infoxtractor

Author	SHA1	Message	Date
goldstein	dc6d28bda1	Merge pull request 'feat(store): Alembic scaffolding + initial ix_jobs migration' (#18 ) from feat/alembic-init into main Some checks are pending tests / test (push) Waiting to run Details	2026-04-18 09:37:37 +00:00
Dirk Riemann	1c60c30084	feat(store): Alembic scaffolding + initial ix_jobs migration (spec §4) All checks were successful tests / test (push) Successful in 1m15s Details tests / test (pull_request) Successful in 1m2s Details Lands the async-friendly Alembic env (NullPool, reads IX_POSTGRES_URL), the hand-written 001 migration matching the spec's table layout exactly (CHECK on status, partial index on pending rows, UNIQUE on (client_id, request_id)), the SQLAlchemy 2.0 ORM mapping, and a lazy engine/session factory. The factory reads the URL through ix.config when available; Task 3.2 makes that the only path. Smoke-tested: alembic upgrade head + downgrade base against a live postgres:16 produce the expected table shape and tear down cleanly. Unit tests assert the migration source contains every required column/index so the migration can't drift from spec at import time. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 11:37:21 +02:00
goldstein	a54a968313	Merge pull request 'test(pipeline): end-to-end hermetic test with fakes + synthetic fixture' (#17 ) from feat/pipeline-e2e-fakes into main Some checks failed tests / test (push) Has been cancelled Details	2026-04-18 09:24:51 +00:00
Dirk Riemann	b109bba873	test(pipeline): end-to-end hermetic test with fakes + synthetic fixture All checks were successful tests / test (push) Successful in 59s Details tests / test (pull_request) Successful in 57s Details Wires the five pipeline steps together with FakeOCRClient + FakeGenAIClient, feeds the committed synthetic_giro.pdf fixture via file:// URL, and asserts the full response shape. - scripts/create_fixture_pdf.py: PyMuPDF-based builder. One-page A4 PDF with six known header strings (bank name, IBAN, period, balances, statement date). Re-runnable on demand; the committed PDF is what CI consumes. - tests/fixtures/synthetic_giro.pdf: committed output. - tests/unit/test_pipeline_end_to_end.py: 5 tests covering * ix_result.result fields populated from the fake LLM * provenance.fields["result.closing_balance"].provenance_verified True * text_agreement True when Paperless-style texts match the value * metadata.timings has one entry per step in the right order * response.error is None and context is not serialised 197 tests total; ruff clean. No integration tests, no real clients, no network. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 11:24:29 +02:00
goldstein	118d77c428	Merge pull request 'feat(pipeline): ResponseHandlerStep (spec §8)' (#16 ) from feat/step-response-handler into main Some checks are pending tests / test (push) Waiting to run Details	2026-04-18 09:21:50 +00:00
Dirk Riemann	565d8d0676	feat(pipeline): ResponseHandlerStep — shape-up final payload (spec §8) All checks were successful tests / test (push) Successful in 1m0s Details tests / test (pull_request) Successful in 1m2s Details Final pipeline step. Three mechanical transforms: 1. include_ocr_text -> concatenate non-tag line texts, pages joined with \n\n, write to ocr_result.result.text. 2. include_geometries=False (default) -> strip ocr_result.result.pages + ocr_result.meta_data. Geometries are heavy; callers opt in. 3. Delete response.context so the internal accumulator never leaks to the caller (belt-and-braces; Field(exclude=True) already does this). validate() always returns True per spec. 7 unit tests in tests/unit/test_response_handler_step.py cover all three branches + context-not-in-model_dump check. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 11:21:36 +02:00
goldstein	83c1996702	Merge pull request 'feat(pipeline): ReliabilityStep (spec §6)' (#15 ) from feat/step-reliability into main Some checks are pending tests / test (push) Waiting to run Details	2026-04-18 09:20:38 +00:00
Dirk Riemann	132f110463	feat(pipeline): ReliabilityStep — writes reliability flags (spec §6) All checks were successful tests / test (push) Successful in 1m3s Details tests / test (pull_request) Successful in 1m1s Details Thin wrapper around ix.provenance.apply_reliability_flags. Validate skips entirely when include_provenance is off OR when no provenance data was built (text-only request, etc.). Process reads context.texts + context.use_case_response and lets the verifier mutate the FieldProvenance entries + fill quality_metrics counters in place. 11 unit tests in tests/unit/test_reliability_step.py cover: validate skips on flag off / missing provenance, runs otherwise; per-type flag behaviour (string verified + text_agreement, Literal -> None, None value -> None, short numeric -> text_agreement None, date with both sides parsed, IBAN whitespace-insensitive, disagreement -> False); quality_metrics verified_fields / text_agreement_fields counters. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 11:20:18 +02:00
goldstein	6d9c239e82	Merge pull request 'feat(pipeline): GenAIStep (spec §6.3, §7, §9.2)' (#14 ) from feat/step-genai into main Some checks are pending tests / test (push) Waiting to run Details	2026-04-18 09:18:59 +00:00
Dirk Riemann	abee9cea7b	feat(pipeline): GenAIStep — LLM call + provenance mapping (spec §6.3, §7, §9.2) All checks were successful tests / test (push) Successful in 1m14s Details tests / test (pull_request) Successful in 1m10s Details Assembles the prompt, picks the structured-output schema, calls the injected GenAIClient, and maps any emitted segment_citations into response.provenance. Reliability flags stay None here; ReliabilityStep fills them in Task 2.7. - System prompt = use_case.system_prompt + (provenance-on) the verbatim citation instruction from spec §9.2. - User text = SegmentIndex.to_prompt_text([p1_l0] style) when provenance is on, else plain OCR flat text + texts joined. - Response schema = UseCaseResponse directly, or a runtime create_model("ProvenanceWrappedResponse", result=(UCR, ...), segment_citations=(list[SegmentCitation], Field(default_factory=list))) when provenance is on. - Model = request override -> use-case default. - Failure modes: httpx / connection / timeout errors -> IX_002_000; pydantic.ValidationError -> IX_002_001. - Writes ix_result.result + ix_result.meta_data (model_name + token_usage); builds response.provenance via map_segment_refs_to_provenance when provenance is on. 17 unit tests in tests/unit/test_genai_step.py cover validate (ocr_only skip, empty -> IX_001_000, text-only, ocr-text path), process happy path, system-prompt shape with/without citation instruction, user text tagged vs. plain, response schema plain vs. wrapped, provenance mapping, error mapping (IX_002_000 + IX_002_001), and model selection (request override + use-case default). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 11:18:44 +02:00
goldstein	acb2d55ce3	Merge pull request 'feat(pipeline): OCRStep (spec §6.2)' (#13 ) from feat/step-ocr into main Some checks are pending tests / test (push) Waiting to run Details	2026-04-18 09:16:04 +00:00
Dirk Riemann	81054baa06	feat(pipeline): OCRStep — run OCR + page tags + SegmentIndex (spec §6.2) All checks were successful tests / test (push) Successful in 1m11s Details tests / test (pull_request) Successful in 1m13s Details Runs after SetupStep. Dispatches the flat page list to the injected OCRClient, writes the raw OCRResult onto response.ocr_result, injects <page file="..." number="..."> open/close tag lines around each page's content, and builds a SegmentIndex over the non-tag lines when provenance is on. Validate follows the spec triad rule: - include_geometries/include_ocr_text/ocr_only + no files -> IX_000_004 - no files -> False (skip) - files + (use_ocr or triad) -> True 9 unit tests in tests/unit/test_ocr_step.py cover all three validate branches, OCRResult written, page tags injected (format + file_index), SegmentIndex built iff provenance on. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 11:15:46 +02:00
goldstein	632acdcd26	Merge pull request 'feat(pipeline): SetupStep (spec §6.1)' (#12 ) from feat/step-setup into main Some checks are pending tests / test (push) Waiting to run Details	2026-04-18 09:14:19 +00:00
Dirk Riemann	97aa24f478	feat(pipeline): SetupStep — validate + fetch + MIME + pages (spec §6.1) All checks were successful tests / test (push) Successful in 1m13s Details tests / test (pull_request) Successful in 1m19s Details First pipeline step. Validates the request (IX_000_002 on empty context), normalises every Context.files entry to a FileRef, downloads them in parallel via asyncio.gather, byte-sniffs MIMEs (IX_000_005 for unsupported), loads the use-case pair from REGISTRY (IX_001_001 on miss), and builds the flat pages + page_metadata list on response_ix.context. Fetcher / ingestor / MIME detector / tmp_dir / fetch_config all inject via the constructor so unit tests stay hermetic — production wires the real ix.ingestion defaults via the app factory. 7 unit tests in tests/unit/test_setup_step.py cover validate errors, happy path (fetcher + ingestor invoked correctly, context populated, use_case_name echoed), FileRef headers pass through, unsupported MIME -> IX_000_005, unknown use case -> IX_001_001, text-only request, and the _InternalContext type assertion. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 11:14:04 +02:00
goldstein	d801038c74	Merge pull request 'feat(ingestion): fetch_file + MIME sniff + DocumentIngestor' (#11 ) from feat/ingestion into main Some checks are pending tests / test (push) Waiting to run Details	2026-04-18 09:12:19 +00:00
Dirk Riemann	290e51416f	feat(ingestion): fetch_file + MIME sniff + DocumentIngestor (spec §6.1) All checks were successful tests / test (push) Successful in 57s Details tests / test (pull_request) Successful in 1m12s Details Three layered modules the SetupStep will wire together in Task 2.4. - fetch.py: async httpx fetch with configurable timeouts + incremental size cap (stream=True, accumulate bytes, raise IX_000_007 when exceeded). file:// URLs read locally. Auth headers pass through. The caller injects a FetchConfig — env reads happen in ix.config (Chunk 3). - mime.py: python-magic byte-sniff + SUPPORTED_MIMES frozenset + require_supported(mime) helper that raises IX_000_005. - pages.py: DocumentIngestor.build_pages(files, texts) -> (list[Page], list[PageMetadata]). PDFs via PyMuPDF (hard 100 pg/PDF cap -> IX_000_006), images via Pillow (multi-frame TIFFs yield multiple Pages), texts as zero-dim Pages so GenAIStep can still cite them. 21 new unit tests (141 total) cover: fetch success with headers, 4xx/5xx mapping, timeout -> IX_000_007, size cap enforced globally + per-file, file:// happy path + missing file, MIME detection for PDF/PNG/JPEG/TIFF, require_supported gate, PDF/TIFF/text page counts, 101-page PDF -> IX_000_006, multi-file file_index assignment. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 11:12:00 +02:00
goldstein	2709fb8d6b	Merge pull request 'feat(clients): OCRClient + GenAIClient protocols + fakes' (#10 ) from feat/client-protocols into main Some checks are pending tests / test (push) Waiting to run Details	2026-04-18 09:08:38 +00:00
Dirk Riemann	118a9abd09	feat(clients): OCRClient + GenAIClient protocols + fakes (spec §6.2, §6.3) All checks were successful tests / test (push) Successful in 1m0s Details tests / test (pull_request) Successful in 1m1s Details Adds the two Protocol-based client contracts the pipeline steps depend on, plus test-oriented fakes. Real engines (Surya, Ollama) land in Chunk 4. - ix.ocr.client.OCRClient — runtime_checkable Protocol with async ocr(). - ix.genai.client.GenAIClient — runtime_checkable Protocol with async invoke(); GenAIInvocationResult + GenAIUsage dataclasses carry the parsed model, token usage, and model name. - FakeOCRClient / FakeGenAIClient: return canned results; both expose a raise_on_call hook for error-path tests. 8 unit tests across tests/unit/test_ocr_fake.py + test_genai_fake.py confirm protocol conformance, canned-return behaviour, usage/model-name defaults, and raise_on_call propagation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 11:08:24 +02:00
goldstein	1344b9ddb4	Merge pull request 'feat(pipeline): Step ABC + Pipeline runner + Timer' (#9 ) from feat/pipeline-core into main Some checks are pending tests / test (push) Waiting to run Details	2026-04-18 09:07:09 +00:00
Dirk Riemann	dcd1bc764a	feat(pipeline): Step ABC + Pipeline runner + Timer (spec §3, §4) All checks were successful tests / test (push) Successful in 56s Details tests / test (pull_request) Successful in 1m7s Details Adds the transport-agnostic pipeline orchestrator. Each step implements async validate + process; the runner wraps both in a Timer, writes per-step entries to response.metadata.timings, and aborts on the first IXException by writing response.error. - Step exposes a step_name property (defaults to class name) so tests and logs label steps consistently. - Timer is a plain context manager that appends one {step, elapsed_seconds} entry on exit regardless of whether the body raised, so the timeline stays reconstructable for failed steps. - 9 unit tests cover ordering, skip-on-false, IXException in validate vs. process, timings populated for every executed step, and shared-response mutation across steps. Non-IX exceptions propagate. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 11:06:46 +02:00
goldstein	b397a80c0b	feat(provenance): mapper + verifier (spec §9.4, §6) (#8 ) Some checks are pending tests / test (push) Waiting to run Details Provenance mapper and reliability verifier land.	2026-04-18 09:01:35 +00:00
Dirk Riemann	1e340c82fa	feat(provenance): mapper + verifier for ReliabilityStep (spec §9.4, §6) All checks were successful tests / test (pull_request) Successful in 1m10s Details tests / test (push) Successful in 1m11s Details Lands the two remaining provenance-subsystem pieces: mapper.py — map_segment_refs_to_provenance: - For each LLM SegmentCitation, pick seg-ids per source_type (`value` vs `value_and_context`), cap at max_sources_per_field, resolve each via SegmentIndex, track invalid references. - Resolve field values by dot-path (`result.items[0].name` supported — `[N]` bracket notation is normalised to `.N` before traversal). - Skip fields that resolve to zero valid sources (spec §9.4). - Write quality_metrics with fields_with_provenance / total_fields / coverage_rate / invalid_references. verify.py — verify_field + apply_reliability_flags: - Dispatches per Pydantic field type: date → parse-both-sides compare; int/float/Decimal → normalize + whole-snippet / numeric-token scan; IBAN (detected via `iban` in field name) → upper+strip compare; Literal / None → flags stay None; else string substring. - _unwrap_optional handles BOTH typing.Union AND types.UnionType so `Decimal \| None` (PEP 604, what get_type_hints emits on 3.12+) resolves correctly — caught by the integration-style test_writes_flags_and_counters. - Number comparator scans numeric tokens in the snippet so labels ("Closing balance CHF 1'234.56") don't mask the match. - apply_reliability_flags mutates the passed ProvenanceData in place and writes verified_fields / text_agreement_fields to quality_metrics. Tests cover each comparator, Literal/None skip, short-value skip (strings and numerics), Decimal via optional union, and end-to-end flag+counter writing against a Pydantic use-case schema that mirrors bank_statement_header. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 11:01:19 +02:00
goldstein	2d22115893	feat(provenance): normalisers + short-value skip rule (spec §6) (#7 ) Some checks are pending tests / test (push) Waiting to run Details Normalizer primitives land.	2026-04-18 08:56:45 +00:00
Dirk Riemann	527fc620fe	feat(provenance): normalisers + short-value skip rule (spec §6) All checks were successful tests / test (pull_request) Successful in 1m0s Details tests / test (push) Successful in 1m28s Details Pure functions the ReliabilityStep will compose to compare extracted values against OCR snippets (and context.texts). Kept in one module so every rule is directly unit-testable without pulling in the step ABC. Highlights: - `normalize_string`: NFKC + casefold + strip common punctuation (. , : ; ! ? () [] {} / \\ ' " `) + collapse whitespace. Substring-compatible. - `normalize_number`: returns the canonical "[-]DDD.DD" form (always 2dp) after stripping currency symbols. Heuristic separator detection handles Swiss-German apostrophes ("1'234.56"), de-DE commas ("1.234,56"), and plain ASCII ("1234.56" / "1234.5" / "1234"). Accepts native int/float/ Decimal as well as str. - `normalize_date`: dateutil parse with dayfirst=True → ISO YYYY-MM-DD. Date and datetime objects short-circuit to their isoformat(). - `normalize_iban`: uppercase + strip whitespace. Format validation is the call site's job; this is pure canonicalisation. - `should_skip_text_agreement`: dispatches on type + value. Literal → skip, None → skip, numeric \|v\|<10 → skip, len(str) ≤ 2 → skip. Numeric check runs first so `10` (len("10")==2) is treated on the numeric side (not skipped) instead of tripping the string length rule. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 10:56:31 +02:00
goldstein	b2ff27c1ca	feat(segmentation): SegmentIndex + prompt-text formatter (spec §9.1) (#6 ) Some checks are pending tests / test (push) Waiting to run Details SegmentIndex lands.	2026-04-18 08:54:02 +00:00
Dirk Riemann	1321d57354	feat(segmentation): SegmentIndex + prompt-text formatter (spec §9.1) All checks were successful tests / test (push) Successful in 58s Details tests / test (pull_request) Successful in 56s Details Builds the ID <-> on-page-anchor map used by both the GenAIStep (to emit the segment-tagged user message) and the provenance mapper (to resolve LLM-cited IDs back to bbox/text/file_index). Design notes: - `build()` is a classmethod so the pipeline constructs the index in one place (OCRStep) and passes the constructed instance along in the internal context. No mutable global state; tests build indexes inline from fake OCR fixtures. - Per-page metadata (file_index) arrives via a parallel `list[PageMetadata]` rather than being smuggled into OCRResult. Keeps segmentation decoupled from ingestion — the OCR engine legitimately doesn't know which file a page came from. - Page-tag lines (`<page …>` / `</page>`) are filtered via a regex so the LLM can never cite them as provenance. `line_idx_in_page` increments only for real lines so the IDs stay dense (p1_l0, p1_l1, ...). - Bounding-box normalisation divides x-coords by page width, y-coords by page height. Zero dimensions (defensive) pass through unchanged. - `to_prompt_text(context_texts=[...])` appends paperless-style texts untagged, separated from the tagged body by a blank line (spec §7.2b). Deterministic for prompt caching. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 10:53:46 +02:00
goldstein	810979e416	feat(use_cases): registry + bank_statement_header (spec §7) (#5 ) Some checks are pending tests / test (push) Waiting to run Details First use case lands.	2026-04-18 08:51:58 +00:00
Dirk Riemann	b80c7952f7	feat(use_cases): registry + bank_statement_header (spec §7) All checks were successful tests / test (pull_request) Successful in 1m0s Details tests / test (push) Successful in 58s Details First use case lands. The schema is intentionally flat — nine scalar fields, no nested arrays — because Ollama's structured-output guidance stays most reliable when the top level has only scalars, and every field we care about (bank_name, IBAN, period, opening/closing balance) can be rendered as one. Registration is explicit in `use_cases/__init__.py`, not a side effect of importing the use-case module. That keeps load order obvious and lets tests patch the registry without having to reload modules. `get_use_case(name)` is the one-liner adapters use; it raises `IX_001_001` with the offending name in `detail` when the lookup misses, which keeps log-scrape simple. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 10:51:43 +02:00
goldstein	230068e484	feat(contracts): ResponseIX + Provenance + Job (spec §3, §9.3) (#4 ) Some checks are pending tests / test (push) Waiting to run Details Lands the outgoing-response data contracts.	2026-04-18 08:50:37 +00:00
Dirk Riemann	02db3b05cc	feat(contracts): ResponseIX + Provenance + Job envelope (spec §3, §9.3) All checks were successful tests / test (push) Successful in 1m2s Details tests / test (pull_request) Successful in 1m0s Details Completes the data-contract layer. Highlights: - `ResponseIX.context` is an internal mutable accumulator used by pipeline steps (pages, files, texts, use_case classes, segment index). It MUST NOT leak into the serialised response, so we mark the field with `Field(exclude=True)` and carry the shape in a small `_InternalContext` sub-model with `extra="allow"` so steps can stash arbitrary state without schema churn. Tested: `model_dump()` and `model_dump_json()` both drop it. - `FieldProvenance` gains `provenance_verified: bool \| None` and `text_agreement: bool \| None` — the two MVP reliability flags written by the new ReliabilityStep. Both default None so rows predating the ReliabilityStep (empty LLM output, cloud-import replay) parse cleanly. - `quality_metrics` stays a free-form `dict[str, Any]` — the MVP adds `verified_fields` and `text_agreement_fields` counters without carving them into the schema, which keeps future metric additions free. - `Job.status` and `Job.callback_status` are `Literal[...]` so Pydantic rejects unknown states at the edge. Invariant (`status='done' iff response.error is None`) stays worker-enforced — callers sometimes hydrate in-flight rows and we do not want validation to reject them. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 10:50:22 +02:00
goldstein	5990218172	feat(contracts): RequestIX + Context + Options (spec §3) (#3 ) Some checks are pending tests / test (push) Waiting to run Details Lands the incoming-request Pydantic v2 contracts.	2026-04-18 08:47:47 +00:00
Dirk Riemann	181cc0fbea	feat(contracts): RequestIX + Context + Options per spec §3 All checks were successful tests / test (push) Successful in 1m2s Details tests / test (pull_request) Successful in 1m6s Details Adds the incoming-request data contracts as Pydantic v2 models. Matches the MVP spec §3 exactly — fields dropped from the reference spec (use_vision, reasoning_effort, version, ...) stay out, and `extra="forbid"` catches any caller that sends them so drift surfaces immediately instead of silently. Context.files is `list[str \| FileRef]`: plain URLs stay str, dict entries parse as FileRef. This keeps the common case (public URL) one-liner while still supporting Paperless-style auth headers and per-file size caps. ix_id stays optional with a docstring warning that callers MUST NOT set it — the transport layer assigns the 16-char hex handle on insert. The field is present so `Job` round-trips out of the store. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 10:47:31 +02:00
goldstein	ebdba99d9f	feat(errors): IXException + IXErrorCode (spec §8) (#2 ) Some checks are pending tests / test (push) Waiting to run Details Lands the single exception type and ten IX_* codes used throughout the pipeline.	2026-04-18 08:46:19 +00:00
Dirk Riemann	ae595c937a	feat(errors): add IXException + IXErrorCode per spec §8 All checks were successful tests / test (push) Successful in 1m2s Details tests / test (pull_request) Successful in 59s Details Adds the single exception type used throughout the pipeline. Every failure maps to one of the ten IX_* codes from the MVP spec §8 with a stable machine-readable code and an optional free-form detail. The `str()` form is log-scrapable with a single regex (`IX_xxx_xxx: <msg> (detail=...)`), so mammon-side reliability UX can classify failures without brittle string parsing. Enum values equal names so callers can serialise either. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 10:46:01 +02:00
goldstein	663cb4ae10	feat(scaffold): project skeleton with uv + pytest + forgejo CI (#1 ) Some checks are pending tests / test (push) Waiting to run Details Lands Task 1.1 from the MVP plan: empty-project skeleton so later tasks have somewhere to land. Local tests + ruff pass. CI trigger fix included so feat branches get runs going forward.	2026-04-18 08:42:56 +00:00
Dirk Riemann	4120d106aa	ci: trigger re-run All checks were successful tests / test (push) Successful in 1m0s Details tests / test (pull_request) Successful in 57s Details	2026-04-18 10:41:57 +02:00
Dirk Riemann	097ebf5db7	ci: run on every push (not just main) so feat branches also get CI Some checks are pending tests / test (push) Waiting to run Details tests / test (pull_request) Successful in 57s Details Matches mammon's pattern more closely and makes PR CI reliable. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 10:40:44 +02:00
Dirk Riemann	7e141829ac	fix(ci): create empty tests/integration so pytest doesn't error on missing dir All checks were successful tests / test (pull_request) Successful in 1m4s Details Integration tests land in Chunk 3; until then CI needs the directory to exist. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 10:39:26 +02:00
Dirk Riemann	a71f023ed9	fix(ci): match mammon's Forgejo Actions pattern (no explicit container image) Some checks failed tests / test (pull_request) Failing after 59s Details The previous python:3.12-slim container lacked node, which actions/checkout@v4 requires. The Forgejo runner's default image includes node + apt + curl, so we can bootstrap python + uv the same way mammon does. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 10:37:56 +02:00
Dirk Riemann	57cdfd73fb	feat(scaffold): project skeleton with uv + pytest + forgejo CI Some checks failed CI / test (pull_request) Failing after 4s Details - pyproject.toml: runtime deps (FastAPI, SQLAlchemy async, Pydantic, PyMuPDF, python-magic, Pillow, dateutil), dev group (pytest, pytest-asyncio, pytest-httpx, ruff, mypy), optional `ocr` extra that pulls surya-ocr + torch (kept optional so CI without GPU can run the base package). - pytest config: asyncio_mode=auto; `live` marker for tests that need a real Ollama/Surya (gated on IX_TEST_OLLAMA=1). - Single smoke test (tests/unit/test_scaffolding.py) verifies the package imports and exposes __version__ — keeps CI green until the real test modules land in later chunks. - .forgejo/workflows/ci.yml: runs ruff + pytest against a Postgres 16 service container. Explicit IX_TEST_MODE=fake keeps real-client tests out. - .env.example: every IX_* var from spec §9 with on-prem-friendly defaults. - uv.lock committed for reproducible builds. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 10:36:43 +02:00
Dirk Riemann	86538ee8de	Implementation plan for ix MVP Detailed, TDD-structured plan with 5 chunks covering ~30 feature-branch tasks from foundation scaffolding through first live deploy + E2E smoke. Each task is one PR; pipeline core comes hermetic-first, real Surya/Ollama clients in Chunk 4, containerization + first deploy in Chunk 5. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 10:34:30 +02:00
Dirk Riemann	5e007b138d	Address spec review — auth, timeouts, lifecycle, error codes - FileRef type added so callers (mammon/Paperless) can pass Authorization headers alongside URLs. context.files is now list[str \| FileRef]. - Job lifecycle state machine pinned down, including worker-startup sweep for rows stuck in 'running' after a crash. - Explicit IX_002_000 / IX_002_001 codes for Ollama unreachable and structured-output schema violations, with per-call timeout IX_GENAI_CALL_TIMEOUT_SECONDS distinct from the per-job timeout. - IX_000_007 code for file-fetch failures; per-file size, connect, and read timeouts configurable via env. - ReliabilityStep: Literal-typed fields and None values explicitly skipped from provenance verification (with reason); dates parse both sides before ISO comparison. - /healthz semantics pinned down (CUDA + Surya loaded; Ollama reachable AND model available). /metrics window is last 24h. - (client_id, request_id) is UNIQUE in ix_jobs, matching the idempotency claim. - Deploy-failure workflow uses `git revert` forward commit, not force-push — aligned with AGENTS.md habits. - Dockerfile / compose require --gpus all. Pre-deploy requires `ollama pull gpt-oss:20b`; /healthz verifies before deploy completes. - CI clarified: Forgejo Actions runners are GPU-less and LAN-disconnected; all inference is stubbed there. Real-Ollama tests behind IX_TEST_OLLAMA=1. - Fixture redaction stance: synthetic-template PDF committed; real redacted fixtures live out-of-repo. - Deferred list picks up use_case URL/Base64, callback retries, multi-container workers. quality_metrics retains reference-spec counters plus the two new MVP ones. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 10:28:43 +02:00
Dirk Riemann	124403252d	Initial design: on-prem LLM extraction microservice MVP Establishes ix as an async, on-prem, LLM-powered structured extraction microservice. Full reference spec stays in docs/spec-core-pipeline.md; MVP spec (strict subset — Ollama only, Surya OCR, REST + Postgres-queue transports in parallel, in-repo use cases, provenance-based reliability signals) lives at docs/superpowers/specs/2026-04-18-ix-mvp-design.md. First use case: bank_statement_header (feeds mammon's needs_parser flow). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 10:23:17 +02:00

43 commits