Detailed, TDD-structured plan with 5 chunks covering ~30 feature-branch tasks from foundation scaffolding through first live deploy + E2E smoke. Each task is one PR; pipeline core comes hermetic-first, real Surya/Ollama clients in Chunk 4, containerization + first deploy in Chunk 5. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
34 KiB
InfoXtractor MVP — Implementation Plan
For agentic workers: REQUIRED: Use superpowers:subagent-driven-development (if subagents available) or superpowers:executing-plans to implement this plan. Steps use checkbox (
- [ ]) syntax for tracking.
Goal: Ship an on-prem, async, LLM-powered structured-extraction microservice with one end-to-end use case (bank_statement_header) reachable from mammon via REST.
Architecture: FastAPI container + single asyncio worker + shared postgis job store. Transport-agnostic pipeline core with pluggable OCRClient (Surya) and GenAIClient (Ollama). Provenance-based reliability signals per field. No cloud services.
Tech Stack: Python 3.12, FastAPI, uvicorn, SQLAlchemy 2.0 async, asyncpg, Alembic, Pydantic v2, pydantic-settings, httpx, PyMuPDF, python-magic, Pillow, surya-ocr (CUDA), Ollama REST API, pytest (+ pytest-asyncio, pytest-postgresql fixture), uv for dep management.
Spec reference: docs/superpowers/specs/2026-04-18-ix-mvp-design.md (approved 2026-04-18 after 2 review rounds).
Habits (inherited from mammon; see AGENTS.md):
- Every task lands as its own feature branch:
feat/<task-slug>. - TDD: write failing test → minimal implementation → green → refactor → commit.
- Every commit updates code + tests + docs (
AGENTS.md,README.md, section in this plan) in one shot. - Push:
git push forgejo feat/<task-slug>→ PR via Forgejo API → wait for CI green → merge →git push server mainto deploy. - Never skip hooks, never force-push
main, never amend merged commits. - After each deploy, run
scripts/e2e_smoke.pyagainst the live service.
File structure (target)
infoxtractor/
├── AGENTS.md # Guiding principles, habits, stack, deploy
├── README.md # One-paragraph + pointers
├── Dockerfile # nvidia/cuda:12.4 + python 3.12 + surya
├── docker-compose.yml # GPU reservation, env_file, monitoring labels
├── pyproject.toml # deps via uv; no setup.py
├── uv.lock
├── alembic.ini
├── alembic/
│ ├── env.py
│ └── versions/
│ └── 001_initial_ix_jobs.py
├── src/ix/
│ ├── __init__.py
│ ├── app.py # FastAPI create_app factory + lifespan
│ ├── config.py # AppConfig (pydantic-settings)
│ ├── errors.py # IXException + IX_* error codes
│ ├── logging.py # JSON formatter, ix_id context
│ ├── contracts/ # Pydantic data contracts
│ │ ├── __init__.py
│ │ ├── request.py # RequestIX, Context, FileRef, Options, *Options
│ │ ├── response.py # ResponseIX, IXResult, OCRResult, OCRDetails, Page, Line, Metadata
│ │ ├── provenance.py # ProvenanceData, FieldProvenance, ExtractionSource, BoundingBox, SegmentCitation
│ │ └── job.py # Job envelope (status enum, lifecycle)
│ ├── use_cases/
│ │ ├── __init__.py # REGISTRY
│ │ └── bank_statement_header.py
│ ├── pipeline/
│ │ ├── __init__.py
│ │ ├── step.py # Step ABC
│ │ ├── pipeline.py # Pipeline orchestrator + Timer
│ │ ├── setup_step.py
│ │ ├── ocr_step.py
│ │ ├── genai_step.py
│ │ ├── reliability_step.py
│ │ └── response_handler_step.py
│ ├── segmentation/
│ │ ├── __init__.py
│ │ └── segment_index.py # SegmentIndex + prompt formatting
│ ├── provenance/
│ │ ├── __init__.py
│ │ ├── mapper.py # map_segment_refs_to_provenance
│ │ ├── normalize.py # normalizers (string, number, date, iban)
│ │ └── verify.py # provenance_verified + text_agreement logic
│ ├── ocr/
│ │ ├── __init__.py
│ │ ├── client.py # OCRClient Protocol
│ │ ├── fake.py # FakeOCRClient (for tests)
│ │ └── surya_client.py # SuryaOCRClient (real)
│ ├── genai/
│ │ ├── __init__.py
│ │ ├── client.py # GenAIClient Protocol, GenAIInvocationResult, GenAIUsage
│ │ ├── fake.py # FakeGenAIClient (for tests)
│ │ └── ollama_client.py # OllamaClient (real)
│ ├── ingestion/
│ │ ├── __init__.py
│ │ ├── fetch.py # file downloader w/ auth headers, timeouts, size cap
│ │ ├── mime.py # python-magic wrapper
│ │ └── pages.py # DocumentIngestor: PDF/image/text → Page list
│ ├── store/
│ │ ├── __init__.py
│ │ ├── models.py # SQLAlchemy ORM for ix_jobs
│ │ ├── engine.py # lazy async engine, session factory
│ │ └── jobs_repo.py # claim_next, insert, get, list_by_correlation, sweep_orphans, update
│ ├── worker/
│ │ ├── __init__.py
│ │ ├── loop.py # worker task: claim → run pipeline → deliver callback
│ │ └── callback.py # one-shot webhook delivery
│ ├── adapters/
│ │ ├── __init__.py
│ │ ├── rest/
│ │ │ ├── __init__.py
│ │ │ ├── routes.py # /jobs, /jobs/{id}, /jobs (list), /healthz, /metrics
│ │ │ └── schemas.py # request/response bodies
│ │ └── pg_queue/
│ │ ├── __init__.py
│ │ └── listener.py # LISTEN ix_jobs_new + 10s fallback poll
│ └── metrics/
│ ├── __init__.py
│ └── counters.py # plain-JSON counter queries
├── tests/
│ ├── __init__.py
│ ├── conftest.py
│ ├── fixtures/
│ │ ├── synthetic_giro.pdf # generated from template
│ │ └── ocr_canned/ # canned Surya outputs for integration tests
│ ├── unit/
│ │ ├── test_contracts.py
│ │ ├── test_errors.py
│ │ ├── test_segment_index.py
│ │ ├── test_provenance_normalize.py
│ │ ├── test_provenance_verify.py
│ │ ├── test_provenance_mapper.py
│ │ ├── test_setup_step.py
│ │ ├── test_ocr_step.py
│ │ ├── test_genai_step.py
│ │ ├── test_reliability_step.py
│ │ ├── test_response_handler_step.py
│ │ ├── test_pipeline.py
│ │ ├── test_use_case_registry.py
│ │ ├── test_ingestion_fetch.py
│ │ ├── test_ingestion_pages.py
│ │ └── test_use_case_bank_statement_header.py
│ ├── integration/
│ │ ├── test_jobs_repo.py
│ │ ├── test_rest_adapter.py
│ │ ├── test_pg_queue_adapter.py
│ │ ├── test_worker_loop.py
│ │ └── test_pipeline_end_to_end.py # fakes, real DB
│ └── live/
│ └── test_ollama_surya_smoke.py # gated on IX_TEST_OLLAMA=1
├── scripts/
│ ├── e2e_smoke.py # post-deploy gate (Mac → :8994)
│ ├── create_fixture_pdf.py # builds synthetic_giro.pdf deterministically
│ └── forgejo_pr.py # wrapper: create branch → PR → merge
├── .env.example
├── .gitignore
└── docs/
├── spec-core-pipeline.md
└── superpowers/
├── specs/2026-04-18-ix-mvp-design.md
└── plans/2026-04-18-ix-mvp-implementation.md (this file)
Boundary rules:
pipeline/knows aboutcontracts/,segmentation/,provenance/,ocr.client,genai.client. NOTstore/,adapters/,worker/.adapters/knows aboutstore/andcontracts/, NOTpipeline/directly (it hands requests to the job store; the worker pulls from the store).worker/knows aboutstore/,pipeline/,contracts/. Bridges the two.store/knows only aboutcontracts/(for JSONB serialization).ocr/surya_client.pyandgenai/ollama_client.pyare the only files that import external libraries beyond stdlib/FastAPI/SQLAlchemy — all other modules stay hermetic.
Chunk 1: Foundation (scaffolding + contracts + use case + SegmentIndex)
Purpose: Set up the project skeleton and land the data contracts, error model, use-case registry with the first use case, and SegmentIndex. No pipeline, no transport, no LLM — just the Pydantic/types core plus the scaffolding to run tests and CI.
Task 1.1: Project scaffolding
Branch: feat/scaffold
Files:
-
Create:
pyproject.toml,.python-version,uv.lock,.env.example -
Create:
src/ix/__init__.py -
Create:
tests/__init__.py,tests/conftest.py -
Create:
.forgejo/workflows/ci.yml(pytest on push + PR) -
Create:
pytest.iniwith asyncio_mode=auto -
Write
pyproject.tomlwith deps:fastapi,uvicorn[standard],sqlalchemy[asyncio]>=2,asyncpg,alembic,pydantic>=2,pydantic-settings,httpx,pymupdf,python-magic,pillow,python-dateutil. Dev:pytest,pytest-asyncio,pytest-httpx,ruff,mypy. -
.env.example: every var from spec §9, all placeholders obvious (<password>,<host-internal-url>). -
pytest.ini:asyncio_mode = auto, collect fromtests/. -
.forgejo/workflows/ci.yml: runsuv sync && uv run pytest tests/unit tests/integration -v; service containerpostgres:16; envIX_POSTGRES_URLpoints at service. Excludestests/live/(real Ollama). -
Commit, push branch, create PR, merge, deploy — no deploy needed yet (nothing to deploy). Stop at merge.
Task 1.2: Error model
Branch: feat/errors
Files:
-
Create:
src/ix/errors.py -
Create:
tests/unit/test_errors.py -
Write failing tests for
IXExceptionand everyIX_*code in spec §8. Each code is a class attribute:IXError.IX_000_000 = "IX_000_000: request_ix is None". Exception carriescode+detail. -
Implement enum-like class with
__str__producing"IX_000_000: message (detail=...)". -
Green, commit, PR, merge.
Task 1.3: Data contracts — RequestIX / Options / Context / FileRef
Branch: feat/contracts-request
Files:
-
Create:
src/ix/contracts/request.py -
Create:
src/ix/contracts/__init__.py -
Create:
tests/unit/test_contracts.py -
Failing tests: round-trip
RequestIX.model_validate_json(...)with each shape in spec §3. String-or-FileRef union. Defaults:include_provenance=True,service="surya", etc. Validation errors on unknown fields. -
Pydantic models per spec §3.
Options,OCROptions,GenAIOptions,ProvenanceOptions,Context,FileRef,RequestIX. -
Green, commit, PR, merge.
Task 1.4: Data contracts — ResponseIX / Provenance / IXResult / OCRResult / Metadata / Job
Branch: feat/contracts-response
Files:
-
Create:
src/ix/contracts/response.py -
Create:
src/ix/contracts/provenance.py -
Create:
src/ix/contracts/job.py -
Modify:
src/ix/contracts/__init__.py(export) -
Modify:
tests/unit/test_contracts.py -
Failing tests:
FieldProvenancewith newprovenance_verified/text_agreementfields;None-allowed.quality_metricskeys:fields_with_provenance,total_fields,coverage_rate,invalid_references,verified_fields,text_agreement_fields.ResponseIX.contextexcluded frommodel_dump(useexclude).Jobenvelope,statusis a Literal,callback_statusstarts asNone.
-
Implement per spec §3 + §9.3.
ResponseIX.contextusesField(exclude=True); a sibling internal model_InternalContextholdspages,files,texts,use_case_request,use_case_response,segment_index. Keep it simple: one class,Field(exclude=True)on the attribute. -
Green, commit, PR, merge.
Task 1.5: Use-case registry and first use case
Branch: feat/use-case-bank-statement-header
Files:
-
Create:
src/ix/use_cases/__init__.py(REGISTRY) -
Create:
src/ix/use_cases/bank_statement_header.py -
Create:
tests/unit/test_use_case_registry.py -
Create:
tests/unit/test_use_case_bank_statement_header.py -
Failing tests:
REGISTRY["bank_statement_header"]returns(Request, BankStatementHeader); unknown name raisesIX_001_001;Request().system_promptcontains "extract header metadata" substring. -
Implement per spec §7. Pydantic models. Register on import of the module (side-effect registration, or explicit registry assembly in
__init__.py). Prefer explicit —REGISTRY = {"bank_statement_header": (Request, BankStatementHeader)}— no import-time side effects. -
Green, commit, PR, merge.
Task 1.6: SegmentIndex
Branch: feat/segment-index
Files:
-
Create:
src/ix/segmentation/__init__.py -
Create:
src/ix/segmentation/segment_index.py -
Create:
tests/unit/test_segment_index.py -
Failing tests (take from spec §9.1):
build()assigns IDsp1_l0,p1_l1, … across the flat page list.<page>tag lines are excluded from IDs.lookup_segment("p1_l0")returns{page, bbox, text, file_index}; unknown →None.to_prompt_text()emits"[p1_l0] text\n…"and appends rawcontext.textsuntagged at the end.- BoundingBox normalization divides by page width/height.
-
Implement.
SegmentIndexis built from anOCRResult+pagesmetadata; holds_id_to_position: dict[str, dict]and_ordered_ids: list[str]. -
Green, commit, PR, merge.
Task 1.7: Provenance normalizers
Branch: feat/provenance-normalize
Files:
-
Create:
src/ix/provenance/__init__.py -
Create:
src/ix/provenance/normalize.py -
Create:
tests/unit/test_provenance_normalize.py -
Failing tests for each normalizer in spec §6 ReliabilityStep:
- String:
" FOO bar!!! "→"foo bar"(after NFKC + casefold + whitespace collapse + punctuation strip). - Number:
"CHF 1'234.56"↔Decimal("1234.56")→ same canonical form. - Date:
"31.03.2026"↔date(2026,3,31)→"2026-03-31"viadateutil(dayfirst=True). - IBAN:
"de 89 3704 0044 0532 0130 00"→"DE89370400440532013000". - Short-value rule:
_should_skip_text_agreement("0", field_type=int)→True;"AB"for str →True.
- String:
-
Implement. Pure functions, no external state, fully unit-testable.
-
Green, commit, PR, merge.
Task 1.8: Provenance mapper + verifier
Branch: feat/provenance-mapper-verifier
Files:
-
Create:
src/ix/provenance/mapper.py(map_segment_refs_to_provenance per spec §9.4) -
Create:
src/ix/provenance/verify.py(verify_field_value + text_agreement_for_field) -
Create:
tests/unit/test_provenance_mapper.py -
Create:
tests/unit/test_provenance_verify.py -
Failing tests for mapper: given fake
SegmentIndex+ fakesegment_citations→ correctFieldProvenance.sources; invalid_references count; value resolution via dot-path ("result.invoice_number","items.0.name");max_sources_per_fieldcap. -
Failing tests for verifier:
provenance_verifiedtrue/false per field type;text_agreementwith and withoutcontext.texts; Literal →None; None value →None; short value →text_agreementNone; date parses both sides. -
Implement; pure functions.
-
Green, commit, PR, merge.
Chunk 1 end state: pytest tests/unit runs green locally and in Forgejo Actions. No runtime service yet. ~8 merged PRs to main. Time estimate: one focused afternoon.
Chunk 2: Pipeline core
Purpose: Wire up the Step ABC + Pipeline orchestrator + all five steps + fake OCR/GenAI clients. At end of chunk, pipeline runs end-to-end with fakes and produces a full ResponseIX for bank_statement_header, entirely hermetic.
Task 2.1: Step ABC + Pipeline orchestrator + Timer
Branch: feat/pipeline-core
Files:
-
Create:
src/ix/pipeline/__init__.py -
Create:
src/ix/pipeline/step.py -
Create:
src/ix/pipeline/pipeline.py -
Create:
tests/unit/test_pipeline.py -
Failing tests using synthetic steps: order preserved;
validate=Falseskips step;validateraise → error written + abort;processraise → error written + abort; each step's elapsed seconds added tometadata.timings. -
Implement per spec §3/§4.
Pipeline(steps=[...])._execute_stepwraps in timer + try/except, setsresponse_ix.erroron raise. -
Green, commit, PR, merge.
Task 2.2: OCRClient and GenAIClient protocols + fakes
Branch: feat/client-protocols
Files:
-
Create:
src/ix/ocr/__init__.py,src/ix/ocr/client.py,src/ix/ocr/fake.py -
Create:
src/ix/genai/__init__.py,src/ix/genai/client.py,src/ix/genai/fake.py -
Create:
tests/unit/test_ocr_fake.py,tests/unit/test_genai_fake.py -
Failing tests:
FakeOCRClient(canned=OCRResult(...))returns the canned result;FakeGenAIClient(parsed=MyModel(...))returns aGenAIInvocationResultwith that parsed instance + stubbed usage. -
Implement Protocols + fakes. Protocols are
@runtime_checkable. -
Green, commit, PR, merge.
Task 2.3: Ingestion — fetch + MIME + pages
Branch: feat/ingestion
Files:
-
Create:
src/ix/ingestion/__init__.py -
Create:
src/ix/ingestion/fetch.py -
Create:
src/ix/ingestion/mime.py -
Create:
src/ix/ingestion/pages.py -
Create:
tests/unit/test_ingestion_fetch.py(pytest-httpx mocks) -
Create:
tests/unit/test_ingestion_pages.py(fixture PDFs/images) -
Failing tests:
fetch_file(FileRef, …)passes headers; size cap raisesIX_000_007; timeout raisesIX_000_007; non-2xx raisesIX_000_007.detect_mime(bytes)classifies PDF/PNG/JPEG/TIFF correctly; unknown raisesIX_000_005.DocumentIngestor.build_pages(files, texts): PDF with 3 pages → 3Pageobjects withpage_no/width/height; multi-frame TIFF → multiple Pages; plain text entry → one Page; >100 PDF pages raisesIX_000_006.
-
Implement.
fetch_fileuses httpx AsyncClient with timeouts from config,stream=Trueto enforce size cap incrementally.DocumentIngestoruses PyMuPDF for PDFs, PIL for images. -
Green, commit, PR, merge.
Task 2.4: SetupStep
Branch: feat/step-setup
Files:
-
Create:
src/ix/pipeline/setup_step.py -
Create:
tests/unit/test_setup_step.py -
Failing tests:
validateraisesIX_000_000if request None;IX_000_002if no files+no texts.processdownloads files (pytest-httpx mocks), assemblesresponse_ix.context.pages, loads use case; unknown use case →IX_001_001.
-
Implement per spec §6. Use
ingestion.fetch_file+DocumentIngestor. -
Green, commit, PR, merge.
Task 2.5: OCRStep
Branch: feat/step-ocr
Files:
-
Create:
src/ix/pipeline/ocr_step.py -
Create:
tests/unit/test_ocr_step.py -
Failing tests:
validateraisesIX_000_004when geometries/text/ocr_only set but no files.validatereturnsFalsefor pure-text requests.processrunsFakeOCRClient, injects page tags, buildsSegmentIndexwhen provenance on.
-
Implement per spec §6.
-
Green, commit, PR, merge.
Task 2.6: GenAIStep
Branch: feat/step-genai
Files:
-
Create:
src/ix/pipeline/genai_step.py -
Create:
tests/unit/test_genai_step.py -
Failing tests:
- System prompt concat with citation instruction when provenance on.
- Text content format:
[p1_l0] foo\n[p1_l1] bar. - Response schema wrapped in
ProvenanceWrappedResponsewhen provenance on. FakeGenAIClientreturns parsed result → written toix_result.result.IX_002_000/IX_002_001surfaced on client raise.- Provenance mapping produces
ProvenanceDatawith the expected field paths.
-
Implement per spec §6. Use
provenance.mapper.map_segment_refs_to_provenance. -
Green, commit, PR, merge.
Task 2.7: ReliabilityStep
Branch: feat/step-reliability
Files:
-
Create:
src/ix/pipeline/reliability_step.py -
Create:
tests/unit/test_reliability_step.py -
Failing tests:
- Skipped when
include_provenance=False. - Per-type dispatch: Literal field →
Noneflags; None value →Noneflags; short value →text_agreement=None. - Dates parse both sides before comparison.
- Counters
verified_fieldsandtext_agreement_fieldswritten. - Tests using
BankStatementHeader+ concreteProvenanceData→ exact flag values.
- Skipped when
-
Implement using
provenance.verify+provenance.normalize; dispatch viatype hintson the use-case response schema (introspect viaget_type_hints). -
Green, commit, PR, merge.
Task 2.8: ResponseHandlerStep
Branch: feat/step-response-handler
Files:
-
Create:
src/ix/pipeline/response_handler_step.py -
Create:
tests/unit/test_response_handler_step.py -
Failing tests per spec §8: attach OCR text; strip geometries when not requested; delete
context. -
Implement.
-
Green, commit, PR, merge.
Task 2.9: End-to-end pipeline test with fakes
Branch: feat/pipeline-e2e-fakes
Files:
-
Create:
tests/unit/test_pipeline_end_to_end.py -
Create:
tests/fixtures/synthetic_giro.pdf(generated) -
Create:
scripts/create_fixture_pdf.py -
scripts/create_fixture_pdf.pybuilds a deterministic PDF with known header fields (bank name, IBAN, period, balances) using reportlab or PyMuPDF. Script re-runs on demand; output is committed. -
Failing test: feed the fixture + canned OCR + canned LLM response through the full
Pipeline([Setup, OCR, GenAI, Reliability, ResponseHandler])and assertresponse_ix.ix_result.result == expected,provenance_verified[closing_balance] is True, timings populated. -
Implement — only wiring; no new logic.
-
Green, commit, PR, merge.
Chunk 2 end state: Full pipeline runs in tests with fakes. No DB, no transport, no real clients. Running pytest tests/unit -v goes green end-to-end. ~9 merged PRs.
Chunk 3: Job store + REST adapter + worker loop
Purpose: Persist jobs in Postgres, expose REST endpoints, run the worker task in the FastAPI lifespan, deliver callbacks. At chunk end, the container (locally or on the server) accepts POST /jobs, runs the fake-backed pipeline against a real DB, and returns results via polling or callback.
Task 3.1: Alembic scaffolding + initial migration
Branch: feat/alembic-init
Files:
-
Create:
alembic.ini,alembic/env.py,alembic/versions/001_initial_ix_jobs.py -
Create:
src/ix/store/__init__.py,src/ix/store/models.py,src/ix/store/engine.py -
alembic/env.py: async engine,NullPool, readsIX_POSTGRES_URL. -
models.py:IxJobORM mapping toix_jobstable per spec §4. -
001_initial_ix_jobs.py: CREATE TABLE + indexes (includingUNIQUEon(client_id, request_id)). No NOTIFY trigger (NOTIFY is fired by writers, not DDL). -
Smoke test:
alembic upgrade headagainst a disposable postgres (via docker) creates the table and indexes. No unit test here — verified in integration tests. -
Commit, PR, merge.
Task 3.2: Config module (AppConfig)
Branch: feat/config
Files:
-
Create:
src/ix/config.py -
Create:
tests/unit/test_config.py -
Failing tests: every env var in spec §9 loads from
IX_*env; defaults match spec. -
Implement
AppConfigviapydantic-settings; no.envauto-load in tests — usemonkeypatch. -
Green, commit, PR, merge.
Task 3.3: JobsRepo (store CRUD)
Branch: feat/jobs-repo
Files:
-
Create:
src/ix/store/jobs_repo.py -
Create:
tests/integration/test_jobs_repo.py -
Create:
tests/conftest.pyfixtures —postgres_url,engine,session_factory(Forgejo CI service container). -
Failing tests (integration, need real DB):
insert_pending(request, callback_url)creates row; returnsjob_id+ix_id.- Inserting with existing
(client_id, request_id)returns the existingjob_id(idempotency), status unchanged. claim_next_pending()→ locks a pending row and updates torunning; returnsNoneif none available; concurrent callers each claim distinct rows (SKIP LOCKED).get(job_id)returns the fullJobwith nested request/response parsed.mark_done(job_id, response)/mark_error(job_id, response_with_error)/update_callback_status(...).sweep_orphans(now, max_running_age)→ returns list of rescued job IDs; their status goes back topending,attempts++.
-
Implement using SQLAlchemy 2.0 async. Each method is a single transaction.
-
Green, commit, PR, merge.
Task 3.4: FastAPI app + REST routes
Branch: feat/rest-adapter
Files:
-
Create:
src/ix/app.py -
Create:
src/ix/adapters/__init__.py,src/ix/adapters/rest/__init__.py,src/ix/adapters/rest/routes.py,src/ix/adapters/rest/schemas.py -
Create:
tests/integration/test_rest_adapter.py -
Failing tests (integration, FastAPI TestClient + real DB):
POST /jobswith valid body → 201, returns{job_id, ix_id, status: "pending"}.POST /jobsidempotent on(client_id, request_id)— second call returns samejob_idwith 200.GET /jobs/{id}returns theJobshape; 404 on unknown.GET /jobs?client_id=…&request_id=…returns the row or 404.GET /healthzreturns JSON withpostgres/ollama/ocrkeys. In tests,ollamaandocrare mocked via dependency-injection hook.GET /metricsreturns 24h counters.
-
Implement
create_app()factory. Lifespan: create engine, runalembic upgrade head, spawn worker task (Chunk 3.5), tear down on shutdown. -
Green, commit, PR, merge.
Task 3.5: Worker loop + callback delivery
Branch: feat/worker-loop
Files:
-
Create:
src/ix/worker/__init__.py,src/ix/worker/loop.py,src/ix/worker/callback.py -
Modify:
src/ix/app.py(lifespan spawns worker task) -
Create:
tests/integration/test_worker_loop.py -
Failing tests (integration):
- Worker claims a pending job, runs a fake pipeline, writes response, updates status to
done. - On pipeline exception: status →
error, response carries the error code. - On
callback_urlset and 200 response:callback_status→delivered. - On callback 500 or timeout:
callback_status→failed;statusstaysdone/error. - Worker startup orphan sweep: job left in
runningwithstarted_at < now - 2 * per_job_timeout→ reset topending, attempts++.
- Worker claims a pending job, runs a fake pipeline, writes response, updates status to
-
Implement. Worker pipeline factory injected — tests pass a stub; production wiring builds the real
PipelinewithFakeOCRClient/FakeGenAIClientfor now (Chunk 4 swaps them). -
Green, commit, PR, merge.
Task 3.6: Postgres queue adapter
Branch: feat/pg-queue-adapter
Files:
-
Create:
src/ix/adapters/pg_queue/__init__.py,src/ix/adapters/pg_queue/listener.py -
Modify:
src/ix/app.py(lifespan spawns listener task if enabled) -
Create:
tests/integration/test_pg_queue_adapter.py -
Failing tests (integration):
- Caller inserts a row directly and
NOTIFY ix_jobs_new, '<job_id>'→ worker picks it up within 1 s. - Missed NOTIFY (e.g., listener not started yet) → 10 s fallback poll finds the row.
- Caller inserts a row directly and
-
Implement.
listener.pyopens a dedicated asyncpg connection (outside the SQLAlchemy pool) to runLISTEN; on notify, emits an asyncio event the workerwait_for_notify_or_poll(10s)reacts to. -
Green, commit, PR, merge.
Chunk 3 end state: FastAPI container serves the REST API, backed by a real Postgres. Pipeline still uses fakes under the hood (real Surya + Ollama land in Chunk 4). ~6 PRs.
Chunk 4: Real OCR + real LLM clients
Purpose: Wire SuryaOCRClient and OllamaClient into production. Tests gated on IX_TEST_OLLAMA=1. Pipeline factory switches from fakes to real clients based on env.
Task 4.1: OllamaClient (real)
Branch: feat/ollama-client
Files:
-
Create:
src/ix/genai/ollama_client.py -
Create:
tests/unit/test_ollama_client.py(uses pytest-httpx to mock) -
Create:
tests/live/test_ollama_client_live.py(gated onIX_TEST_OLLAMA=1) -
Failing unit tests:
invokePOSTs to/api/chatwithformat=<schema>; parses response into the Pydantic schema; surfacesIX_002_000on connection error / timeout; surfacesIX_002_001on schema-parse failure. -
Live test: real call to
host.docker.internal:11434(or192.168.68.42:11434from Mac) withgpt-oss:20bagainst a tinyBankStatementHeader-shaped schema; skipped unlessIX_TEST_OLLAMA=1. -
Implement. httpx AsyncClient with per-call timeout from config.
-
Green, commit, PR, merge.
Task 4.2: SuryaOCRClient (real)
Branch: feat/surya-client
Files:
-
Create:
src/ix/ocr/surya_client.py -
Create:
tests/unit/test_surya_client.py(mockedsurya.recognition.RecognitionPredictor) -
Create:
tests/live/test_surya_client_live.py(gated onIX_TEST_OLLAMA=1— reuses the flag; rename toIX_TEST_LIVE=1if that collides) -
Failing unit tests with Surya API mocked: given 3 Pages, returns an
OCRResultwith 3 matching pages, each with lines + 8-coord polygons. -
Live test: runs Surya against
tests/fixtures/synthetic_giro.pdf; asserts extracted text contains the known IBAN substring. -
Implement.
selfcheck()loads the model at startup and runs a 1-page sanity OCR on a blank page; used by/healthz. -
Add surya to
pyproject.toml:surya-ocr+torch>=2.2(CUDA 12.4 wheels). -
Green, commit, PR, merge.
Task 4.3: Pipeline factory + /healthz wiring
Branch: feat/production-wiring
Files:
-
Modify:
src/ix/app.py(build production pipeline in lifespan, not fakes) -
Create:
src/ix/genai/__init__.py(factory:make_client(config) -> GenAIClient) -
Create:
src/ix/ocr/__init__.py(factory:make_client(config) -> OCRClient) -
Modify:
src/ix/adapters/rest/routes.py(/healthzprobes real clients) -
Failing tests: factory returns
OllamaClient/SuryaOCRClientin production mode;FakeOCRClient/FakeGenAIClientwhenIX_TEST_MODE=fakeenv is set (used by integration tests). -
Implement.
-
Green, commit, PR, merge.
Chunk 4 end state: Running container can handle a real PDF end-to-end with real OCR and real LLM. Unit tests stay hermetic; live tests run on the Mac against the home server. ~3 PRs.
Chunk 5: Containerization + deployment + E2E
Purpose: Dockerize, configure the server, push-to-deploy, run the first live smoke test.
Task 5.1: Dockerfile + docker-compose
Branch: feat/dockerize
Files:
-
Create:
Dockerfile -
Create:
docker-compose.yml -
Modify:
.env.example(final list of vars) -
Dockerfile: base
nvidia/cuda:12.4.0-runtime-ubuntu22.04, install Python 3.12 viadeadsnakes, installuv, copy source,uv sync --no-dev, CMDalembic upgrade head && uvicorn ix.app:create_app --factory --host 0.0.0.0 --port 8994. -
docker-compose.yml: single service
infoxtractor, port 8994,runtime: nvidia, GPU reservation, env_file.env, monitoring labels, backup labels,extra_hosts: host.docker.internal:host-gateway. -
Build locally (
docker compose build) to verify. -
Commit, PR, merge (no deploy yet — see next task).
Task 5.2: Server setup + post-receive hook
Branch: feat/deploy-setup
Files:
-
Create:
docs/deployment.md -
Create:
scripts/setup_server.sh(one-shot: creates bare repo + post-receive hook +infoxtractorDB on postgis +.envon server) -
setup_server.sh(run manually once): SSH toserver@192.168.68.42, create/home/server/Public/infoxtractor/repos.gitbare repo with post-receive hook that checks out to/home/server/Public/infoxtractor/app/, runsdocker compose up -d --build, polls/healthzfor 60 s, exits non-zero on failure. -
Creates
infoxtractorDB + role on the postgis container. -
Writes
/home/server/Public/infoxtractor/app/.envwith real passwords (user provides via environment or prompt). -
Commit, PR, merge. Run the script manually; doc the run in
deployment.md.
Task 5.3: Add server git remote + first deploy
Branch: feat/first-deploy
- Local:
git remote add server ssh://server@192.168.68.42/home/server/Public/infoxtractor/repos.git. - Verify
ollama pull gpt-oss:20bis done on the host (checkdocker exec ollama ollama list). git push server main. Hook rebuilds./healthzcheck. Smoke:curl http://192.168.68.42:8994/healthz.- Document remote setup in
deployment.md. - No code PR — this task is ops.
Task 5.4: E2E smoke test script
Branch: feat/e2e-smoke
Files:
-
Create:
scripts/e2e_smoke.py -
Submits
tests/fixtures/synthetic_giro.pdfviaPOST http://192.168.68.42:8994/jobs(from Mac), polls, asserts per spec §12. Exits non-zero on failure. Prints timings. -
Runs from Mac after every
git push server main(documented as part of deploy habit in AGENTS.md). -
Commit, PR, merge, deploy. Run smoke script; paste output into the PR description.
Chunk 5 end state: Service live on http://192.168.68.42:8994, deploy gated by /healthz + E2E smoke. First consumer (mammon) can start building its integration.
Out of scope for this plan (owned by mammon or future ix work)
- Mammon integration — owned by mammon repo; spec'd separately.
- Second use case (receipt/invoice) — after
bank_statement_headeris proven live. - Multi-container worker, Prometheus exporter, OpenTelemetry exporter, vision path, Config Server, Kafka transport, Azure/AWS/OpenAI adapters — all in spec §14.
Review / handoff
After all chunks merged and deployed:
- Run
scripts/e2e_smoke.pyagainst live service; screenshot / paste output. - Ensure monitoring dashboard shows
infoxtractorhealthy athttp://192.168.68.42:8001. - Confirm
/healthzreturns 200 for 5 minutes straight (no Surya OOMs, no Ollama missing model). - Tag release:
git tag v0.1.0 && git push forgejo v0.1.0. - Open follow-up in mammon: "plan ix integration for needs_parser docs" referencing this spec + service URL.