Completes the data-contract layer. Highlights:
- `ResponseIX.context` is an internal mutable accumulator used by pipeline
steps (pages, files, texts, use_case classes, segment index). It MUST NOT
leak into the serialised response, so we mark the field with
`Field(exclude=True)` and carry the shape in a small `_InternalContext`
sub-model with `extra="allow"` so steps can stash arbitrary state without
schema churn. Tested: `model_dump()` and `model_dump_json()` both drop it.
- `FieldProvenance` gains `provenance_verified: bool | None` and
`text_agreement: bool | None` — the two MVP reliability flags written by
the new ReliabilityStep. Both default None so rows predating the
ReliabilityStep (empty LLM output, cloud-import replay) parse cleanly.
- `quality_metrics` stays a free-form `dict[str, Any]` — the MVP adds
`verified_fields` and `text_agreement_fields` counters without carving
them into the schema, which keeps future metric additions free.
- `Job.status` and `Job.callback_status` are `Literal[...]` so Pydantic
rejects unknown states at the edge. Invariant
(`status='done' iff response.error is None`) stays worker-enforced —
callers sometimes hydrate in-flight rows and we do not want validation
to reject them.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds the incoming-request data contracts as Pydantic v2 models. Matches the
MVP spec §3 exactly — fields dropped from the reference spec (use_vision,
reasoning_effort, version, ...) stay out, and `extra="forbid"` catches any
caller that sends them so drift surfaces immediately instead of silently.
Context.files is `list[str | FileRef]`: plain URLs stay str, dict entries
parse as FileRef. This keeps the common case (public URL) one-liner while
still supporting Paperless-style auth headers and per-file size caps.
ix_id stays optional with a docstring warning that callers MUST NOT set it —
the transport layer assigns the 16-char hex handle on insert. The field is
present so `Job` round-trips out of the store.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds the single exception type used throughout the pipeline. Every failure
maps to one of the ten IX_* codes from the MVP spec §8 with a stable
machine-readable code and an optional free-form detail. The `str()` form is
log-scrapable with a single regex (`IX_xxx_xxx: <msg> (detail=...)`), so
mammon-side reliability UX can classify failures without brittle string
parsing.
Enum values equal names so callers can serialise either.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Lands Task 1.1 from the MVP plan: empty-project skeleton so later tasks have somewhere to land. Local tests + ruff pass. CI trigger fix included so feat branches get runs going forward.
The previous python:3.12-slim container lacked node, which actions/checkout@v4
requires. The Forgejo runner's default image includes node + apt + curl, so
we can bootstrap python + uv the same way mammon does.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- pyproject.toml: runtime deps (FastAPI, SQLAlchemy async, Pydantic, PyMuPDF,
python-magic, Pillow, dateutil), dev group (pytest, pytest-asyncio,
pytest-httpx, ruff, mypy), optional `ocr` extra that pulls surya-ocr + torch
(kept optional so CI without GPU can run the base package).
- pytest config: asyncio_mode=auto; `live` marker for tests that need a real
Ollama/Surya (gated on IX_TEST_OLLAMA=1).
- Single smoke test (tests/unit/test_scaffolding.py) verifies the package
imports and exposes __version__ — keeps CI green until the real test
modules land in later chunks.
- .forgejo/workflows/ci.yml: runs ruff + pytest against a Postgres 16 service
container. Explicit IX_TEST_MODE=fake keeps real-client tests out.
- .env.example: every IX_* var from spec §9 with on-prem-friendly defaults.
- uv.lock committed for reproducible builds.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Detailed, TDD-structured plan with 5 chunks covering ~30 feature-branch
tasks from foundation scaffolding through first live deploy + E2E smoke.
Each task is one PR; pipeline core comes hermetic-first, real Surya/Ollama
clients in Chunk 4, containerization + first deploy in Chunk 5.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- FileRef type added so callers (mammon/Paperless) can pass Authorization
headers alongside URLs. context.files is now list[str | FileRef].
- Job lifecycle state machine pinned down, including worker-startup sweep
for rows stuck in 'running' after a crash.
- Explicit IX_002_000 / IX_002_001 codes for Ollama unreachable and
structured-output schema violations, with per-call timeout
IX_GENAI_CALL_TIMEOUT_SECONDS distinct from the per-job timeout.
- IX_000_007 code for file-fetch failures; per-file size, connect, and
read timeouts configurable via env.
- ReliabilityStep: Literal-typed fields and None values explicitly skipped
from provenance verification (with reason); dates parse both sides
before ISO comparison.
- /healthz semantics pinned down (CUDA + Surya loaded; Ollama reachable
AND model available). /metrics window is last 24h.
- (client_id, request_id) is UNIQUE in ix_jobs, matching the idempotency
claim.
- Deploy-failure workflow uses `git revert` forward commit, not
force-push — aligned with AGENTS.md habits.
- Dockerfile / compose require --gpus all. Pre-deploy requires
`ollama pull gpt-oss:20b`; /healthz verifies before deploy completes.
- CI clarified: Forgejo Actions runners are GPU-less and LAN-disconnected;
all inference is stubbed there. Real-Ollama tests behind IX_TEST_OLLAMA=1.
- Fixture redaction stance: synthetic-template PDF committed; real
redacted fixtures live out-of-repo.
- Deferred list picks up use_case URL/Base64, callback retries,
multi-container workers. quality_metrics retains reference-spec counters
plus the two new MVP ones.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Establishes ix as an async, on-prem, LLM-powered structured extraction
microservice. Full reference spec stays in docs/spec-core-pipeline.md;
MVP spec (strict subset — Ollama only, Surya OCR, REST + Postgres-queue
transports in parallel, in-repo use cases, provenance-based reliability
signals) lives at docs/superpowers/specs/2026-04-18-ix-mvp-design.md.
First use case: bank_statement_header (feeds mammon's needs_parser flow).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>