feat(contracts): ResponseIX + Provenance + Job (spec §3, §9.3) #4

Merged
goldstein merged 1 commit from feat/contracts-response into main 2026-04-18 08:50:38 +00:00
Owner

Completes the data-contract layer:

  • ResponseIX with internal-only context excluded from serialisation (Field(exclude=True) + _InternalContext carrier).
  • FieldProvenance gains MVP reliability flags provenance_verified and text_agreement (both bool | None).
  • Job envelope with Literal status enums.
  • quality_metrics stays a free-form dict so future counters can land without contract churn.

CI trigger is flaky for now; local tests green (34 passed, ruff clean).

Completes the data-contract layer: - `ResponseIX` with internal-only `context` excluded from serialisation (`Field(exclude=True)` + `_InternalContext` carrier). - `FieldProvenance` gains MVP reliability flags `provenance_verified` and `text_agreement` (both `bool | None`). - `Job` envelope with `Literal` status enums. - `quality_metrics` stays a free-form dict so future counters can land without contract churn. CI trigger is flaky for now; local tests green (34 passed, ruff clean).
goldstein added 1 commit 2026-04-18 08:50:32 +00:00
feat(contracts): ResponseIX + Provenance + Job envelope (spec §3, §9.3)
All checks were successful
tests / test (push) Successful in 1m2s
tests / test (pull_request) Successful in 1m0s
02db3b05cc
Completes the data-contract layer. Highlights:

- `ResponseIX.context` is an internal mutable accumulator used by pipeline
  steps (pages, files, texts, use_case classes, segment index). It MUST NOT
  leak into the serialised response, so we mark the field with
  `Field(exclude=True)` and carry the shape in a small `_InternalContext`
  sub-model with `extra="allow"` so steps can stash arbitrary state without
  schema churn. Tested: `model_dump()` and `model_dump_json()` both drop it.

- `FieldProvenance` gains `provenance_verified: bool | None` and
  `text_agreement: bool | None` — the two MVP reliability flags written by
  the new ReliabilityStep. Both default None so rows predating the
  ReliabilityStep (empty LLM output, cloud-import replay) parse cleanly.

- `quality_metrics` stays a free-form `dict[str, Any]` — the MVP adds
  `verified_fields` and `text_agreement_fields` counters without carving
  them into the schema, which keeps future metric additions free.

- `Job.status` and `Job.callback_status` are `Literal[...]` so Pydantic
  rejects unknown states at the edge. Invariant
  (`status='done' iff response.error is None`) stays worker-enforced —
  callers sometimes hydrate in-flight rows and we do not want validation
  to reject them.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
goldstein merged commit 230068e484 into main 2026-04-18 08:50:38 +00:00
Sign in to join this conversation.
No reviewers
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: goldstein/infoxtractor#4
No description provided.