feat(provenance): mapper + verifier (spec §9.4, §6) #8
Loading…
Reference in a new issue
No description provided.
Delete branch "feat/provenance-mapper-verifier"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Completes the provenance subsystem.
mapper.py—map_segment_refs_to_provenanceresolves LLM SegmentCitations toFieldProvenanceentries, honoursmax_sources_per_field, counts invalid references, and resolves field values via dot-path (items[0].name→items.0.name).verify.py—verify_field+apply_reliability_flagsdispatch by Pydantic field type and write theprovenance_verified/text_agreementflags in place. Handles PEP 604X | Noneunions correctly (types.UnionType on 3.12+), scans numeric tokens so labels do not mask balance matches, recognises IBANs by field name.CI trigger is flaky for now; local tests green (103 passed, ruff clean).
Lands the two remaining provenance-subsystem pieces: mapper.py — map_segment_refs_to_provenance: - For each LLM SegmentCitation, pick seg-ids per source_type (`value` vs `value_and_context`), cap at max_sources_per_field, resolve each via SegmentIndex, track invalid references. - Resolve field values by dot-path (`result.items[0].name` supported — `[N]` bracket notation is normalised to `.N` before traversal). - Skip fields that resolve to zero valid sources (spec §9.4). - Write quality_metrics with fields_with_provenance / total_fields / coverage_rate / invalid_references. verify.py — verify_field + apply_reliability_flags: - Dispatches per Pydantic field type: date → parse-both-sides compare; int/float/Decimal → normalize + whole-snippet / numeric-token scan; IBAN (detected via `iban` in field name) → upper+strip compare; Literal / None → flags stay None; else string substring. - _unwrap_optional handles BOTH typing.Union AND types.UnionType so `Decimal | None` (PEP 604, what get_type_hints emits on 3.12+) resolves correctly — caught by the integration-style test_writes_flags_and_counters. - Number comparator scans numeric tokens in the snippet so labels ("Closing balance CHF 1'234.56") don't mask the match. - apply_reliability_flags mutates the passed ProvenanceData in place and writes verified_fields / text_agreement_fields to quality_metrics. Tests cover each comparator, Literal/None skip, short-value skip (strings and numerics), Decimal via optional union, and end-to-end flag+counter writing against a Pydantic use-case schema that mirrors bank_statement_header. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>