infoxtractor/src/ix/pipeline
Dirk Riemann 565d8d0676
All checks were successful
tests / test (push) Successful in 1m0s
tests / test (pull_request) Successful in 1m2s
feat(pipeline): ResponseHandlerStep — shape-up final payload (spec §8)
Final pipeline step. Three mechanical transforms:

1. include_ocr_text -> concatenate non-tag line texts, pages joined
   with \n\n, write to ocr_result.result.text.
2. include_geometries=False (default) -> strip ocr_result.result.pages
   + ocr_result.meta_data. Geometries are heavy; callers opt in.
3. Delete response.context so the internal accumulator never leaks to
   the caller (belt-and-braces; Field(exclude=True) already does this).

validate() always returns True per spec.

7 unit tests in tests/unit/test_response_handler_step.py cover all
three branches + context-not-in-model_dump check.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 11:21:36 +02:00
..
__init__.py feat(pipeline): Step ABC + Pipeline runner + Timer (spec §3, §4) 2026-04-18 11:06:46 +02:00
genai_step.py feat(pipeline): GenAIStep — LLM call + provenance mapping (spec §6.3, §7, §9.2) 2026-04-18 11:18:44 +02:00
ocr_step.py feat(pipeline): OCRStep — run OCR + page tags + SegmentIndex (spec §6.2) 2026-04-18 11:15:46 +02:00
pipeline.py feat(pipeline): Step ABC + Pipeline runner + Timer (spec §3, §4) 2026-04-18 11:06:46 +02:00
reliability_step.py feat(pipeline): ReliabilityStep — writes reliability flags (spec §6) 2026-04-18 11:20:18 +02:00
response_handler_step.py feat(pipeline): ResponseHandlerStep — shape-up final payload (spec §8) 2026-04-18 11:21:36 +02:00
setup_step.py feat(pipeline): SetupStep — validate + fetch + MIME + pages (spec §6.1) 2026-04-18 11:14:04 +02:00
step.py feat(pipeline): Step ABC + Pipeline runner + Timer (spec §3, §4) 2026-04-18 11:06:46 +02:00