infoxtractor

Author	SHA1	Message	Date
Dirk Riemann	2e8ca0ee43	feat(ui): add browser UI at /ui for job submission All checks were successful tests / test (push) Successful in 1m43s Details tests / test (pull_request) Successful in 1m21s Details Minimal Jinja2 + HTMX + Pico CSS UI (all CDN, no build step) that lets a user drop a PDF, pick a registered use case or define one inline, tweak OCR/GenAI/provenance options, submit, and watch the pretty-JSON result come back via 2s HTMX polling. Uploads land in {tmp_dir}/ui/<uuid>.pdf via aiofiles streaming with the existing IX_FILE_MAX_BYTES cap. All submissions go through the same jobs_repo.insert_pending entry point the REST adapter uses — no duplicated logic. The REST surface is unchanged. Tests: tests/integration/test_ui_routes.py — 8 cases covering GET /ui, registered + custom use-case submissions (asserting the stored request carries use_case_inline for the custom path), malformed fields_json rejection, and the fragment renderer for pending vs. done. New deps pinned explicitly in pyproject.toml: jinja2, aiofiles, python-multipart (arrive transitively via FastAPI but we own the import surface now). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 21:27:54 +02:00
Dirk Riemann	703da9035e	feat(use-cases): add inline use-case definitions All checks were successful tests / test (push) Successful in 2m1s Details tests / test (pull_request) Successful in 1m18s Details Adds RequestIX.use_case_inline so callers can define ad-hoc extraction schemas in the request itself, bypassing the backend registry. The pipeline builds a fresh (Request, Response) Pydantic class pair per call via ix.use_cases.inline.build_use_case_classes; structural errors (dup field, bad identifier, choices-on-non-str, empty fields) raise IX_001_001 to match the registry-miss path. Inline wins when both use_case and use_case_inline are set. Existing REST callers see no behavioural change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 21:01:27 +02:00
Dirk Riemann	842c4da90c	chore: MVP deployed — readme, AGENTS.md status, deploy runbook filled in All checks were successful tests / test (push) Successful in 1m16s Details tests / test (pull_request) Successful in 1m12s Details First deploy done 2026-04-18. E2E extraction of the bank_statement_header use case completes in 35 s against the live service, with 7 of 9 header fields provenance-verified + text-agreement-green. closing_balance asserts from spec §12 all pass. Updates: - README.md: status -> "MVP deployed"; worked example curl snippet; pointers to deployment runbook + spec + plan. - AGENTS.md: status line updated with the live URL + date. - pyproject.toml: version comment referencing the first deploy. - docs/deployment.md: "First deploy" section filled in with times, field-level extraction result, plus a log of every small Docker/ops follow-up PR that had to land to make the first deploy healthy. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 14:08:07 +02:00
Dirk Riemann	5ee74f367c	chore(model): switch default IX_DEFAULT_MODEL to qwen3:14b (already on host) All checks were successful tests / test (push) Successful in 1m52s Details tests / test (pull_request) Successful in 1m45s Details The home server's Ollama doesn't have gpt-oss:20b pulled; qwen3:14b is already there and is what mammon's chat agent uses. Switching the default now so the first deploy passes the /healthz ollama probe without an extra `ollama pull` step. The spec lists gpt-oss:20b as a concrete example; qwen3:14b is equally on-prem and Ollama-structured-output-compatible. Touched: AppConfig default, BankStatementHeader Request.default_model, .env.example, setup_server.sh ollama-list check, AGENTS.md, deployment.md, live tests. Unit tests that hard-coded the old model string but don't assert the default were left alone. Also: ASCII en-dash in e2e_smoke.py Paperless-style text (ruff RUF001). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 12:20:23 +02:00
Dirk Riemann	124403252d	Initial design: on-prem LLM extraction microservice MVP Establishes ix as an async, on-prem, LLM-powered structured extraction microservice. Full reference spec stays in docs/spec-core-pipeline.md; MVP spec (strict subset — Ollama only, Surya OCR, REST + Postgres-queue transports in parallel, in-repo use cases, provenance-based reliability signals) lives at docs/superpowers/specs/2026-04-18-ix-mvp-design.md. First use case: bank_statement_header (feeds mammon's needs_parser flow). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 10:23:17 +02:00

5 commits