Dirk Riemann 2e8ca0ee43

tests / test (push) Successful in 1m43s

Details

tests / test (pull_request) Successful in 1m21s

Details

feat(ui): add browser UI at /ui for job submission

Minimal Jinja2 + HTMX + Pico CSS UI (all CDN, no build step) that lets
a user drop a PDF, pick a registered use case or define one inline,
tweak OCR/GenAI/provenance options, submit, and watch the pretty-JSON
result come back via 2s HTMX polling. Uploads land in
{tmp_dir}/ui/<uuid>.pdf via aiofiles streaming with the existing
IX_FILE_MAX_BYTES cap.

All submissions go through the same jobs_repo.insert_pending entry
point the REST adapter uses — no duplicated logic. The REST surface is
unchanged.

Tests: tests/integration/test_ui_routes.py — 8 cases covering GET /ui,
registered + custom use-case submissions (asserting the stored request
carries use_case_inline for the custom path), malformed fields_json
rejection, and the fragment renderer for pending vs. done.

New deps pinned explicitly in pyproject.toml:
jinja2, aiofiles, python-multipart (arrive transitively via FastAPI but
we own the import surface now).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-04-18 21:27:54 +02:00

4.5 KiB

Raw Blame History

InfoXtractor (ix)

Async, on-prem, LLM-powered structured information extraction microservice. Given a document (PDF, image) or text plus a use case (a Pydantic schema), returns a structured JSON result with per-field provenance (source page, bounding box, OCR segment).

Designed to be used by other on-prem services (e.g. mammon) as a reliable fallback / second opinion for format-specific deterministic parsers.

Status: MVP deployed (2026-04-18) at http://192.168.68.42:8994 — LAN only. Browser UI at http://192.168.68.42:8994/ui. Full reference spec at docs/spec-core-pipeline.md; MVP spec at docs/superpowers/specs/2026-04-18-ix-mvp-design.md; deploy runbook at docs/deployment.md.

Use cases: the built-in registry lives in src/ix/use_cases/__init__.py (bank_statement_header for MVP). Callers without a registered entry can ship an ad-hoc schema inline via RequestIX.use_case_inline (see README "Ad-hoc use cases"); the pipeline builds the Pydantic classes on the fly per request. The /ui page exposes this as a "custom" option so non-engineering users can experiment without a deploy.

Guiding Principles

On-prem always. All LLM inference, OCR, and user-data processing run on the home server (192.168.68.42). No cloud APIs — OpenAI, Anthropic, Azure, AWS Bedrock/Textract, Google Document AI, Mistral, etc. are not to be used for user data or inference. LLM backend is Ollama (:11434); OCR runs locally (pluggable OCRClient interface, first engine: Surya on the RTX 3090); job state lives in local Postgres on the postgis container. The spec's references to Azure / AWS / OpenAI are examples to replace, not inherit.
Grounded extraction, not DB truth. ix returns best-effort extracted fields with segment citations, provenance, and cross-OCR agreement signals. ix does not claim its output is DB-grade; the calling service (e.g. mammon) owns the reliability decision (reconcile against anchors, stage for review, compare to deterministic parsers).
Transport-agnostic pipeline core. The pipeline (RequestIX → ResponseIX) knows nothing about HTTP, queues, or databases. Transport adapters (REST, Postgres queue, …) run in parallel alongside the core and all converge on one job store.

Habits

Feature branches + PRs. New work: git checkout -b feat/<name> → commit small, logical chunks → git push forgejo feat/<name> → create PR via Forgejo API → wait for tests to pass → merge → git push server main to deploy.
Keep documentation up to date in the same commit as the code. README.md, docs/, and AGENTS.md update alongside the change. Unpushed / undocumented work is work that isn't done.
Deploy after merging. git push server main rebuilds the Docker image via post-receive and restarts the container. Smoke-test the live service before walking away.
Never skip hooks (--no-verify, etc.) without explicit user approval. Prefer creating new commits over amending. Never force-push main.
Forgejo: repo at http://192.168.68.42:3030/goldstein/infoxtractor (to be created). Use basic auth with FORGEJO_USR / FORGEJO_PSD from ~/Projects/infrastructure/.env, or an API token once issued for this repo.

Tech Stack (MVP)

Language: Python 3.12, asyncio
Web/REST: FastAPI + uvicorn
OCR (pluggable): Surya OCR first (GPU, shares RTX 3090 with Ollama / Immich ML)
LLM: Ollama at 192.168.68.42:11434, structured outputs via JSON schema. Initial model candidate: qwen2.5:32b / qwen3:14b, configurable per use case
State: Postgres on the shared postgis container (:5431), new infoxtractor database
Deployment: Docker, git push server main → post-receive rebuild (pattern from other apps)

Repository / Deploy

Git remotes:
- forgejo: ssh://git@192.168.68.42:2222/goldstein/infoxtractor.git (source of truth / PRs)
- server: bare repo with post-receive rebuild hook (to be set up)
Workflow: feat branch → git push forgejo feat/name → PR via Forgejo API → merge → git push server main to deploy
Monitoring label: infrastructure.web_url=http://192.168.68.42:<PORT>
Backup opt-in: backup.enable=true label on the container

mammon (../mammon) — first consumer. Uses ix as a fallback / second opinion for Paperless-imported bank statements where deterministic parsers don't match.
infrastructure (../infrastructure) — server topology, deployment pattern, Ollama setup, shared postgis Postgres.

4.5 KiB Raw Blame History