The home server's Ollama doesn't have gpt-oss:20b pulled; qwen3:14b is already there and is what mammon's chat agent uses. Switching the default now so the first deploy passes the /healthz ollama probe without an extra `ollama pull` step. The spec lists gpt-oss:20b as a concrete example; qwen3:14b is equally on-prem and Ollama-structured-output-compatible. Touched: AppConfig default, BankStatementHeader Request.default_model, .env.example, setup_server.sh ollama-list check, AGENTS.md, deployment.md, live tests. Unit tests that hard-coded the old model string but don't assert the default were left alone. Also: ASCII en-dash in e2e_smoke.py Paperless-style text (ruff RUF001). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
3.9 KiB
3.9 KiB
InfoXtractor (ix)
Async, on-prem, LLM-powered structured information extraction microservice. Given a document (PDF, image) or text plus a use case (a Pydantic schema), returns a structured JSON result with per-field provenance (source page, bounding box, OCR segment).
Designed to be used by other on-prem services (e.g. mammon) as a reliable fallback / second opinion for format-specific deterministic parsers.
Status: design phase. Full reference spec at docs/spec-core-pipeline.md. MVP spec will live at docs/superpowers/specs/.
Guiding Principles
- On-prem always. All LLM inference, OCR, and user-data processing run on the home server (192.168.68.42). No cloud APIs — OpenAI, Anthropic, Azure, AWS Bedrock/Textract, Google Document AI, Mistral, etc. are not to be used for user data or inference. LLM backend is Ollama (:11434); OCR runs locally (pluggable
OCRClientinterface, first engine: Surya on the RTX 3090); job state lives in local Postgres on the postgis container. The spec's references to Azure / AWS / OpenAI are examples to replace, not inherit. - Grounded extraction, not DB truth. ix returns best-effort extracted fields with segment citations, provenance, and cross-OCR agreement signals. ix does not claim its output is DB-grade; the calling service (e.g. mammon) owns the reliability decision (reconcile against anchors, stage for review, compare to deterministic parsers).
- Transport-agnostic pipeline core. The pipeline (
RequestIX→ResponseIX) knows nothing about HTTP, queues, or databases. Transport adapters (REST, Postgres queue, …) run in parallel alongside the core and all converge on one job store.
Habits
- Feature branches + PRs. New work:
git checkout -b feat/<name>→ commit small, logical chunks →git push forgejo feat/<name>→ create PR via Forgejo API → wait for tests to pass → merge →git push server mainto deploy. - Keep documentation up to date in the same commit as the code.
README.md,docs/, andAGENTS.mdupdate alongside the change. Unpushed / undocumented work is work that isn't done. - Deploy after merging.
git push server mainrebuilds the Docker image viapost-receiveand restarts the container. Smoke-test the live service before walking away. - Never skip hooks (
--no-verify, etc.) without explicit user approval. Prefer creating new commits over amending. Never force-pushmain. - Forgejo: repo at
http://192.168.68.42:3030/goldstein/infoxtractor(to be created). Use basic auth withFORGEJO_USR/FORGEJO_PSDfrom~/Projects/infrastructure/.env, or an API token once issued for this repo.
Tech Stack (MVP)
- Language: Python 3.12, asyncio
- Web/REST: FastAPI + uvicorn
- OCR (pluggable): Surya OCR first (GPU, shares RTX 3090 with Ollama / Immich ML)
- LLM: Ollama at
192.168.68.42:11434, structured outputs via JSON schema. Initial model candidate:qwen2.5:32b/qwen3:14b, configurable per use case - State: Postgres on the shared
postgiscontainer (:5431), newinfoxtractordatabase - Deployment: Docker,
git push server main→ post-receive rebuild (pattern from other apps)
Repository / Deploy
- Git remotes:
forgejo:ssh://git@192.168.68.42:2222/goldstein/infoxtractor.git(source of truth / PRs)server: bare repo withpost-receiverebuild hook (to be set up)
- Workflow: feat branch →
git push forgejo feat/name→ PR via Forgejo API → merge →git push server mainto deploy - Monitoring label:
infrastructure.web_url=http://192.168.68.42:<PORT> - Backup opt-in:
backup.enable=truelabel on the container
Related Projects
- mammon (
../mammon) — first consumer. Uses ix as a fallback / second opinion for Paperless-imported bank statements where deterministic parsers don't match. - infrastructure (
../infrastructure) — server topology, deployment pattern, Ollama setup, sharedpostgisPostgres.