Async on-prem LLM-powered structured information extraction microservice

Find a file

Dirk Riemann d90117807b All checks were successful tests / test (push) Successful in 3m21s Details tests / test (pull_request) Successful in 3m40s Details fix(deps): pin torch to the CUDA 12.4 wheel channel The default pypi torch (2.11 as of lockfile) ships cu13 wheels, which refuse to initialise against the deploy host's NVIDIA 12.4 driver (UserWarning: "driver on your system is too old (found version 12040)"). /healthz reported ocr: fail because Surya couldn't pick up the GPU. Use `tool.uv.sources` to route torch through PyTorch's cu124 index. That pulls torch 2.6.0+cu124 (still satisfies surya-ocr >= 0.9). Lock updated. transformers downgraded to 4.57.6, triton to 3.2.0 — all compatible with surya and each other. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>		2026-04-18 13:02:26 +02:00
.forgejo/workflows	ci: run on every push (not just main) so feat branches also get CI	2026-04-18 10:40:44 +02:00
alembic	feat(store): Alembic scaffolding + initial ix_jobs migration (spec §4)	2026-04-18 11:37:21 +02:00
docs	fix(deploy): switch to network_mode: host — reach postgis + ollama on loopback	2026-04-18 13:00:02 +02:00
scripts	chore(model): switch default IX_DEFAULT_MODEL to qwen3:14b (already on host)	2026-04-18 12:20:23 +02:00
src/ix	fix(deploy): switch to network_mode: host — reach postgis + ollama on loopback	2026-04-18 13:00:02 +02:00
tests	fix(deploy): switch to network_mode: host — reach postgis + ollama on loopback	2026-04-18 13:00:02 +02:00
.env.example	fix(deploy): switch to network_mode: host — reach postgis + ollama on loopback	2026-04-18 13:00:02 +02:00
.gitignore	feat(docker): Dockerfile (CUDA+python3.12) + compose with GPU reservation	2026-04-18 12:15:26 +02:00
.python-version	feat(scaffold): project skeleton with uv + pytest + forgejo CI	2026-04-18 10:36:43 +02:00
AGENTS.md	chore(model): switch default IX_DEFAULT_MODEL to qwen3:14b (already on host)	2026-04-18 12:20:23 +02:00
alembic.ini	feat(store): Alembic scaffolding + initial ix_jobs migration (spec §4)	2026-04-18 11:37:21 +02:00
docker-compose.yml	fix(deploy): switch to network_mode: host — reach postgis + ollama on loopback	2026-04-18 13:00:02 +02:00
Dockerfile	fix(docker): include README.md in the uv sync COPY so hatchling finds it	2026-04-18 12:42:29 +02:00
pyproject.toml	fix(deps): pin torch to the CUDA 12.4 wheel channel	2026-04-18 13:02:26 +02:00
README.md	Initial design: on-prem LLM extraction microservice MVP	2026-04-18 10:23:17 +02:00
uv.lock	fix(deps): pin torch to the CUDA 12.4 wheel channel	2026-04-18 13:02:26 +02:00

README.md

InfoXtractor (ix)

Async, on-prem, LLM-powered structured information extraction microservice.

Given a document (PDF, image, text) and a named use case, ix returns a structured JSON result whose shape matches the use-case schema — together with per-field provenance (OCR segment IDs, bounding boxes, cross-OCR agreement flags) that let the caller decide how much to trust each extracted value.

Status: design phase. Implementation about to start.

Full reference spec: docs/spec-core-pipeline.md (aspirational; MVP is a strict subset)
MVP design: docs/superpowers/specs/2026-04-18-ix-mvp-design.md
Agent / development notes: AGENTS.md

Principles

On-prem always. LLM = Ollama, OCR = local engines (Surya first). No OpenAI / Anthropic / Azure / AWS / cloud.
Grounded extraction, not DB truth. ix returns best-effort fields + provenance; the caller decides what to trust.
Transport-agnostic pipeline core. REST + Postgres-queue adapters in parallel on one job store.