Async on-prem LLM-powered structured information extraction microservice
Our client code imports surya.foundation (added in 0.17). The earlier cu124 torch pin forced uv to downgrade surya to 0.14.1, which doesn't have that module and depends on a transformers version that lacks QuantizedCacheConfig. Net: ocr: fail at /healthz. Drop the cu124 index pin. surya 0.17.1 needs torch >= 2.7, which the default pypi torch (2.11) satisfies. The deploy host's CUDA 12.4 driver doesn't match torch 2.11's cu13 wheels, so CUDA init warns and the GPU isn't available — torch + Surya transparently fall back to CPU. Slower than GPU but correct for MVP. A host driver upgrade later will unlock GPU with no code changes. Unit suite stays green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|---|---|---|
| .forgejo/workflows | ||
| alembic | ||
| docs | ||
| scripts | ||
| src/ix | ||
| tests | ||
| .env.example | ||
| .gitignore | ||
| .python-version | ||
| AGENTS.md | ||
| alembic.ini | ||
| docker-compose.yml | ||
| Dockerfile | ||
| pyproject.toml | ||
| README.md | ||
| uv.lock | ||
InfoXtractor (ix)
Async, on-prem, LLM-powered structured information extraction microservice.
Given a document (PDF, image, text) and a named use case, ix returns a structured JSON result whose shape matches the use-case schema — together with per-field provenance (OCR segment IDs, bounding boxes, cross-OCR agreement flags) that let the caller decide how much to trust each extracted value.
Status: design phase. Implementation about to start.
- Full reference spec:
docs/spec-core-pipeline.md(aspirational; MVP is a strict subset) - MVP design:
docs/superpowers/specs/2026-04-18-ix-mvp-design.md - Agent / development notes:
AGENTS.md
Principles
- On-prem always. LLM = Ollama, OCR = local engines (Surya first). No OpenAI / Anthropic / Azure / AWS / cloud.
- Grounded extraction, not DB truth. ix returns best-effort fields + provenance; the caller decides what to trust.
- Transport-agnostic pipeline core. REST + Postgres-queue adapters in parallel on one job store.