infoxtractor

Author	SHA1	Message	Date
Dirk Riemann	842c4da90c	chore: MVP deployed — readme, AGENTS.md status, deploy runbook filled in All checks were successful tests / test (push) Successful in 1m16s Details tests / test (pull_request) Successful in 1m12s Details First deploy done 2026-04-18. E2E extraction of the bank_statement_header use case completes in 35 s against the live service, with 7 of 9 header fields provenance-verified + text-agreement-green. closing_balance asserts from spec §12 all pass. Updates: - README.md: status -> "MVP deployed"; worked example curl snippet; pointers to deployment runbook + spec + plan. - AGENTS.md: status line updated with the live URL + date. - pyproject.toml: version comment referencing the first deploy. - docs/deployment.md: "First deploy" section filled in with times, field-level extraction result, plus a log of every small Docker/ops follow-up PR that had to land to make the first deploy healthy. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 14:08:07 +02:00
Dirk Riemann	a418969251	fix(deps): pin surya-ocr ^0.17 and drop cu124 index All checks were successful tests / test (push) Successful in 1m23s Details tests / test (pull_request) Successful in 2m23s Details Our client code imports surya.foundation (added in 0.17). The earlier cu124 torch pin forced uv to downgrade surya to 0.14.1, which doesn't have that module and depends on a transformers version that lacks QuantizedCacheConfig. Net: ocr: fail at /healthz. Drop the cu124 index pin. surya 0.17.1 needs torch >= 2.7, which the default pypi torch (2.11) satisfies. The deploy host's CUDA 12.4 driver doesn't match torch 2.11's cu13 wheels, so CUDA init warns and the GPU isn't available — torch + Surya transparently fall back to CPU. Slower than GPU but correct for MVP. A host driver upgrade later will unlock GPU with no code changes. Unit suite stays green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 13:21:40 +02:00
Dirk Riemann	d90117807b	fix(deps): pin torch to the CUDA 12.4 wheel channel All checks were successful tests / test (push) Successful in 3m21s Details tests / test (pull_request) Successful in 3m40s Details The default pypi torch (2.11 as of lockfile) ships cu13 wheels, which refuse to initialise against the deploy host's NVIDIA 12.4 driver (UserWarning: "driver on your system is too old (found version 12040)"). /healthz reported ocr: fail because Surya couldn't pick up the GPU. Use `tool.uv.sources` to route torch through PyTorch's cu124 index. That pulls torch 2.6.0+cu124 (still satisfies surya-ocr >= 0.9). Lock updated. transformers downgraded to 4.57.6, triton to 3.2.0 — all compatible with surya and each other. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 13:02:26 +02:00
Dirk Riemann	57cdfd73fb	feat(scaffold): project skeleton with uv + pytest + forgejo CI Some checks failed CI / test (pull_request) Failing after 4s Details - pyproject.toml: runtime deps (FastAPI, SQLAlchemy async, Pydantic, PyMuPDF, python-magic, Pillow, dateutil), dev group (pytest, pytest-asyncio, pytest-httpx, ruff, mypy), optional `ocr` extra that pulls surya-ocr + torch (kept optional so CI without GPU can run the base package). - pytest config: asyncio_mode=auto; `live` marker for tests that need a real Ollama/Surya (gated on IX_TEST_OLLAMA=1). - Single smoke test (tests/unit/test_scaffolding.py) verifies the package imports and exposes __version__ — keeps CI green until the real test modules land in later chunks. - .forgejo/workflows/ci.yml: runs ruff + pytest against a Postgres 16 service container. Explicit IX_TEST_MODE=fake keeps real-client tests out. - .env.example: every IX_* var from spec §9 with on-prem-friendly defaults. - uv.lock committed for reproducible builds. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 10:36:43 +02:00

4 commits