infoxtractor

Author	SHA1	Message	Date
Dirk Riemann	2e8ca0ee43	feat(ui): add browser UI at /ui for job submission All checks were successful tests / test (push) Successful in 1m43s Details tests / test (pull_request) Successful in 1m21s Details Minimal Jinja2 + HTMX + Pico CSS UI (all CDN, no build step) that lets a user drop a PDF, pick a registered use case or define one inline, tweak OCR/GenAI/provenance options, submit, and watch the pretty-JSON result come back via 2s HTMX polling. Uploads land in {tmp_dir}/ui/<uuid>.pdf via aiofiles streaming with the existing IX_FILE_MAX_BYTES cap. All submissions go through the same jobs_repo.insert_pending entry point the REST adapter uses — no duplicated logic. The REST surface is unchanged. Tests: tests/integration/test_ui_routes.py — 8 cases covering GET /ui, registered + custom use-case submissions (asserting the stored request carries use_case_inline for the custom path), malformed fields_json rejection, and the fragment renderer for pending vs. done. New deps pinned explicitly in pyproject.toml: jinja2, aiofiles, python-multipart (arrive transitively via FastAPI but we own the import surface now). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 21:27:54 +02:00
Dirk Riemann	a418969251	fix(deps): pin surya-ocr ^0.17 and drop cu124 index All checks were successful tests / test (push) Successful in 1m23s Details tests / test (pull_request) Successful in 2m23s Details Our client code imports surya.foundation (added in 0.17). The earlier cu124 torch pin forced uv to downgrade surya to 0.14.1, which doesn't have that module and depends on a transformers version that lacks QuantizedCacheConfig. Net: ocr: fail at /healthz. Drop the cu124 index pin. surya 0.17.1 needs torch >= 2.7, which the default pypi torch (2.11) satisfies. The deploy host's CUDA 12.4 driver doesn't match torch 2.11's cu13 wheels, so CUDA init warns and the GPU isn't available — torch + Surya transparently fall back to CPU. Slower than GPU but correct for MVP. A host driver upgrade later will unlock GPU with no code changes. Unit suite stays green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 13:21:40 +02:00
Dirk Riemann	d90117807b	fix(deps): pin torch to the CUDA 12.4 wheel channel All checks were successful tests / test (push) Successful in 3m21s Details tests / test (pull_request) Successful in 3m40s Details The default pypi torch (2.11 as of lockfile) ships cu13 wheels, which refuse to initialise against the deploy host's NVIDIA 12.4 driver (UserWarning: "driver on your system is too old (found version 12040)"). /healthz reported ocr: fail because Surya couldn't pick up the GPU. Use `tool.uv.sources` to route torch through PyTorch's cu124 index. That pulls torch 2.6.0+cu124 (still satisfies surya-ocr >= 0.9). Lock updated. transformers downgraded to 4.57.6, triton to 3.2.0 — all compatible with surya and each other. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 13:02:26 +02:00
Dirk Riemann	57cdfd73fb	feat(scaffold): project skeleton with uv + pytest + forgejo CI Some checks failed CI / test (pull_request) Failing after 4s Details - pyproject.toml: runtime deps (FastAPI, SQLAlchemy async, Pydantic, PyMuPDF, python-magic, Pillow, dateutil), dev group (pytest, pytest-asyncio, pytest-httpx, ruff, mypy), optional `ocr` extra that pulls surya-ocr + torch (kept optional so CI without GPU can run the base package). - pytest config: asyncio_mode=auto; `live` marker for tests that need a real Ollama/Surya (gated on IX_TEST_OLLAMA=1). - Single smoke test (tests/unit/test_scaffolding.py) verifies the package imports and exposes __version__ — keeps CI green until the real test modules land in later chunks. - .forgejo/workflows/ci.yml: runs ruff + pytest against a Postgres 16 service container. Explicit IX_TEST_MODE=fake keeps real-client tests out. - .env.example: every IX_* var from spec §9 with on-prem-friendly defaults. - uv.lock committed for reproducible builds. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 10:36:43 +02:00

4 commits