Commit graph

8 commits

Author SHA1 Message Date
842c4da90c chore: MVP deployed — readme, AGENTS.md status, deploy runbook filled in
All checks were successful
tests / test (push) Successful in 1m16s
tests / test (pull_request) Successful in 1m12s
First deploy done 2026-04-18. E2E extraction of the bank_statement_header
use case completes in 35 s against the live service, with 7 of 9 header
fields provenance-verified + text-agreement-green. closing_balance
asserts from spec §12 all pass.

Updates:
- README.md: status -> "MVP deployed"; worked example curl snippet;
  pointers to deployment runbook + spec + plan.
- AGENTS.md: status line updated with the live URL + date.
- pyproject.toml: version comment referencing the first deploy.
- docs/deployment.md: "First deploy" section filled in with times,
  field-level extraction result, plus a log of every small Docker/ops
  follow-up PR that had to land to make the first deploy healthy.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 14:08:07 +02:00
c7dc40c51e fix(deploy): switch to network_mode: host — reach postgis + ollama on loopback
All checks were successful
tests / test (push) Successful in 1m12s
tests / test (pull_request) Successful in 1m10s
The shared postgis container is bound to 127.0.0.1 on the host (security
hardening, infrastructure §T12). Ollama is similarly LAN-hardened. The
previous `host.docker.internal + extra_hosts: host-gateway` approach
points at the bridge gateway IP, not loopback, so the container couldn't
reach either service.

Switch to `network_mode: host` (same pattern goldstein uses) and update
the default IX_POSTGRES_URL / IX_OLLAMA_URL to 127.0.0.1. Keep the GPU
reservation block; drop the now-meaningless ports: declaration (host mode
publishes directly).

AppConfig defaults + .env.example + test_config assertions + inline
docstring examples all follow.

Caught on fourth deploy attempt.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 13:00:02 +02:00
5ee74f367c chore(model): switch default IX_DEFAULT_MODEL to qwen3:14b (already on host)
All checks were successful
tests / test (push) Successful in 1m52s
tests / test (pull_request) Successful in 1m45s
The home server's Ollama doesn't have gpt-oss:20b pulled; qwen3:14b is
already there and is what mammon's chat agent uses. Switching the default
now so the first deploy passes the /healthz ollama probe without an extra
`ollama pull` step. The spec lists gpt-oss:20b as a concrete example;
qwen3:14b is equally on-prem and Ollama-structured-output-compatible.

Touched: AppConfig default, BankStatementHeader Request.default_model,
.env.example, setup_server.sh ollama-list check, AGENTS.md, deployment.md,
live tests. Unit tests that hard-coded the old model string but don't
assert the default were left alone.

Also: ASCII en-dash in e2e_smoke.py Paperless-style text (ruff RUF001).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 12:20:23 +02:00
d0648fe01d feat(e2e): scripts/e2e_smoke.py — live deploy gate
All checks were successful
tests / test (push) Successful in 1m11s
tests / test (pull_request) Successful in 2m14s
Runs from the Mac after every `git push server main`.

Flow: starts a tiny HTTP server on the Mac's LAN IP serving
tests/fixtures/synthetic_giro.pdf → POST /jobs with bank_statement_header
+ Paperless-style texts so text_agreement has something to check against →
poll GET /jobs/{id} until terminal → assert status=done, bank_name
non-empty, closing_balance.provenance_verified=True, text_agreement=True,
elapsed < 60 s. Non-zero exit blocks the deploy.

Uses only stdlib (http.server, urllib) — no extra deps on the Mac-side,
no test framework overhead.

Task 5.4 of MVP plan.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 12:18:07 +02:00
6d1bc720b4 feat(deploy): setup_server.sh + deployment runbook
All checks were successful
tests / test (push) Successful in 1m9s
tests / test (pull_request) Successful in 1m10s
- scripts/setup_server.sh: idempotent one-shot. Creates bare repo,
  post-receive hook (which rebuilds docker compose + gates on /healthz),
  infoxtractor Postgres role + DB on the shared postgis container, .env
  (0600) from .env.example with the password substituted in, verifies
  gpt-oss:20b is pulled.
- docs/deployment.md: topology, one-time setup command, normal deploy
  workflow, rollback-via-revert pattern (never force-push main),
  operational checklists for the common /healthz degraded states.
- First deploy section reserved; filled in after Task 5.3 runs.

Task 5.2 of MVP plan.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 12:16:58 +02:00
86538ee8de Implementation plan for ix MVP
Detailed, TDD-structured plan with 5 chunks covering ~30 feature-branch
tasks from foundation scaffolding through first live deploy + E2E smoke.
Each task is one PR; pipeline core comes hermetic-first, real Surya/Ollama
clients in Chunk 4, containerization + first deploy in Chunk 5.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 10:34:30 +02:00
5e007b138d Address spec review — auth, timeouts, lifecycle, error codes
- FileRef type added so callers (mammon/Paperless) can pass Authorization
  headers alongside URLs. context.files is now list[str | FileRef].
- Job lifecycle state machine pinned down, including worker-startup sweep
  for rows stuck in 'running' after a crash.
- Explicit IX_002_000 / IX_002_001 codes for Ollama unreachable and
  structured-output schema violations, with per-call timeout
  IX_GENAI_CALL_TIMEOUT_SECONDS distinct from the per-job timeout.
- IX_000_007 code for file-fetch failures; per-file size, connect, and
  read timeouts configurable via env.
- ReliabilityStep: Literal-typed fields and None values explicitly skipped
  from provenance verification (with reason); dates parse both sides
  before ISO comparison.
- /healthz semantics pinned down (CUDA + Surya loaded; Ollama reachable
  AND model available). /metrics window is last 24h.
- (client_id, request_id) is UNIQUE in ix_jobs, matching the idempotency
  claim.
- Deploy-failure workflow uses `git revert` forward commit, not
  force-push — aligned with AGENTS.md habits.
- Dockerfile / compose require --gpus all. Pre-deploy requires
  `ollama pull gpt-oss:20b`; /healthz verifies before deploy completes.
- CI clarified: Forgejo Actions runners are GPU-less and LAN-disconnected;
  all inference is stubbed there. Real-Ollama tests behind IX_TEST_OLLAMA=1.
- Fixture redaction stance: synthetic-template PDF committed; real
  redacted fixtures live out-of-repo.
- Deferred list picks up use_case URL/Base64, callback retries,
  multi-container workers. quality_metrics retains reference-spec counters
  plus the two new MVP ones.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 10:28:43 +02:00
124403252d Initial design: on-prem LLM extraction microservice MVP
Establishes ix as an async, on-prem, LLM-powered structured extraction
microservice. Full reference spec stays in docs/spec-core-pipeline.md;
MVP spec (strict subset — Ollama only, Surya OCR, REST + Postgres-queue
transports in parallel, in-repo use cases, provenance-based reliability
signals) lives at docs/superpowers/specs/2026-04-18-ix-mvp-design.md.

First use case: bank_statement_header (feeds mammon's needs_parser flow).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 10:23:17 +02:00