# Deployment On-prem deploy to `192.168.68.42`. Push-to-deploy via a bare git repo + `post-receive` hook that rebuilds the Docker Compose stack. Pattern mirrors mammon and unified_messaging. ## Topology ``` Mac (dev) │ git push server main ▼ 192.168.68.42:/home/server/Public/infoxtractor/repos.git (bare) │ post-receive → GIT_WORK_TREE=/…/app git checkout -f main │ docker compose up -d --build │ curl /healthz (60 s gate) ▼ Docker container `infoxtractor` (port 8994) ├─ 127.0.0.1:11434 → Ollama (qwen3:14b; host-network mode) └─ 127.0.0.1:5431 → postgis (database `infoxtractor`; host-network mode) ``` ## One-time server setup Run **once** from the Mac. Idempotent. ```bash export IX_POSTGRES_PASSWORD= ./scripts/setup_server.sh ``` The script: 1. Creates `/home/server/Public/infoxtractor/repos.git` (bare) + `/home/server/Public/infoxtractor/app/` (worktree). 2. Installs the `post-receive` hook (see `scripts/setup_server.sh` for the template). 3. Creates the `infoxtractor` Postgres role + database on the shared `postgis` container. 4. Writes `/home/server/Public/infoxtractor/app/.env` (mode 0600) from `.env.example` with the password substituted in. 5. Verifies `qwen3:14b` is pulled in Ollama. 6. Prints a hint to open UFW for port 8994 on the LAN subnet if it's missing. After the script finishes, add the deploy remote to the local repo: ```bash git remote add server ssh://server@192.168.68.42/home/server/Public/infoxtractor/repos.git ``` ## Normal deploy workflow ```bash # after merging a feat branch into main git push server main # tail the server's deploy log ssh server@192.168.68.42 "tail -f /tmp/infoxtractor-deploy.log" # healthz gate (the post-receive hook also waits up to 60 s for this) curl http://192.168.68.42:8994/healthz # end-to-end smoke — this IS the real acceptance test python scripts/e2e_smoke.py ``` If the post-receive hook exits non-zero (healthz never reaches 200), the deploy is considered failed. The previous container keeps running (the hook swaps via `docker compose up -d --build`, which first builds the new image and only swaps if the build succeeds; if the new container fails `/healthz`, it's still up but broken). Investigate with `docker compose logs --tail 200` in `${APP_DIR}` and either fix forward or revert (see below). ## Rollback Never force-push `main`. Rollbacks happen as **forward commits** via `git revert`: ```bash git revert HEAD # creates a revert commit for the last change git push forgejo main git push server main ``` ## First deploy - **Date:** 2026-04-18 - **Commit:** `fix/ollama-extract-json` (#36, the last of several Docker/ops follow-ups after PR #27 shipped the initial Dockerfile) - **`/healthz`:** all three probes (`postgres`, `ollama`, `ocr`) green. First-pass took ~7 min for the fresh container because Surya's recognition (1.34 GB) + detection (73 MB) models download from HuggingFace on first run; subsequent rebuilds reuse the named volumes declared in `docker-compose.yml` and come up in <30 s. - **E2E extraction:** `bank_statement_header` against `tests/fixtures/synthetic_giro.pdf` with Paperless-style texts: - Pipeline completes in **35 s**. - Extracted: `bank_name=DKB`, `account_iban=DE89370400440532013000`, `currency=EUR`, `opening_balance=1234.56`, `closing_balance=1450.22`, `statement_date=2026-03-31`, `statement_period_end=2026-03-31`, `statement_period_start=2026-03-01`, `account_type=null`. - Provenance: 8 / 9 leaf fields have sources; 7 / 8 `provenance_verified` and `text_agreement` are True. `statement_period_start` shows up in the OCR but normalisation fails (dateutil picks a different interpretation of the cited day); to be chased in a follow-up. ### Docker-ops follow-ups that landed during the first deploy All small, each merged as its own PR. In commit order after the scaffold (#27): - **#31** `fix(docker): uv via standalone installer` — Python 3.12 on Ubuntu 22.04 drops `distutils`; Ubuntu's pip needed it. Switched to the `uv` standalone installer, which has no pip dependency. - **#32** `fix(docker): include README.md in the uv sync COPY` — `hatchling` validates the readme file exists when resolving the editable project install. - **#33** `fix(compose): drop runtime: nvidia` — the deploy host's Docker daemon doesn't register a named `nvidia` runtime; `deploy.resources.devices` is sufficient and matches immich-ml. - **#34** `fix(deploy): network_mode: host` — `postgis` is bound to `127.0.0.1` on the host (security hardening T12). `host.docker.internal` points at the bridge gateway, not loopback, so the container couldn't reach postgis. Goldstein uses the same pattern. - **#35** `fix(deps): pin surya-ocr ^0.17` — earlier cu124 torch pin had forced surya to 0.14.1, which breaks our `surya.foundation` import and needs a transformers version that lacks `QuantizedCacheConfig`. - **#36** `fix(genai): drop Ollama format flag; extract trailing JSON` — Ollama 0.11.8 segfaults on Pydantic JSON Schemas (`$ref`, `anyOf`, `pattern`), and `format="json"` terminates reasoning models (qwen3) at `{}` because their `` chain-of-thought isn't valid JSON. Omit the flag, inject the schema into the system prompt, extract the outermost `{…}` balanced block from the response. - **volumes** — named `ix_surya_cache` + `ix_hf_cache` mount `/root/.cache/datalab` + `/root/.cache/huggingface` so rebuilds don't re-download ~1.5 GB of model weights. Production notes: - `IX_DEFAULT_MODEL=qwen3:14b` (already pulled on the host). Spec listed `gpt-oss:20b` as a concrete example; swapped to keep the deploy on-prem without an extra `ollama pull`. - Torch 2.11 default cu13 wheels fall back to CPU against the host's CUDA 12.4 driver — Surya runs on CPU. Expected inference times: seconds per page. Upgrading the NVIDIA driver (or pinning a cu12-compatible torch wheel newer than 2.7) will unlock GPU with no code changes. ## E2E smoke test (`scripts/e2e_smoke.py`) What it does (from the Mac): 1. Checks `/healthz`. 2. Starts a tiny HTTP server on the Mac's LAN IP serving `tests/fixtures/synthetic_giro.pdf`. 3. Submits a `POST /jobs` with `use_case=bank_statement_header`, the fixture URL in `context.files`, and a Paperless-style OCR text in `context.texts` (to exercise the `text_agreement` cross-check). 4. Polls `GET /jobs/{id}` every 2 s until terminal or 120 s timeout. 5. Asserts: `status=="done"`, `bank_name` non-empty, `provenance.fields["result.closing_balance"].provenance_verified=True`, `text_agreement=True`, total elapsed `< 60s`. Non-zero exit means the deploy is not healthy. Roll back via `git revert HEAD`. ## Operational checklists ### After `ollama pull` on the host The `IX_DEFAULT_MODEL` env var on the server's `.env` must match something in `ollama list`. Changing the default means: 1. Edit `/home/server/Public/infoxtractor/app/.env` → `IX_DEFAULT_MODEL=`. 2. `docker compose --project-directory /home/server/Public/infoxtractor/app restart`. 3. `curl http://192.168.68.42:8994/healthz` → confirm `ollama: ok`. ### If `/healthz` shows `ollama: degraded` `qwen3:14b` (or the configured default) is not pulled. On the host: ```bash ssh server@192.168.68.42 "docker exec ollama ollama pull qwen3:14b" ``` ### If `/healthz` shows `ocr: fail` Surya couldn't initialize (model missing, CUDA unavailable, OOM). First run can be slow — models download on first call. Check container logs: ```bash ssh server@192.168.68.42 "docker logs infoxtractor --tail 200" ``` ### If the container fails to start ```bash ssh server@192.168.68.42 "tail -100 /tmp/infoxtractor-deploy.log" ssh server@192.168.68.42 "docker compose -f /home/server/Public/infoxtractor/app/docker-compose.yml logs --tail 200" ``` ## Monitoring - Monitoring dashboard auto-discovers via the `infrastructure.web_url` label on the container: `http://192.168.68.42:8001` → "infoxtractor" card. - Backup opt-in via `backup.enable=true` + `backup.type=postgres` + `backup.name=infoxtractor` labels. The daily backup script picks up the `infoxtractor` Postgres database automatically. ## Ports | Port | Direction | Source | Service | |------|-----------|--------|---------| | 8994/tcp | ALLOW | 192.168.68.0/24 | ix REST + healthz (LAN only; not publicly exposed) | No VPS Caddy entry; no `infrastructure.docs_url` label — this is an internal service.