infoxtractor/docs/deployment.md
Dirk Riemann 5ee74f367c
All checks were successful
tests / test (push) Successful in 1m52s
tests / test (pull_request) Successful in 1m45s
chore(model): switch default IX_DEFAULT_MODEL to qwen3:14b (already on host)
The home server's Ollama doesn't have gpt-oss:20b pulled; qwen3:14b is
already there and is what mammon's chat agent uses. Switching the default
now so the first deploy passes the /healthz ollama probe without an extra
`ollama pull` step. The spec lists gpt-oss:20b as a concrete example;
qwen3:14b is equally on-prem and Ollama-structured-output-compatible.

Touched: AppConfig default, BankStatementHeader Request.default_model,
.env.example, setup_server.sh ollama-list check, AGENTS.md, deployment.md,
live tests. Unit tests that hard-coded the old model string but don't
assert the default were left alone.

Also: ASCII en-dash in e2e_smoke.py Paperless-style text (ruff RUF001).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 12:20:23 +02:00

5.1 KiB

Deployment

On-prem deploy to 192.168.68.42. Push-to-deploy via a bare git repo + post-receive hook that rebuilds the Docker Compose stack. Pattern mirrors mammon and unified_messaging.

Topology

Mac (dev)
  │  git push server main
  ▼
192.168.68.42:/home/server/Public/infoxtractor/repos.git   (bare)
  │  post-receive → GIT_WORK_TREE=/…/app git checkout -f main
  │                 docker compose up -d --build
  │                 curl /healthz (60 s gate)
  ▼
Docker container `infoxtractor` (port 8994)
  ├─ host.docker.internal:11434  →  Ollama (qwen3:14b)
  └─ host.docker.internal:5431   →  postgis (database `infoxtractor`)

One-time server setup

Run once from the Mac. Idempotent.

export IX_POSTGRES_PASSWORD=<generate-a-strong-one>
./scripts/setup_server.sh

The script:

  1. Creates /home/server/Public/infoxtractor/repos.git (bare) + /home/server/Public/infoxtractor/app/ (worktree).
  2. Installs the post-receive hook (see scripts/setup_server.sh for the template).
  3. Creates the infoxtractor Postgres role + database on the shared postgis container.
  4. Writes /home/server/Public/infoxtractor/app/.env (mode 0600) from .env.example with the password substituted in.
  5. Verifies qwen3:14b is pulled in Ollama.
  6. Prints a hint to open UFW for port 8994 on the LAN subnet if it's missing.

After the script finishes, add the deploy remote to the local repo:

git remote add server ssh://server@192.168.68.42/home/server/Public/infoxtractor/repos.git

Normal deploy workflow

# after merging a feat branch into main
git push server main

# tail the server's deploy log
ssh server@192.168.68.42 "tail -f /tmp/infoxtractor-deploy.log"

# healthz gate (the post-receive hook also waits up to 60 s for this)
curl http://192.168.68.42:8994/healthz

# end-to-end smoke — this IS the real acceptance test
python scripts/e2e_smoke.py

If the post-receive hook exits non-zero (healthz never reaches 200), the deploy is considered failed. The previous container keeps running (the hook swaps via docker compose up -d --build, which first builds the new image and only swaps if the build succeeds; if the new container fails /healthz, it's still up but broken). Investigate with docker compose logs --tail 200 in ${APP_DIR} and either fix forward or revert (see below).

Rollback

Never force-push main. Rollbacks happen as forward commits via git revert:

git revert HEAD     # creates a revert commit for the last change
git push forgejo main
git push server main

First deploy

(fill in after running — timestamps, commit sha, e2e_smoke output)

  • Date: TBD
  • Commit: TBD
  • /healthz first-ok time: TBD
  • e2e_smoke.py status: TBD
  • Notes:

E2E smoke test (scripts/e2e_smoke.py)

What it does (from the Mac):

  1. Checks /healthz.
  2. Starts a tiny HTTP server on the Mac's LAN IP serving tests/fixtures/synthetic_giro.pdf.
  3. Submits a POST /jobs with use_case=bank_statement_header, the fixture URL in context.files, and a Paperless-style OCR text in context.texts (to exercise the text_agreement cross-check).
  4. Polls GET /jobs/{id} every 2 s until terminal or 120 s timeout.
  5. Asserts: status=="done", bank_name non-empty, provenance.fields["result.closing_balance"].provenance_verified=True, text_agreement=True, total elapsed < 60s.

Non-zero exit means the deploy is not healthy. Roll back via git revert HEAD.

Operational checklists

After ollama pull on the host

The IX_DEFAULT_MODEL env var on the server's .env must match something in ollama list. Changing the default means:

  1. Edit /home/server/Public/infoxtractor/app/.envIX_DEFAULT_MODEL=<new>.
  2. docker compose --project-directory /home/server/Public/infoxtractor/app restart.
  3. curl http://192.168.68.42:8994/healthz → confirm ollama: ok.

If /healthz shows ollama: degraded

qwen3:14b (or the configured default) is not pulled. On the host:

ssh server@192.168.68.42 "docker exec ollama ollama pull qwen3:14b"

If /healthz shows ocr: fail

Surya couldn't initialize (model missing, CUDA unavailable, OOM). First run can be slow — models download on first call. Check container logs:

ssh server@192.168.68.42 "docker logs infoxtractor --tail 200"

If the container fails to start

ssh server@192.168.68.42 "tail -100 /tmp/infoxtractor-deploy.log"
ssh server@192.168.68.42 "docker compose -f /home/server/Public/infoxtractor/app/docker-compose.yml logs --tail 200"

Monitoring

  • Monitoring dashboard auto-discovers via the infrastructure.web_url label on the container: http://192.168.68.42:8001 → "infoxtractor" card.
  • Backup opt-in via backup.enable=true + backup.type=postgres + backup.name=infoxtractor labels. The daily backup script picks up the infoxtractor Postgres database automatically.

Ports

Port Direction Source Service
8994/tcp ALLOW 192.168.68.0/24 ix REST + healthz (LAN only; not publicly exposed)

No VPS Caddy entry; no infrastructure.docs_url label — this is an internal service.