The home server's Ollama doesn't have gpt-oss:20b pulled; qwen3:14b is already there and is what mammon's chat agent uses. Switching the default now so the first deploy passes the /healthz ollama probe without an extra `ollama pull` step. The spec lists gpt-oss:20b as a concrete example; qwen3:14b is equally on-prem and Ollama-structured-output-compatible. Touched: AppConfig default, BankStatementHeader Request.default_model, .env.example, setup_server.sh ollama-list check, AGENTS.md, deployment.md, live tests. Unit tests that hard-coded the old model string but don't assert the default were left alone. Also: ASCII en-dash in e2e_smoke.py Paperless-style text (ruff RUF001). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
5.1 KiB
Deployment
On-prem deploy to 192.168.68.42. Push-to-deploy via a bare git repo + post-receive hook that rebuilds the Docker Compose stack. Pattern mirrors mammon and unified_messaging.
Topology
Mac (dev)
│ git push server main
▼
192.168.68.42:/home/server/Public/infoxtractor/repos.git (bare)
│ post-receive → GIT_WORK_TREE=/…/app git checkout -f main
│ docker compose up -d --build
│ curl /healthz (60 s gate)
▼
Docker container `infoxtractor` (port 8994)
├─ host.docker.internal:11434 → Ollama (qwen3:14b)
└─ host.docker.internal:5431 → postgis (database `infoxtractor`)
One-time server setup
Run once from the Mac. Idempotent.
export IX_POSTGRES_PASSWORD=<generate-a-strong-one>
./scripts/setup_server.sh
The script:
- Creates
/home/server/Public/infoxtractor/repos.git(bare) +/home/server/Public/infoxtractor/app/(worktree). - Installs the
post-receivehook (seescripts/setup_server.shfor the template). - Creates the
infoxtractorPostgres role + database on the sharedpostgiscontainer. - Writes
/home/server/Public/infoxtractor/app/.env(mode 0600) from.env.examplewith the password substituted in. - Verifies
qwen3:14bis pulled in Ollama. - Prints a hint to open UFW for port 8994 on the LAN subnet if it's missing.
After the script finishes, add the deploy remote to the local repo:
git remote add server ssh://server@192.168.68.42/home/server/Public/infoxtractor/repos.git
Normal deploy workflow
# after merging a feat branch into main
git push server main
# tail the server's deploy log
ssh server@192.168.68.42 "tail -f /tmp/infoxtractor-deploy.log"
# healthz gate (the post-receive hook also waits up to 60 s for this)
curl http://192.168.68.42:8994/healthz
# end-to-end smoke — this IS the real acceptance test
python scripts/e2e_smoke.py
If the post-receive hook exits non-zero (healthz never reaches 200), the deploy is considered failed. The previous container keeps running (the hook swaps via docker compose up -d --build, which first builds the new image and only swaps if the build succeeds; if the new container fails /healthz, it's still up but broken). Investigate with docker compose logs --tail 200 in ${APP_DIR} and either fix forward or revert (see below).
Rollback
Never force-push main. Rollbacks happen as forward commits via git revert:
git revert HEAD # creates a revert commit for the last change
git push forgejo main
git push server main
First deploy
(fill in after running — timestamps, commit sha, e2e_smoke output)
- Date: TBD
- Commit: TBD
/healthzfirst-ok time: TBDe2e_smoke.pystatus: TBD- Notes: —
E2E smoke test (scripts/e2e_smoke.py)
What it does (from the Mac):
- Checks
/healthz. - Starts a tiny HTTP server on the Mac's LAN IP serving
tests/fixtures/synthetic_giro.pdf. - Submits a
POST /jobswithuse_case=bank_statement_header, the fixture URL incontext.files, and a Paperless-style OCR text incontext.texts(to exercise thetext_agreementcross-check). - Polls
GET /jobs/{id}every 2 s until terminal or 120 s timeout. - Asserts:
status=="done",bank_namenon-empty,provenance.fields["result.closing_balance"].provenance_verified=True,text_agreement=True, total elapsed< 60s.
Non-zero exit means the deploy is not healthy. Roll back via git revert HEAD.
Operational checklists
After ollama pull on the host
The IX_DEFAULT_MODEL env var on the server's .env must match something in ollama list. Changing the default means:
- Edit
/home/server/Public/infoxtractor/app/.env→IX_DEFAULT_MODEL=<new>. docker compose --project-directory /home/server/Public/infoxtractor/app restart.curl http://192.168.68.42:8994/healthz→ confirmollama: ok.
If /healthz shows ollama: degraded
qwen3:14b (or the configured default) is not pulled. On the host:
ssh server@192.168.68.42 "docker exec ollama ollama pull qwen3:14b"
If /healthz shows ocr: fail
Surya couldn't initialize (model missing, CUDA unavailable, OOM). First run can be slow — models download on first call. Check container logs:
ssh server@192.168.68.42 "docker logs infoxtractor --tail 200"
If the container fails to start
ssh server@192.168.68.42 "tail -100 /tmp/infoxtractor-deploy.log"
ssh server@192.168.68.42 "docker compose -f /home/server/Public/infoxtractor/app/docker-compose.yml logs --tail 200"
Monitoring
- Monitoring dashboard auto-discovers via the
infrastructure.web_urllabel on the container:http://192.168.68.42:8001→ "infoxtractor" card. - Backup opt-in via
backup.enable=true+backup.type=postgres+backup.name=infoxtractorlabels. The daily backup script picks up theinfoxtractorPostgres database automatically.
Ports
| Port | Direction | Source | Service |
|---|---|---|---|
| 8994/tcp | ALLOW | 192.168.68.0/24 | ix REST + healthz (LAN only; not publicly exposed) |
No VPS Caddy entry; no infrastructure.docs_url label — this is an internal service.