fix(genai): send format="json" (loose mode) to Ollama
Ollama 0.11.8 segfaults on any Pydantic-shaped structured-output schema with $ref, anyOf, or pattern — confirmed on the deploy host with the simplest MVP case (BankStatementHeader alone). The earlier null-stripping sanitiser wasn't enough. Switch to format="json", which is "emit valid JSON" mode. We're already describing the exact JSON shape in the system prompt (via GenAIStep + the use case's citation instruction appendix) and validating the response body through Pydantic on parse — which raises IX_002_001 on schema mismatch, exactly as before. Stronger guarantees can come back later via a newer Ollama, an API fix, or a different GenAIClient impl. None of that is needed for the MVP to work end to end. Unit tests: the sanitiser left in place (harmless, still tested). The "happy path" test now asserts format == "json". Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
f6ce97d7fd
commit
2efc4d1088
2 changed files with 20 additions and 11 deletions
|
|
@ -159,7 +159,21 @@ class OllamaClient:
|
||||||
request_kwargs: dict[str, Any],
|
request_kwargs: dict[str, Any],
|
||||||
response_schema: type[BaseModel],
|
response_schema: type[BaseModel],
|
||||||
) -> dict[str, Any]:
|
) -> dict[str, Any]:
|
||||||
"""Map provider-neutral kwargs to Ollama's /api/chat body."""
|
"""Map provider-neutral kwargs to Ollama's /api/chat body.
|
||||||
|
|
||||||
|
Schema strategy for Ollama 0.11.8: we pass ``format="json"`` (loose
|
||||||
|
JSON mode) rather than the full Pydantic schema. The llama.cpp
|
||||||
|
structured-output implementation in 0.11.8 segfaults on schemas
|
||||||
|
involving ``anyOf``, ``$ref``, or ``pattern`` — which Pydantic v2
|
||||||
|
emits for Optional / nested-model / Decimal fields.
|
||||||
|
|
||||||
|
In loose JSON mode Ollama guarantees only syntactically-valid
|
||||||
|
JSON; we enforce the schema on our side by catching the Pydantic
|
||||||
|
``ValidationError`` at parse time and raising IX_002_001. The
|
||||||
|
system prompt (built upstream in GenAIStep) already tells the
|
||||||
|
model what JSON shape to emit, so loose mode is the right
|
||||||
|
abstraction layer here.
|
||||||
|
"""
|
||||||
|
|
||||||
messages = self._translate_messages(
|
messages = self._translate_messages(
|
||||||
list(request_kwargs.get("messages") or [])
|
list(request_kwargs.get("messages") or [])
|
||||||
|
|
@ -168,9 +182,7 @@ class OllamaClient:
|
||||||
"model": request_kwargs.get("model"),
|
"model": request_kwargs.get("model"),
|
||||||
"messages": messages,
|
"messages": messages,
|
||||||
"stream": False,
|
"stream": False,
|
||||||
"format": _sanitise_schema_for_ollama(
|
"format": "json",
|
||||||
response_schema.model_json_schema()
|
|
||||||
),
|
|
||||||
}
|
}
|
||||||
|
|
||||||
options: dict[str, Any] = {}
|
options: dict[str, Any] = {}
|
||||||
|
|
|
||||||
|
|
@ -79,13 +79,10 @@ class TestInvokeHappyPath:
|
||||||
body_json = json.loads(body)
|
body_json = json.loads(body)
|
||||||
assert body_json["model"] == "gpt-oss:20b"
|
assert body_json["model"] == "gpt-oss:20b"
|
||||||
assert body_json["stream"] is False
|
assert body_json["stream"] is False
|
||||||
# Format is the pydantic schema with Optional `anyOf [T, null]`
|
# format is "json" (loose mode): Ollama 0.11.8 segfaults on full
|
||||||
# patterns collapsed to just T — Ollama 0.11.8 segfaults on the
|
# Pydantic schemas. We pass the schema via the system prompt
|
||||||
# anyOf+null shape, so we sanitise before sending.
|
# upstream and validate on parse.
|
||||||
fmt = body_json["format"]
|
assert body_json["format"] == "json"
|
||||||
assert fmt["properties"]["bank_name"] == {"title": "Bank Name", "type": "string"}
|
|
||||||
assert fmt["properties"]["account_number"]["type"] == "string"
|
|
||||||
assert "anyOf" not in fmt["properties"]["account_number"]
|
|
||||||
assert body_json["options"]["temperature"] == 0.2
|
assert body_json["options"]["temperature"] == 0.2
|
||||||
assert "reasoning_effort" not in body_json
|
assert "reasoning_effort" not in body_json
|
||||||
assert body_json["messages"] == [
|
assert body_json["messages"] == [
|
||||||
|
|
|
||||||
Loading…
Reference in a new issue