First use case lands. The schema is intentionally flat — nine scalar fields, no nested arrays — because Ollama's structured-output guidance stays most reliable when the top level has only scalars, and every field we care about (bank_name, IBAN, period, opening/closing balance) can be rendered as one. Registration is explicit in `use_cases/__init__.py`, not a side effect of importing the use-case module. That keeps load order obvious and lets tests patch the registry without having to reload modules. `get_use_case(name)` is the one-liner adapters use; it raises `IX_001_001` with the offending name in `detail` when the lookup misses, which keeps log-scrape simple. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
53 lines
2.1 KiB
Python
53 lines
2.1 KiB
Python
"""`bank_statement_header` — first (and, for MVP, only) use case.
|
|
|
|
Shape mirrors spec §7. The module defines the pair of Pydantic models
|
|
(``Request`` = prompt/model config, ``BankStatementHeader`` = extraction
|
|
schema) without registering itself — registration happens in
|
|
:mod:`ix.use_cases` so import-time side effects stay out.
|
|
|
|
All header fields are ``Optional`` except ``bank_name`` and ``currency``;
|
|
the spec lets every other field be null when the document doesn't show it.
|
|
The flat (no-nested-list) schema is chosen because Ollama's structured
|
|
output stays most reliable when the top level contains only scalars.
|
|
"""
|
|
|
|
from __future__ import annotations
|
|
|
|
from datetime import date
|
|
from decimal import Decimal
|
|
from typing import Literal
|
|
|
|
from pydantic import BaseModel, ConfigDict
|
|
|
|
|
|
class Request(BaseModel):
|
|
"""Prompt + default-model config for this use case."""
|
|
|
|
model_config = ConfigDict(extra="forbid")
|
|
|
|
use_case_name: str = "Bank Statement Header"
|
|
default_model: str = "gpt-oss:20b"
|
|
system_prompt: str = (
|
|
"You extract header metadata from a single bank or credit-card statement. "
|
|
"Return only facts that appear in the document; leave a field null if uncertain. "
|
|
"Balances must use the document's numeric format (e.g. '1234.56' or '-123.45'); "
|
|
"do not invent a currency symbol. Account type: 'checking' for current/Giro accounts, "
|
|
"'credit' for credit-card statements, 'savings' otherwise. Always return the IBAN "
|
|
"with spaces removed. Never fabricate a value to fill a required-looking field."
|
|
)
|
|
|
|
|
|
class BankStatementHeader(BaseModel):
|
|
"""Extraction schema for the bank-statement header fields."""
|
|
|
|
model_config = ConfigDict(extra="forbid")
|
|
|
|
bank_name: str
|
|
account_iban: str | None = None
|
|
account_type: Literal["checking", "credit", "savings"] | None = None
|
|
currency: str
|
|
statement_date: date | None = None
|
|
statement_period_start: date | None = None
|
|
statement_period_end: date | None = None
|
|
opening_balance: Decimal | None = None
|
|
closing_balance: Decimal | None = None
|