Skip to main content

26. AI provider strategy — OpenAI primary, Anthropic fallback

Date: 2026-06-07

Status

Accepted

Context

Several planned features use an LLM: contract field extraction, check/remittance OCR, and the daily summary card. These run server-side over documents that contain sensitive PII (SSNs, bank and tax identifiers), so provider choice carries data-handling weight, not just cost/quality.

We need a single, swappable place for "call the model" so the app is not coupled to one vendor's SDK, and so an outage or a per-request failure on one provider does not take the feature down. No AI code exists yet — this ADR records the intended shape before the adapter is built, so the first implementation follows it.

Decision

  1. Two providers behind one interface. A provider-agnostic boundary in src/server/ai/ exposes the operations the app needs (e.g. structured extraction, OCR-to-fields). Callers depend on that interface, never on openai or @anthropic-ai/sdk directly.
  2. OpenAI is primary; Anthropic Claude is the fallback. A request tries OpenAI first; on a retryable error (timeout, 5xx, rate limit, malformed/refused output) it falls back to Anthropic. Both keys live server-only in env (OPENAI_API_KEY, ANTHROPIC_API_KEY); neither is NEXT_PUBLIC_.
  3. Keys validated lazily. src/server/env.ts does not require the AI keys at boot (the app can run without AI configured); the adapter validates the key it needs when first called and returns a Result error if absent, consistent with the project's other integration adapters.
  4. Fallback is observable. When the app falls back, it logs the reason (via src/lib/log.ts, never the document contents) so silent primary-provider degradation is visible.

Consequences

  • Feature availability survives a single-provider incident; the cost is maintaining two prompt/parse paths whose outputs must satisfy the same schema (extraction results are Zod-validated at the boundary regardless of which provider produced them).
  • This deviates from the repo's default lean toward Claude for AI work; the divergence is deliberate (operator preference) and isolated to the src/server/ai/ boundary, so reversing the primary/fallback order is a one-line change, not a refactor.
  • PII / data terms are a precondition, not an afterthought. Real documents must only be sent to a provider account covered by a zero-retention / DPA arrangement. This applies equally to whichever provider is primary; confirm OpenAI's account terms before sending live payroll/PII documents.
  • Prompts and output schemas are defined once per operation and shared across providers where possible, to keep extraction results consistent no matter which model answered.