Providers¶
Provider Reference¶
Provider Constants¶
| Constant | String Value | API Key Required |
|---|---|---|
chat.ProviderOpenAI |
"openai" |
Yes — OPENAI_API_KEY |
chat.ProviderClaude |
"claude" |
Yes — ANTHROPIC_API_KEY |
chat.ProviderGemini |
"gemini" |
Yes — GEMINI_API_KEY |
chat.ProviderClaudeLocal |
"claude-local" |
No — uses local claude binary |
chat.ProviderOpenAICompatible |
"openai-compatible" |
Backend-dependent (set via Token) |
The default provider when Config.Provider is empty (and AI_PROVIDER env var is not set) is ProviderOpenAI.
Capability Comparison¶
| Provider | Tool Calling | Parallel Tools | Structured Output | Streaming | Notes |
|---|---|---|---|---|---|
| OpenAI | ✓ | ✓ | ✓ JSON Schema | ✓ | |
| Claude | ✓ | ✓ | ✓ Tool-based | ✓ | |
| Gemini | ✓ | ✓ | ✓ JSON Schema | ✓ | |
| Claude Local | ✗ | ✗ | ✓ --json-schema |
✗ | MCP tool support planned |
| OpenAI-Compatible | ✓ | ✓ | ✓ JSON Schema | ✓ | Backend-dependent |
ProviderClaudeLocal¶
ProviderClaudeLocal routes requests through the locally installed claude CLI binary instead of the API. This is valuable in environments where direct outbound HTTPS to api.anthropic.com is blocked but the pre-authenticated claude binary is permitted.
Requirements:
- claude binary installed and authenticated (claude login)
- Binary must be in PATH
- No Token or API key needed
client, err := chat.New(ctx, p, chat.Config{
Provider: chat.ProviderClaudeLocal,
Model: "claude-sonnet-4-6", // optional; uses claude's default if empty
SystemPrompt: "You are a helpful assistant.",
})
Multi-turn continuity is maintained via session IDs captured from the CLI's JSON output and passed via --resume on subsequent calls.
ProviderOpenAICompatible¶
Use ProviderOpenAICompatible to target any backend that exposes an OpenAI-compatible API, including Ollama, Groq, Fireworks AI, Together AI, LM Studio, and vLLM.
Requirements:
- BaseURL must be set in Config
- Model must be set (no default — model names are backend-specific)
// Ollama (local)
client, err := chat.New(ctx, p, chat.Config{
Provider: chat.ProviderOpenAICompatible,
BaseURL: "http://localhost:11434/v1",
Model: "llama3.2",
Token: "ollama", // Ollama ignores the token; any non-empty value works
})
// Groq (cloud)
client, err := chat.New(ctx, p, chat.Config{
Provider: chat.ProviderOpenAICompatible,
BaseURL: "https://api.groq.com/openai/v1",
Model: "llama-3.3-70b-versatile",
Token: os.Getenv("GROQ_API_KEY"),
})
Token chunking falls back to cl100k_base encoding for model names not recognised by the tokenizer, so Ollama and other non-OpenAI model names are handled gracefully.
Provider Registry¶
The provider registry is open for extension. Register a custom provider from any package:
// mypackage/provider.go
func init() {
chat.RegisterProvider("my-backend", newMyBackend)
}
func newMyBackend(ctx context.Context, p *props.Props, cfg chat.Config) (chat.ChatClient, error) {
return &MyBackendClient{token: cfg.Token, baseURL: cfg.BaseURL}, nil
}
After importing your package, chat.New(ctx, p, chat.Config{Provider: "my-backend"}) routes to your factory.
Cross-provider fallback & routing¶
A single provider can rate-limit (HTTP 429), suffer an outage (5xx), or be
unreachable. chat.NewFallback wraps an ordered list of clients and, on a
retryable failure from the active one, transparently advances to the next. The
composite is a ChatClient (and a StreamingChatClient iff every wrapped
client streams), so callers are unchanged.
// Wrap already-constructed clients (first is primary, rest are fallbacks):
client, err := chat.NewFallback([]chat.ChatClient{primary, secondary})
// Or build from configs (each provider resolves its own credentials + model):
client, err := chat.NewFallbackFromConfigs(ctx, props, []chat.Config{
{Provider: chat.ProviderClaude, Model: "claude-opus-4-8"},
{Provider: chat.ProviderOpenAI}, // uses OpenAI's default model
{Provider: chat.ProviderGemini},
})
Config surface (opt-in, default off)¶
ai:
provider: claude # single-provider default (unchanged)
fallback:
enabled: false # default — single-provider behaviour is byte-for-byte preserved
providers: [claude, openai, gemini] # ordered; index 0 is primary
When ai.fallback.enabled is true and ai.fallback.providers is non-empty,
chat.NewWithFallback (used by the AI call sites) builds a composite; otherwise
it is exactly chat.New. providers[0] is the primary and overrides
ai.provider (a WARN is logged on disagreement). A fallback provider whose
credentials are missing is dropped with a WARN (endpoint host only); the
primary's absence is fatal.
Known limitation — per-provider model. On the config-driven path each provider in the chain uses its own default model; the global
ai.modelkey is not applied per provider (a single model name rarely makes sense across Claude/OpenAI/Gemini). To pin a specific model per provider, construct the chain explicitly withchat.NewFallbackFromConfigsand set eachConfig.Model.
Which errors trigger failover¶
The default policy (chat.DefaultFailoverPolicy, overridable via
WithFailoverPolicy) advances on transient/unavailable conditions and treats
operator-fixable faults as fatal:
Advance (FailoverNext) |
Fatal (FailoverFatal) |
|---|---|
| 408, 429, 500, 502, 503, 504 | 400, 401, 403, 404, 422 |
| network errors (DNS, connection refused/reset, TLS) | caller-cancelled context |
| per-request timeout | claude-local non-zero exit (operator-fixable) |
Status is extracted via errors.As against the SDK error types
(*anthropic.Error, *openai.Error, *genai.APIError), so the providers'
errors.Wrap layers are transparent. The policy never inspects error messages.
Behaviour across a failover boundary — known limitations¶
- Lossy transcript replay. The composite keeps a provider-neutral transcript
of user/assistant text turns and replays the user turns into a fallback
provider on first use (assistant turns and tool-call/tool-result interleaving
cannot be reconstructed). A conversation that did heavy tool use before
failover resumes with reduced context. Pass
WithStrictToolContextto fail fast instead, once a tool has executed. - Tools re-apply cleanly (handlers are provider-agnostic).
- Streaming fails over only before the first visible event
(
EventTextDelta/EventToolCallStart) reaches your callback — once a delta is emitted it cannot be un-emitted, so a later error is terminal. - Usage is the sum across every provider the composite drove (so a failover's combined spend is visible).
Logging & redaction¶
Each transition logs exactly one WARN line — chat provider failover with
from/to (enum names) and a coarse reason (status/network). The
triggering error's message is never logged verbatim; any endpoint detail is
reduced to the host only, consistent with the chat client's existing rule (see
redact).