Resilience Middleware: Server-Side Rate Limiting and Client-Side Circuit Breaking¶
- Authors
- Matt Cockayne, Claude Opus 4.8 (AI drafting assistant)
- Date
- 21 June 2026
- Status
- DRAFT
Roadmap item D2 (Resilience middleware). This spec is DRAFT and paused for human review. Per
CLAUDE.mdStep 0, do not begin implementation until the Open Questions are resolved or explicitly deferred.
Overview¶
GTB's transports already carry half of a resilience story. On the client side,
pkg/http ships WithRetry / RetryConfig (exponential backoff + full jitter,
2026-03-26-http-retry-backoff) and a
ClientChain / ClientMiddleware RoundTripper pipeline
(2026-03-31-http-client-middleware) that
already includes a token-bucket outbound WithRateLimit. On the server side,
pkg/http has the Chain / Middleware handler pipeline and pkg/grpc has the
InterceptorChain / Interceptor pipeline
(2026-03-26-transport-logging-middleware),
both fronting hardened servers (2026-03-24-secure-http-client).
Two resilience primitives are conspicuously missing, and they are the two that most directly complement what already exists:
-
Server-side rate limiting. Today a GTB-built management/API server has no first-class way to shed load.
WithRateLimitexists only as client egress throttling; there is no ingress equivalent for theChainorInterceptorChain. A downstream tool that exposes an HTTP or gRPC surface must hand-roll a limiter to protect itself β exactly the kind of boilerplate the middleware infrastructure was built to eliminate. -
Client-side circuit breaking. Retry alone has a well-known failure mode: when a downstream is hard down (not transiently flapping), every caller keeps paying the full retry budget β backoff sleeps, connection attempts, and wasted latency β against a service that will not answer. The retry spec itself flags this in its Future Considerations: "A circuit breaker could wrap the retry transport to fail-fast when a downstream service is consistently unavailable, avoiding wasted retry attempts." This spec delivers that.
This spec adds, with no new heavy dependency (token-bucket already vendored via
golang.org/x/time/rate; the breaker is a small hand-rolled state machine):
| Concern | Side | HTTP surface | gRPC surface |
|---|---|---|---|
| Rate limiting | Server (ingress) | RateLimitMiddleware (a Middleware) |
RateLimitInterceptor (an Interceptor) |
| Circuit breaking | Client (egress) | WithCircuitBreaker (a ClientMiddleware) |
CircuitBreakerInterceptor (a grpc.DialOption factory) |
All four plug into the existing chains. Nothing about the chain types changes.
Scope clarification β what this is not¶
- Not a caching layer. A response/HTTP cache was previously proposed and REJECTED as a roadmap item. This spec deliberately stays clear of it: nothing here stores, keys, or serves responses. The breaker's open-state behaviour is to fail fast (return an error), never to serve a cached/stale body. See Confirmation: no caching overlap.
- Not a replacement for retry. Retry and the breaker are orthogonal layers that compose; see Composition with retry.
- Not a distributed/coordinated limiter. The rate limiter is per-process (per-server-instance), in-memory. Cluster-wide quota coordination (Redis token buckets, etc.) is explicitly out of scope and left to downstream tools.
Decision Log β is this foundation-level or app-level?¶
The shared roadmap brief asks each
D*item to argue whether it belongs in GTB's foundation or is really an application concern, and to say so plainly in the verdict if the case is weak. Here is that argument.
The case for foundation-level (strong):
-
It completes a story GTB already started, in GTB's own vocabulary. GTB already owns
Chain,ClientChain,InterceptorChain,RetryConfig, and an egressWithRateLimit. Rate limiting and circuit breaking are the two canonical resilience primitives that sit beside retry in every "stability patterns" treatment (Nygard's Release It!, the Polly/resilience4j/gobreaker ecosystems). Shipping retry but neither of the other two leaves the library at an awkward, incomplete altitude β a downstream tool gets retry for free but must reach outside GTB for the partner primitives. -
The integration points are GTB-internal types, not app types. A breaker is a
ClientMiddleware/http.RoundTripperand agrpc.DialOption; a limiter is aMiddleware/Interceptor. These signatures are GTB's. An app cannot supply them as cleanly from outside without re-deriving the chain plumbing. The natural home for aMiddlewarethat throttles is the package that definesMiddleware. -
It removes a recurring, security-relevant footgun. Ingress rate limiting is a denial-of-service mitigation. GTB has consistently absorbed transport-hardening concerns (body-size caps
DefaultMaxRequestBodyBytes/DefaultMaxGRPCMessageBytes, TLS floors, redirect downgrade rejection). A self-protection limiter is the same class of concern β a foundation that hands you a server should hand you the means to stop it falling over. -
Consistency dividend. Every GTB tool that adopts it gets identical limiter semantics, identical breaker state names, identical config keys, and identical logs/telemetry. That uniformity is precisely the value proposition of a base framework.
The case against (weak, but recorded honestly):
- Policy (the rate, the failure threshold) is undeniably app-specific. But
GTB already ships configurable-policy primitives (
RetryConfig, the egress limiter rate, body caps) without anyone calling those "app-level". GTB provides the mechanism and sane defaults; the app supplies the numbers. That division is the established pattern, not a new compromise. - A determined team could vendor
gobreakerand atollbooth-style limiter directly. But they would then be writing the chain glue GTB is for, and losing the consistency dividend in (4).
Verdict: foundation-level, and not a weak case. This is the missing third of a
trio GTB already commits to two-thirds of, expressed entirely in GTB's own
middleware types, and it carries a self-protection (DoS) dimension consistent with
GTB's existing transport-hardening remit. It is admitted. The only genuinely
app-level part β the policy numbers β is delegated to config/options exactly as
RetryConfig already is. Recommend: accept.
Confirmation: no caching overlap¶
The previously-REJECTED caching-layer item is confirmed non-overlapping:
- No type in this spec reads, writes, stores, or keys a response body.
- The circuit breaker's three states are
Closed/Open/HalfOpen. InOpenit returns a sentinel error (ErrCircuitOpen) immediately β it does not return a previously-seen response. There is no response store, no TTL, no conditional-request handling, noCache-Controlparsing anywhere in scope. - The rate limiter rejects or admits; it never substitutes a stored answer.
No conflict with the rejected caching work exists or is introduced.
Design Decisions¶
-
Server-side rate limiting; client-side circuit breaking β as the brief directs, and as the topology demands. A limiter protects the thing receiving load, so it belongs at ingress (server middleware/interceptor). A breaker protects the caller from a sick callee, so it belongs at egress (client middleware / dial interceptor). Putting either on the wrong side is a category error; this spec does not offer the inverted variants.
-
Token-bucket for the limiter (
golang.org/x/time/rate), matching the already-shipped egressWithRateLimit. The dependency is already vendored and battle-tested; rolling our own leaky-bucket would add risk for no gain. Token-bucket gives a smooth steady-state rate plus a configurable burst, which is the right shape for API ingress. -
Classic three-state circuit breaker, hand-rolled, no new dependency. The
Closed β Open β HalfOpen β Closed/Openstate machine is ~120 lines including the rolling failure counter. Pulling insony/gobreakerfor that is not justified given GTB's std-lib-leaning posture and the "avoid heavy deps if a small impl suffices" directive. The implementation lives in a small, fully-unit- tested internal type. -
Per-route and global, via composition rather than a config matrix. Rather than build a route-patternβpolicy table into the limiter, we expose the limiter as an ordinary
Middleware/Interceptorand let the existing chain mechanics do per-route scoping. A global limiter is one entry in the server-wideChain; a per-route limiter is the same constructor wrapped around a specific handler (HTTP) or selected byinfo.FullMethod(gRPC). See Per-route vs global. This keeps the limiter a leaf primitive and avoids inventing a routing DSL GTB does not otherwise have. -
Config surface mirrors
RetryConfig. Each primitive takes a small config struct with aDefault*Config()constructor returning sane values, exactly likeDefaultRetryConfig(). Options are constructor arguments, not a second variadic layer, keeping the surface minimal. -
Limiter rejects with the protocol-correct "too many requests" signal. HTTP β
429 Too Many Requestswith aRetry-Afterheader (which the client's retry layer already honours β a pleasing closed loop). gRPC βcodes.ResourceExhausted. -
Breaker fails fast with a typed sentinel, surfaced through the existing error stack.
ErrCircuitOpenis acockroachdb/errorssentinel so callers canerrors.Isit. On the gRPC side the open state returnsstatus.Error(codes.Unavailable, β¦)so it is indistinguishable to the wire from a genuine downstream outage (which is the correct semantic). -
Observability via the existing logger; OTel-ready but not OTel-coupled. Limiter rejections and breaker state transitions log through
logger.Logger(the same dependency the logging middleware already takes). Metrics are listed as a future hook, mirroring howpkg/grpckeeps OTel in a separateotel.go(2026-06-01-otel-observability) rather than threading it through every primitive.
Public API¶
pkg/http β server-side rate limit middleware¶
// RateLimitConfig configures the server-side token-bucket rate limiter.
type RateLimitConfig struct {
// RequestsPerSecond is the sustained fill rate of the token bucket.
// Must be > 0. Default: 50.
RequestsPerSecond float64
// Burst is the bucket capacity β the maximum number of requests that may
// be admitted in an instantaneous spike. Must be >= 1. Default: 100.
Burst int
// KeyFunc derives the limiter key for a request, enabling per-client
// limiting. When nil, a single global bucket is used for all requests.
// A common choice is to key on the client IP (see ClientIPKey).
KeyFunc func(*http.Request) string
// OnLimited is invoked when a request is rejected, before the 429 is
// written. Optional; useful for metrics/telemetry. The default writes a
// structured debug log via the logger passed to the constructor.
OnLimited func(*http.Request)
}
// DefaultRateLimitConfig returns a RateLimitConfig suitable for a modest
// management/API server: 50 rps sustained, burst 100, single global bucket.
func DefaultRateLimitConfig() RateLimitConfig
// RateLimitMiddleware returns a Middleware that admits requests under a
// token-bucket limiter and rejects excess traffic with 429 Too Many Requests
// plus a Retry-After header. A nil/invalid config falls back to defaults.
//
// Because it is an ordinary Middleware it composes into any Chain and can be
// scoped globally (one entry in the server chain) or per-route (wrap a single
// handler). Per-client limiting is enabled by setting RateLimitConfig.KeyFunc.
func RateLimitMiddleware(log logger.Logger, cfg RateLimitConfig) Middleware
// ClientIPKey is a ready-made RateLimitConfig.KeyFunc that keys on the client
// IP, preferring the left-most X-Forwarded-For entry when present and falling
// back to RemoteAddr. It reuses the same client-IP derivation as the logging
// middleware for consistency.
func ClientIPKey(r *http.Request) string
pkg/http β client-side circuit breaker middleware¶
// CircuitState is the breaker's state.
type CircuitState int
const (
// StateClosed admits all requests; failures are counted.
StateClosed CircuitState = iota
// StateOpen rejects all requests immediately with ErrCircuitOpen until the
// cooldown elapses, then transitions to StateHalfOpen.
StateOpen
// StateHalfOpen admits a limited number of trial requests; success closes
// the breaker, failure re-opens it.
StateHalfOpen
)
// ErrCircuitOpen is returned (wrapped) by the breaker when it is open. Callers
// may test for it with errors.Is.
var ErrCircuitOpen = errors.New("http: circuit breaker is open")
// CircuitBreakerConfig configures the client-side breaker.
type CircuitBreakerConfig struct {
// FailureThreshold is the number of consecutive failures (within Closed)
// that trips the breaker open. Must be >= 1. Default: 5.
FailureThreshold int
// Cooldown is how long the breaker stays Open before allowing a trial.
// Default: 30s.
Cooldown time.Duration
// HalfOpenMaxRequests is the number of trial requests allowed in HalfOpen.
// The first success closes the breaker; any failure re-opens it.
// Must be >= 1. Default: 1.
HalfOpenMaxRequests int
// IsFailure classifies a round-trip outcome as a failure for breaker
// accounting. When nil, the default treats transport errors and 5xx
// responses (>=500) as failures; 4xx and 2xx/3xx are successes. This means
// a 429 (client rate-limited) does NOT trip the breaker β that is retry's
// job, not the breaker's.
IsFailure func(resp *http.Response, err error) bool
// OnStateChange is invoked on every state transition. Optional; useful for
// logging/telemetry. The constructor also logs transitions via logger.
OnStateChange func(from, to CircuitState)
}
// DefaultCircuitBreakerConfig returns: threshold 5, cooldown 30s,
// half-open trial 1, default 5xx/transport-error failure classification.
func DefaultCircuitBreakerConfig() CircuitBreakerConfig
// WithCircuitBreaker returns a ClientMiddleware that fails fast while a
// downstream is consistently failing, avoiding wasted retry/backoff cycles.
// Place it OUTSIDE the retry transport (i.e. earlier in the ClientChain, or
// rely on the documented ordering) so the breaker sees the post-retry verdict.
func WithCircuitBreaker(log logger.Logger, cfg CircuitBreakerConfig) ClientMiddleware
pkg/grpc β server-side rate limit interceptor¶
// RateLimitConfig mirrors the HTTP server limiter for gRPC ingress.
type RateLimitConfig struct {
RequestsPerSecond float64 // default 50
Burst int // default 100
// KeyFunc derives the limiter key from the RPC context (e.g. peer address
// or a metadata value). When nil, a single global bucket is used.
KeyFunc func(ctx context.Context, fullMethod string) string
// OnLimited is invoked when an RPC is rejected. Optional.
OnLimited func(ctx context.Context, fullMethod string)
}
func DefaultRateLimitConfig() RateLimitConfig
// RateLimitInterceptor returns an Interceptor (unary + stream) that admits
// RPCs under a token-bucket limiter and rejects excess with
// codes.ResourceExhausted. Composes into any InterceptorChain; per-method
// scoping is achieved via KeyFunc keying on fullMethod (or a method-filtering
// wrapper, analogous to the logging interceptor's WithPathFilter).
func RateLimitInterceptor(log logger.Logger, cfg RateLimitConfig) Interceptor
// PeerKey is a ready-made KeyFunc keying on the RPC peer address.
func PeerKey(ctx context.Context, fullMethod string) string
pkg/grpc β client-side circuit breaker dial option¶
// CircuitState / StateClosed / StateOpen / StateHalfOpen β same trio as HTTP,
// defined once in pkg/grpc for the gRPC side.
// CircuitBreakerConfig mirrors the HTTP breaker, with a gRPC-shaped failure
// classifier.
type CircuitBreakerConfig struct {
FailureThreshold int // default 5
Cooldown time.Duration // default 30s
HalfOpenMaxRequests int // default 1
// IsFailure classifies an RPC outcome. When nil, the default treats
// Unavailable, DeadlineExceeded, and ResourceExhausted as failures and all
// other codes (including OK) as successes.
IsFailure func(err error) bool
OnStateChange func(from, to CircuitState)
}
func DefaultCircuitBreakerConfig() CircuitBreakerConfig
// CircuitBreakerInterceptor returns a grpc.UnaryClientInterceptor that opens
// when a downstream is consistently failing and rejects calls with
// codes.Unavailable while open. Install it on a client connection via
// grpc.WithChainUnaryInterceptor.
//
// A streaming variant (CircuitBreakerStreamInterceptor) is provided for
// symmetry but only accounts for stream *establishment* failures, not mid-
// stream errors β see Open Questions.
func CircuitBreakerInterceptor(log logger.Logger, cfg CircuitBreakerConfig) grpc.UnaryClientInterceptor
func CircuitBreakerStreamInterceptor(log logger.Logger, cfg CircuitBreakerConfig) grpc.StreamClientInterceptor
gRPC asymmetry, deliberate. The server breaker would be nonsensical (a server does not "break" against itself), and the client rate limiter is already the egress
golang.org/x/time/ratestory on the HTTP side and is rarely needed for gRPC clients. So gRPC ships server rate-limit + client breaker, matching the HTTP shape. No inverted variants are offered.
Per-route vs global¶
The limiter is a leaf primitive; scoping is composition, not configuration.
Global (server-wide): one limiter entry in the chain protects everything.
chain := gtbhttp.NewChain(
gtbhttp.RecoveryMiddleware(l),
gtbhttp.RateLimitMiddleware(l, gtbhttp.DefaultRateLimitConfig()), // global
gtbhttp.LoggingMiddleware(l),
)
_, _ = gtbhttp.Register(ctx, "http", controller, cfg, l, mux, gtbhttp.WithMiddleware(chain))
Per-route: wrap the specific handler with its own limiter before mounting.
mux := http.NewServeMux()
mux.Handle("/api/expensive",
gtbhttp.RateLimitMiddleware(l, gtbhttp.RateLimitConfig{RequestsPerSecond: 2, Burst: 2}).
Then(http.HandlerFunc(expensiveHandler)), // a single-middleware Then via NewChain
)
mux.HandleFunc("/api/cheap", cheapHandler) // unlimited
Per-client: set KeyFunc (e.g. ClientIPKey) so each key gets its own bucket;
buckets are stored in a bounded, lazily-evicted map (see
Per-client bucket store).
The gRPC side mirrors this: a global limiter is one Interceptor in the
InterceptorChain; per-method limiting keys on info.FullMethod inside KeyFunc
(or wraps with a method filter analogous to the logging interceptor's path filter).
Health endpoints (
/healthz,/livez,/readyz) are already mounted outside theWithMiddlewarechain byRegister(seepkg/http/server.go), so a global limiter never throttles liveness/readiness probes. This is a load-bearing existing guarantee, not new work.
Composition with retry¶
The breaker and retry are orthogonal and stack in a defined order. Within a
NewClient, the established wrapping is: base transport β retryTransport β
clientChain (see NewClient in pkg/http/client.go, lines ~128-141). Because the
breaker is a ClientMiddleware it lives in the clientChain, i.e. outside the
retry transport:
This is the correct ordering and the whole point:
- The breaker sees the final post-retry verdict for a request. One logical call that exhausts its retry budget against a dead service counts as one breaker failure, not N.
- Once the breaker is
Open, subsequent calls are rejected before entering the retry layer β so no backoff sleeps, no connection attempts are spent on a service known to be down. This is exactly the waste the retry spec's Future Considerations flagged.
client := gtbhttp.NewClient(
gtbhttp.WithRetry(gtbhttp.DefaultRetryConfig()),
gtbhttp.WithClientMiddleware(gtbhttp.NewClientChain(
gtbhttp.WithCircuitBreaker(l, gtbhttp.DefaultCircuitBreakerConfig()),
gtbhttp.WithRequestLogging(l),
)),
)
Ordering note for the spec author implementing this: the current
WithClientMiddlewaredoc comment says the chain "wraps the transport after retry β¦ so that retry operates on the raw transport". Confirm during implementation that a breaker placed in the chain therefore sits outside retry as described, and tighten the doc comment if the wording is ambiguous. (Captured in Open Questions.)
Internal implementation¶
Token-bucket limiter (shared shape, two packages)¶
Each limiter holds either a single *rate.Limiter (global, KeyFunc == nil) or a
*keyedLimiter (per-key). On each request: limiter.Allow() β non-blocking,
unlike the egress WithRateLimit which uses Wait to throttle the caller. Ingress
must reject, not block, or a flood would simply queue and exhaust memory. Allow()
== false β write the rejection.
// httpRateLimit (illustrative)
func RateLimitMiddleware(log logger.Logger, cfg RateLimitConfig) Middleware {
cfg = cfg.normalized() // apply defaults, clamp
store := newLimiterStore(cfg.RequestsPerSecond, cfg.Burst)
return func(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
key := ""
if cfg.KeyFunc != nil {
key = cfg.KeyFunc(r)
}
if !store.limiterFor(key).Allow() {
if cfg.OnLimited != nil {
cfg.OnLimited(r)
}
log.Debug("request rate-limited", "path", r.URL.Path, "key", key)
w.Header().Set("Retry-After", "1")
http.Error(w, "rate limit exceeded", http.StatusTooManyRequests)
return
}
next.ServeHTTP(w, r)
})
}
}
Per-client bucket store¶
KeyFunc introduces unbounded-key risk (an attacker rotating source IPs could
allocate a *rate.Limiter per IP and exhaust memory). The store is therefore
bounded and evicting: a mutex-guarded map capped at maxTrackedKeys (default
8192) with simple LRU/last-access eviction. When full, the least-recently-used key is
evicted (its bucket is recreated full on next sighting β acceptable, since eviction
only happens under key churn). This mirrors the defensive posture of the existing
body-size caps. This memory-safety property is a required test.
Circuit breaker state machine¶
A mutex-guarded struct holding state CircuitState, consecutiveFailures int,
openedAt time.Time, and halfOpenInFlight int. The transitions:
- Closed: on each completed call,
IsFailureincrements or resets the counter; reachingFailureThresholdβOpen(recordopenedAt). - Open: every call rejected with
ErrCircuitOpenuntiltime.Since(openedAt) >= Cooldown; the next call after cooldown βHalfOpen. - HalfOpen: admit up to
HalfOpenMaxRequeststrials; first success βClosed(reset counter); any failure βOpen(resetopenedAt). Trials beyond the cap while a trial is in flight are rejected withErrCircuitOpen.
Time is injected (now func() time.Time, default time.Now) so cooldown transitions
are deterministically testable without sleeps β consistent with the project's
race-avoidance guidance (no package-level clock, dependency injected via struct
field). The HTTP variant adapts RoundTrip; the gRPC variant adapts the unary/stream
client interceptor signatures. The core state machine is a single shared internal
type to avoid two divergent implementations.
gRPC limiter¶
RateLimitInterceptor returns an Interceptor{Unary, Stream}. Unary checks
Allow() before invoking handler; on rejection returns
status.Error(codes.ResourceExhausted, "rate limit exceeded"). Stream checks at
stream-open. Peer address via peer.FromContext (the logging interceptor already
does this, so the helper is reusable).
Project structure¶
pkg/http/
ratelimit.go # NEW: RateLimitConfig, RateLimitMiddleware, ClientIPKey, limiterStore
ratelimit_test.go # NEW
circuitbreaker.go # NEW: CircuitBreakerConfig, WithCircuitBreaker, ErrCircuitOpen, breaker
circuitbreaker_test.go # NEW
client_middleware.go # UNCHANGED (breaker is a ClientMiddleware; no chain change)
client.go # UNCHANGED (ordering already correct)
pkg/grpc/
ratelimit.go # NEW: RateLimitConfig, RateLimitInterceptor, PeerKey
ratelimit_test.go # NEW
circuitbreaker.go # NEW: CircuitBreakerConfig, CircuitBreaker(Stream)Interceptor
circuitbreaker_test.go # NEW
internal/circuitbreaker/ # OPTION (see Open Questions): shared state machine
breaker.go # core Closed/Open/HalfOpen machine, transport-agnostic
breaker_test.go
The chain/interceptor types (
chain.go,client_middleware.go,pkg/grpc/chain.go) are untouched β every new primitive is just another value of an existing middleware/interceptor type.
Generator impact¶
None for default scaffolding. The generator does not prescribe a middleware
stack; consumers opt in, exactly as with the logging/recovery middleware. The
docs/components/ examples should show the resilience middleware so scaffolded tools
discover it, but no template change is required. (If review wants the generated
server to ship a commented-out RateLimitMiddleware line as a discoverability hint,
that is a small, separate follow-up β flagged in Open Questions.)
Error handling¶
- Limiter: never errors internally; rejection is a normal
429/ResourceExhaustedresponse, not a Go error. Invalid config is clamped to safe defaults bynormalized()rather than rejected, so a misconfigured limiter can never fail open into a panic β it fails into the default policy and logs a warn. - Breaker: open-state rejection returns
ErrCircuitOpen(HTTP,errors.Is-able) orcodes.Unavailable(gRPC). The breaker never swallows a downstream's real error inClosed/HalfOpenβ it passes the realresp, errthrough and only counts it. - All errors created/wrapped with
github.com/cockroachdb/errorsper project policy.
Testing strategy¶
Table-driven, t.Parallel(), logger.NewNoop(), injected clock β no time.Sleep
for breaker timing. New pkg/ code targets β₯90% coverage per policy.
| Test | Scenario |
|---|---|
TestRateLimit_AdmitsUnderRate |
requests within burst+rate all pass |
TestRateLimit_Rejects429 |
excess request β 429 + Retry-After |
TestRateLimit_PerClientKey |
two IPs get independent buckets |
TestRateLimit_BucketStoreBounded |
key churn never exceeds maxTrackedKeys (memory-safety) |
TestRateLimit_GlobalNilKeyFunc |
single shared bucket when KeyFunc nil |
TestRateLimit_HealthEndpointsUnaffected |
/healthz never throttled (via Register) |
TestRateLimit_NonBlocking |
limiter uses Allow not Wait β rejected request returns promptly |
TestBreaker_OpensAtThreshold |
N consecutive failures β Open |
TestBreaker_OpenRejectsFast |
Open returns ErrCircuitOpen without calling next |
TestBreaker_HalfOpenAfterCooldown |
injected clock past cooldown β HalfOpen |
TestBreaker_HalfOpenSuccessCloses |
trial success β Closed, counter reset |
TestBreaker_HalfOpenFailureReopens |
trial failure β Open, openedAt reset |
TestBreaker_HalfOpenConcurrencyCap |
only HalfOpenMaxRequests trials admitted |
TestBreaker_DefaultIsFailure_5xxAndTransport |
5xx + transport err count; 4xx/429 do not |
TestBreaker_ErrorsIsSentinel |
errors.Is(err, ErrCircuitOpen) holds |
TestBreaker_ComposesWithRetry |
one retry-exhausted call = one breaker failure |
TestGRPCRateLimit_Unary/Stream |
bufconn: ResourceExhausted on excess |
TestGRPCBreaker_Unary |
bufconn: Unavailable while open, recovers after cooldown |
TestBreaker_RaceUnderParallel |
-race with concurrent RoundTrips |
Concurrency tests run under -race; the breaker and bucket store must be race-clean
with no package-level mutable state (project mandate).
Linting & verification¶
No new nolint directives anticipated. No new third-party dependency
(golang.org/x/time/rate already vendored; breaker is hand-rolled).
Documentation¶
- New sections in
docs/components/http.mdanddocs/components/grpc.md(or equivalent) covering both primitives, the global/per-route/per-client recipes, and the retry-composition ordering diagram. - Cross-reference from the retry-backoff component docs (closing its "circuit
breaker" future-work note) and from
docs/concepts/resilience overview if one exists. - Godoc on every exported symbol; the breaker godoc must state the open-state fail-fast semantics and explicitly note it does not serve cached responses.
Backwards compatibility¶
- Purely additive. No existing type, signature, or default changes. Chains,
RetryConfig, andNewClientordering are untouched. - All four primitives are opt-in; a tool that does not add them sees identical behaviour to today.
- Pre-1.0 API note: even though breaking changes are currently permitted as a minor bump, none are needed here.
Future considerations¶
- OTel metrics: limiter admit/reject counters and breaker state-transition gauges
via the existing
otel.gopattern β natural next step, deliberately out of this spec's scope to keep it focused. - Adaptive / concurrency limiting: a Little's-law / AIMD adaptive limiter
(Γ la Netflix
concurrency-limits) as an alternative to fixed token-bucket. - Distributed limiter backend: pluggable store interface so the bucket can live
in Redis for cluster-wide quotas. The
limiterStoreis deliberately an internal interface so this could slot in without an API break. - Breaker bulkheading: per-host breaker instances inside one client (keyed like the limiter store) so one bad host doesn't open the breaker for healthy hosts.
- Config-driven policy: read
RateLimitConfig/CircuitBreakerConfigdefaults from a config prefix (e.g.server.http.ratelimit.*) so operators can tune without recompiling β mirrors how server port/TLS are config-driven. Flagged below.
Open Questions¶
-
Shared internal breaker package? The Closed/Open/HalfOpen machine is identical for HTTP and gRPC. Extract to
internal/circuitbreaker(one tested core, two thin adapters) β or accept a small amount of duplication to keep each transport package self-contained? Recommendation: extract; it is genuinely shared logic with a transport-agnostic shape. -
Config-prefix integration now or later? Should the four configs be readable from a config prefix (
server.http.ratelimit.requests_per_second, etc.) in v1, so operators tune policy via config like they tune port/TLS β or ship code-only defaults first and add config binding as a follow-up? The brief's "minimal config surface" leaning suggests code-first; confirm. -
gRPC streaming breaker depth. Should the stream client breaker account only for stream-establishment failures (proposed, simple) or also inspect per-message errors via a wrapped
ClientStream(more complete, more code)? Proposed: establishment-only for v1, documented as such. -
Default rate values.
50 rps / burst 100is a guess for a "modest management server". Are these the right defaults, or should the limiter ship with no default rate (forcing the caller to choose) to avoid a surprising throttle on an unconfigured high-traffic server? Trade-off: safe-by-default vs least-surprise. -
Breaker default failure classification β is 429 a failure? Proposed: no β a 429 means "you're being rate-limited", which is retry's domain, not a downstream health signal, so it should not trip the breaker. Confirm this is the desired default (it is the one encoded in
DefaultCircuitBreakerConfig). -
WithClientMiddlewaredoc-comment ambiguity. The existing comment frames the chain as wrapping "after retry". Confirm the implementer should tighten that wording so the breaker-outside-retry ordering is unambiguous, rather than leaving it implicit. -
Generator discoverability hint. Should the scaffolded server include a commented-out
RateLimitMiddlewareline as a discoverability nudge, or is a docs mention sufficient? Proposed: docs only; no template change. -
E2E/BDD coverage. Per
CLAUDE.md, user-facing transport behaviour may warrant a Godog scenario (e.g. "given a server with a 2 rps limiter, when I send 5 rapid requests, then 3 receive 429"). Is an E2E scenario in scope for this item, or are unit + integration tests sufficient given there is no new CLI command? Proposed: one smoke-level CLI/transport BDD scenario for the limiter; breaker covered by unit tests only.
Resolutions (open questions confirmed with user 2026-06-21)¶
- Shared breaker package β RESOLVED: extract to
internal/circuitbreakerβ one tested Closed/Open/HalfOpen core with thin HTTP/gRPC adapters. - Config-prefix binding β RESOLVED: bind to a config prefix now (v1).
Operators tune
server.http.ratelimit.*etc. via config like port/TLS; the four resilience configs are readable from the config layer, not code-only. (Departs from the draft's code-first recommendation β broader scope accepted.) - gRPC streaming breaker depth β RESOLVED: also inspect per-message errors
via a wrapped
ClientStream, not just stream-establishment. (Departs from the draft's establishment-only v1 β fuller coverage accepted.) - Default rate values β RESOLVED: ship a
50 rps / burst 100default (safe-by-default for a modest management server). The limiter is still opt-in to the chain; when added without params it applies this default. (Departs from the draft's no-default recommendation.) - 429 β breaker β RESOLVED: no, a 429 does not trip the breaker (it's
retry's domain, not a downstream-health signal); encoded in
DefaultCircuitBreakerConfig. WithClientMiddlewaredoc comment β RESOLVED: tighten it so the breaker-outside-retry ordering (retry-exhausted = one breaker failure) is explicit.- Generator hint β RESOLVED: docs only, no scaffold template change (no
commented-out
RateLimitMiddlewareline). - BDD scope β RESOLVED: one limiter smoke-level Godog scenario (e.g. 2 rps limiter, 5 rapid requests, 3 receive 429); the circuit breaker is covered by unit tests only.
Implementation phases¶
- HTTP server rate limiter β
RateLimitConfig,normalized(),limiterStore(bounded/evicting),RateLimitMiddleware,ClientIPKey; tests incl. memory-safety and health-endpoint-unaffected. - HTTP client circuit breaker β core state machine (shared per OQ1),
ErrCircuitOpen,WithCircuitBreaker, injected clock; tests incl. retry-composition and-race. - gRPC server rate limiter β
RateLimitInterceptor,PeerKey; bufconn tests. - gRPC client circuit breaker β unary + (establishment-only) stream interceptors; bufconn tests.
- Docs + (optional) BDD β component docs, retry cross-reference, optional limiter
Godog scenario;
/gtb-verify.