Controls Package Health Check Integration¶
- Authors
- Gemini CLI (AI drafting assistant)
- Date
- 24 March 2026
- Status
- IMPLEMENTED
1. Overview¶
The controls package manages the lifecycle of registered services (Start, Stop, Status). While basic status functions exist, there is currently no standardized way to expose these health indicators externally to orchestrators like Kubernetes or monitoring systems.
This specification proposes adding standardized health check handlers for HTTP and gRPC services that leverage the Controller's knowledge of all registered services. This ensures that a single health check endpoint can report the health of the entire application or specific components.
2. Problem Statement¶
Currently, each service registered with the Controller has a StatusFunc, but:
1. There is no public API on the Controller to aggregate these statuses into a single report.
2. HTTP servers created via pkg/http do not automatically provide a /healthz endpoint.
3. gRPC servers created via pkg/grpc do not implement the standard gRPC Health Checking Protocol.
4. Consumers have to manually wire up health checks, leading to inconsistency.
3. Goals & Non-Goals¶
Goals¶
- Add a public
Status()method toControllerto aggregate service health. - Provide a standard
/healthzHTTP handler inpkg/http. - Integrate the standard gRPC Health Checking Protocol in
pkg/grpc. - Ensure health checks are non-blocking and do not cause service disruption.
- Automate the registration of these health checks where possible (transparently).
Non-Goals¶
- Implementing complex liveness vs. readiness probe logic (this spec focuses on a general health check).
- Persistent storage of health history.
- Automatic restarts of failing services (this remains the responsibility of the orchestrator).
4. Public API¶
4.1 pkg/controls¶
Update Controller to expose health information:
type ServiceStatus struct {
Name string `json:"name"`
Status string `json:"status"` // "OK", "ERROR"
Error string `json:"error,omitempty"`
}
type HealthReport struct {
OverallHealthy bool `json:"overall_healthy"`
Services []ServiceStatus `json:"services"`
}
// Status returns an aggregate health report for all registered services.
func (c *Controller) Status() HealthReport
4.2 pkg/http¶
Add a health handler:
// HealthHandler returns an http.HandlerFunc that responds with the controller's health report.
// It returns 200 OK if all services are healthy, and 503 Service Unavailable otherwise.
func HealthHandler(controller controls.Controllable) http.HandlerFunc
Update Register to optionally (or by default) include the health endpoint.
4.3 pkg/grpc¶
Add a health service registration helper:
// RegisterHealthService registers the standard gRPC health service with the provided server,
// wired to the controller's status.
func RegisterHealthService(srv *grpc.Server, controller controls.Controllable)
5. Internal Implementation¶
5.1 Controller.Status()¶
The Status() method will iterate through c.services.services, calling each Status() function. Since StatusFunc now returns an error (as per recent updates), the results will be collected into a HealthReport.
5.2 HTTP Integration¶
The HealthHandler will encode the HealthReport as JSON.
If Register is called, it should wrap the provided http.Handler (if it's a mux) or use a internal mux to ensure /healthz is handled.
5.3 gRPC Integration¶
Using google.golang.org/grpc/health, we will implement a periodic or on-demand update of the health status. Since grpc/health expects a Watch or simple Check, we can either:
1. Update the health server state every time Status() is queried.
2. Run a background ticker that updates the health server state by calling Controller.Status().
6. Project Structure¶
pkg/controls/controller.go: Implementation ofStatus().pkg/http/handlers.go: New file for health handlers.pkg/grpc/health.go: New file for gRPC health service integration.
7. Error Handling¶
- If a
StatusFuncpanics, it should be recovered and reported as an error in theHealthReport. - Connectivity errors in health checks should be wrapped using
github.com/cockroachdb/errors.
8. Testing Strategy¶
Implementation must follow the Test-Driven Development (TDD) approach as defined in docs/development/specs/index.md.
8.1 Unit Tests¶
- Package
pkg/controls:- Test
Status()with various service combinations:- All services healthy (returns
overall_healthy: true). - One or more services returning errors (returns
overall_healthy: false). - Services with
nilStatusFunc(should be treated as healthy by default or handled gracefully).
- All services healthy (returns
- Verify thread-safety of
Status()when called concurrently with service start/stop.
- Test
- Package
pkg/http:- Test
HealthHandlerwith mock controllers. - Assert correct HTTP status codes (200 for healthy, 503 for unhealthy).
- Assert JSON response body matches the
HealthReportstructure.
- Test
- Package
pkg/grpc:- Test gRPC health service integration using
grpc_health_v1. - Verify
CheckreturnsSERVINGorNOT_SERVINGbased on controller status.
- Test gRPC health service integration using
8.2 Integration Tests¶
- HTTP Server Integration:
- Start a real HTTP server using
http.Register. - Perform HTTP GET requests to
/healthzand verify full end-to-end connectivity and reporting.
- Start a real HTTP server using
- gRPC Server Integration:
- Start a real gRPC server using
grpc.Register. - Use a gRPC health client to query the health status.
- Start a real gRPC server using
8.3 Quality Gates¶
- Code Coverage: New code in
pkg/must achieve at least 90% coverage. - Race Detection: All tests must pass with
go test -race ./.... - Linting: Must pass
golangci-lint run --fixwith no outstanding issues.
9. Documentation Maintenance¶
Documentation is a first-class citizen in GTB. The following updates are required:
- Library Documentation: Update
docs/components/controls.mdto document the newStatus()method and health registration helpers. - Concept Documentation: Update
docs/concepts/service-orchestration.mdto reflect the new health checking capabilities. - API Reference: Ensure all new public methods, types, and constants are fully documented with GoDoc comments.
- Examples: Add or update examples in
pkg/controls/example_test.go(if exists) or in the documentation to demonstrate how to use health checks.
10. Leveraged Workflows¶
Implementation MUST leverage the following workflows from .agent/workflows/:
/gtb-library-contribution: For adding the core logic topkg/./gtb-verify: To ensure all tests pass, race conditions are absent, and linting is clean./gtb-lint: To resolve any complex linting issues./gtb-docs: For updating the markdown documentation indocs/.
11. Migration & Compatibility¶
- This is a feature addition and should be backward compatible.
- Existing
StatusFuncimplementations are already compatible with the new signature (func() error).
12. Implementation Phases¶
Phase 1: Core Controller¶
- Implement
Controller.Status(). - Add
ServiceStatusandHealthReporttypes. - Update
Services.status()to return results.
Phase 2: HTTP Health Check¶
- Implement
http.HealthHandler. - Update
http.Registerto optionally inject the health handler.
Phase 3: gRPC Health Check¶
- Implement
grpc.RegisterHealthService. - Wire it into the
grpc.Registerflow.