Coverage Gap Closure Specification¶
- Authors
- Matt Cockayne, Claude Opus 4.8 (AI drafting assistant)
- Date
- 20 June 2026
- Status
- IMPLEMENTED
Summary¶
The 2026-06-17 test-coverage closure plan brought its named packages to the
pkg/ β₯90% policy (22 packages, plus the generator ast_extract walkers). A
full-repo go test ./... -cover audit on 2026-06-20 shows 44 of 97 tested
packages at β₯90% β but that raw denominator is misleading: it counts generated
mocks, cmd/* entry points, and pure test-helper packages that carry no
shippable logic and should never be measured.
This spec classifies every package currently below 90% into three buckets β exclude-by-design, covered-by-gated-tests, and addressable-hermetic β and lays out a prioritised plan to close the addressable gaps. The point is to make the 90% policy mean something repo-wide: a package is either β₯90% on unit coverage, or it appears on an explicit, justified exclusion list β nothing falls through the cracks silently.
This spec covers unit coverage (the default just test). Several packages
have additional gated integration or E2E (Godog) coverage not reflected
in the unit number; that is called out per package. The credential/desktop-gated
integration work is specified separately in
2026-06-20-desktop-gated-integration-tests.md.
Goals & Non-Goals¶
Goals¶
- A definitive, categorised inventory of sub-90% packages as of 2026-06-20.
- An explicit exclusion list with per-package rationale, so the policy is enforceable: "β₯90% OR on the list."
- A prioritised, effort-estimated plan for the genuinely addressable gaps.
- A definition of the countable package set the policy applies to.
Non-Goals¶
- No production code changes are expected (test-only), barring a minimal, additive, precedented test seam where unavoidable (note it in the MR).
- Not chasing 90% on TUI/Bubble-Tea, generated, entry-point, or real-dependency code β those are excluded with rationale, not ignored.
- Not the desktop-gated integration work (keychain/WKD/live-VCS/live-chat) β that is the separate spec linked above.
The countable package set¶
A package counts toward the policy unless it is one of:
mocks/**β generated by mockery (// Code generated β¦ DO NOT EDIT).cmd/**β binary entry points (main):cmd/gtb,cmd/e2e,cmd/gtb-no-aws-smoke, and the thincmd/changelog/cmd/docsshims.- Test-only helpers:
internal/exectest,pkg/credentials/credtest,test/e2e/support,internal/version(build-stamp only).
These are never measured. Everything else is either β₯90% or on the exclusion list in this spec.
Bucket A β Exclude by design (with rationale)¶
These ship logic but are intentionally not unit-covered to 90%; they are covered by E2E/integration or are untestable without a real external dependency. They are the standing exclusion list.
| Package | Unit % | Why excluded | Where it is covered |
|---|---|---|---|
pkg/forms |
63.3 | Bubble Tea TUI β interactive event loop needs a real terminal | E2E / manual |
pkg/docs |
32.1 | Bubble Tea markdown browser TUI | E2E / manual |
pkg/cmd/docs |
81.3 | Command wrapper around the TUI; remaining lines are tea.NewProgram().Run() + live HTTP serve |
covered to the TTY ceiling; @cli mcp/docs smoke |
internal/generator/templates |
15.2 | Skeleton string/AST emitters β output correctness is verified by building the scaffolded project | @generator BDD + generator_build integration |
pkg/signing/kms |
85.7 | AWS KMS adapter; newSigner is 100% via a fake, NewSigner is a thin AWS-config wrapper untestable without real AWS |
inner logic unit-tested; wrapper excluded |
pkg/credentials/keychain |
82.4 | The real OS-backend round-trip needs a desktop keychain | desktop-gated spec (Item 2) covers the real path |
Rule going forward: adding a package to Bucket A requires a one-line rationale
here. The apidiff/coverage reviewer treats anything sub-90 not on this list
as a gap to fix.
Bucket B β Generator CLI, covered by @generator BDD E2E (not unit)¶
The internal/cmd/* scaffolding commands are thin wiring over
internal/generator; their behaviour is driven end-to-end by the @generator
Godog suite (features/generator/*) and the generator_build integration tests,
not by unit tests. Their low unit numbers are expected.
| Package | Unit % | Primary real coverage |
|---|---|---|
internal/cmd/enable |
20.5 | features/generator/feature-toggle.feature, signing-lifecycle |
internal/cmd/generate |
43.4 | features/generator/* + generator_build |
internal/cmd/template |
54.9 | features/generator/custom-templates.feature, template-lifecycle |
internal/cmd/sign |
71.1 | features/cli/sign.feature |
internal/cmd/keys |
87.0 | features/cli/keys.feature |
internal/cmd/{disable,regenerate,remove,root} |
0βlow | @generator BDD |
internal/generator/verifier |
29.7 | verifier_integration_test.go (generator_build) |
Decision (Open Question 1 β A, resolved): these are accepted as E2E-covered
and join the exclusion list for the unit policy, on the condition that the
@generator suite + generator_build integration remain green. A thin unit pass
over the pure flag/arg-parsing and error branches of internal/cmd/generate and
internal/cmd/enable is a no-cost bonus only if it falls out naturally (Phase 3).
Bucket C β Addressable hermetic gaps (the real work)¶
Genuinely unit-testable packages below 90%, ordered by gap size. These are the targets.
| Package | Unit % | Gap | Notes / approach |
|---|---|---|---|
pkg/gateway |
52.2 | 38 | HTTP gateway wiring β drive with httptest + an in-process mux; cover register/option/error paths. |
pkg/utils |
71.4 | 19 | Small helpers (incl. IsInteractive, path/string utils). Table-test the pure ones; the TTY check has a non-coverable arm. |
pkg/setup/telemetry |
72.4 | 18 | Telemetry initialiser registration; cover the provider-selection + error branches with fakes. |
pkg/telemetry |
75.2 | 15 | Core collector/event plumbing; cover redaction, buffering, spill, and backend-dispatch error paths. |
internal/agent |
79.0 | 11 | AI-agent orchestration helpers; cover the non-LLM control/error branches with a fake chat client. |
pkg/cmd/version |
80.6 | 9 | Version command output (text/JSON) + error branches. |
pkg/openapi |
82.1 | 8 | Spec registration/serving; httptest the handler + option/error paths. |
pkg/cmd/doctor |
83.1 | 7 | Diagnostic aggregation/render; cover check-result formatting + JSON output + failure branches. |
pkg/telemetry/logs |
86.4 | 4 | Log-backend error/edge branches. |
pkg/telemetry/tracing |
88.5 | 2 | Near-miss; a couple of error branches. |
pkg/telemetry/metrics |
88.9 | 1 | Near-miss. |
pkg/setup |
89.6 | <1 | One or two error branches over the line. |
internal/generator |
75.9 | n/a | Special case (OQ3 β A): unit-test the pure manifest/merge/validation helpers only; file-emission, the interactive diff-pager, AI orchestration, and real git-clone stay generator_build/E2E-covered. The pure-helper surface caps at ~79% β that is the deliberate end state, not a 90% target. The emission/TTY/AI/network portion joins the exclusion rationale. |
Status (implementation, 2026-06-20)¶
Phases 1β3 below are complete β every Bucket C package is β₯90% except
internal/generator, which reached its OQ3βA pure-helper ceiling:
- Phase 1 (telemetry):
telemetry90.6,setup/telemetry100,telemetry/logs100,telemetry/tracing100,telemetry/metrics100. - Phase 2 (command surface):
cmd/version100,cmd/doctor98.7,openapi92.3,setup90.1. - Phase 3 (larger lifts):
gateway100,utils92.9,internal/agent92.2;internal/generator79.2 (pure-helper ceiling β emission/TTY/AI/network excluded,generator_build-covered; the white-box tests required 7 narrow inline//nolint:gosec // G602 false positivedirectives on guarded slice indexing β comments only, no logic change).
Phase 4 (complete): the policy is now codified and self-enforcing β
.coverage-policy.yaml (threshold + not-counted prefixes + the Bucket A/B and
internal/generator exclusion list, each with a rationale),
scripts/coverage-policy.sh (flags any countable sub-90 package not excluded),
the advisory coverage-policy CI job (allow_failure: true, mirroring
apidiff), and a just coverage-policy runner. The check passes clean against
main. This spec is IMPLEMENTED.
Testing strategy¶
- Hermetic only:
httptest,afero.NewMemMapFs(),logger.NewNoop()/NewBuffer(), in-process fakes; no network/keychain/TTY/wall-clock. - CI determinism is mandatory β every new test must pass under
CI=true(one Phase-10 slice tripped aCI=true-only storage-mode default; lesson applied). Set needed env viat.Setenv(serial, not.Parallel). - Table-driven +
t.Parallel()where safe; per package:go test ./pkg/X -coverβ₯90%,golangci-lint run ./pkg/Xclean,go test -race ./pkg/Xok. - Parallelise across packages with isolated subagents (the Phase-10 pattern), integrating and re-verifying each slice on the main worktree.
Implementation phases (prioritised)¶
- Telemetry cluster β
pkg/telemetry(+logs/tracing/metrics) andpkg/setup/telemetry. Self-contained, fakeable, four near-misses among them. - Command-surface near-misses β
pkg/cmd/version,pkg/cmd/doctor,pkg/openapi,pkg/setup(already 89.6%). Cheap output/error-branch passes. - Larger lifts β
pkg/gateway(38-pt gap),pkg/utils,internal/agent, and the non-ASTinternal/generatorhelpers. Optionally the thininternal/cmd/generate/enableunit pass (Bucket B nice-to-have). - Codify the policy β publish the Bucket A/B exclusion list and the
countable-set definition as a machine-readable manifest (e.g.
.coverage-policy.yaml) plus a contributor-guide reference, and add the non-blocking CI advisory job (OQ2 β A) that flags any sub-90 package not on the list, mirroringapidiff'sallow_failure: truepattern.
Each phase is one or more MRs, sized like the Phase-10 slices.
Migration & Compatibility¶
Test-only; no API or behaviour change. Phase 4 makes the policy enforceable by publishing the exclusion list β a documentation/process change, not a code one.
Open Questions¶
All resolved at review (2026-06-20); recorded here for provenance.
Bucket B generator CLI: accept as E2E-covered, or add a unit pass?β Resolved β A: accept as E2E-covered for the unit policy (they are thin wiring; the security-relevant validation/escaping lives ininternal/generator, which Bucket C covers), conditional on the@generator+generator_buildsuites staying green. A cheap unit pass overgenerate/enableflag-parsing/error branches is a no-cost bonus only if it falls out naturally.CI advisory for sub-90 packages not on the exclusion list?β Resolved β A: add a non-blocking advisory job (Phase 4), mirroring theapidiffadvisory (allow_failure: true) so it never gates merges or flakes the pipeline. Requires a machine-readable exclusion list shared by the job and this spec.β Resolved β A: unit-test the deterministic manifest/merge/validation helpers; leave file-emission tointernal/generator: unit-test the pure helpers or also emission?generator_build(which compiles the scaffold β a stronger check than asserting emitted strings).