Telemetry¶
Overview¶
The telemetry package provides an opt-in framework for collecting anonymous usage analytics from CLI tools built on GTB. It is designed around three principles:
- Explicit consent โ telemetry is never enabled by default. Users must opt in via
telemetry enable, theinitprompt, or theTELEMETRY_ENABLEDenvironment variable. - Privacy by design โ no personally identifiable information is collected. Machine IDs are derived from multiple system signals and hashed with SHA-256. Command arguments, file contents, and IP addresses are never recorded.
- Pluggable backends โ tool authors choose where data goes. The framework ships noop, stdout, file, HTTP, and OpenTelemetry (OTLP) backends, and supports custom implementations.
Quick Start¶
Enable telemetry for your tool¶
props.Tool{
Name: "mytool",
Features: props.SetFeatures(
props.Enable(props.TelemetryCmd),
),
Telemetry: props.TelemetryConfig{
Endpoint: "https://analytics.example.com/events",
},
}
Emit events from commands¶
func runMyCommand(p *props.Props) error {
start := time.Now()
// ... command logic ...
p.Collector.TrackCommand("my-command", time.Since(start).Milliseconds(), 0, nil)
return nil
}
User opt-in¶
mytool telemetry enable # opt in
mytool telemetry status # check current state
mytool telemetry disable # opt out (drops all pending events)
mytool telemetry reset # clear local data + request remote deletion
Two-Level Gating¶
Telemetry requires two conditions to be active:
| Level | Who controls it | How |
|---|---|---|
| Feature flag | Tool author | props.Enable(props.TelemetryCmd) in code |
| User consent | End user | telemetry enable command or TELEMETRY_ENABLED=true env var |
Both must be active for data to be collected. If either is missing, the collector is a silent noop.
What Is Collected¶
Every telemetry event contains:
| Field | Example | Description |
|---|---|---|
event.type |
command.invocation |
Event category |
event.name |
generate |
Specific command or feature |
tool.name |
mytool |
Tool identity |
tool.version |
1.2.3 |
Tool version |
os.type |
linux |
Operating system |
os.version |
6.8.0-106-generic |
OS/kernel version |
host.arch |
amd64 |
CPU architecture |
go.version |
go1.26.1 |
Go runtime version |
machine.id |
4a3f8c1d9e2b6f70 |
Anonymised machine identifier (16 hex chars) |
command.duration_ms |
142 |
Execution time (command events only) |
command.exit_code |
0 |
Exit status (command events only) |
Tool authors can add custom metadata via TelemetryConfig.Metadata (included in every event) or the extra parameter on Track/TrackCommand (per-event).
Extended Collection (Enterprise)¶
For closed enterprise environments where users are contractually bound by security policies, tool authors can enable extended collection to include additional diagnostic data:
| Field | Example | When |
|---|---|---|
command.args |
--name myapp --verbose |
ExtendedCollection: true |
command.error |
missing template file |
ExtendedCollection: true |
Extended collection is disabled by default and must be explicitly opted into by the tool author:
Telemetry: props.TelemetryConfig{
ExtendedCollection: true, // enterprise only
Endpoint: "https://internal-analytics.corp.example.com/events",
},
When disabled, TrackCommandExtended silently drops args and error messages โ callers do not need to check the flag. Duration and exit code are always recorded regardless of this setting.
Privacy consideration
Only enable ExtendedCollection in tools deployed within controlled enterprise environments where data handling is governed by employment contracts and security policies. Never enable it for public-facing or open-source tools.
Credential Redaction¶
Even with ExtendedCollection enabled, command.args and command.error values are never shipped verbatim. Every string is routed through pkg/redact before being attached to the outgoing event. The redactor strips URL userinfo, common credential query parameters (apikey, token, access_token, password, โฆ), Authorization headers quoted in free text, well-known provider prefixes (sk-, ghp_, AIza, AKIA, Slack xoxb-, etc.), and very long opaque tokens.
// A command invoked as:
// tool deploy --api-token=sk-proj-abc123def456...
// Ships as:
event.Args = []string{"--api-token=sk-proj-***", "deploy"}
event.Error = `failed POST https://<redacted>@api.example.co/v1?apikey=***: 401`
The redactor is idempotent and never retains the original string. It catches common shapes โ not every possible credential format. Tool authors accepting unusual credential formats in their own commands should either match the common shape conventions (prefix + opaque hex/base64) or contribute a pattern upstream via a PR to pkg/redact.
When a custom telemetry backend is used, events arrive pre-redacted โ the backend does not need to repeat the work.
OTel Exporter Header Advisories¶
If WithOTelHeaders is called with a header name that matches the sensitive-header pattern (Authorization, X-API-Key, custom names containing auth/token/secret/bearer/password/credential), the OTel backend emits a WARN at initialisation time:
WARN OTel header Authorization appears to carry credentials; ensure the
exporter uses TLS and that any HTTP middleware logging headers
redacts this name. See docs/components/telemetry.md.
The warning is advisory โ the header is still honoured. It exists so operators can audit which headers carry credentials and confirm their exporter uses TLS. Header values never appear in the warning text.
What Is NOT Collected¶
By default, the following are never collected:
- Command arguments or flags (unless
ExtendedCollectionis enabled) - Error messages (unless
ExtendedCollectionis enabled) - File paths or file contents
- Environment variables
- IP addresses
- Usernames, hostnames, or any raw PII
- Authentication tokens or credentials
Event Types¶
props.EventCommandInvocation // "command.invocation" โ a command was run
props.EventCommandError // "command.error" โ a command failed
props.EventFeatureUsed // "feature.used" โ a feature was exercised
props.EventUpdateCheck // "update.check" โ update check performed
props.EventUpdateApplied // "update.applied" โ update was applied
props.EventDeletionRequest // "data.deletion_request" โ GDPR deletion request
These constants are defined in both pkg/props and pkg/telemetry. Since they resolve to plain strings, values from either package are interchangeable.
Machine ID¶
The machine ID is a privacy-preserving identifier derived from four system signals:
- OS machine ID โ
/etc/machine-id(Linux),IOPlatformUUID(macOS),MachineGuid(Windows) - MAC address โ first non-loopback network interface
- Hostname
- Username
All four are concatenated and hashed with SHA-256. The first 8 bytes (16 hex chars) are used. Each signal degrades gracefully if unavailable. The hash cannot be reversed to recover any input value.
The machine ID is computed fresh on every invocation โ it is not persisted to config.
Backends¶
Noop (disabled state)¶
Used when telemetry is disabled or no backend is configured. Silently discards all events.
Stdout (debugging)¶
Writes events as pretty-printed JSON. Useful for development.
File (local-only mode)¶
Appends events as newline-delimited JSON to a local file. Activated when the user sets telemetry.local_only: true in config or TELEMETRY_LOCAL=true.
HTTP¶
POSTs events as a JSON array to an endpoint. Network errors are silently dropped. Non-2xx responses are logged at debug level.
OpenTelemetry (OTLP)¶
Exports events as OTel log records via OTLP/HTTP. Compatible with Grafana Cloud, OpenTelemetry Collector, Datadog Agent, and any OTel-capable backend.
backend, err := telemetry.NewOTelBackend(ctx,
"https://otlp-gateway.example.com/otlp",
telemetry.WithOTelHeaders(map[string]string{
"Authorization": "Basic " + authToken,
}),
telemetry.WithOTelService("mytool", "1.2.3"),
telemetry.WithOTelLogger(logger),
)
The endpoint URL is parsed into host and path components. The SDK appends /v1/logs to the path automatically.
OTel Options:
| Option | Description |
|---|---|
WithOTelHeaders(map) |
HTTP headers for every request (e.g. auth) |
WithOTelInsecure() |
Disable TLS (local collectors only) |
WithOTelLogger(l) |
Route OTel SDK errors to GTB logger |
WithOTelService(name, ver) |
Set service.name and service.version resource attributes |
OTel SDK errors
The OTel SDK's logger.Emit() is fire-and-forget. Errors surface asynchronously through the SDK's error handler, not through Backend.Send(). Use WithOTelLogger to route these to your GTB logger at debug level.
Custom Backend¶
Tool authors can supply any implementation of the Backend interface:
Wire it in via TelemetryConfig.Backend:
Telemetry: props.TelemetryConfig{
Backend: func(p *props.Props) any {
return myanalytics.NewBackend(p.Config.GetString("analytics.key"))
},
},
The factory returns any to avoid an import cycle. The returned value must implement telemetry.Backend โ a failed type assertion falls back to noop with a warning.
Datadog¶
The pkg/telemetry/datadog package provides a backend that sends events to Datadog's HTTP Logs Intake API. Events are mapped to Datadog's native log format with ddsource, ddtags, service, and hostname fields โ they appear immediately in Log Explorer without custom parsing.
import "gitlab.com/phpboyscout/go-tool-base/pkg/telemetry/datadog"
Telemetry: props.TelemetryConfig{
Backend: func(p *props.Props) any {
return datadog.NewBackend(
os.Getenv("DD_API_KEY"),
p.Logger,
datadog.WithRegion(datadog.RegionEU1),
)
},
},
Regions: RegionUS1 (default), RegionUS3, RegionUS5, RegionEU1, RegionAP1, RegionAP2, RegionGOV.
Options:
| Option | Description |
|---|---|
WithRegion(region) |
Datadog region (resolves to the correct intake endpoint) |
WithSource(source) |
Override the ddsource tag (default: "gtb") |
Event mapping:
| Event field | Datadog field |
|---|---|
Type: Name |
message |
ToolName |
service |
MachineID |
hostname |
Type, Version, OS, Arch |
ddtags (comma-separated) |
Metadata |
metadata (nested object) |
PostHog¶
The pkg/telemetry/posthog package provides a backend that sends events to PostHog's Capture API using batch mode. Events map directly to PostHog's event model โ they appear in the Events tab with all properties queryable.
import "gitlab.com/phpboyscout/go-tool-base/pkg/telemetry/posthog"
Telemetry: props.TelemetryConfig{
Backend: func(p *props.Props) any {
return posthog.NewBackend(
os.Getenv("POSTHOG_PROJECT_KEY"),
p.Logger,
posthog.WithInstance(posthog.InstanceEU),
)
},
},
Self-hosted PostHog:
posthog.NewBackend(
os.Getenv("POSTHOG_PROJECT_KEY"),
p.Logger,
posthog.WithEndpoint("https://posthog.internal.example.com/capture/"),
)
Options:
| Option | Description |
|---|---|
WithInstance(instance) |
PostHog cloud instance: InstanceUS (default), InstanceEU |
WithEndpoint(url) |
Custom endpoint for self-hosted (overrides WithInstance) |
Event mapping:
| Event field | PostHog field |
|---|---|
Type |
event |
MachineID |
distinct_id |
Name |
properties.event_name |
ToolName |
properties.tool_name |
Version |
properties.tool_version |
OS |
properties.$os |
Arch |
properties.arch |
Metadata |
properties.* (merged) |
Choosing a Backend¶
| Backend | Best for | Auth | Protocol |
|---|---|---|---|
| OTLP | Grafana Cloud, any OTel collector, enterprise observability | Basic auth via headers | OTLP/HTTP (protobuf) |
| Datadog | Teams already using Datadog for infrastructure monitoring | DD-API-KEY header |
HTTP JSON |
| PostHog | Product analytics, feature adoption tracking, funnels | Project key in payload | HTTP JSON |
| HTTP | Simple custom endpoints, webhooks | None (bring your own) | HTTP JSON |
| Custom | Any other platform | Defined by implementation | Any |
The OTLP backend is the default recommendation for new deployments โ it works with any OTel-compatible collector and avoids vendor lock-in. The Datadog and PostHog backends are provided for teams that want native integration with those platforms without writing a custom backend.
Backend Selection Precedence¶
When the collector is constructed in PersistentPreRunE, backends are selected in this order:
- Custom backend โ
TelemetryConfig.Backendfactory (if set) - Local-only โ file backend (if
telemetry.local_onlyis true in config) - OTLP โ
TelemetryConfig.OTelEndpoint(if set) - HTTP โ
TelemetryConfig.Endpoint(if set) - Noop โ no backend configured
TelemetryConfig¶
type TelemetryConfig struct {
Endpoint string // HTTP JSON endpoint
OTelEndpoint string // OTLP/HTTP endpoint (takes precedence)
OTelHeaders map[string]string // OTLP auth headers
OTelInsecure bool // Disable TLS for OTLP
Backend func(*Props) any // Custom backend factory
DeletionRequestor func(*Props) any // Custom GDPR deletion requestor
ExtendedCollection bool // Include args + errors (enterprise only)
DeliveryMode DeliveryMode // at_least_once (default) or at_most_once
Metadata map[string]string // Extra key/value pairs in every event
}
Endpoints are set by the tool author at build time and are not user-configurable. The user config file only stores consent (telemetry.enabled) and mode (telemetry.local_only).
Buffer and Spill¶
Events are buffered in memory (capped at 1000) and flushed on process exit via Cobra's OnFinalize callback.
When the buffer is full, events are spilled to disk:
- Location: config directory (if available and writable), otherwise
/tmp - File size cap: 1 MB per spill file
- File count cap: 10 files โ oldest deleted when exceeded
- Recovery: every
Flushchecks for spill files first, sends them before the current buffer
The shared telemetry.ResolveDataDir(p) helper determines the data directory for both spill files and local-only logs.
Delivery Modes¶
| Mode | Behaviour | Trade-off |
|---|---|---|
DeliveryAtLeastOnce (default) |
Spill files deleted after successful send | Possible duplicates if ack is lost; no data loss |
DeliveryAtMostOnce |
Spill files deleted before send | Possible data loss; no duplicates |
Environment Variables¶
| Variable | Values | Effect |
|---|---|---|
TELEMETRY_ENABLED |
true / false |
Bypasses interactive consent; overrides config at runtime |
TELEMETRY_LOCAL |
true / false |
Forces local-only mode (file backend) |
CI |
true |
Sets --skip-telemetry default to true during init |
These names are deliberately un-prefixed so tools building on GTB can use them without GTB-specific naming conventions.
GDPR Data Deletion¶
The telemetry reset command:
- Drops all buffered events and deletes spill files
- Sends a deletion request via the configured
DeletionRequestor - Clears the local-only telemetry log (if present)
- Disables telemetry
Built-in Deletion Requestors¶
| Requestor | How it works |
|---|---|
NewHTTPDeletionRequestor(url, logger) |
POSTs {"machine_id": "..."} to the endpoint |
NewEmailDeletionRequestor(address, toolName) |
Opens a pre-filled mailto: link |
NewEventDeletionRequestor(backend) |
Sends a data.deletion_request event through the existing backend |
If no requestor is configured, the event-based requestor is used as the universal fallback.
Custom Requestor¶
Telemetry: props.TelemetryConfig{
DeletionRequestor: func(p *props.Props) any {
return telemetry.NewHTTPDeletionRequestor(
"https://analytics.example.com/deletion",
p.Logger,
)
},
},
Consent Withdrawal¶
When the user runs telemetry disable:
- Config is updated to
telemetry.enabled: false - All buffered events are immediately dropped
- All spill files are deleted
- The
OnFinalizeflush re-checks the enabled state and no-ops
No events are sent after an explicit disable, even if they were collected while consent was active.
Init Integration¶
When TelemetryCmd is enabled and the tool has InitCmd enabled, the TelemetryInitialiser registers with the setup system. During init, the user is prompted to opt in:
? Anonymous usage telemetry
Help improve mytool by sending anonymous usage statistics.
No personally identifiable information is collected.
You can change this at any time with `mytool telemetry enable/disable`.
> Yes / No
The --skip-telemetry flag (default true when CI=true) suppresses the prompt in non-interactive environments. The TELEMETRY_ENABLED env var pre-answers the consent question.
Tools Without Init¶
For tools that disable InitCmd (like the GTB binary itself), the telemetry enable command auto-creates the config file in the default config directory (~/.toolname/config.yaml) if one doesn't exist.
Testing¶
Unit Tests¶
Use the noop collector โ Props.Collector is always non-nil:
Or create a disabled collector for explicit testing:
c := telemetry.NewCollector(telemetry.Config{}, telemetry.NewNoopBackend(),
"test", "1.0.0", nil, logger.NewNoop(), "", props.DeliveryAtLeastOnce)
Verifying Events¶
Use a spy backend to capture events in tests:
type spyBackend struct {
events []telemetry.Event
mu sync.Mutex
}
func (s *spyBackend) Send(_ context.Context, events []telemetry.Event) error {
s.mu.Lock()
defer s.mu.Unlock()
s.events = append(s.events, events...)
return nil
}
func (s *spyBackend) Close() error { return nil }
Known Limitations¶
Machine ID Variability¶
The machine ID is computed fresh on every invocation from four system signals: OS machine ID, MAC address, hostname, and username. If any signal changes (network adapter swap, hostname rename, container restart, user switch), the hash changes. This means:
- GDPR deletion requests sent via
telemetry resetmay not match all historical events if the machine ID has changed since those events were recorded. - De-duplication on the backend side should use a time window in addition to machine ID.
Persisting the ID to config was considered but rejected โ a stored identity that follows the user across machines is a greater privacy risk than occasional ID drift.
Thread Safety of Spill Files¶
The spill file mechanism trades strict thread safety for simplicity. flushSpillFiles() reads spill files from disk without holding the buffer mutex, while concurrent Track() calls may write new spill files (with the mutex held). In practice:
- Filesystem operations are atomic at the OS level.
- The worst case is missing a freshly-written spill file (caught on the next flush) or attempting to read a file that was concurrently deleted (handled gracefully with a
continue). Drop()deleting spill files during concurrentTrack()is safe โos.Removeon a non-existent file succeeds silently, andOnFinalizere-checks the enabled state before flushing.
Backend Error Semantics¶
Backend.Send() error behaviour varies by implementation:
| Backend | Network errors | Other errors |
|---|---|---|
| Noop | N/A | Always returns nil |
| Stdout | N/A | Returns encoder errors |
| File | N/A | Returns file I/O errors |
| HTTP | Silently returns nil |
Non-2xx logged at debug |
| OTLP | Surfaced via OTel error handler | Returns nil from Send |
This means Flush() only logs warnings for file/stdout backend failures. HTTP and OTLP failures are either silently dropped or routed through the OTel SDK error handler. This is by design โ telemetry must never block the CLI โ but tool authors debugging delivery issues should enable debug logging.
Backend Fallback on Misconfiguration¶
If a tool author misconfigures OTelEndpoint (e.g. missing scheme, unreachable host), the backend creation fails at startup. The collector falls back to a noop backend with a warning log. Events are silently discarded until the endpoint is corrected. Enable debug logging during development to surface these warnings.
Buffer Size¶
The in-memory buffer is capped at 1000 events. This is not currently configurable. For most CLI tools this is more than sufficient (a typical invocation produces 1-3 events). Long-running services with high event rates may see frequent disk spills, which is handled gracefully but adds I/O overhead.
Local-Only Mode¶
When telemetry.local_only is true in config (or TELEMETRY_LOCAL=true), the file backend is selected and no data is transmitted remotely. This is mutually exclusive with HTTP/OTLP backends โ setting both does not produce dual-write. If you need both local logging and remote transmission, use a custom backend that tees to both.
Metadata Merge Precedence¶
When both TelemetryConfig.Metadata (tool-level) and the extra parameter (per-event) contain the same key, the per-event value wins. This allows commands to override tool-level defaults for specific events.
Insecure Transport¶
If OTelEndpoint uses the http:// scheme (no TLS), event data is transmitted unencrypted. The code correctly enables insecure mode for this case but does not warn. Use https:// for all production endpoints. The WithOTelInsecure() option is an explicit opt-in for local development collectors.
Related Documentation¶
- Telemetry Command โ CLI commands for managing telemetry
- Props โ dependency injection container (
Collectorfield) - Create a Custom Telemetry Backend โ implement your own backend
- Create a Custom Deletion Requestor โ GDPR deletion for custom backends
- Telemetry Specification โ full design spec
- Vendor Backends Specification โ Datadog and PostHog backends