Skip to content

Telemetry

Overview

The telemetry package provides an opt-in framework for collecting anonymous usage analytics from CLI tools built on GTB. It is designed around three principles:

  1. Explicit consent โ€” telemetry is never enabled by default. Users must opt in via telemetry enable, the init prompt, or the TELEMETRY_ENABLED environment variable.
  2. Privacy by design โ€” no personally identifiable information is collected. Machine IDs are derived from multiple system signals and hashed with SHA-256. Command arguments, file contents, and IP addresses are never recorded.
  3. Pluggable backends โ€” tool authors choose where data goes. The framework ships noop, stdout, file, HTTP, and OpenTelemetry (OTLP) backends, and supports custom implementations.

Quick Start

Enable telemetry for your tool

props.Tool{
    Name: "mytool",
    Features: props.SetFeatures(
        props.Enable(props.TelemetryCmd),
    ),
    Telemetry: props.TelemetryConfig{
        Endpoint: "https://analytics.example.com/events",
    },
}

Emit events from commands

func runMyCommand(p *props.Props) error {
    start := time.Now()

    // ... command logic ...

    p.Collector.TrackCommand("my-command", time.Since(start).Milliseconds(), 0, nil)
    return nil
}

User opt-in

mytool telemetry enable    # opt in
mytool telemetry status    # check current state
mytool telemetry disable   # opt out (drops all pending events)
mytool telemetry reset     # clear local data + request remote deletion

Two-Level Gating

Telemetry requires two conditions to be active:

Level Who controls it How
Feature flag Tool author props.Enable(props.TelemetryCmd) in code
User consent End user telemetry enable command or TELEMETRY_ENABLED=true env var

Both must be active for data to be collected. If either is missing, the collector is a silent noop.


What Is Collected

Every telemetry event contains:

Field Example Description
event.type command.invocation Event category
event.name generate Specific command or feature
tool.name mytool Tool identity
tool.version 1.2.3 Tool version
os.type linux Operating system
os.version 6.8.0-106-generic OS/kernel version
host.arch amd64 CPU architecture
go.version go1.26.1 Go runtime version
machine.id 4a3f8c1d9e2b6f70 Anonymised machine identifier (16 hex chars)
command.duration_ms 142 Execution time (command events only)
command.exit_code 0 Exit status (command events only)

Tool authors can add custom metadata via TelemetryConfig.Metadata (included in every event) or the extra parameter on Track/TrackCommand (per-event).

Extended Collection (Enterprise)

For closed enterprise environments where users are contractually bound by security policies, tool authors can enable extended collection to include additional diagnostic data:

Field Example When
command.args --name myapp --verbose ExtendedCollection: true
command.error missing template file ExtendedCollection: true

Extended collection is disabled by default and must be explicitly opted into by the tool author:

Telemetry: props.TelemetryConfig{
    ExtendedCollection: true, // enterprise only
    Endpoint: "https://internal-analytics.corp.example.com/events",
},

When disabled, TrackCommandExtended silently drops args and error messages โ€” callers do not need to check the flag. Duration and exit code are always recorded regardless of this setting.

Privacy consideration

Only enable ExtendedCollection in tools deployed within controlled enterprise environments where data handling is governed by employment contracts and security policies. Never enable it for public-facing or open-source tools.

Credential Redaction

Even with ExtendedCollection enabled, command.args and command.error values are never shipped verbatim. Every string is routed through pkg/redact before being attached to the outgoing event. The redactor strips URL userinfo, common credential query parameters (apikey, token, access_token, password, โ€ฆ), Authorization headers quoted in free text, well-known provider prefixes (sk-, ghp_, AIza, AKIA, Slack xoxb-, etc.), and very long opaque tokens.

// A command invoked as:
//   tool deploy --api-token=sk-proj-abc123def456...

// Ships as:
event.Args  = []string{"--api-token=sk-proj-***", "deploy"}
event.Error = `failed POST https://<redacted>@api.example.co/v1?apikey=***: 401`

The redactor is idempotent and never retains the original string. It catches common shapes โ€” not every possible credential format. Tool authors accepting unusual credential formats in their own commands should either match the common shape conventions (prefix + opaque hex/base64) or contribute a pattern upstream via a PR to pkg/redact.

When a custom telemetry backend is used, events arrive pre-redacted โ€” the backend does not need to repeat the work.

OTel Exporter Header Advisories

If WithOTelHeaders is called with a header name that matches the sensitive-header pattern (Authorization, X-API-Key, custom names containing auth/token/secret/bearer/password/credential), the OTel backend emits a WARN at initialisation time:

WARN  OTel header Authorization appears to carry credentials; ensure the
      exporter uses TLS and that any HTTP middleware logging headers
      redacts this name. See docs/components/telemetry.md.

The warning is advisory โ€” the header is still honoured. It exists so operators can audit which headers carry credentials and confirm their exporter uses TLS. Header values never appear in the warning text.

What Is NOT Collected

By default, the following are never collected:

  • Command arguments or flags (unless ExtendedCollection is enabled)
  • Error messages (unless ExtendedCollection is enabled)
  • File paths or file contents
  • Environment variables
  • IP addresses
  • Usernames, hostnames, or any raw PII
  • Authentication tokens or credentials

Event Types

props.EventCommandInvocation  // "command.invocation" โ€” a command was run
props.EventCommandError       // "command.error" โ€” a command failed
props.EventFeatureUsed        // "feature.used" โ€” a feature was exercised
props.EventUpdateCheck        // "update.check" โ€” update check performed
props.EventUpdateApplied      // "update.applied" โ€” update was applied
props.EventDeletionRequest    // "data.deletion_request" โ€” GDPR deletion request

These constants are defined in both pkg/props and pkg/telemetry. Since they resolve to plain strings, values from either package are interchangeable.


Machine ID

The machine ID is a privacy-preserving identifier derived from four system signals:

  1. OS machine ID โ€” /etc/machine-id (Linux), IOPlatformUUID (macOS), MachineGuid (Windows)
  2. MAC address โ€” first non-loopback network interface
  3. Hostname
  4. Username

All four are concatenated and hashed with SHA-256. The first 8 bytes (16 hex chars) are used. Each signal degrades gracefully if unavailable. The hash cannot be reversed to recover any input value.

The machine ID is computed fresh on every invocation โ€” it is not persisted to config.

$ mytool telemetry status
Telemetry: enabled
Machine ID: 4a3f8c1d9e2b6f70

Backends

Noop (disabled state)

Used when telemetry is disabled or no backend is configured. Silently discards all events.

Stdout (debugging)

Writes events as pretty-printed JSON. Useful for development.

telemetry.NewStdoutBackend(os.Stdout)

File (local-only mode)

Appends events as newline-delimited JSON to a local file. Activated when the user sets telemetry.local_only: true in config or TELEMETRY_LOCAL=true.

telemetry.NewFileBackend("/path/to/telemetry.log")

HTTP

POSTs events as a JSON array to an endpoint. Network errors are silently dropped. Non-2xx responses are logged at debug level.

telemetry.NewHTTPBackend("https://analytics.example.com/events", logger)

OpenTelemetry (OTLP)

Exports events as OTel log records via OTLP/HTTP. Compatible with Grafana Cloud, OpenTelemetry Collector, Datadog Agent, and any OTel-capable backend.

backend, err := telemetry.NewOTelBackend(ctx,
    "https://otlp-gateway.example.com/otlp",
    telemetry.WithOTelHeaders(map[string]string{
        "Authorization": "Basic " + authToken,
    }),
    telemetry.WithOTelService("mytool", "1.2.3"),
    telemetry.WithOTelLogger(logger),
)

The endpoint URL is parsed into host and path components. The SDK appends /v1/logs to the path automatically.

OTel Options:

Option Description
WithOTelHeaders(map) HTTP headers for every request (e.g. auth)
WithOTelInsecure() Disable TLS (local collectors only)
WithOTelLogger(l) Route OTel SDK errors to GTB logger
WithOTelService(name, ver) Set service.name and service.version resource attributes

OTel SDK errors

The OTel SDK's logger.Emit() is fire-and-forget. Errors surface asynchronously through the SDK's error handler, not through Backend.Send(). Use WithOTelLogger to route these to your GTB logger at debug level.

Custom Backend

Tool authors can supply any implementation of the Backend interface:

type Backend interface {
    Send(ctx context.Context, events []Event) error
    Close() error
}

Wire it in via TelemetryConfig.Backend:

Telemetry: props.TelemetryConfig{
    Backend: func(p *props.Props) any {
        return myanalytics.NewBackend(p.Config.GetString("analytics.key"))
    },
},

The factory returns any to avoid an import cycle. The returned value must implement telemetry.Backend โ€” a failed type assertion falls back to noop with a warning.

Datadog

The pkg/telemetry/datadog package provides a backend that sends events to Datadog's HTTP Logs Intake API. Events are mapped to Datadog's native log format with ddsource, ddtags, service, and hostname fields โ€” they appear immediately in Log Explorer without custom parsing.

import "gitlab.com/phpboyscout/go-tool-base/pkg/telemetry/datadog"

Telemetry: props.TelemetryConfig{
    Backend: func(p *props.Props) any {
        return datadog.NewBackend(
            os.Getenv("DD_API_KEY"),
            p.Logger,
            datadog.WithRegion(datadog.RegionEU1),
        )
    },
},

Regions: RegionUS1 (default), RegionUS3, RegionUS5, RegionEU1, RegionAP1, RegionAP2, RegionGOV.

Options:

Option Description
WithRegion(region) Datadog region (resolves to the correct intake endpoint)
WithSource(source) Override the ddsource tag (default: "gtb")

Event mapping:

Event field Datadog field
Type: Name message
ToolName service
MachineID hostname
Type, Version, OS, Arch ddtags (comma-separated)
Metadata metadata (nested object)

PostHog

The pkg/telemetry/posthog package provides a backend that sends events to PostHog's Capture API using batch mode. Events map directly to PostHog's event model โ€” they appear in the Events tab with all properties queryable.

import "gitlab.com/phpboyscout/go-tool-base/pkg/telemetry/posthog"

Telemetry: props.TelemetryConfig{
    Backend: func(p *props.Props) any {
        return posthog.NewBackend(
            os.Getenv("POSTHOG_PROJECT_KEY"),
            p.Logger,
            posthog.WithInstance(posthog.InstanceEU),
        )
    },
},

Self-hosted PostHog:

posthog.NewBackend(
    os.Getenv("POSTHOG_PROJECT_KEY"),
    p.Logger,
    posthog.WithEndpoint("https://posthog.internal.example.com/capture/"),
)

Options:

Option Description
WithInstance(instance) PostHog cloud instance: InstanceUS (default), InstanceEU
WithEndpoint(url) Custom endpoint for self-hosted (overrides WithInstance)

Event mapping:

Event field PostHog field
Type event
MachineID distinct_id
Name properties.event_name
ToolName properties.tool_name
Version properties.tool_version
OS properties.$os
Arch properties.arch
Metadata properties.* (merged)

Choosing a Backend

Backend Best for Auth Protocol
OTLP Grafana Cloud, any OTel collector, enterprise observability Basic auth via headers OTLP/HTTP (protobuf)
Datadog Teams already using Datadog for infrastructure monitoring DD-API-KEY header HTTP JSON
PostHog Product analytics, feature adoption tracking, funnels Project key in payload HTTP JSON
HTTP Simple custom endpoints, webhooks None (bring your own) HTTP JSON
Custom Any other platform Defined by implementation Any

The OTLP backend is the default recommendation for new deployments โ€” it works with any OTel-compatible collector and avoids vendor lock-in. The Datadog and PostHog backends are provided for teams that want native integration with those platforms without writing a custom backend.


Backend Selection Precedence

When the collector is constructed in PersistentPreRunE, backends are selected in this order:

  1. Custom backend โ€” TelemetryConfig.Backend factory (if set)
  2. Local-only โ€” file backend (if telemetry.local_only is true in config)
  3. OTLP โ€” TelemetryConfig.OTelEndpoint (if set)
  4. HTTP โ€” TelemetryConfig.Endpoint (if set)
  5. Noop โ€” no backend configured

TelemetryConfig

type TelemetryConfig struct {
    Endpoint           string               // HTTP JSON endpoint
    OTelEndpoint       string               // OTLP/HTTP endpoint (takes precedence)
    OTelHeaders        map[string]string    // OTLP auth headers
    OTelInsecure       bool                 // Disable TLS for OTLP
    Backend            func(*Props) any     // Custom backend factory
    DeletionRequestor  func(*Props) any     // Custom GDPR deletion requestor
    ExtendedCollection bool                 // Include args + errors (enterprise only)
    DeliveryMode       DeliveryMode         // at_least_once (default) or at_most_once
    Metadata           map[string]string    // Extra key/value pairs in every event
}

Endpoints are set by the tool author at build time and are not user-configurable. The user config file only stores consent (telemetry.enabled) and mode (telemetry.local_only).


Buffer and Spill

Events are buffered in memory (capped at 1000) and flushed on process exit via Cobra's OnFinalize callback.

When the buffer is full, events are spilled to disk:

  • Location: config directory (if available and writable), otherwise /tmp
  • File size cap: 1 MB per spill file
  • File count cap: 10 files โ€” oldest deleted when exceeded
  • Recovery: every Flush checks for spill files first, sends them before the current buffer

The shared telemetry.ResolveDataDir(p) helper determines the data directory for both spill files and local-only logs.


Delivery Modes

Mode Behaviour Trade-off
DeliveryAtLeastOnce (default) Spill files deleted after successful send Possible duplicates if ack is lost; no data loss
DeliveryAtMostOnce Spill files deleted before send Possible data loss; no duplicates
Telemetry: props.TelemetryConfig{
    DeliveryMode: props.DeliveryAtMostOnce,
},

Environment Variables

Variable Values Effect
TELEMETRY_ENABLED true / false Bypasses interactive consent; overrides config at runtime
TELEMETRY_LOCAL true / false Forces local-only mode (file backend)
CI true Sets --skip-telemetry default to true during init

These names are deliberately un-prefixed so tools building on GTB can use them without GTB-specific naming conventions.


GDPR Data Deletion

The telemetry reset command:

  1. Drops all buffered events and deletes spill files
  2. Sends a deletion request via the configured DeletionRequestor
  3. Clears the local-only telemetry log (if present)
  4. Disables telemetry

Built-in Deletion Requestors

Requestor How it works
NewHTTPDeletionRequestor(url, logger) POSTs {"machine_id": "..."} to the endpoint
NewEmailDeletionRequestor(address, toolName) Opens a pre-filled mailto: link
NewEventDeletionRequestor(backend) Sends a data.deletion_request event through the existing backend

If no requestor is configured, the event-based requestor is used as the universal fallback.

Custom Requestor

Telemetry: props.TelemetryConfig{
    DeletionRequestor: func(p *props.Props) any {
        return telemetry.NewHTTPDeletionRequestor(
            "https://analytics.example.com/deletion",
            p.Logger,
        )
    },
},

When the user runs telemetry disable:

  1. Config is updated to telemetry.enabled: false
  2. All buffered events are immediately dropped
  3. All spill files are deleted
  4. The OnFinalize flush re-checks the enabled state and no-ops

No events are sent after an explicit disable, even if they were collected while consent was active.


Init Integration

When TelemetryCmd is enabled and the tool has InitCmd enabled, the TelemetryInitialiser registers with the setup system. During init, the user is prompted to opt in:

? Anonymous usage telemetry
  Help improve mytool by sending anonymous usage statistics.
  No personally identifiable information is collected.
  You can change this at any time with `mytool telemetry enable/disable`.
  > Yes / No

The --skip-telemetry flag (default true when CI=true) suppresses the prompt in non-interactive environments. The TELEMETRY_ENABLED env var pre-answers the consent question.

Tools Without Init

For tools that disable InitCmd (like the GTB binary itself), the telemetry enable command auto-creates the config file in the default config directory (~/.toolname/config.yaml) if one doesn't exist.


Testing

Unit Tests

Use the noop collector โ€” Props.Collector is always non-nil:

p := &props.Props{
    // Collector is nil โ€” telemetry calls are safe but do nothing
}

Or create a disabled collector for explicit testing:

c := telemetry.NewCollector(telemetry.Config{}, telemetry.NewNoopBackend(),
    "test", "1.0.0", nil, logger.NewNoop(), "", props.DeliveryAtLeastOnce)

Verifying Events

Use a spy backend to capture events in tests:

type spyBackend struct {
    events []telemetry.Event
    mu     sync.Mutex
}

func (s *spyBackend) Send(_ context.Context, events []telemetry.Event) error {
    s.mu.Lock()
    defer s.mu.Unlock()
    s.events = append(s.events, events...)
    return nil
}

func (s *spyBackend) Close() error { return nil }

Known Limitations

Machine ID Variability

The machine ID is computed fresh on every invocation from four system signals: OS machine ID, MAC address, hostname, and username. If any signal changes (network adapter swap, hostname rename, container restart, user switch), the hash changes. This means:

  • GDPR deletion requests sent via telemetry reset may not match all historical events if the machine ID has changed since those events were recorded.
  • De-duplication on the backend side should use a time window in addition to machine ID.

Persisting the ID to config was considered but rejected โ€” a stored identity that follows the user across machines is a greater privacy risk than occasional ID drift.

Thread Safety of Spill Files

The spill file mechanism trades strict thread safety for simplicity. flushSpillFiles() reads spill files from disk without holding the buffer mutex, while concurrent Track() calls may write new spill files (with the mutex held). In practice:

  • Filesystem operations are atomic at the OS level.
  • The worst case is missing a freshly-written spill file (caught on the next flush) or attempting to read a file that was concurrently deleted (handled gracefully with a continue).
  • Drop() deleting spill files during concurrent Track() is safe โ€” os.Remove on a non-existent file succeeds silently, and OnFinalize re-checks the enabled state before flushing.

Backend Error Semantics

Backend.Send() error behaviour varies by implementation:

Backend Network errors Other errors
Noop N/A Always returns nil
Stdout N/A Returns encoder errors
File N/A Returns file I/O errors
HTTP Silently returns nil Non-2xx logged at debug
OTLP Surfaced via OTel error handler Returns nil from Send

This means Flush() only logs warnings for file/stdout backend failures. HTTP and OTLP failures are either silently dropped or routed through the OTel SDK error handler. This is by design โ€” telemetry must never block the CLI โ€” but tool authors debugging delivery issues should enable debug logging.

Backend Fallback on Misconfiguration

If a tool author misconfigures OTelEndpoint (e.g. missing scheme, unreachable host), the backend creation fails at startup. The collector falls back to a noop backend with a warning log. Events are silently discarded until the endpoint is corrected. Enable debug logging during development to surface these warnings.

Buffer Size

The in-memory buffer is capped at 1000 events. This is not currently configurable. For most CLI tools this is more than sufficient (a typical invocation produces 1-3 events). Long-running services with high event rates may see frequent disk spills, which is handled gracefully but adds I/O overhead.

Local-Only Mode

When telemetry.local_only is true in config (or TELEMETRY_LOCAL=true), the file backend is selected and no data is transmitted remotely. This is mutually exclusive with HTTP/OTLP backends โ€” setting both does not produce dual-write. If you need both local logging and remote transmission, use a custom backend that tees to both.

Metadata Merge Precedence

When both TelemetryConfig.Metadata (tool-level) and the extra parameter (per-event) contain the same key, the per-event value wins. This allows commands to override tool-level defaults for specific events.

Insecure Transport

If OTelEndpoint uses the http:// scheme (no TLS), event data is transmitted unencrypted. The code correctly enables insecure mode for this case but does not warn. Use https:// for all production endpoints. The WithOTelInsecure() option is an explicit opt-in for local development collectors.