Skip to content

Structured Changelog Diffing Specification

Authors
Matt Cockayne, Claude (claude-opus-4-6) (AI drafting assistant)
Date
26 March 2026
Status
DRAFT

Overview

The SelfUpdater.GetReleaseNotes method in pkg/setup/update.go fetches release notes between two versions and concatenates them as raw markdown. When multiple releases are included, the output is a wall of unstructured text that makes it difficult to identify what changed -- particularly breaking changes that require action before upgrading.

This specification adds a changelog parser that understands conventional commit-based release notes (as generated by semantic-release) and produces a structured summary categorised into breaking changes, features, fixes, and other changes. Breaking changes are highlighted prominently so users can assess upgrade risk before proceeding.

The parser is implemented as a standalone package (pkg/changelog) so it can be reused outside the update flow (e.g., in documentation generation or CI tooling).


Design Decisions

Separate pkg/changelog package: The parsing logic is general-purpose and not specific to the update command. Placing it in its own package follows the library-first principle and allows reuse.

Parse, don't template: The parser produces structured Go types. Rendering to terminal output, markdown, or other formats is handled by the caller. This keeps the parser pure and testable without terminal dependencies.

Conventional commit format assumed: The parser targets the format generated by semantic-release (section headers like ### Features, ### Bug Fixes, ### BREAKING CHANGES, and bullet points with scope and description). Other formats pass through as "other" changes.

Per-release grouping preserved: The output preserves the per-release structure so consumers can display changes grouped by version or flattened into a single list.

No external dependencies: The parser uses only the standard library (strings, regexp). No markdown AST library is needed since semantic-release output follows a predictable format.


Public API Changes

New Package: pkg/changelog

// Category classifies a change entry.
type Category int

const (
    // CategoryBreaking represents a breaking change requiring user action.
    CategoryBreaking Category = iota
    // CategoryFeature represents a new feature.
    CategoryFeature
    // CategoryFix represents a bug fix.
    CategoryFix
    // CategoryPerformance represents a performance improvement.
    CategoryPerformance
    // CategoryOther represents any other change (refactor, docs, chore, etc.).
    CategoryOther
)

// Entry represents a single change within a release.
type Entry struct {
    // Category is the type of change.
    Category Category
    // Scope is the conventional commit scope (e.g., "http", "chat"). May be empty.
    Scope string
    // Description is the change description text.
    Description string
    // Raw is the original unparsed line from the release notes.
    Raw string
}

// Release represents the parsed changelog for a single release version.
type Release struct {
    // Version is the release tag (e.g., "v1.5.0").
    Version string
    // Entries contains all parsed change entries for this release.
    Entries []Entry
}

// Changelog represents the parsed changelog across multiple releases.
type Changelog struct {
    // FromVersion is the starting version (exclusive).
    FromVersion string
    // ToVersion is the ending version (inclusive).
    ToVersion string
    // Releases contains parsed releases ordered from oldest to newest.
    Releases []Release
}

// Parse parses raw release notes markdown (as returned by SelfUpdater.GetReleaseNotes)
// into a structured Changelog.
func Parse(rawNotes string) (*Changelog, error)

// HasBreakingChanges returns true if any release contains breaking changes.
func (c *Changelog) HasBreakingChanges() bool

// BreakingChanges returns all breaking change entries across all releases.
func (c *Changelog) BreakingChanges() []Entry

// EntriesByCategory returns all entries matching the given category across all releases.
func (c *Changelog) EntriesByCategory(cat Category) []Entry

New Function in pkg/changelog: FormatSummary

// FormatSummary produces a human-readable summary of the changelog suitable for
// terminal output. Breaking changes appear first with a warning prefix.
func FormatSummary(cl *Changelog) string

Integration Point in pkg/setup/update.go

The existing GetReleaseNotes method signature is unchanged. A new method is added alongside it:

// GetStructuredReleaseNotes retrieves release notes between two versions and
// returns them as a parsed Changelog.
func (s *SelfUpdater) GetStructuredReleaseNotes(ctx context.Context, from, to string) (*changelog.Changelog, error)

Internal Implementation

Parser

The parser processes the raw markdown line-by-line:

  1. Split into per-release sections by detecting # vX.Y.Z headers.
  2. Within each release, detect section headers (### Features, ### Bug Fixes, ### BREAKING CHANGES, etc.) to determine the current category.
  3. Parse bullet points (* **scope:** description or * description) into Entry structs.
  4. Lines that do not match any pattern are collected as CategoryOther entries.
var (
    releaseHeaderRe = regexp.MustCompile(`^#\s+(v?\d+\.\d+\.\d+.*)`)
    sectionHeaderRe = regexp.MustCompile(`^###?\s+(.+)`)
    entryRe         = regexp.MustCompile(`^\*\s+(?:\*\*([^*]+)\*\*:\s*)?(.+)`)
)

func Parse(rawNotes string) (*Changelog, error) {
    cl := &Changelog{}
    // ... line-by-line parsing
    return cl, nil
}

Section-to-Category Mapping

var sectionCategories = map[string]Category{
    "breaking changes": CategoryBreaking,
    "features":         CategoryFeature,
    "bug fixes":        CategoryFix,
    "performance improvements": CategoryPerformance,
    // Aliases
    "feat":        CategoryFeature,
    "fix":         CategoryFix,
    "perf":        CategoryPerformance,
}

In addition to ### BREAKING CHANGES sections, the parser detects BREAKING CHANGE: footers within individual entries and reclassifies them as CategoryBreaking.

FormatSummary

func FormatSummary(cl *Changelog) string {
    var sb strings.Builder

    if cl.HasBreakingChanges() {
        sb.WriteString("WARNING: Breaking changes detected!\n\n")
        for _, e := range cl.BreakingChanges() {
            sb.WriteString(fmt.Sprintf("  BREAKING: %s\n", e.Description))
        }
        sb.WriteString("\n")
    }

    for _, cat := range []struct {
        category Category
        label    string
    }{
        {CategoryFeature, "Features"},
        {CategoryFix, "Bug Fixes"},
        {CategoryPerformance, "Performance"},
        {CategoryOther, "Other"},
    } {
        entries := cl.EntriesByCategory(cat.category)
        if len(entries) == 0 {
            continue
        }
        sb.WriteString(fmt.Sprintf("%s:\n", cat.label))
        for _, e := range entries {
            if e.Scope != "" {
                sb.WriteString(fmt.Sprintf("  - %s: %s\n", e.Scope, e.Description))
            } else {
                sb.WriteString(fmt.Sprintf("  - %s\n", e.Description))
            }
        }
        sb.WriteString("\n")
    }

    return sb.String()
}

Project Structure

pkg/changelog/
โ”œโ”€โ”€ changelog.go       <- NEW: types (Changelog, Release, Entry, Category)
โ”œโ”€โ”€ parse.go           <- NEW: Parse function, regex patterns, section mapping
โ”œโ”€โ”€ format.go          <- NEW: FormatSummary
โ”œโ”€โ”€ parse_test.go      <- NEW: parser tests
โ”œโ”€โ”€ format_test.go     <- NEW: formatter tests
pkg/setup/
โ”œโ”€โ”€ update.go          <- MODIFIED: add GetStructuredReleaseNotes method

Testing Strategy

Test Scenario
TestParse_SingleRelease Single release with features and fixes
TestParse_MultipleReleases Three releases, correct ordering and grouping
TestParse_BreakingChanges ### BREAKING CHANGES section parsed as CategoryBreaking
TestParse_BreakingChangeFooter BREAKING CHANGE: in entry body reclassified
TestParse_ScopedEntries * **http:** description extracts scope correctly
TestParse_UnscopedEntries * description has empty scope
TestParse_EmptyInput Empty string returns empty Changelog, no error
TestParse_MalformedInput Non-conventional-commit text collected as CategoryOther
TestParse_VersionHeader Version extracted from # v1.2.3 and # 1.2.3
TestChangelog_HasBreakingChanges True when breaking entries exist, false otherwise
TestChangelog_BreakingChanges Returns all breaking entries across releases
TestChangelog_EntriesByCategory Filters correctly for each category
TestFormatSummary_WithBreaking Breaking changes appear first with WARNING prefix
TestFormatSummary_NoBreaking No WARNING section when no breaking changes
TestFormatSummary_AllCategories Each category rendered with correct heading
TestFormatSummary_EmptyChangelog Empty changelog produces empty string

Test Fixtures

Use testdata/ files containing real semantic-release output to ensure the parser handles production formats.

Coverage

  • Target: 95%+ for pkg/changelog/ (pure logic, no I/O).

Linting

  • golangci-lint run --fix must pass.
  • No new nolint directives.

Documentation

  • Godoc for all exported types, functions, and constants in pkg/changelog/.
  • Update docs/components/ with a changelog.md page documenting the parser and formatter.
  • Update the update command documentation to mention structured changelog output.

Backwards Compatibility

  • No breaking changes. GetReleaseNotes is unchanged. GetStructuredReleaseNotes is a new method.
  • The FormatSummary output is purely additive -- consumers who called GetReleaseNotes directly continue to work.

Bundled Changelog in Release Archives

Motivation

The existing GetReleaseNotes / GetStructuredReleaseNotes methods make individual API calls per release to retrieve notes. This has two drawbacks:

  1. Multiple API calls: Fetching notes across many releases requires one API call per release, increasing latency and risk of rate limiting.
  2. Network dependency: The release archive is already downloaded (for the binary) during both online and offline update flows. Including a pre-built changelog in the archive provides structured changelog data at zero extra network cost.

Approach: Go Tool Directive + GoReleaser extra_files

Note: This spec originally recommended git-cliff. The Changelog Generator Tool spec superseded the external dependency with a pure-Go replacement (cmd/changelog). The CHANGELOG.md is now produced by go generate via the go tool changelog generate directive โ€” no CI-specific step is needed.

The go:generate directive runs the changelog tool before the build, and GoReleaser only needs to include the pre-existing file in the archive.

go:generate directive (in pkg/cmd/root/generate.go):

//go:generate go tool changelog generate --output assets/CHANGELOG.md

GoReleaser config (archive section only โ€” no before hook changes):

archives:
  - formats: ["tar.gz"]
    files:
      - CHANGELOG.md

The Go changelog generator (cmd/changelog) maps conventional commit types to section headers matching the format Parse expects:

Commit Type Section Header
feat Features
fix Bug Fixes
perf Performance Improvements
refactor, docs, chore, style Other
test, ci Skipped

This approach has several advantages: - Zero external dependencies: Pure Go, no git-cliff or Rust toolchain needed. - Integrated with go generate: Runs as part of the normal build pipeline. - Self-contained: Declared as a Go tool directive โ€” no separate CI step required.

Extraction During Update

When SelfUpdater extracts a downloaded release archive, it already walks the tar entries to locate the binary. The extraction logic is extended to also look for CHANGELOG.md and, if found, pass its contents to changelog.Parse.

New Method: ParseFromArchive

// ParseFromArchive extracts and parses a CHANGELOG.md file from a release
// archive reader. Returns nil (not an error) if no changelog is found in the
// archive, allowing callers to fall back to API-based retrieval.
func ParseFromArchive(r io.Reader) (*Changelog, error)

This function scans tar entries for a file named CHANGELOG.md, reads its contents, and delegates to Parse. If no changelog entry is found, it returns (nil, nil) so the caller can fall back gracefully.

Fallback Strategy

The SelfUpdater uses a two-tier resolution order:

  1. Archive-bundled: Extract CHANGELOG.md from the already-downloaded release archive. Zero extra I/O.
  2. API-based: If the archive contains no CHANGELOG.md (e.g., older releases built before this feature), fall back to the existing per-release API calls via GetReleaseNotes.

This ensures backwards compatibility with releases that predate the bundled changelog, while new releases benefit from the single-file approach.

Performance Considerations

The changelog generator runs as part of go generate, not on every build invocation. Benchmarked at ~40ยตs per commit, a 100,000-commit repo would complete in ~4s โ€” acceptable for a build-time tool.

If changelog generation becomes a bottleneck as commit count grows, the --since and --releases flags can limit scope to only the most recent releases rather than the full history.


Future Considerations

  • Interactive changelog viewer: Use pkg/docs/ or Bubble Tea to provide an interactive changelog browser with category filtering.
  • Changelog caching: Cache parsed changelogs locally to avoid re-fetching and re-parsing on subsequent update checks.
  • Migration guide extraction: For breaking changes, parse linked issues or PR bodies to extract migration instructions.
  • Incremental changelog generation: Append only the new release to an existing CHANGELOG.md to avoid regenerating the full history on every release.

Implementation Phases

Phase 1 -- Types and Parser (DONE)

  1. Create pkg/changelog/ package
  2. Define Category, Entry, Release, Changelog types
  3. Implement Parse with regex-based line parser
  4. Implement HasBreakingChanges, BreakingChanges, EntriesByCategory

Phase 2 -- Formatter (DONE)

  1. Implement FormatSummary
  2. Add test fixtures from real semantic-release output

Phase 3 -- API Integration (DONE)

  1. Add GetStructuredReleaseNotes to SelfUpdater
  2. Wire into the update command's release notes display

Phase 4 -- Tests (DONE)

  1. Unit tests for parser with all edge cases
  2. Unit tests for formatter
  3. Integration test for GetStructuredReleaseNotes
  4. Run with race detector

Phase 5 -- Bundled Changelog (DONE)

  1. Implement ParseFromArchive in pkg/changelog/
  2. Add go tool changelog generate directive to go:generate
  3. Add CHANGELOG.md to archive files in .goreleaser.yaml
  4. Extend SelfUpdater archive extraction to detect and parse CHANGELOG.md
  5. Implement fallback: archive-bundled โ†’ API-based
  6. Unit tests for ParseFromArchive (valid archive, missing file, malformed content)
  7. Update component documentation

Note: Phase 5 originally specified git-cliff. The Changelog Generator Tool spec replaced it with a pure-Go tool (cmd/changelog) using go-git and leodido/go-conventionalcommits.


Verification

go build ./...
go test -race ./pkg/changelog/...
go test -race ./pkg/setup/...
go test ./...
golangci-lint run --fix

# Verify package exists
ls pkg/changelog/

# Verify integration method
grep -n 'GetStructuredReleaseNotes' pkg/setup/update.go

# Verify archive parsing
grep -n 'ParseFromArchive' pkg/changelog/parse.go

# Verify goreleaser config includes CHANGELOG.md
grep -n 'CHANGELOG.md' .goreleaser.yaml

# Verify changelog tool directive
grep -n 'changelog' go.mod

# Verify go:generate uses the changelog tool
grep -rn 'go tool changelog' internal/cmd/root/generate.go