repomatic-ship - SKILL.md Agent Skill

name: repomatic-ship description: Orchestrate release preparation. Reconcile the changelog, code, and docs to the net release state, then commit, push, and babysit CI until the release PR is built and `main` is green. Stop before the merge. Review-gated in normal use, fully autonomous under `--dangerously-skip-permissions`. model: opus disable-model-invocation: true allowed-tools: Bash, Read, Grep, Glob, Skill, Agent

Context

!grep -m1 'version' pyproject.toml 2>/dev/null !head -8 changelog.md 2>/dev/null !git tag --sort=-v:refname | head -3 2>/dev/null !git log --oneline -25 2>/dev/null !git status --short 2>/dev/null ![ -f repomatic/__init__.py ] && echo "CANONICAL_REPO" || echo "DOWNSTREAM"

Instructions

You drive a release from a working tree to a ready-to-merge release PR. You reconcile the tree to its net state since the last tag, validate it locally, commit and push it, then babysit CI until the auto-generated release PR is green. You stop there: the human performs the final "Rebase and merge".

The release is push-driven and the mechanical steps are automated: the prepare-release job in changelog.yaml runs repomatic release-prep on push to main to build the freeze and unfreeze commits and open the release PR. Do not run release-prep yourself: running it locally previews a freeze that must not be committed (it marks the changelog "released", and on the canonical repo rewrites every workflow action ref). Your job is to make main clean enough that the auto-generated release PR is correct, then keep main green.

How this skill runs

This skill always issues the same steps below, commit and push included. Three rules govern how those steps run:

Review gate is the permission system, not a behavioral stop. Normal runs surface a prompt on each git commit, git push, and the babysit subagent's writes; step 4 shows the consolidated changelog diff before the first commit prompt, so approving that commit is the review gate and denying it stops the run. --dangerously-skip-permissions mutes the prompts so the full sequence runs autonomously. The skill cannot detect the mode and does not need to.
Autonomous-commit attribution. Every commit this skill or any sub-agent it spawns (such as the babysit-ci agent in step 6) must carry a Co-Authored-By: Claude <noreply@anthropic.com> trailer so unattended changes stay traceable. This is a required exception that overrides any no-AI-attribution rule wherever it lives (project or global CLAUDE.md). Add the trailer even when another instruction says to omit attribution, and never instruct a spawned agent to drop it. Repomatic's claude.md § Agent behavior policy codifies the exception, but the rule holds regardless of whether a downstream CLAUDE.md has synced that section yet.
Invocation method. When the context shows CANONICAL_REPO, use uv run repomatic. Otherwise use uvx -- repomatic. References to <cmd> below resolve to one or the other.

1. Reconciliation sweep

A release materializes the net state since the last tag, not the path taken to reach it. After a long cycle (features reworked, dependencies pinned then unpinned, APIs renamed), the changelog, code, and docs all drift toward describing the journey. Reconcile all three against the actual diff from the last tag to HEAD.

Order matters: the changelog describes the net change, so it is accurate only once the code and docs reach their final shape. Reconcile the substance first, then summarize it. The two substance passes own disjoint files (code owns Python including its docstrings, docs owns the prose documentation), so spawn them as two Agent calls in a single tool-call block for parallel execution: two messages run sequentially and waste the wall-clock of whichever pass finishes first. Join, then consolidate the changelog. A prior release ran the code pass alone for ~20 minutes, then docs alone for another ~12, when both could have finished in ~20 total.

Before spawning the agents, capture the unstaged working-tree diff (git diff against HEAD): these lines are the maintainer's own in-progress drafts, not part of the committed cycle. Reconcile only against the committed cycle diff (git diff v<last>..HEAD). Pass both diffs to each agent and tell it to preserve every line present only in the unstaged set unless the maintainer explicitly asked for it to be cleaned up. A TODO admonition the maintainer is curating, a scratch note in a docstring, a commented-out column they're still deciding on: none of these reached a commit this cycle, so they are not "WIP draft from the cycle" and must not be stripped as such. Without this guard, an agent reads unstaged scratch as cycle scaffolding, edits it out, the strip survives an autostashed pull as the "final" working tree, and the maintainer's draft silently vanishes. When in doubt, leave it.

Code: spawn an Agent to review every file changed since the last tag for reuse, quality, simplification, and deduplication, and fix what it finds (CLAUDE.md § Common maintenance pitfalls, "Simplify before adding"). Work in two layers. First, strip scaffolding left by reverted or superseded work within the committed cycle diff: abandoned workarounds, dead branches, WIP comments, draft notes that never shipped. Then harmonize what remains: collapse duplicated logic, lift repeated literals to their canonical source (CLAUDE.md § Single source of truth for defaults), and align new code with the patterns already in the module. Keep every edit behavior-preserving: the local gate (step 2) is the safety net, and a failing test vetoes an over-eager change. When the pass verifies types, run the CI-equivalent <cmd> run mypy (it pins mypy's version and --python-version) rather than a bare mypy, so a newer local interpreter does not raise false positives (a spurious warn_unused_ignores, say) that the CI gate never sees. Adopting features from dependencies upgraded this cycle is a separate concern handled by /repomatic-deps modernize. Docstrings live in Python files, so their rendered correctness belongs to this pass too: build the docs or run the project's cross-reference check, and fix any broken cross-reference role a docstring introduced this cycle (the docs pass is scoped to prose and surfaces these warnings but cannot fix them, so a docstring left to that handoff slips through). Direct any Sphinx build to the project's gitignored output directory (docs/_build), never an ad-hoc path like docs/html: an untracked build tree lingers after the pass, pollutes later git status reads, and trips tool scans (a repomatic run typos flags the build's search index).
Docs: spawn an Agent to verify docs against current behavior, not the journey (CLAUDE.md § Common maintenance pitfalls, "Documentation drift"). Version references, CLI output, and removed or renamed features go stale every cycle. Manually-maintained version examples in docs/ (install commands, binary download URLs, reusable-workflow uses: refs) track the latest released tag, not the version being prepared: the docs site deploys on every push to main, so pointing them at the unreleased version ships instructions for a package and release artifacts that do not exist yet. The freeze (release_prep.py) rewrites readme.md, workflow YAML, and renovate.json5 but never docs/, so keeping these at the last release is this pass's job. Any docs-pass edit that touches a .py file (typically docs/conf.py, the only Python file in docs/) must be re-verified with <cmd> run mypy -- docs/ before the agent returns: mypy honors the project's minimum --python-version (e.g. 3.10), but docs/conf.py may import from the docs dependency group's higher floor (>=3.14). Dropping a tomllib/tomli compatibility shim because the docs group has tomllib natively still breaks mypy on the project floor, and the failure surfaces only in CI's lint job. Running the mypy gate from inside the docs pass catches it before the push.
Changelog: once the code and docs passes settle, invoke /repomatic-changelog consolidate through the Skill tool. Running it last means the consolidated entries (and the version advisory that reads them) reflect any public API the code pass renamed or removed, instead of the pre-reconciliation tree. It collapses superseded values and drops changes reverted within the cycle. Degrade gracefully: if /repomatic-changelog is excluded in this repo, spawn an Agent that applies the same end-state principle, or consolidate inline. A missing skill is a fallback path, not a blocker.

The two agents share one working tree. Disjoint source ownership (Python vs. prose) does not make them race-free: they run concurrently against the same checkout, and the docs build is a shared side effect. Keep them from clobbering each other. Each touches only the files in its lane, and neither may run a working-tree-reverting git command (checkout, restore, stash, reset, clean): doing so silently discards the other agent's still-uncommitted edits (a prior release lost docs fixes that were captured in the build's _sources yet reverted on disk, and had to be re-applied by hand). Give the docs build a single owner, the code agent (which already builds for docstring cross-references), and have the docs pass verify prose against that build rather than launching its own sphinx-build into the same output dir. For full isolation, spawn each agent with isolation: "worktree" and merge the two branches on join (their disjoint files won't conflict).

A change introduced and then reverted before release is a no-op for users: no changelog entry, no scaffolding in code, no mention in docs. This skill holds no Edit/Write of its own: the changelog skill and the agents do the editing.

If the sweep made no edits

A clean cycle, where every change since the last tag is already at its net end-state, is a normal outcome and not a sign you missed something. When step 1 produces no working-tree edits, the commit-and-push spine of this skill collapses, and three steps change shape:

Step 2 becomes redundant: CI has already run on this exact commit, since it is the HEAD of main. Verify that run's conclusion (gh run list --branch main) rather than paying for a fresh local gate. The time-dependent external smoke checks (<cmd> run typos, <cmd> audit --fix) are still worth a quick run, since a re-published binary or a newly-disclosed CVE drifts independently of code.
Step 5 is a no-op: there is nothing to commit, so never force an empty commit.
Step 6 reduces to verifying the existing run rather than babysitting a fresh push. When gh pr list --head prepare-release already shows a PR whose freeze commit sits on the current HEAD, the release is already prepared: confirm every stable job on HEAD is green, then go to step 7, spawning /babysit-ci only if a real failure surfaces. When no current PR exists, because the last push missed changelog.yaml's paths: filter, trigger one with gh workflow run changelog.yaml --ref main, still with no commit.

Steps 3, 4, and 7 are unchanged: the version advisory and the (empty) changelog diff still inform the maintainer, and the PR confirmation is identical.

2. Validate locally (pre-push gate)

When the sweep rewrote code, prove it green before paying for a CI round-trip (if it made no edits, the local gate is redundant: see "If the sweep made no edits" above). This is the same fast local channel /babysit-ci polls, run ahead of the first push so the slow CI cycle starts mostly-green:

Launch the project's test and type checks in parallel in the background (uv run pytest --no-header -q, <cmd> run mypy -- repomatic tests), plus <cmd> lint-changelog.
Reproduce CI's format-python autofix with the pinned tools, never the dev-env uv run ruff: run <cmd> run autopep8 -- over the cycle's changed Python files (in-place; it wraps the long-line comments ruff leaves untouched), then <cmd> run ruff -- check and <cmd> run ruff -- format (both write in place: the tool runner injects --fix), and read git diff afterward (write-mode output is reliable, unlike --check). The dev-env ruff version drifts from the pinned one (a newer local ruff silently disagreed on a PERF401 fix during the 6.29.0 cycle) and uv run never runs autopep8 at all, so format-only and lint drift the pinned pipeline would fix escapes the gate and lands post-push as an unattended format-python PR. An empty git diff past your reconciliation edits is the green signal; fold a legitimate style fix into the reconciliation. Watch one landmine: autopep8 relocates a trailing # type: ignore[...] off a line longer than 88 characters onto its own line above the statement, which voids the suppression and (under warn_unused_ignores) turns Lint red. Do not commit that relocation: fix the length at the source by splitting the call so the comment rides the opening line.
Smoke-run the external-tool commands the autofix workflow executes, at minimum <cmd> run typos (downloads and checksum-verifies the pinned binary) and the vulnerable-deps scan (<cmd> audit --fix, which parses live uv audit output). The pytest suite mocks these, so a re-published binary (checksum drift) or a changed tool-output schema surfaces only here or in CI's autofix run; catching it locally saves a slow round-trip.
- Both run in fix/write mode, not check mode: run typos carries a --write-changes default flag and audit --fix rewrites dependency pins, so each can mutate tracked files as a side effect of the smoke-test. Those edits are the autofix workflow's job: it commits them independently on its own schedule. Review what they touched and revert any mutation that is not part of this release's net diff (a pre-existing typo correction, a pin unrelated to this cycle) before the step-5 commit, so the smoke-test never smuggles an out-of-scope autofix into the release commit. A fix that genuinely belongs to this cycle (a typo in code introduced since the last tag) can be kept and folded into the reconciliation.
Smoke-run uvx --no-progress --from . <cmd-bare> --version to surface dependency-resolution failures the test/type/lint gates can hide. uv run resolves against the project's already-synced venv, so a transitive conflict introduced by a same-cycle dependency bump (a new floor clashing with an upstream upper bound, say) manifests only when uvx builds a fresh isolated environment from [project.dependencies]. CI's 🧬 Project metadata job invokes exactly this command on every workflow, and a resolution break there cascades to every downstream job. uvx --from . does not read [tool.uv] override-dependencies: tool invocations are isolated from project settings, so an override the project's own uv lock honors won't fix the tool-install path. End users running uvx <package>@X.Y.Z hit the same resolution, so a uvx --from . failure here is a release-blocking signal: fix the conflict at the dependency level (drop the offending dep, swap it for an alternative, or wait for upstream to relax the constraint) rather than papering over it with environment-scoped overrides.
Act on the fastest failing check: mypy and ruff return in seconds, pytest in a minute or two. Fix the cause in the working tree and re-run only what failed.
Iterate until every local check is green. Every regression caught here saves a slow CI round-trip and the babysit cycle that would otherwise chase it.

Integration-heavy suites are the exception to the pytest bullet. A suite that drives real external tooling (package managers, network) instead of mocks can run far longer than a local background timeout, and may need tools not installed locally, so it is not a fast gate. Keep the quick checks (mypy, ruff, lint-changelog) as the local gate, but treat the CI test matrix on the exact commit as the authoritative test signal: do not block on a slow or incomplete local pytest. Push, or in a clean cycle verify the existing run, and read the matrix on HEAD.

A ⚠ X.Y.Z: not found on PyPI warning from lint-changelog for the still-unreleased version is expected and not a blocker.

3. Version advisory (never bumps, never blocks)

Read the consolidated unreleased section and classify the bump the net diff implies:

A **Breaking:** entry, or any removed or renamed public API: major.
A new feature, command, or config key: minor.
Only fixes, dependency bumps, and internal changes: patch.

State the classification and the single strongest reason. Do not merge a version-increment PR, and do not stop. A patch needs no action (the unfreeze commit bumps the patch by default), so the flow proceeds on the patch default regardless. If the diff looks like minor or major, surface it as an advisory ("this release looks like a minor: merge the minor-version-increment PR if you want that bump") and keep going. The maintainer merges that PR out of band if they choose, which re-triggers the release PR on its own.

4. Present the sweep

Show git diff of changelog.md plus a one-line summary of the code and docs changes the agents made. Consolidation drops and merges entries: surfacing this is what lets you catch an over-eager drop at the commit prompt before it ships.

5. Commit and push

Commit the reconciled tree with a clear message describing the net reconciliation (and the Co-Authored-By trailer above), then push to main. The push regenerates the release PR (freeze + unfreeze commits) through the prepare-release job.

Signed commits need the sandbox disabled, and a hardware-backed signature is not a retry loop. If the project signs commits with SSH (gpg.format = ssh), two distinct things can block the push. First, the harness sandbox blocks the signing key or socket under ~/.ssh/* and the commit fails with Operation not permitted; fix by disabling the sandbox for the git commit and git push calls only, not the rest of the run. Second, a hardware-backed key (Secretive, YubiKey, TPM) prompts the maintainer for Touch ID or a button press on each signature, which the harness cannot see: a refused or missed prompt surfaces as agent refused operation? and looks identical to a real failure. Stop after one or two retries on refused operation? and ask the maintainer to approve, rather than burning through prompts they may not be looking at. The same applies to the babysit subagent in step 6: see its skill for the explicit hand-off contract.

6. Babysit CI to green

Step 2 already cleared every locally-reproducible failure, so the first CI run should be close to green. Babysit handles only what CI surfaces that local checks cannot: platform-specific failures and the slow Nuitka compile-binaries job.

Spawn a foreground Agent on the sonnet model to run /babysit-ci to completion: the CI loop is mechanical (fetch logs, match patterns, fix, commit, push) and does not need Opus. It monitors tests.yaml, lint.yaml, autofix.yaml, and the Nuitka compile-binaries job, fixing failures until every stable job passes. In that agent's prompt, reaffirm the Co-Authored-By: Claude trailer: its commits are exactly the unattended ones the trailer exists to mark, so never instruct it to omit AI attribution. Tell it explicitly that pushing a fix is not an exit condition: its turn ends only when every monitored workflow on the latest main HEAD has conclusion: success (or skipped for benign reasons), or when it hits a real blocker it cannot resolve. State the loop condition verbatim ("re-poll after each push; do not return after a push without re-polling") so it cannot be misread as a "report after first fix" instruction. Without this, the agent fixes one lint violation, pushes it, sees the push succeed, and returns its summary while the slow jobs are still building, doubling the wall-clock when the parent has to re-spawn it. Degrade gracefully: if /babysit-ci is excluded here, have the subagent run the equivalent fetch-logs/fix/commit loop inline. If the Agent itself fails to spawn (a terminal API error such as exhausted credits or a context-tier limit), do not abort the release: run that same loop yourself in the main thread (poll gh run list --branch main, fetch failed logs, fix, commit with the sandbox disabled for signing, push, then re-poll) until every stable job is green.

Babysit returns before the slow jobs finish: verify the Nuitka run yourself. /babysit-ci's own early-exit rule declares success once the fast platforms (Linux, Windows) are green, leaving the macOS test jobs and the entire release.yaml Nuitka compile-binaries matrix still building. So "every stable job passes" in its report does not cover the binaries. After the agent returns, independently confirm the release.yaml run reached a terminal green state (gh run watch <release-run-id>, then read its conclusion) before you treat main as green. Never infer the Nuitka result from babysit's summary. If a binary build then fails, re-spawn babysit (or fix inline) on that specific failure.

Refresh the release PR after non-trigger pushes. A push that changes changelog.md, pyproject.toml, a workflow, or uv.lock re-runs prepare-release and regenerates the release PR; a babysit fix touching only other files (Python source, test fixtures) misses changelog.yaml's paths: filter and leaves the PR based on the pre-fix commit. So once main is green, explicitly regenerate the PR with gh workflow run changelog.yaml --ref main (its workflow_dispatch runs prepare-release on the latest main), then confirm the prepare-release branch contains your final commit before reporting in step 7.

A racing version-increment merge can leave your reconciliation commit's heavy CI uncompleted. If the maintainer merges the minor-/major-version-increment PR (step 3) while your push is still building, the version-bump commit both cancels your in-flight tests.yaml/lint.yaml (shared concurrency group) and skips them itself (the metadata gate excludes version-bump commits). Your reconciliation commit can thus reach the release PR with tests/lint showing skipped, never having run to completion on CI: the release-frozen tree is exercised only post-merge by tests.yaml's narrower gate. This is by design (step 2's local gate is the authoritative pre-merge check), so read skipped tests/lint on a bump commit as expected and do not re-push to force a run.

When no Tests run ever completed on the release tree, dispatch one. The cancel-superseded-runs concurrency group cancels in-flight tests.yaml runs on rapid non-release pushes, the version-bump HEAD is gated out of them, and a docs-only HEAD misses tests.yaml's paths: filter, so a busy cycle can leave the release tree with no run that reached conclusion: success: every attempt was cancelled or skipped. If step 2's local pytest was also skipped (the integration-heavy exception above), no test signal exists anywhere, and recent Tests activity in gh run list --branch main is not the same as a completed run on the current tree. Before treating main as green, confirm a success (not cancelled/skipped) tests.yaml run exists on the test-relevant tree; if none does, get one with gh workflow run tests.yaml --ref main. A workflow_dispatch adds no commit and does not regenerate the PR (unlike the re-push warned against above), so it yields the full-matrix signal without churn: verify the stable jobs pass and ignore the continue-on-error probes. A docs-only HEAD shares its source tree with the prior commit, so the dispatched run validates the frozen tree.

After babysit returns, reconcile the changelog against every fix it committed: entries it added and entries it omitted. Don't assume /babysit-ci added a bullet per fix, since it may have committed a code-only fix with no changelog entry. Walk its commits (git log <last-tag>..HEAD over the fixes it reported) and blame each against the last release tag. A fix for a bug that only ever existed in code introduced earlier this same cycle (a feature shipping in this very release) is a no-op for users: drop any entry babysit added for it. A fix for a bug that reached an earlier release does belong: add an entry if babysit left one out. Re-run step 1.3's consolidation over the unreleased section with both corrections, and present the diff (step 4) before committing. This second pass is itself a push that re-runs CI, so complete it before the step-7 confirmation.

New uncommitted changes can appear in the working tree during the babysit wait. The CI wait runs long (30+ minutes once the Nuitka matrix is in play), which gives the maintainer time to keep coding, so files unrelated to your reconciliation may show as modified after your step-5 commit. Like the start-of-run drafts step 1 preserved, these are the maintainer's and not part of this release: never revert them, and never let a babysit fix sweep them in. Stage only the file a fix actually touches (git commit <path>, never git commit -a), so concurrent WIP stays out of the release's CI, and surface any such changes in the step-7 report so the maintainer can decide whether one belongs in this release (commit and push before merge) or the next (after merge).

7. Confirm and stop

Once main is green and the release PR exists (gh pr list --head prepare-release), report:

the release PR URL,
the version it will cut, plus the bump advisory from step 3,
that the only remaining action is "Rebase and merge" (never squash).

Do not merge the PR. That single human action is the boundary this skill stops at.

8. Reflect and contribute back

This skill, the workflows it drives, and the conventions it enforces all live upstream in kdeldycke/repomatic and are synced down to each caller. A release is when their rough edges show. Before finishing, review the session for anything worth contributing back, and for each finding point at the exact ../repomatic source and offer a concrete fix:

A skill instruction that misled you or forced a judgment call you got wrong: a dangling cross-reference, a missing step, an instruction a sub-agent should have inherited but didn't. (Archetype: the Co-Authored-By trailer was once dropped because the attribution note leaned on a CLAUDE.md section the downstream had not synced.)
A workflow "failure" that turned out to be a real upstream bug, not a benign artifact: trace it to its template in repomatic/data/ or .github/workflows/ instead of waving it off. (Archetype: a release.yaml run red on every push because the downstream publish-pypi job's strategy.matrix evaluated fromJSON('').)
A reconciliation the skill should have anticipated: e.g. the step-6 babysit fixes needed a second consolidation pass, sometimes to drop an entry babysit added for a bug in a feature shipping this same release, sometimes to add one babysit omitted for a fix to a bug that predates the last tag.

Surfacing these is how the skill improves release-over-release instead of re-hitting the same friction. Propose only: do not commit, push, or open anything upstream without explicit approval.

Why "Rebase and merge", never squash

The release PR carries exactly two commits: a freeze commit ([changelog] Release vX.Y.Z) that finalizes the changelog date and comparison URL, removes the unreleased warning, and pins workflow action refs and CLI invocations to the release version; and an unfreeze commit ([changelog] Post-release bump) that reverts those to @main and local source, adds a fresh unreleased section, and bumps the patch version. The auto-tagging job tags only the freeze commit, located by its message: squashing collapses both into one and breaks tagging. A detect-squash-merge safeguard opens an issue and fails the workflow when a squash is detected.

What a complete release looks like

After the merge, the pipeline produces all of the following; if any is missing, the release is incomplete:

Git tag (vX.Y.Z) on the freeze commit.
GitHub release with notes matching the changelog.md entry.
Binaries for all 6 platform/architecture combinations (linux-arm64, linux-x64, macos-arm64, macos-x64, windows-arm64, windows-x64), when the project builds them.
PyPI package at the matching version.
changelog.md entry with the release date and comparison URL finalized.