metrics

name: metrics description: >- Workflow metrics aggregation. Reads per-epic artifacts, writes per-epic metrics.json. Mirrors base:backlog/base:retros/base:learnings layout. user-invocable: true argument-hint: rollup | query | validate allowed-tools: Read, Bash

Format authority

The canonical format of per-epic metrics.json is defined by the JSON Schema at plugins/base/schemas/metrics.schema.json. All readers and writers cite that schema rather than restating the rules. Every write in scripts/ validates against the schema before committing a mutation — the file cannot land in a malformed state.

The semantic policy that does not fit in a JSON Schema — directory layout, naming grammar, field-order canonical reference, consumer-slug resolution, write contract — lives in references/format.md.

The full metric catalogue (25 metrics, tiers 1–5, with formulae and source artifacts) lives in references/metric-catalogue.md.

Dispatch table

Read $ARGUMENTS. The first whitespace-separated token is the operation; the rest are op-specific arguments passed through to the script. Run the matching script with bash, capture output, surface to the user.

rollup [args...]      → scripts/rollup.sh [args...]
query [args...]       → scripts/query.sh [args...]
validate [args...]    → scripts/validate.sh [args...]
anything else (or empty) → list ops with one-line summaries, exit

Note: Dashboard generation (scripts/dashboard.sh) is CLI/Make only. Use make metrics-dashboard (or make metrics-dashboard ARGS="--out <path>") to generate the static HTML site. The dashboard operation is intentionally absent from this dispatch table — it produces a multi-file site output that is not suitable for inline skill dispatch.

The skill is a thin dispatcher: it does not encode validation, format rules, or routing in prose. Each script is self-documenting via --help and set -e-driven error surfaces. When the user wants to know what arguments an op takes, run the script with --help.

Operation summaries

`rollup`

Per-epic aggregator. Signature: rollup --epic <slug> [--out <path>].

Reads all artifacts under specs/epic-<slug>/, the per-epic rows of verification-results.jsonl, the retro file (if present), BACKLOG.json, and learnings.json. Computes all 25 metrics across Tiers 1–5 (Tier 5 is null when .timings.jsonl is absent). Writes ${CLAUDE_PLUGIN_DATA}/metrics/<consumer-slug>/<epic-slug>.json atomically. Idempotent — re-running overwrites the per-epic file.

Called by /base:feature Step 6 (substep 3.4) and /base:bug Step 4 after the curator step, before the auto-commit substep. Unconditional; no-ops cleanly when source artifacts are absent (all tier fields null, provenance records which sources were missing).

`dashboard` (CLI/Make only — not in skill dispatch)

Multi-consumer static-site generator. Invoked as: make metrics-dashboard [ARGS="--out <path>"] or directly: bash scripts/dashboard.sh [--out <path>].

Walks every consumer directory under ${CLAUDE_PLUGIN_DATA}/metrics/ (excluding dashboard/ and hidden dirs). Schema-validates each per-epic *.json file. Generates a static HTML site under <out>/ (default: ${CLAUDE_PLUGIN_DATA}/metrics/dashboard/). Uses atomic staging-directory swap so the live site is never half-written. On success prints the file:// URL of index.html and a one-line summary to stdout. Validation failures appear as warnings in the generated HTML without crashing the run.

`query`

Ad-hoc jq expression over all per-epic metrics.json files. Signature: query '<jq-expr>'.

Each record is augmented with _epic (the slug) so users can write '.[] | select(._epic == "foo")' without boilerplate. Thin passthrough; lets users build ad-hoc views without learning the dashboard's command surface.

`validate`

Schema-validate one or every per-epic metrics.json. Signature: validate [--file <path>] [--all].

Validates against plugins/base/schemas/metrics.schema.json. Exits non-zero on any failure. Useful for CI and for catching corruption after a manual edit.

Non-goals

No LLM calls from scripts. All scripts in scripts/ refuse to invoke claude, Skill, or any LLM runtime. The SKILL.md is advisory; all computation is shell + jq.
No cross-tenant aggregation. Metrics never leave the consumer's ${CLAUDE_PLUGIN_DATA}/metrics/<consumer-slug>/ directory.
No real-time / streaming metrics. Roll-up is end-of-epic batch only.
No hook installation. The plugin documents how to opt in to Tier 5 runtime signals (see references/hooks.md) but installs no hook automatically.
No forecast / predictive analytics. Trend display only, no modelling.
No LLM-dispatched dashboard. Dashboard generation writes a multi-file static HTML site and is invoked via make metrics-dashboard only, not via the skill dispatch table.

Relationship to other components

/base:feature Step 6 (substep 3.4) — invokes Skill("base:metrics", args: "rollup --epic <slug>") after the curator step, before the auto-commit substep.
/base:bug Step 4 — invokes Skill("base:metrics", args: "rollup --bug <slug>") at the equivalent point.
base:backlog — read-only consumer: rollup.sh reads BACKLOG.json to compute Tier 3 metrics (#15, #16). Never writes to BACKLOG.json.
base:learnings — read-only consumer: rollup.sh reads learnings.json to compute Tier 4 metrics (#18, #19). Never writes to learnings.json.
base:retros — read-only consumer: rollup.sh reads the retro markdown file for this epic. Never writes to retros/.
metrics.schema.json — structural definition consulted by validate.sh and rollup.sh at write time.
references/format.md — canonical field-order and naming-grammar reference cited by all scripts.
references/metric-catalogue.md — per-metric definitions (25 metrics) cited by each script's docblock.
references/hooks.md — opt-in documentation for Tier 5 runtime signals.