name: coverage
description: "This skill should be used when the user says /vibe-test:coverage. Standalone honest-denominator coverage measurement with builder-facing adaptation-prompt UX. Detects the test framework, proposes the diff to add --coverage.all (vitest) or --collectCoverageFrom glob (jest); only applies on builder opt-in. Falls back to c8 --all when adaptation is refused. Defers raw coverage parsing to tessl:analyzing-test-coverage when present. Emits a CI-friendly JSON sidecar at .vibe-test/state/coverage.json for machine consumers; exits 0 regardless of threshold (gate decides pass/fail)."
argument-hint: "[--path ]"
coverage — Honest-Denominator Measurement
Read ../guide/SKILL.md for your overall behavior (persona, experience-level adaptation, Pattern #13 composition rules, version resolution). Then follow this command end to end.
Coverage is the standalone measurement surface. A builder who runs /vibe-test:coverage is asking "what does this repo actually cover, honestly?" The command's flagship claim — honesty — is operationalized two ways: (1) the denominator matches the actual source tree, not just files the tests happened to import; (2) the builder, not the SKILL, chooses whether to mutate their test command. The SKILL proposes the diff, explains the cost, and applies only on opt-in.
What This Command Does, In One Sentence
Detect framework → propose adaptation diff → apply on opt-in or fall back to c8 --all → run coverage → compute weighted score per level → emit three views (markdown / banner / JSON sidecar) → exit 0 regardless of threshold (gate decides pass/fail).
Prerequisites
Blocking prereq (Pattern #16 — must confirm)
- Test framework detectable — the current directory must contain a
package.jsonwith a test runner independencies/devDependencies(vitest, jest, mocha) OR a framework config file (vitest.config.*,jest.config.*). If none:"I don't see vitest, jest, or mocha in this repo. Coverage needs a test runner to measure against. Want to install one, or point me at a subdirectory that has one?"
Wait for the builder's decision. Never measure "coverage" without a runner.
Shaping prereqs (Pattern #16 — adapts silently)
- Scope — if
--path <glob>is passed, narrow the source-file denominator to files matching the glob. Scoped coverage writes to.vibe-test/state/coverage-<scope-hash>.jsonto avoid clobbering full-repo state. - Prior audit — if
audit.json(oraudit-<hash>.json) exists, use itsinventory.scanned_filesas the denominator's source-of-truth. Without prior audit, scan live viasrc/scanner/scan()— takes 1-3s on most repos. - Builder has opted out of adaptation before — check
.vibe-test/state.jsonfor a cached decline; if present, skip the proposal and go straight toc8 --allfallback.
Before You Start
- Guide SKILL —
../guide/SKILL.md. Persona opening/handoff lines, experience-level verbosity, Pattern #13 anchored + dynamic, session memory interfaces. - Data contract —
../guide/references/data-contracts.mdsection "coverage-state". You own writes to<repo>/.vibe-test/state/coverage.json(and scoped variant) + its history copy. You readaudit.jsonfor source-file list when present. - Plays well with —
../guide/references/plays-well-with.md. Entries withapplies_to: coveragearetessl:analyzing-test-coverage(raw parsing deferral). - Friction triggers —
../guide/references/friction-triggers.md/vibe-test:coveragesection. - Tier-adaptive language —
src/reporter/tier-adaptive-language.ts. - Session logger —
../session-logger/SKILL.md.start('coverage', project)at entry,end({sessionUUID, command: 'coverage', outcome})at exit.
Primitive surface
Import surface (all existing src/ modules — no new TypeScript introduced):
src/coverage/runCoverage— orchestrates adapter proposal → adapter apply or c8 fallback → denominator-honesty check.src/coverage/computeWeightedScore/TIER_THRESHOLDS/LEVEL_WEIGHTS— the locked formula.src/coverage/checkDenominator— cherry-picked denominator detection.src/coverage/proposeVitestCoverageAll/proposeJestCollectCoverageFrom— framework-specific proposals (exposed for direct use when coverage needs to show the diff separately from running).src/scanner/scan— for live-scanning when no cached inventory is present.src/scanner/framework-detector— detect the active test framework.src/state/project-state— audit-state reads, sidecar path resolution.src/state/session-log/beacons— instrumentation.src/reporter—createReportObject, three renderers,getLanguageKnobs.
Flow
Step 0 — Session-logger sentinel
Before any user-facing output:
- Read
shared.preferences.persona,shared.preferences.pacing, andplugins.vibe-test.testing_experience(fallbackshared.technical_experience.level) from~/.claude/profiles/builder.json. - Invoke
session-logger.start('coverage', project_basename). Hold the returnedsessionUUIDin memory until Step 8. - Compose the persona-adapted opening line per guide > "Persona Adaptation".
Step 1 — Blocking prereq check
Run the blocking prereq described above. If the check fails, render a gentle block and halt. Do NOT proceed; do invoke session-logger.end({outcome: 'aborted'}).
Step 2 — Pattern #13 announcement (anchored complements)
Parse ../guide/references/plays-well-with.md via loadAnchoredRegistry(). Filter to entries whose applies_to includes coverage.
tessl:analyzing-test-coverage— announce verbatim when present: "Tessl's coverage skill owns raw coverage parsing — I'll defer the numbers to it and overlay tier-appropriate interpretation on top. You'll see the same per-level breakdown either way." In practice, when Tessl is present: Vibe Test still runs the adapter / c8 fallback (we need the denominator-honesty check), but parsing the JSON reporter output is deferred — Tessl's output becomes theper_levelsource rather than ourparseC8Json()fallback.
Surface at most ONE anchored complement announcement per invocation. Dynamic discovery is capped at one additional suggestion per the guide's heuristic table.
Step 3 — Detect framework + current coverage command
- Load the audit-state's
inventoryif present; otherwise invokescan(repoRoot, scopeGlob)to build a fresh one (takes 1-3s). - From
inventory.detection.test, pick the most-specific framework (vitest>jest>mocha>c8-standalone). - Find the existing coverage command. Priority:
package.json→scripts["test:coverage"]package.json→scripts["coverage"]package.json→scripts["test"]with--coveragealready present- Fallback: reconstruct default (
vitest run --coverage/jest --coverage)
- Build
actualSourceFilesfrominventory.scanned_filesfiltered to testable sources (excludenode_modules,dist, test files themselves).
Step 4 — Propose adaptation (never silent modification)
Call runCoverage({framework, cwd, adapterAccepted: null, actualSourceFiles, c8TestCommand}). With adapterAccepted: null, the helper returns the adapter_proposal (diff string + target file) but does not mutate anything and does not run c8 either.
Adaptation-prompt UX (C1 — the flagship):
Render the diff to the builder with tier-appropriate framing:
- first-time / beginner:
"Your current coverage command measures only files that tests import — that's a cherry-picked denominator. Think of it like grading a test where you only answer questions you like: the score looks great, but it's measuring a different thing than you think. Here's a one-line change that adds
--coverage.allso every source file ends up in the denominator:<diff block>Apply? [y/N]. If you skip it, I'll fall back toc8 --allfor this run — same honest denominator, just measured out-of-band." - intermediate:
"Coverage command is cherry-picked — denominator excludes unimported source files. Diff adds
--coverage.allfor vitest (orcollectCoverageFromglob for jest).<diff block>Apply? [y/N]." - experienced:
"Adapter diff for honest denominator:
<diff block>. [y/N]. Fallback: c8 --all."
Apply path:
- Builder says
y→ re-invokerunCoverage({..., adapterAccepted: true}). The helper applies the proposal (mutating the target file atomically) and runs the adapted command viac8TestCommandor falls through toc8 --all. - Builder says
n→ re-invokerunCoverage({..., adapterAccepted: false}). The helper falls through toc8 --allshellingnpx c8 --all --reporter json --reporter text <cmd>. Announce the extra run in the banner: "Declined adaptation — falling back to c8 --all for this run. The adapter diff is still available if you want to apply it later." - Log
friction_type: "coverage_adapter_refused"at confidencemediumwhen builder declines. Cache the decline in.vibe-test/state.json.coverage_adapter_declined = trueso future runs skip the prompt silently (builder can clear by removing the key).
If coverage fails outright (child process crashes, command missing): attach a harness-break finding with severity: critical to the ReportObject and continue with a zeroed score — better to report honestly than fake numbers. Point the builder at /vibe-test:fix for the repair.
Step 5 — Denominator honesty check (C1 — enforcement)
coverage.denominator from runCoverage() already carries is_cherry_picked. When true (ratio < 0.75 threshold):
- Emit finding:
category: 'cherry-picked-denominator',severity: 'high', rationale naming reported-vs-actual count + list ofmissing_files(truncated to 10 for the banner). example_patternshows the minimal-diff fix tailored to framework.
If is_cherry_picked === false but the builder declined adaptation earlier: note this in the banner as "Denominator honest via c8 --all fallback — no config change applied.". The adapter diff is stashed in the JSON sidecar for future reference.
Step 6 — Compute weighted score + per-level breakdown
- Parse the coverage output's per-level map:
- If
tessl:analyzing-test-coverageis present AND the builder authorized deferral → call into Tessl and use its per-level result. - Otherwise → fall back to v0.2 heuristic: assign all measured coverage to
smoke+behavioralin equal split (finer attribution deferred to v0.3).
- If
- Load
applicabilityper test level from the classification matrix for the audit'sapp_type— if no audit exists, default to{smoke: true, behavioral: true, edge: true, integration: false, performance: false}. - Call
computeWeightedScore({perLevel, applicability, tier}). Useaudit.classification.tierwhen available; otherwise'public-facing'as a conservative default. - Attach the full
WeightedScoreResult(score + threshold + pass flag + contributions) to the ReportObject'sscorefield.
The score.per_level rendered in the banner shows each level's raw coverage %, not the weighted contribution — the weighted value lives in the overall score.
Step 7 — Assemble ReportObject + render three views
Build the ReportObject via createReportObject({command: 'coverage', plugin_version, repo_root, scope, commit_hash}). Populate:
classification— carry forward from audit-state if present; otherwise synthesize a minimal{app_type: 'spa' | ..., tier: 'public-facing', modifiers: [], confidence: 0.5}placeholder. The banner renders "classification pulled from audit" vs "default" in the dim section.score— from Step 6. Includes current + target + per-level.findings— cherry-picked-denominator finding if detected; harness-break finding if coverage failed to run.actions_taken— record what actually happened:{kind: 'write', description: 'applied vitest coverage-all adapter', target: 'vitest.config.ts'}OR{kind: 'other', description: 'fell back to c8 --all', target: null}.deferrals— the Pattern #13 matches from Step 2 (verbatim deferral_contract prose).handoff_artifacts—['docs/vibe-test/coverage-<ISO>.md', '.vibe-test/state/coverage.json'].next_step_hint— persona-adapted. Default: "Run/vibe-test:gatewhen ready to check the tier threshold, or/vibe-test:generateto close gaps."
Render three views in parallel:
renderMarkdown(report, {proseSlots})→docs/vibe-test/coverage-<ISO-date>.md.renderBanner(report, {columns, disableColors: !isTty})→ printed to chat.renderJson({report, repoRoot})→.vibe-test/state/coverage.json(scoped variant when--pathset). Schema-validated againstcoverage-state.schema.json. This is the C3 machine-readable artifact — CI consumers read this file directly.
Critical CI semantic (C3): the JSON sidecar carries passes_tier_threshold: boolean but the coverage command itself exits 0 regardless of pass/fail. The gate command (/vibe-test:gate) owns the exit-code contract. Coverage is a measurement, not a decision.
Step 8 — State writes
session-logger.end({sessionUUID, command: 'coverage', outcome: 'completed', key_decisions: [<adapter accepted/declined>, <denominator honest flag>], complements_invoked, artifact_generated})— terminal entry paired to Step 0.beacons.append(repoRoot, {command: 'coverage', sessionUUID, outcome: 'completed', hint: '<score>/<threshold> - <pass|below>'})— Pattern #12.- Update
.vibe-test/state.json— persist the adapter decline flag if builder said no; otherwise leavecoverage_adapter_declinedunset. Coverage does not touch classification / inventory / gap lists (audit owns those).
On any state-write failure: log a runtime_hook_failure friction entry and continue.
Step 9 — Handoff line
Persona-adapted handoff line per guide > "Handoff Language Rules":
| Persona | Handoff line |
|---|---|
professor |
"When you're ready, run /vibe-test:gate for the tier-threshold check, or /vibe-test:generate if you want to close specific gaps first." |
cohort |
"/vibe-test:gate next for CI check, or /vibe-test:generate to close gaps." |
superdev |
"Run /vibe-test:gate." |
architect |
"/vibe-test:gate for the tier-threshold verdict. JSON sidecar at .vibe-test/state/coverage.json is CI-ready." |
coach |
"When you're ready, /vibe-test:gate will check whether this clears your tier threshold." |
null (default) |
"Run /vibe-test:gate when ready." |
Do NOT prescribe /clear between commands. Claude Code auto-compacts.
Tier-Adaptive Language
| Knob | first-time / beginner |
intermediate |
experienced |
|---|---|---|---|
| Adapter-prompt framing | Analogy + diff + "fallback is c8 --all" | 2-line explanation + diff | Diff + [y/N] + fallback reference |
| Per-level table | Full table with plain-English level descriptions | Compact table | Path + percentage only |
| Cherry-picked-denominator rationale | Plain-English + list of missing files | One-line summary + missing count | Ratio + missing count |
JSON output is level-invariant.
Friction Logging
| Trigger | friction_type | confidence |
|---|---|---|
Builder declines the adapter proposal (e.g., --coverage.all addition) |
coverage_adapter_refused |
medium |
Builder disputes the tier threshold the weighted score is measured against (capture claimed + measured tier in symptom) |
tier_threshold_dispute |
medium |
Builder declines a Pattern #13 complement offer (set complement_involved) |
complement_rejected |
medium |
c8 --all fallback fails or produces a clearly wrong denominator |
harness_break |
high |
When in doubt, don't log. False positives poison /evolve.
What the Coverage SKILL is NOT
- Not an auditor. Coverage does not classify app type or rank gaps — it measures.
- Not a gate. Coverage emits
passes_tier_thresholdin the JSON sidecar but exits 0 regardless;/vibe-test:gateowns the exit-code contract. - Not a fixer. When coverage fails to run (harness break), coverage surfaces the finding and points at
/vibe-test:fix. - Not a raw-coverage parser when Tessl is present. Pattern #13 — defer raw parsing to
tessl:analyzing-test-coverage.
Why This SKILL Exists
A score is only as honest as the denominator it's measured against. Most coverage tools report against "files the tests imported" because that's cheaper to compute — and it gives you 88% when the truth is 6%. The adapter-prompt UX is the product: it names the cherry-picked denominator problem out loud, shows the builder the exact diff that fixes it, and lets them opt in instead of silently mutating their config.
The c8 fallback exists because some builders won't want the config change (maybe their coverage config is tangled up in some CI pipeline, maybe they're auditing someone else's repo and don't want to edit anything). The fallback gives them the honest number without touching their code.
C3 — the JSON sidecar — is the machine-readable artifact. /vibe-test:gate reads it under CI mode; external CI systems can read it too. Keeping coverage's exit code at 0 regardless of pass/fail preserves the separation of concerns: coverage measures, gate decides.
Pattern #13 deferral to Tessl keeps Vibe Test honest about its own boundaries: if a dedicated coverage-analysis skill is installed, parsing raw coverage output is its job, not ours. We overlay tier-appropriate interpretation on top of the raw numbers — not re-parse them.