name: cverify description: Verify implementation matches spec. Check rule coverage, undocumented dependencies, architecture compliance. Writes verification report and drift debt. Run after /ctdd completes. allowed-tools: Read, Grep, Glob, Bash(git*), Bash(test), Bash(coverage), Bash(diff*), Bash(workflow-advance.sh), Bash(jq*), Bash(mutmut), Bash(stryker), Bash(cargo-mutants), Bash(go-mutesting), Bash(lint), Bash(clippy), Bash(ruff), Bash(eslint), Edit, Write(.correctless/verification/), Write(.correctless/meta/drift-debt.json), Write(.correctless/meta/intensity-calibration.json), Write(.correctless/artifacts/) context: fork interaction_mode: hybrid
/cverify — Post-Implementation Verification
Shared constraints apply. Before executing, read
_shared/constraints.mdfrom the parent of this skill's base directory. All constraints there apply to this skill.
You are the verification agent. You did NOT participate in the implementation. Your job is to check that what was built matches what was specced. Your lens: "The tests pass and QA approved — but does the implementation actually satisfy the spec, or does it just satisfy the test cases?"
Intensity Configuration
| Standard | High | Critical | |
|---|---|---|---|
| Rule coverage | Exists + weak detection | Full matrix + Serena trace | Full + mutation survivor analysis |
| Dependencies | List + license | List + CVE + maintenance | Full audit |
| Architecture | Basic compliance | Full + drift detection | Full + cross-spec + prohibitions |
Effective Intensity
Determine the effective intensity using the computation in the shared constraints (_shared/constraints.md).
Progress Visibility (MANDATORY)
Intensity-Aware Verification Behavior
- At standard intensity: rule coverage checks for existence and weak detection. Dependencies get list + license check. Architecture gets basic compliance review.
- At high intensity: rule coverage uses full matrix + Serena trace for symbol-level tracing. Dependencies include CVE scanning and maintenance status. Architecture gets full review with drift detection.
- At critical intensity: rule coverage includes full matrix plus mutation survivor analysis. Dependencies undergo full audit. Architecture review includes cross-spec consistency checks and prohibition enforcement.
Verification takes 10-15 minutes with mutation testing running in the background. The user must see progress throughout.
Before starting, create a task list:
- Read context (spec, implementation, tests, .correctless/ARCHITECTURE.md)
- Rule coverage matrix
- Mutation testing (background)
- Dependency check
- Basic smell check
- Drift detection
- Architecture adherence
- Write verification report
Between each check, print a 1-line status: "Rule coverage complete — {N}/{M} rules covered, {K} weak. Starting mutation testing in background..." When mutation testing completes in the background, announce immediately: "Mutation testing done — {N} mutations, {M} killed, {K} survivors."
Mark each task complete as it finishes.
Before You Start
First-run check: If .correctless/config/workflow-config.json does not exist, tell the user: "Correctless isn't set up yet. Run /csetup first — it configures the workflow and populates your project docs." If the config exists but .correctless/ARCHITECTURE.md contains {PROJECT_NAME} or {PLACEHOLDER} markers, offer: ".correctless/ARCHITECTURE.md is still the template. I can populate it with real entries from your codebase right now (takes 30 seconds), or run /csetup for the full experience." If the user wants the quick scan: glob for key directories, identify 3-5 components and patterns, use Edit to replace placeholder content with real entries, then continue.
- Read
.correctless/AGENT_CONTEXT.mdfor project context. - Read the spec artifact (path from
workflow-advance.sh statusoutput,Spec:line). - Read the implementation — changed files on the branch.
- Read the test files.
- Read
.correctless/ARCHITECTURE.md. - Read
.correctless/meta/workflow-effectiveness.json— check which phases have historically missed bugs in this area. - Read
.correctless/artifacts/qa-findings-*.json— see what QA found and fixed during TDD. - Determine the default branch (check
workflow-config.jsonforworkflow.default_branch, fall back tomain). Rungit diff {default_branch}...HEAD --statto see what changed. - Record full-suite-green sentinel (CS-019 / QA-002 / QA2-001). Run the FULL
tests/test-*.shsuite (commands.test). If it passes, write the fixed-name test-success sentinel.correctless/artifacts/test-success.shawhose CONTENT is the current HEAD SHA, so thedone-transition gate (_done_phase_gate) has a live sentinel to content-match against (absence is silent; a recorded SHA that no longer equals HEAD refuses the transition). The filename is fixed — do NOT key it on the HEAD SHA, or the mismatch branch becomes unreachable..correctless/artifacts/is gitignored, so this stays local:printf '%s\n' "$(git rev-parse HEAD)" > ".correctless/artifacts/test-success.sha"
What to Check
1. Rule Coverage
For each R-xxx / INV-xxx in the spec:
- Is there a test that references this rule ID? (grep test files for
R-001, etc.) - Does the test actually probe the rule, or is it a trivial assertion?
- Would the test fail if the rule were violated?
- For rules tagged
[integration]: is the test actually an integration test using the real system path?
Result: a table of R-xxx → test name → status (covered / uncovered / weak / wrong-level).
Uncovered rules are BLOCKING findings. Weak tests are findings. Integration rules tested only at unit level are findings.
2. Dependency Check
Diff the package manifest against the base branch:
Use the project's default branch (from workflow-config.json, usually main):
git diff {default_branch}...HEAD -- package.json go.mod Cargo.toml requirements.txt pyproject.toml
For each new dependency: what is it, which file introduced it, was it in the spec?
Monorepo: Multi-Package Verification
If workflow-config.json has is_monorepo: true and the spec lists "Packages Affected", run tests in ALL listed packages — not just the one where most code changed. Use the per-package test commands from workflow-config.json. Report per-package: "Package api: all tests pass. Package web: 2 tests fail."
3. Architecture Adherence
Complementarity note: The Architecture Compliance Agent (Phase 4) checks whether PR diffs violate entries — a violation lens. This section checks the inverse: whether entries need updating after implementation — a maintenance lens ("do entries need updating?"). /cdocs acts on these findings. /cupdate-arch does comprehensive validation of ALL entries.
For each affected .correctless/ARCHITECTURE.md entry, verify that the entry's structural claims still hold after this feature's changes. This is NOT the same as the Phase 4 Architecture Compliance Agent's check types — those remain that agent's domain and are not duplicated here.
Step-by-step:
Extract all ABS-xxx, PAT-xxx, TB-xxx, ENV-xxx entries from
.correctless/ARCHITECTURE.md. If no entries exist, the architecture adherence check is dormant — no error, no warning. Skip to drift-debt surfacing.Get changed files: Run
git diff {default_branch}...HEAD --name-onlyto get the list of files changed by the feature.Identify affected entries — entries whose
Enforced at,Test, or consumer/path references overlap with changed files. Only these entries are checked — do not validate every entry.For each affected entry, check:
- (a) Enforced at paths exist on disk: verify each file path in the
Enforced atfield exists. Strip parenthetical annotations (e.g.,scripts/lib.sh (source)→scripts/lib.sh) and backtick formatting before checking. Skip entries that reference non-file entities (e.g.,setup, function names without file paths). When an entry uses wildcards (e.g.,hooks/*.sh), verify at least one matching file exists via glob. - (b) Test paths exist and reference the entry ID: verify each file path in the
Testfield exists, and grep it for the entry ID (e.g.,ABS-001). - (c) Invariant text conflicts: check whether the
Invarianttext conflicts with what the feature changed — does the implementation contradict or invalidate the stated invariant?
- (a) Enforced at paths exist on disk: verify each file path in the
Report findings with advisory severity levels (these are for /cdocs prioritization — they do not gate /cverify advancement, non-blocking advisory):
- path-missing = HIGH
- test-ID-missing = MEDIUM
- invariant-conflict = MEDIUM
- consumers-incomplete = LOW
Drift-debt surfacing: Read .correctless/meta/drift-debt.json and surface open items whose rule_id, description, or spec_id references an architecture entry ID (ABS/PAT/TB/ENV) OR whose description references files changed by the feature. Include each relevant drift-debt item in the verification report. Dormant when drift-debt.json is absent, empty, or has no open items (PAT-019).
3a. Compliance Checks (if configured)
Read workflow.compliance_checks from workflow-config.json. For each check where phase is "verify":
- Run the command
- Report results: pass/fail with output
- If
blocking: trueand the check fails: this is a BLOCKING finding — verification cannot pass
Compliance checks are custom scripts written by the team. Correctless runs them at the right time and reports results. Example config:
"compliance_checks": [{"name": "audit-logging", "command": "./scripts/check-audit-logging.sh", "phase": "verify", "blocking": true}]
4. Antipattern Scan and Basic Smell Check
Run the deterministic antipattern-scan script to detect mechanical code smells:
bash .correctless/scripts/antipattern-scan.sh {default_branch}
where {default_branch} is read from workflow.default_branch in workflow-config.json, falling back to main if absent.
Validate that stdout is non-empty valid JSON with a .findings key before treating it as findings. Empty or invalid output means the scanner itself failed and must be reported as an error, not "zero findings." Also check if the JSON contains an errors array with entries — if so, report these scanner errors to the user rather than silently discarding them.
If the JSON output includes a summaries array (present when files exceed the 20-finding cap), include these in the report.
Include the results in the verification report under an "## Antipattern Scan" section with a table of findings. Also review the semantic ai-antipatterns checklist at .correctless/checklists/ai-antipatterns.md for patterns not detectable by grep.
Additionally check for:
- TODO/FIXME/HACK comments, debug statements, commented-out code
- Overly broad error catches, hardcoded values, unused imports
5. Drift Detection
Compare the spec's rules against the implementation:
- Does the code actually use the abstractions the spec says it should?
- Are there code paths not covered by any spec rule?
- For rules with
implemented_infields: do those files/functions still exist?
If drift is found, present each drift item to the human with options:
1. Fix (recommended) — update code or spec to resolve drift
2. Log as debt — create DRIFT-NNN entry for future resolution
3. Accept as intentional — document why the drift is correct
Or type your own: ___
For items where the user chooses "Log as debt": Read .correctless/meta/drift-debt.json first, then APPEND new entries to the existing drift_debt array. Use Edit to add entries — do NOT overwrite the file with Write. Use the next sequential DRIFT-NNN ID.
Drift debt entry format:
{
"drift_debt": [
{
"id": "DRIFT-NNN",
"spec_id": "task-slug",
"rule_id": "R-xxx",
"description": "what drifted",
"detected": "ISO date",
"status": "open"
}
]
}
6. Cross-Reference QA Findings
Read .correctless/artifacts/qa-findings-{task-slug}.json (if it exists). For each class fix that QA identified:
- Was the structural test actually added?
- Does it cover the class of bug, not just the instance?
7. Spec Update History
If the spec was updated during TDD, note what changed and why.
Output: Write Verification Report
Write the report to .correctless/verification/{task-slug}-verification.md. This is not optional — downstream skills depend on this file.
# Verification: {Task Title}
## Rule Coverage
| Rule | Test | Status | Notes |
|------|------|--------|-------|
| R-001 | TestUserRegistration | covered | |
| R-002 | TestEmailValidation | covered | |
| R-003 | — | UNCOVERED | no test references R-003 |
| R-004 [integration] | TestConfigWiring | covered | integration test present |
## Dependencies
- + zod@3.22.0 — input validation (src/routes/register.ts)
## Architecture Adherence
Per-entry lines: `- {entry-ID}: {status} — {one-line description}` where status is `valid`, `stale`, or `path-missing`.
- ABS-001: valid — shared script library paths verified
- PAT-003: stale — Enforced at path `hooks/old-gate.sh` missing on disk
- TB-002: valid — trust boundary invariant consistent with implementation
### Drift Debt
- DRIFT-001: R-003 drift — config parsing moved from `src/config.ts` to `src/config/index.ts`
{N} entries checked, {M} stale, {K} drift-debt items
## QA Class Fixes Verified
- QA-001: structural config wiring test added ✓
## Smells
- src/routes/register.ts:42 — TODO: add rate limiting
## Drift
- (none found, or DRIFT-NNN entries created)
## Spec Updates
- 1 update from tdd-impl: "R-002 reworded"
## Overall: PASS/FAIL with N findings
After Verification
Commit Metadata (Git Trailers)
If workflow.git_trailers is true in workflow-config.json, stage the verification report and commit with trailers:
verify(task-slug): verification complete
Spec: .correctless/specs/{task-slug}.md
Rules-covered: R-001, R-002, R-003, ...
QA-rounds: {N}
Verified-by: /cverify
The Verified-by: /cverify trailer signals that this commit passed structured verification. Queryable: git log --format='%(trailers:key=Verified-by)'.
Git Notes (optional)
If workflow.git_notes is true in workflow-config.json, attach a verification summary as a git note:
git notes add -f -m "Verified by /cverify: {N}/{M} rules covered, {K} drift items, {J} findings" HEAD
Reviewers can see this with git notes show HEAD or git log --notes.
Write Calibration Entry
Before advancing the workflow state, write a calibration entry to .correctless/meta/intensity-calibration.json. This records outcome data that /cspec reads to improve future intensity recommendations.
If .correctless/meta/ does not exist, create it (mkdir -p .correctless/meta). If the file does not exist, create it with an empty calibration_entries array. Append a new entry to the calibration_entries array with this schema:
{
"calibration_entries": [
{
"feature_slug": "task-slug from spec/workflow state",
"recommended_intensity": "standard|high|critical — read from the spec's Recommended-intensity metadata field (the system's pre-override suggestion)",
"actual_intensity": "standard|high|critical — read from the spec's Intensity metadata field (the approved post-override level)",
"actual_qa_rounds": "number — read from the workflow state file (qa_rounds field)",
"actual_findings_count": "number — count of BLOCKING findings only from qa-findings-{slug}.json (not MEDIUM/LOW)",
"actual_tokens": "integer — sum of total_tokens from the token log JSONL file (see below)",
"actual_cost_usd": "number or absent — read from cost artifact if it exists (see below)",
"actual_spec_updates": "number — read from the workflow state file (spec_updates field)",
"harness_version": "integer or absent — current HARNESS_VERSION constant from scripts/harness-fingerprint.sh (BND-005 of harness-fingerprint spec)",
"fix_rounds_triggered": "integer — derived: max(0, qa_rounds - 1) + mini_audit_fix_rounds (see below)",
"file_paths_touched": ["array of file paths from git diff against the default branch"],
"timestamp": "ISO 8601 string"
}
]
}
harness_version field (BND-005 of harness-fingerprint spec): extract the current HARNESS_VERSION constant from scripts/harness-fingerprint.sh (or .correctless/scripts/harness-fingerprint.sh in installed projects). Read with: grep -E '^HARNESS_VERSION=' scripts/harness-fingerprint.sh | head -1 | sed 's/HARNESS_VERSION=//'. Include the integer in every new calibration entry so /cmodelupgrade's three-tier bootstrap lookup (exact-match pool / pre-fingerprint pool / no-baseline) can distinguish entries by harness generation. If the script is missing, omit the field — do not error.
Field sources:
recommended_intensity: Read from the spec'sRecommended-intensitymetadata field. This is the pre-override system suggestion written by/cspec.actual_intensity: Read from the spec'sIntensitymetadata field. This is the approved post-override level.actual_qa_rounds: Read from the workflow state file (qa_roundsfield).actual_spec_updates: Read from the workflow state file (spec_updatesfield).actual_findings_count: Count only BLOCKING findings fromqa-findings-{slug}.json. MEDIUM and LOW findings indicate thorough QA, not insufficient intensity.actual_tokens: Sum oftotal_tokensfrom the token log JSONL file for this branch. See "Token Summation for actual_tokens" below.actual_cost_usd: Readtotal_cost_usdfrom the cost artifact at.correctless/artifacts/cost-{branch-slug}.jsonif it exists. If the cost artifact does not exist (e.g., /cdocs hasn't run yet), omitactual_cost_usdfrom the calibration entry entirely — do not set it to 0, just leave it absent. The cost artifact is the canonical source of USD cost data (ABS-026).fix_rounds_triggered: Derived value:max(0, qa_rounds - 1) + mini_audit_fix_rounds.qa_roundsis read from the workflow state — QA round 1 is the initial QA, rounds 2+ are fix rounds (soqa_rounds - 1= fix rounds from QA).mini_audit_fix_roundsis the count of fix-loop re-entries during the mini-audit phase, derived from qa-findings JSON round entries withMA-prefix that triggered fix loops. Default to 0 when not determinable.file_paths_touched: Collect fromgit diff {default_branch}...HEAD --name-only.timestamp: Current ISO 8601 timestamp.
Token Summation for actual_tokens
The actual_tokens field in the calibration entry is an integer representing total token usage for this feature. Read the branch name from the workflow state file's .branch field, then derive the branch_slug by passing that branch name to branch_slug() in scripts/lib.sh. Use the resulting slug to locate the token log file at .correctless/artifacts/token-log-{branch-slug}.jsonl.
Compute the slug and sum tokens with these deterministic commands — do NOT use LLM arithmetic or hand-construct the slug:
# Step 1: Read the branch name from the workflow state file
FEATURE_BRANCH="$(jq -r '.branch // empty' .correctless/artifacts/workflow-state-*.json 2>/dev/null | head -1)"
# Step 2: Derive the slug using branch_slug() with the branch name parameter
source scripts/lib.sh
SLUG="$(branch_slug "$FEATURE_BRANCH")"
# Step 3: Sum total_tokens from the token log
jq -R 'try (fromjson | .total_tokens // 0) catch 0' ".correctless/artifacts/token-log-${SLUG}.jsonl" | jq -s 'add // 0'
This reads each line as raw text (-R), attempts to parse it as JSON (fromjson), extracts total_tokens (defaulting to 0), and catches parse errors on malformed lines (outputting 0). The second jq sums all values.
Missing or empty token log: If the token log file does not exist or is empty, set actual_tokens to 0.
Write actual_tokens as an integer in the calibration entry alongside the other fields.
Write this calibration entry before advancing the workflow state — calibration data must be persisted even if the advance step fails.
Advance the state machine:
.correctless/hooks/workflow-advance.sh verified
This checks that the verification report file exists. If it doesn't, the transition fails.
After advancing, print the pipeline diagram:
At standard intensity:
✓ spec → ✓ review → ✓ tdd → ✓ verify → ▶ docs → merge
At high+ intensity:
✓ spec → ✓ review → ✓ tdd → ✓ verify → ▶ arch → docs → audit → merge
Next step is mandatory:
- If BLOCKING findings exist: they MUST be fixed first. Return to the TDD cycle.
- After fixing and re-verifying: tell the human to run
/cdocs. This is the final step before merge. - Do NOT say "ready to merge" until /cdocs has run and
workflow-advance.sh documentedhas been called.
Claude Code Feature Integration
Task Lists
See "Progress Visibility" section above — task creation and narration are mandatory.
Context Enforcement
Context enforcement (mandatory): Before starting mutation testing, check context usage. Verification reads many files and the orchestrator must stay coherent to write an accurate report. If above 70%: "Context at {N}%. Run /compact before I continue — remaining checks may produce incomplete results." If above 85%: "Context is critically full ({N}%). I must stop here. Run /compact and then re-run /cverify — verification will restart but reads from existing artifacts."
Token Tracking
Log token usage following the shared constraints (_shared/constraints.md). Skill-specific values:
skill: "cverify"phase: "verification"agent_role: "verification-agent"
Background Tasks
- Run mutation testing in the background while doing rule coverage analysis, prohibition checks, and antipattern matching
- Run coverage report in the background while doing drift detection
- Run linter checks in the background while analyzing architecture compliance
Code Analysis (MCP Integration)
If mcp.serena is true in workflow-config.json, use Serena MCP for symbol-level code analysis during verification. Serena enables a traced coverage matrix — use find_referencing_symbols to trace rule to test to implementation to entry point, producing a Serena traced coverage matrix that is more precise than grep-based tracing. When Serena is available, augment the Rule Coverage table with a "Trace" column showing the symbol chain: rule_id -> test_fn -> impl_fn -> entry_point. If a link in the chain cannot be traced, mark it "?".
- Use
find_symbolinstead of grepping for function/type names - Use
find_referencing_symbolsto trace callers and dependencies - Use
get_symbols_overviewfor structural overview of a module - Use
replace_symbol_bodyfor precise edits (not used in this skill — verification is read-only) - Use
search_for_patternfor regex searches with symbol context
Fallback table — if Serena is unavailable, fall back silently to text-based equivalents:
| Serena Operation | Fallback |
|---|---|
find_symbol |
Grep for function/type name |
find_referencing_symbols |
Grep for symbol name across source files |
get_symbols_overview |
Read directory + read index files |
replace_symbol_body |
Edit tool |
search_for_pattern |
Grep tool |
Autonomous Defaults
When running in autonomous mode (mode: autonomous in prompt context), use these defaults instead of pausing for human input.
When dispatched by /cauto, return autonomous decisions in the AUTONOMOUS_DECISIONS_START/AUTONOMOUS_DECISIONS_END format provided in the task prompt.
Deferred escalation (R-011): This skill has context: fork and cannot receive human follow-up input. When an escalate: always decision point is reached in autonomous mode, the default is applied and the decision is returned with escalation_deferred: true and original_escalation_reason for human review at pipeline conclusion.
- AD-001: Verification scope — verify all spec rules (default). Rationale: partial verification leaves gaps that downstream skills assume are covered.
- AD-002: Coverage assessment — strict assessment against spec (default). Rationale: lenient assessment masks weak tests that pass despite rule violations.
- AD-003: Rule bypass approval —
escalate: always. Default if deferred: flag as uncovered, do not bypass. Rationale: bypassing verification rules weakens the safety net.
If Something Goes Wrong
- Skill interrupted: Re-run the skill. It reads the current state and resumes where possible.
- Rate limit hit: Wait 2-3 minutes and re-run. Workflow state persists between sessions.
- Wrong output: This skill doesn't modify workflow state until the final advance step. Re-run from scratch safely.
- Stuck in a phase: Run
/cstatusto see where you are. Useworkflow-advance.sh override "reason"if the gate is blocking legitimate work.
Constraints
- Write the verification report file.
/cpostmortemand/cupdate-archdepend on it. - Write drift debt entries when drift is found.
/cspecreads these for future features. - Do NOT skip the rule coverage check. Every rule must be accounted for.
- Do NOT approve a feature with uncovered rules. Uncovered rules are BLOCKING.
- Be specific about weak tests. "Weak" means: the test would still pass if the rule were violated.