name: quality-review description: "Stage 2 code quality review. Triggers: 'quality review', 'check code quality', or /review stage 2. Requires spec-review to have passed first. Checks SOLID, DRY, security, and test quality. Do NOT use for spec compliance — use spec-review instead." metadata: author: exarchos version: 1.0.0 mcp-server: exarchos category: workflow phase-affinity: review
Quality Review Skill
Overview
Stage 2 of two-stage review: Assess code quality, maintainability, and engineering best practices.
Prerequisite: Spec review must PASS before quality review.
MANDATORY: Before accepting any rationalization for rubber-stamping code quality, consult
references/rationalization-refutation.md. Every common excuse is catalogued with a counter-argument and the correct action.
Triggers
Activate this skill when:
- Spec review has passed
{{COMMAND_PREFIX}}reviewcommand (after spec review)- Ready to assess code quality
- Before synthesis/merge
Execution Context
This skill runs in a SUBAGENT spawned by the orchestrator, not inline.
The orchestrator provides the state file path, diff output from exarchos_orchestrate({ action: "review_diff" }), task ID, and spec review results (must be PASS).
The subagent reads the state file for artifact paths, uses the diff output instead of full files, runs static analysis, performs a code walkthrough, generates a report, and returns the verdict.
Data Handoff Protocol
The orchestrator is responsible for generating the diff before dispatching the quality-review subagent. The subagent does NOT generate its own diff.
Orchestrator responsibilities:
- Generate diff:
git diff main...HEADorgit diff main...integration-branch - Pass diff content in the subagent dispatch prompt
- Include state file path for artifact resolution
- Include spec review results (must be PASS)
Subagent responsibilities:
- Receive diff content from dispatch prompt (do NOT re-generate)
- Read state file for design/plan artifact paths
- Run static analysis and security scripts against the working tree
- Return structured JSON verdict
Context-Efficient Input
Instead of reading full files, receive the integrated diff:
# Generate integrated diff for review (branch stack vs main)
git diff main...HEAD > /tmp/stack-diff.patch
# Alternative: git diff for integration branch
git diff main...integration-branch > /tmp/integration-diff.patch
# Alternative: use gh CLI to get PR diff
# gh pr diff <number>
# Or use GitHub MCP pull_request_read with method "get_diff" if available
This reduces context consumption by 80-90% while providing the complete picture.
Pre-Review Schema Discovery
Before evaluating, query the review strategy runbook to determine the appropriate evaluation approach:
- Evaluation strategy:
exarchos_orchestrate({ action: "runbook", id: "review-strategy" })to determine single-pass vs two-pass evaluation strategy based on diff size and task count.
Review Scope: Combined Changes
After delegation completes, quality review examines:
- The complete stack diff (all task branches vs main), or the feature/integration-branch diff when using integration branches
- All changes across all tasks in one view
- The full picture of combined code quality
This enables catching:
- Cross-task SOLID violations
- Duplicate code across task boundaries
- Inconsistent patterns between tasks
Review Scope
Quality Review focuses on:
- Code quality and readability
- SOLID principles
- Error handling
- Test quality
- Performance considerations
- Security basics
Does NOT re-check:
- Functional completeness (spec review)
- TDD compliance (spec review)
Review Process
Check Quality Signals
Before reviewing, query quality signals for the skill(s) under review:
{{MCP_PREFIX}}exarchos_view({ action: "code_quality", workflowId: "<featureId>" })
- If
regressionsis non-empty, report active quality regressions to the user before proceeding - If any hint has
confidenceLevel: 'actionable', present thesuggestedActionto the user - If
gatePassRate < 0.80for the target skill, warn about degrading quality
Step 0: Verify Spec Review Passed (MANDATORY)
Before proceeding, confirm spec review passed for all tasks:
action: "get", featureId: "<id>", query: "reviews"
If ANY task has specReview.status !== "pass", STOP and return:
{ "verdict": "blocked", "summary": "Spec review not passed — run spec-review first" }
Step 0.5: Verify Review Triage (Conditional — run when delegation phase preceded this review)
If this review follows a delegation phase, verify triage routing:
exarchos_orchestrate({
action: "verify_review_triage",
featureId: "<feature-id>"
})
featureId resolves PRs (event-store projection) and review.routed events directly from the store — no .state.json/.events.jsonl needed (INV-1). A legacy file-based workflow may instead pass stateFile + eventStream.
passed: true: triage routing correct — continue to Step 1. passed: false: triage issues found — investigate and resolve before proceeding.
Step 1: Static Analysis + Security + Extended Gates
Runbook: Run quality evaluation gates via runbook:
exarchos_orchestrate({ action: "runbook", id: "quality-evaluation" })If runbook unavailable, usedescribeto retrieve gate schemas:exarchos_orchestrate({ action: "describe", actions: ["check_static_analysis", "check_security_scan", "check_convergence", "check_review_verdict"] })
Run automated gates via orchestrate actions. See references/gate-execution.md for orchestrate action signatures and response handling.
check_static_analysis— lint, typecheck, quality-check (D2). Must pass before continuing.check_security_scan— security pattern detection (D1). Include findings in report.- Optional D3-D5 gates:
check_context_economy,check_operational_resilience,check_workflow_determinism— advisory, feed convergence view.
Step 2: Test Desiderata Evaluation
Evaluate agent-generated tests against Kent Beck's Test Desiderata. Four properties are critical for agentic code:
| Property | What to check | Flag when |
|---|---|---|
| Behavioral | Tests assert on observable behavior, not implementation details | Mock call count assertions, internal state inspection, testing private methods |
| Structure-insensitive | Tests survive refactoring without behavioral change | Tests coupled to internal helper method signatures, tests that break when internals are renamed |
| Deterministic | Tests produce the same result every run | Uncontrolled Date.now(), Math.random(), setTimeout race conditions, network-dependent tests |
| Specific | Test failures pinpoint the cause | toBeTruthy() / toBeDefined() without additional specific assertions, catch-all tests with vague descriptions |
Test layer mismatch detection: Flag unit tests with >3 mocked dependencies as potential layer mismatches — unit tests with many mocks often indicate the test is asserting integration concerns rather than unit logic. Advisory finding: suggest re-classifying as integration test with real collaborators.
Include Test Desiderata findings in the quality review report under a "Test Quality" section. Output format: Report Test Desiderata violations as entries in the issues array with category: "test-quality".
Forthcoming — mutation-adequacy (R5, #1520): A dedicated
mutation-adequacyreview dimension will join the review contract in a later slice, scoring whether the test suite actually kills injected mutants (the strongest signal that tests can fail for the right reason). Until that dimension lands, surviving-mutant analysis is out of scope for this skill — do not run or score mutation testing here. Today the closest proxy is the Specific Test Desiderata property above plus thecheck_test_adequacygate at delegation time; full mutation scoring is deferred to R5.
Step 3: Generate Report
Use the template from references/review-report-template.md to structure the review output.
Priority Levels
| Priority | Action | Examples |
|---|---|---|
| HIGH | Must fix before merge | Security issues, data loss risks |
| MEDIUM | Should fix, may defer | SOLID violations, complexity |
| LOW | Nice to have | Style preferences, minor refactors |
Priority Classification Rules
- HIGH: security vulnerabilities, data loss risk, API contract breaks, uncaught exception paths
- MEDIUM: SOLID violations (LSP, ISP), cyclomatic complexity >15, test coverage <70%
- LOW: naming, code style, comment clarity, non-impactful performance
If classification is ambiguous, default to MEDIUM and flag for human decision.
Fix Loop for HIGH-Priority
If HIGH-priority issues found:
- Create fix task listing each HIGH finding with file, issue, and required fix
- Dispatch to implementer subagent
- Re-review quality after fixes
- Only mark APPROVED when all HIGH items resolved and tests pass
Fix loop iteration limit: max 3. If HIGH-priority issues persist after 3 fix-review cycles, pause and escalate to the user with a summary of unresolved issues. The user can override: {{COMMAND_PREFIX}}review --max-fix-iterations 5
Post-Fix Spec Compliance Check (MANDATORY after fix cycle)
After the quality-review fix loop completes and quality passes, re-verify that the quality fixes did not break spec compliance. Run inline (not a full dispatch):
- Run spec verification commands:
npm run test:run npm run typecheckexarchos_orchestrate({ action: "check_tdd_compliance", featureId: "<featureId>", taskId: "<taskId>", branch: "<branch>" }) - If all pass: proceed to APPROVED transition
- If any fail: return to NEEDS_FIXES with spec regression noted in issues array
Required Output Format
The subagent MUST return results as structured JSON. The orchestrator parses this JSON to populate state. Any other format is an error.
{
"verdict": "pass | fail | blocked",
"summary": "1-2 sentence summary",
"issues": [
{
"severity": "HIGH | MEDIUM | LOW",
"category": "security | solid | dry | perf | naming | test-quality | other",
"file": "path/to/file",
"line": 123,
"description": "Issue description",
"required_fix": "What must change"
}
],
"test_results": {
"passed": 0,
"failed": 0,
"coverage_percent": 0
}
}
Anti-Patterns
| Don't | Do Instead |
|---|---|
| Block on LOW priority | Accept and track for later |
| Review before spec passes | Complete spec review first |
| Be overly pedantic | Focus on impactful issues |
| Skip security checks | Always verify basics |
| Accept poor test quality | Tests are code too |
| Apply generic standards to language issues | Reference language-specific rules |
Cross-Task Integration Issues
If an issue spans multiple tasks:
- Classify as "cross-task integration"
- Create fix task specifying ALL affected tasks
- Dispatch fix to implementer with context from all affected tasks
- Mark original tasks as blocked until cross-task fix completes
State Management
Key format: The review key MUST be kebab-case
reviews["quality-review"], not camelCasereviews.qualityReview. The guard matches on the exact key string.
On review complete:
action: "update", featureId: "<id>", updates: {
"reviews": { "quality-review": { "status": "pass", "summary": "...", "issues": [...] } }
}
On all reviews pass — advance to synthesis:
action: "transition", featureId: "<id>", target: "synthesize"
Phase Transitions and Guards
For the full transition table, consult @skills/workflow-state/references/phase-transitions.md.
Quick reference:
review→synthesizerequires guardall-reviews-passed— allreviews.{name}.statusmust be passingreview→delegaterequires guardany-review-failed— triggers fix cycle when any review fails
Schema Discovery
Use exarchos_workflow({ action: "describe", actions: ["update", "init"] }) for
parameter schemas and exarchos_workflow({ action: "describe", playbook: "feature" })
for phase transitions, guards, and playbook guidance. Use
exarchos_orchestrate({ action: "describe", actions: ["check_static_analysis", "check_security_scan", "check_review_verdict"] })
for orchestrate action schemas.
Completion Criteria
- Static analysis passes
- All HIGH-priority issues fixed
- No security vulnerabilities
- Test quality acceptable
- Code is maintainable
- State file updated with review results
Decision Runbooks
For review verdict routing, query the decision runbook:
exarchos_orchestrate({ action: "runbook", id: "review-escalation" })
This runbook provides structured criteria for routing between APPROVED and NEEDS_FIXES verdicts based on finding severity and fix cycle count. APPROVED transitions to synthesize; NEEDS_FIXES transitions back to delegate for a fix cycle. (BLOCKED routing is only relevant in plan-review, not here.)
MCP-Served Quality Check Catalog (Tier 2)
After Tier 1 MCP gates complete, execute the quality check catalog. This provides deterministic quality checks (grep patterns, structural analysis) that run on any MCP platform — no companion plugins required.
Step 2.5: Execute Check Catalog
exarchos_orchestrate({ action: "prepare_review", featureId: "<id>" })
The response contains:
catalog— structured check dimensions with grep patterns, structural thresholds, and heuristic instructionsfindingFormat— the TypeScript interface for submitting findingspluginStatus— which companion plugins are configured in.exarchos.yml
Execute each check in the catalog against the codebase:
- grep checks: Run the
patternagainst files matchingfileGlob - structural checks: Evaluate against the
threshold(e.g., nesting depth, function length) - heuristic checks: Use judgment guided by the
description
Collect all matches as findings in the format specified by findingFormat, then pass them as pluginFindings to check_review_verdict.
Companion Plugin Enhancement (Tier 3 — Platform-Dependent)
On platforms with skill support (Claude Code, Cursor), the orchestrator may additionally invoke impeccable:critique for deeper qualitative analysis. These findings are also passed as pluginFindings.
Convergence & Verdict
Query convergence status and compute verdict via orchestrate. See references/convergence-and-verdict.md for full orchestrate calls, response fields, and verdict routing logic.
Summary: check_convergence returns per-dimension D1-D5 status. check_review_verdict takes finding counts and optional pluginFindings array (from catalog execution and companion plugins), emits gate events, and returns APPROVED or NEEDS_FIXES.
Auto-Transition
All transitions are automatic — no user confirmation. See references/auto-transition.md for per-verdict transition details, Skill invocations, and integration notes.
Recording Results
Before transitioning, record the review verdict. The reviews value MUST be an object with a status field, not a flat string:
APPROVED:
exarchos_workflow({ action: "update", featureId: "<id>", updates: {
reviews: { "quality-review": { status: "pass", summary: "...", issues: [] } }
}})
exarchos_workflow({ action: "transition", featureId: "<id>", target: "synthesize" })
Then invoke {{COMMAND_PREFIX}}synthesize.
NEEDS_FIXES:
exarchos_workflow({ action: "update", featureId: "<id>", updates: {
reviews: { "quality-review": { status: "fail", summary: "...", issues: [{ severity: "HIGH", file: "...", description: "..." }] } }
}})
Then invoke {{COMMAND_PREFIX}}delegate --fixes.
Gate events: Do NOT manually emit
gate.executedevents viaexarchos_event. Gate events are automatically emitted by thecheck_review_verdictorchestrate handler. Manual emission causes duplicates.
Guard shape: The
all-reviews-passedguard requiresreviews.{name}.statusto be a passing value (pass,passed,approved,fixes-applied). Flat strings likereviews: { "quality-review": "pass" }are silently ignored and will block thereview → synthesizetransition.