name: odyssey-review-test-fix
description: "Deep review + fix cycle — archaeology, exploration, multi-dimensional review, targeted fix, generalization, discovery, and knowledge persistence"
argument-hint: " [--dimensions ] [--fix-threshold critical|high|medium|low|all] [--skip-fix] [--skip-generalize] [--auto] [-y] [-c] [--heartbeat]"
allowed-tools: spawn_agents_on_csv, Read, Write, Edit, Bash, Glob, Grep, request_user_input
- ] [--fix-threshold critical|high|medium|low|all] [--skip-fix] [--skip-generalize] [--auto] [-y] [-c] [--heartbeat]"
allowed-tools: spawn_agents_on_csv, Read, Write, Edit, Bash, Glob, Grep, request_user_input
Review dimensions: correctness, security, performance, architecture (filterable via --dimensions).
Zero-residual: Every finding MUST have a concrete action. "Report and shelve" and "pre-existing skip" are forbidden.
Target resolution:
| Input | Resolution |
|---|---|
| File/dir path | Review those files |
HEAD / staged |
git diff HEAD / git diff --staged |
| Phase number | state.json → changed files |
| PR number | git diff main...HEAD |
Flags: --dimensions <list> subset of 4 dims | --fix-threshold <sev> default all | --skip-fix skip S_FIX+S_CONFIRM | --skip-generalize skip S_GENERALIZE+S_DISCOVER | --auto -y -c --heartbeat
Session: .workflow/scratch/{YYYYMMDD}-review-odyssey-{slug}/
Output: session.json | evidence.ndjson | explore.json | understanding.md (sections 1-8)
session.json — review-specific fields:
{ "target": "", "dimensions": [], "review_result": { "remaining_actionable": 0 },
"patterns": [], "confirmation": null, "generalization_stats": null }
evidence.ndjson phases: archaeology|explore|review|fix|discovery|decision|self-iteration
phase_goals[]:
| ID | Goal | done_when | phase | skip_when |
|---|---|---|---|---|
| G1 | Review completed | all dimensions reviewed | S_REVIEW | — |
| G2 | Explore context | explore.json populated | S_EXPLORE | — |
| G3 | Zero remaining | remaining_actionable == 0 |
S_CONFIRM | skip_fix |
| G4 | Pattern generalized | patterns[] >= 1 | S_GENERALIZE | skip_generalize |
| G5 | Discoveries triaged | all hits classified | S_DISCOVER | skip_generalize |
| G6 | Learnings persisted | spec entries or no actionable | S_RECORD | — |
Specs: maestro load --type spec --category review. Rest per base Pre-load.
Knowledge Persistence (S_RECORD → understanding.md section 8):
| Category | Content | Follow-up |
|---|---|---|
| Cross-dimension recurring pattern | Pattern + affected dimensions + coding standard | $spec-add review |
| Security finding | Vulnerability type + trigger + fix approach | $spec-add debug |
| Architecture violation pattern | Violation + correct boundary + verification | $spec-add arch |
| Reusable generalization pattern | Signature + risk + fix template | $spec-add coding |
Shared Output Schema
{
"type": "object",
"properties": {
"id": { "type": "string" },
"result_status": { "type": "string", "enum": ["completed", "failed"] },
"findings": { "type": "string", "maxLength": 500 },
"evidence": { "type": "string" },
"error": { "type": "string" }
},
"required": ["id", "result_status", "findings"]
}
Termination contract: Call report_agent_job_result EXACTLY ONCE. Read-only. Do NOT modify source files, tasks.csv, wave-*.csv, results.csv, or call spawn_agents_on_csv.
tasks.csv
id,title,description,task_type,dimension,deps,wave,status,findings,evidence,error
Waves:
| Wave | Tasks | Parallelism |
|---|---|---|
| 1 | Archaeology (git-timeline, git-blame) | 2 agents |
| 2 | Review (correctness, security, performance, architecture) | 4 agents |
| 3 | Generalization (syntax-grep, semantic-scan, structural-match, historical-grep) | 4 agents |
S_ARCHAEOLOGY → S_EXPLORE : complete S_EXPLORE → S_REVIEW : complete
S_REVIEW → S_FIX : !skip_fix AND findings S_REVIEW → S_GENERALIZE : skip_fix OR no findings, !skip_gen S_REVIEW → S_RECORD : both skip
S_FIX → S_CONFIRM : fix implemented S_CONFIRM → S_GENERALIZE : confirmed, !skip_gen S_CONFIRM → S_RECORD : confirmed, skip_gen S_CONFIRM → S_FIX : needs_rework
S_GENERALIZE → S_DISCOVER : hits S_GENERALIZE → S_RECORD : no hits
S_DISCOVER → S_FIX : fixable sibling S_DISCOVER → S_REVIEW : new target, loops < max S_DISCOVER → S_RECORD : done or max_loops
A_INTAKE
Parse target + flags → file list. Create SESSION_DIR, derive phase_goals[]. Search prior knowledge. Write session.json + section 1. Call create_goal with phase_goals as success_criteria.
A_ARCHAEOLOGY
spawn_agents_on_csv (Wave 1):
"arch-timeline","Git Timeline","git log --oneline -20 -- {target_files}","archaeology","","","1","pending","","",""
"arch-blame","Git Blame","git blame on key regions of target files","archaeology","","","1","pending","","",""
spawn_agents_on_csv({ csv_path: "tasks.csv", id_column: "id",
instruction: ARCHAEOLOGY_INSTRUCTION + TERMINATION_CONTRACT,
max_concurrency: 2, max_runtime_seconds: 300,
output_csv_path: "wave-1-results.csv", output_schema: SHARED_OUTPUT_SCHEMA })
Merge → evidence (phase: archaeology). CLI delegate --to claude --mode analysis. Update section 2.
A_EXPLORE
CLI delegate --to claude --mode analysis — call chains, error gaps, similar patterns. Write explore.json. Update section 3. Mark G2.
A_REVIEW
spawn_agents_on_csv (Wave 2):
"rev-correct","Correctness","Logic errors, boundary conditions, null/undefined, race conditions","review","correctness","","2","pending","","",""
"rev-security","Security","Injection, XSS, CSRF, data exposure, auth bypass","review","security","","2","pending","","",""
"rev-perf","Performance","Hot paths, N+1, memory leaks, unnecessary recomputation","review","performance","","2","pending","","",""
"rev-arch","Architecture","Layer violations, circular deps, interface contracts, SoC","review","architecture","","2","pending","","",""
spawn_agents_on_csv({ csv_path: "tasks.csv", id_column: "id",
instruction: REVIEW_INSTRUCTION + TERMINATION_CONTRACT,
max_concurrency: 4, max_runtime_seconds: 600,
output_csv_path: "wave-2-results.csv", output_schema: SHARED_OUTPUT_SCHEMA })
Each returns [{title, severity, file, line, description, suggestion, cwe}]. Merge → evidence (review). Write review_result + section 4 (severity matrix). Mark G1.
A_FIX
Exhaustive tier loop — descend severity until remaining_actionable == 0:
for tier in [critical, high, medium, low].filter(>= threshold):
for each unfixed candidate: read +/-20 → fix → evidence (fix)
re-review modified area: new findings → append, continue (max 2/tier)
tier done → auto-commit
Normal: request_user_input per tier. -y: auto-fix all.
Remaining > 0 → retry (no max_loops limit). Unchanged 2 rounds → classify each individually.
Blanket "pre-existing" forbidden.
A_CONFIRM
Run tests + CLI delegate zero-residual review (--to claude --mode analysis).
remaining == 0 AND new == 0→ confirmed, mark G3- Otherwise → needs_rework → S_FIX
Update confirmation + remaining_actionable + section 5.
A_GENERALIZE
Base shared_actions. Pattern source: findings (severity >= medium).
spawn_agents_on_csv (Wave 3):
"gen-syntax","Syntax Grep","Grep syntax-layer patterns across project","generalization","syntax","","3","pending","","",""
"gen-semantic","Semantic Scan","Check related modules for same anti-patterns","generalization","semantic","","3","pending","","",""
"gen-structural","Structural Match","Find structurally similar files, check for same issues","generalization","structural","","3","pending","","",""
"gen-historical","Historical Grep","git log -S pattern for introduction/fix history","generalization","historical","","3","pending","","",""
spawn_agents_on_csv({ csv_path: "tasks.csv", id_column: "id",
instruction: GENERALIZATION_INSTRUCTION + TERMINATION_CONTRACT,
max_concurrency: 4, max_runtime_seconds: 600,
output_csv_path: "wave-3-results.csv", output_schema: SHARED_OUTPUT_SCHEMA })
Cross-layer dedup: multi-layer hits → boost confidence | single-layer → needs_review | historically fixed → regression_risk.
Iterative deepening: module with >= 3 hits → targeted deep scan (max 1 round).
Mark G4.
A_DISCOVER
Base shared_actions. Review overrides: cross-phase loop tracking per base.
A_RECORD
Base shared_actions. Learnings per Knowledge Persistence table. Confirmation gate: Before writing spec entries, present proposed entries to user via request_user_input for confirmation. Skip confirmation only if -y flag is set.
Completion summary:
--- REVIEW-TEST-FIX ODYSSEY COMPLETE ---
Target: {target} Dimensions: {dims}
Findings: {C}C {H}H {M}M {L}L Fix: {fixed}, confirmed={yes|skip}
Patterns: {N} ({by_layer}) Scan hits: {total} ({cross} cross-layer)
Issues: {N} Decisions: {N}r/{M}p/{K}d Learnings: {N} Self-iter: {N}x{M}
Goals: {done}/{total} ({skipped} skipped)
---
-y review-specific points
| Decision Point | Normal | -y |
|---|---|---|
| S_FIX tier candidates | request_user_input | auto-fix, deferred |
| S_FIX re-review new findings | request_user_input | auto-append |
| S_CONFIRM needs_rework | Display → S_FIX | auto proceed |
Goal convergence rules
Stop when review_result.remaining_actionable == 0, confirmation == confirmed,
phase_goals_all_done=true. Fix by severity desc, re-review modified areas,
new findings appended. Every finding must have action (fix/issue/decision).
Decision pending must request_user_input.