name: odyssey-debug
description: "Long-running debug cycle — archaeology, diagnosis, fix, confirmation, generalization, discovery, and knowledge persistence"
argument-hint: " [--skip-fix] [--skip-generalize] [--auto] [-y] [-c]"
allowed-tools: spawn_agents_on_csv, Read, Write, Edit, Bash, Glob, Grep, request_user_input
Unlike quality-debug (fast fix), this treats every bug as a learning signal — digs into git history before hypotheses, confirms fixes with CLI review, scans for siblings of the root cause.
Core philosophy:
- Archaeology before hypothesis — look at what changed before guessing why
- Fix one, find many — a single bug reveals a class of bugs
- Decision journal — human-judgment items recorded, not lost
- CLI-assisted review — delegate for second-opinion analysis
三句哲学约束(穷尽迭代):
- 零遗留 — 根因必须确认到底,修复必须验证通过,泛化必须扫描穷尽
- 穷尽迭代 — 假设失败不放弃:扩范围 → 换视角 → 升级工具,直到根因确认或明确 INCONCLUSIVE
- 改进即标准 — 修复后重新确认同区域无新问题,泛化发现的同类 bug 全部处理
Flags: --skip-fix analysis-only | --skip-generalize quick fix | --template <name> 预定义策略 | --auto no delegate confirmation | -y auto-confirm all decisions | -c resume last session
Session: SESSION_DIR = .workflow/scratch/{YYYYMMDD}-debug-odyssey-{slug}/
Output — 4 files:
SESSION_DIR/
├── session.json # state + confirmation + patterns + phase_goals
├── evidence.ndjson # append-only evidence trail (phase field distinguishes origin)
├── explore.json # structured CLI exploration snapshot
└── understanding.md # evolving narrative — 9 sections
evidence.ndjson — unified trail:
{"ts":"","phase":"archaeology|explore|diagnosis|discovery|decision|self-iteration","type":"","source":"","content":"","note":""}
Phase-specific fields:
archaeology:sha,author,date,message,relevance(high|medium|low)explore:category(call_chain|recent_change|error_gap|similar_pattern),detaildiagnosis:hypothesis,result(confirmed|disproved|inconclusive)discovery:file,line,classification(safe|risk|bug),actiondecision:question,options,context,status(pending|resolved|deferred),resolution
explore.json schema:
{
"call_chains": [{"entry":"","chain":["file:line"]}],
"recent_changes": [{"file":"","commits":[{"sha":"","message":"","date":""}]}],
"error_gaps": [{"file":"","line":0,"description":""}],
"similar_patterns": [{"file":"","line":0,"description":""}],
"cli_tool": "", "timestamp": ""
}
session.json schema:
{
"session_id": "debug-odyssey-{YYYYMMDD-HHmmss}", "issue": "",
"flags": { "skip_fix": false, "skip_generalize": false, "auto": false, "auto_confirm": false },
"current_state": "S_INTAKE", "diagnosis_retries": 0,
"root_cause": null, "patterns": [], "confirmation": null,
"phase_goals": [], "phase_goals_all_done": false, "self_iteration_log": [],
"cross_phase_loops": 0, "max_loops": 5,
"generalization_stats": null,
"created_at": "", "updated_at": ""
}
phase_goals[] — auto-derived from flags:
| ID | Goal | done_when | phase | skip_when |
|---|---|---|---|---|
| G1 | Root cause identified | evidence.ndjson has phase=diagnosis result=confirmed | S_DIAGNOSE | — |
| G2 | Explore context gathered | explore.json ≥1 category populated | S_EXPLORE | — |
| G3 | Fix applied and confirmed | confirmation.overall == confirmed | S_CONFIRM | skip_fix |
| G4 | Pattern generalized | patterns[] ≥1 entry | S_GENERALIZE | skip_generalize |
| G5 | Discoveries triaged | all scan hits classified | S_DISCOVER | skip_generalize |
| G6 | Learnings persisted | spec entries created OR no actionable learnings | S_RECORD | — |
When flags[skip_when] == true → auto set status: "skipped", completion_confirmed: true.
understanding.md — 9 sections (written by owning phase):
- Issue & Scope ← S_INTAKE | 2. Archaeology ← S_ARCHAEOLOGY | 3. Exploration ← S_EXPLORE
- Hypotheses ← S_DIAGNOSE | 5. Root Cause ← S_DIAGNOSE | 6. Fix & Confirmation ← S_FIX+S_CONFIRM
- Generalization ← S_GENERALIZE | 8. Discoveries ← S_DISCOVER | 9. Learnings ← S_RECORD
Pre-load(可选,缺失不阻塞)
| 层级 | 命令 | 作用 |
|---|---|---|
| Codebase docs | Read .workflow/codebase/ARCHITECTURE.md |
模块边界 |
| Wiki search | maestro search "<issue keywords>" --json |
先前调查(top 5) |
| Specs + tools | maestro spec load --category debug --keyword "<symptom>" |
已知 issue/workaround |
| Role knowledge | maestro search --category debug → 选相关 → maestro wiki load <id> |
领域知识 |
| Prior sessions | Glob(".workflow/scratch/*-debug-odyssey-*") |
相关会话 |
Knowledge Persistence(S_RECORD 中写入产出文件)
S_RECORD 将可沉淀知识 写入 understanding.md §9 Learnings,按分类结构化:
| 分类 | 写入内容 | 后续建议命令 |
|---|---|---|
| 反复根因模式 | 模式描述 + 触发条件 + 修复模板 | $spec-add debug "..." |
| 非显而易见 workaround | 问题场景 + 解决方案 + 适用范围 | $spec-add learning "..." |
| 架构边界违反 | 违反描述 + 正确边界 + 检查方法 | $spec-add arch "..." |
| 可复用泛化 pattern | pattern 签名 + 风险说明 + fix 模板 | $spec-add coding "..." |
两步模式: 执行中写入产出文件(临时记录)→ 任务完成后用户沉淀为永久知识。执行过程中不调用外部 skill。
| 维度 | sufficient | insufficient |
|---|---|---|
| Coverage | 已知相关文件/模块均已分析 | 遗漏 grep/git log 可发现的目标 |
| Depth | ≥80% 发现有 file:line 级证据 | 多数仅泛泛描述 |
| Actionability | 每条结论有具体后续动作 | 仅"建议关注"类无操作性结论 |
Rules: Phase complete → evaluate 3 dimensions → any insufficient → re-enter (max 3 rounds per phase).
- Round 1: Broaden scope — add directories, git log depth ×2, add delegate angles
- Round 2: Switch perspective — different CLI tool, reverse tracing, manual code reading
- Round 3: Combine both + targeted deep-dive on remaining gaps
Exit: All sufficient → advance | 3-round limit → record gaps and continue. Log to evidence.ndjson + session.json.self_iteration_log[].
Termination Contract (embed in every instruction):
You MUST call report_agent_job_result EXACTLY ONCE before exiting.
Success → result_status=completed | Failure → result_status=failed with error | Timeout → completed with partial.
NEVER continue indefinitely. NEVER exit silently. Read-only — do NOT modify source files.
Do NOT write to tasks.csv, wave-*.csv, results.csv. Do NOT call spawn_agents_on_csv.
tasks.csv
id,title,description,task_type,deps,wave,status,findings,evidence,error
- Wave 1: Archaeology (git-timeline, git-blame) — parallel
- Wave 2: Generalization (syntax-grep, semantic-scan, structural-match, historical-grep) — parallel, depends on root cause
- Single-agent stages (explore, diagnose, fix, confirm) remain inline
Stage 1: Intake (S_INTAKE)
- Parse arguments: issue description, flags
- Generate slug, create
SESSION_DIR - Search:
maestro search "<keywords>"+ Glob prior sessions + ARCHITECTURE.md + Grep keywords - Derive
phase_goals[]from flags (applyskip_when) - Write
session.json+understanding.md§1 - Display Goal Prompt (Appendix), continue without blocking
Resume (-c): Glob latest session → read session.json → restore current_state → jump.
📌 Auto-commit: git add understanding.md && git commit -m "odyssey-debug({slug}): S_INTAKE — 目标解析"
Stage 2: Archaeology (S_ARCHAEOLOGY)
Step 1 — Git archaeology (spawn_agents_on_csv, Wave 1):
Write tasks.csv with Wave 1 rows:
id,title,description,task_type,deps,wave,status,findings,evidence,error
"arch-timeline","Git Timeline","Run git log --oneline -20 -- {files}. Return [{sha,date,author,message,files_changed}] as JSON.","archaeology","","1","pending","","",""
"arch-blame","Git Blame","Top 3 suspicious files: git blame -L {region}. Return [{file,line_range,sha,author,date,content}] as JSON.","archaeology","","1","pending","","",""
spawn_agents_on_csv({ csv_path:"tasks.csv", id_column:"id",
instruction: ARCHAEOLOGY_INSTRUCTION + TERMINATION_CONTRACT,
max_concurrency:2, max_runtime_seconds:300,
output_csv_path:"wave-1-results.csv", output_schema: SHARED_OUTPUT_SCHEMA })
Merge results → evidence.ndjson (phase: "archaeology").
Step 2 — CLI change review:
maestro delegate "PURPOSE: Review recent modifications related to: {issue}
TASK: Analyze intent | Identify risky modifications | Flag potential bug sources
MODE: analysis
CONTEXT: @{relevant_files} | Git log: {top_10_commits}
EXPECTED: JSON [{commit_sha, risk_level, analysis, could_cause_issue, explanation}]
CONSTRAINTS: Focus on behavioral changes, not formatting
" --role analyze --mode analysis
Execute with run_in_background: true, then wait for callback (do NOT halt the Odyssey flow). Append results.
Step 3: Update understanding.md §2.
📌 Auto-commit: git add understanding.md && git commit -m "odyssey-debug({slug}): S_ARCHAEOLOGY — 考古"
Stage 3: Exploration (S_EXPLORE)
Skip if no enabled CLI tools (W006).
maestro delegate "PURPOSE: Gather codebase evidence for: {issue}
TASK: Trace call chains | Find recent changes | Identify error gaps | Check similar patterns
MODE: analysis
CONTEXT: @**/*
EXPECTED: JSON {call_chains, recent_changes, error_gaps, similar_patterns}
CONSTRAINTS: Max 20 entries/category | Symptom-related code paths
Symptoms: {issue} Archaeology hints: {suspicious_commits}
" --role explore --mode analysis
Execute with run_in_background: true, then wait for callback (do NOT halt the Odyssey flow).
Parse → write explore.json + evidence (phase: "explore"). Update §3. Mark G2 done.
📌 Auto-commit: git add understanding.md && git commit -m "odyssey-debug({slug}): S_EXPLORE — 探索"
Stage 4: Diagnosis (S_DIAGNOSE)
- Form hypotheses from evidence (archaeology + explore), ranked [HIGH]/[MEDIUM]/[LOW] → §4
- Test each (rank order): design test → execute → evidence (phase: "diagnosis")
- Decision journal: ambiguity → evidence (phase: "decision"); Normal: request_user_input |
-y: defer - Root cause: confirmed →
session.json.root_cause+ §5. Mark G1 done.
Escalation (3-strike):
All hypotheses fail → increment diagnosis_retries.
- < 3: broaden scope via
maestro delegate --role analyze, form new hypotheses. = 3: Normal → request_user_input (broaden/new/INCONCLUSIVE) |
-y→ auto INCONCLUSIVE, proceed to S_RECORD.
📌 Auto-commit: git add understanding.md && git commit -m "odyssey-debug({slug}): S_DIAGNOSE — 诊断"
Stage 5: Fix (S_FIX)
Skip if --skip-fix.
- Present root cause + proposed fix. Normal: request_user_input |
-y: auto proceed - Implement fix
- Record in evidence (phase: "decision")
📌 Auto-commit: git add -A && git commit -m "odyssey-debug({slug}): S_FIX — 修复"
Stage 6: Confirmation (S_CONFIRM)
Skip if --skip-fix.
- Tests: auto-detect framework, run covering tests
- CLI fix review:
maestro delegate "PURPOSE: Review fix for: {issue}
TASK: Verify correctness | Check regressions | Assess completeness
MODE: analysis
CONTEXT: @{modified_files} | Root cause: {summary} | Diff: {git_diff}
EXPECTED: JSON {verdict, findings [{severity, description, suggestion}], regression_risk}
CONSTRAINTS: Focus on correctness, not style
" --role review --mode analysis
Execute with run_in_background: true, then wait for callback (do NOT halt the Odyssey flow).
- Write
session.json.confirmation:{test_result, cli_review, overall, timestamp} - Update §6.
needs_rework→ Stage 5.confirmed→ mark G3 done, advance.
📌 Auto-commit: git add understanding.md && git commit -m "odyssey-debug({slug}): S_CONFIRM — 确认"
Stage 7: Generalization (S_GENERALIZE)
Skip if --skip-generalize. 举一反三: extract pattern, scan for siblings.
Step 1 — Multi-layer pattern extraction:
| Layer | Method | Example |
|---|---|---|
| Syntax | Regex patterns (direct Grep) | eval(, missing await, unclosed resource |
| Semantic | Anti-pattern description (Agent scan) | Unhandled async errors, unvalidated input |
| Structural | Architecture-level similarity | Same import structure, missing override |
Write session.json.patterns[]: [{id, source, layer, signature, description, risk, fix_template}]
Step 2 — 4-agent scan (spawn_agents_on_csv, Wave 2):
Append Wave 2 rows to tasks.csv:
"gen-syntax","Syntax Grep","Grep syntax-layer signatures '${signature}' across project. Return [{file,line,context,risk_level,layer:'syntax',confidence}].","generalization","","2","pending","","",""
"gen-semantic","Semantic Scan","Check related modules for anti-pattern: ${description}. Return [{file,line,context,risk_level,layer:'semantic',confidence}].","generalization","","2","pending","","",""
"gen-structural","Structural Match","Find structurally similar files to ${buggy_files}, check for anti-pattern. Return [{file,line,description,risk,layer:'structural',confidence}].","generalization","","2","pending","","",""
"gen-historical","Historical Grep","Run git log -S '${signature}' --oneline. Return [{sha,file,date,type:'introduced|fixed',context}].","generalization","","2","pending","","",""
spawn_agents_on_csv({ csv_path:"tasks.csv", id_column:"id",
instruction: GENERALIZATION_INSTRUCTION + TERMINATION_CONTRACT,
max_concurrency:4, max_runtime_seconds:300,
output_csv_path:"wave-2-results.csv", output_schema: SHARED_OUTPUT_SCHEMA })
Step 3 — Cross-layer dedup: same file:line multi-layer → boost confidence | single-layer → needs_review | historical fixed → regression_risk
Step 4 — Iterative deepening: module ≥3 hits → targeted deep scan (max 1 round).
Step 5 — Quality Gate (self-iteration).
Step 6: Write §7 + session.json.generalization_stats: {patterns_extracted, total_hits, cross_layer_confirmed, regression_risks, by_layer, deepening_triggered}. Mark G4 done.
📌 Auto-commit: git add understanding.md && git commit -m "odyssey-debug({slug}): S_GENERALIZE — 泛化"
Stage 8: Discovery (S_DISCOVER)
Skip if no generalization hits.
- Triage each hit: read ±10 lines → classify
safe/risk/bug - Route (explicit action required for each classification):
bug+ directly fixable → fix immediately (not just log an issue) → back to S_FIXbug+ requires cross-module/architectural decision → create issue (with fix suggestion + impact analysis)risk→ evaluate if guard/validation can mitigate directly; if yes, fix itsafe→ mark skip See Appendix-ybehavior. Append evidence (phase: "discovery" + "decision")
- Cross-phase loop:
- discovery finds new bug → S_DIAGNOSE (cross_phase_loops++)
- same-pattern bug + fix template → S_FIX
- S_DISCOVER → S_RECORD: triage complete AND remaining_actionable == 0
- S_DISCOVER → S_RECORD: loops >= max_loops → MUST log each unfixed item with specific reason (blanket "pre-existing" is forbidden)
- Update §8. Mark G5 done.
📌 Auto-commit: git add understanding.md && git commit -m "odyssey-debug({slug}): S_DISCOVER — 发现"
Stage 9: Record (S_RECORD)
- Finalize
understanding.md§9 - Write learnings to understanding.md §9: 按 Knowledge Persistence 表分类记录(临时),completion summary 列出建议的后续命令
- Mark G6 done. Pending decisions: Normal → request_user_input |
-y→ show deferred count - Goal audit: all confirmed →
phase_goals_all_done = true. Any false: Normal → request_user_input (回退/跳过/接受) |-y→ auto accept - Completion:
current_state = "COMPLETED", emit summary:
--- DEBUG ODYSSEY COMPLETE ---
Issue: {issue}
Root cause: {root_cause.hypothesis}
Fix: {applied|skipped|inconclusive}
Patterns: {patterns_extracted} ({by_layer} distribution)
Scan hits: {total_hits} ({cross_layer_confirmed} cross-layer confirmed)
Issues: {N} created
Decisions: {N} resolved, {M} pending, {K} deferred
Learnings: {N} spec entries persisted
Self-iter: {N} quality gate rounds across {M} stages
Goals: {done}/{total} confirmed ({skipped} skipped)
---
Next steps: $manage-issue list --source debug-odyssey, $learn-decompose <module>,
$quality-review, $learn-second-opinion <understanding.md>, $learn-investigate "<question>"
📌 Auto-commit: git add understanding.md && git commit -m "odyssey-debug({slug}): S_RECORD — 总结"
Goal Prompt Template
时机守卫:仅在 Stage 1 完成后显示一次。Stage 9 完成时禁止重新显示。
📋 Debug Odyssey 会话已创建。可随时复制以下 /goal 设定终止条件:
/goal 穷尽迭代:直到根因确认(或明确 INCONCLUSIVE)且修复验证通过
且泛化扫描穷尽且 phase_goals_all_done=true 才停。
假设失败时扩范围→换视角→升级工具,不轻易放弃。
泛化发现的同类 bug 全部修复或创建 issue,不允许遗留。
遇到 phase=decision 的 pending 必须 request_user_input,不得自行 resolve。
Odyssey 输出提示词后继续执行不阻塞。/goal 由用户任意时刻输入。
-y Auto-Confirm Behavior
| Decision Point | Normal | -y mode |
|---|---|---|
| Stage 4 ambiguity | request_user_input blocks | record deferred, best-effort continue |
| Stage 4 3-strike | request_user_input 3-way | auto INCONCLUSIVE |
| Stage 5 fix direction | request_user_input confirm | auto proceed |
| Stage 8 bug triage | request_user_input route | auto create issue |
| Stage 8 ambiguous | request_user_input batch | all deferred |
| Stage 9 decisions | request_user_input per-item | skip, show deferred count |
| Stage 9 goal audit | request_user_input 3-way | auto accept current state |
deferred items shown as "待决策" in summary; recoverable via -c.
Phase Goal Lifecycle
pending → done (confirmed=true) normal | pending → skipped (confirmed=true) flags/manual | pending → failed (confirmed=false) INCONCLUSIVE
phase_goals_all_done = true only when ALL goals have completion_confirmed == true.