name: debug description: > Interactive runtime-evidence debugger for ANY language (frontend, backend, CLI, worker, mobile, shell). USE THIS (instead of adding a print/console.log and asking the user to eyeball it) when about to: "add a log and ask user to check", "open DevTools and tell me what you see", "reproduce the bug and share the output", "check the console/terminal output". Triggers: "debug this", "fix this bug", "why isn't this working", "investigate this issue", "trace the problem", "value is null/undefined", "request returns wrong data", "handler never runs", "UI not updating", "panic/exception thrown". Runs a localhost log server, instruments code, asks YOU to reproduce, reads logs directly (no copy-paste), then interactively confirms with you whether the bug is fixed before cleaning up.
Debug — Interactive Runtime-Evidence Debugging (any language)
Fix bugs with runtime evidence, not guesses. You stay in the loop:
Don't guess → Hypothesize → Instrument → [ask you to Reproduce] → Analyze → Fix → [ask you: fixed?] → Verify
This skill runs on the main thread, so it can ask you questions directly (via
AskUserQuestion) at the two human-in-the-loop gates: reproduce and is it
fixed?. The mechanism is language-agnostic: a localhost HTTP log server receives
NDJSON from instrumented code — browser, Go/Rust/Python/Node service, CLI, shell
via curl — and the logs are read directly from a file.
Hard rules
- Never fix without runtime evidence. Get values from a real run first.
- Never guess. Unsure ⇒ add more instrumentation and reproduce again.
- Never ask the user to copy-paste console output. Capture server-side,
catthe log file. - Never remove instrumentation before the user confirms the fix. Keep
#region debugblocks until verified. - Tag every log with the hypothesis it tests.
Locate the bundled scripts
# Searches common install locations across agents, then the cwd as fallback.
# -L follows symlinks (e.g. a global ~/.claude/skills/debug → repo checkout).
SCRIPTS="$(dirname "$(find -L \
~/.claude/skills/debug .claude/skills/debug \
~/.config/opencode/skill/debug ~/.codex/skills/debug ~/.cursor/skills/debug . \
-path '*debug/scripts/debug_server.js' 2>/dev/null | head -1)")"
Workflow
Phase 1 — Start the log server (no-op if already running)
node "$SCRIPTS/debug_server.js" /path/to/project &
Env: DEBUG_PORT (default 8787), DEBUG_LOG_DIR (default .debug).
Port busy: lsof -ti :8787 | xargs kill -9 then restart.
Phase 2 — Create a session, save the session_id
curl -s -X POST http://localhost:8787/session -d '{"name":"short-bug-desc"}'
Phase 3 — Hypothesize
3–5 hypotheses, each specific, testable from one log line, covering different subsystems.
Phase 4 — Instrument
3–8 points (entry/exit, before/after critical ops, each branch), each tagged with
its hypothesisId, wrapped in #region debug … #endregion. High-frequency
events: log only on state change. Recipes:
Universal — POST NDJSON via native HTTP client or curl:
{"sessionId":"<id>","msg":"entry","data":{"user_id":null},"hypothesisId":"H1"} → http://localhost:8787/log
Shell: dbg(){ curl -s -X POST http://localhost:8787/log -d "{\"sessionId\":\"$SID\",\"msg\":\"$1\",\"data\":$2,\"hypothesisId\":\"$3\"}" >/dev/null; }
JS/TS: fetch('http://localhost:8787/log',{method:'POST',body:JSON.stringify({sessionId:SID,msg,data,hypothesisId:h})}) (or navigator.sendBeacon)
Python: requests.post('http://localhost:8787/log',json={'sessionId':SID,'msg':msg,'data':data,'hypothesisId':h},timeout=0.5)
Go: http.Post("http://localhost:8787/log","application/json",bytes.NewReader(jsonBytes))
Rust: reqwest::blocking::Client::new().post("http://localhost:8787/log").json(&v).send()
Browser blockers (backend/CLI reach localhost directly): mixed-content, CSP
connect-src, CORS preflight → use a same-origin dev-server proxy or sendBeacon;
Chrome-extension content scripts relay via the background service worker.
Phase 5 — Clear logs, then ASK the user to reproduce
node "$SCRIPTS/debug_cleanup.js" clear /path/to/project $SESSION_ID
Give precise reproduction steps (start command, route/args, action, expected vs
observed). Then call AskUserQuestion — e.g. header "Reproduce", options:
"Done — reproduced it", "Couldn't reproduce", "App won't start". Do NOT read logs
until the user confirms they ran the steps.
Phase 6 — Analyze
cat /path/to/project/.debug/debug-$SESSION_ID.log
Mark each hypothesis CONFIRMED / REJECTED / INCONCLUSIVE with the log line as evidence. All rejected/inconclusive ⇒ new hypotheses, more instrumentation, repeat Phase 5. Report the verdicts to the user.
Phase 7 — Fix
Fix only the confirmed root cause. Keep instrumentation. Tag verification logs runId:"post-fix".
Phase 8 — Verify, then ASK the user: is it fixed?
Clear logs, give the same reproduction steps, have the user run them again, read
the post-fix logs, and show a before/after value comparison. Then call
AskUserQuestion — header "Fixed?", options:
- "Fixed" — bug gone, behavior correct → go to Phase 9.
- "Still broken" — → back to Phase 3 with new hypotheses; iterate.
- "Partially / new issue" — → treat the residual as a fresh hypothesis set; iterate.
Do not declare success yourself — the user's answer decides.
Phase 9 — Clean up (only after the user answered "Fixed")
grep -rn "#region debug" and remove every block. Confirm removal to the user.
Five Whys (optional)
For recurring/prod/security bugs, ask "why did this exist?" → CODE (fix) / TEST (add test) / PROCESS (review/checklist) / SYSTEMIC (document pattern).
Log format
NDJSON, one entry/line: {"ts":"…","msg":"entry","data":{"id":5},"hypothesisId":"H1","loc":"app.go:42"}
Checklist
- Server running · session created,
session_idsaved - 3–5 hypotheses · 3–8 tagged logs
- Logs cleared · user asked to reproduce (AskUserQuestion)
- Each hypothesis CONFIRMED/REJECTED/INCONCLUSIVE with evidence
- Fix on confirmed root cause only · before/after compared
- User asked "fixed?" (AskUserQuestion) — their answer decides
- Instrumentation removed only after "Fixed"