systematic-debugging

star 0

Use when encountering any bug, test failure, or unexpected behavior, before proposing fixes

rryando By rryando schedule Updated 6/11/2026

name: systematic-debugging description: Use when encountering any bug, test failure, or unexpected behavior, before proposing fixes

Skill: systematic-debugging

When

Any bug, test failure, or unexpected behavior — before proposing fixes.

Follows ARCS CLI Primer: arcs --commands --json for discovery, --json --lean on all calls.

Flow

flowchart TD
    classDef decision fill:#f59e0b,color:#fff
    classDef stop fill:#ef4444,color:#fff

    Bug[Bug observed] --> ARCS[Check ARCS knowledge]
    ARCS --> Found{Match found?}
    Found -->|Yes| Verify[Verify it applies]
    Found -->|No| Observe

    Verify -->|Applies| Fix
    Verify -->|Doesn't apply| Observe

    Observe[Phase 1: Observe] --> Repro{Reproducible?}
    Repro -->|No| Instrument[Add logging/tracing]
    Instrument --> Observe
    Repro -->|Yes| Hypothesize[Phase 2: Hypothesize]

    Hypothesize --> Compare[Find working example, list differences]
    Compare --> Theory[Form single specific hypothesis]

    Theory --> Isolate[Phase 3: Isolate]
    Isolate --> Test{Smallest change confirms?}
    Test -->|Yes| Fix[Phase 4: Fix]
    Test -->|No| FailCount{3+ failures?}
    FailCount -->|No| Theory
    FailCount -->|Yes| Arch[Question architecture]

    Fix --> WriteFail[Write failing test]
    WriteFail --> Implement[Single targeted fix]
    Implement --> Green{Scoped tests pass?}
    Green -->|Yes| Capture[Capture as ARCS knowledge]
    Green -->|No| FailCount

    class Found,Repro,Test,FailCount,Green decision
    class Arch stop

Phase 1: Observe (Root Cause Investigation)

  • Read the actual error message completely
  • Reproduce consistently before proceeding
  • Check recent changes (git log, git diff)
  • Trace data flow backward from failure point
  • Instrument component boundaries if cause unclear
  • Pre-step: arcs knowledge search <slug> "<error>" --json for gotcha/lesson/pattern entries

Phase 2: Hypothesize (Pattern Analysis)

  • Find a working example in the same codebase
  • Compare working vs broken — list every difference
  • Understand the dependency chain
  • Form ONE specific hypothesis (not multiple)

Phase 3: Isolate

  • Test with the smallest possible change
  • One variable at a time — never stack fixes
  • If hypothesis fails, form a new one from evidence
  • Escalation: 3+ failed fixes → question the architecture, not the symptom

Phase 4: Fix

  • Write a failing test FIRST (proves the bug exists)
  • Implement a single targeted fix
  • Verify the scoped tests for the files you changed pass (your dispatch VERIFY command — never the full suite; the devil-advocate completion gate owns that)
  • If your fix introduces new failures in YOUR scoped tests, revert and return to Phase 2. Failures in files outside your scope are report-only (BLOCKED_BY) — likely a sibling agent's in-flight work; never fix or revert it

Log Triage Protocol

Scan order: failure point → errors → warnings → timing anomalies

rg -n "ERROR|FATAL|panic|exception" <logfile>   # Error grep
jq 'select(.level == "error")' <json-log>       # Structured logs

Output: Timeline of events leading to failure (T-5m, T-3m, T-0).

Git Bisect (Regressions)

git bisect start
git bisect bad HEAD
git bisect good <last-known-good>
git bisect run <test-command>

After finding the commit: read the diff, isolate specific lines, feed into Phase 2.

Dependency Conflict Diagnosis

Symptom Likely Cause
instanceof fails across modules Duplicate package copies
Type mismatch on same interface Different versions loaded
"Cannot find module" intermittent Hoisting conflict
Works with --legacy-peer-deps Peer dep unsatisfied

Diagnose: npm ls <pkg>, npm explain <pkg>, check for multiple copies.

ARCS Knowledge Capture

After root cause identified, persist as knowledge:

  • gotcha — environmental/config traps
  • lesson — architectural insights from this session
  • pattern — reusable solution to recurring problem

Include: root cause summary, evidence, affected files, fix approach.

Capture Resolution as Knowledge

After resolving the issue, persist the learning:

# For a surprising behavior or trap
arcs knowledge create <slug> "Redis connection pool exhaustion under load" \
  --kind=gotcha \
  --summary="Pool size defaults to 10; under concurrent requests >50, connections time out silently" \
  --body="Root cause: default pool size. Fix: set poolSize to max(50, expectedConcurrency). Symptoms: intermittent 503s with no error logs." \
  --json

# For a reusable debugging technique or resolution pattern
arcs knowledge create <slug> "Diagnosing silent connection failures" \
  --kind=lesson \
  --summary="Enable connection-level event logging before load testing" \
  --body="Attach listeners to pool 'error' and 'timeout' events. Default Node.js behavior swallows these." \
  --json

# For a pattern that should be followed going forward
arcs knowledge create <slug> "Connection pool sizing formula" \
  --kind=pattern \
  --summary="Pool size = max(50, 2x expected peak concurrency)" \
  --body="Applies to Redis, Postgres, and HTTP agent pools. Validated under load test 2026-05-26." \
  --json

Kind selection guide:

  • gotcha — surprising behavior, trap, or non-obvious failure mode
  • lesson — learned technique, debugging approach, resolution method
  • pattern — reusable solution that should be applied going forward

Constraints

  • NO FIXES WITHOUT ROOT CAUSE INVESTIGATION. If Phase 1 incomplete, you cannot propose fixes.
  • One variable at a time. Never apply multiple changes simultaneously.
  • 3+ failures = architectural problem. Stop fixing symptoms, question the pattern.
  • Test before fix. Failing test proves the bug; green test proves the fix.
  • Defense in depth: After fixing root cause, add validation at multiple layers to prevent recurrence.
  • Systematic is faster than thrashing. 15-30min systematic vs 2-3h random fixes.

Red Flags (Return to Phase 1)

  • "Quick fix for now, investigate later"
  • "Just try changing X and see"
  • Proposing solutions before tracing data flow
  • Each fix reveals a new problem in a different place
  • "I don't fully understand but this might work"
  • Human says "stop guessing" or "is that not happening?"
Install via CLI
npx skills add https://github.com/rryando/arcs --skill systematic-debugging
Repository Details
star Stars 0
call_split Forks 0
navigation Branch main
article Path SKILL.md
More from Creator