name: troubleshooting-decision-tree-skill description: "Routes to appropriate troubleshooting patterns based on issue category. Use when: (1) encountering connection errors (ECONNREFUSED, timeout, auth failures), (2) debugging integration failures (MCP, API, database), (3) system behavior differs from expectations. Verified: 2025-12-14."
Troubleshooting Decision Tree
Purpose: Route to the right solution fast Success Rate: 84% issue resolution Time Saved: 10-30 min per debug session
Usage Scenarios
(1) When encountering connection errors
- Database: "ECONNREFUSED", "authentication failed", "timeout"
- API: "404 Not Found", "401 Unauthorized", "timeout"
- MCP: "Server not responding", "command not found"
(2) When debugging integration failures
- MCP servers not connecting
- Skills not activating
- Memory bank not loading
- Hooks not running
(3) When system behavior differs from expectations
- Feature returns wrong data
- Tests passing locally but failing in CI
- Deployment succeeded but changes not visible
Failed Attempts
| Attempt | Why It Failed | Lesson Learned |
|---|---|---|
| Manual grep search through entire codebase | Too slow (20-30 min), missed edge cases | Use decision tree to narrow domain first |
| Asking generic "how to fix?" | Too vague, no actionable guidance | Identify error category before asking |
| Checking random Stack Overflow posts | Not project-specific, wasted time | Check project patterns first (CORE-PATTERNS.md) |
Quick Start (< 5 min)
Step 1: Identify Error Category
# Connection errors
grep -i "ECONNREFUSED\|timeout\|authentication failed" [log_file]
# Integration errors
grep -i "not found\|undefined\|cannot read" [log_file]
# Behavior errors
# No grep - compare expected vs actual output
Step 2: Check Relevant Pattern
Connection Errors → Check CORE-PATTERNS.md for:
- Database connection patterns
- API endpoint configuration
- Environment variable setup
Integration Errors → Check your skills for:
- database-credentials-validation-skill (if exists)
- api-integration-skill (if exists)
- mcp-troubleshooting-skill (if exists)
Behavior Errors → Check system-status.json:
- Is this feature marked
passes: true? - Are there known issues in
active_blockers? - Check
recent_fixesfor similar problems
Step 3: Apply Solution
Follow the pattern from CORE-PATTERNS.md or relevant skill.
Step 4: Document if New
If this was a new issue:
- Add to
recent_fixesin system-status.json - Consider creating a skill if it might recur
- Update CORE-PATTERNS.md if it's a core pattern
Decision Tree
Error Occurred
↓
What category?
↓
┌───────────┬────────────────┬──────────────┐
│ Connection│ Integration │ Behavior │
└───────────┴────────────────┴──────────────┘
↓ ↓ ↓
Check Check Check
Database MCP config Feature
patterns Skills system status.json
API config Hook execution Recent fixes
Env vars Dependencies Test results
Detailed Routing
Connection Errors
DATABASE_CONNECTION:
Errors: "ECONNREFUSED", "authentication failed for user X"
Check_First:
- CORE-PATTERNS.md DATABASE_SAFETY section
- Environment variables (DB_HOST, DB_PORT, DB_USER, DB_PASSWORD)
- Database is running: `psql -h localhost -p 5432 -U user -c "SELECT 1"`
API_CONNECTION:
Errors: "ECONNREFUSED", "getaddrinfo ENOTFOUND", "timeout"
Check_First:
- CORE-PATTERNS.md API_INTEGRATION section
- API endpoint configuration
- Network connectivity: `curl -v $API_URL`
MCP_CONNECTION:
Errors: "Server not responding", "command not found"
Check_First:
- .claude/mcp_servers.json validity: `jq empty .claude/mcp_servers.json`
- Command availability: `which npx`, `which node`
- Run: `./scripts/check-mcp.sh` (if available)
Integration Errors
SKILLS_NOT_ACTIVATING:
Check:
- Skills in ~/.claude/skills/ (not project directory)
- YAML frontmatter valid
- Hook executable: `ls -l .claude/hooks/pre-prompt.sh`
MEMORY_BANK_NOT_LOADING:
Check:
- Files in memory-bank/always/
- Prefixed with @ in CLAUDE.md
- Valid markdown syntax
HOOKS_NOT_RUNNING:
Check:
- Hook is executable: `chmod +x .claude/hooks/pre-prompt.sh`
- settings.local.json has hook enabled
- Hook script has no syntax errors
Behavior Errors
WRONG_DATA_RETURNED:
Check:
- system-status.json: Is feature marked passes: true?
- CORE-PATTERNS.md: Are you following the correct pattern?
- Recent_fixes: Has this been fixed before?
TESTS_FAILING:
Check:
- Environment: Are you in correct environment?
- Dependencies: `npm install` or equivalent
- Database: Is test database seeded?
DEPLOYMENT_CHANGES_NOT_VISIBLE:
Check:
- Cache: Clear browser cache (Ctrl+Shift+R)
- Service: Is correct version deployed?
- Traffic: Is traffic routed to latest revision?
Evidence
Created: 2025-12-14 Source: Based on production troubleshooting-workflow-skill Success Rate: 84% issue resolution with decision tree Time Savings: 10-30 min per debug session (vs random searching) Usage Frequency: Used in 90%+ of debugging sessions
Integration
Works With:
- CORE-PATTERNS.md (checks patterns first)
- system-status.json (checks feature status)
- Other troubleshooting skills (routes to specific skills)
- MCP validation scripts (./scripts/check-mcp.sh)
Next Steps After Using:
- Document solution in system-status.json recent_fixes
- Consider creating specific skill if problem might recur
- Update CORE-PATTERNS.md if new pattern discovered
Success Criteria
You've mastered this skill when:
- Can identify error category in < 1 min
- Know where to look for solution (pattern vs skill vs status)
- Resolve 80%+ of issues without external search
- Document new solutions for team reuse