name: profile-memory description: > Profile memory usage in sidecar using Go pprof, system tools, and heap analysis. Covers identifying memory leaks, goroutine leaks, file descriptor accumulation, and CPU profiling. Use when investigating memory issues, profiling performance, debugging memory leaks, or diagnosing unresponsive plugins. disable-model-invocation: true
Memory Profiling for Sidecar
Quick Triage
| Symptom | Tool | Action |
|---|---|---|
| High RSS / memory growth | vmmap, pprof heap | Check system memory, then heap profile |
| Too many open files | lsof | Check FD count and breakdown |
| High CPU | pprof cpu, ps | Capture CPU profile |
| Goroutine leak | pprof goroutines | Check goroutine count and stacks |
| Plugin unresponsive | lsof + goroutines | Check SQLite locks, blocked goroutines |
Triage flow: Is RSS high? -> Check FD count -> Check vmmap -> Check heap profile
System-Level Profiling (No pprof Required)
Find the Process
pgrep -f sidecar
ps aux | grep sidecar
Basic Stats
# RSS, VSZ, CPU%, thread count
ps -o pid,rss,vsz,%cpu,nlwp -p <PID>
# Human-readable RSS
ps -o pid,rss -p <PID> | awk 'NR>1{printf "%d MB\n", $2/1024}'
# Watch over time
while true; do ps -o rss,%cpu -p <PID> | tail -1; sleep 5; done
System Memory
macOS (vmmap):
vmmap --summary <PID>
Key sections: VM_ALLOCATE (Go heap), MALLOC (C heap/SQLite), Physical footprint, Swapped.
Red flags:
- RESIDENT SIZE >500MB for idle sidecar
- REGION COUNT in thousands for VM_ALLOCATE = mmap leak
- SWAPPED SIZE growing = memory pressure
Linux:
cat /proc/<PID>/status | grep -E 'VmRSS|VmSize|Threads'
pmap -x <PID> | tail -5
ls /proc/<PID>/fd | wc -l
File Descriptor Analysis
# Count and breakdown
lsof -p <PID> | wc -l
lsof -p <PID> | awk '{print $5}' | sort | uniq -c | sort -rn
# Find leaked files
lsof -p <PID> | grep REG | awk '{print $9}' | sort | uniq -c | sort -rn | head -20
# Check session file leaks
lsof -p <PID> | grep -c '\.claude/projects'
lsof -p <PID> | grep -c '\.codex/sessions'
# Watch FD count
while true; do echo "$(date): $(lsof -p <PID> 2>/dev/null | wc -l) FDs"; sleep 30; done
Healthy baselines: Total FDs 50-150, REG files 10-30, PIPEs 10-30, DIRs 5-15.
Red flags: 1000+ total FDs, same file opened 4+ times, growing count over time.
Thread Count
# macOS
ps -M -p <PID> | wc -l
# Linux
ls /proc/<PID>/task | wc -l
# Expected: 20-60 threads. 100+ = goroutine leak likely
Go pprof (Runtime Profiling)
Enable pprof
SIDECAR_PPROF=1 sidecar # Default port 6060
SIDECAR_PPROF=6061 sidecar # Custom port
Heap Profile (Current Allocations)
curl http://localhost:6060/debug/pprof/heap > heap.prof
go tool pprof -top heap.prof
go tool pprof heap.prof # Interactive: top20, list <func>, web
Allocs Profile (All Allocations Since Start)
curl http://localhost:6060/debug/pprof/allocs > allocs.prof
go tool pprof -top allocs.prof
Goroutine Profile
# Count
curl -s http://localhost:6060/debug/pprof/goroutine?debug=1 | head -1
# Full stacks
curl http://localhost:6060/debug/pprof/goroutine?debug=2 > goroutines.txt
# Find stuck goroutines
grep -A5 'runtime.chanrecv' goroutines.txt
Memory Stats
curl http://localhost:6060/debug/pprof/heap?debug=1 | head -30
Compare Snapshots (Best for Finding Leaks)
curl http://localhost:6060/debug/pprof/heap > heap1.prof
# Wait (1 hour or overnight)
curl http://localhost:6060/debug/pprof/heap > heap2.prof
go tool pprof -base heap1.prof heap2.prof
# Then: top20
Continuous Monitoring
./scripts/mem-monitor.sh # Default: port 6060, 60s interval
./scripts/mem-monitor.sh 6061 30 # Custom port and interval
Output: CSV time,heap_alloc_bytes,heap_inuse_bytes,goroutines,rss_mb to mem-YYYYMMDD-HHMMSS.log.
CPU Profile
curl http://localhost:6060/debug/pprof/profile?seconds=30 > cpu.prof
go tool pprof -top cpu.prof
go tool pprof cpu.prof # Interactive: top20, list <func>, web
Web UI
go tool pprof -http=:8080 http://localhost:6060/debug/pprof/heap
Requires Graphviz (brew install graphviz / apt install graphviz) for flame graphs.
What to Look For
Goroutine leaks: Count should stabilize at 20-50 after startup. Steady growth = leak. Search for runtime.chanrecv1, runtime.chansend1, time.Sleep.
Heap growth: HeapAlloc should stabilize after loading sessions. Consistent upward trend = leak.
Common pprof signatures: bufio.Scanner (buffer not returned), json.Unmarshal (large objects retained), append in loops (slice growing), channel operations (blocked senders/receivers).
Diagnosing Unresponsive Plugins
- Check goroutines for blocked operations:
curl -s http://localhost:6060/debug/pprof/goroutine?debug=2 | grep -A10 'monitor\|td' - Check SQLite locks:
lsof -p <PID> | grep '\.db' - Check stuck tea.Cmd goroutines:
curl -s http://localhost:6060/debug/pprof/goroutine?debug=2 | grep -B2 'fetchData\|FetchData' - Check for swap pressure:
vmmap --summary <PID> | grep -E 'Physical|Swapped'
Common causes: SQLite locked by concurrent td CLI, memory pressure causing swap thrashing, goroutine blocked on unread channel, accumulating fetchData() goroutines.
Known Leak Patterns
File Descriptor Accumulation (Adapter/Watcher Layer)
Files: internal/adapter/claudecode/watcher.go etc. Uses fsnotify to watch directories.
Symptoms: RSS grows to 5-15GB overnight, lsof shows 1000+ session files open.
lsof -p <PID> | grep '\.claude/projects' | wc -l
lsof -p <PID> | grep '\.codex/sessions' | wc -l
OutputBuffer Substring Retention
OutputBuffer.Update() uses strings.Split() creating substrings sharing backing array. NOT a leak if Update() keeps being called. Only retains if polling stops.
Poll Chain Duplication
Entering interactive mode without incrementing generation counter creates parallel poll chains. Doubles CPU but no memory leak. Fix: increment shellPollGeneration or pollGeneration on entry.
Go MADV_FREE (macOS)
Go runtime uses MADV_FREE marking pages reusable without returning to OS. RSS appears high but memory is available. NOT a leak. Use vmmap --summary and compare "Physical footprint (peak)" vs current.
Investigation Workflow
ps -o pid,rss,%cpu -p <PID>-- establish baselinelsof -p <PID> | wc -l-- if >200, focus on FD leaklsof -p <PID> | grep REG | awk '{print $9}' | sort | uniq -c | sort -rn | head -20vmmap --summary <PID>-- RESIDENT vs VIRTUAL, region count- If pprof available: heap snapshot comparison over time
- If pprof unavailable: restart with
SIDECAR_PPROF=1, reproduce the leak
Overnight Test Procedure
- Start:
SIDECAR_PPROF=1 sidecar - Baseline:
curl .../heap > baseline.prof - Note goroutine count
- Leave overnight (
caffeinate -ion macOS) - Morning: capture new profiles
- Compare:
go tool pprof -base baseline.prof morning.prof - Check goroutines for stuck ones
Notes
- pprof adds ~1-2MB overhead, no TUI performance impact
- HTTP server runs in separate goroutine, localhost only
- On macOS, vmmap/lsof need no special permissions for own processes