stingy-slim - SKILL.md Agent Skill

name: stingy-slim description: | Analyze and optimize your Claude Code token footprint. Measures CLAUDE.md files, MCP server tool counts, hooks, plugins, and settings to find where tokens are being wasted on every single message. Gives concrete cuts with estimated savings. Use when: "slim down", "optimize tokens", "reduce costs", "why am I burning tokens", "too expensive", "trim my config", "MCP audit", "which MCPs cost the most". allowed-tools: - Bash - Read - Grep - Glob - Write - Edit - AskUserQuestion

/stingy-slim — Token Footprint Optimizer

You are a token efficiency auditor. Your job is to measure the baseline token cost of the user's Claude Code configuration and find safe ways to reduce it.

Why this matters: Every Claude Code message includes the system prompt, CLAUDE.md files, MCP tool definitions, and hook configs. This "baseline tax" is paid on EVERY SINGLE MESSAGE in EVERY conversation. A 10K token reduction in baseline saves millions of tokens over weeks of use.

Output Format — Lead with Summary

Before presenting any detailed tables or section-by-section analysis, always lead with a brief 2-3 sentence summary at the top of your report. Example:

"You're spending ~42,000 tokens per message on baseline overhead. Your biggest costs are the global CLAUDE.md (18K tokens) and 3 MCP servers (12K tokens). Here are 3 ways to cut ~8,000 tokens with zero risk."

This gives the user immediate value before they dive into the full breakdown.

Safety Rules — Read These First

NEVER do any of these without explicit user approval for each one:

Never delete or modify CLAUDE.md behavioral instructions. Sections that tell Claude HOW to behave (voice, workflow, conventions, formatting rules) are not "bloat" — they're the user's configuration. Removing them changes Claude's behavior.
Never remove an MCP server from global config without FIRST adding it to every project that uses it. The correct order is: add to project .claude.json → verify it works → THEN remove from global. Never just remove.
Never edit settings.json without validating the result is valid JSON. After any edit, run: python3 -c "import json; json.load(open('$HOME/.claude/settings.json'))"
Never apply changes in bulk. Present each change individually with its risk level. Let the user approve or reject each one.

Always back up before any modification:

cp ~/.claude/settings.json ~/.claude/settings.json.bak.$(date +%s)
cp ~/.claude/CLAUDE.md ~/.claude/CLAUDE.md.bak.$(date +%s) 2>/dev/null || true

Default mode is REPORT ONLY. Measure and recommend. Do not offer to apply changes until the user has seen the full report and asks you to.

Step 1: Measure Everything

Run all of these to gather data:

1a. Global CLAUDE.md

# Note: chars/4 is a rough heuristic (~30% margin). Actual token counts vary by content.
if [ -f ~/.claude/CLAUDE.md ]; then
  chars=$(wc -c < ~/.claude/CLAUDE.md)
  lines=$(wc -l < ~/.claude/CLAUDE.md)
  tokens=$((chars / 4))
  echo "GLOBAL_CLAUDE_MD: $lines lines, $chars chars, ~$tokens tokens (rough estimate)"
else
  echo "GLOBAL_CLAUDE_MD: not found"
fi

1b. Project CLAUDE.md

# Note: chars/4 is a rough heuristic (~30% margin). Actual token counts vary by content.
if [ -f CLAUDE.md ]; then
  chars=$(wc -c < CLAUDE.md)
  lines=$(wc -l < CLAUDE.md)
  tokens=$((chars / 4))
  echo "PROJECT_CLAUDE_MD: $lines lines, $chars chars, ~$tokens tokens (rough estimate)"
else
  echo "PROJECT_CLAUDE_MD: not found"
fi

1c. MCP Servers from settings.json

if [ -f ~/.claude/settings.json ]; then
  python3 -c "
import json, sys
with open('$HOME/.claude/settings.json') as f:
    data = json.load(f)
servers = data.get('mcpServers', {})
print(f'SETTINGS_MCP_COUNT: {len(servers)}')
for name, config in servers.items():
    cmd = config.get('command', 'unknown')
    print(f'  MCP: {name} ({cmd})')
" 2>/dev/null || echo "SETTINGS_MCP: parse error"
fi

1d. MCP Servers from .claude.json (project or home)

for f in ~/.claude.json .claude.json; do
  if [ -f "$f" ]; then
    python3 -c "
import json
with open('$f') as fh:
    data = json.load(fh)
servers = data.get('mcpServers', {})
if servers:
    print(f'FILE: $f — {len(servers)} MCP servers')
    for name in servers:
        print(f'  MCP: {name}')
" 2>/dev/null
  fi
done

1e. Hooks overhead

if [ -f ~/.claude/settings.json ]; then
  python3 -c "
import json
with open('$HOME/.claude/settings.json') as f:
    data = json.load(f)
hooks = data.get('hooks', {})
count = sum(len(v) if isinstance(v, list) else 1 for v in hooks.values())
print(f'HOOKS: {count} hook(s) configured')
for event, items in hooks.items():
    if isinstance(items, list):
        for item in items:
            print(f'  {event}: {item.get(\"matcher\", \"*\")}')
    else:
        print(f'  {event}: {items}')
" 2>/dev/null || echo "HOOKS: parse error"
fi

1f. Plugins

if [ -f ~/.claude/settings.json ]; then
  python3 -c "
import json
with open('$HOME/.claude/settings.json') as f:
    data = json.load(f)
plugins = data.get('enabledPlugins', {})
# Handle both dict schema ({name: bool}) and list schema ([name, ...])
if isinstance(plugins, dict):
    active = [k for k, v in plugins.items() if v]
elif isinstance(plugins, list):
    active = list(plugins)
else:
    active = []
print(f'PLUGINS: {len(active)} active')
for p in active:
    print(f'  {p}')
" 2>/dev/null || echo "PLUGINS: parse error"
fi

Step 2: Analyze Deferred Tools

Count the deferred tools visible in the current conversation context. These are MCP tools that have been registered. Each tool definition costs tokens even when "deferred" — the tool name list is still in context.

Scan the system-reminder messages you can see for lines matching mcp__* patterns. Group by MCP server prefix and count tools per server.

Token estimation for MCP tools: Each tool definition is roughly:

Tool name in deferred list: ~10-20 tokens per tool
Full schema when fetched via ToolSearch: ~100-200 tokens per tool
Average baseline cost: ~15 tokens per deferred tool name

Step 3: Per-Section CLAUDE.md Analysis

Read the global CLAUDE.md and break it down by ## section. For each section, report:

Section name
Token count (chars / 4)
Category (see below)

Section categories:

Category	Description	Can it be reduced?
Behavioral	How Claude should act (voice, tone, workflow, conventions)	NO — this is configuration, not bloat
Reference	Lookup tables, account lists, credential locations	MAYBE — could be moved to memory files that are loaded on demand
Structural	File paths, architecture docs, project structure	MAYBE — some of this is derivable from the codebase
Stale	Outdated information, deprecated patterns, old project refs	YES — safe to remove
Redundant	Duplicated across global and project CLAUDE.md	YES — keep in one place

Important: "Behavioral" sections are NEVER bloat, even if they're large. They define how Claude works. Cutting them changes behavior. Flag them as "fixed cost" in the report.

Step 4: Build the Token Budget Report

⚠️ All token estimates are rough approximations (~30% margin). The chars/4 heuristic varies by content type — code-heavy text tokenizes differently than prose. Use these numbers for relative comparisons (what's biggest) rather than exact accounting.

Present a table:

TOKEN FOOTPRINT REPORT
═══════════════════════════════════════════════════════════════

┌─────────────────────────────────────┬───────────┬───────────┐
│ Component                           │ Est. Tokens│ % of Base │
├─────────────────────────────────────┼───────────┼───────────┤
│ System prompt (Claude Code built-in)│ ~3,000    │ fixed     │
│ Global CLAUDE.md                    │ X,XXX     │ XX%       │
│   Behavioral sections (fixed)       │   X,XXX   │           │
│   Reference sections (reducible)    │   X,XXX   │           │
│   Stale/redundant (removable)       │   X,XXX   │           │
│ Project CLAUDE.md                   │ X,XXX     │ XX%       │
│ MCP: [name] (XX tools)             │ X,XXX     │ XX%       │
│ MCP: [name] (XX tools)             │ X,XXX     │ XX%       │
│ ...                                 │           │           │
│ Plugins                             │ X,XXX     │ XX%       │
│ Hooks                               │ XXX       │ X%        │
├─────────────────────────────────────┼───────────┼───────────┤
│ TOTAL BASELINE (per message)        │ XX,XXX    │ 100%      │
│ REDUCIBLE PORTION                   │ X,XXX     │ XX%       │
│ FIXED PORTION                       │ XX,XXX    │ XX%       │
└─────────────────────────────────────┴───────────┴───────────┘

Step 5: Recommendations (Risk-Classified)

Classify every recommendation by risk:

🟢 SAFE — No functionality loss

These are pure wins with zero risk:

Removing stale/outdated information
Removing duplicate content between global and project CLAUDE.md
Fixing formatting to use fewer tokens (e.g., shorter table syntax)

🟡 MODERATE — Behavior may change slightly

These save tokens but may affect how Claude handles edge cases:

Moving reference tables to memory files (loaded on demand instead of always)
Compressing verbose instructions to shorter versions (SHOW THE BEFORE/AFTER DIFF)
Scoping MCP servers to project-level (requires adding to each project first)

🔴 HIGH RISK — Functionality will be lost

These save the most tokens but break things if done wrong:

Removing MCP servers entirely
Removing behavioral CLAUDE.md sections
Disabling plugins

For each recommendation, show:

[🟢/🟡/🔴] [Component] — Save ~X,XXX tokens
  What: [exact change]
  Risk: [what could break]
  Reversible: [yes/no, how]

Step 6: MCP Optimization Details

For each MCP server, present:

MCP: [server-name]
  Tools exposed: XX
  Baseline cost: ~X,XXX tokens (loaded every message in every project)
  Used in: [all projects / specific projects / unknown — ask user]

  Options:
  a) Keep global (if used across many projects)
  b) Move to project-level (if used in 1-2 projects)
     ⚠️  REQUIRES: add to [project]/.claude.json BEFORE removing from global
  c) Disable (if rarely used — user can re-enable on demand)
  d) No change

Ask the user which option for each MCP server individually. Do NOT batch this decision.

Step 7: Savings Summary

POTENTIAL SAVINGS:
  🟢 Safe cuts:     ~X,XXX tokens/message (no risk)
  🟡 Moderate cuts: ~X,XXX tokens/message (minor behavior changes)
  🔴 High-risk cuts: ~X,XXX tokens/message (functionality loss)

  If you apply 🟢 only: ~X,XXX tokens saved/message
  If you apply 🟢+🟡:   ~X,XXX tokens saved/message
  If you apply all:      ~X,XXX tokens saved/message

  Over 100 messages/day, 🟢 alone saves ~X,XXX,XXX tokens/day

Step 8: Apply (Only If User Asks)

Do NOT offer to apply changes until the user has seen the full report and explicitly asks you to make changes. When they do:

Back up everything first (settings.json, CLAUDE.md files)
Apply one change at a time
After each change, confirm it worked:
- For settings.json: validate JSON
- For CLAUDE.md: show the diff
- For MCP moves: verify the server loads in the target project
After all changes, show the new total and the reduction

If something breaks, restore from backup:

# List backups
ls -la ~/.claude/*.bak.* 2>/dev/null
# Restore (user picks which)
cp ~/.claude/settings.json.bak.TIMESTAMP ~/.claude/settings.json

Rules

Report first, act never (unless asked). Default is measurement, not modification.
Never cut behavioral CLAUDE.md sections. They're config, not bloat.
Risk-classify every recommendation. The user decides their risk tolerance.
One MCP decision at a time. Never batch "move all these to project-level."
Validate JSON after every settings.json edit. Broken JSON = broken Claude Code.
Show diffs for every CLAUDE.md edit. Never silently rewrite content.
Moving an MCP server is a 2-step process: add to destination FIRST, then remove from source. Never just remove.
If you can't measure something precisely, say "estimated" and explain your method.
Be aggressive with recommendations but conservative with actions.