audit-agents-skills - SKILL.md Agent Skill

name: audit-agents-skills description: "Audit Claude Code agents, skills, and commands for quality and production readiness. Use when evaluating skill quality, checking production readiness scores, or comparing agents against best-practice templates."

Audit Agents/Skills/Commands

Quality audit for Claude Code agents, skills, and commands. Scores each file against best practices from SKILLS-BEST-PRACTICES.md and the detailed criteria in scoring/criteria.yaml.

Workflow

Phase 1: Discovery

Scan these directories for agents/skills/commands:

.claude/agents/
.claude/skills/
.claude/commands/

Classify each file by type (agent/skill/command).

Phase 2: Audit

Read SKILLS-BEST-PRACTICES.md first — it defines the qualitative standard. Then for each file, check every criterion from scoring/criteria.yaml plus the Anthropic best practices checklist below.

Anthropic Best Practices Checklist

These checks come directly from Anthropic's internal learnings and MUST be applied on top of the scoring criteria:

Check	What to look for	Why it matters
Description = trigger, short	Description is concise (1-2 sentences), tells the model WHEN to activate — not a summary of what the skill does. Optimized for token budget: always loaded in context for every session	Verbose descriptions waste context tokens and dilute trigger signal. If the model can't trigger it or drowns in noise, the skill is dead
Gotchas section	Has a `## Gotchas` section with real failure cases	Highest-signal content per Anthropic — captures what Claude gets wrong
Progressive disclosure	Uses subfiles/folders instead of one massive SKILL.md	Keeps context lean; Claude reads subfiles on demand
Don't state the obvious	Focuses on non-default knowledge Claude wouldn't know	Wasted tokens if it just says "write clean code"
Don't railroad	Gives info + flexibility, not rigid step-by-step scripts	Claude needs room to adapt to the situation
Single category	Fits cleanly into one of the 9 types (Library, Verification, Data, Business, Scaffold, Quality, CI/CD, Runbook, Infra)	Confused skills that straddle categories are hard to trigger and maintain
Name = directory	Frontmatter `name` matches the directory name	Mismatch causes invocation confusion
Composable	References other skills it depends on or works with	Skills should compose, not duplicate
No hardcoded paths	No `/Users/`, `/home/`, `C:\`	Breaks portability
Reasonable size	SKILL.md < ~200 lines; total skill folder < 5K tokens in main file	Context budget is finite

Phase 3: Report

For each file, output:

### <file path>
- **Type**: agent / skill / command
- **Category**: (one of the 9 types, or "unclear")
- **Score**: X/32 (or X/20 for commands)
- **Grade**: A/B/C/D/F
- **Issues**: (list each failed criterion with 1-line fix suggestion)
- **Best Practices**: (pass/fail for each Anthropic checklist item)

Then a summary table:

| File | Type | Score | Grade | Top Issue |
|------|------|-------|-------|-----------|

Phase 4: Recommendations

Prioritize by impact:

Critical — skill is broken or untriggerable
High — missing gotchas, wrong description, massive size
Low — minor: missing examples, no composition references

Grading

Grade	Range	Meaning
A	90-100%	Production-ready
B	80-89%	Good, meets threshold
C	70-79%	Needs work
D	60-69%	Significant gaps
F	<60%	Major rewrite needed

Production threshold: 80% (Grade B).