name: agent-review description: > Audit of agent files: frontmatter, triggers, voice, length, duplication. TRIGGER when: authoring/reviewing a .claude/agents/.md or plugins//agents/*.md change, or auditing trigger accuracy across the roster. DO NOT TRIGGER when: editing SKILL.md (use skill-review), editing CLAUDE.md/AGENTS.md/memory files (use ai-instruction-and-memory-files), or reviewing non-agent files. user-invocable: false
Agent Review — Architecture & Checklist
1. What makes an agent load
An agent file is claude/.claude/agents/<name>.md (stowed user-scope) or plugins/<plugin>/agents/<name>.md (plugin-scoped). The harness scans .claude/agents/ and ~/.claude/agents/ recursively, and each plugin's agents/ directory. Identity comes from the name frontmatter field — the filename does not have to match.
Required frontmatter:
name— lowercase letters and hyphens; unique across the whole tree. Duplicates within one scope: the harness keeps one and discards the other without warning.description— what the agent does and when to dispatch it. The dispatcher reads this to decide delegation.
Optional frontmatter:
tools— comma-separated allowlist (restrictive). When omitted, the agent inherits every tool from the main conversation. Beware confusion with skillallowed-tools: skillallowed-toolsis additive; agenttoolsis restrictive. They have opposite defaults — mixing them silently breaks the cap.disallowedTools— comma-separated denylist. Applied beforetoolswhen both are present.model—sonnet,opus,haiku, a full model ID, orinherit. Defaults toinherit.maxTurns— integer cap on agentic turns. Must be a YAML integer (maxTurns: 20), not a string (maxTurns: "20") — strings are silently ignored and the cap does not apply.permissionMode—default,acceptEdits,auto,dontAsk,bypassPermissions, orplan. Ignored for plugin subagents.mcpServers— list of MCP servers scoped to the agent. Ignored for plugin subagents.hooks— frontmatter-scoped lifecycle hooks. Ignored for plugin subagents.skills— skills to preload into the agent's context at startup (full body content is injected, not just the description).memory—user,project, orlocal. Enables a persistent memory directory.isolation— set toworktreefor a temporary git worktree.color— one ofred,blue,green,yellow,purple,orange,pink,cyan.background—trueto always run in the background.effort—low,medium,high,xhigh,max; available levels depend on the model.initialPrompt— auto-submitted as the first user turn when the agent runs as the main session agent.
Description text is always-loaded context budget; agent bodies are lazy — paid only when the harness dispatches the agent. Overspend in the description costs more than overspend in the body.
2. Trigger design
Format the description with two parallel lists:
TRIGGER when: <specific situation>; <another>; <another>.
DO NOT TRIGGER when: <obvious misfire>; <adjacent agent or skill's surface>; <out of scope>.
Specificity sources, in priority order:
- File globs (
.claude/agents/*.md,*.tsx) — most reliable; the harness can match them exactly. - Action verbs (editing, auditing, restructuring, reviewing) — second-most reliable; pin the trigger to a what-the-user-is-doing shape.
- Context cues (deciding which file a rule belongs in, debating length) — weakest; only fire when the verbs aren't enough.
DO NOT TRIGGER carries equal weight. Over-broad triggers dispatch the agent into turns its system prompt can't help with; over-narrow leaves it dormant. Name the adjacent agents and skills whose surfaces overlap.
Executor-style carve-out. Invocation-context agents whose description names a single dispatch shape — e.g., code-writer ("dispatch for delegated code changes") — may use narrower trigger language than routed reviewers. Do not relax trigger discipline for routed reviewer agents (staff-*, ciso-reviewer); their dispatcher matches on description shape and they fire across many surfaces.
3. Voice and structure
- Imperative second person. "Review the change," "Apply the checklist," not "The reviewer reviews the change."
- Declarative numbered items for checklist content.
- Section headers are domain or step labels ("Scope," "Review checklist") — never "Introduction," "Background," "Conclusion."
- Tables for matrices, prose for narratives, bullets for parallel options.
4. Length and the behavior test
Target under 200 lines per file (diminishing returns past 300).
The behavior test. For every line: does removing this line change the agent's behavior on at least one realistic input? If no, cut.
Lines that almost always pass the test:
- Anti-pattern descriptions the agent can pattern-match against a diff.
- Rationale that arms the agent to judge edge cases not enumerated in the rule (the why lets the rule generalize).
- Trade-off framing that resolves an otherwise-ambiguous call.
Lines that almost always fail the test:
- "This agent was created after a production incident where..."
- "This is a contentious choice; reasonable engineers disagree..."
- Narrative case studies that retell a past PR.
- Audience-persuasion language ("It's important to remember that...").
5. Operational vs narrative content
Replace narrative with imperative. Agent bodies are operational instructions to the agent, not design documents.
| Tone marker | Operational | Narrative (cut) |
|---|---|---|
| Subject | "Check whether..." | "We've found that..." |
| Tense | imperative present | past, retrospective |
| Audience | the agent, mid-task | a human reading cold |
| Specificity | concrete pattern | generalized lesson |
6. Cross-references vs duplication
Cross-reference (default) when another skill's or agent's triggers already cover the file path or situation being edited. Example: code-review points at agent-review for .claude/agents/*.md rather than restating this checklist inline.
Duplicate (defense-in-depth) only when all three hold:
- The content is critical — getting it wrong has real consequences.
- The two locations reach the agent through different load paths (e.g. always-loaded CLAUDE.md vs on-demand agent body).
- One path could silently fail (agent not dispatched, file not loaded for a particular file type, etc.).
If not all three hold, point at the canonical source.
7. Review checklist
Frontmatter contract —
namematches the agent's identity slug;descriptionpresent and contains bothTRIGGER when:andDO NOT TRIGGER when:blocks.toolsvsallowed-tools: agent files usetools(restrictive allowlist); skill files useallowed-tools(additive). Mixing them silently breaks the cap. Verify the field name matches the file type.Description scope — the description's TRIGGER list matches what the system prompt actually covers. An overpromising description dispatches the agent into work it can't do.
Trigger specificity — TRIGGER conditions use file globs or action verbs, not vague context cues alone. Executor-style exemption: invocation-context agents (
code-writer) may use narrower trigger language; do not relax for routed reviewer agents (staff-*,ciso-reviewer).DO NOT TRIGGER coverage — adjacent-agent and adjacent-skill surfaces are named explicitly; verify every domain-adjacent agent and skill appears.
disallowedToolsrecognition — anydisallowedToolsentry is a recognized field name (not a misspelling oftools), and its value lists actual tool names.maxTurnsinteger validation —maxTurns: 20, notmaxTurns: "20". Strings are silently ignored; the cap will not apply.modelfield discipline — check against~/.claude/CLAUDE.md"Model Routing". Reviewer, code-writing, and specialist agents pinmodel: sonnet; routinely-dispatched subagents do not pinopus.Length — under the 200-line target. Flag anything that drifts past 200 (and especially past 300) without a load-bearing reason.
Behavior test per line — every line should change the agent's behavior on at least one realistic input. Cut narrative, editorial meta-commentary, and incident retellings.
Voice — imperative second person; numbered items for checklists; section headers that label domains or steps.
Cross-reference correctness — referenced agent and skill names exist; referenced section anchors (
§2,§3) match the target file's structure. Renames in the target break these silently.Duplication justification — content duplicated across agents passes the three-condition test in §6. Otherwise, replace the duplicate with a pointer.
Redaction — no real project, organization, codename, or private-tracker-ID references in examples or rationale. Use
PROJ-<digits>orTICKET-<digits>for tracker-shaped placeholders. (Repo-specific; see repo-rootCLAUDE.md.)Behavioral-equivalence audit (compression diffs) — for any diff that removes or shortens lines, before declaring the review complete:
Removed/shortened text Surviving line Behavior-preserving? (quote) (cite) Y — (quote surviving line) / N — (what is lost) Y requires citing the specific surviving line that carries the same instruction. "The meaning is the same" is not sufficient. Any N is a finding; restore the instruction.
Compression that drops a forbidden-action callout ("do not X"), an imperative directive, an escape-hatch exception, or a verification command fails this audit even when the summary reads as equivalent.
Reviewer-agent output contract (
staff-*andciso-revieweronly) — verify the agent carriesWritein its frontmattertoolslist, and has a### File-based outputsection in its## Output formatbody. Both are required forfindings_pathdispatch from the code-review dispatcher. An agent missing either falls back to full inline output and can trigger the heredoc-abort-on-large-findings failure the canary guards against.