roast - SKILL.md Agent Skill

name: roast description: Analyze a skill for ambiguities, missing validation, unclear instructions, and potential LLM confusion disable-model-invocation: true allowed-tools: Read, Glob, Grep argument-hint: [skill-name or path]

Analyze the skill "$0" for precision issues and potential ambiguities. Validate arguments, locate and read the skill file, then evaluate against focus areas.

Steps 1-2: Validate and Read

Step 1: Argument Validation

CRITICAL: If $0 is empty or not provided:

DO NOT proceed with analysis
DO NOT infer or guess which skill to analyze
Exit immediately with error: "No skill specified. Usage: /roast [skill-name or path]"

If the skill path/name is provided, continue with analysis.

Step 2: Locate and Read the Skill

Search in this order:

If $0 is an absolute path, read that file directly
If $0 is a relative path:
- If it ends in .md, resolve relative to current working directory and read that file
- Otherwise, treat as directory, resolve relative to current working directory, and append /SKILL.md
If $0 is a skill name (no path separators):
- Check .claude/skills/$0/SKILL.md (relative to current working directory)

If skill not found in any location:

Exit with error: "Skill '$0' not found in .claude/skills/"
DO NOT proceed with analysis

Steps 3-6: Analyze

Once the skill file is successfully read, evaluate:

Core Questions

Is it accurate enough?
- Does the skill achieve its stated purpose?
- Are there ambiguities that could cause wrong outcomes?
Would more precision improve results?
- Where does vagueness cause LLM to guess incorrectly?
- What constraints would meaningfully improve success rate?
What's the trade-off?
- What flexibility is lost with added precision?
- Is the restriction worth the gained certainty?
Is anything over-complicated?
- Are there overly restrictive statements that could backfire?
- Where does excessive precision create brittleness or block valid use cases?

What Could Actually Go Wrong

Focus on practical failure modes:

Wrong execution: Instructions that could be interpreted multiple ways, leading to different actions
Silent failures: Missing validation that causes errors to go unnoticed
Scope creep: Unbounded operations that do more than intended
Unsafe operations: Destructive actions without guards
LLM confusion: Ambiguity where the model genuinely can't tell what's expected
Over-restriction: Excessive precision that blocks valid use cases or creates brittleness

Ignore edge cases that don't affect the skill's intention or theoretical issues that won't occur in practice.

Red Flags

Look for these patterns that cause real problems:

Best-effort inference: Guessing instead of failing when inputs missing
Vague scope: Unbounded operations without clear stopping points
Ambiguous conditionals: "If needed" without criteria for when that is
Silent failures: Not validating preconditions that will cause errors later
Tool permission gaps: Unrestricted tools for specific tasks
Over-specification: Hardcoded patterns/values that prevent valid alternatives

Steps 7-9: Output

Present findings as a brief summary:

Assessment

Accuracy: [Sufficient | Needs Work] Intent: (One-line description of what skill is trying to accomplish)

What Could Go Wrong

For each practical issue found:

Issue #number

Line X: Brief problem description
Impact: How this affects the skill's outcome
Fix: Concise suggestion (1 line)
Trade-off: What flexibility is lost (if any)

Over-Complicated Statements

Identify overly restrictive or brittle statements found:

Line X: What's too restrictive
Backfire risk: How this could block valid use cases
Simplify: Suggest looser alternative

Performance Optimizations

Optional precision improvements that make LLM execution more certain but aren't required:

Brief suggestion with trade-off noted

Keep observations focused and concise. Skip theoretical issues.

Step 9: After presenting the summary, ask the user which issues to fix and offer an interactive questionnaire for each selected issue.

Examples

Problematic: "Fix the issue in $0"

Impact: "issue" undefined, LLM will guess what to fix
Fix: "Read $0 and fix [specific thing]"

Problematic: "Find relevant files"

Impact: "Relevant" is subjective, inconsistent results
Fix: Only if specificity matters for outcome - otherwise fine for exploratory tasks

Problematic: "Check if tests pass"

Impact: No failure handling, LLM might just report instead of analyzing
Fix: Only if subsequent steps depend on pass/fail - otherwise "check" is adequate

Over-complicated: "Use Grep with pattern '^test_.*.py$' and output_mode 'files_with_matches' to find test files"

Backfire: Blocks alternative test file patterns like *_test.py or test*.py
Simplify: "Find test files" (let LLM choose approach) or "Use Glob with pattern '**/test.py'"