description: > Use when asked to lint, audit, review, or score AI-facing instruction files such as SKILL.md, AGENT.md, AGENTS.md, CLAUDE.md, platform body.md files, prompt files, rules, policies, and agent-facing references. NOT for application code review, harness configuration review, ordinary docs, tests, or generated build output. name: reviewing-instructions
Instruction Review
Review AI-facing instruction files for routing precision, behavioral signal, output contracts, failure handling, grounding, and score stability. Do not score ordinary docs or source code.
Read first
references/scoring-rubric.mdfor gates, 0-10 bands, caps, confidence, and output schema.references/model-resolution.mdfor model alias mapping and fallback rules.references/calibration.mdonly when a score is borderline or confidence is low.references/models/<family>.mdonly after model family resolution.
Accepted inputs
The user may pass:
- file path, directory path, or plugin name
- omitted scope, meaning discover likely instruction files
--model <name>to override model family or variant- requests such as lint, audit, review, score, compare, or rerank
Plugin name without a path separator expands to matching src/skills/<name>,
src/agents/<name>, or src/plugins/<name> when present.
Scope boundaries
Review only markdown or prompt files that guide an AI agent or coding assistant. Include support files only when an entrypoint tells the agent to read them or when they live under that skill or agent folder.
Do not review:
- application source code, tests, or generated artifacts
- ordinary README, changelog, product, or design docs unless agent-facing
- harness config quality; use evolving-config
- code quality; use reviewing-code
If a candidate is ambiguous, put it in Candidates Not Reviewed with the reason.
Discovery
Build the review set in this order:
- Explicit paths from the user.
- Entrypoints: SKILL.md, AGENT.md, AGENTS.md, CLAUDE.md.
- Support files referenced by entrypoints: body.md, references, prompt, rules, context, and policy markdown.
- High-confidence agent-facing markdown in agents, skills, prompts, instructions, references, or rules directories.
For a single explicit file, review that file only unless the user asks for linked files. For a directory, include its entrypoint and local support files.
Model resolution
Use references/model-resolution.md.
Resolution order:
--model <name>from the user.- File frontmatter model or platform metadata.
- Parent entrypoint model for support files.
- Tool or target folder family when obvious.
- generic.
Report one line per review set: Model context: <family>/<variant or generic> — source <arg|frontmatter|parent|folder|generic>.
If resolution is ambiguous, use generic and set review confidence to medium or low.
Structural pre-pass
Run the lint script scoped to the review target when Bash is available:
uv run python src/skills/reviewing-instructions/scripts/lint-instructions.py <scope>
If scope is omitted, run the whole-repo pre-pass. If the script ignores scope, filter reported findings to reviewed files before scoring.
If the script fails or is unavailable, record Structural pre-pass: skipped with
the exact reason and continue semantic review.
The pre-pass is advisory. Semantic review and the scoring rubric are authoritative.
Semantic review
For each confirmed file:
- Read the file fully.
- Confirm it is agent-facing.
- Resolve model context.
- Apply hard gates from the scoring rubric.
- Score each dimension using band-first 0-10 anchors.
- Apply caps and confidence rules.
- Rate applicable lint rules as PASS, WARN, or FAIL.
- List the top 1-3 improvements by impact.
Use evidence for every score and finding: section name, line number, exact text, or missing evidence. No evidence, no finding.
Scoring stability rules
- Choose the rubric band first, then choose the midpoint unless evidence justifies an edge.
- Apply caps before computing the final score.
- Round final scores to the nearest 0.5.
- Use low confidence instead of over-precise scoring when context is partial.
- Do not let one polished section hide a missing hard gate.
- For repeated scoring or reranking, use the same scope, model context, and rubric version.
Output
## Instruction Review Report
Model context: <family/variant> — source <source>
Rubric version: <date or file path>
Review confidence: high | medium | low
### Summary
- Files reviewed: N
- Candidates not reviewed: N
- Structural pre-pass: <errors/warnings or skipped reason>
- Score range: X-Y / 10
- Main risk: <one sentence>
### Scores
path/to/file.md — overall X / 10, confidence <high|medium|low>
- Gates: pass | capped at N because <reason>
- Signal Density: X — <evidence>
- Scope Specificity: X — <evidence>
- Output Structure: X — <evidence>
- Format Efficiency: X — <evidence>
- Failure Handling: X — <evidence>
- Grounding Discipline: X — <evidence>
- Routing Precision: X — <evidence>
- Progressive Disclosure: X — <evidence>
- Lint: PASS <ids>; WARN <ids>; FAIL <ids>
### Findings
1. path — <severity> <rule or dimension>: <issue>. Evidence: <section/line/text>. Fix: <concrete fix>.
### Top Improvements
1. <highest-impact change>
2. <next change>
3. <next change>
### Candidates Not Reviewed
- path — <reason>
Omit empty sections. If no findings remain after evidence checks, say No confirmed findings.
Failure handling
- Missing scope and broad review would be expensive: ask one clarifying question.
- Unknown model alias: use generic, report the alias gap, and lower confidence.
- Vendor docs unavailable: use local model reference or generic; do not block review.
- Conflicting local and vendor guidance: local project rules win; report the conflict.
- Parallel or delegated reviews disagree: apply the same gates and caps, then keep the lower-confidence result out of confirmed findings.