writing-skills - SKILL.md Agent Skill

name: writing-skills description: Applies test-driven development principles to write and execute behavioral tests. Use when creating new skills, editing existing skills, or verifying skills work before deployment.

Writing Skills

When to use this skill

Writing skills IS Test-Driven Development applied to process documentation.

Write test cases (pressure scenarios), watch them fail (baseline behavior), write the skill, watch tests pass (agents comply), refactor (close loopholes).

Core principle: If you didn't watch an agent fail without the skill, you don't know if the skill teaches the right thing.

REQUIRED BACKGROUND: Understand test-driven-development before using this skill. Same principles — adapted to documentation.

TDD Mapping

TDD	Skill Creation
Test case	Pressure scenario with subagent
Production code	`SKILL.md`
RED — test fails	Agent violates rule without skill (baseline)
GREEN — test passes	Agent complies with skill present
Refactor	Close loopholes while maintaining compliance

The Iron Law

NO SKILL WITHOUT A FAILING TEST FIRST

This applies to new skills AND edits. Write skill before testing? Delete it. Start over.

When to Create a Skill

Create when:

Technique wasn't intuitively obvious to you
You'd reference it again across projects
Pattern applies broadly (not project-specific)

Don't create for:

One-off solutions
Standard practices well-documented elsewhere
Project-specific conventions → put in CLAUDE.md
Mechanical constraints → automate; save docs for judgment calls

Directory Structure

Skills/
  skill-name/
    SKILL.md           # Required
    supporting-file.*  # Only if needed (heavy reference or reusable tools)

SKILL.md Frontmatter Rules

---
name: skill-name-with-hyphens      # Letters, numbers, hyphens only
description: Use when [specific triggering conditions and symptoms]
---

Max 1024 characters total in frontmatter
description — written in third-person, starts with "Use when..."
NEVER summarize the skill's workflow in the description

Why "no workflow summary" in description matters

Testing revealed: when a description summarizes the skill's workflow, Claude follows the description instead of reading the full skill. A description saying "code review between tasks" caused Claude to do ONE review, but the skill body shows TWO (spec compliance then quality).

# ❌ BAD - summarizes workflow
description: Use when executing plans - dispatches subagent per task with code review between tasks

# ✅ GOOD - triggering conditions only
description: Use when executing implementation plans with independent tasks in the current session

SKILL.md Structure

## When to use this skill

Core principle in 1-2 sentences.

## When to use this skill

Bullets with SYMPTOMS and use cases. Small flowchart only if decision
non-obvious.

## How to use it

Before/after comparison or steps.

## Quick Reference

Table or bullets for scanning.

## Common Mistakes

What goes wrong + fixes.

Flowcharts

Use only for non-obvious decisions or process loops. Never for reference material, linear instructions, or labels without semantic meaning. Prefer Mermaid over dot/graphviz for portability.

Bulletproofing Against Rationalization

For discipline-enforcing skills (TDD, verification, debugging):

Close loopholes explicitly — state the rule + forbid specific workarounds
Add "spirit vs letter" statement early: "Violating the letter is violating the spirit."
Build rationalization table from baseline testing — every excuse an agent uses goes in the table
Create red flags list — make it easy for agents to self-check

RED-GREEN-REFACTOR for Skills

RED: Run pressure scenario WITHOUT skill. Document exact behavior and rationalizations verbatim.

GREEN: Write minimal skill addressing those specific failures. Run same scenarios WITH skill — agent complies.

REFACTOR: Agent finds new rationalization → add explicit counter → re-test until bulletproof.

Skill Creation Checklist

RED Phase:

Create pressure scenarios (3+ combined pressures for discipline skills)
Run WITHOUT skill — document baseline behavior verbatim

GREEN Phase:

Name uses only letters, numbers, hyphens
Frontmatter ≤ 1024 chars, description starts with "Use when..."
Description is triggering conditions only (no workflow summary)
Addresses specific failures found in RED
One excellent code example (not multi-language)
Run WITH skill — verify compliance

REFACTOR Phase:

Identify new rationalizations → add counters
Build rationalization table
Create red flags list
Re-test until bulletproof

Quality:

Quick reference table
Common mistakes section
No narrative storytelling
Supporting files only for heavy reference or reusable tools

Common Mistakes

Mistake	Fix
Writing skill before baseline test	Run baseline first — it's the RED phase
Workflow summary in description	Triggering conditions only
Multi-language examples	One excellent example
Narrative ("In session X we found...")	Reusable patterns instead
Deploying without testing	Test every skill before deploying
Batching multiple skills without testing each	Complete checklist per skill before next

The Bottom Line

Creating skills IS TDD. Same Iron Law. Same cycle. Same benefits.

Follow TDD for code → follow it for skills. It's the same discipline applied to documentation.