name: spiral description: Five-phase adversarial development loop (design → plan → architect → implement → evaluate) that re-enters until the evaluator flags GREEN. For features where the bar is "insane quality", not "ship it". Use when the user explicitly invokes /spiral or asks for a multi-pass spiral build. user_invocable: true
/spiral — Adversarial development spiral
You orchestrate five sub-agents per iteration. The loop repeats until the Evaluator returns VERDICT: GREEN, capped at 4 iterations. Every phase writes ONE markdown file to .spiral/<feature>/iter-N/. The orchestrator's context stays tiny because sub-agents only return a one-line receipt; the substantive output lives in files.
Core design rules (read first)
These exist because the previous version of this skill stalled on background agents.
- Every sub-agent runs in the FOREGROUND. No
run_in_background. Foreground agents always return a result; background agents rely on completion notifications that can hang or get lost. We trade a few seconds of orchestrator wait time for determinism. - Sub-agents return ≤ 80-word receipts. The prompt explicitly says: "Write your output to
<path>. Reply with ONLY the verdict line plus a 1-sentence summary." The orchestrator NEVER reads the full output back into context — it reads at most the last 2 lines of a file when it needs to make a control-flow decision. - Files are the bus. Inter-phase data only travels through
.spiral/<feature>/iter-N/*.md. Sub-agents read the files they need; the orchestrator just passes paths. - The loop is a real while loop in the orchestrator's turn. Phases run sequentially in one session. No
ScheduleWakeup, noMonitor, no polling. After Phase 5, read the verdict, branch. - Implementation runs inside a worktree. Phase 4 uses
isolation: "worktree"so a failed iteration can't leave the main tree broken. The worktree path is recorded; if the iteration ends GREEN, the orchestrator surfaces the worktree path so the user can merge. - Progress is visible.
TaskCreateone task per phase per iteration; mark complete as they finish.mark_chapteronce per iteration. The user can see exactly where the spiral is at any moment without you having to summarize.
When to invoke
- User explicitly says
/spiral <feature>or "spiral on …". - Feature is non-trivial (multi-screen, new system) and the bar is award-grade craft.
- Repo has a dev server + Playwright suite (the Evaluator needs them).
Do NOT use for: single-component polish, copy fixes, bug triage, refactors. Those have their own skills.
Arguments
/spiral <feature-slug> [brief sentence]
If slug missing, ask once via AskUserQuestion. Slug becomes .spiral/<feature-slug>/. Brief is one sentence of intent — write it to .spiral/<feature-slug>/brief.md and never touch it again.
Layout
.spiral/<feature-slug>/
brief.md
iter-1/
01-design.md
02-plan.md
03-architecture.md
04-implementation.md ← includes "Worktree: <path>" line
05-evaluation.md ← ends with VERDICT: GREEN | RED
screenshots/
iter-2/ (only if iter-1 RED)
...
Phase models
| Phase | Model | Why |
|---|---|---|
| 1 Design | opus | Creative UX synthesis, deep |
| 2 Plan | sonnet | Structured decomposition, cheap |
| 3 Architect | opus | High-stakes, irreversible decisions |
| 4 Implementation | sonnet | Volume work, parallel children |
| 5 Evaluator | opus | Adversarial depth, blind-spot hunting |
Pass model: to the Agent tool to lock the tier. The orchestrator stays on whatever model the user invoked.
Orchestrator turn structure
Inside ONE conversation turn (or a small number of turns the user is happy to leave running), do this:
1. Setup
- Resolve slug, write brief.md, mkdir .spiral/<slug>/iter-1/
- TaskCreate: one task per phase × expected iterations (start with 5 tasks for iter-1)
- mark_chapter("Spiral iter-1: <slug>")
2. For iteration N in 1..4:
- Run Phase 1 (foreground Agent, opus). Wait for ≤80-word receipt.
- TaskUpdate phase 1 → completed; phase 2 → in_progress
- Run Phase 2 (foreground Agent, sonnet). If reply contains "DESIGN-GAP", goto Phase 1 of same iter once with the gap appended.
- TaskUpdate phase 2 → completed; phase 3 → in_progress
- Run Phase 3 (foreground Agent, opus).
- TaskUpdate phase 3 → completed; phase 4 → in_progress
- Run Phase 4 (foreground Agent, sonnet, isolation: "worktree"). Receipt includes worktree path + branch.
- TaskUpdate phase 4 → completed; phase 5 → in_progress
- Run Phase 5 (foreground Agent, opus). Receipt is just the VERDICT line.
- TaskUpdate phase 5 → completed.
- If receipt contains "VERDICT: GREEN":
- Surface 3-bullet summary + worktree path to user. STOP.
- Else (RED):
- mkdir iter-(N+1)/; TaskCreate 5 more tasks; mark_chapter("Spiral iter-(N+1)")
- Continue loop.
3. If loop exits without GREEN (4 iterations RED):
- Read just the score lines from each iter-N/05-evaluation.md (one grep, not full read).
- Surface trend to user: which axes are stuck? Ask whether to keep iterating, pivot, or ship.
The orchestrator NEVER reads a sub-agent's full output. It either reads the agent's receipt text directly, or runs tail -n 5 .spiral/<slug>/iter-N/05-evaluation.md to grab the verdict.
Sub-agent prompt template
Every phase's Agent call follows this shape — copy and fill in:
Role: <one sentence>.
Read in this order:
1. <path>
2. <path>
[3. Prior iter-(N-1)/05-evaluation.md — treat findings as REQUIRED corrections]
Write your output to <output-path>. Cap: <N> words.
Required sections (in order):
- ...
Output rubric:
- Be specific. Cite file:line for non-obvious claims.
- No filler ("delightful", "seamless", "modern").
When done, reply with ONLY:
- A 1-sentence summary
- The literal line `WROTE: <output-path>`
[- For Phase 5 also: the literal line `VERDICT: GREEN` or `VERDICT: RED`]
Do NOT include the file's contents in your reply.
That last rule is the load-bearing one. It keeps your context from blowing up.
Phase 1 — Design (opus, foreground)
Role: senior product designer + UX researcher.
Reads: brief.md, prior iter-(N-1)/05-evaluation.md if exists, CLAUDE.md, the closest existing screen (orchestrator names the path).
Writes iter-N/01-design.md (≤ 1500 words):
- User flows (every path including failure & empty states)
- States table: idle / loading / empty / partial / success / error-recoverable / error-terminal / disabled
- UX principles (3-5 bullets, specific to this feature)
- UI direction (typography, color, motion, spacing; cite tokens; flag new ones)
- Copy register + ~10 candidate strings (EN + FR if repo is bilingual)
- Open questions (gaps for Planner to resolve)
Phase 2 — Planner (sonnet, foreground)
Role: tech lead doing work-wave decomposition.
Reads: iter-N/01-design.md, prior eval if exists, repo layout from CLAUDE.md.
Writes iter-N/02-plan.md (≤ 1000 words):
- Design readiness check (per section of 01-design). If under-specified, write
DESIGN-GAP: <what>at the top and reply with that line. Orchestrator re-runs Phase 1 with the gap appended (max one redo per iteration). - Work waves (ordered, ≤ 6h each, each ends in a typecheckable + testable state)
- Surface inventory (files created vs modified, paths only)
- Wave dependencies
- Risk list (3-5, ranked, with mitigation)
Phase 3 — Architect (opus, foreground)
Role: principal engineer choosing patterns for production-grade code.
Reads: iter-N/01-design.md, iter-N/02-plan.md, prior eval, CLAUDE.md, files named in plan's surface inventory.
Writes iter-N/03-architecture.md (≤ 1500 words):
- State shape (stores, reducers, persisted state; discriminated unions over flag bags)
- API contract (endpoints touched: method, path, request, response, error shape; Zod at HTTP edges)
- Module boundaries (deep modules vs thin routes; where coupling is acceptable)
- Pattern choices (3-7 non-obvious decisions, each justified in one sentence)
- Failure & resilience (timeouts, retries, idempotency, propagate vs catch)
- Scalability notes (only bits that materially matter)
- Test surfaces (unit / integration / E2E split)
Phase 4 — Implementation (sonnet, foreground, worktree)
Role: implementation lead. May spawn its own parallel children for independent waves.
Agent call MUST pass isolation: "worktree". The worktree path + branch come back in the agent's result message — capture them and write them into the receipt header.
Reads (this agent and its children): iter-N/02-plan.md, iter-N/03-architecture.md, CLAUDE.md, source files named in plan.
Per wave:
- Independent wave → spawn child Agent (general-purpose, sonnet) with surface list + architecture subsection + hard "don't touch files outside this list" rule. Children are foreground too. Run independent children as parallel Agent calls in the same message.
- Dependent wave → run after prior completes.
- After every wave: typecheck the affected package (
cd <pkg> && pnpm exec tsc --noEmit). Fix RED before moving on. Never--no-verify.
Writes iter-N/04-implementation.md (≤ 800 words):
Worktree: <absolute path>andBranch: <branch-name>on the first two lines- Waves completed (
[x]checklist) git diff --stat HEADoutput- Typecheck status per package
- Deviations from architecture (with reason; empty if none)
- Known follow-ups (out of scope)
- Hand-off notes for evaluator (≤ 5 sentences: how to exercise the feature, what state to seed)
Standalone /spiral: never commits, leaves changes in the worktree for the user to review.
SUB-SPIRAL MODE (running inside /meta-spiral): the OPPOSITE — every iteration's final state MUST be committed on the worktree branch before the receipt goes back. The meta-merge consumes the BRANCH TIP via git merge; uncommitted worktree changes are silently dropped (this shipped stubs/stale work to main three times). Commit message: spiral(<slug>): iter-N implementation. Exclude .spiral/ files.
Phase 5 — Evaluator (opus, foreground)
Role: adversarial critic + QA engineer. Defaults to RED unless evidence proves GREEN.
Reads: iter-N/01-design.md … iter-N/04-implementation.md, source diffs in the worktree, prior iter-(N-1)/05-evaluation.md if exists (verify previously-flagged issues are resolved).
Required actions:
- Switch to the worktree path from
04-implementation.md. - Start the dev server (use
/runskill or repo's documented command). - Use Playwright MCP to exercise every state from the design's state table — golden path + at least 2 failure paths.
- Save screenshots to
.spiral/<feature>/iter-N/screenshots/. - Write or extend a Playwright spec under
tests/e2e/specs/. Runpnpm test:e2e. - Adversarial critique across four axes — score 0-10 with evidence:
- Originality — one thing no competitor does, or AI-slop pastiche?
- Scalability — at 100× load, what breaks first? file + reason.
- Usability — wet-hands / one-handed / distracted user can recover from every error state?
- Functionality — every state from design actually reachable? Dead ends? Blind spots?
Writes iter-N/05-evaluation.md (≤ 2000 words):
- Scores (4 axes, evidence per score)
- Blockers (numbered, each
file:line+ what's wrong + what GREEN looks like) - AI-slop watch (quote generic / unfelt elements)
- Playwright results (pass/fail per spec; link spec path)
- Screenshots referenced (one bullet per state)
- Final line, alone:
VERDICT: GREENorVERDICT: RED
GREEN requires: all four axes ≥ 7, zero blockers, Playwright suite green, AI-slop watch empty. Anything less is RED.
Reply to orchestrator with ONLY:
- 1-sentence summary
WROTE: .spiral/<slug>/iter-N/05-evaluation.mdVERDICT: GREENorVERDICT: RED
Worktree handling on completion
- GREEN: surface the worktree path. Tell the user: "Changes are in
<worktree-path>on branch<branch>. Review withgit -C <path> diff mainand merge when ready." Do not auto-merge. - RED, continuing: leave the worktree in place; the next iteration's Phase 4 gets a fresh worktree (a new isolation: worktree call). Prior worktrees stay on disk for the user to inspect — DON'T clean them up automatically; mention them in the final summary.
- 4× RED, stopping: surface all worktree paths + the score trend.
Out-of-scope findings → spawn_task
When the evaluator catches improvements that are out of scope (dead code, stale docs, security issues in unrelated files), it MAY include a SPAWN_TASKS: block at the end of its reply listing them. The orchestrator then calls mcp__ccd_session__spawn_task once per item to fire them as side tasks. This keeps the spiral focused while not losing the finds.
Hard rules
- One file per phase. If a sub-agent writes more or returns the file contents in its reply, re-issue with a tighter prompt — don't accept it.
- Orchestrator never edits source. Only writes
.spiral/<slug>/brief.mdand reads receipts/verdicts. - No commits unless the user explicitly asks. CLAUDE.md governs commit hygiene. EXCEPTION: in sub-spiral mode (under /meta-spiral) the worktree branch MUST carry committed work — see Phase 4.
- Stage files explicitly when committing — never
git add -A. (Sub-spiral-mode worktree commits mayadd -Athen unstage.spiral/— the worktree contains only this spiral's work.) - Never
--no-verifypast hooks. - If two consecutive iterations score the same axis < 5, the bet is wrong, not the execution. Stop and surface to the user.
- If a foreground sub-agent appears to hang (no return after a reasonable wall-clock for its task type), don't wait indefinitely — surface it to the user. Foreground SHOULD return; if it doesn't, that's a real failure to escalate, not poll around.
Definition of done
.spiral/<feature>/iter-N/05-evaluation.mdends withVERDICT: GREEN.- Playwright suite green from a clean run inside the worktree.
- Screenshots for every state in
iter-N/screenshots/. - Worktree path surfaced to user; main tree untouched.
- TaskList shows all phase tasks completed.
- User can open
.spiral/<feature>/and audit every decision back to the brief.