name: sdd-review-specs description: Use when OpenSpec artifacts have been generated by /opsx:propose and need review before implementation begins — validates proposal scope, spec completeness, design decisions, and task executability
SDD Review Specs
Overview
AI-generated specs are initial drafts, not final contracts. They reflect what the AI understood — not necessarily what you intended. Reviewing specs is the most critical human judgment gate in spec-driven development.
Core principle: Every AI-generated spec artifact must be reviewed by a human before any code is written.
Violating the letter of this review process is violating the spirit of spec-driven development.
The Iron Law
NO CODE WITHOUT REVIEWED SPECIFICATIONS
If the 4 artifacts haven't been reviewed, you cannot proceed to implementation.
When to Review
ALWAYS review after /opsx:propose generates artifacts. Never skip review because:
- AI misinterprets scope boundaries
- AI generates "happy path" specs with insufficient error coverage
- AI makes technology choices incompatible with existing architecture
- AI omits edge cases that a human would catch immediately
Review is NOT required for:
- Tier 0 changes: one-line fixes, typos, log lines, comments (no artifacts to review)
Tier-Based Review Depth
Not every change needs the same review depth. Classify by scope:
| Tier | Applies To | Review Scope | Time |
|---|---|---|---|
| Tier 0 — Skip | Typo fixes, log lines, comments, one-line changes | No artifacts exist. Verify with build/lint directly. | 0 min |
| Tier 1 — Light | Single-field additions, simple validation changes, config tweaks | Review proposal.md scope + tasks.md executability. Skim design.md. |
5-10 min |
| Tier 2 — Full | New features, cross-package refactors, architecture changes, API additions | Full review of all 4 artifacts with complete checklist. | 15-30 min |
If unsure, default to Tier 2. Over-reviewing is cheaper than missing a critical issue.
The Gate Function
BEFORE proceeding to any implementation:
1. IDENTIFY tier — Tier 0, 1, or 2?
2. READ each artifact in order: proposal → specs/ → design → tasks
3. CHECK against the checklist for your tier (below)
4. FEEDBACK — communicate every issue found, with suggested fix
5. ITERATE — after AI fixes, re-check affected artifacts
6. ONLY THEN — declare review passed
7. ROUTE — to superpowers:writing-plans
Tier 2 Full Review Checklist
Create a task for each artifact. Complete in order.
1. proposal.md — Scope & Motivation
- Motivation clear: Can a new team member understand WHY this change exists by reading the first 3 sentences?
- Scope explicit: Is every in-scope item concretely named (not "and related features")?
- Out of scope explicit: Are excluded items listed with reasons ("deferred to future change" vs "not applicable")?
- No scope creep signals: No phrases like "and similar improvements" or "as needed"
- Boundary contract: If someone later asks "can we add X to this change?", can you answer by pointing to in/out scope?
2. specs/ — Behavioral Completeness
- Happy paths covered: Every in-scope item in proposal.md maps to ≥1 behavioral requirement in specs/
- Error paths covered: For every happy path, are there corresponding error/edge case requirements?
- Input validation: Are invalid inputs, boundary values, and null/empty states addressed?
- State transitions: If the feature involves state changes, are all transitions defined?
- No vague language: No "should handle errors appropriately" without specifying HOW
- Interface signatures: If the spec defines functions/APIs, are signatures concrete (types, parameters, return values)?
- Consistency with proposal.md: Does every spec requirement fall within proposal.md's scope?
3. design.md — Technical Soundness
- Approach justified: Does it explain WHY this approach over alternatives?
- Alternatives documented: Are rejected alternatives listed with rejection reasons?
- Dependency check: Do new dependencies conflict with existing ones? (e.g., "introduce Redis" but project already uses a different cache)
- Existing patterns respected: Does the design follow established project conventions?
- Integration points: Are touch points with existing modules explicitly named?
- Risks identified: Are there risk items ("what if the external API is down?") and mitigation strategies?
- Concurrency/performance: If relevant, are goroutine/thread safety, memory, and latency considered?
4. tasks.md — Executability
- Complete coverage: Does every in-scope item from proposal.md map to ≥1 checkbox?
- Test tasks included: Does every implementation task have a corresponding test task?
- No vague tasks: No "handle edge cases" or "add appropriate error handling" — each task is concrete
- Correct ordering: Do dependent tasks come after their dependencies?
- Independently executable: Can a developer (or AI) pick up any single task and complete it without reading others?
- Verifiable per task: Is there a clear "done" signal for each checkbox (test passes, file exists, etc.)?
Tier 1 Light Review Checklist
- proposal.md: Scope boundaries clear? In/out scope reasonable?
- tasks.md: Each task concrete and executable? Test tasks paired with implementation tasks?
- design.md (skim): Any obviously wrong technology choices? Any conflict with existing dependencies?
- specs/ (skim): Any obvious missing error paths?
Red Flags — STOP and Revise
These patterns in AI-generated artifacts mean the spec is NOT ready:
proposal.md Red Flags
- "and related features" / "as needed" / "etc." — scope creep by design
- In-scope list has 10+ items — change is too large, split it
- No out-of-scope section — AI avoided saying "no"
specs/ Red Flags
- All happy paths, zero error paths — AI only described the ideal case
- "should work correctly" / "should handle errors" — too vague to verify
- No edge cases (empty input, null, boundary values, concurrency) — incomplete
- Requirements that can't be tested — "the system should be fast"
design.md Red Flags
- "Use [technology]" without explaining WHY — missing decision record
- No alternatives section — AI didn't consider other approaches
- "Similar to [existing module]" without specifics — lazy design
- Introduces a new dependency without noting it — architectural drift
- Ignores existing project conventions — AI didn't read project.md/CLAUDE.md
tasks.md Red Flags
- Tasks with "and" in the name — should be split
- No test tasks — TDD impossible without them
- "Refactor as needed" / "Add error handling" — placeholder tasks
- Tasks ordered alphabetically rather than by dependency — wrong order
Any of these red flags means: the artifact is not ready. Give specific feedback and request revision.
Common Rationalizations
| Excuse | Reality |
|---|---|
| "The AI usually gets this right" | AI makes systematic mistakes. Review catches them. |
| "I'll catch issues during implementation" | Finding issues during coding costs 10x more to fix. |
| "The spec looks reasonable at a glance" | Skimming is not reviewing. Use the checklist. |
| "This is too small to need review" | Small changes have the most unexamined assumptions. |
| "I trust the AI's design decisions" | Trust but verify. The AI doesn't know your unwritten conventions. |
| "Review takes too long" | Rework from unreviewed specs takes longer. |
| "I already reviewed it during brainstorming" | Brainstorming explores. Propose formalizes. They're different artifacts. |
| "The tests will catch spec issues" | Tests verify implementation, not design quality. Wrong design + passing tests = wrong product. |
All of these mean: do the review. Follow the checklist.
Common Failures
| Claim | Requires | Not Sufficient |
|---|---|---|
| "Specs reviewed" | Every checklist item checked, issues documented | "Read through it", "looks fine" |
| "Scope is clear" | In/out scope sections are explicit and complete | "I know what they mean" |
| "Design is solid" | Alternatives documented, decisions justified, risks noted | "The approach makes sense" |
| "Tasks are executable" | Each task concrete, independently verifiable, correctly ordered | "The list looks complete" |
| "Ready for implementation" | All red flags resolved, tier-appropriate checklist passed | "Should be good enough" |
After Review Passes
Review passed → invoke superpowers:writing-plans to refine task granularity
REQUIRED SUB-SKILL: Use superpowers:writing-plans to convert the reviewed tasks.md into 2-5 minute bite-sized tasks. The reviewed artifacts are now the authoritative spec baseline — writing-plans works from them.
Why This Matters
Unreviewed AI-generated specs cause predictable failures:
- Scope creep: "While we're at it, let's also..." — adds unplanned work mid-implementation
- Wrong technology: AI proposes a library that conflicts with existing stack
- Missing error handling: Happy-path-only specs produce brittle implementations
- Untestable requirements: Vague specs can't be verified — "did we build the right thing?"
- Design drift: Without a decision record, 3 months later nobody remembers why approach A was chosen
Review is the human judgment gate. AI generates fast. Humans decide correctly.
The Bottom Line
No code before reviewed specs.
Read each artifact. Check against the checklist. Flag every issue. Only when all artifacts pass review does implementation begin.
This is non-negotiable.