name: spec description: Write specifications for Pangea/pelago-aiml. Creates Epics, Issues, Bugs, Spikes, or PRDs with Linear integration. Uses /research_codebase for deep codebase analysis. Validates specs for ACs, diagrams, analytics, and test plans.
Spec Writer (Pangea)
Write well-structured specifications through guided conversation. Supports Epics, Issues/Stories, Bugs, Spikes, and PRDs.
When this skill is invoked, follow the flows below. They are carefully crafted.
Project Integration:
- Uses
/research_codebasecommand for codebase analysis - Stores research in
docs/thoughts/shared/research/ - Integrates with Pangea team workflows (Linear, Sparky, Amplitude)
Modes
| Command | Action |
|---|---|
/spec |
Show options: quick from context, fresh, or analyze existing |
/spec PAN-174 |
Analyze & update existing ticket (skip to Step 0b) |
/spec PAN-174 --validate |
Validate only, show report |
/spec PAN-174 --rewrite |
Discard current, start fresh |
/spec --quick |
Skip options, go straight to quick mode from context |
Workflow
Step 0: Check MCP Availability
Quick check - if MCPs are connected, proceed. No API calls needed.
Check for MCP tools in current session:
mcp__letta__*tools available → Sparky ✓mcp__linear-server__*tools available → Linear ✓- Sparky skill exists at
.claude/skills/sparky/→ Sparky script ✓
If any are missing, brief notice:
Available integrations: [Sparky ✓] [Linear ✓] [Exa ✓]
Missing: [list any missing]
→ I can still write specs, just can't [consult Sparky / create Linear tickets]
Proceed? (Y/n)
Degraded Mode:
- Sparky unavailable → Skip Step 3, don't offer consultation
- Linear unavailable → Skip Step 7, output markdown only
- Exa unavailable → Skip Step 3b, no best practices search
Step 0a: Present Mode Options
Always present these three options after showing integrations:
Available integrations: [Sparky ✓/✗] [Linear ✓/✗] [Exa ✓/✗]
A) Start fresh - new interactive flow
B) Create from conversation or plan - extract context, generate spec
C) Analyze existing ticket - enter ticket ID (e.g., PAN-174)
Option A (Quick Mode):
- Extract context from recent conversation or existing plan
- Identify: problem, solution approach, scope, technical details
- Infer spec type from context
- Skip to Step 4 (Generate Spec) with extracted context
- Still run validation (Step 6c)
- Show abbreviated review before creating ticket
Option B (Fresh Start):
- Continue to Step 1 (normal interactive flow)
Option C (Existing Ticket):
- Prompt for ticket ID if not provided
- Go to Step 0b (fetch and analyze existing ticket)
Context Extraction for Quick Mode:
When user chooses A, extract from conversation:
- Problem/goal being solved
- Proposed solution or approach
- Technical details (files, functions, architecture)
- Scope boundaries mentioned
- Any acceptance criteria discussed
Generate spec directly, show for review:
Based on our conversation, here's the spec:
[GENERATED SPEC]
┌─────────────────────────────────────────────────────────────────┐
│ ✓ Validation: PASS (N warnings) │
├─────────────────────────────────────────────────────────────────┤
│ A) Create ticket in Linear │
│ B) Edit something first │
│ C) Go full interactive mode │
└─────────────────────────────────────────────────────────────────┘
Step 0b: Detect Existing Ticket Mode
Check if user provided a ticket ID (e.g., /spec PAN-174):
If a ticket ID is provided:
- Fetch ticket via
mcp__linear-server__get_issuewithincludeRelations: true - Extract: title, description, type, status, labels, parent, children
- Detect spec type from labels/content (Issue, Bug, Epic, Spike, PRD)
- Go to Step 0c (Existing Ticket Analysis)
If no ticket ID:
- Continue to Step 1 (normal create flow)
Step 0c: Existing Ticket Analysis
ALL Existing ticket analysis should go through this flow. Do not just vibe check it. Use this exact validation check. This is our core ticket spec schema that all tickets mut follow.
Run validation checks on the current ticket content:
┌─────────────────────────────────────────────────────────────────┐
│ TICKET ANALYSIS: [TICKET-ID] │
├─────────────────────────────────────────────────────────────────┤
│ Title: [title] │
│ Type: [detected type] │
│ Status: [status] │
│ URL: [linear URL] │
├─────────────────────────────────────────────────────────────────┤
│ VALIDATION: │
│ [✓/✗/⚠] Problem statement: [status] │
│ [✓/✗/⚠] Acceptance criteria: [count] found, [status] │
│ [✓/✗/⚠] Scope defined: [status] │
│ [✓/✗/⚠] Architecture diagram: [present/missing] │
│ [✓/✗/⚠] User flow diagram: [present/missing] │
│ [✓/✗/⚠] Test plan: [present/missing] │
│ [✓/✗/⚠] Analytics events: [present/missing] │
│ [✓/✗/⚠] Implementation hints: [status] │
│ [✓/✗/⚠] Ambiguous language: [status] │
│ [✓/✗/⚠] AC quality: [discrete stories / vague summary] │
├─────────────────────────────────────────────────────────────────┤
│ Result: [PASS/FAIL] ([N] blocking, [M] warnings) │
└─────────────────────────────────────────────────────────────────┘
Validation Checks:
| Check | Rule | Severity |
|---|---|---|
problem_statement |
"Problem" or "Bug Report" section exists, >50 chars | BLOCK |
acceptance_criteria |
At least 3 ACs, preferably in Gherkin format | BLOCK |
scope_defined |
Has "Goals"/"In Scope" AND "Non-Goals"/"Out of Scope" | BLOCK |
architecture_diagram |
Has ASCII diagram for multi-component work | BLOCK (Large/Epic/PRD) |
user_flow_diagram |
Has user flow diagram all features that impact member experience in any way | BLOCK |
test_plan |
Lists E2E/Integration/Regression tests to write | BLOCK (Issue/Bug) |
analytics_events |
Has Analytics/Amplitude section with event definitions | BLOCK (Issue/Epic) |
ac_quality |
ACs are discrete testable stories, not vague summaries | BLOCK |
no_ambiguous_language |
No "should", "might", "could", "maybe" in ACs | WARN |
tldr_present |
TL;DR or summary section exists | WARN |
implementation_hints |
Has files/functions to change (Issues/Bugs only) | WARN |
gherkin_format |
ACs use Given/When/Then format | WARN |
If --validate flag: Show report and stop. Do not offer to update.
If --rewrite flag: Skip to Step 1, but pre-fill context from existing ticket.
Otherwise, present options:
What would you like to do?
A) Fix issues interactively - I'll ask questions to fill gaps
B) Rewrite entire spec - start fresh, keeping context
C) View current description - show full content
D) Update specific section - choose what to change
Option A (Fix interactively):
- Go to Step 0d (Gap-Filling Mode)
Option B (Rewrite):
- Store existing context (title, any valid sections)
- Go to Step 1 with pre-filled context
- Mark as UPDATE mode for Step 7
Option C (View):
- Display full ticket description
- Return to options
Option D (Update section):
- Show list of sections
- Allow targeted edits
- Go to Step 7 (update mode)
Step 0d: Gap-Filling Mode
Only ask about what's missing. Do not re-ask about sections that pass validation.
The ticket is missing some key information. Let me ask about the gaps:
[Only show questions for FAILED or WARN checks]
Gap Questions by Check:
| Failed Check | Questions to Ask |
|---|---|
problem_statement |
"What problem does this solve? Who experiences it?" |
acceptance_criteria |
"What are the key acceptance criteria? (list 3-5)" |
scope_defined |
"What's explicitly OUT of scope for this work?" |
architecture_diagram |
"Can you describe the component interactions? I'll create a diagram." |
user_flow_diagram |
"What's the user journey? I'll create a flow diagram." |
test_plan |
"What E2E, Integration, or Regression tests should be written?" |
analytics_events |
"What events should be tracked? (e.g., feature_used, error_occurred)" |
ac_quality |
"Let's break down the ACs into discrete, testable user stories." |
implementation_hints |
"Which files/functions will likely need changes?" |
After gathering gaps:
- Merge new content with existing valid sections
- Generate updated spec
- Go to Step 5 (Review) then Step 7 (Update mode)
Step 1: Identify Spec Type & Concept
┌─────────────────────────────────────────────────────────────────┐
│ STEP 1: SPEC TYPE │
└─────────────────────────────────────────────────────────────────┘
What are we building?
┌────────────────┬──────────────────────────────────────────────────────────┐
│ Type │ Description │
├────────────────┼──────────────────────────────────────────────────────────┤
│ A) Epic │ Large feature or initiative spanning multiple PRs/issues │
│ │ Example: "Implement voice conversation analytics system" │
│ │ Example: "Add multi-language support across the app" │
├────────────────┼──────────────────────────────────────────────────────────┤
│ B) Issue/Story │ Single deliverable, one focused piece of work │
│ │ Example: "Add push notification scheduling for outreach" │
│ │ Example: "Create settings screen with dark mode toggle" │
├────────────────┼──────────────────────────────────────────────────────────┤
│ C) Bug │ Defect report with reproduction steps │
│ │ Example: "App crashes when tapping back button on iOS" │
│ │ Example: "Audio cuts out after 30 seconds of silence" │
├────────────────┼──────────────────────────────────────────────────────────┤
│ D) Spike │ Research/investigation with time-box │
│ │ Example: "Evaluate LiveKit vs Daily for WebRTC" │
│ │ Example: "Investigate memory leak in conversation view" │
├────────────────┼──────────────────────────────────────────────────────────┤
│ E) PRD │ Full Product Requirements Document for major features │
│ │ Example: "Proactive outreach system - full product spec" │
│ │ Example: "Member onboarding flow redesign" │
└────────────────┴──────────────────────────────────────────────────────────┘
Briefly describe what you're trying to do:
(e.g., "Add push notification scheduling for proactive outreach"
or "Fix crash when user taps back button on iOS")
Wait for user to provide spec type + basic concept.
Step 1b: Select Research
Once you know what they're building, ask what research to run:
Got it: [SPEC_TYPE] for "[BRIEF_CONCEPT]"
What research should I run? (e.g. "1,2,3 M" or "A" for all)
1) Codebase - scan related files, patterns
2) Linear - existing tickets, projects
3) Sparky - product/design context
4) Git - recent commits, branches
5) Docs - README, ADRs, specs
6) Exa - search web for best practices, examples
7) Other - [describe what you want researched]
A) All - run everything
N) None - skip research
Depth: Q (quick) / M (medium) / D (deep)
Step 1c: Run Parallel Research
CRITICAL: Launch ALL research in a SINGLE message with multiple Task tool calls.
Do NOT call Bash directly for Sparky. Do NOT run research sequentially.
For Deep Codebase Research: If the user needs comprehensive codebase understanding, recommend they run /research_codebase first, then return to /spec. The research document will provide thorough context.
For Quick Spec Research: Use the parallel Task approach below.
Send ONE message containing multiple Task tool invocations. Example:
[Single message with 6 Task tool calls - all launch simultaneously]
Task 1: Codebase (invoke /research_codebase command)
Task 2: Linear (subagent_type=general-purpose)
Task 3: Sparky (subagent_type=general-purpose)
Task 4: Git (subagent_type=Bash)
Task 5: Docs (subagent_type=Explore)
Task 6: Exa (subagent_type=general-purpose)
Each Task returns max 2500 tokens. All run simultaneously.
Task Prompts (subagent_type for each):
| Research | subagent_type | Prompt |
|---|---|---|
| Codebase | general-purpose |
Invoke the /research_codebase command with the topic. The command uses specialized agents (codebase-locator, codebase-analyzer, codebase-pattern-finder) to find files, understand code, and identify patterns. Return summary of findings relevant to the spec. |
| Linear | general-purpose |
Search Linear for [TOPIC]: related tickets, team, project. Also review all open and pending tickets as other in flight work might impact this topic. |
| Sparky | general-purpose |
[FROM SPEC-MAKER] + product context, key questions |
| Git | Bash |
Git analysis for [TOPIC]: recent commits, active branches |
| Docs | Explore |
Find docs for [TOPIC]: README, ADRs, thoughts/ directory |
| Exa | general-purpose |
Web search for [TOPIC]: best practices, pitfalls (ALWAYS run) |
Codebase Research Details:
When invoking codebase research, the agent should:
- Use the
/research_codebasecommand pattern with specialized sub-agents:- codebase-locator: Find WHERE files and components live
- codebase-analyzer: Understand HOW specific code works
- codebase-pattern-finder: Find examples of existing patterns
- Focus on documenting what EXISTS (not suggesting improvements)
- Return:
- Related files with paths
- Existing patterns to follow
- Integration points
- Dependencies
Important for Sparky:
- Use Task tool with
subagent_type: general-purpose - The subagent calls:
bash .claude/skills/sparky/scripts/ask_sparky.sh "prompt" - Do NOT call Bash directly from main conversation (blocks parallel execution)
- ALL messages to Sparky MUST begin with
[FROM SPEC-MAKER]
Important for Exa:
- ALWAYS include Exa in research (unless user explicitly says "none")
- Search for: "[TOPIC] best practices", "[TOPIC] common pitfalls", "[TOPIC] architecture patterns"
- Include results in research summary under "Best Practices"
Context Window Management:
- Each subagent returns max 500 tokens
- Total research context: max 3500 tokens (7 sources × 500)
- If a source returns nothing useful, omit from summary
- Present combined research summary to user before questioning
Research Summary Output:
📊 Research Complete
**Codebase:**
- Tech: [stack]
- Related files: [list]
- Patterns: [conventions]
**Linear:**
- Related tickets: [list with IDs]
- Team: [team name]
- Current epic: [if any]
**Sparky's Input:**
- Key context: [summary]
- (Questions moved to Step 2 below)
**Git:**
- Recent activity: [summary]
- Active branches: [list]
**Docs:**
- Relevant docs: [list]
**Exa (Best Practices):**
- Patterns: [list]
- Pitfalls: [list]
- Sources: [URLs]
Ready to gather specifics. [Proceed to Step 2]
Step 2: Gather Initial Context
Convert research findings into lettered questions.
Do NOT list Sparky's questions separately, then ask different questions. Instead, take Sparky's questions and format them WITH options.
Building the question batch:
- Take Sparky's key questions (if any)
- Add technical questions from codebase findings
- Format each with 3-4 lettered options based on common patterns
- Present as ONE unified batch
Example - Converting Sparky's questions:
Sparky asked: "Who are the primary annotators?"
Convert to:
1. Who are the primary users?
A) Just you / small eng team (< 5 people)
B) Mixed team (eng + clinicians/researchers)
C) Broader org access needed
D) Other: [describe]
Question Format:
- 2-4 options per question with letters (A, B, C, D)
- Always include D) Other for custom input
- User responds with shorthand: "1A, 2C, 3D"
- Ask in batches of 3-5 questions, wait for response
- All questions in ONE consistent format
Example unified batch:
Based on the research, I need to clarify a few things.
Answer with shorthand like "1B, 2A, 3C" or provide custom answers.
1. What type of user problem is this?
A) UX friction - users can do it but it's painful
B) Missing functionality - users can't do something they need
C) Performance issue - too slow or resource-intensive
D) Other: [describe]
2. How urgent is this?
A) Blocking - users are stuck, no workaround
B) High - significant pain, workaround exists
C) Medium - nice to have, not urgent
D) Low - minor improvement
3. Is there existing code/systems this touches?
A) New feature - mostly greenfield
B) Modification - changing existing behavior
C) Integration - connecting existing pieces
D) Refactor - restructuring without behavior change
Key Questions by Type (adapt with lettered options):
Epic
- What problem does this solve? (user pain / business need / technical debt)
- Who are the primary users/stakeholders?
- What does success look like? (metrics)
- What's in scope vs explicitly out of scope?
- What are the key user stories? (list 3-5)
- Dependencies on other teams/systems?
- Rough scope? (small/medium/large)
Issue/Story
- What user problem does this address?
- How urgent/important is this?
- What's the acceptance criteria? (list key ones)
- Are there technical constraints?
- Dependencies or blockers?
- Any design/UX requirements?
Bug
- What's the expected vs actual behavior?
- Steps to reproduce? (numbered list)
- Environment details (OS, browser, version)?
- Severity? (critical/high/medium/low)
- Impact - how many users affected?
- Any workarounds known?
Spike
- What question are we trying to answer?
- What's the time-box? (hours/days)
- What artifacts will be produced?
- What decision will this inform?
- What options are we evaluating?
PRD
- What problem are we solving?
- Who is the target user?
- What are the goals? (list 2-3)
- What are the non-goals? (explicitly out of scope)
- What does the user journey look like?
- Technical constraints or dependencies?
- How will we measure success?
Iteration Pattern:
Continue asking questions until you have:
- Clear problem statement
- Defined scope (in/out)
- Acceptance criteria (at least 3)
- Technical approach (if applicable)
- Dependencies identified
Then proceed to Step 3 (Sparky) or Step 4 (Generate Spec).
Step 3: Consult Sparky (Optional)
Skip this step if Sparky was marked unavailable in Step 0.
Once you have sufficient context about what the user is trying to build, ask if they want Sparky's input:
I have enough context to start drafting. Would you like me to consult Sparky first?
Sparky can provide product/design/user research perspective and identify
the most important questions you should answer before finalizing the spec.
A) Yes, consult Sparky
B) No, proceed to draft
If the user chooses Yes:
- Summarize the context gathered so far
- Call Sparky with a structured prompt:
bash .claude/skills/sparky/scripts/ask_sparky.sh "PROMPT"
Prompt Template for Sparky:
[FROM SPEC-MAKER] We're writing a [SPEC_TYPE] for: [BRIEF_DESCRIPTION]
Context gathered so far:
- Problem: [PROBLEM]
- Users: [TARGET_USERS]
- Scope: [IN_SCOPE / OUT_OF_SCOPE]
[OTHER RELEVANT CONTEXT]
Based on your product/design/user research knowledge:
1. What context or insights should inform this spec?
2. What are the 3-5 most important questions the human needs to answer before we finalize?
3. Any red flags or considerations we should address?
Using Sparky's Response:
- Present Sparky's insights to the user
- Ask the user to answer Sparky's key questions
- Incorporate both Sparky's context and the user's answers into the spec
Example Sparky Response:
Context: We've seen users struggle with [X] in the past. The design team
explored [Y] approach last quarter.
Key questions to answer:
1. How does this interact with the existing [feature]?
2. What's the rollout strategy - all users or phased?
3. Have we validated this solves the actual user pain point?
4. What's the fallback if [edge case] happens?
Red flags: Make sure to consider [Z] which caused issues before.
Step 3b: Search Best Practices (Exa)
Before generating the spec, search for industry best practices using Exa:
Searching for best practices related to: [TOPIC]...
Use mcp__exa__web_search_exa to find:
- Implementation patterns for similar features
- Common pitfalls and how to avoid them
- Security considerations
- Performance best practices
- UX patterns (if applicable)
Example Exa queries:
| Spec Topic | Search Query |
|---|---|
| Auth system | "OAuth 2.0 implementation best practices 2024" |
| API design | "REST API design patterns error handling" |
| Database migration | "zero downtime database migration patterns" |
| Caching | "caching strategies cache invalidation best practices" |
| File upload | "secure file upload validation best practices" |
Exa Search Call:
mcp__exa__web_search_exa:
query: "[TOPIC] implementation best practices"
numResults: 5
Extract and summarize (max 300 tokens):
📚 Best Practices Found:
**Implementation Patterns:**
- [Pattern 1]: [brief description]
- [Pattern 2]: [brief description]
**Common Pitfalls:**
- [Pitfall 1]: [how to avoid]
- [Pitfall 2]: [how to avoid]
**Security Considerations:**
- [Item 1]
- [Item 2]
Sources: [list URLs for references section]
Incorporate into spec:
- Add relevant patterns to Technical Notes section
- Include pitfalls in Edge Cases
- Add security items to Non-Functional Requirements
- Link sources in References section
Step 4: Generate Spec
When you have enough context (including Sparky's input if consulted), generate the complete spec in markdown format.
Output Requirements:
- Use proper markdown headers, lists, and formatting
- Include all relevant sections for the spec type
- Be specific and actionable
- Include acceptance criteria where applicable
Spec Templates:
Epic Template
# Epic: [Title]
## TL;DR
> **What:** [One sentence describing the initiative]
> **Why:** [Business value / user impact]
> **Scope:** [X] issues across [Y] phases
> **Teams:** [Teams involved]
## Problem Statement
[Clear description of the problem being solved]
## Goals
- [Goal 1]
- [Goal 2]
## Non-Goals
- [Explicitly out of scope]
## Architecture
[High-level architecture description]
┌─────────────┐ ┌──────────────┐ │ Component A │──────▶│ Component B │ └─────────────┘ └──────────────┘
## Acceptance Criteria (BDD)
<!--
IMPORTANT: Do NOT write vague summary ACs like "Given the epic is complete, Then everything works."
Each AC must be a discrete, testable user story. Group by functional area.
Aim for 10-20 ACs for an Epic, covering: core flows, edge cases, error handling, analytics.
-->
### [Functional Area 1]
**AC1: [Specific scenario name]**
```gherkin
Given [specific precondition]
When [specific action]
Then [specific observable outcome]
And [additional verifiable outcome]
AC2: [Another specific scenario]
Given [specific precondition]
When [specific action]
Then [specific observable outcome]
[Functional Area 2]
AC3: [Specific scenario name]
Given [specific precondition]
When [specific action]
Then [specific observable outcome]
Edge Cases & Error Handling
AC4: [Edge case name]
Given [edge case condition]
When [action occurs]
Then [graceful handling]
AC5: [Error scenario]
Given [error condition]
When [action fails]
Then [system recovers gracefully]
And [appropriate logging/alerting occurs]
Analytics Events
| Event | Trigger | Properties |
|---|---|---|
[feature]_[action] |
[When this fires] | user_id, [key_prop], [key_prop] |
[feature]_error |
[When error occurs] | user_id, error_type, error_message |
[feature]_success |
[When flow completes] | user_id, duration_ms, [outcome] |
Sub-Issues / Phases
Phase 1: [Name]
| Issue | Description | BDD Summary |
|---|---|---|
| [Issue 1] | [Brief] | Given X, When Y, Then Z |
| [Issue 2] | [Brief] | Given X, When Y, Then Z |
Phase 2: [Name]
| Issue | Description | BDD Summary |
|---|---|---|
| [Issue 3] | [Brief] | Given X, When Y, Then Z |
| [Issue 4] | [Brief] | Given X, When Y, Then Z |
Success Metrics
| Metric | Current | Target | How to Measure |
|---|---|---|---|
| [Metric 1] | [Baseline] | [Goal] | [Method] |
Dependencies
- [Team/System]: [What's needed]
Open Questions
- [Question 1] - Owner: [Name]
- [Question 2] - Owner: [Name]
#### Issue/Story Template
```markdown
# [Issue Title]
## TL;DR
> **What:** [One sentence describing the change]
> **Why:** [One sentence on the value/impact]
> **How:** [One sentence on approach]
> **Scope:** [Small/Medium/Large] | **Risk:** [Low/Medium/High]
## Problem
[What problem does this solve?]
## Solution Approach
[Proposed approach]
## Architecture
[If applicable - component interactions, data flow]
[ASCII diagram if helpful]
## User Flow
[If applicable - user-impacting feature]
[User journey diagram if helpful]
## Acceptance Criteria (BDD)
### Core Functionality
**AC1: [Name]**
```gherkin
Given [precondition/context]
When [action/trigger]
Then [expected outcome]
And [additional outcome if needed]
AC2: [Name]
Given [precondition/context]
When [action/trigger]
Then [expected outcome]
Edge Cases & Error Handling
AC3: [Edge case name]
Given [edge case condition]
When [action/trigger]
Then [graceful handling]
And [user feedback if applicable]
AC4: [Error scenario]
Given [error condition]
When [action fails]
Then [error is handled gracefully]
And [appropriate error message shown]
Implementation Hints
Files likely to change:
path/to/file1.ts- [what changes]path/to/file2.ts- [what changes]
Key functions/classes:
functionName()- [modification needed]ClassName- [modification needed]
Patterns to follow:
- [Existing pattern in codebase to match]
Test Plan
E2E Tests:
- [Test description - user journey covered]
Integration Tests:
- [Test description - components/APIs tested]
Regression Tests:
- [Test description - existing behavior preserved]
Analytics Events
| Event | Trigger | Properties |
|---|---|---|
[feature]_started |
[When user begins flow] | user_id, [context] |
[feature]_completed |
[When flow succeeds] | user_id, duration_ms |
[feature]_failed |
[When flow fails] | user_id, error_type |
Technical Notes
[Implementation details, constraints]
Non-Functional Requirements
- Performance: [If applicable]
- Security: [If applicable]
Dependencies
- [Dependencies if any]
Out of Scope
- [What this does NOT include]
References
- [Link to related docs/tickets]
#### Bug Template
```markdown
# Bug: [Title]
## TL;DR
> **Bug:** [One sentence description]
> **Impact:** [Who/what is affected]
> **Severity:** [Critical/High/Medium/Low]
> **Workaround:** [Yes/No] - [brief if yes]
## Bug Report (BDD)
```gherkin
Given [the preconditions/setup]
When [the action that triggers the bug]
Then [actual incorrect behavior]
But [expected correct behavior]
Steps to Reproduce
- [Step 1]
- [Step 2]
- [Step 3]
Environment
- OS: [Operating system]
- Browser/Version: [If applicable]
- App Version: [Version number]
Impact
[Who is affected and how many users]
Workaround
[Any known workarounds, or "None"]
Fix Acceptance Criteria (BDD)
AC1: Bug is fixed
Given [same preconditions as bug report]
When [same action that triggered bug]
Then [correct expected behavior]
And [no regression in related functionality]
AC2: Regression test added
Given the fix is implemented
Then a test exists that would catch this bug
And the test is included in CI pipeline
Implementation Hints
Likely root cause:
- [Hypothesis about what's wrong]
Files to investigate:
path/to/file.ts- [why]
Related code:
- [Function/class that likely contains bug]
Test Plan
Regression Tests:
- [Test that would have caught this bug]
- [Test for related edge cases]
Integration Tests:
- [Test for component interactions if applicable]
Screenshots/Logs
[Attach if available]
#### Spike Template
```markdown
# Spike: [Title]
## TL;DR
> **Question:** [Primary question to answer]
> **Time-box:** [Duration]
> **Output:** [What artifact will be produced]
> **Decision:** [What this will help decide]
## Research Questions
1. [Primary question this spike will answer]
2. [Secondary question]
3. [Tertiary question if applicable]
## Background
[Context on why this investigation is needed]
## Time-box
[Duration: e.g., 2 days]
## Approach
1. [Investigation step 1]
2. [Investigation step 2]
3. [Investigation step 3]
## Options to Evaluate
| Option | Pros | Cons | Effort |
|--------|------|------|--------|
| [Option A] | [Pros] | [Cons] | [S/M/L] |
| [Option B] | [Pros] | [Cons] | [S/M/L] |
| [Option C] | [Pros] | [Cons] | [S/M/L] |
## Decision Criteria
How we'll evaluate options:
| Criterion | Weight | Notes |
|-----------|--------|-------|
| [Criterion 1] | [High/Med/Low] | [Why important] |
| [Criterion 2] | [High/Med/Low] | [Why important] |
## Spike Complete When (BDD)
```gherkin
Given the spike time-box is complete
Then we have answered: [primary question]
And we have a recommendation with rationale
And we have documented trade-offs
And next steps are defined
Artifacts
- [Decision document / ADR]
- [POC code if applicable]
- [Comparison matrix]
Decision This Informs
[What decision will be made based on findings]
References
- [Existing docs, prior art, relevant links]
#### PRD Template
```markdown
# PRD: [Feature Name]
## Overview
[2-3 sentence summary]
## Problem Statement
[Detailed problem description]
## Goals
1. [Goal 1]
2. [Goal 2]
## Non-Goals
- [What this will NOT address]
## Target Users
[User personas or segments]
## User Stories
1. As a [user], I want [action] so that [benefit]
## Architecture
[High-level system architecture]
┌─────────────┐ ┌──────────────┐ │ Component A │──────▶│ Component B │ └─────────────┘ └──────┬───────┘ │ ▼ ┌──────────────┐ │ Component C │ └──────────────┘
## User Flow
START │ ▼ ┌─────────────────────────────┐ │ [Step 1] │ └──────────┬──────────────────┘ │ ▼ ┌─────────────────────────────┐ │ [Step 2] │ └──────────┬──────────────────┘ │ ▼ END
## Requirements
### Functional Requirements
- [FR-1]: [Requirement]
- [FR-2]: [Requirement]
### Non-Functional Requirements
- **Performance**: [Criteria]
- **Reliability**: [Criteria]
- **Security**: [Criteria]
## Acceptance Criteria
**Core Functionality:**
- [ ] [Criterion 1]
- [ ] [Criterion 2]
**Edge Cases & Fallback Behavior:**
- [ ] [Edge case 1]
- [ ] [Edge case 2]
## Testing Requirements
**Unit Tests:**
- [ ] [Test 1]
**Integration Tests:**
- [ ] [Test 1]
## Implementation Phases
### Phase 1: [Name]
**Dependencies:** None
- [Deliverable 1]
- [Deliverable 2]
### Phase 2: [Name]
**Dependencies:** Phase 1
- [Deliverable 3]
- [Deliverable 4]
## Success Metrics
| Metric | Current | Target |
|--------|---------|--------|
| [Metric 1] | [Baseline] | [Goal] |
## Open Questions
- [Question 1]
## References
- [Link to related docs]
- [Link to related tickets]
Step 5: Review and Refine
After generating the spec:
- Ask if any sections need adjustment
- Clarify any ambiguous points
- Iterate until the user is satisfied
Step 6: Evaluate Scope & Breakdown
Before creating tickets, analyze the spec and propose the right structure:
Based on the spec, here's my assessment:
**Scope Analysis:**
- Acceptance Criteria count: [X]
- Phases/milestones: [X]
- Estimated PRs: [X]
- Cross-team dependencies: [Yes/No]
**Recommended Structure:**
[One of the options below]
Classification Matrix:
| Indicators | Recommendation |
|---|---|
| < 5 ACs, single PR, no phases | Single Issue |
| 5-10 ACs, 2-3 PRs, phases mentioned | Epic (single ticket with sub-tasks) |
| 10-15 ACs, multiple PRs, clear phases | Epic + Sub-Issues (parent + children) |
| > 15 ACs, multi-team, roadmap item | Project + Epics + Issues (full hierarchy) |
Present options to user:
How should we structure this in Linear?
A) Single Issue - create one ticket with full spec
B) Epic with Sub-Issues - create parent epic + [N] child issues
C) Project - create project with [N] epics/issues underneath
D) Multiple separate Issues - create [N] independent tickets
E) Let me decide - just output the markdown
If user chooses B, C, or D (multiple tickets):
Propose the breakdown:
I'll create the following tickets: 📁 [Epic/Project]: [Title] ├── 📋 Issue 1: [Title] - [brief scope] ├── 📋 Issue 2: [Title] - [brief scope] └── 📋 Issue 3: [Title] - [brief scope] Does this breakdown look right? A) Yes, create all tickets B) Adjust the breakdown C) Just create the parent, I'll add children laterGenerate individual specs for each ticket (condensed from main spec)
Create tickets with proper parent-child relationships
Step 6b: Notify Sparky (Automatic)
After spec is finalized, automatically send to Sparky as FYI.
No user prompt needed. No response expected. Just inform Sparky of the decision.
bash .claude/skills/sparky/scripts/ask_sparky.sh "[FROM SPEC-MAKER] FYI - Final spec created:
Title: [SPEC_TITLE]
Type: [Issue/Epic/PRD/etc.]
Summary: [TL;DR from spec]
Key decisions:
- [Decision 1]
- [Decision 2]
- [Decision 3]
This is for your records. No response needed."
Do NOT wait for Sparky's response. Run in background or fire-and-forget.
Step 6c: Validation Gate
Before creating or updating any ticket, run validation checks.
This is a HARD GATE. Do not proceed to Step 7 if blocking checks fail.
Run these checks on the generated spec:
┌─────────────────────────────────────────────────────────────────┐
│ SPEC VALIDATION │
├─────────────────────────────────────────────────────────────────┤
│ [✓/✗] Problem statement: [present/missing] ([N] chars) │
│ [✓/✗] Acceptance criteria: [N] ACs [in Gherkin/plain format] │
│ [✓/✗] Scope defined: [goals + non-goals / missing] │
│ [✓/✗] Architecture diagram: [present/missing] │
│ [✓/✗] User flow diagram: [present/missing] │
│ [✓/✗] Test plan: [present/missing] │
│ [✓/✗] Analytics events: [present/missing] │
│ [✓/✗] AC quality: [discrete stories / vague summary] │
│ [✓/⚠] Ambiguous language: [none found / found on line X] │
│ [✓/⚠] TL;DR present: [yes/no] │
│ [✓/⚠] Implementation hints: [present/missing] │
├─────────────────────────────────────────────────────────────────┤
│ Result: [PASS/FAIL] ([N] blocking, [M] warnings) │
└─────────────────────────────────────────────────────────────────┘
Validation Rules:
| Check | Rule | Severity | Applies To |
|---|---|---|---|
problem_statement |
Section exists, >50 chars | BLOCK | All |
acceptance_criteria |
At least 3 ACs | BLOCK | Issue, Bug, Epic |
gherkin_format |
ACs use Given/When/Then | WARN | Issue, Bug |
scope_defined |
Has goals AND non-goals | BLOCK | Epic, PRD |
scope_defined |
Has "Out of Scope" section | WARN | Issue |
architecture_diagram |
ASCII diagram showing component interactions | BLOCK | Epic, PRD, Large Issue |
architecture_diagram |
Diagram present for multi-component work | WARN | Medium Issue |
user_flow_diagram |
User flow diagram for user-facing features | BLOCK | PRD |
user_flow_diagram |
User flow diagram if feature has UI/UX | WARN | Issue, Epic |
test_plan |
Lists E2E, Integration, or Regression tests to write | BLOCK | Issue, Bug |
test_plan |
Specifies what tests will be created | WARN | Epic |
analytics_events |
Analytics section with event name, trigger, properties | BLOCK | Issue, Epic |
analytics_events |
Analytics section exists | WARN | Bug |
ac_quality |
ACs are discrete, testable user stories (not vague summaries) | BLOCK | All |
no_ambiguous_language |
No "should/might/could/maybe" in ACs | WARN | All |
tldr_present |
TL;DR or summary exists | WARN | All |
implementation_hints |
Files/functions listed | WARN | Issue, Bug |
success_metrics |
Measurable metrics defined | BLOCK | Epic, PRD |
success_metrics |
Metrics section exists | WARN | Issue |
On PASS (0 blocking issues):
✓ Spec validation passed ([N] warnings)
[Show warnings if any]
Proceed to create ticket? (Y/n)
On FAIL (1+ blocking issues):
✗ Spec validation failed ([N] blocking issues)
Please fix before creating ticket:
- [List blocking issues]
[Also show warnings]
Which issues should we address?
A) Fix all - I'll ask questions for each
B) Let me edit manually - show the spec
C) Skip validation - create anyway (not recommended)
If user chooses C (skip):
- Add label
needs-spec-reviewto the ticket - Add comment noting validation was skipped
- Proceed to Step 7
Step 7: Create or Update Linear Ticket(s)
If Linear was marked unavailable in Step 0:
The spec is ready! Here's the final markdown:
[OUTPUT FULL SPEC]
[If multiple tickets were planned in Step 6, output each spec separately]
Note: Linear is not configured in this session. You can copy these specs
and create the tickets manually, or set up the Linear MCP for future sessions.
If Linear is available:
UPDATE MODE (existing ticket from Step 0b)
When updating an existing ticket:
- Use
mcp__linear-server__update_issuewith the ticket ID - Replace the description with the updated spec
- If ticket was in "Needs Human" status, move back to previous state (usually "Backlog")
- Add a comment summarizing what changed
Update call:
mcp__linear-server__update_issue:
id: [TICKET_ID]
description: [FULL_UPDATED_SPEC]
state: [previous state or "Backlog" if was "Needs Human"]
Add change summary comment:
mcp__linear-server__create_comment:
issueId: [TICKET_ID]
body: |
## Spec Updated
Changes made:
- [Added/Updated problem statement]
- [Added X acceptance criteria]
- [Added out-of-scope section]
- [etc.]
---
_Updated via /spec command_
Post-update summary:
✅ Updated ticket: [TICKET-ID]
Changes:
- [List what was added/changed]
URL: [LINEAR_URL]
[If was in "Needs Human", note it was moved back to Backlog]
CREATE MODE (new ticket)
Create based on Step 6 decision:
Single Issue:
- Use
mcp__linear-server__create_issue - Include full markdown spec in description
Epic with Sub-Issues:
- Create parent Epic issue first
- Create child issues with
parentIdset to Epic - Return all ticket URLs
Project:
- Use
mcp__linear-server__create_projectfor container - Create issues/epics underneath with project association
- Return project URL + all ticket URLs
For all tickets:
- Ask for team assignment if not specified
- Set appropriate labels based on spec type (Bug, Feature, Spike, etc.)
- Link related tickets if creating multiple
- Copy final URLs to clipboard
Post-creation summary:
✅ Created [N] ticket(s) in Linear:
📁 [Project/Epic]: [URL]
├── 📋 [Issue 1]: [URL]
├── 📋 [Issue 2]: [URL]
└── 📋 [Issue 3]: [URL]
[URLs copied to clipboard]
ASCII Diagrams
Include ASCII diagrams when they help clarify the spec. Types to consider:
| Diagram Type | When to Use |
|---|---|
| Architecture | Component interactions, data flow, system boundaries |
| Sequence | API interactions, user flows with multiple steps |
| State Machine | Feature lifecycle, status transitions |
| Entity Relationship | Data models, database schemas |
Diagram Requirements:
- Must render correctly in terminal AND markdown (Linear, GitHub)
- Use UTF-8 box-drawing characters:
┌┐└┘├┤┬┴┼─│▶▼ - Wrap in triple backticks for markdown code blocks
- Keep width reasonable (< 80 chars ideal)
Example Architecture Diagram:
┌─────────────┐ ┌──────────────┐
│ Client │──────▶│ API Layer │
└─────────────┘ └──────┬───────┘
│
▼
┌──────────────┐
│ Database │
└──────────────┘
Classification Logic
When creating Linear tickets, classify based on scope:
| Type | Indicators | Linear Entity |
|---|---|---|
| Issue | Single PR, isolated fix, < 5 ACs, no phases | Issue |
| Epic | Multi-PR, 5-15 ACs, explicit phases, coordinated work | Issue (Epic label) or Project |
| Project | Multi-team, > 15 ACs, initiative-level, roadmap item | Project |
Classification Signals:
- Scope keywords: "multiple PRs", "phases", "dependencies"
- Complexity: story points, time mentions
- Architecture diagram complexity
- Number of acceptance criteria
Acceptance Criteria Quality
The #1 spec failure mode is vague, untestable ACs.
❌ BAD: Vague Summary ACs
Given the feature is fully implemented
Then users can do the thing
And everything works correctly
And the system is reliable
This is useless. It's not testable, not specific, and doesn't capture discrete behaviors.
✅ GOOD: Discrete, Testable User Stories
Each AC should be:
- Specific: One scenario, one outcome
- Testable: Can write an automated test for it
- Independent: Doesn't depend on other ACs to make sense
- Observable: Outcome is verifiable (not "system is fast" but "response < 200ms")
Group ACs by functional area:
| Functional Area | Example ACs |
|---|---|
| Core Flow | Happy path user journeys |
| Edge Cases | Boundary conditions, empty states |
| Error Handling | What happens when things fail |
| Timing/Scheduling | Time-based behaviors |
| Analytics | Event tracking verification |
| Integration | Cross-component behaviors |
AC Count Guidelines:
| Spec Type | Target AC Count |
|---|---|
| Issue | 5-10 ACs |
| Epic | 15-25 ACs |
| PRD | 20-40 ACs |
Example: Decomposing a Feature
Feature: "Send push notifications"
❌ Vague:
Given the notification system is implemented
Then users receive notifications
✅ Decomposed:
# Core Flow
AC1: Given a notification is scheduled for 6pm
When the scheduler runs at 6pm
Then the notification is sent via Expo Push
And sent_at timestamp is recorded
# Edge Case
AC2: Given a notification is scheduled for 6pm
And the user's push token is invalid
When the scheduler runs
Then the notification is marked as failed
And no retry is attempted
# Error Handling
AC3: Given Expo Push API returns a 500 error
When sending a notification
Then the system retries with exponential backoff
And after 3 failures, marks as failed
# Analytics
AC4: Given a notification is sent successfully
Then a notification_sent event is logged
With properties: user_id, notification_id, scheduled_for, actual_send_time
Analytics Requirements
Every Issue and Epic MUST include an Analytics section.
Why Analytics Matter
Without analytics, you cannot:
- Know if the feature is being used
- Measure success against goals
- Debug issues in production
- Make data-driven decisions about iteration
Event Naming Convention
Use snake_case with format: [domain]_[object]_[action]
| Pattern | Example | Use Case |
|---|---|---|
[feature]_[action] |
ppo_scheduled |
Feature-level events |
[feature]_[object]_[action] |
ppo_notification_sent |
When feature has sub-objects |
[screen]_viewed |
settings_viewed |
Screen/page views |
[button]_tapped |
submit_button_tapped |
UI interactions |
Event Categories
Lifecycle Events - Track the full funnel:
| Event | When | Required Properties |
|---|---|---|
*_scheduled |
System queues an action | scheduled_for, trigger_type |
*_started |
User/system begins flow | source, entry_point |
*_completed |
Flow succeeds | duration_ms, outcome |
*_failed |
Flow fails | error_type, error_code, retry_count |
*_abandoned |
User exits mid-flow | last_step, time_in_flow_ms |
*_retried |
System/user retries | attempt_number, previous_error |
Engagement Events - Track user behavior:
| Event | When | Required Properties |
|---|---|---|
*_viewed |
User sees content | content_id, position, source |
*_tapped |
User interacts | element_id, context |
*_opened |
User opens notification/modal | time_since_sent, source |
*_dismissed |
User dismisses | time_visible_ms, action_taken |
Business Events - Track outcomes:
| Event | When | Required Properties |
|---|---|---|
*_converted |
User completes key action | conversion_type, value |
*_activated |
Feature first used | days_since_signup, activation_path |
*_retained |
User returns | days_since_last_active, return_trigger |
Property Standards
Always Include:
user_id- For user-level analysissession_id- For session-level analysistimestamp- Usually automaticplatform-ios/android/webapp_version- For version-based debugging
Context Properties:
source- Where did user come from? (push,deeplink,organic)entry_point- Which button/link triggered this?experiment_variant- If A/B testing
Outcome Properties:
duration_ms- How long did it take?success-true/falseerror_type- If failed, what category?error_message- Human-readable error
Analytics Table Format
## Analytics Events
### Funnel: [Feature Name]
| Event | Trigger | Properties | Notes |
|-------|---------|------------|-------|
| `feature_scheduled` | System schedules action | `user_id`, `scheduled_for`, `trigger_type`, `trigger_context` | Baseline for funnel |
| `feature_sent` | Action delivered | `user_id`, `feature_id`, `delay_from_scheduled_ms` | Delivery rate |
| `feature_opened` | User engages within 1hr | `user_id`, `feature_id`, `time_to_open_ms`, `source` | Open rate |
| `feature_converted` | User completes goal | `user_id`, `feature_id`, `conversion_type`, `time_to_convert_ms` | Conversion rate |
| `feature_failed` | Delivery/action failed | `user_id`, `feature_id`, `error_type`, `error_code`, `retry_count` | Error rate |
### Derived Metrics
| Metric | Calculation | Target |
|--------|-------------|--------|
| Delivery Rate | `feature_sent` / `feature_scheduled` | > 99% |
| Open Rate | `feature_opened` / `feature_sent` | > 30% |
| Conversion Rate | `feature_converted` / `feature_opened` | > 20% |
| Error Rate | `feature_failed` / `feature_scheduled` | < 1% |
Example: Complete Analytics Spec
Feature: PPO (Proactive Personalized Outreach)
## Analytics Events
### Funnel: PPO Lifecycle
| Event | Trigger | Properties | Notes |
|-------|---------|------------|-------|
| `ppo_analyzed` | Conversation ends, analyzer runs | `user_id`, `conversation_id`, `transcript_length` | Entry to funnel |
| `ppo_scheduled` | LLM decides to schedule PPO | `user_id`, `ppo_id`, `trigger_type`, `scheduled_for`, `message_preview` | Extraction rate |
| `ppo_skipped` | LLM decides NOT to schedule | `user_id`, `conversation_id`, `skip_reason` | Understand filtering |
| `ppo_sent` | Push notification delivered | `user_id`, `ppo_id`, `delay_from_scheduled_ms`, `expo_receipt_id` | Delivery rate |
| `ppo_send_failed` | Push delivery failed | `user_id`, `ppo_id`, `error_type`, `error_code`, `retry_count` | Debug delivery |
| `ppo_opened` | App opened within 1hr of send | `user_id`, `ppo_id`, `time_to_open_ms`, `opened_from` | Open rate |
| `ppo_conversation_started` | Conversation started within 1hr | `user_id`, `ppo_id`, `time_to_conversation_ms` | Engagement rate |
| `ppo_used` | PPO injected into Ember context | `user_id`, `ppo_id`, `ppo_count_injected` | Context usage |
| `ppo_expired` | PPO past 24hr grace period | `user_id`, `ppo_id`, `hours_overdue` | Stale PPO rate |
### Segment Properties
All events include:
- `user_id`, `session_id`, `platform`, `app_version`
- `ppo_trigger_type`: `commitment` | `event` | `emotional`
- `user_ppo_count_lifetime`: Total PPOs this user has received
### Derived Metrics
| Metric | Calculation | Target | Alert Threshold |
|--------|-------------|--------|-----------------|
| Extraction Rate | `ppo_scheduled` / `ppo_analyzed` | 15-25% | < 5% or > 50% |
| Delivery Rate | `ppo_sent` / `ppo_scheduled` | > 99% | < 95% |
| Open Rate | `ppo_opened` / `ppo_sent` | > 30% | < 15% |
| Conversation Rate | `ppo_conversation_started` / `ppo_opened` | > 50% | < 25% |
| Context Injection Rate | `ppo_used` / `ppo_opened` | > 95% | < 80% |
| Error Rate | `ppo_send_failed` / `ppo_scheduled` | < 1% | > 5% |
Validation Checklist
Before finalizing any spec:
- Problem is clearly stated
- Scope is well-defined (in-scope and out-of-scope)
- Acceptance criteria are discrete, testable user stories (not vague summaries)
- ACs grouped by functional area (core flow, edge cases, errors, analytics)
- Architecture diagram included (for multi-component work)
- User flow diagram included (for user-facing features)
- Test plan specifies E2E/Integration/Regression tests to write
- Analytics events defined with triggers and properties
- Dependencies are identified
- Success metrics are measurable
- No ambiguous language ("should", "might", "could")
Quick Reference
| Spec Type | Linear Entity | Key Sections |
|---|---|---|
| Epic | Project | Goals, Stories, Metrics |
| Issue | Issue | Problem, AC, Technical Notes |
| Bug | Issue + Bug label | Repro Steps, Expected/Actual |
| Spike | Issue + Spike label | Question, Time-box, Artifacts |
| PRD | Project + Doc | Requirements, User Journey, Metrics |
Sparky Integration
This skill optionally integrates with Sparky, a persistent AI agent with long-term memory.
When Sparky is Consulted:
- After gathering initial context from the user
- Before generating the final spec
What Sparky Provides:
- Product/design/user research perspective
- Historical context and prior decisions
- The 3-5 most important questions the human needs to answer
- Red flags or considerations to address
Flow with Sparky:
- User provides initial context
- Claude asks if user wants Sparky's input
- If yes, Claude sends context to Sparky
- Sparky returns insights + key questions
- Claude presents questions to user
- User answers the questions
- Claude generates spec with all context
Dependency:
- Requires the
sparkyskill at.claude/skills/sparky/ - Uses
.claude/skills/sparky/scripts/ask_sparky.sh
Including Sparky's Input in Specs:
If Sparky was consulted, add a "Research Context" section:
## Research Context
_Insights from product/design review:_
- [Key insight 1]
- [Key insight 2]
_Key questions addressed:_
- Q: [Question from Sparky]
A: [User's answer]