spec - SKILL.md Agent Skill

name: spec description: Write specifications for Pangea/pelago-aiml. Creates Epics, Issues, Bugs, Spikes, or PRDs with Linear integration. Uses /research_codebase for deep codebase analysis. Validates specs for ACs, diagrams, analytics, and test plans.

Spec Writer (Pangea)

Write well-structured specifications through guided conversation. Supports Epics, Issues/Stories, Bugs, Spikes, and PRDs.

When this skill is invoked, follow the flows below. They are carefully crafted.

Project Integration:

Uses /research_codebase command for codebase analysis
Stores research in docs/thoughts/shared/research/
Integrates with Pangea team workflows (Linear, Sparky, Amplitude)

Modes

Command	Action
`/spec`	Show options: quick from context, fresh, or analyze existing
`/spec PAN-174`	Analyze & update existing ticket (skip to Step 0b)
`/spec PAN-174 --validate`	Validate only, show report
`/spec PAN-174 --rewrite`	Discard current, start fresh
`/spec --quick`	Skip options, go straight to quick mode from context

Workflow

Step 0: Check MCP Availability

Quick check - if MCPs are connected, proceed. No API calls needed.

Check for MCP tools in current session:

mcp__letta__* tools available → Sparky ✓
mcp__linear-server__* tools available → Linear ✓
Sparky skill exists at .claude/skills/sparky/ → Sparky script ✓

If any are missing, brief notice:

Available integrations: [Sparky ✓] [Linear ✓] [Exa ✓]

Missing: [list any missing]
→ I can still write specs, just can't [consult Sparky / create Linear tickets]

Proceed? (Y/n)

Degraded Mode:

Sparky unavailable → Skip Step 3, don't offer consultation
Linear unavailable → Skip Step 7, output markdown only
Exa unavailable → Skip Step 3b, no best practices search

Step 0a: Present Mode Options

Always present these three options after showing integrations:

Available integrations: [Sparky ✓/✗] [Linear ✓/✗] [Exa ✓/✗]

A) Start fresh - new interactive flow
B) Create from conversation or plan - extract context, generate spec
C) Analyze existing ticket - enter ticket ID (e.g., PAN-174)

Option A (Quick Mode):

Extract context from recent conversation or existing plan
Identify: problem, solution approach, scope, technical details
Infer spec type from context
Skip to Step 4 (Generate Spec) with extracted context
Still run validation (Step 6c)
Show abbreviated review before creating ticket

Option B (Fresh Start):

Continue to Step 1 (normal interactive flow)

Option C (Existing Ticket):

Prompt for ticket ID if not provided
Go to Step 0b (fetch and analyze existing ticket)

Context Extraction for Quick Mode:

When user chooses A, extract from conversation:

Problem/goal being solved
Proposed solution or approach
Technical details (files, functions, architecture)
Scope boundaries mentioned
Any acceptance criteria discussed

Generate spec directly, show for review:

Based on our conversation, here's the spec:

[GENERATED SPEC]

┌─────────────────────────────────────────────────────────────────┐
│ ✓ Validation: PASS (N warnings)                                │
├─────────────────────────────────────────────────────────────────┤
│ A) Create ticket in Linear                                      │
│ B) Edit something first                                         │
│ C) Go full interactive mode                                     │
└─────────────────────────────────────────────────────────────────┘

Step 0b: Detect Existing Ticket Mode

Check if user provided a ticket ID (e.g., /spec PAN-174):

If a ticket ID is provided:

Fetch ticket via mcp__linear-server__get_issue with includeRelations: true
Extract: title, description, type, status, labels, parent, children
Detect spec type from labels/content (Issue, Bug, Epic, Spike, PRD)
Go to Step 0c (Existing Ticket Analysis)

If no ticket ID:

Continue to Step 1 (normal create flow)

Step 0c: Existing Ticket Analysis

ALL Existing ticket analysis should go through this flow. Do not just vibe check it. Use this exact validation check. This is our core ticket spec schema that all tickets mut follow.

Run validation checks on the current ticket content:

┌─────────────────────────────────────────────────────────────────┐
│ TICKET ANALYSIS: [TICKET-ID]                                    │
├─────────────────────────────────────────────────────────────────┤
│ Title: [title]                                                  │
│ Type: [detected type]                                           │
│ Status: [status]                                                │
│ URL: [linear URL]                                               │
├─────────────────────────────────────────────────────────────────┤
│ VALIDATION:                                                     │
│ [✓/✗/⚠] Problem statement: [status]                            │
│ [✓/✗/⚠] Acceptance criteria: [count] found, [status]           │
│ [✓/✗/⚠] Scope defined: [status]                                │
│ [✓/✗/⚠] Architecture diagram: [present/missing]                │
│ [✓/✗/⚠] User flow diagram: [present/missing]               │
│ [✓/✗/⚠] Test plan: [present/missing]                           │
│ [✓/✗/⚠] Analytics events: [present/missing]                    │
│ [✓/✗/⚠] Implementation hints: [status]                         │
│ [✓/✗/⚠] Ambiguous language: [status]                           │
│ [✓/✗/⚠] AC quality: [discrete stories / vague summary]         │
├─────────────────────────────────────────────────────────────────┤
│ Result: [PASS/FAIL] ([N] blocking, [M] warnings)                │
└─────────────────────────────────────────────────────────────────┘

Validation Checks:

Check	Rule	Severity
`problem_statement`	"Problem" or "Bug Report" section exists, >50 chars	BLOCK
`acceptance_criteria`	At least 3 ACs, preferably in Gherkin format	BLOCK
`scope_defined`	Has "Goals"/"In Scope" AND "Non-Goals"/"Out of Scope"	BLOCK
`architecture_diagram`	Has ASCII diagram for multi-component work	BLOCK (Large/Epic/PRD)
`user_flow_diagram`	Has user flow diagram all features that impact member experience in any way	BLOCK
`test_plan`	Lists E2E/Integration/Regression tests to write	BLOCK (Issue/Bug)
`analytics_events`	Has Analytics/Amplitude section with event definitions	BLOCK (Issue/Epic)
`ac_quality`	ACs are discrete testable stories, not vague summaries	BLOCK
`no_ambiguous_language`	No "should", "might", "could", "maybe" in ACs	WARN
`tldr_present`	TL;DR or summary section exists	WARN
`implementation_hints`	Has files/functions to change (Issues/Bugs only)	WARN
`gherkin_format`	ACs use Given/When/Then format	WARN

If --validate flag: Show report and stop. Do not offer to update.

If --rewrite flag: Skip to Step 1, but pre-fill context from existing ticket.

Otherwise, present options:

What would you like to do?

A) Fix issues interactively - I'll ask questions to fill gaps
B) Rewrite entire spec - start fresh, keeping context
C) View current description - show full content
D) Update specific section - choose what to change

Option A (Fix interactively):

Go to Step 0d (Gap-Filling Mode)

Option B (Rewrite):

Store existing context (title, any valid sections)
Go to Step 1 with pre-filled context
Mark as UPDATE mode for Step 7

Option C (View):

Display full ticket description
Return to options

Option D (Update section):

Show list of sections
Allow targeted edits
Go to Step 7 (update mode)

Step 0d: Gap-Filling Mode

Only ask about what's missing. Do not re-ask about sections that pass validation.

The ticket is missing some key information. Let me ask about the gaps:

[Only show questions for FAILED or WARN checks]

Gap Questions by Check:

Failed Check	Questions to Ask
`problem_statement`	"What problem does this solve? Who experiences it?"
`acceptance_criteria`	"What are the key acceptance criteria? (list 3-5)"
`scope_defined`	"What's explicitly OUT of scope for this work?"
`architecture_diagram`	"Can you describe the component interactions? I'll create a diagram."
`user_flow_diagram`	"What's the user journey? I'll create a flow diagram."
`test_plan`	"What E2E, Integration, or Regression tests should be written?"
`analytics_events`	"What events should be tracked? (e.g., feature_used, error_occurred)"
`ac_quality`	"Let's break down the ACs into discrete, testable user stories."
`implementation_hints`	"Which files/functions will likely need changes?"

After gathering gaps:

Merge new content with existing valid sections
Generate updated spec
Go to Step 5 (Review) then Step 7 (Update mode)

Step 1: Identify Spec Type & Concept

┌─────────────────────────────────────────────────────────────────┐
│                    STEP 1: SPEC TYPE                            │
└─────────────────────────────────────────────────────────────────┘

What are we building?

┌────────────────┬──────────────────────────────────────────────────────────┐
│      Type      │                       Description                        │
├────────────────┼──────────────────────────────────────────────────────────┤
│ A) Epic        │ Large feature or initiative spanning multiple PRs/issues │
│                │ Example: "Implement voice conversation analytics system" │
│                │ Example: "Add multi-language support across the app"     │
├────────────────┼──────────────────────────────────────────────────────────┤
│ B) Issue/Story │ Single deliverable, one focused piece of work            │
│                │ Example: "Add push notification scheduling for outreach" │
│                │ Example: "Create settings screen with dark mode toggle"  │
├────────────────┼──────────────────────────────────────────────────────────┤
│ C) Bug         │ Defect report with reproduction steps                    │
│                │ Example: "App crashes when tapping back button on iOS"   │
│                │ Example: "Audio cuts out after 30 seconds of silence"    │
├────────────────┼──────────────────────────────────────────────────────────┤
│ D) Spike       │ Research/investigation with time-box                     │
│                │ Example: "Evaluate LiveKit vs Daily for WebRTC"          │
│                │ Example: "Investigate memory leak in conversation view"  │
├────────────────┼──────────────────────────────────────────────────────────┤
│ E) PRD         │ Full Product Requirements Document for major features    │
│                │ Example: "Proactive outreach system - full product spec" │
│                │ Example: "Member onboarding flow redesign"               │
└────────────────┴──────────────────────────────────────────────────────────┘

Briefly describe what you're trying to do:

(e.g., "Add push notification scheduling for proactive outreach"
 or "Fix crash when user taps back button on iOS")

Wait for user to provide spec type + basic concept.

Step 1b: Select Research

Once you know what they're building, ask what research to run:

Got it: [SPEC_TYPE] for "[BRIEF_CONCEPT]"

What research should I run? (e.g. "1,2,3 M" or "A" for all)

1) Codebase - scan related files, patterns
2) Linear - existing tickets, projects
3) Sparky - product/design context
4) Git - recent commits, branches
5) Docs - README, ADRs, specs
6) Exa - search web for best practices, examples
7) Other - [describe what you want researched]
A) All - run everything
N) None - skip research

Depth: Q (quick) / M (medium) / D (deep)

Step 1c: Run Parallel Research

CRITICAL: Launch ALL research in a SINGLE message with multiple Task tool calls.

Do NOT call Bash directly for Sparky. Do NOT run research sequentially.

For Deep Codebase Research: If the user needs comprehensive codebase understanding, recommend they run /research_codebase first, then return to /spec. The research document will provide thorough context.

For Quick Spec Research: Use the parallel Task approach below.

Send ONE message containing multiple Task tool invocations. Example:

[Single message with 6 Task tool calls - all launch simultaneously]

Task 1: Codebase (invoke /research_codebase command)
Task 2: Linear (subagent_type=general-purpose)
Task 3: Sparky (subagent_type=general-purpose)
Task 4: Git (subagent_type=Bash)
Task 5: Docs (subagent_type=Explore)
Task 6: Exa (subagent_type=general-purpose)

Each Task returns max 2500 tokens. All run simultaneously.

Task Prompts (subagent_type for each):

Research	subagent_type	Prompt
Codebase	`general-purpose`	Invoke the `/research_codebase` command with the topic. The command uses specialized agents (codebase-locator, codebase-analyzer, codebase-pattern-finder) to find files, understand code, and identify patterns. Return summary of findings relevant to the spec.
Linear	`general-purpose`	Search Linear for [TOPIC]: related tickets, team, project. Also review all open and pending tickets as other in flight work might impact this topic.
Sparky	`general-purpose`	`[FROM SPEC-MAKER]` + product context, key questions
Git	`Bash`	Git analysis for [TOPIC]: recent commits, active branches
Docs	`Explore`	Find docs for [TOPIC]: README, ADRs, thoughts/ directory
Exa	`general-purpose`	Web search for [TOPIC]: best practices, pitfalls (ALWAYS run)

Codebase Research Details:

When invoking codebase research, the agent should:

Use the /research_codebase command pattern with specialized sub-agents:
- codebase-locator: Find WHERE files and components live
- codebase-analyzer: Understand HOW specific code works
- codebase-pattern-finder: Find examples of existing patterns
Focus on documenting what EXISTS (not suggesting improvements)
Return:
- Related files with paths
- Existing patterns to follow
- Integration points
- Dependencies

Important for Sparky:

Use Task tool with subagent_type: general-purpose
The subagent calls: bash .claude/skills/sparky/scripts/ask_sparky.sh "prompt"
Do NOT call Bash directly from main conversation (blocks parallel execution)
ALL messages to Sparky MUST begin with [FROM SPEC-MAKER]

Important for Exa:

ALWAYS include Exa in research (unless user explicitly says "none")
Search for: "[TOPIC] best practices", "[TOPIC] common pitfalls", "[TOPIC] architecture patterns"
Include results in research summary under "Best Practices"

Context Window Management:

Each subagent returns max 500 tokens
Total research context: max 3500 tokens (7 sources × 500)
If a source returns nothing useful, omit from summary
Present combined research summary to user before questioning

Research Summary Output:

📊 Research Complete

**Codebase:**
- Tech: [stack]
- Related files: [list]
- Patterns: [conventions]

**Linear:**
- Related tickets: [list with IDs]
- Team: [team name]
- Current epic: [if any]

**Sparky's Input:**
- Key context: [summary]
- (Questions moved to Step 2 below)

**Git:**
- Recent activity: [summary]
- Active branches: [list]

**Docs:**
- Relevant docs: [list]

**Exa (Best Practices):**
- Patterns: [list]
- Pitfalls: [list]
- Sources: [URLs]

Ready to gather specifics. [Proceed to Step 2]

Step 2: Gather Initial Context

Convert research findings into lettered questions.

Do NOT list Sparky's questions separately, then ask different questions. Instead, take Sparky's questions and format them WITH options.

Building the question batch:

Take Sparky's key questions (if any)
Add technical questions from codebase findings
Format each with 3-4 lettered options based on common patterns
Present as ONE unified batch

Example - Converting Sparky's questions:

Sparky asked: "Who are the primary annotators?"

Convert to:

1. Who are the primary users?
   A) Just you / small eng team (< 5 people)
   B) Mixed team (eng + clinicians/researchers)
   C) Broader org access needed
   D) Other: [describe]

Question Format:

2-4 options per question with letters (A, B, C, D)
Always include D) Other for custom input
User responds with shorthand: "1A, 2C, 3D"
Ask in batches of 3-5 questions, wait for response
All questions in ONE consistent format

Example unified batch:

Based on the research, I need to clarify a few things.
Answer with shorthand like "1B, 2A, 3C" or provide custom answers.

1. What type of user problem is this?
   A) UX friction - users can do it but it's painful
   B) Missing functionality - users can't do something they need
   C) Performance issue - too slow or resource-intensive
   D) Other: [describe]

2. How urgent is this?
   A) Blocking - users are stuck, no workaround
   B) High - significant pain, workaround exists
   C) Medium - nice to have, not urgent
   D) Low - minor improvement

3. Is there existing code/systems this touches?
   A) New feature - mostly greenfield
   B) Modification - changing existing behavior
   C) Integration - connecting existing pieces
   D) Refactor - restructuring without behavior change

Key Questions by Type (adapt with lettered options):

Epic

What problem does this solve? (user pain / business need / technical debt)
Who are the primary users/stakeholders?
What does success look like? (metrics)
What's in scope vs explicitly out of scope?
What are the key user stories? (list 3-5)
Dependencies on other teams/systems?
Rough scope? (small/medium/large)

Issue/Story

What user problem does this address?
How urgent/important is this?
What's the acceptance criteria? (list key ones)
Are there technical constraints?
Dependencies or blockers?
Any design/UX requirements?

Bug

What's the expected vs actual behavior?
Steps to reproduce? (numbered list)
Environment details (OS, browser, version)?
Severity? (critical/high/medium/low)
Impact - how many users affected?
Any workarounds known?

Spike

What question are we trying to answer?
What's the time-box? (hours/days)
What artifacts will be produced?
What decision will this inform?
What options are we evaluating?

PRD

What problem are we solving?
Who is the target user?
What are the goals? (list 2-3)
What are the non-goals? (explicitly out of scope)
What does the user journey look like?
Technical constraints or dependencies?
How will we measure success?

Iteration Pattern:

Continue asking questions until you have:

Clear problem statement
Defined scope (in/out)
Acceptance criteria (at least 3)
Technical approach (if applicable)
Dependencies identified

Then proceed to Step 3 (Sparky) or Step 4 (Generate Spec).

Step 3: Consult Sparky (Optional)

Skip this step if Sparky was marked unavailable in Step 0.

Once you have sufficient context about what the user is trying to build, ask if they want Sparky's input:

I have enough context to start drafting. Would you like me to consult Sparky first?

Sparky can provide product/design/user research perspective and identify
the most important questions you should answer before finalizing the spec.

A) Yes, consult Sparky
B) No, proceed to draft

If the user chooses Yes:

Summarize the context gathered so far

Call Sparky with a structured prompt:

bash .claude/skills/sparky/scripts/ask_sparky.sh "PROMPT"

Prompt Template for Sparky:

[FROM SPEC-MAKER] We're writing a [SPEC_TYPE] for: [BRIEF_DESCRIPTION]

Context gathered so far:
- Problem: [PROBLEM]
- Users: [TARGET_USERS]
- Scope: [IN_SCOPE / OUT_OF_SCOPE]
[OTHER RELEVANT CONTEXT]

Based on your product/design/user research knowledge:
1. What context or insights should inform this spec?
2. What are the 3-5 most important questions the human needs to answer before we finalize?
3. Any red flags or considerations we should address?

Using Sparky's Response:

Present Sparky's insights to the user
Ask the user to answer Sparky's key questions
Incorporate both Sparky's context and the user's answers into the spec

Example Sparky Response:

Context: We've seen users struggle with [X] in the past. The design team
explored [Y] approach last quarter.

Key questions to answer:
1. How does this interact with the existing [feature]?
2. What's the rollout strategy - all users or phased?
3. Have we validated this solves the actual user pain point?
4. What's the fallback if [edge case] happens?

Red flags: Make sure to consider [Z] which caused issues before.

Step 3b: Search Best Practices (Exa)

Before generating the spec, search for industry best practices using Exa:

Searching for best practices related to: [TOPIC]...

Use mcp__exa__web_search_exa to find:

Implementation patterns for similar features
Common pitfalls and how to avoid them
Security considerations
Performance best practices
UX patterns (if applicable)

Example Exa queries:

Spec Topic	Search Query
Auth system	"OAuth 2.0 implementation best practices 2024"
API design	"REST API design patterns error handling"
Database migration	"zero downtime database migration patterns"
Caching	"caching strategies cache invalidation best practices"
File upload	"secure file upload validation best practices"

Exa Search Call:

mcp__exa__web_search_exa:
  query: "[TOPIC] implementation best practices"
  numResults: 5

Extract and summarize (max 300 tokens):

📚 Best Practices Found:

**Implementation Patterns:**
- [Pattern 1]: [brief description]
- [Pattern 2]: [brief description]

**Common Pitfalls:**
- [Pitfall 1]: [how to avoid]
- [Pitfall 2]: [how to avoid]

**Security Considerations:**
- [Item 1]
- [Item 2]

Sources: [list URLs for references section]

Incorporate into spec:

Add relevant patterns to Technical Notes section
Include pitfalls in Edge Cases
Add security items to Non-Functional Requirements
Link sources in References section

Step 4: Generate Spec

When you have enough context (including Sparky's input if consulted), generate the complete spec in markdown format.

Output Requirements:

Use proper markdown headers, lists, and formatting
Include all relevant sections for the spec type
Be specific and actionable
Include acceptance criteria where applicable

Spec Templates:

Epic Template

# Epic: [Title]

## TL;DR
> **What:** [One sentence describing the initiative]
> **Why:** [Business value / user impact]
> **Scope:** [X] issues across [Y] phases
> **Teams:** [Teams involved]

## Problem Statement
[Clear description of the problem being solved]

## Goals
- [Goal 1]
- [Goal 2]

## Non-Goals
- [Explicitly out of scope]

## Architecture
[High-level architecture description]

┌─────────────┐ ┌──────────────┐ │ Component A │──────▶│ Component B │ └─────────────┘ └──────────────┘


## Acceptance Criteria (BDD)

<!--
IMPORTANT: Do NOT write vague summary ACs like "Given the epic is complete, Then everything works."
Each AC must be a discrete, testable user story. Group by functional area.
Aim for 10-20 ACs for an Epic, covering: core flows, edge cases, error handling, analytics.
-->

### [Functional Area 1]

**AC1: [Specific scenario name]**
```gherkin
Given [specific precondition]
When [specific action]
Then [specific observable outcome]
And [additional verifiable outcome]

AC2: [Another specific scenario]

Given [specific precondition]
When [specific action]
Then [specific observable outcome]

[Functional Area 2]

AC3: [Specific scenario name]

Given [specific precondition]
When [specific action]
Then [specific observable outcome]

Edge Cases & Error Handling

AC4: [Edge case name]

Given [edge case condition]
When [action occurs]
Then [graceful handling]

AC5: [Error scenario]

Given [error condition]
When [action fails]
Then [system recovers gracefully]
And [appropriate logging/alerting occurs]

Analytics Events

Event	Trigger	Properties
`[feature]_[action]`	[When this fires]	`user_id`, `[key_prop]`, `[key_prop]`
`[feature]_error`	[When error occurs]	`user_id`, `error_type`, `error_message`
`[feature]_success`	[When flow completes]	`user_id`, `duration_ms`, `[outcome]`

Sub-Issues / Phases

Phase 1: [Name]

Issue	Description	BDD Summary
[Issue 1]	[Brief]	Given X, When Y, Then Z
[Issue 2]	[Brief]	Given X, When Y, Then Z

Phase 2: [Name]

Issue	Description	BDD Summary
[Issue 3]	[Brief]	Given X, When Y, Then Z
[Issue 4]	[Brief]	Given X, When Y, Then Z

Success Metrics

Metric	Current	Target	How to Measure
[Metric 1]	[Baseline]	[Goal]	[Method]

Dependencies

[Team/System]: [What's needed]

Open Questions

[Question 1] - Owner: [Name]
[Question 2] - Owner: [Name]


#### Issue/Story Template
```markdown
# [Issue Title]

## TL;DR
> **What:** [One sentence describing the change]
> **Why:** [One sentence on the value/impact]
> **How:** [One sentence on approach]
> **Scope:** [Small/Medium/Large] | **Risk:** [Low/Medium/High]

## Problem
[What problem does this solve?]

## Solution Approach
[Proposed approach]

## Architecture
[If applicable - component interactions, data flow]

[ASCII diagram if helpful]


## User Flow
[If applicable - user-impacting feature]

[User journey diagram if helpful]


## Acceptance Criteria (BDD)

### Core Functionality

**AC1: [Name]**
```gherkin
Given [precondition/context]
When [action/trigger]
Then [expected outcome]
And [additional outcome if needed]

AC2: [Name]

Given [precondition/context]
When [action/trigger]
Then [expected outcome]

Edge Cases & Error Handling

AC3: [Edge case name]

Given [edge case condition]
When [action/trigger]
Then [graceful handling]
And [user feedback if applicable]

AC4: [Error scenario]

Given [error condition]
When [action fails]
Then [error is handled gracefully]
And [appropriate error message shown]

Implementation Hints

Files likely to change:

path/to/file1.ts - [what changes]
path/to/file2.ts - [what changes]

Key functions/classes:

functionName() - [modification needed]
ClassName - [modification needed]

Patterns to follow:

[Existing pattern in codebase to match]

Test Plan

E2E Tests:

[Test description - user journey covered]

Integration Tests:

[Test description - components/APIs tested]

Regression Tests:

[Test description - existing behavior preserved]

Analytics Events

Event	Trigger	Properties
`[feature]_started`	[When user begins flow]	`user_id`, `[context]`
`[feature]_completed`	[When flow succeeds]	`user_id`, `duration_ms`
`[feature]_failed`	[When flow fails]	`user_id`, `error_type`

Technical Notes

[Implementation details, constraints]

Non-Functional Requirements

Performance: [If applicable]
Security: [If applicable]

Dependencies

[Dependencies if any]

Out of Scope

[What this does NOT include]

References

[Link to related docs/tickets]


#### Bug Template
```markdown
# Bug: [Title]

## TL;DR
> **Bug:** [One sentence description]
> **Impact:** [Who/what is affected]
> **Severity:** [Critical/High/Medium/Low]
> **Workaround:** [Yes/No] - [brief if yes]

## Bug Report (BDD)
```gherkin
Given [the preconditions/setup]
When [the action that triggers the bug]
Then [actual incorrect behavior]
But [expected correct behavior]

Steps to Reproduce

[Step 1]
[Step 2]
[Step 3]

Environment

OS: [Operating system]
Browser/Version: [If applicable]
App Version: [Version number]

Impact

[Who is affected and how many users]

Workaround

[Any known workarounds, or "None"]

Fix Acceptance Criteria (BDD)

AC1: Bug is fixed

Given [same preconditions as bug report]
When [same action that triggered bug]
Then [correct expected behavior]
And [no regression in related functionality]

AC2: Regression test added

Given the fix is implemented
Then a test exists that would catch this bug
And the test is included in CI pipeline

Implementation Hints

Likely root cause:

[Hypothesis about what's wrong]

Files to investigate:

path/to/file.ts - [why]

Related code:

[Function/class that likely contains bug]

Test Plan

Regression Tests:

[Test that would have caught this bug]
[Test for related edge cases]

Integration Tests:

[Test for component interactions if applicable]

Screenshots/Logs

[Attach if available]


#### Spike Template
```markdown
# Spike: [Title]

## TL;DR
> **Question:** [Primary question to answer]
> **Time-box:** [Duration]
> **Output:** [What artifact will be produced]
> **Decision:** [What this will help decide]

## Research Questions
1. [Primary question this spike will answer]
2. [Secondary question]
3. [Tertiary question if applicable]

## Background
[Context on why this investigation is needed]

## Time-box
[Duration: e.g., 2 days]

## Approach
1. [Investigation step 1]
2. [Investigation step 2]
3. [Investigation step 3]

## Options to Evaluate
| Option | Pros | Cons | Effort |
|--------|------|------|--------|
| [Option A] | [Pros] | [Cons] | [S/M/L] |
| [Option B] | [Pros] | [Cons] | [S/M/L] |
| [Option C] | [Pros] | [Cons] | [S/M/L] |

## Decision Criteria
How we'll evaluate options:
| Criterion | Weight | Notes |
|-----------|--------|-------|
| [Criterion 1] | [High/Med/Low] | [Why important] |
| [Criterion 2] | [High/Med/Low] | [Why important] |

## Spike Complete When (BDD)
```gherkin
Given the spike time-box is complete
Then we have answered: [primary question]
And we have a recommendation with rationale
And we have documented trade-offs
And next steps are defined

Artifacts

[Decision document / ADR]
[POC code if applicable]
[Comparison matrix]

Decision This Informs

[What decision will be made based on findings]

References

[Existing docs, prior art, relevant links]


#### PRD Template
```markdown
# PRD: [Feature Name]

## Overview
[2-3 sentence summary]

## Problem Statement
[Detailed problem description]

## Goals
1. [Goal 1]
2. [Goal 2]

## Non-Goals
- [What this will NOT address]

## Target Users
[User personas or segments]

## User Stories
1. As a [user], I want [action] so that [benefit]

## Architecture

[High-level system architecture]

┌─────────────┐ ┌──────────────┐ │ Component A │──────▶│ Component B │ └─────────────┘ └──────┬───────┘ │ ▼ ┌──────────────┐ │ Component C │ └──────────────┘


## User Flow

START │ ▼ ┌─────────────────────────────┐ │ [Step 1] │ └──────────┬──────────────────┘ │ ▼ ┌─────────────────────────────┐ │ [Step 2] │ └──────────┬──────────────────┘ │ ▼ END


## Requirements

### Functional Requirements
- [FR-1]: [Requirement]
- [FR-2]: [Requirement]

### Non-Functional Requirements
- **Performance**: [Criteria]
- **Reliability**: [Criteria]
- **Security**: [Criteria]

## Acceptance Criteria

**Core Functionality:**
- [ ] [Criterion 1]
- [ ] [Criterion 2]

**Edge Cases & Fallback Behavior:**
- [ ] [Edge case 1]
- [ ] [Edge case 2]

## Testing Requirements

**Unit Tests:**
- [ ] [Test 1]

**Integration Tests:**
- [ ] [Test 1]

## Implementation Phases

### Phase 1: [Name]
**Dependencies:** None
- [Deliverable 1]
- [Deliverable 2]

### Phase 2: [Name]
**Dependencies:** Phase 1
- [Deliverable 3]
- [Deliverable 4]

## Success Metrics
| Metric | Current | Target |
|--------|---------|--------|
| [Metric 1] | [Baseline] | [Goal] |

## Open Questions
- [Question 1]

## References
- [Link to related docs]
- [Link to related tickets]

Step 5: Review and Refine

After generating the spec:

Ask if any sections need adjustment
Clarify any ambiguous points
Iterate until the user is satisfied

Step 6: Evaluate Scope & Breakdown

Before creating tickets, analyze the spec and propose the right structure:

Based on the spec, here's my assessment:

**Scope Analysis:**
- Acceptance Criteria count: [X]
- Phases/milestones: [X]
- Estimated PRs: [X]
- Cross-team dependencies: [Yes/No]

**Recommended Structure:**
[One of the options below]

Classification Matrix:

Indicators	Recommendation
< 5 ACs, single PR, no phases	Single Issue
5-10 ACs, 2-3 PRs, phases mentioned	Epic (single ticket with sub-tasks)
10-15 ACs, multiple PRs, clear phases	Epic + Sub-Issues (parent + children)
> 15 ACs, multi-team, roadmap item	Project + Epics + Issues (full hierarchy)

Present options to user:

How should we structure this in Linear?

A) Single Issue - create one ticket with full spec
B) Epic with Sub-Issues - create parent epic + [N] child issues
C) Project - create project with [N] epics/issues underneath
D) Multiple separate Issues - create [N] independent tickets
E) Let me decide - just output the markdown

If user chooses B, C, or D (multiple tickets):

Propose the breakdown:

I'll create the following tickets:

📁 [Epic/Project]: [Title]
├── 📋 Issue 1: [Title] - [brief scope]
├── 📋 Issue 2: [Title] - [brief scope]
└── 📋 Issue 3: [Title] - [brief scope]

Does this breakdown look right?
A) Yes, create all tickets
B) Adjust the breakdown
C) Just create the parent, I'll add children later

Generate individual specs for each ticket (condensed from main spec)
Create tickets with proper parent-child relationships

Step 6b: Notify Sparky (Automatic)

After spec is finalized, automatically send to Sparky as FYI.

No user prompt needed. No response expected. Just inform Sparky of the decision.

bash .claude/skills/sparky/scripts/ask_sparky.sh "[FROM SPEC-MAKER] FYI - Final spec created:

Title: [SPEC_TITLE]
Type: [Issue/Epic/PRD/etc.]
Summary: [TL;DR from spec]

Key decisions:
- [Decision 1]
- [Decision 2]
- [Decision 3]

This is for your records. No response needed."

Do NOT wait for Sparky's response. Run in background or fire-and-forget.

Step 6c: Validation Gate

Before creating or updating any ticket, run validation checks.

This is a HARD GATE. Do not proceed to Step 7 if blocking checks fail.

Run these checks on the generated spec:

┌─────────────────────────────────────────────────────────────────┐
│ SPEC VALIDATION                                                 │
├─────────────────────────────────────────────────────────────────┤
│ [✓/✗] Problem statement: [present/missing] ([N] chars)         │
│ [✓/✗] Acceptance criteria: [N] ACs [in Gherkin/plain format]   │
│ [✓/✗] Scope defined: [goals + non-goals / missing]             │
│ [✓/✗] Architecture diagram: [present/missing]                  │
│ [✓/✗] User flow diagram: [present/missing]                 │
│ [✓/✗] Test plan: [present/missing]                             │
│ [✓/✗] Analytics events: [present/missing]                      │
│ [✓/✗] AC quality: [discrete stories / vague summary]           │
│ [✓/⚠] Ambiguous language: [none found / found on line X]       │
│ [✓/⚠] TL;DR present: [yes/no]                                  │
│ [✓/⚠] Implementation hints: [present/missing]                  │
├─────────────────────────────────────────────────────────────────┤
│ Result: [PASS/FAIL] ([N] blocking, [M] warnings)                │
└─────────────────────────────────────────────────────────────────┘

Validation Rules:

Check	Rule	Severity	Applies To
`problem_statement`	Section exists, >50 chars	BLOCK	All
`acceptance_criteria`	At least 3 ACs	BLOCK	Issue, Bug, Epic
`gherkin_format`	ACs use Given/When/Then	WARN	Issue, Bug
`scope_defined`	Has goals AND non-goals	BLOCK	Epic, PRD
`scope_defined`	Has "Out of Scope" section	WARN	Issue
`architecture_diagram`	ASCII diagram showing component interactions	BLOCK	Epic, PRD, Large Issue
`architecture_diagram`	Diagram present for multi-component work	WARN	Medium Issue
`user_flow_diagram`	User flow diagram for user-facing features	BLOCK	PRD
`user_flow_diagram`	User flow diagram if feature has UI/UX	WARN	Issue, Epic
`test_plan`	Lists E2E, Integration, or Regression tests to write	BLOCK	Issue, Bug
`test_plan`	Specifies what tests will be created	WARN	Epic
`analytics_events`	Analytics section with event name, trigger, properties	BLOCK	Issue, Epic
`analytics_events`	Analytics section exists	WARN	Bug
`ac_quality`	ACs are discrete, testable user stories (not vague summaries)	BLOCK	All
`no_ambiguous_language`	No "should/might/could/maybe" in ACs	WARN	All
`tldr_present`	TL;DR or summary exists	WARN	All
`implementation_hints`	Files/functions listed	WARN	Issue, Bug
`success_metrics`	Measurable metrics defined	BLOCK	Epic, PRD
`success_metrics`	Metrics section exists	WARN	Issue

On PASS (0 blocking issues):

✓ Spec validation passed ([N] warnings)

[Show warnings if any]

Proceed to create ticket? (Y/n)

On FAIL (1+ blocking issues):

✗ Spec validation failed ([N] blocking issues)

Please fix before creating ticket:
- [List blocking issues]

[Also show warnings]

Which issues should we address?
A) Fix all - I'll ask questions for each
B) Let me edit manually - show the spec
C) Skip validation - create anyway (not recommended)

If user chooses C (skip):

Add label needs-spec-review to the ticket
Add comment noting validation was skipped
Proceed to Step 7

Step 7: Create or Update Linear Ticket(s)

If Linear was marked unavailable in Step 0:

The spec is ready! Here's the final markdown:

[OUTPUT FULL SPEC]

[If multiple tickets were planned in Step 6, output each spec separately]

Note: Linear is not configured in this session. You can copy these specs
and create the tickets manually, or set up the Linear MCP for future sessions.

If Linear is available:

UPDATE MODE (existing ticket from Step 0b)

When updating an existing ticket:

Use mcp__linear-server__update_issue with the ticket ID
Replace the description with the updated spec
If ticket was in "Needs Human" status, move back to previous state (usually "Backlog")
Add a comment summarizing what changed

Update call:

mcp__linear-server__update_issue:
  id: [TICKET_ID]
  description: [FULL_UPDATED_SPEC]
  state: [previous state or "Backlog" if was "Needs Human"]

Add change summary comment:

mcp__linear-server__create_comment:
  issueId: [TICKET_ID]
  body: |
    ## Spec Updated

    Changes made:
    - [Added/Updated problem statement]
    - [Added X acceptance criteria]
    - [Added out-of-scope section]
    - [etc.]

    ---
    _Updated via /spec command_

Post-update summary:

✅ Updated ticket: [TICKET-ID]

Changes:
- [List what was added/changed]

URL: [LINEAR_URL]

[If was in "Needs Human", note it was moved back to Backlog]

CREATE MODE (new ticket)

Create based on Step 6 decision:

Single Issue:

Use mcp__linear-server__create_issue
Include full markdown spec in description

Epic with Sub-Issues:

Create parent Epic issue first
Create child issues with parentId set to Epic
Return all ticket URLs

Project:

Use mcp__linear-server__create_project for container
Create issues/epics underneath with project association
Return project URL + all ticket URLs

For all tickets:

Ask for team assignment if not specified
Set appropriate labels based on spec type (Bug, Feature, Spike, etc.)
Link related tickets if creating multiple
Copy final URLs to clipboard

Post-creation summary:

✅ Created [N] ticket(s) in Linear:

📁 [Project/Epic]: [URL]
├── 📋 [Issue 1]: [URL]
├── 📋 [Issue 2]: [URL]
└── 📋 [Issue 3]: [URL]

[URLs copied to clipboard]

ASCII Diagrams

Include ASCII diagrams when they help clarify the spec. Types to consider:

Diagram Type	When to Use
Architecture	Component interactions, data flow, system boundaries
Sequence	API interactions, user flows with multiple steps
State Machine	Feature lifecycle, status transitions
Entity Relationship	Data models, database schemas

Diagram Requirements:

Must render correctly in terminal AND markdown (Linear, GitHub)
Use UTF-8 box-drawing characters: ┌┐└┘├┤┬┴┼─│▶▼
Wrap in triple backticks for markdown code blocks
Keep width reasonable (< 80 chars ideal)

Example Architecture Diagram:

┌─────────────┐       ┌──────────────┐
│   Client    │──────▶│   API Layer  │
└─────────────┘       └──────┬───────┘
                             │
                             ▼
                      ┌──────────────┐
                      │   Database   │
                      └──────────────┘

Classification Logic

When creating Linear tickets, classify based on scope:

Type	Indicators	Linear Entity
Issue	Single PR, isolated fix, < 5 ACs, no phases	Issue
Epic	Multi-PR, 5-15 ACs, explicit phases, coordinated work	Issue (Epic label) or Project
Project	Multi-team, > 15 ACs, initiative-level, roadmap item	Project

Classification Signals:

Scope keywords: "multiple PRs", "phases", "dependencies"
Complexity: story points, time mentions
Architecture diagram complexity
Number of acceptance criteria

Acceptance Criteria Quality

The #1 spec failure mode is vague, untestable ACs.

❌ BAD: Vague Summary ACs

Given the feature is fully implemented
Then users can do the thing
And everything works correctly
And the system is reliable

This is useless. It's not testable, not specific, and doesn't capture discrete behaviors.

✅ GOOD: Discrete, Testable User Stories

Each AC should be:

Specific: One scenario, one outcome
Testable: Can write an automated test for it
Independent: Doesn't depend on other ACs to make sense
Observable: Outcome is verifiable (not "system is fast" but "response < 200ms")

Group ACs by functional area:

Functional Area	Example ACs
Core Flow	Happy path user journeys
Edge Cases	Boundary conditions, empty states
Error Handling	What happens when things fail
Timing/Scheduling	Time-based behaviors
Analytics	Event tracking verification
Integration	Cross-component behaviors

AC Count Guidelines:

Spec Type	Target AC Count
Issue	5-10 ACs
Epic	15-25 ACs
PRD	20-40 ACs

Example: Decomposing a Feature

Feature: "Send push notifications"

❌ Vague:

Given the notification system is implemented
Then users receive notifications

✅ Decomposed:

# Core Flow
AC1: Given a notification is scheduled for 6pm
     When the scheduler runs at 6pm
     Then the notification is sent via Expo Push
     And sent_at timestamp is recorded

# Edge Case
AC2: Given a notification is scheduled for 6pm
     And the user's push token is invalid
     When the scheduler runs
     Then the notification is marked as failed
     And no retry is attempted

# Error Handling
AC3: Given Expo Push API returns a 500 error
     When sending a notification
     Then the system retries with exponential backoff
     And after 3 failures, marks as failed

# Analytics
AC4: Given a notification is sent successfully
     Then a notification_sent event is logged
     With properties: user_id, notification_id, scheduled_for, actual_send_time

Analytics Requirements

Every Issue and Epic MUST include an Analytics section.

Why Analytics Matter

Without analytics, you cannot:

Know if the feature is being used
Measure success against goals
Debug issues in production
Make data-driven decisions about iteration

Event Naming Convention

Use snake_case with format: [domain]_[object]_[action]

Pattern	Example	Use Case
`[feature]_[action]`	`ppo_scheduled`	Feature-level events
`[feature]_[object]_[action]`	`ppo_notification_sent`	When feature has sub-objects
`[screen]_viewed`	`settings_viewed`	Screen/page views
`[button]_tapped`	`submit_button_tapped`	UI interactions

Event Categories

Lifecycle Events - Track the full funnel:

Event	When	Required Properties
`*_scheduled`	System queues an action	`scheduled_for`, `trigger_type`
`*_started`	User/system begins flow	`source`, `entry_point`
`*_completed`	Flow succeeds	`duration_ms`, `outcome`
`*_failed`	Flow fails	`error_type`, `error_code`, `retry_count`
`*_abandoned`	User exits mid-flow	`last_step`, `time_in_flow_ms`
`*_retried`	System/user retries	`attempt_number`, `previous_error`

Engagement Events - Track user behavior:

Event	When	Required Properties
`*_viewed`	User sees content	`content_id`, `position`, `source`
`*_tapped`	User interacts	`element_id`, `context`
`*_opened`	User opens notification/modal	`time_since_sent`, `source`
`*_dismissed`	User dismisses	`time_visible_ms`, `action_taken`

Business Events - Track outcomes:

Event	When	Required Properties
`*_converted`	User completes key action	`conversion_type`, `value`
`*_activated`	Feature first used	`days_since_signup`, `activation_path`
`*_retained`	User returns	`days_since_last_active`, `return_trigger`

Property Standards

Always Include:

user_id - For user-level analysis
session_id - For session-level analysis
timestamp - Usually automatic
platform - ios / android / web
app_version - For version-based debugging

Context Properties:

source - Where did user come from? (push, deeplink, organic)
entry_point - Which button/link triggered this?
experiment_variant - If A/B testing

Outcome Properties:

duration_ms - How long did it take?
success - true / false
error_type - If failed, what category?
error_message - Human-readable error

Analytics Table Format

## Analytics Events

### Funnel: [Feature Name]

| Event | Trigger | Properties | Notes |
|-------|---------|------------|-------|
| `feature_scheduled` | System schedules action | `user_id`, `scheduled_for`, `trigger_type`, `trigger_context` | Baseline for funnel |
| `feature_sent` | Action delivered | `user_id`, `feature_id`, `delay_from_scheduled_ms` | Delivery rate |
| `feature_opened` | User engages within 1hr | `user_id`, `feature_id`, `time_to_open_ms`, `source` | Open rate |
| `feature_converted` | User completes goal | `user_id`, `feature_id`, `conversion_type`, `time_to_convert_ms` | Conversion rate |
| `feature_failed` | Delivery/action failed | `user_id`, `feature_id`, `error_type`, `error_code`, `retry_count` | Error rate |

### Derived Metrics

| Metric | Calculation | Target |
|--------|-------------|--------|
| Delivery Rate | `feature_sent` / `feature_scheduled` | > 99% |
| Open Rate | `feature_opened` / `feature_sent` | > 30% |
| Conversion Rate | `feature_converted` / `feature_opened` | > 20% |
| Error Rate | `feature_failed` / `feature_scheduled` | < 1% |

Example: Complete Analytics Spec

Feature: PPO (Proactive Personalized Outreach)

## Analytics Events

### Funnel: PPO Lifecycle

| Event | Trigger | Properties | Notes |
|-------|---------|------------|-------|
| `ppo_analyzed` | Conversation ends, analyzer runs | `user_id`, `conversation_id`, `transcript_length` | Entry to funnel |
| `ppo_scheduled` | LLM decides to schedule PPO | `user_id`, `ppo_id`, `trigger_type`, `scheduled_for`, `message_preview` | Extraction rate |
| `ppo_skipped` | LLM decides NOT to schedule | `user_id`, `conversation_id`, `skip_reason` | Understand filtering |
| `ppo_sent` | Push notification delivered | `user_id`, `ppo_id`, `delay_from_scheduled_ms`, `expo_receipt_id` | Delivery rate |
| `ppo_send_failed` | Push delivery failed | `user_id`, `ppo_id`, `error_type`, `error_code`, `retry_count` | Debug delivery |
| `ppo_opened` | App opened within 1hr of send | `user_id`, `ppo_id`, `time_to_open_ms`, `opened_from` | Open rate |
| `ppo_conversation_started` | Conversation started within 1hr | `user_id`, `ppo_id`, `time_to_conversation_ms` | Engagement rate |
| `ppo_used` | PPO injected into Ember context | `user_id`, `ppo_id`, `ppo_count_injected` | Context usage |
| `ppo_expired` | PPO past 24hr grace period | `user_id`, `ppo_id`, `hours_overdue` | Stale PPO rate |

### Segment Properties

All events include:
- `user_id`, `session_id`, `platform`, `app_version`
- `ppo_trigger_type`: `commitment` | `event` | `emotional`
- `user_ppo_count_lifetime`: Total PPOs this user has received

### Derived Metrics

| Metric | Calculation | Target | Alert Threshold |
|--------|-------------|--------|-----------------|
| Extraction Rate | `ppo_scheduled` / `ppo_analyzed` | 15-25% | < 5% or > 50% |
| Delivery Rate | `ppo_sent` / `ppo_scheduled` | > 99% | < 95% |
| Open Rate | `ppo_opened` / `ppo_sent` | > 30% | < 15% |
| Conversation Rate | `ppo_conversation_started` / `ppo_opened` | > 50% | < 25% |
| Context Injection Rate | `ppo_used` / `ppo_opened` | > 95% | < 80% |
| Error Rate | `ppo_send_failed` / `ppo_scheduled` | < 1% | > 5% |

Validation Checklist

Before finalizing any spec:

Problem is clearly stated
Scope is well-defined (in-scope and out-of-scope)
Acceptance criteria are discrete, testable user stories (not vague summaries)
ACs grouped by functional area (core flow, edge cases, errors, analytics)
Architecture diagram included (for multi-component work)
User flow diagram included (for user-facing features)
Test plan specifies E2E/Integration/Regression tests to write
Analytics events defined with triggers and properties
Dependencies are identified
Success metrics are measurable
No ambiguous language ("should", "might", "could")

Quick Reference

Spec Type	Linear Entity	Key Sections
Epic	Project	Goals, Stories, Metrics
Issue	Issue	Problem, AC, Technical Notes
Bug	Issue + Bug label	Repro Steps, Expected/Actual
Spike	Issue + Spike label	Question, Time-box, Artifacts
PRD	Project + Doc	Requirements, User Journey, Metrics

Sparky Integration

This skill optionally integrates with Sparky, a persistent AI agent with long-term memory.

When Sparky is Consulted:

After gathering initial context from the user
Before generating the final spec

What Sparky Provides:

Product/design/user research perspective
Historical context and prior decisions
The 3-5 most important questions the human needs to answer
Red flags or considerations to address

Flow with Sparky:

User provides initial context
Claude asks if user wants Sparky's input
If yes, Claude sends context to Sparky
Sparky returns insights + key questions
Claude presents questions to user
User answers the questions
Claude generates spec with all context

Dependency:

Requires the sparky skill at .claude/skills/sparky/
Uses .claude/skills/sparky/scripts/ask_sparky.sh

Including Sparky's Input in Specs:

If Sparky was consulted, add a "Research Context" section:

## Research Context
_Insights from product/design review:_
- [Key insight 1]
- [Key insight 2]

_Key questions addressed:_
- Q: [Question from Sparky]
  A: [User's answer]