spec

star 0

Write specifications for Pangea/pelago-aiml. Creates Epics, Issues, Bugs, Spikes, or PRDs with Linear integration. Uses /research_codebase for deep codebase analysis. Validates specs for ACs, diagrams, analytics, and test plans.

quitgenius By quitgenius schedule Updated 1/30/2026

name: spec description: Write specifications for Pangea/pelago-aiml. Creates Epics, Issues, Bugs, Spikes, or PRDs with Linear integration. Uses /research_codebase for deep codebase analysis. Validates specs for ACs, diagrams, analytics, and test plans.

Spec Writer (Pangea)

Write well-structured specifications through guided conversation. Supports Epics, Issues/Stories, Bugs, Spikes, and PRDs.

When this skill is invoked, follow the flows below. They are carefully crafted.

Project Integration:

  • Uses /research_codebase command for codebase analysis
  • Stores research in docs/thoughts/shared/research/
  • Integrates with Pangea team workflows (Linear, Sparky, Amplitude)

Modes

Command Action
/spec Show options: quick from context, fresh, or analyze existing
/spec PAN-174 Analyze & update existing ticket (skip to Step 0b)
/spec PAN-174 --validate Validate only, show report
/spec PAN-174 --rewrite Discard current, start fresh
/spec --quick Skip options, go straight to quick mode from context

Workflow

Step 0: Check MCP Availability

Quick check - if MCPs are connected, proceed. No API calls needed.

Check for MCP tools in current session:

  • mcp__letta__* tools available → Sparky ✓
  • mcp__linear-server__* tools available → Linear ✓
  • Sparky skill exists at .claude/skills/sparky/ → Sparky script ✓

If any are missing, brief notice:

Available integrations: [Sparky ✓] [Linear ✓] [Exa ✓]

Missing: [list any missing]
→ I can still write specs, just can't [consult Sparky / create Linear tickets]

Proceed? (Y/n)

Degraded Mode:

  • Sparky unavailable → Skip Step 3, don't offer consultation
  • Linear unavailable → Skip Step 7, output markdown only
  • Exa unavailable → Skip Step 3b, no best practices search

Step 0a: Present Mode Options

Always present these three options after showing integrations:

Available integrations: [Sparky ✓/✗] [Linear ✓/✗] [Exa ✓/✗]

A) Start fresh - new interactive flow
B) Create from conversation or plan - extract context, generate spec
C) Analyze existing ticket - enter ticket ID (e.g., PAN-174)

Option A (Quick Mode):

  • Extract context from recent conversation or existing plan
  • Identify: problem, solution approach, scope, technical details
  • Infer spec type from context
  • Skip to Step 4 (Generate Spec) with extracted context
  • Still run validation (Step 6c)
  • Show abbreviated review before creating ticket

Option B (Fresh Start):

  • Continue to Step 1 (normal interactive flow)

Option C (Existing Ticket):

  • Prompt for ticket ID if not provided
  • Go to Step 0b (fetch and analyze existing ticket)

Context Extraction for Quick Mode:

When user chooses A, extract from conversation:

  • Problem/goal being solved
  • Proposed solution or approach
  • Technical details (files, functions, architecture)
  • Scope boundaries mentioned
  • Any acceptance criteria discussed

Generate spec directly, show for review:

Based on our conversation, here's the spec:

[GENERATED SPEC]

┌─────────────────────────────────────────────────────────────────┐
│ ✓ Validation: PASS (N warnings)                                │
├─────────────────────────────────────────────────────────────────┤
│ A) Create ticket in Linear                                      │
│ B) Edit something first                                         │
│ C) Go full interactive mode                                     │
└─────────────────────────────────────────────────────────────────┘

Step 0b: Detect Existing Ticket Mode

Check if user provided a ticket ID (e.g., /spec PAN-174):

If a ticket ID is provided:

  1. Fetch ticket via mcp__linear-server__get_issue with includeRelations: true
  2. Extract: title, description, type, status, labels, parent, children
  3. Detect spec type from labels/content (Issue, Bug, Epic, Spike, PRD)
  4. Go to Step 0c (Existing Ticket Analysis)

If no ticket ID:

  • Continue to Step 1 (normal create flow)

Step 0c: Existing Ticket Analysis

ALL Existing ticket analysis should go through this flow. Do not just vibe check it. Use this exact validation check. This is our core ticket spec schema that all tickets mut follow.

Run validation checks on the current ticket content:

┌─────────────────────────────────────────────────────────────────┐
│ TICKET ANALYSIS: [TICKET-ID]                                    │
├─────────────────────────────────────────────────────────────────┤
│ Title: [title]                                                  │
│ Type: [detected type]                                           │
│ Status: [status]                                                │
│ URL: [linear URL]                                               │
├─────────────────────────────────────────────────────────────────┤
│ VALIDATION:                                                     │
│ [✓/✗/⚠] Problem statement: [status]                            │
│ [✓/✗/⚠] Acceptance criteria: [count] found, [status]           │
│ [✓/✗/⚠] Scope defined: [status]                                │
│ [✓/✗/⚠] Architecture diagram: [present/missing]                │
│ [✓/✗/⚠] User flow diagram: [present/missing]               │
│ [✓/✗/⚠] Test plan: [present/missing]                           │
│ [✓/✗/⚠] Analytics events: [present/missing]                    │
│ [✓/✗/⚠] Implementation hints: [status]                         │
│ [✓/✗/⚠] Ambiguous language: [status]                           │
│ [✓/✗/⚠] AC quality: [discrete stories / vague summary]         │
├─────────────────────────────────────────────────────────────────┤
│ Result: [PASS/FAIL] ([N] blocking, [M] warnings)                │
└─────────────────────────────────────────────────────────────────┘

Validation Checks:

Check Rule Severity
problem_statement "Problem" or "Bug Report" section exists, >50 chars BLOCK
acceptance_criteria At least 3 ACs, preferably in Gherkin format BLOCK
scope_defined Has "Goals"/"In Scope" AND "Non-Goals"/"Out of Scope" BLOCK
architecture_diagram Has ASCII diagram for multi-component work BLOCK (Large/Epic/PRD)
user_flow_diagram Has user flow diagram all features that impact member experience in any way BLOCK
test_plan Lists E2E/Integration/Regression tests to write BLOCK (Issue/Bug)
analytics_events Has Analytics/Amplitude section with event definitions BLOCK (Issue/Epic)
ac_quality ACs are discrete testable stories, not vague summaries BLOCK
no_ambiguous_language No "should", "might", "could", "maybe" in ACs WARN
tldr_present TL;DR or summary section exists WARN
implementation_hints Has files/functions to change (Issues/Bugs only) WARN
gherkin_format ACs use Given/When/Then format WARN

If --validate flag: Show report and stop. Do not offer to update.

If --rewrite flag: Skip to Step 1, but pre-fill context from existing ticket.

Otherwise, present options:

What would you like to do?

A) Fix issues interactively - I'll ask questions to fill gaps
B) Rewrite entire spec - start fresh, keeping context
C) View current description - show full content
D) Update specific section - choose what to change

Option A (Fix interactively):

  • Go to Step 0d (Gap-Filling Mode)

Option B (Rewrite):

  • Store existing context (title, any valid sections)
  • Go to Step 1 with pre-filled context
  • Mark as UPDATE mode for Step 7

Option C (View):

  • Display full ticket description
  • Return to options

Option D (Update section):

  • Show list of sections
  • Allow targeted edits
  • Go to Step 7 (update mode)

Step 0d: Gap-Filling Mode

Only ask about what's missing. Do not re-ask about sections that pass validation.

The ticket is missing some key information. Let me ask about the gaps:

[Only show questions for FAILED or WARN checks]

Gap Questions by Check:

Failed Check Questions to Ask
problem_statement "What problem does this solve? Who experiences it?"
acceptance_criteria "What are the key acceptance criteria? (list 3-5)"
scope_defined "What's explicitly OUT of scope for this work?"
architecture_diagram "Can you describe the component interactions? I'll create a diagram."
user_flow_diagram "What's the user journey? I'll create a flow diagram."
test_plan "What E2E, Integration, or Regression tests should be written?"
analytics_events "What events should be tracked? (e.g., feature_used, error_occurred)"
ac_quality "Let's break down the ACs into discrete, testable user stories."
implementation_hints "Which files/functions will likely need changes?"

After gathering gaps:

  • Merge new content with existing valid sections
  • Generate updated spec
  • Go to Step 5 (Review) then Step 7 (Update mode)

Step 1: Identify Spec Type & Concept

┌─────────────────────────────────────────────────────────────────┐
│                    STEP 1: SPEC TYPE                            │
└─────────────────────────────────────────────────────────────────┘

What are we building?

┌────────────────┬──────────────────────────────────────────────────────────┐
│      Type      │                       Description                        │
├────────────────┼──────────────────────────────────────────────────────────┤
│ A) Epic        │ Large feature or initiative spanning multiple PRs/issues │
│                │ Example: "Implement voice conversation analytics system" │
│                │ Example: "Add multi-language support across the app"     │
├────────────────┼──────────────────────────────────────────────────────────┤
│ B) Issue/Story │ Single deliverable, one focused piece of work            │
│                │ Example: "Add push notification scheduling for outreach" │
│                │ Example: "Create settings screen with dark mode toggle"  │
├────────────────┼──────────────────────────────────────────────────────────┤
│ C) Bug         │ Defect report with reproduction steps                    │
│                │ Example: "App crashes when tapping back button on iOS"   │
│                │ Example: "Audio cuts out after 30 seconds of silence"    │
├────────────────┼──────────────────────────────────────────────────────────┤
│ D) Spike       │ Research/investigation with time-box                     │
│                │ Example: "Evaluate LiveKit vs Daily for WebRTC"          │
│                │ Example: "Investigate memory leak in conversation view"  │
├────────────────┼──────────────────────────────────────────────────────────┤
│ E) PRD         │ Full Product Requirements Document for major features    │
│                │ Example: "Proactive outreach system - full product spec" │
│                │ Example: "Member onboarding flow redesign"               │
└────────────────┴──────────────────────────────────────────────────────────┘

Briefly describe what you're trying to do:

(e.g., "Add push notification scheduling for proactive outreach"
 or "Fix crash when user taps back button on iOS")

Wait for user to provide spec type + basic concept.

Step 1b: Select Research

Once you know what they're building, ask what research to run:

Got it: [SPEC_TYPE] for "[BRIEF_CONCEPT]"

What research should I run? (e.g. "1,2,3 M" or "A" for all)

1) Codebase - scan related files, patterns
2) Linear - existing tickets, projects
3) Sparky - product/design context
4) Git - recent commits, branches
5) Docs - README, ADRs, specs
6) Exa - search web for best practices, examples
7) Other - [describe what you want researched]
A) All - run everything
N) None - skip research

Depth: Q (quick) / M (medium) / D (deep)

Step 1c: Run Parallel Research

CRITICAL: Launch ALL research in a SINGLE message with multiple Task tool calls.

Do NOT call Bash directly for Sparky. Do NOT run research sequentially.

For Deep Codebase Research: If the user needs comprehensive codebase understanding, recommend they run /research_codebase first, then return to /spec. The research document will provide thorough context.

For Quick Spec Research: Use the parallel Task approach below.

Send ONE message containing multiple Task tool invocations. Example:

[Single message with 6 Task tool calls - all launch simultaneously]

Task 1: Codebase (invoke /research_codebase command)
Task 2: Linear (subagent_type=general-purpose)
Task 3: Sparky (subagent_type=general-purpose)
Task 4: Git (subagent_type=Bash)
Task 5: Docs (subagent_type=Explore)
Task 6: Exa (subagent_type=general-purpose)

Each Task returns max 2500 tokens. All run simultaneously.

Task Prompts (subagent_type for each):

Research subagent_type Prompt
Codebase general-purpose Invoke the /research_codebase command with the topic. The command uses specialized agents (codebase-locator, codebase-analyzer, codebase-pattern-finder) to find files, understand code, and identify patterns. Return summary of findings relevant to the spec.
Linear general-purpose Search Linear for [TOPIC]: related tickets, team, project. Also review all open and pending tickets as other in flight work might impact this topic.
Sparky general-purpose [FROM SPEC-MAKER] + product context, key questions
Git Bash Git analysis for [TOPIC]: recent commits, active branches
Docs Explore Find docs for [TOPIC]: README, ADRs, thoughts/ directory
Exa general-purpose Web search for [TOPIC]: best practices, pitfalls (ALWAYS run)

Codebase Research Details:

When invoking codebase research, the agent should:

  1. Use the /research_codebase command pattern with specialized sub-agents:
    • codebase-locator: Find WHERE files and components live
    • codebase-analyzer: Understand HOW specific code works
    • codebase-pattern-finder: Find examples of existing patterns
  2. Focus on documenting what EXISTS (not suggesting improvements)
  3. Return:
    • Related files with paths
    • Existing patterns to follow
    • Integration points
    • Dependencies

Important for Sparky:

  • Use Task tool with subagent_type: general-purpose
  • The subagent calls: bash .claude/skills/sparky/scripts/ask_sparky.sh "prompt"
  • Do NOT call Bash directly from main conversation (blocks parallel execution)
  • ALL messages to Sparky MUST begin with [FROM SPEC-MAKER]

Important for Exa:

  • ALWAYS include Exa in research (unless user explicitly says "none")
  • Search for: "[TOPIC] best practices", "[TOPIC] common pitfalls", "[TOPIC] architecture patterns"
  • Include results in research summary under "Best Practices"

Context Window Management:

  • Each subagent returns max 500 tokens
  • Total research context: max 3500 tokens (7 sources × 500)
  • If a source returns nothing useful, omit from summary
  • Present combined research summary to user before questioning

Research Summary Output:

📊 Research Complete

**Codebase:**
- Tech: [stack]
- Related files: [list]
- Patterns: [conventions]

**Linear:**
- Related tickets: [list with IDs]
- Team: [team name]
- Current epic: [if any]

**Sparky's Input:**
- Key context: [summary]
- (Questions moved to Step 2 below)

**Git:**
- Recent activity: [summary]
- Active branches: [list]

**Docs:**
- Relevant docs: [list]

**Exa (Best Practices):**
- Patterns: [list]
- Pitfalls: [list]
- Sources: [URLs]

Ready to gather specifics. [Proceed to Step 2]

Step 2: Gather Initial Context

Convert research findings into lettered questions.

Do NOT list Sparky's questions separately, then ask different questions. Instead, take Sparky's questions and format them WITH options.

Building the question batch:

  1. Take Sparky's key questions (if any)
  2. Add technical questions from codebase findings
  3. Format each with 3-4 lettered options based on common patterns
  4. Present as ONE unified batch

Example - Converting Sparky's questions:

Sparky asked: "Who are the primary annotators?"

Convert to:

1. Who are the primary users?
   A) Just you / small eng team (< 5 people)
   B) Mixed team (eng + clinicians/researchers)
   C) Broader org access needed
   D) Other: [describe]

Question Format:

  • 2-4 options per question with letters (A, B, C, D)
  • Always include D) Other for custom input
  • User responds with shorthand: "1A, 2C, 3D"
  • Ask in batches of 3-5 questions, wait for response
  • All questions in ONE consistent format

Example unified batch:

Based on the research, I need to clarify a few things.
Answer with shorthand like "1B, 2A, 3C" or provide custom answers.

1. What type of user problem is this?
   A) UX friction - users can do it but it's painful
   B) Missing functionality - users can't do something they need
   C) Performance issue - too slow or resource-intensive
   D) Other: [describe]

2. How urgent is this?
   A) Blocking - users are stuck, no workaround
   B) High - significant pain, workaround exists
   C) Medium - nice to have, not urgent
   D) Low - minor improvement

3. Is there existing code/systems this touches?
   A) New feature - mostly greenfield
   B) Modification - changing existing behavior
   C) Integration - connecting existing pieces
   D) Refactor - restructuring without behavior change

Key Questions by Type (adapt with lettered options):

Epic

  1. What problem does this solve? (user pain / business need / technical debt)
  2. Who are the primary users/stakeholders?
  3. What does success look like? (metrics)
  4. What's in scope vs explicitly out of scope?
  5. What are the key user stories? (list 3-5)
  6. Dependencies on other teams/systems?
  7. Rough scope? (small/medium/large)

Issue/Story

  1. What user problem does this address?
  2. How urgent/important is this?
  3. What's the acceptance criteria? (list key ones)
  4. Are there technical constraints?
  5. Dependencies or blockers?
  6. Any design/UX requirements?

Bug

  1. What's the expected vs actual behavior?
  2. Steps to reproduce? (numbered list)
  3. Environment details (OS, browser, version)?
  4. Severity? (critical/high/medium/low)
  5. Impact - how many users affected?
  6. Any workarounds known?

Spike

  1. What question are we trying to answer?
  2. What's the time-box? (hours/days)
  3. What artifacts will be produced?
  4. What decision will this inform?
  5. What options are we evaluating?

PRD

  1. What problem are we solving?
  2. Who is the target user?
  3. What are the goals? (list 2-3)
  4. What are the non-goals? (explicitly out of scope)
  5. What does the user journey look like?
  6. Technical constraints or dependencies?
  7. How will we measure success?

Iteration Pattern:

Continue asking questions until you have:

  • Clear problem statement
  • Defined scope (in/out)
  • Acceptance criteria (at least 3)
  • Technical approach (if applicable)
  • Dependencies identified

Then proceed to Step 3 (Sparky) or Step 4 (Generate Spec).

Step 3: Consult Sparky (Optional)

Skip this step if Sparky was marked unavailable in Step 0.

Once you have sufficient context about what the user is trying to build, ask if they want Sparky's input:

I have enough context to start drafting. Would you like me to consult Sparky first?

Sparky can provide product/design/user research perspective and identify
the most important questions you should answer before finalizing the spec.

A) Yes, consult Sparky
B) No, proceed to draft

If the user chooses Yes:

  1. Summarize the context gathered so far
  2. Call Sparky with a structured prompt:
    bash .claude/skills/sparky/scripts/ask_sparky.sh "PROMPT"
    

Prompt Template for Sparky:

[FROM SPEC-MAKER] We're writing a [SPEC_TYPE] for: [BRIEF_DESCRIPTION]

Context gathered so far:
- Problem: [PROBLEM]
- Users: [TARGET_USERS]
- Scope: [IN_SCOPE / OUT_OF_SCOPE]
[OTHER RELEVANT CONTEXT]

Based on your product/design/user research knowledge:
1. What context or insights should inform this spec?
2. What are the 3-5 most important questions the human needs to answer before we finalize?
3. Any red flags or considerations we should address?

Using Sparky's Response:

  1. Present Sparky's insights to the user
  2. Ask the user to answer Sparky's key questions
  3. Incorporate both Sparky's context and the user's answers into the spec

Example Sparky Response:

Context: We've seen users struggle with [X] in the past. The design team
explored [Y] approach last quarter.

Key questions to answer:
1. How does this interact with the existing [feature]?
2. What's the rollout strategy - all users or phased?
3. Have we validated this solves the actual user pain point?
4. What's the fallback if [edge case] happens?

Red flags: Make sure to consider [Z] which caused issues before.

Step 3b: Search Best Practices (Exa)

Before generating the spec, search for industry best practices using Exa:

Searching for best practices related to: [TOPIC]...

Use mcp__exa__web_search_exa to find:

  • Implementation patterns for similar features
  • Common pitfalls and how to avoid them
  • Security considerations
  • Performance best practices
  • UX patterns (if applicable)

Example Exa queries:

Spec Topic Search Query
Auth system "OAuth 2.0 implementation best practices 2024"
API design "REST API design patterns error handling"
Database migration "zero downtime database migration patterns"
Caching "caching strategies cache invalidation best practices"
File upload "secure file upload validation best practices"

Exa Search Call:

mcp__exa__web_search_exa:
  query: "[TOPIC] implementation best practices"
  numResults: 5

Extract and summarize (max 300 tokens):

📚 Best Practices Found:

**Implementation Patterns:**
- [Pattern 1]: [brief description]
- [Pattern 2]: [brief description]

**Common Pitfalls:**
- [Pitfall 1]: [how to avoid]
- [Pitfall 2]: [how to avoid]

**Security Considerations:**
- [Item 1]
- [Item 2]

Sources: [list URLs for references section]

Incorporate into spec:

  • Add relevant patterns to Technical Notes section
  • Include pitfalls in Edge Cases
  • Add security items to Non-Functional Requirements
  • Link sources in References section

Step 4: Generate Spec

When you have enough context (including Sparky's input if consulted), generate the complete spec in markdown format.

Output Requirements:

  • Use proper markdown headers, lists, and formatting
  • Include all relevant sections for the spec type
  • Be specific and actionable
  • Include acceptance criteria where applicable

Spec Templates:

Epic Template

# Epic: [Title]

## TL;DR
> **What:** [One sentence describing the initiative]
> **Why:** [Business value / user impact]
> **Scope:** [X] issues across [Y] phases
> **Teams:** [Teams involved]

## Problem Statement
[Clear description of the problem being solved]

## Goals
- [Goal 1]
- [Goal 2]

## Non-Goals
- [Explicitly out of scope]

## Architecture
[High-level architecture description]

┌─────────────┐ ┌──────────────┐ │ Component A │──────▶│ Component B │ └─────────────┘ └──────────────┘


## Acceptance Criteria (BDD)

<!--
IMPORTANT: Do NOT write vague summary ACs like "Given the epic is complete, Then everything works."
Each AC must be a discrete, testable user story. Group by functional area.
Aim for 10-20 ACs for an Epic, covering: core flows, edge cases, error handling, analytics.
-->

### [Functional Area 1]

**AC1: [Specific scenario name]**
```gherkin
Given [specific precondition]
When [specific action]
Then [specific observable outcome]
And [additional verifiable outcome]

AC2: [Another specific scenario]

Given [specific precondition]
When [specific action]
Then [specific observable outcome]

[Functional Area 2]

AC3: [Specific scenario name]

Given [specific precondition]
When [specific action]
Then [specific observable outcome]

Edge Cases & Error Handling

AC4: [Edge case name]

Given [edge case condition]
When [action occurs]
Then [graceful handling]

AC5: [Error scenario]

Given [error condition]
When [action fails]
Then [system recovers gracefully]
And [appropriate logging/alerting occurs]

Analytics Events

Event Trigger Properties
[feature]_[action] [When this fires] user_id, [key_prop], [key_prop]
[feature]_error [When error occurs] user_id, error_type, error_message
[feature]_success [When flow completes] user_id, duration_ms, [outcome]

Sub-Issues / Phases

Phase 1: [Name]

Issue Description BDD Summary
[Issue 1] [Brief] Given X, When Y, Then Z
[Issue 2] [Brief] Given X, When Y, Then Z

Phase 2: [Name]

Issue Description BDD Summary
[Issue 3] [Brief] Given X, When Y, Then Z
[Issue 4] [Brief] Given X, When Y, Then Z

Success Metrics

Metric Current Target How to Measure
[Metric 1] [Baseline] [Goal] [Method]

Dependencies

  • [Team/System]: [What's needed]

Open Questions

  • [Question 1] - Owner: [Name]
  • [Question 2] - Owner: [Name]

#### Issue/Story Template
```markdown
# [Issue Title]

## TL;DR
> **What:** [One sentence describing the change]
> **Why:** [One sentence on the value/impact]
> **How:** [One sentence on approach]
> **Scope:** [Small/Medium/Large] | **Risk:** [Low/Medium/High]

## Problem
[What problem does this solve?]

## Solution Approach
[Proposed approach]

## Architecture
[If applicable - component interactions, data flow]

[ASCII diagram if helpful]


## User Flow
[If applicable - user-impacting feature]

[User journey diagram if helpful]


## Acceptance Criteria (BDD)

### Core Functionality

**AC1: [Name]**
```gherkin
Given [precondition/context]
When [action/trigger]
Then [expected outcome]
And [additional outcome if needed]

AC2: [Name]

Given [precondition/context]
When [action/trigger]
Then [expected outcome]

Edge Cases & Error Handling

AC3: [Edge case name]

Given [edge case condition]
When [action/trigger]
Then [graceful handling]
And [user feedback if applicable]

AC4: [Error scenario]

Given [error condition]
When [action fails]
Then [error is handled gracefully]
And [appropriate error message shown]

Implementation Hints

Files likely to change:

  • path/to/file1.ts - [what changes]
  • path/to/file2.ts - [what changes]

Key functions/classes:

  • functionName() - [modification needed]
  • ClassName - [modification needed]

Patterns to follow:

  • [Existing pattern in codebase to match]

Test Plan

E2E Tests:

  • [Test description - user journey covered]

Integration Tests:

  • [Test description - components/APIs tested]

Regression Tests:

  • [Test description - existing behavior preserved]

Analytics Events

Event Trigger Properties
[feature]_started [When user begins flow] user_id, [context]
[feature]_completed [When flow succeeds] user_id, duration_ms
[feature]_failed [When flow fails] user_id, error_type

Technical Notes

[Implementation details, constraints]

Non-Functional Requirements

  • Performance: [If applicable]
  • Security: [If applicable]

Dependencies

  • [Dependencies if any]

Out of Scope

  • [What this does NOT include]

References

  • [Link to related docs/tickets]

#### Bug Template
```markdown
# Bug: [Title]

## TL;DR
> **Bug:** [One sentence description]
> **Impact:** [Who/what is affected]
> **Severity:** [Critical/High/Medium/Low]
> **Workaround:** [Yes/No] - [brief if yes]

## Bug Report (BDD)
```gherkin
Given [the preconditions/setup]
When [the action that triggers the bug]
Then [actual incorrect behavior]
But [expected correct behavior]

Steps to Reproduce

  1. [Step 1]
  2. [Step 2]
  3. [Step 3]

Environment

  • OS: [Operating system]
  • Browser/Version: [If applicable]
  • App Version: [Version number]

Impact

[Who is affected and how many users]

Workaround

[Any known workarounds, or "None"]

Fix Acceptance Criteria (BDD)

AC1: Bug is fixed

Given [same preconditions as bug report]
When [same action that triggered bug]
Then [correct expected behavior]
And [no regression in related functionality]

AC2: Regression test added

Given the fix is implemented
Then a test exists that would catch this bug
And the test is included in CI pipeline

Implementation Hints

Likely root cause:

  • [Hypothesis about what's wrong]

Files to investigate:

  • path/to/file.ts - [why]

Related code:

  • [Function/class that likely contains bug]

Test Plan

Regression Tests:

  • [Test that would have caught this bug]
  • [Test for related edge cases]

Integration Tests:

  • [Test for component interactions if applicable]

Screenshots/Logs

[Attach if available]


#### Spike Template
```markdown
# Spike: [Title]

## TL;DR
> **Question:** [Primary question to answer]
> **Time-box:** [Duration]
> **Output:** [What artifact will be produced]
> **Decision:** [What this will help decide]

## Research Questions
1. [Primary question this spike will answer]
2. [Secondary question]
3. [Tertiary question if applicable]

## Background
[Context on why this investigation is needed]

## Time-box
[Duration: e.g., 2 days]

## Approach
1. [Investigation step 1]
2. [Investigation step 2]
3. [Investigation step 3]

## Options to Evaluate
| Option | Pros | Cons | Effort |
|--------|------|------|--------|
| [Option A] | [Pros] | [Cons] | [S/M/L] |
| [Option B] | [Pros] | [Cons] | [S/M/L] |
| [Option C] | [Pros] | [Cons] | [S/M/L] |

## Decision Criteria
How we'll evaluate options:
| Criterion | Weight | Notes |
|-----------|--------|-------|
| [Criterion 1] | [High/Med/Low] | [Why important] |
| [Criterion 2] | [High/Med/Low] | [Why important] |

## Spike Complete When (BDD)
```gherkin
Given the spike time-box is complete
Then we have answered: [primary question]
And we have a recommendation with rationale
And we have documented trade-offs
And next steps are defined

Artifacts

  • [Decision document / ADR]
  • [POC code if applicable]
  • [Comparison matrix]

Decision This Informs

[What decision will be made based on findings]

References

  • [Existing docs, prior art, relevant links]

#### PRD Template
```markdown
# PRD: [Feature Name]

## Overview
[2-3 sentence summary]

## Problem Statement
[Detailed problem description]

## Goals
1. [Goal 1]
2. [Goal 2]

## Non-Goals
- [What this will NOT address]

## Target Users
[User personas or segments]

## User Stories
1. As a [user], I want [action] so that [benefit]

## Architecture

[High-level system architecture]

┌─────────────┐ ┌──────────────┐ │ Component A │──────▶│ Component B │ └─────────────┘ └──────┬───────┘ │ ▼ ┌──────────────┐ │ Component C │ └──────────────┘


## User Flow

START │ ▼ ┌─────────────────────────────┐ │ [Step 1] │ └──────────┬──────────────────┘ │ ▼ ┌─────────────────────────────┐ │ [Step 2] │ └──────────┬──────────────────┘ │ ▼ END


## Requirements

### Functional Requirements
- [FR-1]: [Requirement]
- [FR-2]: [Requirement]

### Non-Functional Requirements
- **Performance**: [Criteria]
- **Reliability**: [Criteria]
- **Security**: [Criteria]

## Acceptance Criteria

**Core Functionality:**
- [ ] [Criterion 1]
- [ ] [Criterion 2]

**Edge Cases & Fallback Behavior:**
- [ ] [Edge case 1]
- [ ] [Edge case 2]

## Testing Requirements

**Unit Tests:**
- [ ] [Test 1]

**Integration Tests:**
- [ ] [Test 1]

## Implementation Phases

### Phase 1: [Name]
**Dependencies:** None
- [Deliverable 1]
- [Deliverable 2]

### Phase 2: [Name]
**Dependencies:** Phase 1
- [Deliverable 3]
- [Deliverable 4]

## Success Metrics
| Metric | Current | Target |
|--------|---------|--------|
| [Metric 1] | [Baseline] | [Goal] |

## Open Questions
- [Question 1]

## References
- [Link to related docs]
- [Link to related tickets]

Step 5: Review and Refine

After generating the spec:

  1. Ask if any sections need adjustment
  2. Clarify any ambiguous points
  3. Iterate until the user is satisfied

Step 6: Evaluate Scope & Breakdown

Before creating tickets, analyze the spec and propose the right structure:

Based on the spec, here's my assessment:

**Scope Analysis:**
- Acceptance Criteria count: [X]
- Phases/milestones: [X]
- Estimated PRs: [X]
- Cross-team dependencies: [Yes/No]

**Recommended Structure:**
[One of the options below]

Classification Matrix:

Indicators Recommendation
< 5 ACs, single PR, no phases Single Issue
5-10 ACs, 2-3 PRs, phases mentioned Epic (single ticket with sub-tasks)
10-15 ACs, multiple PRs, clear phases Epic + Sub-Issues (parent + children)
> 15 ACs, multi-team, roadmap item Project + Epics + Issues (full hierarchy)

Present options to user:

How should we structure this in Linear?

A) Single Issue - create one ticket with full spec
B) Epic with Sub-Issues - create parent epic + [N] child issues
C) Project - create project with [N] epics/issues underneath
D) Multiple separate Issues - create [N] independent tickets
E) Let me decide - just output the markdown

If user chooses B, C, or D (multiple tickets):

  1. Propose the breakdown:

    I'll create the following tickets:
    
    📁 [Epic/Project]: [Title]
    ├── 📋 Issue 1: [Title] - [brief scope]
    ├── 📋 Issue 2: [Title] - [brief scope]
    └── 📋 Issue 3: [Title] - [brief scope]
    
    Does this breakdown look right?
    A) Yes, create all tickets
    B) Adjust the breakdown
    C) Just create the parent, I'll add children later
    
  2. Generate individual specs for each ticket (condensed from main spec)

  3. Create tickets with proper parent-child relationships

Step 6b: Notify Sparky (Automatic)

After spec is finalized, automatically send to Sparky as FYI.

No user prompt needed. No response expected. Just inform Sparky of the decision.

bash .claude/skills/sparky/scripts/ask_sparky.sh "[FROM SPEC-MAKER] FYI - Final spec created:

Title: [SPEC_TITLE]
Type: [Issue/Epic/PRD/etc.]
Summary: [TL;DR from spec]

Key decisions:
- [Decision 1]
- [Decision 2]
- [Decision 3]

This is for your records. No response needed."

Do NOT wait for Sparky's response. Run in background or fire-and-forget.

Step 6c: Validation Gate

Before creating or updating any ticket, run validation checks.

This is a HARD GATE. Do not proceed to Step 7 if blocking checks fail.

Run these checks on the generated spec:

┌─────────────────────────────────────────────────────────────────┐
│ SPEC VALIDATION                                                 │
├─────────────────────────────────────────────────────────────────┤
│ [✓/✗] Problem statement: [present/missing] ([N] chars)         │
│ [✓/✗] Acceptance criteria: [N] ACs [in Gherkin/plain format]   │
│ [✓/✗] Scope defined: [goals + non-goals / missing]             │
│ [✓/✗] Architecture diagram: [present/missing]                  │
│ [✓/✗] User flow diagram: [present/missing]                 │
│ [✓/✗] Test plan: [present/missing]                             │
│ [✓/✗] Analytics events: [present/missing]                      │
│ [✓/✗] AC quality: [discrete stories / vague summary]           │
│ [✓/⚠] Ambiguous language: [none found / found on line X]       │
│ [✓/⚠] TL;DR present: [yes/no]                                  │
│ [✓/⚠] Implementation hints: [present/missing]                  │
├─────────────────────────────────────────────────────────────────┤
│ Result: [PASS/FAIL] ([N] blocking, [M] warnings)                │
└─────────────────────────────────────────────────────────────────┘

Validation Rules:

Check Rule Severity Applies To
problem_statement Section exists, >50 chars BLOCK All
acceptance_criteria At least 3 ACs BLOCK Issue, Bug, Epic
gherkin_format ACs use Given/When/Then WARN Issue, Bug
scope_defined Has goals AND non-goals BLOCK Epic, PRD
scope_defined Has "Out of Scope" section WARN Issue
architecture_diagram ASCII diagram showing component interactions BLOCK Epic, PRD, Large Issue
architecture_diagram Diagram present for multi-component work WARN Medium Issue
user_flow_diagram User flow diagram for user-facing features BLOCK PRD
user_flow_diagram User flow diagram if feature has UI/UX WARN Issue, Epic
test_plan Lists E2E, Integration, or Regression tests to write BLOCK Issue, Bug
test_plan Specifies what tests will be created WARN Epic
analytics_events Analytics section with event name, trigger, properties BLOCK Issue, Epic
analytics_events Analytics section exists WARN Bug
ac_quality ACs are discrete, testable user stories (not vague summaries) BLOCK All
no_ambiguous_language No "should/might/could/maybe" in ACs WARN All
tldr_present TL;DR or summary exists WARN All
implementation_hints Files/functions listed WARN Issue, Bug
success_metrics Measurable metrics defined BLOCK Epic, PRD
success_metrics Metrics section exists WARN Issue

On PASS (0 blocking issues):

✓ Spec validation passed ([N] warnings)

[Show warnings if any]

Proceed to create ticket? (Y/n)

On FAIL (1+ blocking issues):

✗ Spec validation failed ([N] blocking issues)

Please fix before creating ticket:
- [List blocking issues]

[Also show warnings]

Which issues should we address?
A) Fix all - I'll ask questions for each
B) Let me edit manually - show the spec
C) Skip validation - create anyway (not recommended)

If user chooses C (skip):

  • Add label needs-spec-review to the ticket
  • Add comment noting validation was skipped
  • Proceed to Step 7

Step 7: Create or Update Linear Ticket(s)

If Linear was marked unavailable in Step 0:

The spec is ready! Here's the final markdown:

[OUTPUT FULL SPEC]

[If multiple tickets were planned in Step 6, output each spec separately]

Note: Linear is not configured in this session. You can copy these specs
and create the tickets manually, or set up the Linear MCP for future sessions.

If Linear is available:


UPDATE MODE (existing ticket from Step 0b)

When updating an existing ticket:

  1. Use mcp__linear-server__update_issue with the ticket ID
  2. Replace the description with the updated spec
  3. If ticket was in "Needs Human" status, move back to previous state (usually "Backlog")
  4. Add a comment summarizing what changed

Update call:

mcp__linear-server__update_issue:
  id: [TICKET_ID]
  description: [FULL_UPDATED_SPEC]
  state: [previous state or "Backlog" if was "Needs Human"]

Add change summary comment:

mcp__linear-server__create_comment:
  issueId: [TICKET_ID]
  body: |
    ## Spec Updated

    Changes made:
    - [Added/Updated problem statement]
    - [Added X acceptance criteria]
    - [Added out-of-scope section]
    - [etc.]

    ---
    _Updated via /spec command_

Post-update summary:

✅ Updated ticket: [TICKET-ID]

Changes:
- [List what was added/changed]

URL: [LINEAR_URL]

[If was in "Needs Human", note it was moved back to Backlog]

CREATE MODE (new ticket)

Create based on Step 6 decision:

Single Issue:

  • Use mcp__linear-server__create_issue
  • Include full markdown spec in description

Epic with Sub-Issues:

  1. Create parent Epic issue first
  2. Create child issues with parentId set to Epic
  3. Return all ticket URLs

Project:

  1. Use mcp__linear-server__create_project for container
  2. Create issues/epics underneath with project association
  3. Return project URL + all ticket URLs

For all tickets:

  • Ask for team assignment if not specified
  • Set appropriate labels based on spec type (Bug, Feature, Spike, etc.)
  • Link related tickets if creating multiple
  • Copy final URLs to clipboard

Post-creation summary:

✅ Created [N] ticket(s) in Linear:

📁 [Project/Epic]: [URL]
├── 📋 [Issue 1]: [URL]
├── 📋 [Issue 2]: [URL]
└── 📋 [Issue 3]: [URL]

[URLs copied to clipboard]

ASCII Diagrams

Include ASCII diagrams when they help clarify the spec. Types to consider:

Diagram Type When to Use
Architecture Component interactions, data flow, system boundaries
Sequence API interactions, user flows with multiple steps
State Machine Feature lifecycle, status transitions
Entity Relationship Data models, database schemas

Diagram Requirements:

  • Must render correctly in terminal AND markdown (Linear, GitHub)
  • Use UTF-8 box-drawing characters: ┌┐└┘├┤┬┴┼─│▶▼
  • Wrap in triple backticks for markdown code blocks
  • Keep width reasonable (< 80 chars ideal)

Example Architecture Diagram:

┌─────────────┐       ┌──────────────┐
│   Client    │──────▶│   API Layer  │
└─────────────┘       └──────┬───────┘
                             │
                             ▼
                      ┌──────────────┐
                      │   Database   │
                      └──────────────┘

Classification Logic

When creating Linear tickets, classify based on scope:

Type Indicators Linear Entity
Issue Single PR, isolated fix, < 5 ACs, no phases Issue
Epic Multi-PR, 5-15 ACs, explicit phases, coordinated work Issue (Epic label) or Project
Project Multi-team, > 15 ACs, initiative-level, roadmap item Project

Classification Signals:

  • Scope keywords: "multiple PRs", "phases", "dependencies"
  • Complexity: story points, time mentions
  • Architecture diagram complexity
  • Number of acceptance criteria

Acceptance Criteria Quality

The #1 spec failure mode is vague, untestable ACs.

❌ BAD: Vague Summary ACs

Given the feature is fully implemented
Then users can do the thing
And everything works correctly
And the system is reliable

This is useless. It's not testable, not specific, and doesn't capture discrete behaviors.

✅ GOOD: Discrete, Testable User Stories

Each AC should be:

  • Specific: One scenario, one outcome
  • Testable: Can write an automated test for it
  • Independent: Doesn't depend on other ACs to make sense
  • Observable: Outcome is verifiable (not "system is fast" but "response < 200ms")

Group ACs by functional area:

Functional Area Example ACs
Core Flow Happy path user journeys
Edge Cases Boundary conditions, empty states
Error Handling What happens when things fail
Timing/Scheduling Time-based behaviors
Analytics Event tracking verification
Integration Cross-component behaviors

AC Count Guidelines:

Spec Type Target AC Count
Issue 5-10 ACs
Epic 15-25 ACs
PRD 20-40 ACs

Example: Decomposing a Feature

Feature: "Send push notifications"

❌ Vague:

Given the notification system is implemented
Then users receive notifications

✅ Decomposed:

# Core Flow
AC1: Given a notification is scheduled for 6pm
     When the scheduler runs at 6pm
     Then the notification is sent via Expo Push
     And sent_at timestamp is recorded

# Edge Case
AC2: Given a notification is scheduled for 6pm
     And the user's push token is invalid
     When the scheduler runs
     Then the notification is marked as failed
     And no retry is attempted

# Error Handling
AC3: Given Expo Push API returns a 500 error
     When sending a notification
     Then the system retries with exponential backoff
     And after 3 failures, marks as failed

# Analytics
AC4: Given a notification is sent successfully
     Then a notification_sent event is logged
     With properties: user_id, notification_id, scheduled_for, actual_send_time

Analytics Requirements

Every Issue and Epic MUST include an Analytics section.

Why Analytics Matter

Without analytics, you cannot:

  • Know if the feature is being used
  • Measure success against goals
  • Debug issues in production
  • Make data-driven decisions about iteration

Event Naming Convention

Use snake_case with format: [domain]_[object]_[action]

Pattern Example Use Case
[feature]_[action] ppo_scheduled Feature-level events
[feature]_[object]_[action] ppo_notification_sent When feature has sub-objects
[screen]_viewed settings_viewed Screen/page views
[button]_tapped submit_button_tapped UI interactions

Event Categories

Lifecycle Events - Track the full funnel:

Event When Required Properties
*_scheduled System queues an action scheduled_for, trigger_type
*_started User/system begins flow source, entry_point
*_completed Flow succeeds duration_ms, outcome
*_failed Flow fails error_type, error_code, retry_count
*_abandoned User exits mid-flow last_step, time_in_flow_ms
*_retried System/user retries attempt_number, previous_error

Engagement Events - Track user behavior:

Event When Required Properties
*_viewed User sees content content_id, position, source
*_tapped User interacts element_id, context
*_opened User opens notification/modal time_since_sent, source
*_dismissed User dismisses time_visible_ms, action_taken

Business Events - Track outcomes:

Event When Required Properties
*_converted User completes key action conversion_type, value
*_activated Feature first used days_since_signup, activation_path
*_retained User returns days_since_last_active, return_trigger

Property Standards

Always Include:

  • user_id - For user-level analysis
  • session_id - For session-level analysis
  • timestamp - Usually automatic
  • platform - ios / android / web
  • app_version - For version-based debugging

Context Properties:

  • source - Where did user come from? (push, deeplink, organic)
  • entry_point - Which button/link triggered this?
  • experiment_variant - If A/B testing

Outcome Properties:

  • duration_ms - How long did it take?
  • success - true / false
  • error_type - If failed, what category?
  • error_message - Human-readable error

Analytics Table Format

## Analytics Events

### Funnel: [Feature Name]

| Event | Trigger | Properties | Notes |
|-------|---------|------------|-------|
| `feature_scheduled` | System schedules action | `user_id`, `scheduled_for`, `trigger_type`, `trigger_context` | Baseline for funnel |
| `feature_sent` | Action delivered | `user_id`, `feature_id`, `delay_from_scheduled_ms` | Delivery rate |
| `feature_opened` | User engages within 1hr | `user_id`, `feature_id`, `time_to_open_ms`, `source` | Open rate |
| `feature_converted` | User completes goal | `user_id`, `feature_id`, `conversion_type`, `time_to_convert_ms` | Conversion rate |
| `feature_failed` | Delivery/action failed | `user_id`, `feature_id`, `error_type`, `error_code`, `retry_count` | Error rate |

### Derived Metrics

| Metric | Calculation | Target |
|--------|-------------|--------|
| Delivery Rate | `feature_sent` / `feature_scheduled` | > 99% |
| Open Rate | `feature_opened` / `feature_sent` | > 30% |
| Conversion Rate | `feature_converted` / `feature_opened` | > 20% |
| Error Rate | `feature_failed` / `feature_scheduled` | < 1% |

Example: Complete Analytics Spec

Feature: PPO (Proactive Personalized Outreach)

## Analytics Events

### Funnel: PPO Lifecycle

| Event | Trigger | Properties | Notes |
|-------|---------|------------|-------|
| `ppo_analyzed` | Conversation ends, analyzer runs | `user_id`, `conversation_id`, `transcript_length` | Entry to funnel |
| `ppo_scheduled` | LLM decides to schedule PPO | `user_id`, `ppo_id`, `trigger_type`, `scheduled_for`, `message_preview` | Extraction rate |
| `ppo_skipped` | LLM decides NOT to schedule | `user_id`, `conversation_id`, `skip_reason` | Understand filtering |
| `ppo_sent` | Push notification delivered | `user_id`, `ppo_id`, `delay_from_scheduled_ms`, `expo_receipt_id` | Delivery rate |
| `ppo_send_failed` | Push delivery failed | `user_id`, `ppo_id`, `error_type`, `error_code`, `retry_count` | Debug delivery |
| `ppo_opened` | App opened within 1hr of send | `user_id`, `ppo_id`, `time_to_open_ms`, `opened_from` | Open rate |
| `ppo_conversation_started` | Conversation started within 1hr | `user_id`, `ppo_id`, `time_to_conversation_ms` | Engagement rate |
| `ppo_used` | PPO injected into Ember context | `user_id`, `ppo_id`, `ppo_count_injected` | Context usage |
| `ppo_expired` | PPO past 24hr grace period | `user_id`, `ppo_id`, `hours_overdue` | Stale PPO rate |

### Segment Properties

All events include:
- `user_id`, `session_id`, `platform`, `app_version`
- `ppo_trigger_type`: `commitment` | `event` | `emotional`
- `user_ppo_count_lifetime`: Total PPOs this user has received

### Derived Metrics

| Metric | Calculation | Target | Alert Threshold |
|--------|-------------|--------|-----------------|
| Extraction Rate | `ppo_scheduled` / `ppo_analyzed` | 15-25% | < 5% or > 50% |
| Delivery Rate | `ppo_sent` / `ppo_scheduled` | > 99% | < 95% |
| Open Rate | `ppo_opened` / `ppo_sent` | > 30% | < 15% |
| Conversation Rate | `ppo_conversation_started` / `ppo_opened` | > 50% | < 25% |
| Context Injection Rate | `ppo_used` / `ppo_opened` | > 95% | < 80% |
| Error Rate | `ppo_send_failed` / `ppo_scheduled` | < 1% | > 5% |

Validation Checklist

Before finalizing any spec:

  • Problem is clearly stated
  • Scope is well-defined (in-scope and out-of-scope)
  • Acceptance criteria are discrete, testable user stories (not vague summaries)
  • ACs grouped by functional area (core flow, edge cases, errors, analytics)
  • Architecture diagram included (for multi-component work)
  • User flow diagram included (for user-facing features)
  • Test plan specifies E2E/Integration/Regression tests to write
  • Analytics events defined with triggers and properties
  • Dependencies are identified
  • Success metrics are measurable
  • No ambiguous language ("should", "might", "could")

Quick Reference

Spec Type Linear Entity Key Sections
Epic Project Goals, Stories, Metrics
Issue Issue Problem, AC, Technical Notes
Bug Issue + Bug label Repro Steps, Expected/Actual
Spike Issue + Spike label Question, Time-box, Artifacts
PRD Project + Doc Requirements, User Journey, Metrics

Sparky Integration

This skill optionally integrates with Sparky, a persistent AI agent with long-term memory.

When Sparky is Consulted:

  • After gathering initial context from the user
  • Before generating the final spec

What Sparky Provides:

  • Product/design/user research perspective
  • Historical context and prior decisions
  • The 3-5 most important questions the human needs to answer
  • Red flags or considerations to address

Flow with Sparky:

  1. User provides initial context
  2. Claude asks if user wants Sparky's input
  3. If yes, Claude sends context to Sparky
  4. Sparky returns insights + key questions
  5. Claude presents questions to user
  6. User answers the questions
  7. Claude generates spec with all context

Dependency:

  • Requires the sparky skill at .claude/skills/sparky/
  • Uses .claude/skills/sparky/scripts/ask_sparky.sh

Including Sparky's Input in Specs:

If Sparky was consulted, add a "Research Context" section:

## Research Context
_Insights from product/design review:_
- [Key insight 1]
- [Key insight 2]

_Key questions addressed:_
- Q: [Question from Sparky]
  A: [User's answer]
Install via CLI
npx skills add https://github.com/quitgenius/ai-engineering --skill spec
Repository Details
star Stars 0
call_split Forks 0
navigation Branch main
article Path SKILL.md
More from Creator