nw-spike-methodology - SKILL.md Agent Skill

name: nw-spike-methodology description: Teaches agents how to run a timeboxed spike - throwaway code that validates one assumption before DESIGN user-invocable: false disable-model-invocation: true

Spike Methodology

What a Spike IS

Throwaway code that validates exactly one assumption. Max 1 hour. Binary outcome: works or doesn't work. Code is deleted after validation; only findings persist.

What a Spike IS NOT

Not This	Why
Prototype	Prototypes evolve into production code. Spikes are deleted.
MVP	MVPs ship to users. Spikes never leave the developer's machine.
Walking Skeleton	Skeletons wire end-to-end architecture. Spikes test one mechanism.
POC	POCs demonstrate feasibility to stakeholders. Spikes answer a developer question.

The 4 Questions

Every spike answers exactly these:

Does the core mechanism work? Can we parse the output? Call the API? Does the algorithm produce correct results?
Does it meet the performance requirement? Is it under the time/memory/throughput budget?
What edge cases exist? Empty input, malformed data, missing dependencies, concurrent access.
What did we assume wrong? The design said X, but reality is Y. Document the delta.

When to Spike

New mechanism never tried before in this codebase
Performance requirement that cannot be validated by reasoning alone
External integration (API, library, OS feature) with unknown behavior
Algorithm where correctness is non-obvious

When to Skip

Pure refactoring (mechanism already proven)
Bug fix (behavior already understood)
Feature estimated at less than 1 day (spike overhead not justified)
Mechanism already validated by a prior spike or existing code

Rules

Max 1 hour wall clock. If it takes longer, the problem is bigger than expected. Stop and escalate.
No tests. No types. No error handling. No ports. No abstractions.
Code lives in /tmp/spike_{feature_id}/ or a scratch directory. Never in src/.
One file preferred. Two files maximum.
Use time.perf_counter() for timing, not time.time().
Print results to stdout. No logging frameworks.
After validation: delete the code, keep the findings.

Spike Code Structure

#!/usr/bin/env python3
"""SPIKE: {feature} -- {one-line question being validated}

Throwaway code. Not production. Will be deleted after validation.
"""
import time

# 1. Setup (minimal, no frameworks)
# ...

# 2. Core mechanism attempt
start = time.perf_counter()
# ... the thing you're testing ...
elapsed = time.perf_counter() - start

# 3. Print findings
print(f"Mechanism: {'WORKS' if success else 'FAILS'}")
print(f"Timing: {elapsed*1000:.1f}ms (budget: {budget}ms)")
print(f"Edge cases: {edge_cases}")

Findings Output Format

Output goes to docs/feature/{feature-id}/spike/findings.md:

# Spike Findings -- {feature-id}

## Verdict: WORKS / DOESN'T WORK

## Question tested
{the one assumption being validated}

## Core mechanism
- Tested: {what was tested}
- Result: {what happened}

## Timing
- {operation}: {time}ms
- Total: {time}ms (budget: {budget}ms) -- PASS / FAIL

## Edge cases discovered
1. {edge case}: {what happened}
2. {edge case}: {what happened}

## Design implications
- {assumption that was wrong}: {correct reality}
- {approach the spike validated or invalidated}

## Spike code
Deleted. Was at /tmp/spike_{feature_id}/

Escalation

If the spike reveals the problem is fundamentally different from what was assumed:

Stop the spike
Write findings with verdict "DOESN'T WORK" or "BIGGER THAN EXPECTED"
Return to DISCUSS to re-scope the feature
The spike findings become input to the revised DISCUSS wave