name: test-driven-development description: "Enforce strict Test-Driven Development workflow: write one test, make it pass, verify, then proceed. Prevents over-implementation and ensures code matches requirements exactly. Use when implementing new features, adding settings, or building functionality incrementally."
Test-Driven Development (TDD)
When to Use This Skill
Use this skill when the user explicitly requests:
- "Use TDD"
- "Test-driven development"
- "Write tests first"
- "Add tests before implementing"
- Implementing features incrementally with test verification at each step
Core TDD Principle
NEVER write implementation code before writing a test for it.
The TDD cycle is:
- Write ONE test for the next small piece of functionality
- Run the test - it should fail (red)
- Write minimal code to make that specific test pass
- Run the test - verify it passes (green)
- ONLY THEN proceed to the next test
Critical Rules
1. ONE Test at a Time
# WRONG: Adding multiple tests at once
it('should load panel1Text', () => { ... })
it('should load panel2Text', () => { ... })
it('should load panel3Text', () => { ... })
# Then implementing all at once
# RIGHT: One test, verify, next test
it('should load panel1Text', () => { ... })
# Run test → passes
# THEN add next test
it('should load panel2Text', () => { ... })
# Run test → passes
# THEN add next test
2. Test Drives Implementation, Not Vice Versa
// WRONG: Implementation first
const settings = {
panel1Text: "Welcome",
panel2Text: "I'm Theia",
panel3Text: "Your partner"
}
// Then write tests
// RIGHT: Test first
it('should have panel1Text', () => {
expect(settings).toHaveProperty('panel1Text')
expect(settings.panel1Text).toBe('Welcome')
})
// Test fails → add ONLY panel1Text → test passes
3. Verify After Each Step
Every test must be run and confirmed passing before writing the next test:
# After writing each test:
bun test your-test-file.test.ts
# Verify output shows:
✓ should have panel1Text
# 1 pass, 0 fail
# ONLY THEN proceed to next test
4. Minimal Implementation
Write the absolute minimum code to make the current test pass:
// Test: should have panel1Text with value "Welcome"
// WRONG: Adding everything at once
{
"panel1Text": "Welcome",
"panel2Text": "I'm Theia",
"panel3Text": "Your partner"
}
// RIGHT: Only what's needed for current test
{
"panel1Text": "Welcome"
}
5. Mocked Timers and Async Testing
CRITICAL RULE: Mocked processes with timers shall always add to a cumulative maxPotentialDelayTime so the assert timeout will always be sufficient because it will be based on that plus its own buffer.
When testing code that uses timers (setTimeout, setInterval, debounce):
Track Cumulative Time
// WRONG: Guessing how long to wait
setTimeout(async () => {
await operation1() // Unknown duration
await operation2() // Unknown duration
}, 1000)
await vi.advanceTimersByTimeAsync(2000) // Random guess
expect(mock).toHaveBeenCalled() // Might fail
// RIGHT: Track cumulative potential delay
let maxPotentialDelayTime = 0
// Timer adds 1000ms
maxPotentialDelayTime += 1000
// Mock operations add their durations
maxPotentialDelayTime += mockOperation1Duration // e.g., 50ms
maxPotentialDelayTime += mockOperation2Duration // e.g., 100ms
// Advance timers by tracked total + buffer
await vi.advanceTimersByTimeAsync(maxPotentialDelayTime + 100)
expect(mock).toHaveBeenCalled() // Will always succeed
Implementation Pattern
describe('async operation', () => {
beforeEach(() => {
vi.useFakeTimers()
})
it('completes within tracked time', async () => {
// Define all mock process timers at start
const mockProcessTimers = [
1000, // DEBOUNCE_DELAY
10, // MOCK_GET_DELAY
10, // MOCK_UPLOAD_DELAY
10, // MOCK_IMPORT_DELAY
10 // MOCK_SAVE_DELAY
]
// Define buffer for assertion safety
const assertDelayBuffer = 100
// Calculate total delay needed
const assertDelay = mockProcessTimers.reduce((sum, delay) => sum + delay, 0) + assertDelayBuffer
// Trigger operation
triggerDebouncedOperation()
// Wait for calculated delay
await vi.advanceTimersByTimeAsync(assertDelay)
expect(mockOperation).toHaveBeenCalled()
})
})
Why This Matters
Without time tracking:
- Tests guess arbitrary wait times
- Flaky tests that sometimes pass/fail
- No relationship between mock delays and test waits
- "House of cards" timing
With time tracking:
- Tests know exactly how long to wait
- Deterministic, reliable tests
- Clear relationship between delays and waits
- Tests adapt when delays change
Never use arbitrary timeouts like 2000ms without justification. Always track cumulative delays.
Dynamic Imports and Fake Timers
CRITICAL: Avoid dynamic imports in code paths that create timers when using fake timers for testing.
Generic Problem:
When code has an async operation (dynamic import, network call, etc.) that creates a timer AFTER the async work completes, fake timer APIs like runAllTimersAsync() will check for timers BEFORE the async operation completes, find none, and return immediately. The timer is created later and never executes.
Rule: If your code uses fake timers for testing, do not use dynamic imports (await import()) in the code path before creating the timer.
REQUIRED PROCEDURE FOR FIXING TEST ISSUES
When a test fails unexpectedly, you MUST follow this procedure:
- List dependencies: Identify every tool/function the test uses
- Create proof file: New test file (e.g.,
featureProofTest.test.ts) to prove each dependency works - Start minimal: Simplest possible test (tool exists and runs)
- Add one thing: Each new test adds exactly one complexity
- Run after each: Test fails = found the broken assumption
- Continue to completion: Build proof tests all the way until they replicate ALL aspects of the problematic test (mocked modules, dynamic imports, exact async patterns, etc.)
- Compare patterns: If all pass, issue is test setup not tools
Result: First failing proof test shows exactly what's broken. All passing = real test has environment/mocking issue.
Do NOT:
- Guess at solutions without proving tool behavior
- Search for fixes without understanding the problem
- Modify real tests without isolating the issue
- Repeat failed approaches expecting different results
TDD Workflow Example
User Request: "Add three panel text settings to the config file"
Step 1: First Test
// baseline-test-settings.test.ts
it('should have panel1Text', () => {
expect(settings.panel1Text).toBe('Welcome')
})
Run test:
bun test baseline-test-settings.test.ts
# ✗ Test fails - property doesn't exist
Step 2: Minimal Implementation
// baseline-test-settings.json
{
"panel1Text": "Welcome"
}
Step 3: Verify
bun test baseline-test-settings.test.ts
# ✓ 1 pass
STOP. Report success. Ask if user wants to proceed to next test.
Step 4: Second Test
it('should have panel2Text', () => {
expect(settings.panel2Text).toBe("I'm Theia")
})
Run test → Fails → Add only panel2Text → Test passes → Report → Repeat
Critical: Test Functionality, Not Configuration Data
FUNDAMENTAL PRINCIPLE: Tests should verify that code works correctly, not that configuration has specific values.
✅ Test Structure and Types
// CORRECT: Test that property exists and has right type
it('should have timerVisible setting', () => {
expect(settings).toHaveProperty('timerVisible')
expect(typeof settings.timerVisible).toBe('boolean')
})
// CORRECT: Test that numeric property exists
it('should have questionTimeout setting', () => {
expect(settings).toHaveProperty('questionTimeout')
expect(typeof settings.questionTimeout).toBe('number')
})
❌ Don't Test Configuration Values
// WRONG: Testing specific config value
it('should have timerVisible setting', () => {
expect(settings.timerVisible).toBe(false) // ❌ BAD
})
// Problem: If user changes config to true, test fails
// But the loading functionality still works!
// WRONG: Testing specific text content
it('should have welcome text', () => {
expect(settings.welcomeText).toBe('Welcome') // ❌ BAD
})
// Problem: If user changes text to "Hello", test fails
// But the loading functionality still works!
Why This Matters
Configuration data changes frequently:
- User preferences
- Business requirements
- A/B testing
- Localization
Functionality rarely changes:
- Property exists
- Type is correct
- Structure is valid
If changing a config value breaks tests, your tests are wrong.
Exception: Integration Tests
Only test specific values when verifying integration between systems:
// ACCEPTABLE in integration test:
it('should calculate timeout correctly', () => {
const timeout = calculateTimeout(settings.baseTime, settings.multiplier)
expect(timeout).toBe(100) // Testing calculation, not config
})
Anti-Patterns to Avoid
❌ Implementing Multiple Features Without Tests
// WRONG: Adding all settings at once
{
"panel1Text": "Welcome",
"panel2Text": "I'm Theia",
"panel3Text": "Your partner",
"fadeInTime": 0.8,
"delayTime": 2.2
}
// Then writing tests afterward
❌ Writing Multiple Tests Before Implementation
// WRONG: All tests written first
describe('settings', () => {
it('should have panel1Text', () => { ... })
it('should have panel2Text', () => { ... })
it('should have panel3Text', () => { ... })
})
// Then implementing everything
❌ Not Verifying Tests Pass
// Add test
it('should have panel1Text', () => { ... })
// Add implementation
{ "panel1Text": "Welcome" }
// WRONG: Immediately moving to next test without running current test
it('should have panel2Text', () => { ... }) // DON'T DO THIS YET
Correct TDD Pattern
Example: Adding Config Settings
User: "Add panel1Text, panel2Text, panel3Text to the config"
Response:
TDD Step 1: Adding panel1Text
1. Write test:
it('should have panel1Text', () => {
expect(settings.panel1Text).toBe('Welcome')
})
2. Run test:
bun test → FAILS (expected)
3. Add implementation:
{ "panel1Text": "Welcome" }
4. Run test:
bun test → PASSES ✓
✓ panel1Text complete. Ready for panel2Text.
Wait for user confirmation before proceeding to panel2Text.
Benefits of Strict TDD
Prevents Over-Implementation
- Can't add features not requested
- Each test represents an explicit requirement
- Implementation scope is controlled by tests
Ensures Requirements Match Code
- Tests define "what" before "how"
- No guessing about intended behavior
- Documentation through tests
Immediate Feedback
- Know exactly when something breaks
- Confidence that each piece works
- Easier debugging (last change broke last test)
Forces Incremental Progress
- Can't rush ahead
- User sees progress step-by-step
- Opportunity to course-correct early
Integration with Solid Developer Principles
TDD aligns perfectly with solid-developer rules:
Literal Interpretation
Tests make requirements explicit - no room for inference:
// Test EXPLICITLY states requirement
it('should have panel1Text with value "Welcome"', () => {
expect(settings.panel1Text).toBe('Welcome')
})
// Can only implement exactly this
Ask When Unclear
If test requirements are ambiguous, STOP and ask:
User: "Add panel text settings"
STOP - Ask before writing test:
"How many panel text settings? What are their names and values?"
No Anticipating Needs
Tests prevent adding "helpful" extras:
// Test only requires panel1Text
it('should have panel1Text', () => { ... })
// WRONG: Adding more than tested
{
"panel1Text": "Welcome",
"panel2Text": "I'm Theia" // NOT TESTED YET
}
// RIGHT: Only what test requires
{
"panel1Text": "Welcome"
}
When TDD is Not Appropriate
Don't use strict TDD for:
- Exploratory coding or prototyping
- One-line fixes or typo corrections
- Refactoring existing tested code
- Research tasks (reading code, analyzing)
TDD Checklist
Before proceeding to next test, verify:
- Current test is written
- Current test was run and failed initially
- Minimal implementation added
- Current test was run and passed
- User confirmed or acknowledged success
- Ready to proceed to next test
Common TDD Questions
Q: How small should each test be? A: Test ONE property, ONE behavior, ONE requirement at a time.
Q: What if multiple tests would pass with the same implementation? A: Still write them one at a time. Different tests document different requirements.
Q: Should I run all tests or just the current one? A: Run all tests to ensure no regressions, but focus verification on the new test.
Q: What if the user wants me to "implement everything"? A: Still use TDD - implement everything, but one test at a time, verifying each.
Example Session
User: "Add three timing settings to config: fadeIn, delay, and pause"
Claude: "TDD Step 1: Adding fadeIn setting
1. Writing test for fadeIn:
it('should have fadeIn timing', () => {
expect(settings.timing).toHaveProperty('fadeIn')
expect(typeof settings.timing.fadeIn).toBe('number')
})
2. Running test:
bun test settings.test.ts
✗ FAIL - property doesn't exist
3. Adding implementation:
{
"timing": {
"fadeIn": 0.8
}
}
4. Running test:
bun test settings.test.ts
✓ PASS
fadeIn setting complete. Ready to proceed to 'delay' setting."
User: "yes proceed"
Claude: "TDD Step 2: Adding delay setting
1. Writing test for delay:
it('should have delay timing', () => {
expect(settings.timing).toHaveProperty('delay')
expect(typeof settings.timing.delay).toBe('number')
})
2. Running test:
bun test settings.test.ts
✗ FAIL - delay property doesn't exist
3. Adding implementation:
{
"timing": {
"fadeIn": 0.8,
"delay": 2.2
}
}
4. Running test:
bun test settings.test.ts
✓ PASS (2 tests passing)
delay setting complete. Ready to proceed to 'pause' setting."
[Continues for pause...]
Summary
TDD ensures:
- ✅ Only requested features are implemented
- ✅ Every feature has a test
- ✅ Tests document requirements
- ✅ Progress is verifiable at each step
- ✅ No scope creep or "helpful" additions
- ✅ User maintains control over pace and direction
Use TDD when the user wants:
- Incremental, verifiable progress
- Confidence that code matches specs
- Ability to course-correct early
- Documentation through tests
- To prevent over-engineering
The TDD mantra: Red → Green → Next