tdd - SKILL.md Agent Skill

name: tdd description: 'Guide test-driven development through red/green/refactor phases. Provides checklists and patterns for writing failing tests first, implementing minimal code to pass, and refactoring safely. Use when practicing TDD or when /speckit.implement requests test-first development.' source: github/awesome-copilot source_files: agents/tdd-red.agent.md, agents/tdd-green.agent.md, agents/tdd-refactor.agent.md adopted: 2026-01-29 customizations: Consolidated three agents into single skill; adapted for TypeScript/React; integrated with Spec Kit workflow license: MIT

TDD Skill — Red/Green/Refactor

Overview

Guide test-driven development through the classic red/green/refactor cycle. This skill provides phase-specific checklists and patterns for writing high-quality tests before implementation.

When to Use

Starting a new feature with /speckit.implement
User asks to "write tests first" or "use TDD"
Implementing code that requires high reliability
Working on bug fixes (write failing test to reproduce first)

Phase 1: Red — Write Failing Tests First

Goal

Write a clear, specific failing test that describes the desired behavior before any implementation exists.

Workflow

Extract requirements from task description or issue
Identify single behavior to test
Write test with descriptive name: should_[expected]_when_[condition]
Follow AAA pattern: Arrange → Act → Assert
Run test — verify it fails for the right reason

Test Naming Patterns

// TypeScript/Vitest examples
describe('UserService', () => {
  it('should throw ValidationError when email is invalid', () => { });
  it('should return user when credentials are valid', () => { });
  it('should emit event when user is created', () => { });
});

# Python/pytest examples
def test_should_raise_validation_error_when_email_invalid():
    pass

def test_should_return_user_when_credentials_valid():
    pass

Red Phase Checklist

Test clearly describes expected behavior
Test fails for the right reason (missing implementation, not syntax error)
Test name is descriptive and follows naming convention
Test follows AAA pattern (Arrange, Act, Assert)
Single assertion focus per test
Edge cases considered
No production code written yet

Phase 2: Green — Make Tests Pass Quickly

Goal

Write the minimal code necessary to make the failing test pass. Resist over-engineering.

Workflow

Run failing test — confirm what needs implementing
Write simplest code to make test pass
Use "fake it till you make it" — hard-coded values are OK initially
Run all tests — ensure nothing is broken
Resist refactoring — save that for Phase 3

Implementation Strategies

// Strategy 1: Fake it (simplest)
function isValidEmail(email: string): boolean {
  return email === 'test@example.com'; // Hard-coded for first test
}

// Strategy 2: Obvious implementation
function isValidEmail(email: string): boolean {
  return /^[^\s@]+@[^\s@]+\.[^\s@]+$/.test(email);
}

// Strategy 3: Triangulation (add more tests to force generalization)

Green Phase Checklist

All tests are passing (green bar)
No more code written than necessary
Existing tests remain unbroken
Implementation is simple and direct
Did NOT refactor yet
Ready for refactoring phase

Phase 3: Refactor — Improve Quality

Goal

Clean up code and improve design while keeping all tests green. Make small changes and run tests frequently.

Workflow

Ensure all tests pass before starting
Identify code smells: duplication, long methods, unclear names
Make one small change at a time
Run tests after each change — stay green
Repeat until satisfied with code quality

Common Refactorings

Smell	Refactoring
Duplicated code	Extract function/method
Long function	Extract helper functions
Magic numbers	Extract constants
Unclear name	Rename variable/function
Deep nesting	Guard clauses / early return
Large class	Extract class

Refactor Phase Checklist

All tests still passing
Code duplication eliminated
Names clearly express intent
Methods have single responsibility
No magic numbers/strings
Security considerations addressed
Code coverage maintained

Integration with Spec Kit

When using /speckit.implement:

1. Read task from tasks.md
2. @tdd Red Phase:
   - Write failing test for task requirements
   - Verify test fails correctly
3. @tdd Green Phase:
   - Implement minimal code
   - Verify all tests pass
4. @tdd Refactor Phase:
   - Clean up code
   - Verify tests still pass
5. Mark task [X] complete
6. Commit: test: add tests for {feature}
7. Commit: feat: implement {feature}

TypeScript/React Patterns

Component Testing (Vitest + Testing Library)

// Red: Write failing test
describe('Button', () => {
  it('should call onClick when clicked', () => {
    const handleClick = vi.fn();
    render(<Button onClick={handleClick}>Click me</Button>);
    
    fireEvent.click(screen.getByRole('button'));
    
    expect(handleClick).toHaveBeenCalledOnce();
  });
});

// Green: Implement minimal component
function Button({ onClick, children }) {
  return <button onClick={onClick}>{children}</button>;
}

Hook Testing

describe('useCounter', () => {
  it('should increment count when increment is called', () => {
    const { result } = renderHook(() => useCounter());
    
    act(() => result.current.increment());
    
    expect(result.current.count).toBe(1);
  });
});

Anti-Patterns to Avoid

Anti-Pattern	Problem	Solution
Writing tests after code	Loses TDD benefits	Always test first
Multiple assertions	Unclear failures	One assertion per test
Testing implementation	Brittle tests	Test behavior, not internals
Skipping refactor phase	Technical debt	Always refactor after green
Large test steps	Hard to debug	Small, incremental tests
Unit tests only	Gaps in coverage	Always include integration + E2E
Skipping test tasks	No safety net	Tests are NON-NEGOTIABLE

Test Pyramid — Required Test Levels

Every feature MUST include all three test levels. This is NON-NEGOTIABLE.

Unit Tests (Base of Pyramid)

Test individual classes, methods, and functions in isolation
Mock/stub all external dependencies
Fast, deterministic, no I/O
Highest count — cover all logic branches

Integration Tests (Middle of Pyramid)

Test cross-layer behavior with real dependencies
Use real database (in-memory or Testcontainers)
Test repository → database, handler → service → repository chains
Validate serialization, query correctness, middleware behavior

E2E Tests (Top of Pyramid)

Test complete user workflows through the API/UI surface
Use WebApplicationFactory (.NET) or equivalent test server
Validate HTTP status codes, response shapes, auth flows
Fewer tests, focused on critical paths

.NET / C# Patterns

Unit Test (xUnit + FluentAssertions + NSubstitute)

// Red: Write failing test
public class CreateWorkHandlerTests
{
    private readonly IWorkRepository _repository = Substitute.For<IWorkRepository>();
    private readonly CreateWorkHandler _sut;

    public CreateWorkHandlerTests()
    {
        _sut = new CreateWorkHandler(_repository);
    }

    [Fact]
    public async Task Handle_ValidCommand_CreatesWork()
    {
        // Arrange
        var command = new CreateWorkCommand("Hamlet", "Shakespeare");

        // Act
        var result = await _sut.Handle(command, CancellationToken.None);

        // Assert
        result.Should().NotBeNull();
        await _repository.Received(1).AddAsync(Arg.Any<Work>(), Arg.Any<CancellationToken>());
    }
}

Integration Test (WebApplicationFactory + Real DB)

public class WorkEndpointTests : IClassFixture<WebApplicationFactory<Program>>
{
    private readonly HttpClient _client;

    public WorkEndpointTests(WebApplicationFactory<Program> factory)
    {
        _client = factory.WithWebHostBuilder(builder =>
        {
            builder.ConfigureServices(services =>
            {
                // Replace DB with in-memory for test isolation
                services.AddDbContext<EtudeStoryDbContext>(options =>
                    options.UseInMemoryDatabase("TestDb"));
            });
        }).CreateClient();
    }

    [Fact]
    public async Task GetWork_ExistingId_Returns200WithWork()
    {
        // Arrange — seed via API or direct DB access
        // Act
        var response = await _client.GetAsync("/api/works/{id}");

        // Assert
        response.StatusCode.Should().Be(HttpStatusCode.OK);
        var work = await response.Content.ReadFromJsonAsync<WorkResponse>();
        work.Should().NotBeNull();
    }
}

E2E Test (Playwright for UI, or HttpClient for API)

// API E2E: Full workflow through HTTP
[Fact]
public async Task CreateAndRetrieveWork_FullWorkflow_Succeeds()
{
    // Arrange
    var createPayload = new { title = "Hamlet", author = "Shakespeare" };

    // Act — Create
    var createResponse = await _client.PostAsJsonAsync("/api/works", createPayload);
    createResponse.StatusCode.Should().Be(HttpStatusCode.Created);
    var created = await createResponse.Content.ReadFromJsonAsync<WorkResponse>();

    // Act — Retrieve
    var getResponse = await _client.GetAsync($"/api/works/{created!.Id}");

    // Assert
    getResponse.StatusCode.Should().Be(HttpStatusCode.OK);
    var retrieved = await getResponse.Content.ReadFromJsonAsync<WorkResponse>();
    retrieved!.Title.Should().Be("Hamlet");
}

Python / FastAPI Patterns

Unit Test (pytest + unittest.mock)

# Red: Write failing test
import pytest
from unittest.mock import AsyncMock, MagicMock
from app.services.work_service import WorkService

class TestWorkService:
    def setup_method(self):
        self.repository = AsyncMock()
        self.sut = WorkService(repository=self.repository)

    @pytest.mark.asyncio
    async def test_create_work_valid_input_persists_to_repository(self):
        # Arrange
        self.repository.add.return_value = Work(id="1", title="Hamlet")

        # Act
        result = await self.sut.create(title="Hamlet", author="Shakespeare")

        # Assert
        assert result.title == "Hamlet"
        self.repository.add.assert_awaited_once()

Integration Test (pytest + httpx + Testcontainers)

import pytest
from httpx import AsyncClient, ASGITransport
from app.main import create_app

@pytest.fixture
async def client():
    app = create_app(testing=True)
    transport = ASGITransport(app=app)
    async with AsyncClient(transport=transport, base_url="http://test") as c:
        yield c

@pytest.mark.asyncio
async def test_create_work_endpoint_returns_201(client):
    # Act
    response = await client.post("/api/works", json={"title": "Hamlet", "author": "Shakespeare"})

    # Assert
    assert response.status_code == 201
    data = response.json()
    assert data["title"] == "Hamlet"

E2E Test (pytest + full workflow)

@pytest.mark.asyncio
async def test_create_and_retrieve_work_full_workflow(client):
    # Act — Create
    create_response = await client.post("/api/works", json={"title": "Hamlet", "author": "Shakespeare"})
    assert create_response.status_code == 201
    work_id = create_response.json()["id"]

    # Act — Retrieve
    get_response = await client.get(f"/api/works/{work_id}")

    # Assert
    assert get_response.status_code == 200
    assert get_response.json()["title"] == "Hamlet"

References

TDD by Example (Kent Beck)
github/awesome-copilot TDD agents
Vitest / Jest / pytest / xUnit documentation