name: tdd version: 1.0.0 description: > Test-Driven Development methodology for Deno 2.x projects. Teaches the red-green-refactor cycle using Deno's built-in test runner and @std/assert. Use when writing new features, fixing bugs, or refactoring existing code in Triggerfish. classification_ceiling: INTERNAL
Test-Driven Development with Deno
Write the test first. Watch it fail. Write the minimum code to pass. Refactor.
The TDD Cycle
RED Write a test for the behavior you want. Run it. It must fail.
GREEN Write the simplest code that makes the test pass. Nothing more.
REFACTOR Clean up duplication and improve structure. Tests stay green.
Never skip the red step. If the test passes before you write implementation code, either the test is wrong or the feature already exists.
When to Use TDD
- Adding a new function, module, or integration
- Fixing a bug (write a test that reproduces it first)
- Refactoring existing code (ensure tests exist before changing)
- Implementing a spec from PHASE_BREAKDOWN.md
Test Structure
Every test uses Deno.test(). Name tests as
"ComponentName: descriptive behavior":
import { assertEquals, assertExists } from "jsr:@std/assert";
Deno.test("PolicyEngine: evaluates allow rule for matching input", () => {
const engine = createPolicyEngine();
engine.addRule(allowRule);
const result = engine.evaluate(matchingInput);
assertEquals(result.action, "ALLOW");
});
For async tests:
Deno.test("SessionManager: create returns session with PUBLIC taint", async () => {
const mgr = await makeManager();
const session = await mgr.create({
userId: "u" as UserId,
channelId: "c" as ChannelId,
});
assertEquals(session.taint, "PUBLIC");
assertExists(session.id);
});
Assert Functions
Import from jsr:@std/assert:
import {
assert, // boolean truthiness
assertEquals, // strict equality (most common)
assertExists, // not null/undefined
assertMatch, // regex match
assertNotEquals, // strict inequality
assertRejects, // async function throws
assertStringIncludes, // substring match
} from "jsr:@std/assert";
| Function | Use When |
|---|---|
assertEquals(actual, expected) |
Comparing values, objects, arrays |
assertExists(value) |
Checking something is not null/undefined |
assert(condition) |
Simple boolean check |
assertStringIncludes(str, sub) |
Checking partial string content |
assertRejects(fn, ErrorType?) |
Testing async error paths |
assertMatch(str, regex) |
Pattern matching on strings |
Testing the Result Pattern
Every Triggerfish function returns Result<T, E>, never throws. Test both
paths:
// Success path
Deno.test("parseClassification: valid input returns ok Result", () => {
const result = parseClassification("RESTRICTED");
assertEquals(result.ok, true);
if (result.ok) {
assertEquals(result.value, "RESTRICTED");
}
});
// Error path
Deno.test("parseClassification: invalid input returns error Result", () => {
const result = parseClassification("INVALID");
assertEquals(result.ok, false);
if (!result.ok) {
assertStringIncludes(result.error, "Invalid classification");
}
});
Always narrow with if (result.ok) before accessing .value or .error.
TypeScript enforces this.
Test Helpers
Write local helper functions at the bottom of each test file. Common patterns:
Factory helper with overrideable defaults
function makeSession(taint: ClassificationLevel = "PUBLIC") {
let s = createSession({
userId: "u" as UserId,
channelId: "c" as ChannelId,
});
if (taint !== "PUBLIC") {
s = updateTaint(s, taint, "test setup");
}
return s;
}
Mock provider
function createMockProvider(
name: string,
response = "mock response",
): LlmProvider {
return {
name,
supportsStreaming: false,
async complete(_messages, _tools, _options) {
return {
content: response,
toolCalls: [],
usage: { inputTokens: 10, outputTokens: 5 },
};
},
};
}
Partial override helper
function makeAnswers(
overrides: Partial<WizardAnswers> = {},
): WizardAnswers {
return {
provider: "anthropic",
providerModel: "claude-sonnet-4-5",
apiKey: "",
agentName: "TestBot",
mission: "A test agent.",
...overrides,
};
}
Branded Type Casting
Triggerfish uses branded types for IDs. In tests, cast string literals:
const session = createSession({
userId: "u" as UserId,
channelId: "c" as ChannelId,
});
assertEquals(session.taint, "PUBLIC");
Temp Directory Cleanup
For tests that create files, use Deno.makeTempDir() with try/finally:
Deno.test("ExecTools: write creates file in workspace", async () => {
const tmpDir = await Deno.makeTempDir();
const ws = await createWorkspace({ agentId: "test", basePath: tmpDir });
try {
const result = await tools.write("hello.txt", "world");
assertEquals(result.ok, true);
} finally {
await ws.destroy();
}
});
Never leave temp directories behind. The finally block runs even when
assertions fail.
Environment-Gated Tests
For integration tests requiring live credentials:
Deno.test({
name: "AnthropicProvider: real API call (integration)",
ignore: !Deno.env.get("ANTHROPIC_API_KEY"),
async fn() {
const provider = createAnthropicProvider({});
const result = await provider.complete(
[{ role: "user", content: "Say hello" }],
[],
{},
);
assertStringIncludes(result.content, "hello");
},
});
The ignore flag skips the test when the env var is missing. It runs in CI
where credentials are set.
Sanitizer Flags
Some SDKs (Slack, Discord) leak async ops on import. Disable sanitizers for those tests only:
Deno.test({
name: "Slack adapter: factory creates adapter",
sanitizeResources: false,
sanitizeOps: false,
async fn() {
const adapter = createSlackChannel({ botToken: "xoxb-fake", ... });
assertEquals(adapter.status().channelType, "slack");
},
});
Only use these flags when you understand why the leak occurs. Never use them to hide real bugs.
Running Tests
# Run all tests
deno task test
# Run tests for a specific module
deno task test tests/core/types/
# Run a single test file
deno task test tests/skills/skills_test.ts
# Watch mode (re-runs on file changes)
deno task test:watch
The test task includes all necessary permissions:
--allow-read --allow-write --allow-env --allow-ffi --allow-run --allow-net --allow-sys --no-check
Step-by-Step Walkthrough
Example: Adding a maxClassification() function
Step 1: RED -- Write the failing test
// tests/core/types/classification_test.ts
Deno.test("maxClassification: returns higher of two levels", () => {
assertEquals(maxClassification("PUBLIC", "CONFIDENTIAL"), "CONFIDENTIAL");
assertEquals(maxClassification("RESTRICTED", "INTERNAL"), "RESTRICTED");
assertEquals(maxClassification("PUBLIC", "PUBLIC"), "PUBLIC");
});
Run: deno task test tests/core/types/classification_test.ts Result: FAIL
-- maxClassification is not defined.
Step 2: GREEN -- Write minimal implementation
// src/core/types/classification.ts
export function maxClassification(
a: ClassificationLevel,
b: ClassificationLevel,
): ClassificationLevel {
return CLASSIFICATION_ORDER[a] >= CLASSIFICATION_ORDER[b] ? a : b;
}
Run the test again. Result: PASS.
Step 3: REFACTOR
The implementation is already minimal. Check if the function should be exported
from mod.ts. Add it to the barrel. Run all tests to confirm nothing broke.
Deterministic Tests
All tests must produce the same result every time. Rules:
- No randomness (use deterministic test data)
- No external services (mock them)
- No time-dependent logic (inject clocks)
- No shared mutable state between tests
// Verify determinism explicitly
Deno.test("HookRunner: same input always produces same decision", async () => {
const runner = createHookRunner(engine);
const session = makeSession("CONFIDENTIAL");
const input = { target_classification: "PUBLIC" };
const r1 = await runner.run("PRE_OUTPUT", { session, input });
const r2 = await runner.run("PRE_OUTPUT", { session, input });
assertEquals(r1.allowed, r2.allowed);
assertEquals(r1.action, r2.action);
});
Common Mistakes
| Mistake | Why It's Wrong | Fix |
|---|---|---|
| Writing code before the test | You don't know if your test actually catches failures | Write test, see it fail, then implement |
| Testing implementation details | Tests break when you refactor | Test behavior and outputs, not internals |
| Over-implementing in GREEN | Extra code has no test coverage | Write only what the current test requires |
Using any in test helpers |
Defeats TypeScript safety | Type your mocks with the real interfaces |
| Skipping the REFACTOR step | Technical debt accumulates | Always review after green; clean up |
| Ignoring failing tests | Broken tests erode trust | Fix immediately or delete if obsolete |