name: 02-test description: Runs cabal test with streaming details and writes the output to the pipeline log. kind: leaf executor: script model: claude-haiku-4-5-20251001
Test
Runs nix develop --command cabal test --test-show-details=streaming and writes the output to .pipeline/test.log.
Inputs
- None — the working directory must be the repo root.
Plan
- Confirm
.pipeline/exists → verify: directory exists. - Run
nix develop --command cabal test --test-show-details=streamingredirecting stdout+stderr to.pipeline/test.log→ verify: log file exists. - Capture exit code → verify: integer captured.
- Propagate exit code → verify: non-zero exits surface to the caller.
Assumptions:
- The repo root is the current working directory.
- The build step has already passed; otherwise tests cannot link.
If any assumption fails, refuse — do not guess.
Steps
- If
.pipeline/does not exist, refuse: "pipeline not initialised; run phase 01 first" and exit non-zero. - If
nixis not on PATH (command -v nix), refuse: "nix not found; the repo's dev shell is required for cabal" and exit non-zero. - Run:
nix develop --command cabal test all --test-show-details=streaming > .pipeline/test.log 2>&1. - Capture the exit code.
- Snapshot rollover (deterministic). Before parsing the new run, if
.pipeline/test-counts.jsonexists from the previous run, atomically rename it to.pipeline/test-counts.prev.json(overwriting any older snapshot). This guarantees the diff in step 7 always compares against a well-defined prior snapshot — no race conditions, no orphaned files. - Parse per-suite summary lines from
.pipeline/test.log(the lines matching^N examples, M failures(, K pending)?$immediately above eachTest suite <name>: PASS/FAILmarker). Emit a JSON object to.pipeline/test-counts.jsonkeyed by suite name with fields{examples, failures, pending}. - If
.pipeline/test-counts.prev.jsonexists (from step 5's rollover), diff against it and surface:- Any suite where
pendingincreased → warning: "pending count grew for; review whether real tests were demoted to pending without a fixture being lifted" - Any suite where
pendingdecreased without a corresponding test-file diff → finding: "pending count dropped forwith no fixture work in this diff; suggests tests were deleted rather than satisfied" The check is a heuristic flag, not a refusal — the build-loop continues either way.
- Any suite where
- If exit (from step 4) is non-zero, surface the tail of
.pipeline/test.logand exit non-zero. - If exit is zero, exit 0.
Output
.pipeline/test.log and .pipeline/test-counts.json written; exit code propagated.
Refusals
.pipeline/missing → refuse: "pipeline not initialised; run phase 01 first".nixnot on PATH → refuse: "nix not found; the repo's dev shell is required for cabal".