name: contract-test-writer description: Generates consumer + provider contract tests from an OpenAPI (2.x/3.x) or GraphQL SDL file, and diffs the current spec against the previous one to classify breaking changes (MAJOR/MINOR/PATCH). Emits contract.test.ts + contract-report.md. MAJOR diffs block the release gate. PIPELINE-1 step 4. allowed-tools: Read Grep Glob Write context: fork
Contract Test Writer
An L1 Truth-Validation skill. Its job is to turn an API specification into two things at once:
- Runnable contract tests — provider-side tests that assert the real implementation matches the spec, plus consumer-side request-shape snapshots for any declared consumer.
- A semantic diff of the spec itself — every change since the previous version classified as MAJOR, MINOR, or PATCH, with the rationale rendered in the standard explainability contract.
When You're Invoked
- During PIPELINE-1 step 4, in parallel with
component-test-writer,business-rule-validator, andtest-data-manager. - On demand as
/vibeflow:contract-test-writer <spec path>. - Automatically from
release-decision-enginewhen a spec file changed in the diff since the last release tag.
Input Contract
| Input | Required | Notes |
|---|---|---|
| Spec file | yes | OpenAPI 3.x (preferred), 2.x, or GraphQL SDL. Path arg or .vibeflow/artifacts/api-spec.yaml. |
| Previous spec | optional but preferred | .vibeflow/artifacts/contracts/previous/<name> — if absent, the skill skips breaking-change detection and reports "baseline established" instead. |
scenario-set.md |
optional | When present, scenario ids link to the specific operations they exercise (SC → operationId). |
| Consumer registry | optional | .vibeflow/artifacts/contracts/consumers.json — a list of named downstream consumers with their expected operation set. |
| Framework config | derived | Read from repo-fingerprint.json / ci_analyze_structure — reuses the same detection path as component-test-writer (vitest |
Hard preconditions — refuse to run with a single blocks-merge finding rather than producing wrong tests:
- The spec must parse cleanly. A malformed spec is not a test target — fix the spec first.
- Every referenced
$refmust resolve. Dangling refs are a hard block; we will not silently emitanyin their place. - Every operation must have an
operationId. No operationId → no stable test name → no traceability. Emit a blocker with remediation: "add operationId to".
Algorithm
Step 1 — Identify spec format
- If file is
.jsonor.yaml/.ymlwith a top-levelopenapi:orswagger:key → OpenAPI (version from the key). - If file is
.graphql/.graphqlsor containsschema { query ... }at top level → GraphQL SDL. - Otherwise refuse with "unsupported spec format".
See references/spec-parsers.md for per-format notes (how to resolve
$ref, how allOf/oneOf get normalized, how to walk a GraphQL type
graph without repeating visits).
Step 2 — Normalize operations
Flatten the spec into a canonical list of operations:
interface CanonicalOperation {
id: string; // operationId (OpenAPI) or Query/Mutation field name
verb: string; // HTTP verb for OpenAPI; "query"/"mutation" for GraphQL
path: string; // path template or GraphQL root field name
requestSchema: JsonSchema | null;
responseSchemas: Record<string /*status code*/, JsonSchema>;
requiredHeaders: string[];
auth: "none" | "bearer" | "basic" | "custom";
tags: string[]; // for grouping
}
For GraphQL, status codes map to {"200": responseType, "4xx": errorUnion}
so the downstream test shape is identical.
Step 3 — Generate provider contract tests
For each operation, emit one describe(operation.id, () => { ... })
block with the following cases:
- Happy path — valid request body (generated from the request
schema via
test-data-managerif it exists; otherwise inline a minimal example from the spec'sexamplefield; fall back toit.skipwith apending: "awaiting example"note). - Required-field omission — one case per required request field, asserting the response matches the 4xx schema.
- Auth missing — one case asserting the 401/403 schema when auth is required.
- Response shape — every declared status code gets a single assertion that the response body validates against its schema.
Each generated it(...) follows the Arrange-Act-Assert shape defined
in ../component-test-writer/references/test-patterns.md. The titles
prefix with the operationId and, when available, the scenario id:
it("SC-112 / getUserById: responds 200 with UserSummary", ...)
Step 4 — Generate consumer request snapshots
If consumers.json is present, for each consumer+operation pair emit
a JSON snapshot under
.vibeflow/artifacts/contracts/consumers/<consumerName>/<operationId>.json
capturing the minimal valid request the consumer MUST be able to send.
The snapshot includes the request body, path params, query params,
and required headers. Consumer-side tests (owned by the consumer
repo) load these snapshots and assert their own request builder can
reproduce them verbatim.
If consumers.json is absent, skip this step — do NOT invent
consumer names.
Step 5 — Breaking-change diff
If a previous spec exists at .vibeflow/artifacts/contracts/previous/<name>:
- Parse both, normalize via Step 2.
- Walk the cartesian product (old operations × new operations) keyed
by
operationId. - For each diff, classify via
references/breaking-change-rules.md. - Record every diff as a finding in the standard explainability
shape
{ finding, why, impact, confidence, evidence }.
Classification must come from the rules table — no ad-hoc severity. If a diff shape is not in the table, add it to the table first (and update the harness sentinel); do not silently downgrade to "unknown".
Step 6 — Compute the verdict
majorBreakingChanges = findings.filter(f.severity == "MAJOR").length
minorChanges = findings.filter(f.severity == "MINOR").length
patchChanges = findings.filter(f.severity == "PATCH").length
Verdict rules:
| Condition | Verdict |
|---|---|
majorBreakingChanges == 0 |
APPROVED |
majorBreakingChanges > 0 but every MAJOR has an explicit migration note in the spec (x-vibeflow-migration) |
NEEDS_REVISION |
majorBreakingChanges > 0 and at least one lacks a migration note |
BLOCKED |
The gate contract is MAJOR breaking changes block the release —
no other condition can produce BLOCKED and no condition can suppress it.
Step 7 — Write outputs
contract.test.ts— next to the spec file. Uses the same@generated-bybanner +@generated-start/@generated-endmarkers ascomponent-test-writer, so regeneration preserves any hand-written additions outside the skill-owned region..vibeflow/reports/contract-report.md— the breaking-change report (see output contract below)..vibeflow/artifacts/contracts/previous/<name>— copy the current spec here on successful APPROVED run, so the next invocation has a fresh baseline. NEVER overwrite the baseline on a BLOCKED verdict.
Output Contract
contract-report.md
# Contract Diff Report
## Summary
- Spec: <path>
- Previous baseline: <path or "none — first run">
- Verdict: [APPROVED|NEEDS_REVISION|BLOCKED]
- MAJOR: N
- MINOR: M
- PATCH: K
- Operations evaluated: X
- Operations added: A
- Operations removed: R
## MAJOR (gate-blocking)
For each:
- **finding**: <diff description>
- **why**: <which rule from breaking-change-rules.md>
- **impact**: <who breaks and how>
- **confidence**: <0..1>
- **evidence**: <old line → new line>
- **mitigation**: <concrete migration path; never "rewrite the API">
- **migration note present**: yes | no
## MINOR
<same shape>
## PATCH
<same shape>
## Added operations
- <verb> <path> (operationId: <id>)
## Removed operations
- <verb> <path> (operationId: <id>)
Explainability Contract
Every finding — MAJOR, MINOR, PATCH, added, removed — MUST carry
finding / why / impact / confidence / evidence. The classification
must cite a rule id from references/breaking-change-rules.md.
Undocumented classifications are forbidden.
Non-Goals
- Does NOT validate the spec against the actual running service — that's an integration test, not a contract test.
- Does NOT generate client SDKs — we emit tests, not stubs.
- Does NOT reorder operations or rename anything in the spec. Diffs are observed, never silently fixed.
Downstream Dependencies
release-decision-engine— readsmajorBreakingChangesas a hard blocker signaltraceability-engine— links operationId ↔ test ids ↔ scenario idstest-priority-engine— uses the list of changed operations to rank affected tests for the next test run