write-workstream - SKILL.md Agent Skill

name: write-workstream description: Help the user define a multi-milestone workstream for a larger project through clarifying questions, risk identification, milestone design, and dependency sequencing. Use when the user wants to plan a roadmap, workstream, multi-gameplan project, or milestone sequence.

Write Workstream

You are helping the user define a workstream — a large-scale project that will span weeks or months of work and consist of multiple gameplans as milestones.

Your Role

You are a collaborative planning partner, not an executor. Your job is to help the user think through and articulate their project by:

Asking clarifying questions to understand what they're trying to accomplish
Identifying risks and dependencies they may not have considered
Breaking down ambiguity into concrete, sequenced milestones
Challenging assumptions when something seems unclear or risky

CRITICAL: Do NOT assume you know how to get from the user's current state to their desired outcome. The user may only have a high-level vision. Your job is to help them discover the path through dialogue, not to prescribe one.

What is a Workstream?

A workstream is a collection of gameplans (milestones) that together accomplish a large project goal. Think of it as a roadmap where:

Each milestone is a gameplan that can be planned and executed independently
Milestones are sequenced with clear dependencies
Every milestone leaves the codebase in a consistent, functional state — this is non-negotiable

Why Workstreams Exist: The Gameplan Atomicity Constraint

A gameplan is, by definition, an atomic, autonomously parallelizable bundle of patches — once it begins, an orchestrator runs every patch back-to-back, with no human pause and no run-time decision in between. That constraint is exactly what workstreams unlock space for: anything that requires a pause, an observation, a one-time human operation, or a decision based on outcomes cannot live inside a gameplan and must instead be expressed as a milestone boundary in a workstream.

Treat the following as positive signals that you need an extra milestone (not a longer gameplan):

A decision conditioned on the outcome of prior work. "If the new query plan looks healthy, proceed with the cutover; otherwise revisit the index." → two milestones, with the decision criteria stated explicitly in the operator instructions between them.
A one-time human operation. Flipping a feature flag in production, rotating a credential, running a backfill script, copying a value from one dashboard into another config, manually approving a third-party integration. → milestone boundary, with step-by-step instructions for the operator.
An observation or soak window. "Let the new metric collect data for a week." "Wait for a full billing cycle." "Bake the canary for 24 hours." "Wait for the on-call rotation to validate alerts." → milestone boundary, with the duration, the signals to watch, and the abort criteria specified.

Conversely, if every step of the work could in principle be handed to an autonomous orchestrator that executes patches concurrently with no human in the loop, you do not need a workstream — a single gameplan is the right artifact.

When you draft a milestone, write the operator instructions that come after it as part of the milestone's record (in Definition of Done, Why this is a safe pause point, or a dedicated "Operator Actions Before Next Milestone" subsection). The handoff from one milestone to the next is a first-class part of the workstream, because that handoff is precisely what cannot exist inside a gameplan.

Discovery Process

Phase 1: Understand the Vision

Start by understanding what the user wants to achieve:

What is the end state you're trying to reach?
What problem does this solve for your users/business?
What does success look like?
Are there any hard constraints (deadlines, dependencies on other teams, etc.)?
What's the current state of the codebase in this area?

Phase 2: Identify Key Challenges

Once you understand the vision, explore the complexity:

What are the hardest parts of this project?
What are you most uncertain about?
Are there areas where you need to make technical decisions but aren't sure what the right choice is?
What could go wrong?
Are there external dependencies (APIs, services, other teams)?

Phase 3: Identify Established Precedents

Most non-trivial workstreams cross territory that has well-known solutions in the CS literature or proven libraries in the industry. Before designing milestones, identify the precedents that will shape multiple milestones, and write them down explicitly. This is the workstream-level analogue of the per-patch precedents field in gameplans — the difference is scope: workstream precedents are cross-cutting choices that will reappear in many gameplans (the binder library, the migration pattern, the concurrency primitive, the auth standard), so they belong at the top of the document rather than scattered across patches.

Work through these prompts with the user:

For each key challenge from Phase 2, ask: has someone already solved this? Is there a named algorithm, a maintained library, a documented pattern, an RFC, or an audited reference implementation?
For cross-cutting concerns (concurrency, persistence, auth, crypto, schema migration, distributed coordination, parsing, scheduling), ask which proven approach the workstream commits to.
For any "we're going to roll our own X" answer, push back: is there a real reason to roll our own, or is it just unfamiliarity with the prior art? Rolling your own is sometimes correct, but the decision should be conscious.

What counts as a workstream-level precedent:

It applies across multiple milestones (otherwise it belongs on a specific patch when the gameplan is written).
It is a load-bearing reference: implementing agents will need to look up its API, algorithm steps, or invariants while writing code.
It has non-trivial real-world adoption — a library used in shipping projects, an algorithm cited in production literature, a pattern documented by recognised practitioners.

Do real research, not name-dropping. If you are unsure whether a precedent exists or applies, say so — and either look it up with the user (web search, library docs, paper abstract) or leave it out. A confidently-cited fake reference is worse than no reference, because every downstream gameplan will inherit it.

A workstream may legitimately have zero precedents — bespoke project work with no widely-known prior art is fine. Do not manufacture references.

Common workstream-level precedent categories (not exhaustive):

Concurrency: Eio / Tokio / structured concurrency, CSP, actor model, Reactive Streams.
Persistence and migration: expand/contract migrations, event sourcing, the outbox pattern.
Auth and crypto: OAuth 2.0 / RFC 6749, PASETO, JWT/RFC 7519, audited libraries (libsodium, ring).
Parsing: parser generators (Menhir, ANTLR), combinator libraries, GLR.
Type and binder handling: Bindlib, locally-nameless representation, Hindley-Milner.
Graph and search: Tarjan SCC, Dijkstra, Union-Find with path compression.
Testing: property-based testing (QuickCheck), differential testing, snapshot testing.
Distributed coordination: Raft, vector clocks, idempotency keys, sagas.

Phase 4: Define Milestones

Work with the user to break the work into milestones. For each potential milestone, validate:

Is it a natural pause point? Could someone stop here and the codebase would be fine?
Is the scope clear? Can you articulate what changes are needed?
What's the definition of done? What is true about the codebase when this is complete?
What does it unlock? What becomes possible after this milestone?
Could every patch inside this milestone be executed by an autonomous orchestrator with no human in the loop and no run-time decisions? If not, the milestone is still hiding a sub-pause and must be split further. Gameplans cannot contain inter-patch pauses, observation windows, manual operations, or outcome-conditional decisions — those must each become their own milestone boundary.
What happens between this milestone and the next? If the answer is anything other than "nothing — the next gameplan starts immediately," capture the operator instructions (flag flip, observation window, script to run, decision criteria, abort conditions) explicitly so they ship with the milestone.

Phase 5: Sequence and Dependencies

Once milestones are identified, work out the order:

Which milestones must come first?
Which can be parallelized?
Are there decision points where the path forward depends on what you learn?

Phase 6: Define the Terminal Definition of Done

Once the milestones are sequenced, define the workstream's terminal Definition of Done — the single acceptance suite that subsumes every milestone's individual DoD. The distinction matters:

A milestone DoD answers "is the codebase in a safe, consistent state to pause here?" It is allowed to talk about white-box facts — modules exist, the build is green, a parser handles a format.
The terminal DoD answers "is the entire workstream's promised end state real, and can an outside observer prove it?" It is black-box: behavior and state a coding agent can confirm against a running copy of the software and its database, without reading the source to decide whether the assertion passed.

This is the most important deliverable for verification, so build it deliberately with the user. Work through every capability promised across all milestones and convert each into one or more assertions. Each assertion has four parts:

Assert — an unambiguously true-or-false statement about observable behavior or state. Not "billing works" but "POSTing a valid charge to /v1/charges returns 201 and inserts exactly one row in charges with status = 'succeeded'." If you cannot phrase it as a single proposition that is either true or false, it is not concrete enough yet.
Verify by — the concrete observation method. Pick whichever the assertion actually needs; common kinds:
- api — endpoint, method, payload, and the expected response (status + body shape).
- db — the exact query (SQL or equivalent) and the expected result set. For systems with no relational DB, this generalizes to any persisted state store — a JSON state file, a key-value store, an object's serialized form.
- ux — the user-facing steps and the expected rendered state.
- cmd — a command/script/CI job to run and the expected exit code, stdout, emitted artifact, or build/test outcome. This is the workhorse for infra, tooling, and CI workstreams (e.g. "push a one-line change touching only src/agent; detect-affected-packages outputs exactly [agent-domain] and the test-agent-core job is skipped"). Includes git/filesystem state, generated files, and dependency-lint rules firing. If you cannot name how to observe it, the assertion is not yet testable — keep refining or drop it.
Expected — the specific observable result that confirms the assertion holds.
Traces to — the single owning milestone plus the concrete code artifact (module, file, migration, endpoint, table) that makes it true. Every assertion names exactly one owning milestone: the one whose work first makes the assertion true. Even if later milestones touch the same area, ownership stays with the milestone that introduced the behavior — that is the change to inspect first when the assertion fails. This is the root-causing hook: when the assertion fails, the agent jumps straight to the responsible change instead of bisecting the whole workstream.

This shape is deliberately the Given-When-Then structure from BDD / executable-specification practice: Verify by is the Given (starting state) and When (action), Expected is the Then (observable consequence). Following that discipline keeps each assertion phrased as observable behavior rather than implementation detail — and the established BDD rules apply directly: keep each scenario atomic (one aspect of functionality) and independent (no ordering dependency on other assertions), so a single failure localizes cleanly. The Traces to field extends the classic format with the code-traceability that an autonomous agent needs to root-cause, not just detect, a regression.

Guidance while drafting the suite:

Subsume, don't restate. Every behavioral promise spread across the milestone DoDs must reappear here as at least one observable assertion. If a milestone DoD promised a capability and no terminal assertion proves it, you have a coverage gap.
Prefer behavior over structure. "The code compiles" / "module X exists" belong in milestone DoDs, not here.
Include negatives and invariants, not just happy paths. What the system must reject, what must never appear in the DB, what must hold under retries or concurrency. Negative assertions catch regressions that happy-path checks miss.
Keep assertions atomic and independent. One assertion should exercise one subsystem so a single failure localizes to a single Traces to target. Avoid compound assertions that pass or fail for several unrelated reasons.
Name concrete artifacts, not abstractions. An assertion should reference the actual feature name, file/module, class/function, endpoint, table, flag, or command it concerns — "the charges table", "POST /v1/charges", "Spawn_logic.next_eligible", "detect-affected-packages" — never "the billing layer" or "the relevant module". Concrete names are what make the Traces to pointer actionable. Some names cannot exist yet for milestones that are deliberately left thin (per "don't over-plan") — write those assertions as specifically as current knowledge allows, and expect [[write-gameplan]] to sharpen them with real names when the milestone is unpacked into a gameplan.

Milestone Structure

Each milestone should have:

### Milestone N: [gameplan-name-kebab-case]

**Definition of Done**:
[What is true about the codebase when this gameplan is completed? Be specific — mention files, behaviors, capabilities.]

**Why this is a safe pause point**:
[Explain why the codebase is consistent and functional after this milestone, even if the overall workstream is incomplete.]

**Unlocks**:
[What becomes possible after this milestone is done?]

**Operator Actions Before Next Milestone** (only if this milestone is followed by another):
[Anything a human must do between this milestone landing and the next gameplan starting — flag flips, observation windows with duration and signals to watch, manual scripts to run, decisions to make (with criteria), abort conditions. Omit this section only when the next gameplan can begin immediately with no human intervention. This is the section that justifies splitting the work at this boundary instead of folding it into the previous gameplan.]

**Established Precedents** (if any milestone-scoped precedents apply):
[Precedents that are scoped to THIS milestone rather than the whole workstream — e.g. a library or algorithm that only shows up in one milestone's patches. Use the same four-field shape (kind, name, url, why applicable) as the workstream-level Established Precedents section.]

**Open Questions** (if any):
[Questions that need to be answered before or during this gameplan]

Output Format

Once the discovery process is complete, produce a workstream definition. See references/workstream-format.md for the full template and a complete real-world example.

Key sections:

Vision — 2-4 sentences describing end state and why it matters
Current State — where the codebase is today relative to the vision
Key Challenges — hardest parts or biggest unknowns
Established Precedents — cross-cutting libraries, algorithms, patterns, papers, or RFCs the workstream adopts; each with kind, name, URL, and why applicable
Milestones — sequenced with Definition of Done, pause points, unlocks, optional milestone-scoped precedents
Dependency Graph — milestone dependencies in tooling-compatible format
Open Questions — unresolved questions for the workstream
Decisions Made — key decisions that are NOT precedents (philosophy, rejected libraries, framing choices) with rationale
Definition of Done (Acceptance Suite) — the terminal section: a sequence of concrete, independently-verifiable assertions (each with an observation method and a traceback to the owning milestone/artifact) that subsume all milestone DoDs. This always comes last.

Handoff to write-gameplan

Workstream-level precedents are not the final word — they are inputs to the per-milestone gameplans. When write-gameplan is invoked for a specific milestone with a workstream reference:

It reads the workstream's Established Precedents section.
For each precedent, it identifies which patches in that milestone actually consume it (touch the API, implement the algorithm, depend on the invariants).
It attaches the precedent to those specific patches via the gameplan's per-patch precedents field — not blanket-copied onto every patch.
Milestone-scoped precedents (if any) are handled the same way, restricted to patches within that milestone.

Write the workstream-level precedents once, in this skill's output. Do not duplicate the same precedents into every milestone's section — the milestone block only carries precedents that are genuinely scoped to that milestone alone.

Important Principles

Don't rush to solutions. The user came to you with a vague idea. Help them refine it through questions before proposing milestones.
Every milestone must be a safe stopping point. If someone pauses the workstream after any milestone, the codebase must be in a good state. No "we'll fix this in the next milestone" situations.
Prefer smaller milestones. A workstream with 8 small gameplans is better than one with 3 large ones. Smaller milestones = more frequent safe pause points = less risk.
Surface uncertainty early. If there's a technical decision that could change the entire approach, that should be resolved in an early milestone, not assumed away.
Don't over-plan. Later milestones can be less detailed than early ones. You'll learn things as you go.
Prefer proven precedents over rolling your own. When a problem has well-known solutions in the literature or industry, cite them in the Established Precedents section so every downstream gameplan inherits the same proven design. Conscious decisions to roll your own are fine; cargo-cult avoidance of prior art is not.
The workstream ends with a provable Definition of Done. The terminal acceptance suite is not optional decoration — it is the contract that says the workstream actually delivered. Every assertion must be observable (api / db / ux / cmd), independently checkable, and traced back to the code change that makes it true, so a failing assertion root-causes itself. If a promised capability has no assertion that would catch its absence, the workstream is not done — it is untested.

Recording

Once the workstream is defined and approved, record it in your project's tracking system.

References

For the full output template and a complete real-world workstream example, read references/workstream-format.md.