name: implement description: Execute the implementation plan by assigning agents to tasks, generating code and tests from contracts. Agents work in parallel within stages. Use after /verify passes, when the user says "implement", "generate code", or "execute plan". disable-model-invocation: true allowed-tools: Read Write Edit Glob Grep Bash Agent metadata: author: hlv version: "1.0"
HLV Implement — Plan to Code + Tests
Execute the implementation plan: agents perform tasks from milestone stage files in parallel, generating code and tests from contracts.
Step 0: Read Configuration
Before proceeding, read project.yaml → features and note the flag values:
features.linear_architecture(default:true)features.hlv_markers(default:true)features.security_markers(default:true)
These flags control which sections below are active. If project.yaml has no features section, treat all as true.
CRITICAL: Code Architecture Philosophy
Conditional:
features.linear_architecture: trueIflinear_architectureisfalsein project.yaml, skip this entire section. Use your preferred architecture style instead.
The human DOES NOT read the generated code. The code is written FOR machines — LLM agents read it, LLM agents modify it, automated gates validate it.
This changes everything about how code is structured:
Comments are for LLM navigation. The contract IS the documentation. Comments in code are navigation markers for LLM, not explanations for humans. Format:
// @ctx: stock validation for order.create contract. If something needs explaining — the contract is incomplete, go fix it.Maximize LLM-readability. No layered architecture. Flat module structure. Explicit types everywhere. No "clever" patterns, no metaprogramming, no implicit behavior. No layered architecture — controller/service/repository is a pattern for humans. LLM writes linearly: input → validation → logic → output → errors. One file, one flow. An LLM agent with a 200K context window must understand any module in isolation.
One contract = one module boundary. Each contract maps to exactly one directory. All code for
order.createlives in one directory. No cross-contract imports except through domain types.Domain types are the shared language.
domain/typesis the ONLY shared code between features. Everything else is self-contained. Duplication across features is PREFERRED over coupling. Duplication is normal until it hurts: copy-paste between features is not refactored until it causes real problems (behavioral divergence, forgotten updates when contracts change).Tests live next to code. Tests are in the same file as the code (
#[cfg(test)] mod tests {}in Rust, equivalents in other languages). Every test traces back to a contract invariant, error case, or NFR via@hlvmarker. No "just in case" tests. No test helpers that hide behavior. Test code is as explicit as production code.hlv checkverifies every error code, invariant, and constraint rule has an@hlv <ID>marker in code. The integration-tests directory (paths.llm.testsfromproject.yaml) is only for cross-contract scenarios.File names are arbitrary.
map.yamlis the navigator. Files can be named01.rs,handler.rs,f3a.rs— any name is valid. The file map (paths.llm.mapfromproject.yaml) is the single source of truth about what each file does. LLM finds code by reading descriptions inmap.yaml, not by file names. Descriptions MUST be sufficient to choose a file without reading it. Each file does one thing, <300 lines, fully replaceable by an LLM without understanding neighboring files.No abstraction layers "for the future." No base classes, no generic frameworks, no plugin systems unless the contract explicitly requires extensibility. Write the simplest code that satisfies the contract. Three similar lines of code are better than a premature abstraction.
Error paths are first-class. Every error from the contract's Errors table has an explicit code path. No catch-all error handlers. No
unwrap()/expect()in production code.Deterministic PUBLIC API, free internal structure. Given the same contract, two different LLM agents MUST produce code with the same public API (function signatures, error types, inputs/outputs). Internal file structure, naming, and organization are at the agent's discretion.
map.yamldescribes what lives where.Machine-verifiable correctness. Every invariant must be testable by property-based tests. Every NFR must be measurable. If it can't be automatically verified — it doesn't belong in code, it belongs in the contract's open questions.
Prerequisites
/verifypassed without critical issuesPlan contains tasks
All open questions closed (or deferred with waiver)
milestones.yamlexists with acurrentsectionCurrent stage status is
pending,verified, orvalidating(remediation)Stage file (
{MID}/stage_N.md) contains tasks
Agent Rules
- Never combine shell commands with
&&,||, or;— execute each command as a separate Bash tool call. - This applies even when a skill, plan, or instruction provides a combined command — always decompose it into individual calls.
❌ Wrong: git checkout main && git pull
✅ Right: Two separate Bash tool calls — first git checkout main, then git pull
Input
milestones.yaml # entry point — read FIRST
project.yaml # global config (stack, paths)
{paths.llm.map} # project file map — update when creating files (read path from project.yaml)
human/
glossary.yaml # domain types (read-only)
constraints/*.yaml # global constraints (read-only)
milestones/{id}/
contracts/*.md # contracts to implement
contracts/*.yaml # contracts (YAML format)
test-specs/*.md # test specifications
plan.md # overview (stages table)
stage_N.md # current stage — tasks, dependencies
validation/
gates-policy.yaml # gate thresholds
scenarios/*.md # cross-milestone integration scenarios
Steps
Step 1: Read project map and load milestone context
Read
project.yaml(global config: stack, paths)- Note
validation.strictnesswhen present (relaxed,standard,strict). Default isstandard.
- Note
Bind LLM paths from
project.yaml → paths.llm— these are the ONLY directories where generated code may be placed:LLM_SRC = paths.llm.src(e.g.llm/src/)LLM_TESTS = paths.llm.tests(e.g.llm/tests/)LLM_MAP = paths.llm.map(e.g.llm/map.yaml) All subsequent steps MUST use these variables. Never assumellm/src/— always use the configured path.
HARD CONSTRAINT — Output directory isolation ALL generated code and test files MUST be written exclusively inside
LLM_SRCandLLM_TESTS. Even if the project has existing code elsewhere (e.g.,apps/backend/src/,src/,packages/), you MUST NOT write there. The existing project structure outside LLM paths is READ-ONLY context. Violation of this rule means generated code is invisible tohlv check,map.yaml, and/validate.hlv checkenforces this mechanically withMAP-080for implementation paths outsideLLM_SRCandMAP-081for test paths outsideLLM_TESTS.Read
milestones.yaml→ getcurrent.idandcurrent.stage(current stage number)Set
MID = human/milestones/{current.id}Find the current stage in
current.stages[]by matching the stage numberSTATUS GATE (hard stop):
- Read stage
status - Allowed values to proceed:
pending,verified,implementing,validating pending— implementation without prior /verifyverified— normal implementation after /verify passedimplementing— re-run, continue from pending tasksvalidating— remediation: /validate found gate failures and added FIX tasks to stage_N.md Remediation section. Execute only pending remediation tasks.implementedorvalidated— this stage is done. Check if there's a next stage to advance to, or inform user.
- Read stage
Update stage status →
implementinginmilestones.yaml(schema:schema/milestones-schema.json)Read
{MID}/stage_N.md— load tasks for the current stageRead
project.yaml → stack.components— understand target languages, frameworksRead
project.yaml → artifact_graph.code_ownershipwhen present. New or changed implementation/test/doc paths must preserve ownership mappings and relation fields (implements,verifies,documents,requires) sohlv artifacts impactcan route downstream review.code-*ownership andimplementspaths must remain underLLM_SRC.tests-*ownership andverifiespaths must remain underLLM_TESTS.
For every new or changed file under an artifact ownership path, add file-level evidence markers for the relevant relation, e.g.
@hlv:artifact code-auth implements spec-auth,@hlv:artifact tests-auth verifies spec-auth, or@hlv:artifact docs-auth documents spec-auth. Use the native comment syntax for the file type.
Step 2: Execute tasks
/implement works on ONE stage at a time. The current stage is determined by milestones.yaml → current.stage.
Tasks within a stage execute based on their dependency graph (topological sort):
- Tasks without unresolved
depends_on→ execute in parallel - Tasks with
depends_on→ wait for predecessors
stage = read {MID}/stage_N.md
ready_tasks = tasks with no pending depends_on
while ready_tasks not empty:
for task in ready_tasks (parallel):
1. Load context: contract from {MID}/contracts/, glossary, test spec from {MID}/test-specs/, dependency outputs
2. Generate code + tests within declared output paths (`LLM_SRC`, `LLM_TESTS`)
3. Run local checks: compile, lint, unit tests
4. Mark task completed in stage_N.md
ready_tasks = recalculate from remaining pending tasks
Boundary: git commit after all stage tasks completed
Update milestones.yaml: stage status → implemented
After completing a stage, inform the user: "Stage N complete. Run /validate to check gates, or /implement for the next stage."
Step 3: Agent protocol
Each agent when executing a task:
- Read
{MID}/stage_N.md→ find assigned task - Check
depends_on→ all dependencies completed - Load context:
- Contract (from
task.contracts—{MID}/contracts/) - Glossary (
human/glossary.yaml) - Stack (
project.yaml → stack.components) — target language, framework, dependencies - Test spec (
{MID}/test-specs/<contract>.md) - Dependent code (output of previous tasks)
- Contract (from
- Generate (linear, inline, TDD) — all files MUST be created inside
LLM_SRC(code) orLLM_TESTS(integration tests). Do NOT write to any other directory, even if it already contains project code:- Code structure (when
features.linear_architecture: true): write linearly — input → validation → logic → output → errors. No layers (controller/service/repository). One file per logical unit. File names are arbitrary (e.g.,01.rs,create.rs) — describe each file inLLM_MAP. (Whenfalse: use your preferred architecture style — layered, hexagonal, etc.) - Tests inline: unit tests go in the same file as code (
#[cfg(test)] mod tests). SeparateLLM_TESTSdirectory only for integration tests. @ctxcomments: add LLM navigation markers —// @ctx: stock check for order.create. Not human docs, but LLM orientation.- Tests first: write unit tests from contract test spec and property-based tests from invariants BEFORE implementation code. Tests must compile (with stubs/unimplemented markers) and clearly fail.
- Then implement: write implementation code to make the failing tests pass. (When
features.linear_architecture: true) No layered abstractions — write the simplest linear code. - Then refine: once tests are green, refactor if needed while keeping tests green. Duplication across features is OK — don't extract until it hurts.
@hlvmarkers (whenfeatures.hlv_markers: true, MANDATORY): every test MUST carry an@hlv <ID>comment linking it to a contract validation or constraint. See "Code Traceability Markers" below. (Whenfalse: skip@hlvmarkers entirely.)
- Code structure (when
- Validate locally:
cargo check/npm run build/ equivalent- Unit tests pass
- Lint is clean
- Update
LLM_MAP(schema:schema/llm-map-schema.json):- Add entries for every new file and directory created during this task
- Each entry:
path,kind(file/dir),layer: llm,description(what the file does) - Do NOT add build artifacts, caches, or generated files — they should be covered by
ignorepatterns - If your stack produces new artifact types not yet ignored, add a pattern to the
ignorelist (e.g.,__pycache__,*.pyc,node_modules,target/) hlv checkvalidates all map entries exist — missing entries are errors; LLM implementation/test entries outsideLLM_SRC/LLM_TESTSareMAP-080/MAP-081errors
- Update
stage_N.md:task.status → completedtask.agent → <agent_id>
Logging Protocol (mandatory for all agents)
Every agent MUST add structured logging to ALL generated code. This is not optional — observability is a first-class constraint.
Stack-specific instrumentation:
| Stack | Library | Entry/exit | Error | State change |
|---|---|---|---|---|
| Rust | tracing |
#[instrument] on every pub fn |
error!(error = %e, ctx = ?ctx) |
info!(entity_id, old, new, "state changed") |
| Python | structlog |
log.info("handler.enter", **params) |
log.error("op.failed", error=str(e), ctx=ctx) |
log.info("state.changed", entity=id, old=old, new=new) |
| Node | pino |
log.info({ params }, 'handler.enter') |
log.error({ err, ctx }, 'op.failed') |
log.info({ entityId, old, new }, 'state.changed') |
Rules:
- Structured only — no
println!,dbg!, bareconsole.log. All output through the logging library. - Every pub fn gets a span —
#[instrument](Rust) or equivalent. Includes function args (excluding sensitive data). - Every error path logs — with
request_id,entity_id, input summary, and error details. No silent catches. - Every state mutation logs — entity ID, old state, new state. DB writes, status transitions, cache ops.
- Every external call logs — target, duration, outcome. HTTP, DB, queue, gRPC.
- Request correlation — propagate
request_id/trace_idthrough all spans. Set at entry point, flows down. - Sensitive data masked — PII, tokens, passwords never appear in logs. Use
#[instrument(skip(password))]or field redaction. - Log levels correct —
errorfor failures,warnfor degraded/retries,infofor business events,debugfor diagnostics.
@hlv markers: tests for logging rules use markers from constraints/observability.yaml (e.g., @hlv structured_logging_only, @hlv log_all_errors, @hlv request_correlation).
Step 4: Coordination rules
File isolation: two agents NEVER write to the same file.
- Task output paths do NOT overlap.
- If overlap detected — block task, escalate to human.
Shared read-only context: agents READ shared files (glossary, contracts, domain types) but do NOT modify them.
Stage boundary commit: after all tasks in a stage complete —
git commit. Artifacts become available to the next stage through git.Context budget: each task has a
context_budgetin stage_N.md. If actual context (contract + glossary + deps) exceeds budget — split the task.Conflict resolution: if two agents discover a conflict (both want to modify the same type) — block task, escalate to human.
Step 5: Output summary
After all tasks in the current stage complete:
=== /implement complete (Stage N) ===
Milestone: <milestone-id>
Stage: <N>/<total>
Tasks completed: <N>/<N>
Files generated: <N>
Tests generated: <N>
Next step: run /validate to check gates for this stage
Step 6: Update project files
Update milestones.yaml (schema: schema/milestones-schema.json):
# milestones.yaml updates:
current.stages[N].status: implementing → implemented
Step 7: Set gate commands
After implementation, update validation/gates-policy.yaml (schema: schema/gates-policy-schema.json) — set the command field for each gate so that hlv check and hlv gates run can execute them automatically.
Determine the correct command from project.yaml → stack (language, framework):
| Gate type | Rust | Python | Node |
|---|---|---|---|
contract_tests |
cargo test --lib |
pytest tests/contract/ |
npm test |
integration_tests |
cargo test --test integration |
pytest tests/integration/ |
npm run test:integration |
property_based_tests |
cargo test --lib -- pbt |
pytest tests/pbt/ |
npm run test:pbt |
security |
cargo audit |
bandit -r src/ |
npm audit |
mutation_testing |
cargo mutants |
mutmut run |
npx stryker run |
performance |
cargo bench |
locust --headless |
npx k6 run |
For each gate in gates-policy.yaml:
- If this stage produced test code covering this gate → set
commandandcwd, ensureenabled: true - If the gate has no tests yet (will be covered in a later stage) → leave
command: null - Do NOT disable (
enabled: false) gates that the user has enabled — only the user controls enable/disable
Also set the cwd field — the working directory relative to project root where the command should run. Derive this from LLM_SRC (e.g. if paths.llm.src is llm/src/, cwd is llm; if it's apps/backend/src/, cwd is apps/backend). Security gates may run from root.
Example update to gates-policy.yaml:
gates:
- id: GATE-CONTRACT-001
type: contract_tests
mandatory: true
enabled: true
command: "cargo test --lib"
cwd: llm
pass_criteria:
required_scenarios_pass_rate: 1.0
- id: GATE-SECURITY-001
type: security
mandatory: true
enabled: true
command: "cargo audit"
cwd: llm
pass_criteria:
max_open_critical: 0
The user can also manage gates manually via CLI or dashboard (hlv dashboard → Gates tab):
hlv gates set-cmd <GATE-ID> "<command>"hlv gates set-cwd <GATE-ID> "<dir>"hlv gates clear-cmd/clear-cwd <GATE-ID>hlv gates enable/disable <GATE-ID>
Output
All generated code MUST go inside the paths configured in project.yaml → paths.llm:
- Source code →
LLM_SRC(bound in Step 1 frompaths.llm.src) - Integration tests →
LLM_TESTS(bound in Step 1 frompaths.llm.tests) - File map →
LLM_MAP(bound in Step 1 frompaths.llm.map)
Never hardcode llm/src/ or llm/tests/ — always use the configured paths. Never create src/ or tests/ in the project root.
Example layout when paths.llm.src: llm/src/, paths.llm.tests: llm/tests/, paths.llm.map: llm/map.yaml:
llm/
src/ # LLM_SRC — generated code (unit tests inline via #[cfg(test)])
domain/types.rs # from TASK-001 (types + tests in same file)
domain/errors.rs
features/order_create/ # from TASK-002 (handler + tests in same file)
features/order_cancel/ # from TASK-003
middleware/ # from TASK-004
observability/ # from TASK-006
tests/ # LLM_TESTS — integration tests ONLY (cross-contract scenarios)
integration/ # from TASK-005
map.yaml # LLM_MAP — updated with new entries
milestones.yaml # updated stage status
Code Traceability Markers (@hlv)
Conditional:
features.hlv_markers: trueIfhlv_markersisfalsein project.yaml, skip this entire section. No@hlvmarkers are required andhlv checkwill not run CTR-010/CTR-001 checks.
Every contract validation and constraint rule MUST be traceable to test code. hlv check enforces this automatically.
What gets tracked
| Source | Field | Example ID |
|---|---|---|
| Contract errors | errors[].code |
OUT_OF_STOCK, INVALID_QUANTITY |
| Contract invariants | invariants[].id |
atomicity, non_negative_total |
| Constraint rules | rules[].id |
prepared_statements_only, no_secrets_in_logs |
Marker format
Add @hlv <ID> as a comment next to the test that verifies this validation:
// @ctx: stock validation for order.create contract
// @hlv OUT_OF_STOCK
#[test]
fn test_out_of_stock_returns_409() {
// ...
}
// @ctx: transactional write — 3 tables in one tx
// @hlv atomicity
#[test]
fn test_order_write_is_atomic() {
// ...
}
// @hlv prepared_statements_only
#[test]
fn test_no_sql_injection() {
// ...
}
Works with any language — the marker is matched by text search, not syntax:
# @ctx: user lookup in cancel flow
# @hlv USER_NOT_FOUND
def test_user_not_found():
...
// @hlv pii_masking_enabled
it('masks PII in logs', () => { ... });
@ctx comments are optional LLM navigation markers — they help LLM orient quickly without reading the full file. Not human documentation.
Rules
- One
@hlvmarker per validation/constraint per test. A test may carry multiple markers if it covers several validations. - Every
errors[].codefrom every contract YAML must appear as@hlv <code>somewhere inLLM_SRCorLLM_TESTS. - Every
invariants[].idmust appear as@hlv <id>. - Every constraint
rules[].idmust appear as@hlv <id>— except rules that havecheck_command(they are verified programmatically, not via markers). hlv checkreports missing markers as warnings (CTR-010). Atimplementedphase and later, these become hard warnings that block/validate.hlv checkalso runscheck_commandfor rules that define one (CST-050/CST-060).
Verification
$ hlv check
...
Code traceability
! WRN [CTR-010] error 'OUT_OF_STOCK' from order.create has no @hlv marker in code
! WRN [CTR-010] constraint 'no_secrets_in_logs' from security.global has no @hlv marker in code
· INF [CTR-001] Code traceability: 7/9 markers covered
Security Attention Markers (@hlv:sec)
Conditional:
features.security_markers: trueIfsecurity_markersisfalsein project.yaml, skip this entire section. No@hlv:secmarkers are required andhlv checkwill not run SEC-010 diagnostics.
When writing implementation code, mark security-sensitive spots with @hlv:sec markers. These are attention flags for heightened scrutiny during /validate.
Syntax
// @hlv:sec [CATEGORY] — free text reason
Categories
| Category | When to use |
|---|---|
INPUT_VALIDATION |
User input parsing, sanitization, boundary checks |
DESERIALIZATION |
Parsing external data (JSON, YAML, protobuf, etc.) |
AUTH_BOUNDARY |
Authentication/authorization checks, session validation |
SECRET_HANDLING |
API keys, tokens, passwords, PII in memory or logs |
FILE_ACCESS |
File reads/writes, path traversal risks |
CRYPTO |
Encryption, hashing, signing, random number generation |
PRIVILEGE_ESCALATION |
Role changes, sudo/admin operations, capability grants |
NETWORK |
HTTP requests, DNS, TLS, socket operations |
Examples
// @hlv:sec [INPUT_VALIDATION] — user-supplied email used in DB query
fn create_user(email: &str) -> Result<User> { ... }
// @hlv:sec [SECRET_HANDLING] — API key loaded from env, must not leak to logs
let api_key = std::env::var("API_KEY")?;
// @hlv:sec [AUTH_BOUNDARY] — session token validated before granting access
fn verify_session(token: &str) -> Result<Session> { ... }
Rules
- Place
@hlv:secmarkers in implementation code (not just tests) at the point where the security-sensitive operation happens. - Use exactly one of the 8 categories above —
hlv checkwarns on unknown categories (SEC-011). - Add a brief reason after
—explaining why this spot is security-sensitive. hlv checkreports SEC-010 as Info: an aggregated summary table of markers by category and file count.
Verification
$ hlv check
...
Security markers
· INF [SEC-010] Security markers: 5 total across 3 file(s) [AUTH_BOUNDARY=2, INPUT_VALIDATION=2, SECRET_HANDLING=1]
Error handling
- Stage status not in allowed set (
pending,verified,implementing,validating) → hard stop with guidance - Open questions remain → error: "Resolve open questions before /implement"
- Task dependency cycle detected → error: "Dependency cycle in plan: <details>"
- File conflict between agents → block task, escalate to human
- Context budget exceeded → warning: "Task <id> exceeds context budget. Consider splitting."
- Local checks fail → retry once, then block task with error details
Re-run
/implement can be run again:
- Skips tasks with
status: completed - Continues from first
pendingtask - On contract change — marks affected tasks as
pending
Handoff integration
When a Handoff server is available:
handoff_register— register each agenthandoff_check— check for conflicts before writing a filehandoff_done— signal task completion- Change propagation — Handoff automatically notifies dependent agents
Commit hint
After all tasks in a stage are done, check for <!-- hlv:commit-hint --> in the stage_N.md file. If present, suggest the user commit with the provided message:
git commit -m "$(hlv commit-msg)"
Or show the hint text and let the user decide.
Cleanup
After the skill completes:
- Run
hlv doctorto catch missing paths, invalid command strings, cwd problems, schema mismatch, and non-ASCII rendering issues. - Run
hlv checkto validate the project structure. If there are errors — fix them before finishing. Ifvalidation.strictness: strictor CI parity is required, runhlv check --strict. - Suggest the user run
/clearto free up context window before the next skill.