iterate-on-implementation - SKILL.md Agent Skill

name: iterate-on-implementation description: Iteratively refine a feature implementation by identifying and fixing bugs, edge cases, and improvements category: Git Workflow tags: [openspec, refinement, iteration, quality] triggers: - "iterate on implementation" - "refine implementation" - "improve implementation" - "improve and iterate" - "linear iterate on implementation"

Iterate on Implementation

Iteratively refine a feature implementation after /implement-feature completes. Each iteration reviews the code, identifies improvements, implements fixes, and commits — repeating until only low-criticality findings remain or max iterations are reached.

Arguments

$ARGUMENTS - OpenSpec change-id (required), optionally followed by --max <N> (default: 5), --threshold <level> (default: "medium"; values: "critical", "high", "medium", "low"), and --vendor-review (dispatch multi-vendor review after iterate loop converges; automatic in coordinated tier)

Prerequisites

Feature branch openspec/<change-id> exists with implementation commits (or the operator-mandated branch when OPENSPEC_BRANCH_OVERRIDE is set)
Approved OpenSpec proposal exists at openspec/changes/<change-id>/
Run /implement-feature first if no implementation exists

Provider-Neutral Dispatch

When this skill delegates refinement or validation work, treat the provider-neutral dispatch adapter as the canonical cross-provider path. Claude Code, Codex, and Gemini/Jules are first-class providers when configured; Claude-style Task(...) or Agent(...) snippets are provider-specific examples, with inline execution as the fallback.

OpenSpec Execution Preference

Use OpenSpec-generated runtime assets first, then CLI fallback:

Claude: .claude/commands/opsx/*.md or .claude/skills/openspec-*/SKILL.md
Codex: .codex/skills/openspec-*/SKILL.md
Gemini: .gemini/commands/opsx/*.toml or .gemini/skills/openspec-*/SKILL.md
Fallback: direct openspec CLI commands

Coordinator Integration (Optional)

Use docs/coordination-detection-template.md as the shared detection preamble.

Detect transport and capability flags at skill start
Execute hooks only when the matching CAN_* flag is true
If coordinator is unavailable, continue with standalone behavior

Steps

0. Detect Coordinator, Read Handoff, Recall Memory

At skill start, run the coordination detection preamble and set:

COORDINATOR_AVAILABLE
COORDINATION_TRANSPORT (mcp|http|none)
CAN_LOCK, CAN_QUEUE_WORK, CAN_HANDOFF, CAN_MEMORY, CAN_GUARDRAILS

If CAN_HANDOFF=true, read recent handoff context:

MCP path: read_handoff
HTTP path: "<skill-base-dir>/../coordination-bridge/scripts/coordination_bridge.py" try_handoff_read(...)

If CAN_MEMORY=true, recall relevant implementation-iteration memories:

MCP path: recall
HTTP path: "<skill-base-dir>/../coordination-bridge/scripts/coordination_bridge.py" try_recall(...)

On recall/handoff failure, continue with standalone iteration and log informationally.

1. Determine Change ID and Configuration

# Parse change-id from argument or current branch
BRANCH=$(git branch --show-current)
CHANGE_ID=${ARGUMENTS%% *}  # First arg, or detect from branch
CHANGE_ID=${CHANGE_ID:-$(echo $BRANCH | sed 's/^openspec\///')}

# Defaults
MAX_ITERATIONS=5
THRESHOLD="medium"  # critical > high > medium > low

# Detect worktree context and resolve OpenSpec path
# Note: detect auto-discovers context from the working directory;
# agent-id information is available via the worktree registry if needed.
eval "$(python3 "<skill-base-dir>/../worktree/scripts/worktree.py" detect)"
if [[ "$IN_WORKTREE" == "true" ]]; then
  echo "Running in worktree. OpenSpec path: $OPENSPEC_PATH"
fi

Parse optional flags from $ARGUMENTS:

--max <N> overrides MAX_ITERATIONS
--threshold <level> overrides THRESHOLD
--vendor-review sets VENDOR_REVIEW=true

# Vendor review: explicit flag OR auto-enable in coordinated tier
VENDOR_REVIEW=false
if [[ "$ARGUMENTS" == *"--vendor-review"* ]] || [[ "$COORDINATOR_AVAILABLE" == "true" ]]; then
  VENDOR_REVIEW=true
fi

2. Verify Implementation Exists

# Resolve the expected feature branch. This reads from the worktree registry
# (preferred — reflects what plan/implement actually used) and falls back to
# OPENSPEC_BRANCH_OVERRIDE env var or the openspec/<change-id> default.
eval "$(python3 "<skill-base-dir>/../worktree/scripts/worktree.py" resolve-branch "$CHANGE_ID")"
FEATURE_BRANCH="$BRANCH"

# Verify on feature branch
CURRENT_BRANCH="$(git branch --show-current)"
if [[ "$CURRENT_BRANCH" != "$FEATURE_BRANCH" ]]; then
  echo "ERROR: on '$CURRENT_BRANCH' but expected '$FEATURE_BRANCH' (source: $BRANCH_SOURCE)" >&2
  echo "Hint: check out '$FEATURE_BRANCH' and re-run, or run /implement-feature first" >&2
  exit 1
fi

# Verify proposal exists
openspec show $CHANGE_ID

# Verify implementation commits exist
git log --oneline main..HEAD

If not on the feature branch, check out $FEATURE_BRANCH (which honors OPENSPEC_BRANCH_OVERRIDE). If no implementation commits exist, abort and recommend running /implement-feature first.

2.5. Prepare Findings Artifact

Preferred path:

Use the runtime-native continue/findings workflow (opsx:continue equivalent) to create or extend impl-findings.

CLI fallback path:

openspec instructions impl-findings --change "$CHANGE_ID"
openspec status --change "$CHANGE_ID"

Ensure openspec/changes/<change-id>/impl-findings.md exists and append each iteration's findings there.

2.6. Consume Rework Report (If Available)

If openspec/changes/<change-id>/rework-report.json exists, load it as the primary input for prioritizing fixes. The rework report provides machine-readable failure routing: scenario IDs, visibility, requirement refs, implicated files, and recommended actions.

REWORK_REPORT="openspec/changes/$CHANGE_ID/rework-report.json"
if [[ -f "$REWORK_REPORT" ]]; then
  echo "Rework report found — using as primary iteration input"
  # Parse failures with recommended_action == "iterate"
  # Prioritize those failures over code-review-discovered findings
fi

When a rework report is present:

Prioritize failures with recommended_action: "iterate" over self-discovered findings
Skip failures with recommended_action: "defer" unless the threshold is set to "low"
Flag failures with recommended_action: "revise-spec" for spec revision rather than code changes
Public scenario failures are soft signals — fix if possible, defer if not critical
Holdout scenario failures are not visible here (they route through /cleanup-feature)

3. Begin Iteration Loop

ITERATION=1

4. Review and Analyze

Read the following files to understand intent and current state:

$OPENSPEC_PATH/changes/<change-id>/proposal.md
$OPENSPEC_PATH/changes/<change-id>/design.md
$OPENSPEC_PATH/changes/<change-id>/tasks.md
All implementation source files changed on this branch (git diff --name-only main..HEAD)

Note: In worktree mode, OpenSpec files are in the main repository, not the worktree.

Produce a structured improvement analysis with findings in this format:

#	Type	Criticality	Description	Proposed Fix
1	bug/security/edge-case/workflow/performance/UX/observability/resilience	critical/high/medium/low	What the issue is	How to fix it

Type categories:

bug: Incorrect behavior, crashes, data corruption, logic errors
security: Authentication/authorization bypass, input validation gaps at system boundaries, secrets exposure, SQL injection, XSS, command injection, missing TLS, OWASP top-10 vulnerabilities
edge-case: Unhandled inputs, boundary conditions, error paths
workflow: Developer experience, tooling integration, process issues
performance: Unnecessary work, slow paths, resource waste, N+1 queries, unbounded loops, missing pagination
UX: Confusing output, missing feedback, poor error messages
observability: Missing structured logging for key operations, no error context in catch blocks, missing health/readiness endpoints for new services, no metrics for SLI-relevant paths, missing trace propagation
resilience: Missing retry with backoff for external calls, no timeout configuration, non-idempotent operations that should be idempotent, missing circuit breakers for external dependencies, no graceful degradation on dependency failure

Criticality levels:

critical: Authentication bypass, data loss, crashes, incorrect core behavior, secrets in code or logs, missing TLS for sensitive data
high: Unhandled error paths, missing validation at system boundaries, race conditions, no retry on critical external calls, missing health endpoint for new service
medium: Missing edge cases, suboptimal error messages, incomplete logging, missing structured log fields, no timeout on external calls
low: Code style, minor naming, documentation polish, minor performance, verbose logging that could be reduced

Schema type mapping (for translating implementation findings to review-findings.schema.json types at the dispatch/consensus boundary):

Impl Dimension	Schema Type(s)	Notes
bug	`correctness`	Logic errors and crashes
security	`security`	Direct mapping
edge-case	`correctness`, `resilience`	Unhandled error recovery → resilience; boundary conditions → correctness
workflow	`style`, `architecture`	DX/tooling concerns
performance	`performance`	Direct mapping
UX	`style`, `correctness`	Bad error messages = style; wrong output = correctness
observability	`observability`	Direct mapping
resilience	`resilience`	Direct mapping

Schema types spec_gap, contract_mismatch, and compatibility have no matching implementation dimension — these are evaluated by parallel-review-implementation (spec/contract compliance) and iterate-on-plan (compatibility) respectively.

5. Check Termination Conditions

Stop iterating if:

All findings are below the criticality threshold → present summary and list remaining low-criticality findings for optional manual review
ITERATION > MAX_ITERATIONS → present summary and list any unaddressed findings

If stopping, skip to the After Loop section below.

Otherwise, continue to step 6.

6. Implement Improvements

Fix all findings at or above the criticality threshold
For findings that require design changes beyond the current proposal scope:
- Flag as "out of scope"
- Recommend creating a new OpenSpec proposal
- Do NOT implement out-of-scope changes

Archetype Resolution (Phase 2)

Before dispatching fix agents, resolve the implementer archetype for escalation:

from src.agents_config import load_archetypes_config, resolve_model
archetypes = load_archetypes_config()
implementer = archetypes.get("implementer")
runner = archetypes.get("runner")
impl_model = resolve_model(implementer, package_metadata) if implementer else "sonnet"
runner_model = resolve_model(runner, {}) if runner else "haiku"

Parallel Fixes (for independent findings)

When multiple findings target different files, fix them concurrently:

# Spawn parallel agents for independent fixes
Task(
  subagent_type="general-purpose",
  model=impl_model,  # archetype: implementer (sonnet, or opus if escalated)
  description="Fix finding 1: <type> in <file>",
  prompt="Fix this issue in OpenSpec <change-id> implementation:

## Finding
Type: <type>
Criticality: <criticality>
Description: <description>
Proposed Fix: <fix>

## File Scope
You MAY modify: <specific file(s)>
You must NOT modify any other files.

## Process
1. Read the file and understand the issue
2. Implement the fix
3. Run relevant tests
4. Report changes made

Do NOT commit - the orchestrator handles commits.",
  run_in_background=true
)

Rules:

Only parallelize fixes targeting different files
Fixes to the same file must be sequential
Collect all results before running quality checks

7. Run Quality Checks (Parallel Execution)

Run all quality checks concurrently using Task() with run_in_background=true:

# Launch all checks in parallel (single message, multiple Task calls)
Task(subagent_type="Bash", model=runner_model, prompt="Run pytest and report pass/fail with summary", run_in_background=true)
Task(subagent_type="Bash", model=runner_model, prompt="Run mypy src/ and report any type errors", run_in_background=true)
Task(subagent_type="Bash", model=runner_model, prompt="Run ruff check . and report any linting issues", run_in_background=true)
Task(subagent_type="Bash", model=runner_model, prompt="Run openspec validate $CHANGE_ID --strict", run_in_background=true)

Result Aggregation:

Wait for all TaskOutput results
Collect pass/fail status from each check
Report ALL results together (don't fail-fast on first error)
Present failures with their check type for targeted fixes

Example output format:

Quality Check Results:
✓ pytest: 42 tests passed
✗ mypy: 3 type errors in src/auth.py
✓ ruff: No issues
✓ openspec validate: Valid

Failures to address this iteration:
- mypy: src/auth.py:15 - Missing return type annotation

Fix any failures before proceeding. If fixes introduce new issues, address them within this iteration.

8. Update Documentation

Review whether genuinely new patterns, lessons, or gotchas were discovered in this iteration. If so, update:

CLAUDE.md — project guidelines, workflow patterns, lessons learned
AGENTS.md — AI assistant instructions
docs/ — focused documentation files

Follow the existing convention:

Update CLAUDE.md or AGENTS.md directly if they are under 300 lines each
If either file exceeds 300 lines, refactor into focused documents in docs/ and reference them

Do NOT add redundant documentation for findings that are variations of already-documented patterns.

9. Update OpenSpec Documents

Review whether the current OpenSpec documents accurately reflect the refined implementation. When this iteration's findings reveal spec drift, incorrect assumptions, or missing requirements, update:

openspec/changes/<change-id>/proposal.md — if the proposal's described behavior no longer matches reality
openspec/changes/<change-id>/design.md — if design decisions or trade-offs changed during refinement
Spec deltas in openspec/changes/<change-id>/specs/ — if requirements or scenarios need correction
openspec/changes/<change-id>/change-context.md — if this iteration added new files, tests, or changed requirement mappings, update the Requirement Traceability Matrix rows (Files Changed, Test(s) columns). Update Coverage Summary if new tests were added or requirements were discovered. If a finding reveals a missing spec requirement, add a new row to the matrix and write the corresponding test before fixing.

Do NOT make unnecessary changes if the OpenSpec documents are still accurate after this iteration's fixes.

9.5. Append Session Log

Construct a PhaseRecord for the Implementation Iteration <N> phase and call write_both(). The iteration number is auto-computed from prior Implementation Iteration entries in the session-log.

Capture from this iteration:

Decisions — Decisions about which review findings to address, which to defer, and how to refactor.
Alternatives Considered — Refactor approaches considered and rejected.
Trade-offs — Trade-offs accepted (e.g., chose minimal change over full refactor).
Open Questions — Remaining questions for the next iteration or validation.
Completed Work — Findings addressed in this iteration.
In Progress — Findings being worked on but not yet resolved (only if any).
Summary — 2–3 sentences: which findings were addressed, what changed.

Persist via PhaseRecord.write_both():

This step MUST run BEFORE the git add . in Step 10 so the session-log entry is included in that commit.

python3 - <<'EOF'
import sys
sys.path.insert(0, "skills/session-log/scripts")
from phase_record import PhaseRecord, Decision, Alternative, TradeOff
from extract_session_log import count_phase_iterations

n = count_phase_iterations(
    "Implementation Iteration", "openspec/changes/<change-id>/session-log.md"
) + 1

record = PhaseRecord(
    change_id="<change-id>",
    phase_name=f"Implementation Iteration {n}",
    agent_type="<agent-type>",
    summary="<2-3 sentences: findings addressed, what changed>",
    decisions=[
        Decision(title="<title>", rationale="<rationale>"),
    ],
    alternatives=[Alternative(alternative="<approach>", reason="<rejection reason>")],
    trade_offs=[TradeOff(accepted="<X>", over="<Y>", reason="<reason>")],
    open_questions=["<question>"],
    completed_work=["<finding addressed>"],
)
result = record.write_both()
print(f"markdown_path={result.markdown_path}")
print(f"sanitized={result.sanitized}")
print(f"handoff_id={result.handoff_id or '(local fallback)'}")
print(f"handoff_local_path={result.handoff_local_path}")
for w in result.warnings:
    print(f"WARN: {w}", file=sys.stderr)
EOF

write_both() runs three best-effort steps internally: append rendered markdown → sanitize in-place → coordinator handoff (or local fallback at openspec/changes/<change-id>/handoffs/implementation-iteration-<n>-<N>.json). Each step logs warnings on failure but does not raise. The session-log.md is included in git add . in the existing commit step.

10. Commit Iteration

# Review all changes
git status
git diff

# Stage all changes
git add .

# Commit with structured message
git commit -m "$(cat <<'EOF'
refine(<scope>): iteration <N> - <summary of key changes>

Iterate-on-implementation: <change-id>, iteration <N>/<max>

Findings addressed:
- [<criticality>] <type>: <description>
- [<criticality>] <type>: <description>

Co-Authored-By: Claude <noreply@anthropic.com>
EOF
)"

# Increment and loop
ITERATION=$((ITERATION + 1))

Loop back to Step 4.

After Loop

11. Multi-Vendor Review (Conditional)

Skip this step if VENDOR_REVIEW=false.

After the iterate loop converges (all findings below threshold) or max iterations are reached, dispatch a multi-vendor review for a final independent validation pass.

11a. Dispatch Reviews

For implementations with work packages (work-packages.yaml exists), dispatch per-package reviews via /parallel-review-implementation. For simpler implementations, dispatch a whole-branch review.

Per-package dispatch (if work-packages.yaml exists):

# Dispatch per-package reviews to other vendors
for PKG_ID in $(python3 -c "
import yaml
pkgs = yaml.safe_load(open('openspec/changes/$CHANGE_ID/work-packages.yaml'))
for p in pkgs.get('packages', []): print(p['id'])
"); do
  python3 "<skill-base-dir>/../parallel-infrastructure/scripts/review_dispatcher.py" \
    --review-type implementation \
    --mode review \
    --prompt-file "openspec/changes/$CHANGE_ID/reviews/review-prompt-$PKG_ID.md" \
    --cwd "$(pwd)" \
    --output-dir "openspec/changes/$CHANGE_ID/reviews" \
    --exclude-vendor claude_code \
    --timeout 600
done

Whole-branch dispatch (no work packages):

# Create review prompt for the full implementation diff
mkdir -p openspec/changes/$CHANGE_ID/reviews

cat > openspec/changes/$CHANGE_ID/reviews/review-prompt.md <<'PROMPT'
Review the implementation on this branch against the OpenSpec proposal.
Run: git diff main..HEAD to see all changes.
Read openspec/changes/$CHANGE_ID/proposal.md and spec deltas for requirements.
Output ONLY valid JSON conforming to review-findings.schema.json.
Focus on: correctness, security, contract compliance, test coverage, and performance.
PROMPT

python3 "<skill-base-dir>/../parallel-infrastructure/scripts/review_dispatcher.py" \
  --review-type implementation \
  --mode review \
  --prompt-file "openspec/changes/$CHANGE_ID/reviews/review-prompt.md" \
  --cwd "$(pwd)" \
  --output-dir "openspec/changes/$CHANGE_ID/reviews" \
  --exclude-vendor claude_code \
  --timeout 600

Also produce your own findings as the primary reviewer: review the implementation diff against spec requirements and write findings to openspec/changes/$CHANGE_ID/review-findings-impl.json.

11b. Synthesize Consensus

python3 "<skill-base-dir>/../parallel-infrastructure/scripts/consensus_synthesizer.py" \
  --review-type implementation \
  --target "$CHANGE_ID" \
  --findings "openspec/changes/$CHANGE_ID/review-findings-impl.json" \
             "openspec/changes/$CHANGE_ID/reviews/findings-"*"-implementation.json" \
  --output "openspec/changes/$CHANGE_ID/reviews/consensus-impl.json"

Present consensus summary:

Confirmed findings (2+ vendors agree) — high confidence
Unconfirmed findings (single vendor) — lower confidence, warnings
Disagreements (vendors disagree on disposition) — escalate to human

If no other vendors are available (CLIs not installed), skip dispatch and proceed with single-vendor findings only.

11c. Feed Back Findings Above Remediation Threshold

The remediation threshold is the user's --threshold setting if provided, otherwise medium.

If the consensus or vendor review surfaces new findings at or above the remediation threshold:

Append the new findings to openspec/changes/$CHANGE_ID/impl-findings.md
Run one additional iterate cycle (Steps 4-10) to address them
Commit with message: refine(<scope>): vendor-review remediation - <summary>
Do NOT re-dispatch vendor review (prevents infinite recursion)

If all vendor review findings are below the remediation threshold, proceed to the summary.

12. Present Summary

Present a summary of all iterations:


If `CAN_MEMORY=true`, remember implementation iteration outcomes:

- MCP path: `remember`
- HTTP path: `"<skill-base-dir>/../coordination-bridge/scripts/coordination_bridge.py"` `try_remember(...)`

If `CAN_HANDOFF=true`, write a completion handoff containing:

- Fixes applied and critical findings resolved
- Remaining risks or manual follow-ups
- Validation status and recommended next command
## Iteration Summary

### Iteration 1
- Findings: <count> (<count by criticality>)
- Fixed: <list>

### Iteration 2
- Findings: <count> (<count by criticality>)
- Fixed: <list>

...

### Final State
- Total iterations: <N>
- Total findings addressed: <count>
- Remaining findings (below threshold): <list or "none">
- Termination reason: <threshold met | max iterations reached>

### Vendor Review (if dispatched)
- Vendors dispatched: <list or "skipped">
- Consensus findings: <confirmed count> confirmed, <unconfirmed count> unconfirmed, <disagreement count> disagreements
- Remediation cycle: <ran / not needed>
- New findings addressed in remediation: <count or "N/A">

Output

Iteration commits on the resolved feature branch (openspec/<change-id> by default, or the operator-mandated branch)
Structured findings summary for each iteration
Updated documentation (CLAUDE.md, AGENTS.md, docs/ as applicable)
Updated OpenSpec documents (proposal.md, design.md, spec deltas as applicable)
Final state assessment
Vendor review consensus (if --vendor-review or coordinated tier): openspec/changes/<change-id>/reviews/consensus-impl.json

Next Step

Validate the deployed feature (recommended):

/validate-feature <change-id>

Or skip validation and proceed to cleanup:

/cleanup-feature <change-id>