run-phase

star 1

Use when the user says 'run phase', 'start phase N', 'next phase', '继续开发', '跑下一阶段', '开始第N阶段', or when continuing development guided by a dev-guide. Orchestrates the plan-execute-review cycle for one phase of a development guide: write-plan → verify-plan → execute-plan → test-changes → review agents in parallel → fix issues. Produces: phase completion report + updated workflow state in .claude/dev-workflow-state.json. Not when: no dev-guide exists — run write-dev-guide first.

n0rvyn By n0rvyn schedule Updated 6/12/2026

name: run-phase description: "Use when the user says 'run phase', 'start phase N', 'next phase', '继续开发', '跑下一阶段', '开始第N阶段', or when continuing development guided by a dev-guide. Orchestrates the plan-execute-review cycle for one phase of a development guide: write-plan → verify-plan → execute-plan → test-changes → review agents in parallel → fix issues. Produces: phase completion report + updated workflow state in .claude/dev-workflow-state.json. Not when: no dev-guide exists — run write-dev-guide first."

Overview

This skill orchestrates one iteration of the development cycle. Opus handles judgment-intensive steps (planning, fixing) in main context; Sonnet handles mechanical execution as a dispatched agent; Opus reviews in dispatched agents for unbiased assessment.

Locate/Resume Phase
  → scope confirmation checkpoint (main context — opus)
  → write plan (main context — opus, full conversation context)
  → UX review checkpoint (main context — if design has UX Assertions)
  → verify plan (dispatch opus agent — unbiased review)
  → execute plan (dispatch sonnet agent — segmented via Workflow, hard-stop checkpoints)
  → test changes (dispatch sonnet agent — build/test/lint suite)
  → visual feedback loop (main context — render #Preview, diff vs design, fix ≤3x; skipped if non-UI or no design ref)
  → dispatch feature-spec + review agents in parallel (separate contexts)
  → fix all issues (main context — opus: execution + test + review failures)
  → Phase done

State File

Location: .claude/dev-workflow-state.json (JSON format, matches execute-plan-checkpoint.json for consistency).

This file tracks progress across sessions. Update it before starting each step (so crash-resume works). Read/write via the Read and Write tools.

Legacy migration: If .claude/dev-workflow-state.yml exists (from prior versions) and .json does not, read the YAML, write equivalent JSON to .claude/dev-workflow-state.json, delete the .yml, and surface ℹ️ Migrated dev-workflow state file from legacy YAML to JSON format to the user. After migration, continue with the JSON path.

{
  "project": "<name>",
  "current_phase": 2,
  "phase_name": "Phase Name",
  "phase_step": "plan",
  "_comment_phase_step": "one of: plan | ux-review | verify | execute | test | visual | review | fix | done",
  "dev_guide": "docs/06-plans/YYYY-MM-DD-project-dev-guide.md",
  "plan_file": null,
  "_comment_plan_file": "set to docs/06-plans/YYYY-MM-DD-<name>-plan.md after Step 2",
  "verification_report": null,
  "task_progress": null,
  "review_reports": [],
  "test_report": null,
  "_comment_test_report": "set to .claude/test-reports/test-run-*.md path after Step 5",
  "gaps_remaining": 0,
  "last_updated": "YYYY-MM-DDTHH:MM:SS"
}

The _comment_* keys are informational and can be omitted in real state files (JSON has no native comment syntax; keep state files lean — drop these _comment_* keys when writing).

Agent Dispatch Verification Gate

This skill dispatches sub-agents at multiple steps (Step 4 execute-plan, Step 5 test-changes, Step 6 feature-spec + 4 review agents). The dev-workflow/hooks/verify-agent-output.py hook intercepts every Agent return and surfaces "files NOT on disk" when a sub-agent's stdout claims it wrote files that don't actually exist.

Treat agent stdout as a claim, not a fact:

  • After every Agent return in Step 4/5/6, before recording the report path into state, verify the claimed report file actually exists on disk (ls or Read). If missing: do NOT advance phase_step; either re-dispatch the agent with explicit Write tool requirement, or surface the failure to the user.
  • For execute-plan (Step 4): the Workflow returns per-task structured results; spot-check by verifying every path in each result's files_written array exists on disk (defense-in-depth — the dev-workflow verify-agent-output.py hook also intercepts at agent return time). The execute-plan skill itself owns the segment loop and the checkpoint file, but the run-phase orchestrator should not blindly trust the final summary.
  • For test-changes / review agents: their report files (docs/06-plans/execution-report.md, .claude/test-reports/*.md, .claude/reviews/*.md) must exist before the next step.

This gate is non-negotiable: we have a confirmed past case of a sub-agent reporting file writes that never persisted, caught only because the verify-agent-output hook fired.

User-Visible Notifications

Long phases (30-60 min from plan → done) can outlast the user's attention. At three checkpoints, emit a PushNotification so the user is pulled back when they walked away:

  1. Plan ready for approval (end of Step 2): when decision points need answering.
  2. All agents returned (end of Step 6): when consolidated review summary is ready and Step 7 fix-or-skip decision is needed.
  3. Phase done (end of Step 8): when the phase completes (success or with documented known issues).

Skip notifications when the user has been actively responding within the last minute (heuristic — when in doubt, send). Keep messages short (under 80 chars), lead with the decision needed.

Process

Step 1: Resume or Locate Phase

  1. Check for an existing state file via the Bash tool:
    cat .claude/dev-workflow-state.json 2>/dev/null || cat .claude/dev-workflow-state.yml 2>/dev/null || echo "NO_STATE_FILE"
    
    • If output is NO_STATE_FILE: proceed to step 2 (starting fresh).
    • If output is YAML (legacy format): apply the migration described in "## State File" above — read the YAML, write equivalent JSON, delete the .yml. Then parse the migrated JSON.
    • Otherwise: parse the JSON content from the output.
    • If phase_step is spec (legacy): treat as review and proceed to Step 6
    • If phase_step is build-test (legacy): treat as test and proceed to Step 5
    • If phase_step is not done:
      • Present: "Phase {N} ({name}) in progress — step: {phase_step}. Resume?"
      • If user accepts:
        • Scope drift check (only when phase_step is plan AND plan_file is not null): Read the Phase's current scope from the dev-guide and compare with the plan file's Scope: section. If they differ: "Dev-guide scope has changed since the plan was written. Re-run scope confirmation (Step 1.5)?" If user accepts: reset phase_step: plan, plan_file: null, and run Step 1.5. If user declines: proceed with existing plan. For steps after plan (verify/execute/review/fix): no check needed — the plan is the working document.
        • Skip to the step indicated by phase_step
      • If user declines: ask which Phase to start
  2. If no state file or starting fresh:
    • Find dev-guide: docs/06-plans/*-dev-guide.md (if multiple, prefer the file with current: true in frontmatter; if no file has a current: field in frontmatter, treat all as candidates and ask user)
    • Read the document and check each Phase's acceptance criteria
    • Phases with all criteria checked = completed
    • Identify the first incomplete Phase
    • Present Phase summary: Goal, Scope, Architecture decisions, Acceptance criteria
    • Ask: "Start Phase N?"
  3. Initialize state file (write to .claude/dev-workflow-state.json):
{
  "project": "<from dev-guide title>",
  "current_phase": <N>,
  "phase_name": "<Phase name>",
  "phase_step": "plan",
  "dev_guide": "<dev-guide path>",
  "plan_file": null,
  "verification_report": null,
  "task_progress": null,
  "review_reports": [],
  "test_report": null,
  "gaps_remaining": 0,
  "last_updated": "<now>"
}

If the user specifies a different Phase number, use that instead.

Step 1.4: Project Health Preflight

Before Step 1.5, read .claude/dev-workflow-health.json if present.

  • If state is missing, OR last_health has any red signal, OR state's updated_at is older than 7 days, run dev-workflow/scripts/project_health_scan.py --mode full --reason plan --check-staleness 7 --max-ms 5000 --format markdown --write-state and use the fresh report.
  • Otherwise reuse cached last_health from state (no scan invocation).
  • Summarize red/yellow Project Health signals before planning.
  • Feed those signals into the write-plan context so the plan header includes **Project health:**.
  • Do not change orchestration order: Project Health adds context only; it cannot skip scope confirmation, verify-plan, execute-plan, test-changes, or review.

The generated plan must still include ## Impact Map and per-task Task Contract fields when contract_version: 1 is used.

Step 1.5: Scope & Visual Expectation Confirmation

Before writing the plan, present the Phase scope and visual expectations for explicit user confirmation.

Skip condition: When resuming from state file with phase_step not plan, skip this step — scope was already confirmed in a prior session.

Freshness check: Before presenting, read the dev-guide's YAML frontmatter for confirmed_at:. If the timestamp is within 60 minutes of now, use lightweight mode (step 1b). Otherwise, use full mode (step 1a).

1a. Full mode (default):

Read the Phase's scope items and **用户可见的变化:** section from the dev-guide. Present to the user:

Phase {N} — confirm before planning:

范围:
1. {scope item 1}
2. {scope item 2}
...

用户可见的变化:
- {visual expectation 1, from dev-guide}
- {visual expectation 2}
(如果有需要补充的布局、交互细节,请在这一步告诉我)

确认以上内容,或补充/修正后继续。

If **用户可见的变化:** starts with "无" (infrastructure Phase, e.g., "无" or "无 — 纯基建阶段"), present scope only (omit the visual section).

1b. Lightweight mode (confirmed_at within 60 min):

Phase {N} 范围已在 dev-guide 中确认。
有新增视觉/交互细节要补充吗?没有则直接开始规划。

User responds:

  • No additions → proceed to Step 2 (skip steps 3-4 below)
  • Adds visual/interaction details → proceed to step 4 (auto-crystal), same as full mode
  1. Wait for user response (full mode only):

    • User confirms without additions → proceed to Step 2
    • User corrects scope → edit the Phase's **Scope:** bulleted list in the dev-guide file to match user's corrections, then check acceptance criteria sync (see below), re-present for confirmation
    • User adds visual/interaction details → proceed to step 4 (auto-crystal)
    • Max 2 correction cycles; after that, proceed with last-confirmed content

    Acceptance criteria sync (after scope correction): Compare the Phase's **Acceptance criteria:** with the updated scope:

    • If any criterion references a removed scope item → flag: "验收标准 '{criterion}' 对应的范围项已移除,是否同步删除?"
    • If new scope items lack corresponding criteria → flag: "新增范围项 '{item}' 无验收标准,是否补充?"
    • Present flags to user. Apply user's decisions (delete/add criteria) to the dev-guide before re-presenting the Phase.
    • If no mismatches found, skip silently.
  2. Auto-crystal (conditional): If the user's response contains any new visual/interaction detail not already in the dev-guide's 用户可见的变化 section, treat as "adds details" regardless of whether they also said "confirmed."

    4a. Assemble decisions: Extract from the user's input:

    • Each confirmed visual/interaction detail → [D-xxx] in imperative form
    • Each explicitly rejected approach → ## Rejected Alternatives + ## Constraints
    • If no alternatives were discussed or rejected, write None. for those sections

    4b. Confirm with user: Present the assembled decisions in-line:

    以下视觉/交互决策将记录供后续规划使用:
    - [D-001] {detail}
    - [D-002] {detail}
    约束:{constraints, or "无"}
    
    确认记录,或修改后继续。
    

    Apply user edits if any, then proceed to write.

    4c. Write crystal file:

    • First, search docs/11-crystals/*-crystal.md for an existing crystal file
    • If an existing crystal file is found: append the visual decisions to it — add new [D-xxx] entries (continuing the existing numbering) to ## Decisions (machine-readable), merge new items into ## Constraints and ## Scope Boundaries. Do not overwrite existing content.
    • If no existing crystal file: create docs/11-crystals/YYYY-MM-DD-phase-{N}-visual-crystal.md using this format:
    # Decision Crystal: Phase {N} Visual Expectations
    
    Date: YYYY-MM-DD
    
    ## Initial Idea
    {User's original visual description, denoised but not rewritten}
    
    ## Discussion Points
    {Any back-and-forth from the confirmation, if applicable}
    
    ## Rejected Alternatives
    {Approaches the user explicitly rejected, or "None."}
    
    ## Decisions (machine-readable)
    - [D-001] {confirmed visual/interaction detail in imperative form}
    - [D-002] {detail}
    
    ## Constraints
    {Explicitly rejected visual approaches, or "None."}
    
    ## Scope Boundaries
    - IN: {items from user's visual additions}
    
    ## Source Context
    - Design doc: {path or "none"}
    - Dev-guide: {dev-guide path} Phase {N}
    
    • Do NOT invoke /crystallize as a separate skill — write the file directly

This checkpoint catches scope pollution and aligns visual expectations before writing the plan.

Step 2: Plan (main context)

  1. Update state: phase_step: plan, last_updated: <now>
  2. Gather Phase context from dev-guide:
    • Goal: Phase N's goal
    • Scope: Phase N's scope items
    • Acceptance criteria: Phase N's acceptance criteria
    • Design doc reference: from dev-guide header (if exists)
    • Design analysis reference: search docs/06-plans/*-design-analysis.md; if exactly 1 file, use it; if multiple, use the one whose filename matches the Phase's feature topic; if still ambiguous, ask the user; if none, set to "none"
    • Crystal file reference: search docs/11-crystals/*-crystal.md; if exactly 1 file, use it; if multiple, ask the user which one applies; if none, set to "none"
    • If no crystal file found AND the Phase has architecture decisions marked as "resolved" in the dev-guide: suggest /crystallize to capture these decisions before planning. Do not block — user can decline and proceed without a crystal file.
  3. If a design doc path exists: read the design doc and check for a ## UX Assertions section. Note the result — it controls Step 2.5.
  4. Preload relevant lessons:
    • Extract keywords from Phase scope items and goal (component names, technology terms, domain terms)
    • Search docs/09-lessons-learned/ for entries matching these keywords: Grep(pattern="<keyword1>|<keyword2>|<keyword3>", path="docs/09-lessons-learned/", output_mode="content", context=5)
    • If matches found: note the matching lesson entries for reference during plan writing
    • If no matches or directory does not exist: skip silently
  5. Read source files: Read the design doc, design analysis, crystal file (if any), and key codebase files relevant to the Phase scope. This grounds the plan in actual code state.
  6. Write the plan following the Plan Writing Reference in ${CLAUDE_PLUGIN_ROOT}/skills/write-plan/SKILL.md. Use the gathered Phase context as inputs. Save to docs/06-plans/YYYY-MM-DD-<feature-name>-plan.md.
  7. Update state: plan_file: <path>, last_updated: <now>
  8. Present plan summary to user (task count, key files)
  9. Decision Points: Check the ## Decisions section of the plan file.
    • If Decisions > 0:
      • First time this session: Read ${CLAUDE_PLUGIN_ROOT}/references/decision-points.md
      • Apply the rules with parameters:
        • Source file: the plan file
        • Mode: full
        • Recording: default
  10. Auto-select verification speed: count tasks in the plan file. If task count < 5: mark --fast flag for Step 3 (use Sonnet for verification). If task count ≥ 5: no flag (use Opus default).
  11. PushNotification checkpoint 1: if the plan's ## Decisions section has > 0 unresolved DPs, emit a PushNotification with message like Phase {N} plan ready — {K} decisions await your input (see "## User-Visible Notifications" above).

Step 2.5: UX Review (conditional)

Trigger condition: The design doc contains a ## UX Assertions section with at least one assertion row (not just the header). If no design doc, no UX Assertions section, or the table has zero assertion rows, skip to Step 3.

  1. Update state: phase_step: ux-review, last_updated: <now>
  2. Read the generated plan file
  3. Read the design doc's ## UX Assertions table and ## User Journeys section
  4. Build a mapping table:

For each UX assertion:

  • Find plan tasks with UX ref: UX-NNN matching this assertion
  • Extract the task's User interaction: line (if present)

For each UI-facing plan task without a UX ref::

  • Note as unmapped
  1. Present to user:
UX Assertion Coverage:

| UX ID | Assertion | Plan Task(s) | User Interaction | Status |
|-------|-----------|-------------|-----------------|--------|
| UX-001 | {assertion text} | Task 3, Task 5 | {from task's User interaction: line, or "—"} | ✅ Mapped |
| UX-002 | {assertion text} | — | — | ❌ No task |
| UX-003 | {assertion text} | Task 7 | {from task} | ✅ Mapped |

Unmapped UI tasks (no UX ref):
- Task 4: {task title} — {reason or "needs UX assertion?"}

Confirm this mapping is correct, or provide corrections.
  1. Wait for user response:

    • User confirms: proceed to Step 3
    • User provides corrections:
      • For minor fixes (add/correct UX ref: lines, adjust User interaction: text): edit the plan file directly
      • For structural changes (add missing tasks, redesign task scope): revise the plan directly in main context
      • Re-present the mapping for confirmation after corrections
    • Max 2 correction cycles; after that, proceed with noted gaps
  2. Update state: last_updated: <now>

Step 3: Verify

Auto-approve condition: If ALL of the following are true:

  • Plan has 3 or fewer tasks
  • No design doc reference (or set to "none")
  • No crystal file reference (or set to "none")

Then skip full agent verification. Instead:

  1. Read the plan file
  2. Perform inline sanity check in main context:
    • Each task has **Files:** and **Steps:** sections
    • Task dependencies (if any) are ordered correctly
    • No obvious gaps (e.g., task references a file not listed in any task's Files)
  3. Update state: phase_step: verify, verification_report: "auto-approved (small plan)", last_updated: <now>
  4. Skip to Step 4

Otherwise: proceed with full verification below.

  1. Update state: phase_step: verify, last_updated: <now>
  2. Invoke dev-workflow:verify-plan with the plan from Step 2 (pass --fast flag if set in Step 2)
  3. Update state: verification_report: <summary>, last_updated: <now>

If still "Must revise" after 2 revision cycles: Present the remaining issues to the user:

"Plan verification failed after 2 revision attempts. Remaining issues: [list specific issues from verifier output]

Options: A. Stop and manually revise the plan, then re-run this step B. Proceed with imperfect plan (issues noted in execution — treat as extra caution points)"

Wait for user choice. If A: stop. If B: mark state verification_report: "partial" and continue.

Step 4: Execute (segmented sonnet agent dispatch via Workflow)

  1. Update state: phase_step: execute, last_updated: <now>
  2. Invoke dev-workflow:execute-plan to handle execution. The skill manages the full segmented dispatch lifecycle:
    • Runs compute_checkpoints.py to derive batches + hard_stops
    • Creates .claude/execute-plan-checkpoint.json (segment metadata: plan_file, total, batch_size, k, hard_stops, status; plus the completed map owned by the task agents)
    • For each segment: invokes Workflow({scriptPath, args}) with the segment's batches, awaits completion, spot-checks files_written, then either auto-continues to the next segment or pauses at a hard-stop for the user to say "continue"
    • Cross-session resume: the on-disk checkpoint file's completed map is authoritative; Workflow({resumeFromRunId}) is same-session only
    • Returns an explicit completion signal (the Terminal-write step): Execution complete: complete or Execution complete: completed_with_failures, or Paused at hard-stop: waiting for "continue" — never a silent return
    • On complete: deletes the checkpoint file; on completed_with_failures: retains it with status: "completed_with_failures" (the cross-session source for the fix pass)
    • Cross-session resume: the on-disk checkpoint file's completed map is authoritative; Workflow({resumeFromRunId}) is same-session only
  3. Completion detection — trust execute-plan's explicit return (in-context), not a file grep. execute-plan runs as the same main agent following nested instructions, so its return is available directly:
    • Execution complete: complete or completed_with_failures → execution finished; proceed (route failures to Step 7). Read the report at docs/06-plans/execution-report.md for the summary.
    • Paused at hard-stop: waiting for "continue" → NOT finished (see item 4).
    • Durable backup only (cross-session / truncated return): the report's **Status:** line under this plan's **Plan:** <path> sectioncomplete/completed_with_failures = done, in-progress = interrupted. (Scope to the plan section; the report is a shared append-log with one Status line per plan.)
  4. Do NOT advance to Step 5 while execution is paused at a hard-stop. If execute-plan returns Paused at hard-stop, execution did NOT complete — surface to the user that it is waiting for "continue"; do not advance to Step 5 and do not route to Step 7. Only a completed_with_failures return routes failed/blocked tasks to Step 7 (Fix).
  5. Present summary: completed/blocked/failed task counts
  6. If blocked or failed tasks exist: note them for Step 7 (Fix)
  7. Update state: last_updated: <now>

Step 5: Test Changes (sonnet agent dispatch)

  1. Update state: phase_step: test, last_updated: <now>
  2. Invoke dev-workflow:test-changes with:
    • Project root
    • Plan file path (from state plan_file)
  3. When the skill returns: read the test report path from its output
  4. Present test summary: Build (pass/fail), Tests (X/Y passed), Lint (pass/fail)
  5. If test failures exist: note them for Step 7 (Fix)
  6. Update state: test_report: <report path>, last_updated: <now>

Step 5.5: Visual Feedback Loop (main context — opus)

This step closes the visual gap between implemented UI and design reference before human review (D-010: 把界面拉到八九不离十,最后一公里人工;非全自动像素级). It runs after test-changes and before the review agents, so reviewers see visually-corrected code.

⚠️ 需项目验证:真机 iOS 项目里实跑该 step(渲染→diff→修→收敛/交人)本仓无法验证。

  1. Update state: phase_step: visual, last_updated: <now>

  2. Gate ①: prerequisites + UI relevance — first verify apple-dev (which provides render-preview) is installed: ls ~/.claude/plugins/cache/*/apple-dev/ 2>/dev/null. If no output → skip this entire step: log Visual step skipped: apple-dev not installed, set phase_step: review, proceed to Step 6. (Step 6 guards apple-dev too, but it runs after this step, so the check is repeated here.) Then derive the list of modified SwiftUI view files independently in this step:

    • Primary: scan the plan's per-task **Files:** sections (always available since Step 4 has completed) for modified .swift files. A file counts as a view if EITHER its name matches a view suffix (*View / *Card / *Row / *Cell / *Tab / *Screen / *Sheet / *Banner) OR it contains a #Preview block or a : View conformance. Do NOT filter on *View.swift alone: SwiftUI views are frequently named Card/Row/Tab/Screen, and a name-only filter silently skips them.
    • Optional cross-check: git diff --name-only against the phase's starting commit (only if a baseline ref was recorded), filtered by the same view-detection rule
    • If the resulting list is empty → skip this entire step: log Visual step skipped: non-UI phase, set phase_step: review, proceed to Step 6.
    • Note: Step 6's ui-reviewer uses the same signal source, but computes it inline at that point. This step derives it independently — Step 6 has not run yet and its inline condition is not a stored artifact. (This step's own fixes may touch additional files, so Step 6's recomputed set can differ slightly — expected.)
  3. Gate ②: Design reference (only if Gate ① passes) — resolve a design reference image path in this order:

    • (a) /tmp/design-screenshot-*.png (understand-design output) — already an image path
    • (b) docs/06-plans/*-design-analysis.md — extract the first referenced image path from the markdown
    • (c) Design doc path from dev-guide header — extract the first referenced image path from the doc
    • If no actual image path resolves (none of a/b/c yields an image file) → skip (DP-002=A): log Visual step skipped: no design reference image, set phase_step: review, proceed to Step 6.
    • Do NOT attempt self-evaluation without a reference image.
  4. Render-diff-fix loop (both gates passed) — filter the Gate ① list to views that contain a #Preview block. If that filtered list is empty → skip: log Visual step skipped: no #Preview blocks in modified views, set phase_step: review, proceed to Step 6. Otherwise, for each such view:

    a. Invoke apple-dev:render-preview via the Skill tool, passing swiftFile: <absolute path to this view's .swift file>, outputDir: <a caller-controlled dir under the repo, e.g. .claude/visual-phase{N}/> (REQUIRED; do NOT omit, because render-preview's default system-temp dir is non-deterministic to this caller), and previewId when the file has multiple #Preview blocks. render-preview is context: fork, so its returned message is summarized; do NOT parse the returned message for the path. Instead read the authoritative result file it writes at <outputDir>/<name>.result.json (<name> = the .swift basename, plus -{previewId} when set) and parse {channel, pngPath, downsampled, error} from that file.

    • If the result file is missing OR error is not null: log Render failed for {ViewName}: {error or "no result file"} and skip this view; continue to next. b. Read pngPath (the value from the result file) in main context. Compare rendered output against the design reference image: identify visual differences (spacing, layer hierarchy, color, text truncation, overflow). c. If differences are material: fix the corresponding SwiftUI code in main context, then return to step (a) to re-render. d. Iteration cap: ≤3 rounds per view. If cap reached with remaining differences, stop and record them.
  5. Hand-off — after the loop:

    • Collect all remaining differences (cap-reached or skipped views) into a plain-language list using spatial language (screen position and appearance — no code identifiers).
    • Present as informational items (do NOT use AskUserQuestion — same nature as Step 6 human-verification items):

      视觉仍有差异(需真机/人工定夺):

      • {item, spatial language}
    • Emit a PushNotification (< 80 chars, follow run-phase notification style), e.g.: Phase {N} visual loop done — {N} diffs remain for human review
  6. Set phase_step: review, proceed to Step 6.

Step 6: Document Features & Reviews (parallel agent dispatch)

  1. Update state: phase_step: review, last_updated: <now>

  2. Determine agents to dispatch:

    Feature spec agents (conditional):

    • Check the Phase scope for completed user journeys (a user journey is "completed" when all its acceptance criteria in the dev-guide are checked off)
    • If this is NOT an infrastructure-only Phase: confirm feature name and scope with the user, then prepare dev-workflow:feature-spec-writer dispatch for each completed feature
    • If infrastructure-only (no user journeys): no feature-spec-writer dispatch

    Review agents (always at least one):

    • Always: dev-workflow:implementation-reviewer agent
    • apple-dev reviewers (conditional): before adding any of the three below, verify apple-dev is installed via ls ~/.claude/plugins/cache/*/apple-dev/ 2>/dev/null. If no output, skip all three and add to the Step 6 summary table: "apple-dev not installed — UI/design/feature review coverage skipped". If installed:
      • If Phase modified UI files: apple-dev:ui-reviewer — pass list of modified *View.swift files
      • If Phase created new pages/components: apple-dev:design-reviewer — pass list of new View files
      • If Phase completed a full user journey: apple-dev:feature-reviewer — pass feature scope + key files
    • If this is the submission prep Phase: invoke /asc-submit-preview skill after agents complete (requires apple-dev installed; if not, note in summary and skip)

    Note: run-phase dispatches Apple reviewers via Phase-completion signals (UI files modified / new components / journey completed). /review-execution dispatches the SAME agents via git-diff signals (HAS_SWIFT / HAS_NEW_VIEW / user keywords). Both routes are intentional — run-phase serves orchestrated phases; review-execution is standalone. Running both back-to-back will dispatch agents twice with slightly different scopes.

  3. Dispatch ALL agents in parallel using the Task tool in a single message:

    For feature-spec-writer (if applicable):

    Generate a feature spec with the following inputs:
    
    Feature name: {name}
    Feature scope: {scope}
    Design doc paths:
    {relevant design doc paths and sections}
    Dev-guide: {dev-guide path} Phase {N}
    Key implementation files:
    {list of key files}
    Project root: {project root}
    

    For implementation-reviewer (always): pass plan file path, project root, and design doc path (from state or dev-guide; "none" if no design doc) For apple-dev agents (conditional):

    • apple-dev:ui-reviewer (if UI files modified): pass list of modified *View.swift files
    • apple-dev:design-reviewer (if new pages/components): pass list of new View files
    • apple-dev:feature-reviewer (if full user journey completed): pass feature scope + key files

    Each agent receives a fresh context — they have no memory of how the code was written. This removes confirmation bias from self-review.

  4. When all return: For each agent, check its report file:

    • Agent returned a Report: path → Read that file
    • Agent was truncated (no Report: in return) → search .claude/reviews/ for the agent's report file pattern (e.g., implementation-reviewer-*.md). If found and **Status:** in-progress, the agent was truncated — use the partial results.
    • No report file found at all → note as "❌ File not produced" in the summary — do not retry review agents (their output is informational, not blocking).
  5. Present a consolidated summary table:

Agent Verdict Issues Report/Spec
Feature Spec: {name} ✅/❌ {user story counts} {path}
Implementation ✅/❌ {gap counts} — Tests: {required}/{exist}/{covered} {path}
UI ✅/❌ {counts} {path}
Design ✅/❌ {counts} {path}
Feature Review ✅/❌ {counts} {path}
  1. Feature spec decision points: If feature-spec-writer was dispatched, check its return for Decisions: count.

    • If Decisions > 0:
      • First time this session: Read ${CLAUDE_PLUGIN_ROOT}/references/decision-points.md
      • Apply the rules with parameters:
        • Source file: the spec file
        • Mode: full
        • Recording: default
  2. Surface human verification items: If any review report's compact summary shows 人工验证项 > 0 or 设备验证项 > 0:

    • Read each report file that has verification items
    • Extract items from these sections:
      • ui-reviewer: ### Part C: 人工验证清单
      • design-reviewer: ### Part B: 设备验证清单
      • feature-reviewer: ### Part C: 设备验证清单
    • Consolidate, deduplicate, and present in plain language below the summary table:

    以下需要在设备上确认(来自 review 报告):

    • {item, translated to spatial language — use screen position and appearance, NO code identifiers}
    • {item}
    • ⚠️ 需真机:{animation/transition items}

    This is informational — do not block with AskUserQuestion. The user can raise issues during Step 7 (Fix Gaps).

  3. Surface test coverage summary: If implementation-reviewer's compact return includes a Tests: line:

    • Extract: required, exist, pass, shell counts
    • If shell > 0 or pass < required: present warning below the human verification items:

      ⚠️ 测试覆盖不完整:{N} 个计划要求的测试中,{M} 个为空壳或未覆盖核心路径

  4. Update state: review_reports: [<report file paths from agent summaries>], last_updated: <now>

  5. PushNotification checkpoint 2: emit PushNotification with message like Phase {N} reviews complete — {N} gaps, {M} verifications need device (see "## User-Visible Notifications" above). Skip if all agents passed with zero issues AND the user has been responding within the last minute.

Step 7: Fix Issues

If any of the following have issues: execution report (blocked/failed tasks), test report (build/test/lint failures), or review reports (gaps):

  1. Update state: phase_step: fix, last_updated: <now>

  2. Collect all issues from three sources: a. Execution failures (from Step 4): blocked/failed tasks from the execute-plan agent report b. Test failures (from Step 5): build errors, test failures, lint errors from the test-changes report c. Review gaps (from Step 6): plan-vs-code gaps, pre-existing issues Read the relevant report files for full details. Skip entries that are not file paths (e.g., "user-override" sentinel values).

  3. List all issues sorted by severity (critical first, then warnings) Separate by origin:

    • 执行阻塞({N} 个):
      • {blocked/failed tasks with reasons}
    • 测试失败({N} 个):
      • Build: {errors if any}
      • Tests: {failing test names + assertion messages}
      • Lint: {errors if any}
    • Review 问题({N} 个):
      • {plan-vs-code gaps}
    • 已有问题({M} 个):
      • {pre-existing issues from implementation-reviewer}
  4. Ask the user: "Fix these issues before moving on, or mark as known issues?"

  5. If fixing: a. Separate design issues from code issues. If design-reviewer report exists among review_reports:

    • Extract all 🔴 items from design-reviewer Part A
    • Group by category: Hierarchy (A1, A11), Spacing (A3, A12), Consistency (A5, A6), Color (A2)
    • Present design issues separately from other review issues:

      设计问题({N} 个必须修复):

      • {category}: {count} 代码/UI 问题({M} 个):
      • {summary} b. Fix all issues (design + code), then re-run test-changes if it had failures, and re-run only the reviews that had failures c. Design re-verification limit: If design-reviewer still fails after 1 fix cycle, report remaining design issues and proceed — do not loop. Other reviewers (implementation, UI, feature) follow existing behavior.
  6. If skipping: note the known issues and proceed

  7. Update state: gaps_remaining: <count>, last_updated: <now>

  8. Decision Points: Check each review report for Decisions: count.

    • If any report has Decisions > 0:
      • First time this session: Read ${CLAUDE_PLUGIN_ROOT}/references/decision-points.md
      • Apply the rules with parameters:
        • Source file: the review report
        • Mode: mixed
        • Recording: default
    • Then proceed to Step 8

Step 8: Phase Completion

  1. Pre-completion gate (structural enforcement):

    • Read review_reports and test_report from state file
    • If review_reports is empty AND test_report is null (no reports): BLOCK: "Cannot complete phase: no test or review reports found. Run Step 5 and Step 6 before marking phase as done." Do NOT proceed. Use AskUserQuestion:
      • Option A: "Run Step 5 now" → return to Step 5
      • Option B: "Skip test and review, complete phase" → add review_reports: ["user-override"], test_report: "user-override", log override, proceed
    • If any review report has verdict ❌ AND gaps_remaining > 0: BLOCK: "Cannot complete phase: {gaps_remaining} unresolved gaps." Do NOT proceed. Use AskUserQuestion:
      • Option A: "Fix gaps (Step 7)" → return to Step 7
      • Option B: "Mark as known issues and complete" → proceed with gaps noted
  2. Update state: phase_step: done, last_updated: <now>

  3. Update the dev-guide:

    • Check off this Phase's acceptance criteria
    • Add status line: **Status:** ✅ Completed — YYYY-MM-DD after the Phase heading
  4. Issue archival (conditional): If Step 6 has items marked as "known issues" or skipped gaps:

    • Ask: "Create GitHub Issues for {N} deferred items?"
    • If yes:
      • Check label existence: gh label list --json name -q '.[].name'
      • If deferred label doesn't exist, create it: gh label create "deferred" --color "FBCA04" --description "Deferred from phase review"
      • If phase-{N} label doesn't exist, create it: gh label create "phase-{N}" --color "0E8A16" --description "Phase {N}"
      • For each deferred item, run gh issue create with labels deferred and phase-{current phase number}. Use the item description as issue body under ### Symptom.
    • Display all created issue URLs.
  5. Remind the user to update project docs:

    • docs/07-changelog/ — record changes
    • docs/03-decisions/ — if architectural decisions were made
  6. Report:

    Detect if this is the last phase: After checking off this Phase's acceptance criteria (step 2 above), re-read the dev-guide. If ALL phases now have all acceptance criteria checked (- [x]), this is the last phase.

    If more phases remain:

    Phase N complete. Next: Phase N+1 — [name]: [goal]. Run /run-phase to continue, or /commit to save progress first.

    If all phases are complete:

    Phase N complete. All phases done. Run /finalize for cross-phase validation, or /commit to save progress.

  7. PushNotification checkpoint 3: emit PushNotification with message like Phase {N} done — {next-action} where next-action is either next phase N+1 ready or all phases complete — run /finalize. Always send this one (phase completion is a high-signal event regardless of activity).

Rules

  • Never skip Step 5 or Step 6. Testing and reviews are not optional.
  • Never skip verification. Step 3 must run before Step 4.
  • Phase order matters. Don't start Phase N+1 if Phase N has unchecked acceptance criteria (unless user explicitly overrides).
  • Consolidate review output. Merge all review results into one summary with sections.
  • State before action. Update state file before starting each step, not after.
  • Visual feedback (Step 5.5) is conditional — skipped for non-UI phases or when no design reference exists; it never blocks (remaining diffs surface as informational human-verification items).

Completion Criteria

  • Phase acceptance criteria checked off in dev-guide (Step 7)
  • State file phase_step set to done
  • Next phase communicated to user
Install via CLI
npx skills add https://github.com/n0rvyn/indie-toolkit --skill run-phase
Repository Details
star Stars 1
call_split Forks 0
navigation Branch main
article Path SKILL.md
More from Creator