name: implementation-plan-review description: Expert review of an implementation plan against a GitHub issue/enhancement description (provided as a local file or a GitHub issue URL) and the current repository codebase. Use when asked to critique a plan for correctness, completeness, codebase alignment, risks, and test/rollout readiness (do not implement). disable-model-invocation: true
Implementation Plan Review (Expert)
Review an implementation plan for coverage, correctness, and fit with the current codebase. Do not implement.
Inputs
- Implementation plan: a local file path.
- Issue/requirements: either (a) a local file path, or (b) a GitHub issue number (run from the target repo so
ghresolves it).
If the user provides a GitHub issue number, prefer fetching it into a local file using the bundled script (requires gh auth and network access; otherwise ask for a local issue description file):
bash scripts/fetch_github_issue.sh <issue-number> --out /tmp/issue.md
Run this command from the skill directory when that directory is inside the target repository. If it is not, keep the
command agent-agnostic by resolving the script path relative to this skill directory while running gh from the target
repo so the issue number resolves against the correct repository.
If the script reports that gh is not authenticated (exit code 3), ask the user to run:
gh auth login
Workflow
- Prepare clean scratchpads:
- Before reading or listing any scratchpad files, run from this skill directory:
bash scripts/prepare_scratchpads.sh <plan-file>
- Use the printed
OK <scratchpad-path>directory for all scratchpads in this review. - Existing scratchpads are stale working artifacts; never read or preserve them for a new review.
- If the script reports
ERROR, stop and report the failure.
Read the two inputs in full:
- Plan file
- Issue description file (or the fetched
/tmp/issue.md)
Apply versioning neutrality policy:
- Do not request a missing version bump (package version,
server.json, manifests, etc.) unless a repo rule, user instruction, release plan, or issue text explicitly requires one. - Do not suggest removing version bump steps merely because the issue does not mention versioning. Issues usually describe the problem, motivation, or code-level improvement; they are not expected to spell out release mechanics.
- If the plan already includes version bump steps, review them only for correctness and consistency with applicable repo rules: required files, matching version strings, valid version format, and no unrelated version/manifests changed.
- Raise a versioning finding only when the plan's versioning steps are internally inconsistent, contradict explicit requirements, or are objectively attached to the wrong files/surfaces.
- Do not request a missing version bump (package version,
Apply review-noise policy:
- Do not raise findings only because an implementation plan omits developer execution mechanics such as checking
/.dockerenv, choosing host vs devcontainer command prefixes, or spelling out both command variants. - Do not teach command invocation mechanics in recommendations.
- Review verification semantically: required test/lint/integration categories, targets, and coverage, not how a developer invokes commands in their environment.
- Still flag objectively wrong verification scope, such as requiring only a narrow test subset when repo rules require the full default suite.
- Do not raise findings only because an implementation plan omits developer execution mechanics such as checking
Validate codebase reality (start targeted, expand as needed):
- Start by finding referenced modules/configs/env vars/tests with
rg(fast and low-noise). - Prefer opening the minimal set of files first to confirm patterns and naming, but broaden freely if you suspect hidden coupling or cross-cutting behavior (e.g., shared helpers, config loading, response models, pagination, truncation).
- If the plan touches MCP tools, REST API, docs, or tests, cross-check relevant
.cursor/rules/*.mdcguidance. - If it improves confidence, use any other repo investigation strategy (e.g., inspect docs like
SPEC.md/API.md, check tests, usegit blame, or run unit tests/lint locally).
- Start by finding referenced modules/configs/env vars/tests with
Suggested commands (adapt as needed):
rg -n "name_in_plan|function_in_plan|ENV_VAR_IN_PLAN" -S .
rg -n "ToolResponse\\[|@mcp\\.tool\\(|log_tool_invocation" blockscout_mcp_server -S
rg -n "ServerConfig\\(|BaseSettings\\(|BLOCKSCOUT_" blockscout_mcp_server/config.py -S
rg -n "pytest\\.mark\\.integration|tests/integration|tests/tools" tests -S
- Ground findings with scratchpads:
- Use only the clean scratchpad directory prepared in step 1.
- For each actionable candidate finding, create one deterministic scratchpad file in final report order:
finding-01-short-slug.mdfinding-02-short-slug.md- Continue numbering in the same order used in §4.
- Each scratchpad must include these sections:
Grounded context: exact code, docs, tests, configs, or rules inspected, with paths/functions/classes where relevant.Variants: at least 2 meaningful solution options; use 3-4 for non-trivial findings.Rubric: criteria for choosing between variants.Evaluation: a score table or concise comparison of variants against the rubric.Best recommendation: the concrete change to make to the implementation plan.Plain-language rationale: why the chosen recommendation is best.
- Use the scratchpad result to rewrite the final §4 recommendation. If scratchpad investigation disproves or weakens a finding, remove it or downgrade it before final output.
- Findings that cannot be grounded in a scratchpad should normally be omitted. Keep them only when they are genuinely product/intent questions and mark them as
Question.
Scratchpad discipline:
- Scratchpads are working artifacts created by this review skill to make final recommendations grounded.
- Do not create scratchpads for pure summary text, obvious nits, or questions that require user/product input rather than code investigation.
- Do not use scratchpads to pad the report; use them only to make actionable recommendations more accurate.
- Produce the review in the required format (next section).
Required output format
Produce a review with these sections:
1) Understanding
- Issue summary
- Acceptance criteria (bulleted)
2) Plan ↔ Requirements coverage
- What is covered well
- What is missing / ambiguous
3) Codebase alignment
- Key files/modules you inspected (with paths)
- Assumptions in the plan that match the codebase
- Assumptions that don’t match (explain and suggest correction)
4) Review comments (actionable)
Provide comments as a list. Each comment must include:
- Severity:
Blocker | Major | Minor | Question | Nit - Location: plan section/step + (when relevant) repo file/function/class
- Problem: what’s wrong / missing
- Recommendation: concrete change to the plan
- Rationale: why it matters (bug risk / security / perf / maintainability)
- Scratchpad: path to the backing scratchpad file, when the comment is actionable and not a pure
Question
Testing gaps rule:
- List every specific missing/incorrect test as an actionable comment in §4.
- In §6, provide a consolidated checklist that references those items without repeating full explanations.
5) Junior-dev readiness check
- Missing task-specific prerequisites, step ordering, and verification coverage
- Do not flag omitted environment-specific command invocation details
- Where the plan needs more explicit detail
6) Test & rollout strategy
- Consolidated test checklist (Unit / Integration / E2E / Negative & security / Performance & regression), referencing §4 test comments
- Migration/rollback plan if applicable
- Feature flags / safe rollout suggestions if applicable
Review focus checklist (use as prompts, not new requirements)
- Coverage: every acceptance criterion mapped to plan steps.
- Codebase alignment: paths, module structure, naming, existing helpers and patterns.
- Edge cases & compatibility: pagination, timeouts, empty results, truncation limits, backward compatibility.
- Security: input validation, SSRF/DNS rebinding boundaries, secrets handling, logging redaction, auth assumptions.
- Performance/scale: API call counts, caching, pagination strategy, long-running tasks/progress updates.
- Ops/observability: error handling, logs, metrics/telemetry/analytics implications, rollout/rollback.
- Versioning: only comment if explicitly required by the issue description; otherwise assume omission is intentional.