deep-review-task - SKILL.md Agent Skill

name: deep-review-task description: Run the sandboxed deep-review tool against a benchmark task PR — launches Claude Code in Docker with pre-fetched PR artifacts and writes review-summary.md / issues-found.md allowed-tools: Bash argument-hint: " [--review]"

Run the tools/deep-review sandboxed reviewer against a benchmark task PR. The tool spins up Claude Code inside a locked-down Docker container, pre-fetches all PR artifacts, and produces a full GUIDE.md-style review.

Prerequisites: Docker Desktop running, gh CLI authenticated, and either claude CLI logged in or ANTHROPIC_API_KEY set in tools/deep-review/.env. macOS or Linux/WSL.

Phase 0: Parse arguments

$ARGUMENTS is the PR URL, optionally followed by --review to auto-post a REQUEST_CHANGES review when the run finishes.

If no PR URL is given, stop and ask the user for one. Do not invent a URL.

Phase 1: Locate the tool

Find the repo root and verify tools/deep-review/run.sh exists:

REPO_ROOT=$(git rev-parse --show-toplevel)
RUN_SH="$REPO_ROOT/tools/deep-review/run.sh"
test -x "$RUN_SH" || { echo "deep-review not found at $RUN_SH — pull latest main"; exit 1; }

If missing, tell the user to git pull origin main (the tool landed in PR #605, commit 29b1f353) and stop.

Phase 2: Run it

Invoke run.sh directly with the PR URL. The default built-in prompt drives the full GUIDE.md review end-to-end — do not pass a custom prompt unless the user asked for one. Stream output live so the user sees thinking, tool calls, and the final cost line.

cd "$REPO_ROOT/tools/deep-review"
./run.sh --pr "<pr-url>"         # or add --review to auto-post

Pass --review only if the user included it in $ARGUMENTS or explicitly asked. Posting to GitHub is visible to others — don't decide on your own.

Phase 3: Surface the outputs

When the run completes, the task directory is tools/deep-review/tasks/<owner>/<repo>/pr-<N>/. Tell the user the paths to:

review-summary.md — full review
issues-found.md — auto-extracted ## Issues Found section
natural-difficulty-extensions.md — if the agent populated that section

Offer to open review-summary.md with open -g <path> so the user can read it in their default viewer.

If --review was used, the final Review posted: <url> line from run.sh is the posted review. Surface that URL to the user.

Notes

Concurrent runs against different PRs work — each PR has its own namespaced directory.
Re-running the same PR refreshes pre-fetched inputs in place; prior review-summary.md is overwritten by the agent.
Custom prompts (advanced): ./run.sh "your prompt" --pr <url> — only use this if the user asked for a narrower focus.
For configuration (model, effort, API key), see tools/deep-review/README.md.