name: deep-review-task
description: Run the sandboxed deep-review tool against a benchmark task PR — launches Claude Code in Docker with pre-fetched PR artifacts and writes review-summary.md / issues-found.md
allowed-tools: Bash
argument-hint: " [--review]"
Run the tools/deep-review sandboxed reviewer against a benchmark task PR. The tool spins up Claude Code inside a locked-down Docker container, pre-fetches all PR artifacts, and produces a full GUIDE.md-style review.
Prerequisites: Docker Desktop running, gh CLI authenticated, and either claude CLI logged in or ANTHROPIC_API_KEY set in tools/deep-review/.env. macOS or Linux/WSL.
Phase 0: Parse arguments
$ARGUMENTS is the PR URL, optionally followed by --review to auto-post a REQUEST_CHANGES review when the run finishes.
If no PR URL is given, stop and ask the user for one. Do not invent a URL.
Phase 1: Locate the tool
Find the repo root and verify tools/deep-review/run.sh exists:
REPO_ROOT=$(git rev-parse --show-toplevel)
RUN_SH="$REPO_ROOT/tools/deep-review/run.sh"
test -x "$RUN_SH" || { echo "deep-review not found at $RUN_SH — pull latest main"; exit 1; }
If missing, tell the user to git pull origin main (the tool landed in PR #605, commit 29b1f353) and stop.
Phase 2: Run it
Invoke run.sh directly with the PR URL. The default built-in prompt drives the full GUIDE.md review end-to-end — do not pass a custom prompt unless the user asked for one. Stream output live so the user sees thinking, tool calls, and the final cost line.
cd "$REPO_ROOT/tools/deep-review"
./run.sh --pr "<pr-url>" # or add --review to auto-post
Pass --review only if the user included it in $ARGUMENTS or explicitly asked. Posting to GitHub is visible to others — don't decide on your own.
Phase 3: Surface the outputs
When the run completes, the task directory is tools/deep-review/tasks/<owner>/<repo>/pr-<N>/. Tell the user the paths to:
review-summary.md— full reviewissues-found.md— auto-extracted## Issues Foundsectionnatural-difficulty-extensions.md— if the agent populated that section
Offer to open review-summary.md with open -g <path> so the user can read it in their default viewer.
If --review was used, the final Review posted: <url> line from run.sh is the posted review. Surface that URL to the user.
Notes
- Concurrent runs against different PRs work — each PR has its own namespaced directory.
- Re-running the same PR refreshes pre-fetched inputs in place; prior
review-summary.mdis overwritten by the agent. - Custom prompts (advanced):
./run.sh "your prompt" --pr <url>— only use this if the user asked for a narrower focus. - For configuration (model, effort, API key), see
tools/deep-review/README.md.