name: interrogate description: "Use for "interrogate", "adversarial review", "multi-model review", "challenge this", "stress test this code", "find blind spots", or "tear this apart". Multiple LLM reviewers challenge changes from independent angles." disable-model-invocation: true
Interrogate
Spawn one reviewer per configured model to adversarially review code changes. Each model gets the same prompt and rubric. The adversarial signal comes from model diversity, not assigned personas. Models differ in blind spots, priors, and reasoning patterns. Agreement across models is high-confidence signal; lone-model findings are worth reading but lower confidence.
The deliverable is a synthesized verdict. Do NOT auto-apply changes.
Step 1, Determine Scope
Identify what to review from context:
- If the user points at specific files or a diff, use that
- If on a feature branch, run
git diff main...HEAD(or the appropriate base branch) for the full changeset - If the user's message references recent work, gather the relevant files
Package the diff (or file contents) plus any surrounding context files the reviewers need to understand the code.
Step 2, State the Intent
Before spawning reviewers, state the intent explicitly. What is this code trying to accomplish? Derive this from:
- The user's message
- Commit messages
- PR description if one exists
- The code itself
Write one clear paragraph. Reviewers challenge whether the work achieves the intent well, not whether the intent itself is correct. If you're unsure about the intent, ask the user before proceeding.
Step 3, Spawn Reviewers
Launch one reviewer per model in your configured interrogate list (defaults claude-opus-4-8-thinking-xhigh, gpt-5.5-high-fast, composer-2.5-fast), all in a single message.
For each reviewer:
subagent_type:generalPurposemodel: one model from the configured interrogate listreadonly:true
If a configured model slug is rejected as unresolvable when you try to spawn the subagent, check the valid slugs in the Task tool's error message, pick the closest equivalent (prefer the highest-reasoning tier of the same family), spawn with the valid slug, and open a separate PR to update the configured defaults. Do not block the review on the slug issue.
Read references/reviewer-prompt.md and fill in the template with:
- The stated intent
- The diff or file contents
- The review rubric from
references/rubric.md - The code-quality lens from
references/code-quality-review.md
The same filled template goes to all reviewers, so every model applies the code-quality lens.
Each reviewer produces structured findings as described in the prompt template.
Step 4, Synthesize
As results come back, build a unified picture:
- Parse all findings from the reviewers
- Identify consensus. Findings raised by 2+ models independently are highest signal.
- Identify lone-model findings. Still worth reading, but weight accordingly.
- Deduplicate. Different models may describe the same issue differently. Merge these and note which models raised it.
- Note disagreements. If one model flags something and another explicitly says the opposite, that's useful context for the verdict.
Step 5, Lead Judgment
You are the lead reviewer, a pragmatic senior engineer, not a neutral aggregator.
Read references/lead-judgment.md for the full framework. Reviewers only see a slice of the codebase. You have the full context (the goal, the constraints, the timeline, which tradeoffs were already considered). Use that context aggressively.
Categorize every finding using these buckets:
- Act on. Real issues affecting correctness, security, or maintainability given the actual goals. These would block a real PR.
- Consider. Legitimate points, but you're not sure they outweigh the cost of addressing them right now. Worth the user's attention.
- Noted. Technically valid but not actionable. Context-dependent, premature optimization, or low-impact given the current stage.
- Dismissed. Wrong, nitpicky, or missing context. Brief explanation why.
For each finding, include:
- Which model(s) raised it
- The category (act on / consider / noted / dismissed)
- A one-line rationale for the categorization
Output Format
Present the verdict in this structure:
Intent
[The stated intent paragraph from Step 2]
Reviewers
List each reviewer on its own line like - <model name>: [N findings]
Act On
[Findings that should be addressed. For each: description, which models raised it, why it matters.]
Consider
[Findings worth thinking about. For each: description, which models raised it, tradeoff involved.]
Noted
[Valid but low-priority. Brief list.]
Dismissed
[Rejected findings with brief rationale. This shows the user what was filtered out and why, so they can override your judgment if they disagree.]
Agreement Map
[Where did models agree, where did they diverge, and what does the pattern of agreement/disagreement tell us?]