name: skill-debate description: "Structured multi-provider AI debates between Claude and available advisors — use for critical decisions"
Host: Codex CLI — This skill was designed for Claude Code and adapted for Codex. Cross-reference commands use installed skill names in Codex rather than
/octo:*slash commands. Use the active Codex shell and subagent tools. Do not claim a provider, model, or host subagent is available until the current session exposes it. For host tool equivalents, seeskills/blocks/codex-host-adapter.md.
AI Debate Hub Skill v4.8
MANDATORY COMPLIANCE — DO NOT SKIP
When this skill is invoked, you MUST dispatch the debate advisors through orchestrate.sh and synthesize their positions. You are PROHIBITED from:
- Simulating advisor responses yourself instead of dispatching real providers via
orchestrate.sh - Deciding the question is "too simple" and answering single-model without the multi-provider debate
- Skipping the provider availability check or the structured rounds
- Dropping an available advisor (including
agy) from the roster without telling the user - Rationalizing that one model's view is sufficient — the user invoked a debate for multiple perspectives
⚠️ MANDATORY: Visual Indicators Protocol
BEFORE starting ANY debate, you MUST output this banner:
🐙 **CLAUDE OCTOPUS ACTIVATED** - AI Debate Hub
🐙 Debate: [Topic/question being debated]
Participants:
🔴 Codex CLI - Technical implementation perspective
🟡 Gemini CLI - Ecosystem and strategic perspective
🟠 Sonnet 4.6 - Pragmatic implementer perspective if host subagents are available
🐙 current host model - Moderator and synthesis
🟢 Copilot CLI - GitHub-native perspective (if available)
🟤 Qwen CLI - Alternative model perspective (if available)
Core participants are selected from available providers. Codex (🔴), Gemini (🟡), Antigravity (🧭), Sonnet (🟠), current host model (🐙), and other detected providers can participate based on routing and availability.
This is NOT optional. Users need to see which AI providers are active. External API calls (🔴 🟡) use provider API keys. Sonnet (🟠), Copilot (🟢), and Qwen (🟤) are included with existing subscriptions.
CRITICAL: External CLI Syntax (v0.101.0+)
You MUST use this exact command pattern. Do NOT improvise provider flags.
For debate rounds, dispatch every external advisor through Octopus routing:
"${HOME}/.claude-octopus/plugin/scripts/orchestrate.sh" spawn "$advisor" "$prompt"
Do not call provider CLIs directly from the debate workflow. The router applies provider-specific flags for Codex, Gemini, Antigravity, and other advisors.
- Provider-specific syntax lives in
scripts/lib/dispatch.shand helper scripts. - Do not copy direct Codex, Gemini, or Antigravity CLI invocations into debate steps.
- Always pass the selected advisor name to
orchestrate.sh spawn; the router chooses the correct command.
Flags that DO NOT EXIST (will cause errors):
codex --approval-mode full-auto— no--approval-modeflag in Codex 0.130.0codex --full-auto— deprecated/removed for current non-interactive dispatchcodex -q/codex --quiet— REMOVED in v0.101.0codex -y/codex --yes— NEVER EXISTEDcodex "prompt"withoutexec— launches interactive TUI, hangsgemini -y— DEPRECATED, use--approval-mode yolo
You are current host model, a participant and moderator in a multi-provider AI debate system. You consult external advisors (Gemini, Codex, Antigravity, and other available providers) via CLI, contribute your own analysis, and synthesize all perspectives for the user. If the host exposes subagents, include Sonnet as an independent analyst.
CRITICAL: You are NOT just an orchestrator. You are an active participant with your own voice and opinions.
How Users Invoke This Skill
Users can invoke the debate skill in natural language. You parse the intent and run the debate.
Basic Invocation
/debate <question or task>
With Flags
/debate -r 3 -d thorough <question>
/debate --rounds 2 --debate-style adversarial <question>
/debate --path debates/009-new-topic <question>
With File References
Users can mention files naturally - you resolve them to full paths:
/debate Is our CLAUDE.md accurate?
-> You resolve to full absolute path
/debate Review the auth flow in src/auth.ts
-> You find src/auth.ts relative to cwd and pass full path to advisors
Examples Users Might Say
/debate Should we use Redis or in-memory cache?/debate -r 3 Review the whatsappbot codebase for issues/debate on whether our error handling in api.ts is sufficientRun a debate about the database schema designI want gemini and codex to review this PR
Flags
| Flag | Short | Default | Description |
|---|---|---|---|
--rounds N |
-r N |
1 | Number of debate rounds (1-10) |
--debate-style STYLE |
-d STYLE |
quick | Style: quick, thorough, adversarial, collaborative |
--moderator-style MODE |
-m MODE |
guided | Mode: transparent, guided, authoritative |
--advisors LIST |
-a LIST |
auto | Comma-separated list |
--out-dir PATH |
-o PATH |
debates/ |
Output directory (relative to cwd) |
--path PATH |
-p PATH |
none | Debate folder path (skips cd requirement) |
--context-file FILE |
-c FILE |
none | File to include as context |
--max-words N |
-w N |
300 | Word limit per response |
--topic NAME |
-t NAME |
auto | Topic slug for folder naming |
--synthesize |
-s |
off | Generate a deliverable (markdown file, diff, or plan) from consensus |
Flag Precedence Rules
--rounds vs --debate-style:
--roundsexplicitly set: ALWAYS takes precedence over style defaults--debate-style quickimplies 1 round UNLESS--roundsis also specified- Error if conflicting:
--debate-style quick --rounds 5-> warn user, use--roundsvalue
Style round defaults (when --rounds not specified):
| Style | Default Rounds |
|---|---|
| quick | 1 |
| thorough | 3 |
| adversarial | 3 |
| collaborative | 2 |
Validation:
--roundsmust be 1-10- Error on
--rounds 0or--rounds 11+
Your Role: Participant + Moderator
Multi-Provider Debate Structure
This is a provider debate with selected advisor voices plus you as moderator:
User Question
|
v
+-------------------+
| ROUND 1 |
+-------------------+
| Gemini analyzes | 🟡 External CLI
| Codex analyzes | 🔴 External CLI
| Sonnet analyzes | 🟠 Agent(model: sonnet)
| YOU analyze | 🐙 Your independent analysis (Opus)
+-------------------+
|
v
+-------------------+
| ROUND 2+ |
+-------------------+
| Gemini responds | 🟡 Sees prior round
| Codex responds | 🔴 Sees prior round
| Sonnet responds | 🟠 Sees prior round
| YOU respond | 🐙 Your independent response
+-------------------+
|
v
+-------------------+
| FINAL SYNTHESIS |
+-------------------+
| YOU synthesize all four perspectives
| and recommend a path forward
+-------------------+
Key responsibilities:
- Set up the debate: Create folder structure, write context.md
- Consult external advisors: Call Gemini/Codex via CLI for each round
- Launch optional host subagent: Dispatch Sonnet via host subagent tool (background execution) for each round
- Contribute your analysis: Write your own perspective to rounds/r00N_claude.md
- Moderate: Ensure advisors stay on topic, follow word limits
- Synthesize: Combine all four perspectives into actionable recommendations
Claude-Octopus Enhancements
When running debates in claude-octopus, the following enhancements are automatically applied:
1. Session-Aware Storage
Enhanced behavior (when CLAUDE_CODE_SESSION is set):
~/.claude-octopus/debates/${SESSION_ID}/
└── NNN-topic-slug/
├── context.md
├── state.json
├── synthesis.md
└── rounds/
Benefits:
- Debates organized by Claude Code session
- Easy to find debates from specific conversations
- Automatic cleanup when sessions expire
- Integration with claude-octopus analytics
2. Quality Gates for Debate Responses
Enhancement: Evaluate each advisor response for quality before proceeding to next round.
Quality Metrics:
| Metric | Weight | Criteria |
|---|---|---|
| Length | 25 pts | 50-1000 words (substantive but concise) |
| Citations | 25 pts | References, links, or sources present |
| Code Examples | 25 pts | Technical examples or code snippets |
| Engagement | 25 pts | Addresses other advisors' specific points |
Quality Thresholds:
- Score >= 75: Proceed (high quality)
- Score 50-74: Proceed with warning (flag in synthesis)
- Score < 50: Re-prompt advisor for elaboration
3. Cost Tracking & Analytics
Track token usage and cost for each debate, integrated with claude-octopus analytics.
4. Document Export
Export debates to professional formats via the document-delivery skill:
- PPTX presentations
- DOCX reports
- PDF documents
Implementation Steps
When the user invokes /debate:
Step 1: Check Provider Availability & Display Banner
MANDATORY: You MUST use the native shell command tool to run this provider check BEFORE displaying the banner. Do NOT skip it. Do NOT assume availability.
For provider checks, never use grep -P; use portable grep -E/case checks and capture the exit code so missing optional CLIs do not fail open or abort the command.
bash "${HOME}/.claude-octopus/plugin/scripts/helpers/check-providers.sh"
Use the ACTUAL results below. PROHIBITED: Showing only "🔵 Claude: Available ✓" without listing all providers.
Then display the banner with real provider status:
🐙 **CLAUDE OCTOPUS ACTIVATED** - AI Debate Hub
🐙 Debate: [Topic/question being debated]
Provider Availability:
🔴 Codex CLI: [Available ✓ / Not installed ✗]
🟡 Gemini CLI: [Available ✓ / Not installed ✗]
🧭 Antigravity CLI: [Available ✓ / Not installed ✗]
🟠 Sonnet 4.6: available only when this Codex session exposes a compatible host subagent tool
🐙 current host model: Available ✓ (Moderator and participant)
If providers are missing:
- If all external providers are unavailable: Inform user that debate requires at least one external provider and suggest running
/octo:setupto configure them - If one or more providers are unavailable: Note which providers are missing and proceed with available provider(s) and Claude
Step 2: Ask Clarifying Questions
Use the AskUserQuestion tool to gather context before starting the debate:
Ask 4 clarifying questions to ensure high-quality debate:
AskUserQuestion({
questions: [
{
question: "What's your primary goal for this debate?",
header: "Goal",
multiSelect: false,
options: [
{label: "Make a technical decision", description: "I need to choose between options"},
{label: "Identify risks/concerns", description: "I want to surface potential issues"},
{label: "Understand trade-offs", description: "I want to see pros/cons of approaches"},
{label: "Get diverse perspectives", description: "I want multiple viewpoints"}
]
},
{
question: "How should the AI models evaluate the topic?",
header: "Evaluation",
multiSelect: false,
options: [
{label: "Cross-critique (Recommended)", description: "Models challenge each other's proposals directly — deeper analysis but may anchor on first responses"},
{label: "Independent evaluation", description: "Models evaluate independently without seeing others' work — prevents groupthink and anchoring bias"}
]
},
{
question: "What's the most important factor in your decision?",
header: "Priority",
multiSelect: false,
options: [
{label: "Performance", description: "Speed and efficiency are critical"},
{label: "Security", description: "Security and safety are paramount"},
{label: "Maintainability", description: "Long-term maintenance and clarity"},
{label: "Cost/Resources", description: "Budget and resource constraints"}
]
},
{
question: "Do you have existing context or constraints the debate should consider?",
header: "Context",
multiSelect: true,
options: [
{label: "Existing codebase patterns", description: "Must align with current architecture"},
{label: "Team expertise", description: "Team skill set is a constraint"},
{label: "Deadline pressure", description: "Time-to-market is critical"},
{label: "Compliance requirements", description: "Regulatory or policy constraints"}
]
}
]
})
After receiving answers:
- If user selected "Cross-critique": use
--mode cross-critique(default ACH falsification) - If user selected "Independent evaluation": use
--mode blinded(no cross-contamination) - Incorporate all other answers into the debate context.
Step 3: Parse Arguments & Build Debate Fleet
# Extract question and flags
QUESTION="Should we use Redis or in-memory cache?"
ROUNDS=3
STYLE="thorough"
# Dynamic advisor selection — use build-fleet.sh for model family diversity
DEBATE_FLEET=$("${HOME}/.claude-octopus/plugin/scripts/helpers/build-fleet.sh" debate standard "${QUESTION}" 2>/dev/null)
# Extract debater agent types (exclude claude-sonnet Moderator)
ADVISORS=$(echo "$DEBATE_FLEET" | grep '|Debater|' | cut -d'|' -f1 | paste -sd',' -)
# Fallback if build-fleet.sh unavailable: use installed providers, including agy.
if [[ -z "$ADVISORS" ]]; then
fallback_advisors=()
command -v codex >/dev/null 2>&1 && fallback_advisors+=(codex)
command -v agy >/dev/null 2>&1 && fallback_advisors+=(agy)
command -v gemini >/dev/null 2>&1 && fallback_advisors+=(gemini)
ADVISORS=$(IFS=,; echo "${fallback_advisors[*]}")
fi
The build-fleet.sh debate command selects up to 3 debaters from different model families (e.g., codex/OpenAI, agy/Google Antigravity, gemini/Google, copilot/Microsoft) to maximize training bias diversity. Do not hardcode Gemini/Codex-only advisors; use the runtime ADVISORS list.
Step 4: Setup Debate Folder
# Create debate directory structure
DEBATE_BASE_DIR="${HOME}/.claude-octopus/debates/${CLAUDE_CODE_SESSION:-./debates}"
DEBATE_ID="042-redis-vs-memcached"
DEBATE_DIR="${DEBATE_BASE_DIR}/${DEBATE_ID}"
mkdir -p "${DEBATE_DIR}/rounds"
# Write context.md
cat > "${DEBATE_DIR}/context.md" <<EOF
# Debate: ${QUESTION}
**Debate ID**: ${DEBATE_ID}
**Rounds**: ${ROUNDS}
**Style**: ${STYLE}
**Advisors**: ${ADVISORS}
**Started**: $(date -u +"%Y-%m-%dT%H:%M:%SZ")
## Question
${QUESTION}
## Clarifying Context
**Primary Goal**: ${USER_GOAL}
**Priority Factor**: ${USER_PRIORITY}
**Constraints**: ${USER_CONSTRAINTS}
## Additional Context
[Any relevant context from user's message or files]
[If claude-mem is installed, search for past debates or decisions on this topic using its MCP tools]
EOF
# Initialize state.json
cat > "${DEBATE_DIR}/state.json" <<EOF
{
"debate_id": "${DEBATE_ID}",
"question": "${QUESTION}",
"rounds_total": ${ROUNDS},
"rounds_completed": 0,
"advisors": [$(echo "$ADVISORS" | sed 's/,/", "/g' | sed 's/^/"/' | sed 's/$/"/')],
"user_context": {
"goal": "${USER_GOAL}",
"priority": "${USER_PRIORITY}",
"constraints": "${USER_CONSTRAINTS}"
},
"status": "active",
"created_at": "$(date -u +"%Y-%m-%dT%H:%M:%SZ")"
}
EOF
Step 5: Conduct Rounds
For each round, iterate the runtime advisor list and dispatch through Octopus:
IFS=',' read -r -a ADVISOR_LIST <<< "$ADVISORS"
for advisor in "${ADVISOR_LIST[@]}"; do
case "$advisor" in
claude*|codex*|gemini*|agy*|antigravity|copilot*|qwen*|opencode*|ollama*|cursor-agent*|vibe*) ;;
*) echo "Skipping unsupported advisor: $advisor"; continue ;;
esac
safe_advisor=$(printf '%s' "$advisor" | tr -c '[:alnum:]_-' '_')
"${HOME}/.claude-octopus/plugin/scripts/orchestrate.sh" spawn "$advisor" \
"You are ${advisor} participating in debate round 1.
DEBATE QUESTION: ${QUESTION}
${CONTEXT}
Write a concise, independent analysis (${MAX_WORDS} words). Address implementation tradeoffs, risks, and where other likely perspectives may be wrong." \
> "${DEBATE_DIR}/rounds/r001_${safe_advisor}.md" &
done
wait
5.2: Launch optional host subagent (Pragmatic Implementer)
Dispatch Sonnet via the host subagent tool with model: "sonnet" and background execution: true when the host exposes subagents. Sonnet runs in parallel with the external advisor calls — no additional latency.
Agent(
model: "sonnet",
background execution: true,
description: "Sonnet: debate round 1",
prompt: "You are a PRAGMATIC IMPLEMENTER participating in a structured AI debate.
YOUR ROLE: You are the person who would actually have to BUILD this. You care about what ships, what works, and what you'll be debugging at 2am. Ground your analysis in the actual code and real implementation constraints.
DEBATE QUESTION: ${QUESTION}
${CONTEXT}
Write your analysis (${MAX_WORDS} words) to: ${DEBATE_DIR}/rounds/r001_sonnet.md
Cover: implementation feasibility, hidden gotchas, concrete effort estimates, and what the other approaches miss from a builder's perspective."
)
WHY Sonnet and not just more Opus? Sonnet is a distinct model with different strengths — faster, more concise, catches implementation details that Opus's broader reasoning sometimes overlooks. Using a different model prevents groupthink within the Claude model family.
Timing: Launch optional host subagent BEFORE or IN PARALLEL with the external advisor calls. By the time the CLI calls return, Sonnet is usually done too. Check for completion before proceeding to 5.3.
5.3: Write Your Analysis (Opus)
Use the Read tool to read all advisor responses, then write your independent analysis:
# Read what all advisors said
for response_file in "${DEBATE_DIR}"/rounds/r001_*.md; do
printf '\n## %s\n' "$(basename "$response_file" .md)"
cat "$response_file"
done
# Write your analysis as moderator
cat > "${DEBATE_DIR}/rounds/r001_claude.md" <<EOF
# current host model Analysis - Round 1
[Your independent analysis here, considering but not just summarizing the three advisor perspectives. Note where Sonnet's implementation perspective reveals things the external advisors missed.]
EOF
5.4: Quality Gates (Claude-Octopus Enhancement)
After each advisor responds, evaluate response quality:
evaluate_response_quality() {
local response_file="$1"
local advisor="$2"
word_count=$(wc -w < "$response_file")
has_citations=$(grep -c '\[' "$response_file" || echo 0)
has_code=$(grep -c '```' "$response_file" || echo 0)
addresses_others=$(grep -ciE '(gemini|codex|agy|antigravity|claude|sonnet)' "$response_file" || echo 0)
score=0
(( word_count >= 50 && word_count <= 1000 )) && (( score += 25 ))
(( has_citations > 0 )) && (( score += 25 ))
(( has_code > 0 )) && (( score += 25 ))
(( addresses_others > 0 )) && (( score += 25 ))
echo "$score"
}
for response_file in "${DEBATE_DIR}"/rounds/r001_*.md; do
advisor=$(basename "$response_file" .md | sed 's/^r001_//')
quality_score=$(evaluate_response_quality "$response_file" "$advisor")
if (( quality_score < 50 )); then
echo "Low quality response from ${advisor} (score: $quality_score). Re-prompting..."
# Re-prompt for more detail
fi
done
Step 6: Final Synthesis
After all rounds complete, write a comprehensive synthesis:
cat > "${DEBATE_DIR}/synthesis.md" <<EOF
# Final Synthesis: ${QUESTION}
## Summary of Perspectives
### External Advisor Perspectives
[Key points from each advisor selected in ADVISORS: Codex, Gemini, Antigravity, or other available providers]
### 🟠 Sonnet's Perspective
[Key points from Sonnet across all rounds — especially implementation feasibility and gotchas]
### 🐙 current host model Perspective
[Your key points across all rounds]
## Areas of Agreement
[Where all advisors converged]
## Areas of Disagreement
[Key points of contention]
## Recommended Path Forward
[Your final recommendation based on all perspectives]
## Next Steps
[Concrete action items for the user]
EOF
Step 7: Present Results to User
Read the synthesis and present it in the chat:
I've completed a ${ROUNDS}-round debate on "${QUESTION}".
[Include key findings from synthesis.md]
Full debate saved to: ${DEBATE_DIR}
You can export this debate to PPTX/DOCX/PDF using the document-delivery skill.
Step 7.5: Generate Deliverable (when --synthesize is set)
If the user passed --synthesize (or -s), generate a concrete deliverable after synthesis:
- Read the synthesis.md file
- Identify the consensus recommendations and action items
- Generate ONE of the following based on context:
- For code topics: A plan with file paths and proposed changes
- For content topics: A draft document (e.g., rewritten README, PRD outline)
- For architecture topics: A decision record with rationale
- Save to
${DEBATE_DIR}/deliverable.md - Show the deliverable to the user with AskUserQuestion:
- "Apply this" — proceed with implementation
- "Refine" — adjust the deliverable
- "Save only" — keep it as reference, don't act
IMPORTANT: The deliverable is a PROPOSAL. Never auto-apply changes without user approval.
Example Usage
Example 1: Quick Debate
User: /debate Should we use Redis or in-memory cache?
Claude:
1. Creates debate folder at ~/.claude-octopus/debates/${SESSION_ID}/042-redis-vs-memcached/
2. Writes context.md with question
3. Round 1:
- Launches Sonnet via Agent(model: sonnet, background execution: true) — pragmatic implementer
- Calls orchestrate.sh spawn for each runtime advisor selected by build-fleet.sh, such as codex and agy when Gemini is not installed
- Waits for Sonnet completion
- Writes own analysis (Opus) considering all advisor perspectives
4. Writes synthesis.md with final recommendation from all participants
5. Presents results in chat
Example 2: Thorough Adversarial Debate
User: /debate -r 3 -d adversarial Review our authentication implementation in src/auth.ts
Claude:
1. Reads src/auth.ts to understand context
2. Creates debate folder
3. Round 1 (Sonnet launched in background first, then selected external advisors in parallel):
- 🟠 Sonnet: Implementation feasibility analysis of auth.ts
- External advisors selected by build-fleet.sh, such as 🔴 Codex, 🧭 Antigravity, or 🟡 Gemini depending on availability
- 🐙 current host model: Your independent analysis considering all advisors
4. Round 2:
- 🟠 Sonnet: Responds to other participants' points
- External advisors challenge each other's points
- 🐙 Claude: You challenge advisor points
5. Round 3:
- All participants: Final positions
6. Synthesis with quality scores for each advisor
7. Present results with cost tracking
Quality Checklist
Before completing a debate, ensure:
- All rounds completed for selected participants
- Your independent analysis (Opus) written for each round (not just summaries)
- Synthesis.md includes all participating perspectives
- Quality scores recorded for advisor responses
- Cost tracking updated (if in claude-octopus context)
- Results presented to user in chat
- Debate folder path provided to user
Integration with Other Skills
Document Delivery
Export debates to professional formats:
After debate completes:
"Would you like to export this debate to PPTX/DOCX/PDF? I can use the document-delivery skill to create a professional presentation."
Knowledge Mode
Debates can be used in knowledge mode workflows:
Knowledge mode "deliberate" phase → Run /debate to get multiple perspectives
→ Use synthesis for final decision
Quality Gates for Responses
Each advisor response is scored before proceeding:
| Metric | Weight | Criteria |
|---|---|---|
| Length | 25 pts | 50-1000 words (substantive but concise) |
| Citations | 25 pts | References, links, or sources present |
| Code Examples | 25 pts | Technical examples or code snippets |
| Engagement | 25 pts | Addresses other advisors' specific points |
Score >= 75: proceed. Score 50-74: proceed with warning. Score < 50: re-prompt for elaboration.
Cost Tracking
Typical costs (default word limits):
- Quick (1 round): $0.02 - $0.05
- Thorough (3 rounds): $0.10 - $0.20
- Adversarial (5 rounds): $0.25 - $0.50
Cost tracking integrates with ~/.claude-octopus/analytics/ logs.
Export
After debate completes, export results via document-delivery skill:
- PPTX: stakeholder presentations from synthesis
- DOCX: detailed documentation from full transcript
- PDF: archival with metadata (topic, participants, cost)
Attribution
- Original Skill: AI Debate Hub by wolverin0
- Version: v4.8
- Repository: https://github.com/wolverin0/claude-skills
- License: MIT
- Enhancements: Claude-Octopus integration (session-aware storage, quality gates, cost tracking, document export, provider debate with Sonnet)
Ready to debate! Users can invoke with /debate <question> or natural language.