adversarial-review

name: adversarial_review description: Decide cuándo y cómo invocar a Forge (GPT-5) o Mycelium (Gemini 2.5) para revisión técnica adversarial y chequeos de consistencia de sistema. Usar cuando el usuario diga "/debate", pida "revisión adversarial", "feedback crítico", "validación arquitectónica" o haya una decisión técnica de alto riesgo antes de construir.

Adversarial Review (Debate Invocation)

Purpose: Guide Mentat's decisions on when and how to invoke Forge (GPT-5) or Mycelium (Gemini 2.5) for technical review, adversarial feedback, and system-wide consistency checks.

Scope: Technical review workflows, architectural validation, and collaborative decision-making with external AI agents.

Reference: .claude/commands/debate.md - Always follow the debate command protocol.

Forge (GPT-5) Invocation Protocol

Mentat must invoke Forge (via /debate or direct) when:

Mandatory Invocation (High-Stakes) ✅

Always invoke Forge for:

✅ Finalizing PRDs for projects >$5K
✅ Complex architectural decisions (microservices, distributed systems, etc.)
✅ Changes affecting critical performance or scalability
✅ Complex algorithm design or heavy mathematical calculations
✅ Security decisions (auth, encryption, data handling)
✅ Database schema changes affecting multiple systems
✅ Breaking changes to public APIs
✅ Production-critical bugs (P0)
✅ Data corruption or security vulnerabilities

Rationale: High-stakes decisions require adversarial review to catch risks Mentat might miss.

Recommended Invocation (Complex Decisions) 🟡

Strongly consider invoking Forge for:

🟡 New architectures Mentat hasn't implemented before
🟡 Technical trade-offs with multiple viable options
🟡 Complex debugging requiring deep analysis
🟡 Critical code optimization (hot paths, N+1 queries, etc.)
🟡 Implementation plans >2 weeks
🟡 Refactoring large codebases (>1000 LOC)
🟡 Performance-critical features (real-time, high-throughput)
🟡 Payment flows and financial transactions
🟡 Real-time features (WebSockets, polling, SSE)

Rationale: Complex decisions benefit from Forge's execution-focused perspective.

Optional Invocation (Nice to Have) 🔵

Consider invoking Forge for:

🔵 Code review before merge to production
🔵 Second opinion when Mentat is unsure
🔵 Exploration of technical alternatives
🔵 Validation of implementation approach
🔵 Review of test coverage strategy

Rationale: Optional but valuable for quality assurance.

Do NOT Invoke (Waste of Time) ❌

Never invoke Forge for:

❌ Simple decisions already industry-consensus
❌ Typos, formatting, simple documentation
❌ Early drafts (invoke when document is >70% complete)
❌ Business strategy without substantial technical component
❌ Routine CRUD operations
❌ Simple bug fixes (<1 hour work)
❌ Copy-paste implementations from well-documented libraries
❌ Aesthetic preferences (color palette, font choices)
❌ Easily reversible decisions

Rationale: Forge's time is valuable; use it for decisions that matter.

Mycelium (Gemini 2.5) Invocation Protocol

Mentat should invoke Mycelium (via /debate with mycelium agent) when:

When to Invoke Mycelium 🍄

Best for:

🍄 System Architecture & Integration - cross-project consistency
🍄 Workflow Design - holistic system analysis
🍄 Cross-project dependencies and integration points
🍄 System-wide consistency checks

Rationale: Mycelium focuses on system architecture, integration and cross-project consistency.

When NOT to Invoke Mycelium ❌

❌ Code review (use Forge)
❌ Implementation details (use Forge)
❌ Performance optimization (use Forge)
❌ Simple features without system impact

Invocation Methods

Via `/debate` Command (Preferred)

# Default (Forge) - for execution/technical review
/debate path/to/document.md

# Explicit Forge invocation
/debate path/to/document.md forge

# Mycelium - for PRD or System Architecture review
/debate path/to/document.md mycelium

# Code Review mode (LLM-as-Judge) - for agent output review
/debate --code-review SKY-123       # review changes from a Linear issue
/debate --code-review HEAD~3        # review last 3 commits
/debate --code-review path/to/file  # review specific file

Process (document mode):

Mentat prepares document for review (must be >70% complete)
Determines target agent (Forge or Mycelium) based on document type
Invokes agent via /debate command
Agent generates structured feedback following feedback template
Mentat incorporates feedback into document
Mentat deletes temporary feedback file
Mentat presents refined document to user

Process (code-review mode):

Mentat collects diff (from Linear issue, commits, or file)
Mentat collects spec/AC from Linear issue (if available)
Invokes Forge with diff + spec context
Forge reviews: security, quality, spec adherence, architecture
Mentat presents findings and recommends fix or approve
If issues found → agent re-delegated with improved spec (Rewind > Fix)

Use for:

Forge: PRDs, architecture docs, implementation plans, technical proposals, code review
Mycelium: PRDs, system architecture docs, workflow design, integration planning

Via Direct Script Invocation

# Forge
python3 scripts/invoke_forge.py "path/to/document.md" "PRD"

# Mycelium
python3 scripts/invoke_mycelium.py "path/to/document.md" "PRD"

Use for: Custom scenarios or when /debate command is not available

Expected Behavior

Step 1: Mentat Prepares Document

Document requirements:

Must be >70% complete
Clear problem statement
Proposed solution with rationale
Trade-offs identified
Options considered (minimum 2-3 for Forge)
Tentative recommendation from Mentat

Document types:

PRDs (Product Requirements Documents)
Architecture documents
Implementation plans
Technical proposals
Code files (for review)

Step 2: Mentat Invokes Agent

For Forge:

Provides clear context
Specifies what needs review
Highlights areas of concern
Focuses on execution, code quality, performance, risks

For Mycelium:

Provides system context
Highlights integration points
Focuses on consistency and completeness
Emphasizes cross-project considerations

Step 3: Agent Generates Feedback

Forge feedback structure:

Executive summary
Strengths
Concerns (high/medium/low priority)
Alternative approaches with pros/cons
Specific recommendations
Technical notes

Mycelium feedback structure:

System-wide consistency check
Integration points analysis
PRD completeness assessment
Cross-project dependencies
Workflow design validation

Step 4: Mentat Incorporates Feedback

Updates original document
Addresses high-priority concerns first
Documents decisions made
Maintains original document style
Preserves valuable insights

Step 5: Mentat Presents Refined Document

Shows changes made
Explains rationale for decisions
Highlights remaining open questions
Summarizes debate/review results

Decision Framework

When to Invoke Which Agent

Decision tree:

What type of document?
- PRD or System Architecture → Consider Mycelium
- Implementation Plan or Code → Consider Forge
What's the primary concern?
- System consistency, integration → Mycelium
- Execution, performance, risks → Forge
Is this high-stakes AND system-wide?
- Yes → Invoke both (Forge first, then Mycelium)
- No → Continue evaluation
Is this high-stakes?
- Yes → Mandatory invocation (usually Forge)
- No → Continue evaluation
Is this complex?
- Yes → Recommended invocation
- No → Optional invocation or skip
Is document >70% complete?
- No → Wait, don't invoke yet
- Yes → Proceed

When to Invoke Both Agents

Invoke both Forge and Mycelium when:

✅ High-stakes system architecture (e.g., payment systems, distributed systems)
✅ Architecture decisions affecting multiple projects
✅ Complex integrations requiring both execution review and system consistency
✅ Critical workflows with cross-project dependencies

Process:

Invoke Forge first (execution and risk focus)
Incorporate Forge feedback
Invoke Mycelium second (system consistency and integration)
Incorporate Mycelium feedback
Present refined document with both perspectives addressed

When to Invoke Forge vs Mycelium

Scenario	Agent	Reason
Architecture design	Forge	Execution and risk focus
Cross-project integration	Mycelium	Holistic system view
Implementation plan	Forge	Execution quality
Workflow design	Mycelium	System consistency
Code review	Forge	Code quality and performance
Performance optimization	Forge	Execution expertise

Examples

Example 1: Dual Invocation (Forge + Mycelium)

User: "Necesito diseñar la arquitectura de un sistema distribuido para procesar pagos"
→ Mentat identifies: 
  - Complex architectural decision (high-stakes, payment system) → Forge needed
  - System architecture with cross-project impact → Mycelium needed
→ Prepares architecture document (>70% complete)
→ Invokes Forge first: /debate docs/projects/payment/ARCH.md forge
  → Forge reviews: Execution risks, performance, security, scalability
  → Provides feedback on: Payment flow risks, database design, API contracts
→ Mentat incorporates Forge feedback
→ Invokes Mycelium second: /debate docs/projects/payment/ARCH.md mycelium
  → Mycelium reviews: System integration, cross-project consistency, workflow design
  → Provides feedback on: Integration points with existing systems, consistency patterns
→ Mentat incorporates Mycelium feedback
→ Presents refined architecture to user with both perspectives addressed

Rationale: High-stakes payment system requires both execution review (Forge) and system-wide consistency (Mycelium).

Example 2: Mycelium Invocation (System Integration)

User: "Necesito diseñar el workflow de integración entre el sistema de e-commerce y el ERP"
→ Mentat identifies: 
  - Workflow design with cross-system integration → Mycelium needed
  - System-wide consistency check required
  - Not high-stakes execution risk → Forge not needed
→ Prepares workflow design document (>70% complete)
→ Invokes Mycelium via /debate docs/projects/integration/WORKFLOW.md mycelium
→ Mycelium reviews workflow for:
  - Cross-project consistency (e-commerce ↔ ERP patterns)
  - Integration points and data flow
  - System-wide impact and dependencies
  - Workflow design alignment with existing patterns
→ Provides feedback on:
  - Integration patterns consistency
  - Data synchronization strategy
  - Error handling across systems
  - Workflow orchestration approach
→ Mentat incorporates feedback
→ Presents refined workflow design to user

Rationale: Workflow design focuses on system integration and consistency, which is Mycelium's strength.

Example 3: Forge Invocation (Code Review)

User: "Revisa este código antes de merge a producción"
→ Mentat identifies: Code review, production merge
→ Invokes Forge via /debate src/api/endpoints.py forge
→ Forge reviews code quality, performance, security
→ Provides feedback on optimizations, edge cases, risks
→ Mentat incorporates feedback
→ Presents refined code to user

Example 4: LLM-as-Judge (Post-Agent Code Review)

Pixel completes SKY-47: WhatsAppButton component
→ Mentat validates acceptance criteria: 3/3 met
→ Feature is critical (customer-facing) → triggers code review
→ Invokes: /debate --code-review SKY-47
→ Mentat collects:
  - Git diff from Pixel's branch
  - Acceptance criteria from SKY-47 description
  - Relevant contracts (DESIGN_CONTRACT, TEST_ID_CONTRACT)
→ Forge reviews:
  - Security: No XSS vectors, no hardcoded secrets ✅
  - Quality: No duplicated logic, clean component structure ✅
  - Spec adherence: All AC met, test IDs present ✅
  - Architecture: Follows existing component patterns ✅
→ Forge verdict: APPROVE with minor suggestions
→ Mentat marks SKY-47 Done, proceeds to Sentinela

When to trigger code-review mode:

Features touching auth, payments, or user data → mandatory
Agent output with >500 LOC changed → recommended
Standard UI/CRUD → optional (Sentinela E2E is sufficient)

Anti-patterns

❌ Invoking for incomplete documents - Document must be >70% complete ❌ Invoking for simple decisions - Use agents' time wisely ❌ Skipping invocation for high-stakes decisions - Always get adversarial review ❌ Invoking wrong agent - Forge for execution, Mycelium for system consistency ❌ Not preparing document properly - Must include context, options, trade-offs ❌ Ignoring feedback - Always incorporate high-priority concerns ❌ Invoking too early - Wait until document is substantial enough for meaningful review

Key Principles

Use Forge for execution - Code quality, performance, risks, implementation
Use Mycelium for systems - PRD completeness, integration, consistency
Prepare documents properly - >70% complete, clear options, trade-offs
Incorporate feedback - Address high-priority concerns, document decisions
Respect agent time - Only invoke for decisions that matter
Follow debate protocol - Use /debate command, clean up temp files
Document decisions - Always explain why invocation was made and what changed

Ready. Use this skill when deciding whether to invoke Forge or Mycelium for technical review. Always follow the decision framework and respect the debate command protocol.