name: deep-research description: "Multi-agent iterative research pipeline using Eric Jang methodology. 10-18 specialized agents across 3 cycles produce consensus references, interaction maps, personalized syntheses, and action plans with domain-specific stress-testing, iterative deepening, and anti-pattern guards." version: "4.2" license: MIT
Deep Research Skill
A structured multi-agent research pipeline that turns any topic into a comprehensive, fact-checked, bilingual (EN+RU) synthesis with actionable recommendations. Built on Eric Jang's iterative methodology from "As Rocks May Think".
How It Works
The skill orchestrates 10-19 specialized AI agents across 3 mandatory cycles:
Cycle 1: Broad Search → 4-5 parallel SCOUTs explore the landscape
Quality Gates → CRITIC + METHODOLOGIST cross-check findings
Reflection 1 → Identify gaps, generate competing hypotheses
Cycle 2: Deep Dives → 2-3 targeted agents test hypotheses + stress-test questions
Iterative Deep. → Auto-resolve CONTESTED claims (WEAK → follow-up DD)
Reflection 2 → Convergence analysis, hypothesis verdicts
Cycle 3: Execute → Python scripts (analysis, models, visualizations)
Synthesize → SYNTHESIZER creates integrated document
Verify → FACT-CHECKER + CITATION_VERIFIER + DOMAIN_REVIEWER
Apply → ACTION MAPPER updates user's protocols/goals
Research Modes
| Mode | Output | When to use |
|---|---|---|
| personalized (default) | synthesis.md |
Specific question for your context |
| consensus | consensus_reference.md |
Building knowledge base (population-level truth) |
| consensus+interactions | consensus + interaction_map.md |
Cross-effects matter |
| full | All three documents | Deep investigation |
Agent Roles
| Role | Count | Purpose |
|---|---|---|
| SCOUT | 4-5 | Broad literature search, each with unique reasoning style |
| CRITIC | 1 | Cross-stream contradictions, bias audit, weak evidence |
| METHODOLOGIST | 1 | Methodological quality — domain-specific evidence hierarchy (GRADE for health, forecast audit for macro, reproducibility for science) |
| DEEP DIVER | 2-3+ | Hypothesis testing + domain stress-test questions. Auto-spawns on CONTESTED claims (iterative deepening) |
| SYNTHESIZER | 1 | Integration across all sources into coherent document |
| INTERACTION MAPPER | 1 | Cross-domain interactions that change recommendations |
| DOMAIN_REVIEWER | 1 | Domain-specific review: MEDICAL (health), MACRO (markets), MARKET (company), METHODOLOGY (science) |
| FACT-CHECKER | 1 | Top-15 numerical claims verification |
| CITATION_VERIFIER | script | Python API check against Semantic Scholar/PubMed/CrossRef |
| TEMPORAL DIFF | 0-1 | Compares new consensus with previous version (UPDATE mode only) |
| ACTION MAPPER | 1 | Converts findings into TODO blocks in user's files |
v4.1 Features
Domain Stress-Test Questions
Each Deep Diver must answer ≥2 mandatory adversarial questions from their domain adapter. Examples:
- Health: "Under what conditions does this intervention become harmful for MY profile?"
- Macro: "What single event makes this consensus irrelevant within 18 months?"
- Company: "What does every successful player understand that customers never say out loud?"
- Science: "What replication failure would collapse this finding?"
Iterative Deepening on CONTESTED Claims
When Cycle 2 convergence finds CONTESTED claims (confidence <0.5), the pipeline auto-spawns targeted Deep Divers to find tiebreaker evidence. Max 2 rounds, 15% budget cap.
Anti-Pattern Guards
Each domain adapter includes a "Common Anti-Patterns" table — pre-flight checklist of domain-specific mistakes (e.g., "citing mouse studies as human evidence" for health, "extrapolating 3-year trend as permanent" for macro).
Company Domain Enhancements
- Raw language preservation: SCOUTs preserve verbatim customer/user quotes with tags
- Opportunity taxonomy: Action Mapper classifies opportunities as CONTRARIAN / TIMING_PLAY / SAFE_BET
v4.2 Features
Temporal Diff (UPDATE mode)
When re-researching a topic that already has a consensus reference, the pipeline auto-detects the previous research and offers UPDATE mode. In UPDATE mode:
- Full research runs identically to fresh mode (SCOUTs do NOT see previous consensus — no anchoring bias)
- TEMPORAL DIFF agent runs after SYNTHESIZER, comparing old and new consensus claim-by-claim
- Produces
_temporal_diff.mdwith 5 categories: CONFIRMED / REVISED / CONTRADICTED / OBSOLETE / NEW - Adds a
## Temporal Diff Summaryto the new consensus_reference.md - Archives previous version as
_previous_consensus_[YYYY_MM].md - Stability score: CONFIRMED / total — tells you how much the field moved
CLI flags:
--update— force UPDATE mode (auto-detect previous consensus)--fresh— force FRESH mode (ignore previous consensus)- No flag — auto-detect and ask user
Quick Start
Copy the template:
cp .claude/commands/research/context_template.md .claude/commands/research/context.mdFill in your data in
context.md:- File paths to your profiles, lab results, protocols
- Key biomarkers (if health/nutrition research)
- Genetic variants (if relevant)
Run:
/research creatine safety for athletes # → health domain (auto) /research global electricity demand consensus high 6h # → macro domain (auto) /research AI agents in financial services consensus high 4h # → company domain (auto) /research transformer architectures consensus high 4h # → science domain (auto)
File Structure
.claude/commands/research/
├── SKILL.md ← You are here (overview + quick start)
├── README.md ← Brief quick start
├── context_template.md ← Copy to context.md, fill in your data
├── context.md ← YOUR private config (gitignored)
│
├── research.md ← Main entry point (pipeline + architecture)
├── agents.md ← Detailed agent role descriptions
├── prompts.md ← Full prompt templates for each agent
├── cycle1.md ← Cycle 1 instructions (SCOUTs + quality gates)
├── cycle2.md ← Cycle 2 instructions (Deep Dives + convergence)
├── cycle3.md ← Cycle 3 instructions (Synthesis + verification)
├── finalize.md ← Finalization checklist + git/notification
│
├── domains/
│ ├── health.md ← METHODOLOGIST + MEDICAL_REVIEWER + health consensus template
│ ├── macro.md ← METHODOLOGIST + MACRO_REVIEWER + macro consensus template
│ ├── company.md ← METHODOLOGIST + MARKET_REVIEWER + company consensus template
│ └── science.md ← METHODOLOGIST + METHODOLOGY_REVIEWER + science template
│
└── examples/
├── example_output_tree.txt
└── example_synthesis_excerpt.md
Output Structure
A completed research produces:
YYYY_MM_topic_name/
├── _PROGRESS_LOG.md # Log + 2 reflections + hypotheses
├── synthesis.md # Final synthesis EN (10 sections)
├── synthesis_ru.md # Final synthesis RU (full translation)
├── consensus_reference.md # Population-level truth (if consensus/full)
├── interaction_map.md # Cross-domain interactions (if full)
├── unknowns_and_next.md # Known unknowns + next experiments
├── _critic_review.md # CRITIC output
├── _methods_review.md # METHODOLOGIST output (domain-specific)
├── _domain_review.md # Domain reviewer output (medical/macro/market/methodology)
├── _fact_check.md # FACT-CHECKER output
├── _citation_audit.md # CITATION_VERIFIER output
├── _temporal_diff.md # TEMPORAL DIFF output (UPDATE mode only)
├── _previous_consensus_*.md # Archived previous version (UPDATE mode only)
├── _action_map.md # ACTION MAPPER output
├── stream_a_*.md ... stream_e_*.md # Cycle 1 streams
├── deep_dive_a_*.md ... # Cycle 2 deep dives
├── data/*.csv # Structured data (snake_case)
├── figures/*.png # Visualizations (300 DPI)
└── scripts/*.py # Analysis scripts
Configuration (context.md)
| Section | Purpose |
|---|---|
| User Files | Paths to your profiles, lab results, genetics |
| Key Biomarkers | Latest lab values (health/nutrition research) |
| Genetics | Relevant SNPs and clinical implications |
| Domains | Default stream configurations per domain |
| Action Mapper | Per-domain output targets (protocols, research_queue, blog) |
| Finalization | Git commit + notification settings |
Dependencies
Required:
- Claude Code CLI (
claude)
Optional (for Cycle 3 scripts):
- Python 3.10+ with
matplotlib,pandas,numpy markdownlibrary (fortools/view_research.pystyled viewer)
Optional (for notifications):
- Telegram bot token (for
tools/notify_research.py) tools/finalize_research.shautomates: git commit + push + notify + viewer
Known Caveats
- WebSearch often unavailable in subagents — agents synthesize from training data and mark limitations
- Context window limits — use
/compactduring long research; all intermediate results saved to disk - Agents may not write files — always verify with
lsafter each agent completes - CSV column names vary — Python scripts must include column normalization
- PEP 668 on macOS — use venv for pip install:
python3 -m venv scripts/.venv
Language
All instructions and prompts are in English. The synthesis output is always in English (synthesis.md). If you set preferred_language in your context.md, the pipeline also produces a full translation in your language (e.g., synthesis_ru.md).