name: deep-research description: "Multi-tiered deep research methodology — from raw data gathering through evaluation to synthesis. Use when conducting comprehensive research on any topic, building evidence-based arguments, or investigating complex questions."
Deep Research Skill
Overview
A structured multi-tier research pipeline that transforms raw information into evaluated, synthesized knowledge. Each tier has explicit quality gates that must be satisfied before proceeding. Every session explores five mandatory research dimensions and produces multiple structured output documents.
When to Use This Skill
- Researching a new topic comprehensively (not just quick lookups)
- Building evidence-based arguments or positions
- Investigating complex, multi-faceted questions
- Preparing content that requires authoritative sourcing
- Conducting due diligence on technologies, methods, or ideas
Mandatory Research Dimensions
Every research session MUST explore all five dimensions regardless of topic:
| # | Dimension | What to Explore |
|---|---|---|
| 1 | Field Landscape | What exists around this field? Key terms, subfields, adjacent areas, topology of the domain |
| 2 | Key People & Publishers | Who is publishing? What are they saying? Key voices, institutions, conferences, journals |
| 3 | Alternative Approaches | Competing methods, frameworks, tools — how do they compare? What are the trade-offs? |
| 4 | Best Practices | What is considered best practice? Consensus positions, standards, established guidelines |
| 5 | Critical Essentials | What is the most important thing to know? Non-negotiables, foundational concepts, common pitfalls |
The Five Tiers
Tier 1: GATHER — Cast the Net
Objective: Collect a broad, diverse set of raw sources covering all five mandatory dimensions.
Method:
- Formulate 15-20 search queries covering all five dimensions:
- Field Landscape queries (3-4): overview, subfields, adjacent areas, taxonomy
- Key People queries (2-3): researchers, authors, institutions, conferences
- Alternative Approaches queries (3-4): competing methods, comparisons, trade-offs
- Best Practices queries (2-3): standards, guidelines, established methods
- Critical Essentials queries (2-3): fundamentals, common mistakes, prerequisites
- Cross-dimension queries (3-4): synthesis, debates, state-of-the-art
- Check internal knowledge — search Zettelkasten for existing notes
- Check curated bookmarks — search Raindrop.io for previously saved sources
- Academic source discovery — target arXiv, Google Scholar, and academic databases; flag PDFs for Tier 3b
- Execute searches using Tavily (web search, crawl, extract) and Brave
- Record everything in the Research Log with dimension tags and quality ratings
Search Strategy Template:
# Field Landscape (dimension 1)
Query 1: "[topic] overview landscape" → Broad field survey
Query 2: "[topic] taxonomy classification" → Field structure
Query 3: "[topic] related fields adjacent" → Adjacent areas
Query 4: "[topic] history evolution" → How the field developed
# Key People & Publishers (dimension 2)
Query 5: "[topic] leading researchers authors" → Key voices
Query 6: "[topic] conference journal publications" → Academic venues
Query 7: "[topic] influential papers cited" → Seminal works
# Alternative Approaches (dimension 3)
Query 8: "[topic] vs [alternative]" → Head-to-head comparison
Query 9: "[topic] alternative methods approaches" → Competing frameworks
Query 10: "[topic] comparison trade-offs" → Trade-off analysis
# Best Practices (dimension 4)
Query 11: "[topic] best practices guidelines" → Consensus methods
Query 12: "[topic] standards recommendations" → Official guidance
# Critical Essentials (dimension 5)
Query 13: "[topic] fundamentals essentials" → Core knowledge
Query 14: "[topic] common mistakes pitfalls" → What not to do
Query 15: "[topic] prerequisites requirements" → Foundations
# Cross-dimension
Query 16: "[topic] state of the art 2025" → Current frontier
Query 17: "[topic] debate controversy" → Open disputes
Query 18: "[topic] research evidence" → Academic angle
Query 19: "[topic] critique limitations" → Counter-arguments
Query 20: "[topic] case study" → Real-world evidence
Academic Source Discovery:
Query A1: "arxiv [topic] survey" → arXiv survey papers
Query A2: "[topic] systematic review meta-analysis" → Systematic reviews
Query A3: "[key_author] [topic]" → Track key researchers
For each academic source found: note PDF URL, DOI, format (arXiv/journal/preprint).
Curated Bookmarks Strategy (Raindrop.io):
Search 1: search_bookmarks_by_text("[topic]") → Direct matches
Search 2: search_bookmarks_by_tags(["<domain-tag>"]) → Tag-based discovery
Search 3: search_bookmarks(query, collection_id=<id>) → Collection-scoped
Bookmark hits are first-class sources — they represent previously curated, human-vetted material.
Sources tagged aic-processed or with ZK links in excerpts have already been partially evaluated.
Quality Gate: Minimum 15 unique sources from at least 3 source categories. All 5 dimensions searched. At least 3 dimensions have sources found.
Tier 2: PROCESS — Triage and Classify
Objective: Filter, deduplicate, and classify raw sources.
Source Quality Matrix:
| Tier | Type | Examples | Reliability |
|---|---|---|---|
| 1 | Peer-reviewed research | Journals, meta-analyses | ✅ Highest |
| 2 | Official documentation | Standards bodies, official docs | ✅ High |
| 3 | Expert analysis | Conference proceedings, preprints, books | ⚠️ Good |
| 4 | Industry content | Reports, white papers, expert blogs | ⚠️ Moderate |
| 5 | General content | News, tutorials, social media | ❌ Low |
Processing Steps:
- Deduplicate: Remove sources covering identical information
- Assess quality: Rate each source using the matrix above
- Check recency: Flag sources older than 3 years for verification
- Check authorship: Identify author credentials where possible
- Classify relevance: Core (directly answers question), Supporting (provides context), Peripheral (tangential but useful)
Output: Source Register artifact (see template)
Quality Gate: Source Register completed. At least 3 Tier 1-2 sources identified.
Tier 3: READ & UNDERSTAND — Deep Extraction
Objective: Extract structured knowledge from each vetted source.
Per-Source Extraction Protocol:
For each source rated Tier 1-3 (full extraction):
- Core claims: What specific claims does this source make?
- Evidence: What evidence supports those claims?
- Methodology: How was the evidence gathered/analyzed?
- Limitations: What does the source acknowledge it doesn't cover?
- Connections: How does this relate to other sources in the register?
For each source rated Tier 4-5 (light extraction):
- Key data points: Specific facts, statistics, or quotes worth keeping
- Perspective: What angle does this source add?
Output: Extraction notes (inline in Research Log or as separate artifacts)
Quality Gate: Every Tier 1-3 source has completed extraction notes.
Tier 3b: PDF ANALYSIS — Academic Deep Reading
Objective: Download available PDFs and perform full Pacheco-Vega AIC analysis per paper.
Activation: Whenever Tier 1 identifies academic papers with available PDFs (arXiv, open-access journals, preprints). Skipped only when zero PDFs are available.
Per-PDF Protocol (Pacheco-Vega AIC method):
- PDF Acquisition: Download PDFs. For arXiv, convert
/abs/URLs to/pdf/format. For paywalled papers, extract abstract + metadata only. - Pass 1 — Analytical Reading: Structure map, data/evidence inventory, key terms defined
- Pass 2 — Interpretive Reading: Core argument in own words, theoretical framework, implications, what's missing
- Pass 3 — Critical Reading: Methodological assessment (rigor, sample quality, reproducibility), bias/conflict check, position in field, relevance to current research
Per-PDF Output: One pdf-analysis-{{slug}}.md file per paper (see template).
Archival: Download PDFs to notes/papers/ for persistent access.
Quality Gate: Every available PDF has a completed AIC analysis file. Papers archived to notes/papers/.
Tier 4: EVALUATE — Cross-Reference and Assess
Objective: Build a coherent evidence picture with explicit confidence levels.
Evaluation Methods:
Triangulation Matrix: For each major finding, list which sources support/contradict it:
Finding Supporting Sources Contradicting Sources Confidence [claim] Source A, B, D Source C High/Med/Low Conflict Analysis: For each contradiction:
- Why do sources disagree? (different timeframes, methodologies, contexts)
- Which side has stronger evidence?
- Can the contradiction be resolved?
Gap Identification: What questions remain unanswered?
- Questions raised but not addressed by any source
- Areas where only 1 source exists (needs more evidence)
- Temporal gaps (no recent data)
Preliminary Hypotheses (3-7 required):
- What the evidence suggests but doesn't prove
- Each with: supporting evidence, counter-evidence, confidence, testability
- Types: explanatory, predictive, prescriptive, comparative
Open Questions Identification (across 5 categories):
- Evidence Gaps: questions the evidence doesn't answer
- Unresolved Contradictions: conflicts without clear resolution
- Emerging Questions: new questions raised by the findings
- Practical Questions: how findings apply in practice
- Definitional Questions: terms or boundaries that remain unclear
Further Research Directions (3 categories):
- Deep Research Follow-ups: sub-topics for another research session
- Academic Research Questions: questions requiring primary literature
- Empirical Studies Needed: questions requiring original data collection
Confidence Scoring:
- High: 3+ independent Tier 1-2 sources agree, no contradictions
- Medium: 2+ sources agree, minor contradictions explained
- Low: Single source, or significant unresolved contradictions
- Speculative: No direct evidence, but logical inference from related findings
Quality Gate: Triangulation matrix and gap analysis completed. Minimum 3 hypotheses generated. Open questions identified across all 5 categories.
Tier 5: SYNTHESIZE — Produce Knowledge
Objective: Transform evaluated evidence into structured, actionable outputs.
Synthesis Outputs (all four are ALWAYS produced):
Research Brief (ALWAYS):
- Executive summary of findings
- Field landscape overview: key terms, subfields, adjacent areas
- Key people and publishers identified
- Alternative approaches with trade-offs
- Key findings with confidence levels
- Best practices and critical essentials
- Recommendations or conclusions
Open Questions (ALWAYS):
- Evidence Gaps, Unresolved Contradictions, Emerging Questions, Practical Questions, Definitional Questions
- Priority ranking by impact and tractability
Hypotheses (ALWAYS):
- 3-7 hypotheses with supporting/counter-evidence, confidence, testability
- Priority matrix (confidence × testability × impact)
Further Research (ALWAYS):
- Deep Research Follow-ups with specific search parameters
- Academic Research Questions with databases and search terms
- Empirical Studies Needed with study type and effort estimates
Zettelkasten Integration (when findings are worth keeping):
- Create permanent notes for key insights
- Create literature notes for important sources
- Link to existing knowledge network
- Create structure note if cluster forms
Detailed Synthesis Report (for complex topics):
- Full findings with evidence chains
- Source-by-source contribution mapping
- Methodology assessment
- Limitations statement
Quality Gate: Research Brief, Open Questions, Hypotheses, and Further Research all completed.
Tier 5b: DIAGRAM — Field Map Visualization
Objective: Create a draw.io XML diagram mapping the field's structure for complex topics.
Activation: When the research reveals 3+ interrelated subfields, competing approaches, or a complex landscape of actors and institutions. Skipped for narrowly focused topics.
Diagram Types:
- Field Map: Nodes for concepts, people, institutions, approaches; edges for relationships
- Concept Map: Hierarchical breakdown of a domain
- Comparison Map: Side-by-side alternative approaches with trade-offs
Node Types:
| Type | Shape | Color | Use For |
|---|---|---|---|
| Core Concept | Rounded rectangle | #dae8fc blue |
Central topic and key terms |
| Person/Group | Ellipse | #d5e8d4 green |
Key researchers, institutions |
| Approach | Hexagon | #fff2cc yellow |
Methods, frameworks, tools |
| Subfield | Rectangle | #e1d5e7 purple |
Subfields, adjacent areas |
| Institution | Diamond | #f8cecc red |
Organizations, conferences |
| Open Question | Cloud | #f5f5f5 gray |
Unresolved questions |
Output: field-map.drawio in the session folder (draw.io XML format).
Quality Gate: All key entities from the research represented. Relationships labeled.
Templates
All templates are in .github/skills/deep-research/templates/:
| Template | Purpose | When to Use |
|---|---|---|
research-brief.md |
Executive summary of research | Always — final output |
open-questions.md |
Structured open questions across 5 categories | Always — identifies gaps |
hypotheses.md |
Research hypotheses with evidence and testability | Always — what the evidence suggests |
further-research.md |
Follow-up research directions (deep, academic, empirical) | Always — next steps |
source-assessment.md |
Source register with quality ratings | Tier 2 — tracking sources |
synthesis-report.md |
Detailed findings report | Complex topics |
research-log.md |
Process log for research session | Always — running log |
pdf-analysis.md |
Per-PDF Pacheco-Vega AIC analysis | Tier 3b — academic papers |
fact-check-report.md |
Per-claim verdicts with balanced evidence | Fact-check mode — auditing existing content |
Agent Architecture
The deep research pipeline is implemented as an orchestrator + subagent system. The orchestrator (deep-researcher) is a manager only — it delegates all research work to dedicated subagents that communicate through the research log.
Subagents
| Agent | Tier | Responsibility | Key Tools |
|---|---|---|---|
deep-research.gather |
1 | Broad data collection across 5 dimensions (web, bookmarks, ZK, academic) | tavily-search, brave-search, raindrop, zettelkasten |
deep-research.process |
2 | Source triage, quality rating, deduplication | file operations only |
deep-research.bookmark |
Cross-cutting | Raindrop archival of ALL non-discarded sources | raindrop |
deep-research.extract |
3 | Deep reading and structured knowledge extraction | tavily_extract, tavily_crawl |
deep-research.analyze-pdf |
3b | PDF download + Pacheco-Vega AIC analysis per paper | fetch_webpage, file operations |
deep-research.evaluate |
4 | Cross-referencing, triangulation, hypotheses, open questions | file operations only |
deep-research.synthesize |
5 | Research Brief, Open Questions, Hypotheses, Further Research, ZK notes | zettelkasten |
deep-research.diagram |
5b | draw.io field map diagram for complex topics | file operations only |
IPC: Research Log as Shared State
All subagents communicate through the research log (research-log.md):
- Each subagent reads the log for its inputs (previous tier's outputs)
- Each subagent writes its results back to the log
- The orchestrator reads the log between tiers to check quality gates
- Status protocol:
**Status**: pending | in-progress | completed | failed - Gate protocol:
**Gate**: pending | passed | failed | <reason>
Pipeline Flow
Orchestrator: Setup → create research-log.md
↓
[Gather Agent] → 15-20 queries across 5 dimensions → log sources → Gate check
↓
[Process Agent] → triage, classify, flag PDFs → log decisions → Gate check
↓
[Bookmark Agent] → archives ALL non-discarded sources to Raindrop
↓
[Extract Agent] → deep reading per source → log extraction notes → Gate check
↓
[Analyze-PDF Agent] → download PDFs → AIC analysis per paper → Gate check
↓
[Evaluate Agent] → triangulation, hypotheses, open questions, further research → Gate check
↓
[Synthesize Agent] → Research Brief + Open Questions + Hypotheses + Further Research + ZK notes → Gate check
↓
[Diagram Agent] → field map draw.io (if complex topic) → Gate check
↓
[Bookmark Agent] → enriches bookmarks with ZK IDs
↓
Orchestrator: Wrap-up → verify all documents produced → report to user
Session Output Documents
Every session produces these documents in the session folder:
| Document | Required | Template |
|---|---|---|
| Research Log | Always | research-log.md |
| Research Brief | Always | research-brief.md |
| Open Questions | Always | open-questions.md |
| Hypotheses | Always | hypotheses.md |
| Further Research | Always | further-research.md |
| Source Assessment | Always | source-assessment.md |
| PDF Analysis (per paper) | When PDFs available | pdf-analysis.md |
| Synthesis Report | Complex topics | synthesis-report.md |
| Field Map Diagram | Complex topics (3+ subfields) | draw.io XML |
Gate Failure Recovery
The orchestrator handles failures by re-invoking upstream agents with refined guidance. Max 2 retries per tier before escalating to the user.
Integration with Other Agents
- Literature Reviewer → Hand off academic sources for Pacheco-Vega AIC analysis
- Zettelkasten Orchestrator → Route permanent notes for integration into knowledge graph
Mode: Fact-Check — Verify Claims in Existing Content
The deep research pipeline supports a fact-check mode for auditing existing content (blog posts, articles, proposals) against evidence. This mode reuses the same tiered pipeline but with a different entry point and output format.
When to Use Fact-Check Mode
- Auditing blog posts or articles for factual accuracy before publication
- Validating claims made in proposals, pitches, or community communications
- Periodic evidence review of published content (facts go stale)
- After any content generation session — AI-generated content is especially prone to plausible-sounding but unsourced claims
Fact-Check Pipeline
Step 0: EXTRACT CLAIMS — Scan content, identify every verifiable claim
↓
Step 1: CATEGORIZE — Classify claims by type and boldness
↓
Step 2: GATHER — Search for evidence (both supporting AND contradicting)
↓
Step 3: EVALUATE — Verdict per claim with balanced evidence
↓
Step 4: REPORT — Produce fact-check report with corrections and enrichments
↓
Step 5: APPLY — Propagate corrections back into the content
Step 0: Claim Extraction
Scan the content and extract every statement a skeptical reader could ask "source?" about.
Priority targets: Any <!-- NEEDS SOURCE: ... --> markers in the content are pre-flagged claims from the writing phase. These are the author's own acknowledgment that evidence is needed — extract and prioritize them first.
Also extract: Assertions without markers that a skeptical reader could challenge — especially opinions, polemic statements, or emotional arguments presented as facts without evidence backing.
Claim types to extract:
| Type | Example | Boldness |
|---|---|---|
| Statistic | "organic reach dropped to 2%" | 🔴 High — must have Tier A/B source |
| Historical event | "MySpace lost 50 million songs in 2019" | 🔴 High — must have contemporaneous source |
| Technical fact | "ActivityPub is a W3C standard" | 🟡 Medium — link to spec suffices |
| Attributed concept | "Doctorow coined enshittification" | 🟡 Medium — credit originator with date |
| Trend assertion | "organic reach has been declining" | 🟡 Medium — directional evidence needed |
| Platform behavior | "the algorithm decides who sees your posts" | 🟢 Low — widely understood, light sourcing |
| Legal/policy | "Meta's ToS grants them a worldwide license" | 🔴 High — must quote actual document |
Boldness determines search depth:
- 🔴 High-boldness claims: actively search for BOTH supporting and contradicting evidence. Require 2+ independent sources.
- 🟡 Medium-boldness claims: search for supporting evidence; note contradictions if found.
- 🟢 Low-boldness claims: verify with at least one source; skip deep counter-evidence search.
Step 1: Balanced Evidence Gathering
For each claim, the Gather agent searches for evidence in both directions:
Supporting queries: "[claim topic] evidence", "[claim topic] data", "[claim topic] confirmed"
Contradicting queries: "[claim topic] myth", "[claim topic] debunked", "[claim topic] criticism", "[claim topic] incorrect"
This is the critical difference from standard research mode. Standard mode explores a topic. Fact-check mode stress-tests specific assertions.
Step 2: Verdict Assignment
Each claim receives a verdict:
| Verdict | Criteria |
|---|---|
| ✅ Verified | 2+ independent sources confirm; no credible contradictions |
| ⚠️ Partially correct | Core claim directionally right but details wrong (date, number, scope) |
| ❌ Incorrect | Credible sources contradict the claim; correction needed |
| 🔍 Unverifiable | No sources found confirming or denying; recommend reframing |
| 🆕 Enrichment | Claim is correct but additional context/evidence strengthens it |
For ⚠️ and ❌ verdicts, the report MUST include:
- What the content currently says (exact quote)
- What the evidence says (with citations)
- A suggested corrected text ready for insertion
- The rationale for the correction
Step 3: Balance Assessment
For each claim (especially 🔴 high-boldness), document the evidence balance:
Claim: "Facebook's organic reach dropped from 16% to 2% by 2024"
├── Supporting: CampaignPros (2025), Statista (2024) — Tier B
├── Contradicting: None found
├── Qualifying: AdSchoolMaster reports 5.2% in 2020, suggesting the decline was more gradual than implied
├── Balance: Supporting > Contradicting
└── Nuance: The 16% → 2% trajectory is correct but compressed multiple intermediate steps
This balance assessment appears in the fact-check report and informs how the claim should be presented in the content — not just whether it's "true" but whether the framing is fair.
Step 4: Enrichment Discovery
During fact-checking, the research often uncovers relevant facts not present in the original content. These are captured as enrichment opportunities:
- Additional statistics that strengthen an argument
- Historical context that adds depth
- Counter-arguments that should be acknowledged for intellectual honesty
- More recent data that updates an older claim
Enrichments are optional additions, not corrections. They're tracked separately in the report.
Step 5: Correction Propagation
The fact-check report includes ready-to-apply corrections:
- Exact current text — the verbatim string to find in the content file
- Corrected replacement — the new text with inline citations
- Sources section entries — formatted entries to add to the article's
## Sourcessection
This enables efficient batch application of corrections across multiple files.
Fact-Check Output
Primary output: fact-check-report.md in the session folder, using the template at .github/skills/deep-research/templates/fact-check-report.md.
The report contains:
- Claim register with per-claim verdicts and balanced evidence
- Critical corrections table (ready-to-apply)
- Enrichment opportunities
- Unresolved claims with recommendations
- Source master list
Fact-Check Quality Gate
- Every extracted claim has a verdict
- Every 🔴 high-boldness claim has both supporting AND contradicting evidence searched
- Every ⚠️/❌ verdict includes a suggested correction
- No claim has verdict "pending" at completion
Anti-Patterns
- Premature synthesis: Jumping to conclusions before completing evaluation
- Source monoculture: Relying on a single type of source
- Confirmation bias: Searching only for evidence that supports initial hypothesis. In fact-check mode, actively search for counter-evidence — especially for claims you "know" are true.
- Citation-free claims: Making assertions without traceable evidence
- Depth without breadth: Going deep on first sources found without surveying the landscape
- Orchestrator doing work: The deep-researcher must ONLY delegate — never research directly
- Skipping gate checks: Always verify the research log status between tiers
- Verification bias: Accepting claims as correct because they "sound right" or appeared in AI-generated text. Every claim must face evidence, not assumptions.
- Missing counter-evidence: For bold claims, finding only supporting evidence is incomplete work. Actively seek contradictions, qualifications, and nuance.
- Binary verdicts: Claims aren't just "true" or "false." Most disputed claims are partially correct, correct but misleading, or correct but outdated. Capture the nuance.
- Opinions as facts: Polemic statements, emotional arguments, and value judgments presented without evidence must be flagged — they are claims that need sourcing. "Facebook is destroying communities" needs documented harms; "organizers report declining reach" does not.
- Ignoring source markers:
<!-- NEEDS SOURCE: ... -->markers in content are the author's explicit request for evidence. Never skip them — they are the highest-priority targets for fact-check mode.