deep-research - SKILL.md Agent Skill

name: deep-research description: "IMPORTANT: This skill contains a research methodology you MUST read before doing any web research. It defines a 4-phase protocol (dimension mapping → parallel subagent allocation → T1-T4 source credibility triage → completeness gate with saturation stopping criteria) that you do not have by default. Without reading this skill, you will skip dimension mapping, run searches sequentially instead of in parallel, miss the source tier system, and lack the completeness checkpoints. Read this skill FIRST whenever the user asks to research, investigate, compare, evaluate, explain, or look into any topic — or when they need current real-world information for any task (articles, presentations, reports, decisions). Also read it for 'what is X', 'what are the best practices for X', market/technology analysis, or any question requiring synthesis of multiple web sources." user-invocable: true allowed-tools: WebSearch, WebFetch, Read, Write, Bash, Task

Deep Research Skill

Before starting research, assess four factors: Scope (how many dimensions does this topic have?), Depth (overview or actionable detail?), Freshness (is this time-sensitive?), Stakes (will someone make a decision based on this?). These determine how much work is warranted — a casual explainer needs less rigor than a technology evaluation driving a purchase decision.

The Research Decision Tree

User request arrives
├── Simple factual question (single fact, well-known, no competing views)?
│   └── Single WebSearch + verify with WebFetch → Done
│       Smell test: Does this have competing schools of thought?
│                   Does the answer vary by context/industry?
│                   Has the answer changed in the last 2 years?
│                   If ANY = yes → treat as multi-faceted
├── Multi-faceted topic?
│   └── FULL PROTOCOL (Phase 1-4 below)
├── Comparison/evaluation?
│   └── FULL PROTOCOL + ensure EQUAL depth on each option
└── Pre-content-generation?
    └── FULL PROTOCOL + collect specific assets (data, quotes, examples)

Phase 1: Dimension Mapping (3-5 searches)

Goal: Identify ALL angles before going deep on any single one.

Search the topic broadly, then list dimensions. A good dimension map looks like:

Topic: "X"
Dimensions: [technical, business, user, regulatory, competitive, historical]

CRITICAL: Spend 20% of time here. Getting dimensions wrong means researching the wrong things deeply.

Phase 2: Parallel Deep Dive

Launch parallel research agents via the Task tool, one per dimension. Sequential searching wastes 3-5x the time on multi-faceted topics because each dimension's searches are independent — there's no reason to serialize them.

Each agent gets ONE dimension with this prompt template:

Research "[TOPIC] — [DIMENSION]" thoroughly:
1. Search 3-5 queries with different phrasings
2. WebFetch the 2 most authoritative results
3. Return: key findings, data points, source URLs, and confidence level (high/medium/low)

Subagent allocation by topic type:

Topic Type	Agents	Dimensions
Technology evaluation	4	Technical specs, Real-world adoption, Limitations, Alternatives
Market research	4	Market size/trends, Key players, User needs, Regulatory
Concept explanation	3	Core mechanics, Applications, Criticisms/limitations
Comparison (A vs B)	4	A strengths, B strengths, Head-to-head data, User experiences

Search Query Craft

Expert query patterns that surface hidden results:

# Force authoritative sources
"[topic] site:arxiv.org OR site:nature.com OR site:acm.org"
"[topic] filetype:pdf"

# Find real experiences, not marketing
"[topic] postmortem" / "[topic] lessons learned" / "[topic] we switched from"
"[topic] reddit" / "[topic] hacker news discussion"

# Surface data, not opinions
"[topic] benchmark results 2025 2026"
"[topic] survey report statistics"

# Find contrarian views
"[topic] overrated" / "[topic] problems with" / "[topic] why not"
"[topic] alternative to"

Handling Research Failures

Failure	Action
WebFetch returns 403/paywall	Cite abstract/snippet only; note "full text inaccessible" in source list
WebFetch times out	Retry once with different URL; if still failing, move to next result
Subagent returns no useful findings	Re-run with 2 rephrased queries before declaring dimension dry
Dimension returns only T4 sources	Note "no high-quality sources found for [dimension]" — explicit gaps are more honest than silent omissions
All searches return outdated results	Add year qualifier ("2025 2026") and try "[topic] latest" / "[topic] recent"

Phase 3: Source Credibility Triage

After collecting results, triage EVERY source:

Tier	Source Type	Trust Level	Action
T1	Primary research, official docs, peer-reviewed	High	Cite directly
T2	Reputable journalism, industry analysts (Gartner, McKinsey)	Medium-High	Cite with attribution
T3	Blog posts, tutorials, Stack Overflow	Medium	Cross-check claims
T4	AI-generated summaries, content farms, undated articles	Low	Find original source — these introduce errors at each repackaging layer

Contradiction Resolution: When sources disagree:

Check publication dates — newer usually wins for factual claims
Check source tier — T1 overrides T3
Check specificity — specific data overrides general claims
If still unresolved — present both viewpoints with context

Phase 4: Completeness Gate

STOP and check before synthesizing:

Checkpoint	Minimum Bar
Distinct sources fetched (WebFetch, not just snippets)	≥ 3
Dimensions covered	≥ 3
Concrete data points (numbers, dates, names)	≥ 5
Counterarguments or limitations found	≥ 1
Source tiers represented	At least one T1 or T2

If any checkpoint fails → go back and search more.

Stopping Criterion (Research Saturation) — Stop when TWO of these are true:

New searches return sources already seen (≥50% overlap)
New findings no longer change the Executive Summary
All dimensions have at least one T1 or T2 source

Research beyond saturation is overhead, not depth. Stop and synthesize.

Common Pitfalls

These are the failure modes that consistently degrade research quality:

Snippet trust. WebSearch snippets are truncated, mangled, and stripped of context. A snippet saying "revenue grew 40%" might be from 2019 or referring to a different company. WebFetch the source before citing any specific claim or data point — the 30 seconds it costs prevents attribution errors that undermine the entire output.

Single-angle research. Searching "React performance" and stopping is not research. The same topic searched as "React vs Vue benchmark", "React production issues", and "React performance optimization patterns" surfaces completely different sources. Rephrase queries to break out of the search engine's first-page bubble.

Citing AI-generated summaries. Medium posts and SEO content farms often repackage primary sources with errors introduced at each layer. When a secondary article makes a claim, find the original it's summarizing — that's the citable source.

Undated content. Technical articles without publication dates could be 5+ years old. In fast-moving fields, outdated information presented as current is worse than no information. Note "date unknown" when the publication date can't be determined.

Vague subagent prompts. Launching a subagent with "Research X" produces inconsistent, unmerge-able results. Specify the return schema: "Return key findings, data points, source URLs, and confidence level (high/medium/low)." The 10 extra words in the prompt save minutes of post-processing.

Output Structure

Executive Summary (2-3 sentences, the headline finding)
Key Findings by Dimension (structured, with source attribution per finding)
Data Points (specific numbers, properly attributed — never from snippets)
Contrarian/Risk View (what could be wrong or overhyped)
Source List (URLs grouped by tier, with access notes for any paywalled sources)