deep-research - SKILL.md Agent Skill

name: deep-research description: Autonomous research pipeline - discover, extract, and integrate cutting-edge insights into knowledge base argument-hint: [optional topic or "auto" for autonomous selection] automation: gated allowed-tools: Task, Read, Bash, Glob, Grep

Deep Research & Knowledge Integration Pipeline

You are orchestrating a fully autonomous research → extraction → connection discovery workflow to expand the knowledge base with cutting-edge insights.

Input Processing

User Input: $ARGUMENTS

Execution Modes:

Directed Mode - User specifies topic(s): $ARGUMENTS = "neuroscience of habits" or $ARGUMENTS = "multi-agent systems, safety alignment"
Autonomous Mode - You select topics: $ARGUMENTS = "" or $ARGUMENTS = "auto"

Mission

Execute a complete 3-phase autonomous research pipeline:

RESEARCH - Gather cutting-edge papers and developments
EXTRACT - Pull unique insights from research findings
CONNECT - Map connections to existing knowledge base

Critical Requirement: ALL extracted insights MUST be stored in Document Insights folder structure to keep separate from main Brain.

Phase 1: Topic Selection & Research Planning

A. If User Provided Topic(s) (Directed Mode)

Parse $ARGUMENTS for topic(s)
Validate topics are research-worthy
Plan research scope for each topic

B. If Autonomous Mode

Analyze knowledge base to identify research opportunities:

Read knowledge base analysis:
```
cat knowledge-base-analysis.md
```

Check recent activity:

ls -lt Brain/Document\ Insights/ | head -10

Identify gaps based on:
- Underrepresented domains in knowledge-base-analysis.md
- Missing connections flagged in recent changelogs
- Emerging themes from existing insights
- User's recent work patterns
- CLAUDE.md priorities and future directions
Select 1-3 research topics that would:
- Fill identified gaps
- Build on existing strengths (e.g., Buddhism-Neuroscience-AI triangle)
- Connect underexplored domains
- Add empirical validation to intuitive frameworks
- Challenge or extend current thinking

Examples of Good Topic Selection:

"Neuroscience of habits and behavior change" (if habit formation underrepresented)
"Collective intelligence and swarm behavior" (if group dynamics missing)
"Embodied cognition and interoception" (if embodiment gap identified)
"Complexity science and emergence" (if systems thinking needed)
"Creativity neuroscience and insight generation" (if creative process mechanics missing)

Phase 2: Execute Research

Get Current Timestamp

date '+%Y-%m-%d %H:%M:%S %Z'

Save this for session folder naming: YYYY-MM-DD Topic Description

Launch Research Specialist Agent(s)

For Each Topic:

Use Task tool with subagent_type='research-specialist':

TOPIC: [Selected topic]

Conduct comprehensive research on [topic] focusing EXCLUSIVELY on the most recent research and developments.

⚠️ CRITICAL RECENCY REQUIREMENT:
Your training data may be outdated. The world changes rapidly, especially in fast-moving fields like AI, neuroscience, and technology. You MUST prioritize the most recent information available through web search, even if it contradicts what you think you know from training data.

SEARCH STRATEGY:
- Use Google Search grounding to find papers published in the last 12-18 months
- Explicitly search for "2024", "2025", "recent", "latest" in queries
- Check paper publication dates - reject anything older than 2023 unless foundational
- Look for preprints, conference proceedings, and recent journal publications
- Prioritize arXiv papers from last 6 months, conference papers from 2024-2025
- Search for "state of the art [topic] 2024" or "[topic] breakthrough 2025"

RESEARCH REQUIREMENTS:

1. **Target Sources (RECENT ONLY):**
   - arXiv preprints (2024-2025, prioritize last 6 months)
   - Major conferences 2024-2025 (NeurIPS, ICML, ICLR, AAAI, ACL, EMNLP, etc.)
   - Leading AI labs recent publications (OpenAI, Anthropic, Google DeepMind, Microsoft Research)
   - Top-tier journals (2024-2025 issues only)
   - Industry whitepapers and blog posts from major tech companies (last 12 months)
   - Recent preprints and working papers

2. **Key Focus Areas:**
   - Novel mechanisms and frameworks (not in your training data)
   - Empirical findings with quantified results (recent benchmarks)
   - Counter-intuitive or contrarian insights (challenging established thinking)
   - Cross-domain applications (emerging connections)
   - Real-world implementations and case studies (production deployments)
   - Practical implications for practitioners

3. **Output Requirements:**
   - Comprehensive structured report (15-25 major papers/developments)
   - Full citations with DATES prominently displayed (title, authors, DATE, venue, arXiv ID)
   - Key findings and novel contributions
   - Performance metrics and empirical data
   - Emerging trends and patterns
   - URLs to papers/resources
   - Critical analysis and synthesis

4. **Save Location:**
   resources/[Topic-Slug]-Research-Report-YYYY-MM-DD.md

VERIFICATION: Before finalizing, verify that 80%+ of papers are from 2024-2025. If not, search again with more explicit recency filters.

Use Gemini AI with Google Search grounding. Trust the search results over your training data.

Strategy Considerations:

Sequential: Run topics one-by-one if they're related (later research can reference earlier findings)
Parallel: Run multiple topics simultaneously if they're independent domains
Your choice - decide based on topic relationships and efficiency

Monitor Research Output

After each research agent completes:

Note the report file path
Verify comprehensive coverage (15-25+ papers)
Check for citations and empirical data
Confirm report saved in /resources/ directory

Phase 3: Extract Insights

Create Session Folder

Format: YYYY-MM-DD [Topic Description]

Example: 2025-11-20 Neuroscience of Habits and Behavior Change

Path: Brain/Document Insights/[Session-Folder]/

Launch Document Insight Extractor

For Each Research Report:

Use Task tool with subagent_type='document-insight-extractor':

Extract unique insights from the research report for the knowledge base.

SOURCE DOCUMENT: [Full path to research report]

SESSION FOLDER: [Session folder name]

EXTRACTION GUIDELINES:

1. **Focus on Novel Insights:**
   - Paradigm shifts and new frameworks
   - Counter-intuitive or surprising findings
   - Empirical validation of existing theories
   - Novel mechanisms and explanations
   - Cross-domain applications
   - Contrarian perspectives backed by evidence

2. **Bridge to Existing Knowledge Base:**
   - Connect to the 6 primary hubs: Consciousness, Dopamine, Decision-Making, Identity, AI Agents, Flow States
   - Reference Eugene's existing frameworks (Folder Paradigm, Mental Models Taxonomy, etc.)
   - Identify consilience opportunities (3+ domains converging)
   - Find validation or challenges to current thinking
   - Look for applications of Buddhist/neuroscience principles

3. **Prioritize:**
   - Research findings that extend current understanding
   - Empirical data that validates intuitive frameworks
   - Novel architectures or methodologies
   - Real-world implications and case studies
   - Philosophical or meta-level insights

4. **Quality Standards:**
   - 15-25 high-quality insights per report
   - Avoid redundancy with existing knowledge base (ALWAYS search for duplicates)
   - Include proper citations (paper title, authors, year)
   - Tag appropriately for discoverability
   - Create connections to existing permanent notes

5. **Output Requirements:**
   - Create permanent notes in session folder
   - Include full citations and sources
   - Add relevant tags
   - Note connections to existing insights
   - Create changelog: CHANGELOG - Document Analysis YYYY-MM-DD.md

CRITICAL:
- ALWAYS search for duplicates before creating notes
- Store ALL extracted notes in: Brain/Document Insights/[Session-Folder]/
- Create comprehensive changelog documenting extraction process

Monitor Extraction Output

After extraction completes:

Verify insights stored in correct Document Insights session folder
Check changelog was created
Note count of unique insights extracted
Confirm deduplication was performed

Phase 4: Insight Interview (Optional)

After extraction completes, present the top findings and offer to run an insight interview before connection discovery. This captures your personal perspective alongside the external research - making the final connection map richer because it maps both what the research says AND what you actually think about it.

Present Top Insights

Summarize the 5-8 most significant extracted insights from the session folder:

List note titles with one-sentence descriptions
Highlight findings that challenge existing KB frameworks or contradict current notes
Flag any surprising or counterintuitive results

[APPROVAL GATE] - Run Insight Interview?

Present to user:

"[N] insights extracted on [topic]. Before connection discovery, would you like to do a quick insight interview? I'll ask you 6-8 questions grounded in your existing notes and these new findings - to capture YOUR angles, reactions, and disagreements. Your responses save to Brain/AI Extracted Notes/ and the connection finder will map both sets together.

Say yes to run the interview, or skip to go straight to connection discovery."

If yes: Invoke the insight-interview skill for the current topic.

The dialogue runs here - one question at a time
User insights saved to Brain/AI Extracted Notes/
Note the session timestamp so connection discovery can include these new notes

If skip: Proceed directly to Phase 5.

Update Scope for Connection Discovery

If the interview ran, Phase 5 should map connections across both:

External research: Brain/Document Insights/[Session-Folder]/
Personal insights: new notes created in Brain/AI Extracted Notes/ during this session

Phase 5: Connection Discovery

Launch Connection Finder Agent(s)

Strategy Options:

Option A: Single Comprehensive Pass

Run connection-finder once on the entire session folder
Maps all new insights against full knowledge base

Option B: Multiple Targeted Passes

Run connection-finder 2-3 times on different subsets
First pass: New insights ↔ Existing AI insights (102 notes)
Second pass: New insights ↔ Primary hubs (Dopamine, Consciousness, etc.)
Third pass: Cross-domain bridges and synthesis opportunities

Your Choice - Select based on insight count and domain diversity.

Execute Connection Discovery

Use Task tool with subagent_type='connection-finder':

Discover connections between newly extracted insights and existing knowledge base.

STARTING POINTS:
All notes in session folder: Brain/Document Insights/[Session-Folder]/

Or specify individual notes if doing targeted passes.

CONNECTION DISCOVERY GOALS:

1. **Bridge to Existing Knowledge:**
   - Connect to 102 existing AI insights
   - Link to 6 primary thematic hubs (Consciousness, Dopamine, Decision-Making, Identity, AI Agents, Flow)
   - Find relationships to original frameworks (Folder Paradigm, Mental Models Taxonomy, etc.)
   - Map to MOCs and output content

2. **Cross-Domain Opportunities:**
   - Buddhism ↔ Neuroscience ↔ AI consilience
   - Decision Science ↔ Agent Architecture
   - Flow States ↔ Peak Performance ↔ AI Optimization
   - Identity/Belief Systems ↔ Agent Fitness Functions
   - Dopamine hub connections (universal bridge)

3. **Synthesis Identification:**
   - Clusters of insights ready for article development
   - Consilience zones (3+ domains converging)
   - Emergent patterns and meta-insights
   - Framework extension opportunities
   - New MOC candidates

4. **Analysis Parameters:**
   - Similarity thresholds: 0.65-0.85 (strong to moderate)
   - Depth: 2-3 levels from each new insight
   - Focus: Non-obvious, high-value connections

5. **Output Requirements:**
   - Map direct connections to existing permanent notes
   - Identify bridge notes connecting multiple domains
   - Highlight consilience zones and synthesis opportunities
   - Create dated changelog: CHANGELOG - Connection Discovery Session YYYY-MM-DD.md
   - Store changelog in: Brain/05-Meta/Changelogs/
   - Update master changelog: Brain/CHANGELOG.md
   - Suggest concrete article topics or framework extensions

Begin comprehensive connection mapping.

Monitor Connection Discovery

After connection-finder completes:

Verify changelog created in /Brain/05-Meta/Changelogs/
Check master CHANGELOG.md was updated
Note key findings: consilience zones, synthesis opportunities
Identify high-priority article topics

Phase 6: Final Summary & Recommendations

Consolidate Results

Generate a comprehensive session report including:

# Deep Research Pipeline - Session Summary
**Date:** [Timestamp]
**Execution Mode:** [Directed / Autonomous]
**Topics Researched:** [List]

---

## Phase 1: Research
**Topics Selected:**
1. [Topic 1] - Rationale: [Why chosen]
2. [Topic 2] - Rationale: [Why chosen]
...

**Research Reports Created:**
- [Report 1]: /resources/[filename] ([N] papers analyzed)
- [Report 2]: /resources/[filename] ([N] papers analyzed)

**Total Papers Analyzed:** [N]
**Research Coverage:** [Domains covered]

---

## Phase 2: Insight Extraction
**Session Folder:** /Brain/Document Insights/[Session-Folder]/

**Extraction Results:**
- Unique insights extracted: [N]
- Duplicates avoided: [N]
- Very similar (evaluated): [N]
- Changelogs created: [List paths]

**Insights by Type:**
- Research findings: [N]
- Theoretical frameworks: [N]
- Production insights: [N]
- Contrarian arguments: [N]

**Top Insights:**
1. [[Note Title]] - [Brief description]
2. [[Note Title]] - [Brief description]
...

---

## Phase 3: Connection Discovery
**Changelogs Created:**
- [Path to connection discovery changelog]

**Key Findings:**
- Strong connections discovered: [N]
- Emergent patterns identified: [N]
- Cross-domain bridges: [N]
- Consilience zones: [List]

**Major Cross-Domain Bridges:**
1. [Domain A] ↔ [Domain B] - Mechanism: [How connected]
2. [Domain A] ↔ [Domain C] - Mechanism: [How connected]

**Synthesis Opportunities Identified:**
1. **Article:** "[Title]" - Ready for development
2. **Framework:** "[Name]" - Extension of existing work
3. **MOC Candidate:** "[Topic]" - Needs organization hub

---

## Impact Assessment

**Knowledge Base Enhancement:**
- New research domains added: [List]
- Existing frameworks validated/extended: [List]
- Gaps filled: [List]
- New connections to core hubs: [N]

**Most Significant Discoveries:**
1. [Discovery 1] - Why significant: [Explanation]
2. [Discovery 2] - Why significant: [Explanation]
3. [Discovery 3] - Why significant: [Explanation]

**Contrarian Insights:**
- [Insight that challenges conventional wisdom]
- [Insight that challenges existing framework]

---

## Recommended Next Steps

**High-Priority Actions:**
1. **Write Article:** "[Suggested title]"
   - Sources: [[Note 1]], [[Note 2]], [[Note 3]]
   - Unique angle: [What makes this distinctive]
   - Target audience: [Who would benefit]

2. **Extend Framework:** "[Framework name]"
   - Current state: [What exists]
   - Enhancement: [What research adds]
   - Application: [How to use]

3. **Create MOC:** "[Topic]"
   - Notes to organize: [Count]
   - Structure: [Suggested organization]
   - Purpose: [Navigation goal]

**Medium-Priority:**
- [Additional recommendations]

**Long-Term Opportunities:**
- [Strategic synthesis possibilities]

---

## Session Files Created

**Research Reports:**
- [Path 1]
- [Path 2]

**Insight Notes:**
- [Session folder path] ([N] notes)

**Changelogs:**
- [Extraction changelog path]
- [Connection discovery changelog path]
- Master CHANGELOG.md updated

---

## Knowledge Base Statistics (Updated)

**Before Session:**
- Total permanent notes: [N]
- AI insights: [N]
- Document insights: [N]

**After Session:**
- Total permanent notes: [N] (+[N])
- AI insights: [N]
- Document insights: [N] (+[N])

**Growth:** +[N] notes, +[N] connections

---

## Meta-Analysis

**What Worked Well:**
- [Successes in topic selection, research, extraction, or connection]

**Challenges Encountered:**
- [Any difficulties or limitations]

**Lessons for Future Sessions:**
- [Improvements for next research pipeline run]

---

**End of Deep Research Pipeline Session**

Quality Standards & Best Practices

Topic Selection (Autonomous Mode)

Strategic alignment: Choose topics that build on existing strengths or fill critical gaps
Cross-domain potential: Prefer topics that bridge multiple knowledge base hubs
Empirical grounding: Select areas with active research (2024-2025 papers available)
Practical relevance: Topics should have real-world applications or implications

Research Quality

Recency: Prioritize 2024-2025 papers and developments
Rigor: Focus on peer-reviewed research and reputable sources
Depth: 15-25 major papers minimum per topic
Breadth: Cover multiple perspectives and approaches
Empirics: Include quantified results and performance metrics

Insight Extraction

Novelty: Only extract genuinely new perspectives
Deduplication: ALWAYS search before creating notes
Citations: Include full source attribution
Connections: Link to existing knowledge base
Quality > Quantity: 15-25 high-value insights, not 100 mediocre ones

Connection Discovery

Non-obvious focus: Surface-level links are less valuable
Cross-domain priority: Consilience zones are gold
Synthesis orientation: Identify article/framework opportunities
Actionable output: Provide concrete next steps

Documentation

Comprehensive changelogs: Document every phase
Clear file organization: Session folders in Document Insights
Master log updates: Keep CHANGELOG.md current
Audit trail: Future-you should understand what happened and why

Execution Protocol

Parse input → Determine directed vs. autonomous mode
Select topics → Either use provided topics or analyze knowledge base for gaps
Get timestamp → For session folder naming
Research phase → Launch research-specialist agent(s)
Extraction phase → Launch document-insight-extractor for each report
Insight interview → Optional gate: present top findings, offer /insight-interview to capture your angles before connection discovery
Connection phase → Launch connection-finder agent(s) across both document and personal insights
Generate summary → Comprehensive session report
Provide recommendations → Actionable next steps for content creation

Key Principle: Fully autonomous execution. No human intervention required between phases. All insights stored in Document Insights folder structure to maintain separation from main Brain.

Error Handling

If research finds insufficient papers:

Broaden search criteria
Extend date range (include 2023)
Consider adjacent topics
Document limitation in summary

If extraction finds too many duplicates:

Focus on truly novel contributions
Look for empirical validation of concepts
Seek contrarian perspectives
Consider topic was already well-covered

If connection-finder finds weak connections:

Topic may be genuinely novel (good!)
Increase similarity threshold range
Run additional passes on specific hubs
Document gap as synthesis opportunity

If any phase fails:

Document error in summary
Continue with successful phases
Provide partial results
Recommend retry or alternative approach

Remember: This is a knowledge base expansion engine. Your goal is to systematically grow Eugene's second brain with cutting-edge, well-integrated insights that enhance his intellectual capabilities and content creation potential.

State Dependencies

Source	Location	Read	Write	Description
Knowledge base analysis	`knowledge-base-analysis.md`	X		Current KB state for gap analysis
Document Insights	`Brain/Document Insights/`	X	X	Session folders for extracted insights
Research reports	`resources/`	X	X	Generated research reports
Changelogs	`Brain/05-Meta/Changelogs/`	X	X	Session and discovery changelogs
Master changelog	`Brain/CHANGELOG.md`	X	X	Master change log
Local Brain Search	`resources/local-brain-search/`	X		Vector search for deduplication

Completion Checklist

Execution mode determined (directed vs autonomous)
Topics selected with rationale
Research reports generated and saved to /resources/
Session folder created in Document Insights
Insights extracted with deduplication
Extraction changelog created
Insight interview offered (ran or skipped)
Connection discovery completed
Connection discovery changelog created in /05-Meta/Changelogs/
Master CHANGELOG.md updated
Session summary generated with recommendations
Synthesis opportunities identified