name: extract-knowledge description: Deep content analysis and insight extraction from articles, transcripts, documents, videos, PDFs. Prioritizes novelty and surprise over summaries. USE WHEN extract knowledge, extract insights, analyze this content, extract alpha, key ideas, what's important in this.
Extract Knowledge
Mode: Deep content analysis and insight extraction Input: User-provided content (articles, transcripts, documents, videos, PDFs) This is NOT web research. It analyzes content the user provides or points to.
When to Use
- User says "extract knowledge", "extract insights", "analyze this content", "extract alpha"
- User provides an article, transcript, video URL, PDF, or paste of text and wants the key ideas pulled out
- Deep analysis of a specific piece of content (not broad web research)
- User wants to capture the most important and surprising insights from something they consumed
Philosophy
Based on Claude Shannon's information theory: real information is what is different, not what is the same. This skill prioritizes novelty, surprise, and non-obvious insights over comprehensive summaries. Standard extraction catches the obvious points. This skill catches what standard extraction misses -- the subtle, the counterintuitive, the profound.
Available Tools
| Tool | Use For |
|---|---|
| WebFetch | Retrieve article/page content from a URL the user provides |
| claude-browser MCP | Retrieve content from auth-required or JS-heavy pages |
| Read | Read local files (PDFs, text files, documents) the user points to |
Workflow
Step 1: Acquire Content
Determine the content source and retrieve it:
Web URL provided:
WebFetch(url: "[user's URL]", prompt: "Extract the full article/transcript content")
Local file provided:
Read(file_path: "[user's file path]")
Text pasted directly: Use the pasted content as-is.
YouTube URL provided:
WebFetch(url: "[youtube URL]", prompt: "Extract the video transcript or description")
If WebFetch cannot get the transcript, use the claude-browser MCP to access the page and extract available content.
Auth-required or JS-heavy page: Use the claude-browser MCP server to navigate and extract content.
Step 2: Deep Thinking Analysis
Before extracting anything, engage in extended thinking across these dimensions:
- SURFACE SCAN -- What are the obvious main points?
- DEPTH PROBE -- What implications are not explicitly stated?
- CONNECTION MAP -- What unusual connections exist between ideas?
- What makes you stop and think "wait, how does THAT work?"
- What cross-domain patterns appear (same principle across biology/ML, physics/economics, human/AI)?
- What would feel personally relevant in a surprising way?
- ASSUMPTION CHALLENGE -- What conventional wisdom is being questioned?
- NOVELTY DETECTION -- What is genuinely new or surprising here?
- FRAMEWORK EXTRACTION -- What mental models or frameworks emerge?
- SUBTLE INSIGHTS -- What quiet observations carry profound weight?
- CONTRARIAN ANGLES -- What goes against common thinking?
- FUTURE IMPLICATIONS -- What does this suggest about what is coming?
- SYNTHESIS -- What are the highest-value ideas across all dimensions?
Allow thinking to wander and make unexpected connections. Question every assumption about what is "important." Prioritize novelty and surprise over comprehensiveness.
Step 3: Extract and Structure
Produce a structured extraction with these sections:
## Knowledge Extraction: [Content Title]
**Source:** [URL, file path, or "provided text"]
**Content Type:** [Article / Transcript / Video / PDF / Essay / Interview]
**Date Analyzed:** [YYYY-MM-DD]
### One-Paragraph Summary
[Dense paragraph capturing the essence -- what is this about, why does it matter, what is the core argument or contribution.]
### Key Ideas (5-10)
[The main substantive points, written clearly. These are the ideas someone would want if they only had 30 seconds.]
- [Key idea 1]
- [Key idea 2]
- ...
### Surprising Insights (5-15)
[The high-alpha extractions. Ideas that are counterintuitive, novel, or make you reconsider something. Written in approachable 8-15 word bullets.]
- [Surprising insight 1]
- [Surprising insight 2]
- ...
### Mental Models and Frameworks
[Any reusable thinking tools, frameworks, or mental models found in the content.]
- **[Framework name]:** [Brief description of the model and when to apply it]
- ...
### Actionable Recommendations
[What should someone DO based on this content? Specific, concrete actions.]
- [Action 1]
- [Action 2]
- ...
### Quotes Worth Saving
[Direct quotes that are exceptionally well-phrased, memorable, or capture a key idea perfectly. Include attribution.]
> "[Quote]" -- [Speaker/Author]
### Connections and Implications
[What does this connect to? What are the second-order effects? What domains does this apply to beyond the obvious?]
- [Connection 1]
- [Connection 2]
Step 4: Quality Check
Before delivering, verify:
- Surprising Insights are actually surprising -- if an insight is obvious or commonly known, remove it
- Key Ideas cover the substance -- someone reading only this section should understand the content
- Mental Models are reusable -- not just restated points, but genuine thinking tools
- Quotes are exact -- do not paraphrase in the quotes section
- Actionable Recommendations are specific -- "think more carefully" is not actionable; "spend 10 minutes writing down assumptions before starting" is
What Makes a Good Extraction
HIGH-VALUE signals (include these):
- Makes you stop and reconsider something you thought you knew
- Connects ideas from different domains unexpectedly
- Challenges industry consensus or common wisdom
- Reframes a familiar concept in a surprising way
- Has second-order implications not explicitly stated
- Represents a novel mental model or framework
- Captures a subtle observation with profound weight
LOW-VALUE signals (avoid these):
- Restates common knowledge everyone already knows
- Generic advice that could apply to anything ("work hard", "be curious")
- Surface-level observations without depth
- Purely factual information without insight
- Ideas that have been said many times before in the same way
Output Persistence
Extraction results are delivered inline. To persist, write to:
~/.augment/MEMORY/RESEARCH/{YYYY-MM-DD}_{content-slug}/
extraction.md # The full structured extraction
metadata.md # Source info, content type, date analyzed
Create and save automatically if the extraction exceeds 500 words or the user requests it.
Comparison to Standard Summarization
| Aspect | Standard Summary | Extract Knowledge |
|---|---|---|
| Focus | Comprehensiveness | Novelty and surprise |
| Thinking depth | Surface | Deep multi-dimensional analysis |
| Mental models | Rarely extracted | Explicitly identified |
| Surprising insights | Mixed with obvious | Separated and prioritized |
| Actionable items | Sometimes included | Always included and specific |
| Quotes | Rarely preserved | Best quotes captured |
| Quality bar | "Did I cover everything?" | "Did I find what others would miss?" |