name: research-task description: Execute a research task using compositional workflow planning. Characterizes the task, selects an approach from a library of research strategies, executes it, and self-evaluates output quality. argument-hint: [research question or topic] user-invocable: true allowed-tools: Read, Glob, Grep, Bash, WebFetch, WebSearch
Research this topic: $ARGUMENTS
Workflow
Phase 1: Task Characterization
Rate the research task on each dimension (1-5):
| Dimension | 1 (Low) | 5 (High) |
|---|---|---|
| Scope | Point question (single fact) | Frontier mapping (state of entire field) |
| Domain structure | Single field, established methods | Distant cross-domain analogy |
| Evidence type | Empirical data, measurements | Theoretical arguments, first-principles |
| Time horizon | What's true right now | What could become true (speculative) |
| Fidelity | Ballpark / directional | Rigorous / publication-grade |
Phase 1b: Workflow Library Query
Query the MAP-Elites workflow library for strategies matching this task's coordinates:
sqlite3 research-workflows/workflows.db "SELECT id, name, archetype, fitness_score FROM workflows WHERE status IN ('active', 'seed') ORDER BY fitness_score DESC LIMIT 10;"
For more targeted retrieval, filter by behavioral region:
sqlite3 research-workflows/workflows.db "SELECT w.id, w.name, ws.name as step_name, ws.description, ws.rationale FROM workflows w JOIN workflow_steps ws ON w.id = ws.workflow_id WHERE w.archetype = '<archetype>' AND w.status IN ('active', 'seed') ORDER BY w.fitness_score DESC, ws.step_order;"
Use retrieved workflows to inform Phase 2 strategy selection. Prefer workflows with high fitness scores and usage counts.
Phase 2: Strategy Selection
Based on the characterization and library query, select the most appropriate research archetype:
Exploratory (high scope, moderate fidelity)
- Quick landscape scan — 15-minute overview of a new area
- Deep literature review — comprehensive survey of a mature field
- Trend detection — what's gaining momentum
- White space identification — what isn't being worked on that should be
Confirmatory (low scope, high fidelity)
- Fact-checking pipeline — verify specific claims with primary sources
- Consensus mapping — what do experts agree/disagree on
- Replication check — has this finding held up under scrutiny
Analytical (moderate scope, high evidence)
- Compare and contrast — systematic comparison of N approaches
- Root cause analysis — why did X happen / why doesn't X work
- Cost-benefit analysis — should we do X given tradeoffs
- Forecasting — given current trends, what's likely
Generative (high domain structure, moderate time horizon)
- Cross-domain transfer — find solutions from field B for problems in field A
- First-principles derivation — reason from fundamentals, not literature
- Synthesis — combine N existing ideas into a novel framework
- Adversarial stress-test — find the strongest objections to X
Applied (low time horizon, high fidelity)
- Technical feasibility assessment — can X be built with current tools
- Build-vs-buy analysis — build or use existing solution
- Implementation planning — how to execute X given constraints
Phase 3: Compositional Planning
Don't just pick one strategy. Identify which sub-techniques from multiple strategies suit this specific task:
- Select 2-3 relevant archetypes from Phase 2
- For each, identify the steps that apply to this task
- Note why each step exists (what it optimizes for)
- Compose a novel workflow combining the best elements
- Record which elements were borrowed from which archetype
Phase 4: Execution
Execute the composed workflow using available tools:
- WebSearch — broad web queries
- Exa (
web_search_exa,web_search_advanced_exa) — semantic/neural search, cross-domain retrieval - WebFetch — deep read of specific URLs
- Firecrawl (
firecrawl_scrape,firecrawl_search) — structured web scraping - GitHub — repository search, code search, trending
- Read/Glob/Grep — local codebase and documentation
Phase 5: Self-Evaluation
Score your output on each dimension (0-1):
- Coherence — Does it tell a consistent story? Internal contradictions?
- Grounding — Are claims supported by retrieved evidence?
- Compression — Can key findings be stated concisely? (If not, thesis may be unclear)
- Surprise — Did anything non-obvious surface? (Confirming the known has low value)
- Actionability — Does the output enable a decision or next step?
Report the scores honestly. Low scores on surprise or actionability are useful signals — they indicate the research question may need reframing.
Phase 5b: Library Update
Log this execution and update the workflow library:
# Log the execution
sqlite3 research-workflows/workflows.db "INSERT INTO executions (task_description, task_scope, task_domain_structure, task_evidence_type, task_time_horizon, task_fidelity, score_coherence, score_grounding, score_compression, score_surprise, score_actionability, score_composite) VALUES ('<description>', <scope>, <domain>, <evidence>, <horizon>, <fidelity>, <coherence>, <grounding>, <compression>, <surprise>, <actionability>, <composite>);"
If composite score exceeds the current occupant of this behavioral region, the composed workflow becomes a candidate for library insertion.
Phase 6: Output
## Research: [Topic]
### Characterization
Scope: [1-5] | Domain: [1-5] | Evidence: [1-5] | Horizon: [1-5] | Fidelity: [1-5]
### Strategy
[Which archetypes were combined and why]
### Findings
[Structured findings — use headers, tables, or lists as appropriate]
### Quality Assessment
Coherence: [0-1] | Grounding: [0-1] | Compression: [0-1] | Surprise: [0-1] | Actionability: [0-1]
### Key Takeaway
[Single paragraph — the compressed thesis]
### Recommended Next Steps
[What to do with these findings]