name: search-domain-paper description: | 主题论文搜索与基准对比表生成。用户说"搜一下 [主题]"、"查找XX领域的基准论文"、 "XX主题有哪些论文"时使用。
内部使用 DeepXiv SDK 搜索 arXiv 论文,提取开源状态、实验数据集、Benchmark 分数, 输出结构化 Markdown 对比表。
metadata: { "openclaw": { "requires": { "bins": ["deepxiv"], "env": [] } } }
Search Domain Paper - 主题论文搜索与基准对比表
使用场景:用户想要搜索某个研究主题的论文,并生成对比表格时。
触发词:
- "搜一下 [主题]"
- "查找 [主题] 的基准论文"
- "[主题] 有哪些论文"
- "搜索 [主题] 论文"
English requests:
- "Find recent baseline papers on [topic]"
- "Search papers on [topic]"
Goal
Turn a topic search into a structured markdown table:
- Search recent papers with
deepxiv search - Brief all candidates with
deepxiv paper <id> --brief - Keep the relevant papers, prioritizing papers with GitHub/code
- Inspect promising papers with
deepxiv paper <id> --head - Read experiment-related sections with
deepxiv paper <id> --section ... - Extract datasets, evaluation setup, and reported scores
- Write a markdown table summarizing the baselines
Default Workflow
1. Search by topic and date range
Use a broad search first. Default to last 30 days if user doesn't specify.
deepxiv search "<topic>" --date-from <date> --limit 20
Example:
deepxiv search "continual learning robot" --date-from 2026-03-01 --limit 20
Default heuristics:
- Use the user’s exact topic phrase first
- Keep
--limithigh enough to avoid missing relevant papers - If results are noisy, refine the query with close variants
Examples:
deepxiv search "agentic memory" --date-from 2026-03-01 --limit 100 --format json
deepxiv search "memory agents long-horizon" --date-from 2026-03-01 --limit 100 --format json
deepxiv search "agent memory benchmark" --date-from 2026-03-01 --limit 100 --format json
2. Brief all candidates
For each arXiv ID, fetch:
deepxiv paper <arxiv_id> --brief
Capture:
- title
- arXiv ID
- publish date
- TLDR
- keywords
- GitHub URL
- PDF/source URL
This is the screening step. Do not read full sections yet.
3. Filter and prioritize
Keep papers that are actually about the topic, not just adjacent terms.
Prioritize:
- papers directly centered on the topic
- empirical papers over purely conceptual ones
- papers with GitHub/code
- benchmark or comparison papers
- papers with clear experiment sections
De-prioritize:
- purely opinion or survey papers unless the user asked for surveys
- papers with no clear evaluation evidence
- papers only loosely related to the topic
If the list is still large, keep a primary set and a secondary set:
- Primary: strongest and most relevant baselines
- Secondary: adjacent or weaker evidence
4. Inspect paper structure
For retained papers:
deepxiv paper <arxiv_id> --head
Use --head to find experiment-bearing sections such as:
- Experiments
- Evaluation
- Results
- Benchmark
- Main Results
- Analysis
Also capture:
- abstract
- total token count
- section names
5. Read only experiment-relevant sections
Once the right sections are known, read only those:
deepxiv paper <arxiv_id> --section Experiments
deepxiv paper <arxiv_id> --section Evaluation
deepxiv paper <arxiv_id> --section Results
Section selection guidance:
- Start with
ExperimentsorEvaluation - Read
Resultsif the metrics are not clear - Read
Introductiononly if the task setup is still ambiguous - Read
Appendixonly if benchmark details are missing from the main paper
Avoid reading the entire paper unless the user explicitly asks for it.
Extraction Targets
For each retained paper, try to extract:
- Title
- arXiv ID
- Paper URL
- GitHub/code URL
- Open-source status:
Yes,No, orUnknown - Main task
- Evaluation datasets / benchmarks
- Key metrics
- Best reported scores
- Notes on experimental setting
If exact scores are not clearly available from the inspected sections:
- leave the score field as
Not clearly stated - do not invent or infer a number
If datasets are only partially visible:
- include the datasets you can verify
- mention that the list may be incomplete
Markdown Output
输出位置:{SEARCH_RESULTS_PATH}
推荐文件名格式:
<topic>-baseline-table-YYYY-MM-DD.md
例如:
continual-learning-robot-baseline-table-2026-04-12.md
如果目录不存在,先创建:
mkdir -p {SEARCH_RESULTS_PATH}
Recommended output structure:
# Agentic Memory Baseline Table
Topic: agentic memory
Date range: 2026-03-01 to 2026-04-01
Search source: deepxiv
## Summary
- Number of search results
- Number of relevant papers retained
- Number with public code
- Main recurring datasets or benchmark families
## Baseline Table
| Title | arXiv | URL | Open Source | Code URL | Datasets / Benchmarks | Metrics / Scores | Notes |
| --- | --- | --- | --- | --- | --- | --- | --- |
| ... | ... | ... | ... | ... | ... | ... | ... |
## Inclusion Notes
- Which papers were excluded and why
- Which rows are based only on `--brief`
- Which rows were verified through `--head` and experiment/result sections
## Observations
- Common evaluation datasets
- Which papers appear strongest
- Where the benchmark story is still fragmented
Writing Rules
- Prefer verified facts over broad summaries
- Separate "paper is relevant" from "paper has strong benchmark evidence"
- Be explicit when a row is missing score details
- Mark open-source status conservatively
- Keep the table compact but useful
- Add short notes when comparisons are not apples-to-apples
Decision Rules
- Always start with search and brief
- Prefer papers with GitHub when deciding which ones to inspect first
- Use
--headbefore--section - Read only the sections needed to recover datasets and scores
- If the topic is broad, tell the user when the table mixes multiple subtask types
Minimal Example
deepxiv search "agentic memory" --date-from 2026-03-01 --limit 100 --format json
deepxiv paper 2603.21489 --brief
deepxiv paper 2603.21489 --head
deepxiv paper 2603.21489 --section Experiments
Then write a markdown table with:
- title
- paper URL
- open-source status
- code URL
- datasets / benchmarks
- metrics / scores
- short notes on what was actually verified