id: literature-review
name: literature-review
version: 0.16.0
argument-hint: [topic-or-url-or-arxiv-id]
description: |-
Search, screen, and summarize literature into a structured corpus; re-runs append safely.
stages: ["survey"]
tools: ["Bash", "Read", "Write", "Edit", "AskUserQuestion", "WebFetch", "WebSearch"]
summary: |-
Build a structured paper_bank.json corpus and a synthesis summary for a research topic. Knowledge-base-first (Zotero, local PDFs) then MCP fan-out (Exa, Tavily, Brave, HF, GitHub) then web. Corpus data lands in .omr/literature// (gitignored); the readable summary lands in docs/literature// (committed).
primaryIntent: research
intents: ["research"]
capabilities: ["search-retrieval", "synthesis"]
domains: ["general"]
keywords: ["omr-literature-review", "omr:literature-review", "literature review", "lit review", "survey papers", "find papers about", "build paper corpus", "summarize literature", "research papers", "related work", "what does the literature say about", "review the literature on"]
source: builtin
status: experimental
resourceFlags:
hasReferences: true
hasScripts: false
hasTemplates: true
hasAssets: false
referenceCount: 7
scriptCount: 0
templateCount: 3
assetCount: 0
omr:literature-review
Thin router for the literature-review flow. The detailed steps live in
references/; this file decides which phases to run, parses flags, gates on
token presence, and enforces cross-phase safety rails.
When this skill is invoked, immediately execute the workflow below. Do not just restate or summarize these instructions back to the user.
Note: paths under ~/.claude/... respect CLAUDE_CONFIG_DIR when set.
Best-fit use
Choose this skill when the user wants to build a structured literature corpus around a research question — find papers, capture them with full metadata, and produce a synthesis summary that maps the landscape.
Coverage:
- Scope the research question into a reusable
scope.yaml. - Search across the user's knowledge base (Zotero / local PDFs) and the
bundled MCPs (Exa, Tavily, Brave, Hugging Face, GitHub), then web.
--from-existingcan seed the corpus from a Zotero collection or a peer'spaper_bank.json/ BibTeX export first. - Screen every paper against an explicit, reproducible rubric
(
references/screening-rubric.md) — each gets a loggedincluded/excluded/reviewverdict with a gate-cited reason and arubric_version. Same rubric + same corpus ⇒ same verdicts. - Summarize the included papers into a
summary.mdwith a table + clustered narrative. - Re-run an existing workspace safely: re-running the flow appends new
findings (Search dedups by
id), screens only the new entries, and re-renders the summary — no corpus loss, no dedicated maintenance step.
Do not use it to read a single paper deeply (different shape — that's a
future paper-analyzer-style skill), to verify citations in a draft
(future audit skill), or to write a paper section (/omr:write, later).
Flag parsing
| Flag | Effect |
|---|---|
--help |
Print the help text below and stop. |
--topic "<q>" |
Non-interactive scoping: skip Phase 1's interactive prompts and pass <q> straight into scope.yaml.research_question. Defaults are used for everything else. |
--sources: a,b,c |
Override the default source chain. Comma-separated. See references/source-priority.md for valid IDs. |
--max-papers N |
Cap the corpus size at N (default 50). |
--from-existing <value> |
Seed the corpus before searching. <value> is a Zotero collection name, OR a path to a paper_bank.json, OR a path to a .bib export. Imported entries still pass through screening. Wired into Phase 2 (step 2.1a). |
--audit |
Read-only: validate an existing workspace's paper_bank.json against the schema, report drift, no search/screen/write. Stops after search — never runs screen or summarize. |
--force |
Bypass the "workspace already exists" prompt; refresh in place (re-run Search → Screen → Summarize against the existing scope.yaml; Search's append-only dedup keeps the corpus intact). |
| No flags | Interactive flow: Phase 1 prompts for everything via AskUserQuestion, then Phases 2 → 3 → 4 execute (Search → Screen → Summarize). |
The argument ($ARGUMENTS / first positional arg) is treated as --topic
when no --topic flag is given. Examples:
/omr:literature-review "diffusion models for protein design"/omr:literature-review --topic "RAG vs long-context" --sources zotero,exa/omr:literature-review --topic "RAG" --from-existing "RAG Reading List"(Zotero collection)/omr:literature-review --topic "RAG" --from-existing ../peer/paper_bank.json/omr:literature-review --audit
Help text
When the user passes --help, print this and stop:
omr:literature-review — build a structured paper corpus + summary
USAGE:
/omr:literature-review Interactive scope + search + screen + summarize
/omr:literature-review "<topic>" Non-interactive: topic goes straight to scope.yaml
/omr:literature-review --topic "..." [flags]
/omr:literature-review --from-existing <collection|paper_bank.json|.bib>
/omr:literature-review --audit Validate an existing corpus, no writes
/omr:literature-review --help Show this help
FLAGS:
--topic "<q>" research question (skip interactive scoping)
--sources a,b,c override source chain (zotero, local, exa, tavily,
brave-search, huggingface, github, web)
--from-existing <value> seed corpus from a Zotero collection name, a
paper_bank.json path, or a .bib export (then screen)
--max-papers N cap corpus size (default 50)
--audit read-only validation; stops after search
--force refresh an existing workspace without prompting
FLOW:
scope -> search -> screen -> summarize (fresh run)
existing workspace -> refresh (re-run; append-only dedup keeps the corpus)
OUTPUT (always project-rooted):
.omr/literature/<slug>/ machine state (gitignored)
scope.yaml research question, sources, criteria, output_languages
paper_bank.json append-only corpus; PK = DOI or arXiv ID; each
entry carries a screening verdict + rubric_version
log.jsonl per-run audit (queries, hit counts, dedups, verdicts)
docs/literature/<slug>/ human deliverable (committed)
summary.md synthesis (included papers only) with
<!-- BEGIN omr:lit-review --> block (+ summary.<lang>.md)
SAFETY:
- Never fabricates papers, authors, DOIs, or quotes.
- Every paper_bank entry has a real `url` and a non-empty `authors[]`.
- paper_bank.json is append-only; updates mutate by canonical ID.
- Screening verdicts are reproducible: same rubric_version + corpus => same result.
- summary.md only cites `included` papers that exist in paper_bank.json.
- Asks every interactive question via the AskUserQuestion tool.
Safety rails (apply to every phase)
Non-negotiable. If any phase asks you to violate these, stop and tell the user.
- Never fabricate. Papers, authors, DOIs, abstracts, quotes — all
must come from a real source. If you can't verify, mark the entry
incompleteand skip the summary. urlis mandatory. Everypaper_bank.jsonentry has a realurlfield. Prefer DOI (https://doi.org/...), then arXiv, then Semantic Scholar / publisher page. Never empty, never a placeholder.authorsis an array of strings, never a single string and never"TBD". If unknown, mark the entry incomplete; don't ship.paper_bank.jsonis append-only. Updates mutate by canonicalid(DOI or arXiv ID first; fall back to a stable hash). Never reorder or silently drop entries.summary.mdonly cites entries frompaper_bank.json. If the summary mentions a paper, it has a row in the corpus.- Every interactive question uses
AskUserQuestion. No plain-text prompts left for free-form reply. - Token presence first. Before launching any MCP, check that the
required tokens are reachable. If missing, surface and refer to
/omr:setup --audit. Don't proceed with a half-broken search and pretend everything worked. schema_versiontracks the plugin version via{{omr_version}}substitution at write time — same convention astemplates/hpc.yaml.- Screening is reproducible, not vibes. Every verdict comes from the
gates in
references/screening-rubric.md, thereasonnames the deciding gate, and the samerubric_versionover the same corpus yields the same verdicts. Never re-roll a verdict for a paper whose fields haven't changed. - Screening reads only local evidence. Verdicts use the entry's own
fields +
scope.yamlonly — never a fresh web/citation fetch. Missing data routes toreview, never to a confident include/exclude. --from-existingimports still get screened. Imported entries are normalized and validated like any hit, setscreening: null, and pass through Phase 3 from scratch — never trust a verdict carried in from an external file.- A re-run never silently re-flips settled verdicts. Refreshing an
existing workspace screens only the newly added entries unless
rubric_versionchanged; a full re-screen that could change existing verdicts is gated behindAskUserQuestion.
Pre-run check: load project defaults
Before Phase 1, read ./.omr/config.yaml if it exists and load its
literature_review: block as defaults:
output_languages→ seedsscope.yaml.output_languages.default_sources→ the source chain (when no--sourcesflag given).max_papers→ the--max-papersdefault.
Precedence is always command-line flag > ./.omr/config.yaml > built-in
default. If config.yaml is absent, fall back to built-in defaults
silently (no prompt — running /omr:setup to create it is optional). The
author general key, if set, may be used for citation attribution in the
summary.
Pre-run check: gate on token presence
Before Phase 1, check which MCPs are reachable:
for pair in "EXA_API_KEY:exa" "TAVILY_API_KEY:tavily" "BRAVE_API_KEY:brave-search" \
"GITHUB_PERSONAL_ACCESS_TOKEN:github" "HF_TOKEN:huggingface"; do
v="${pair%%:*}"; server="${pair#*:}"
if [ -n "$(printenv "$v")" ]; then
echo "$server: reachable"
else
echo "$server: missing token"
fi
done
If --sources names any server with a missing token, halt and tell the
user:
The following requested sources have no token set:
<list>. Run/omr:setup --auditto see remediation steps, then re-run this skill.
If no --sources was given and all MCPs are missing, prefer the
knowledge-base path (Zotero + local) and warn the user that the corpus
will be local-only.
Phase execution
Execute these phases in order. For each, read the file at the path and
follow its instructions exactly. Pass the parsed flags and Phase 1's
scope.yaml content forward.
- Phase 1 — Scope:
${CLAUDE_PLUGIN_ROOT}/skills/literature-review/references/01-scope.md. - Phase 2 — Search:
${CLAUDE_PLUGIN_ROOT}/skills/literature-review/references/02-search.md. - Phase 3 — Screen:
${CLAUDE_PLUGIN_ROOT}/skills/literature-review/references/03-screen.md. - Phase 4 — Summarize:
${CLAUDE_PLUGIN_ROOT}/skills/literature-review/references/04-summarize.md.
The non-numbered reference docs are loaded by the phase that needs them:
references/source-priority.md (Phase 2), references/screening-rubric.md
(Phase 3), references/output-template.md (Phase 4).
Branching:
- Fresh workspace → run Phases 1 → 2 → 3 → 4 in order.
- Existing workspace + Refresh (or
--force) → Phase 1's 1.2 sends the run to 1.5 (skip re-prompting scope), then continues Phases 2 → 3 → 4. Search's append-only dedup keeps the corpus intact, only new entries get screened, and the summary is re-rendered. No dedicated maintenance phase. --audit→ skip Phases 1, 3 and 4; run the read-only audit flow at the bottom of02-search.md, which stops after search (no screen, no summarize, no write).
Each phase ends with a one-line ## Handoff that you echo to the user
before moving on.
Out of scope
- Deep-reading a single paper (future
omr:paper-analyzer-style skill). - Verifying citations inside a draft manuscript (future audit skill).
- Writing paper sections (
/omr:write, future). - Running PDF OCR on local PDFs (use
omp:literature-pdf-ocr-libraryupstream).