search-papers

star 16

Search for academic papers using Semantic Scholar, OpenReview, OpenAlex, and CrossRef. Use when you need to find related work, check novelty, gather citations, download/read full papers, or get BibTeX.

findalexli By findalexli schedule Updated 2/22/2026

name: search-papers description: Search for academic papers using Semantic Scholar, OpenReview, OpenAlex, and CrossRef. Use when you need to find related work, check novelty, gather citations, download/read full papers, or get BibTeX. argument-hint: "[query or topic]"

Academic Paper Search

Search query: $ARGUMENTS

Four APIs, each with a clear role:

API Role Auth
Semantic Scholar (S2) Search, citation graph, sort by impact $S2_API_KEY header (optional but recommended)
OpenAlex Expanded search, OA PDF links, metadata $OPENALEX_API_KEY query param (free, $1/day budget)
OpenReview Peer reviews, BibTeX for ML venues None needed for public data
CrossRef BibTeX for any paper by DOI None (use mailto for polite pool)

Important: Source .env before API calls:

set -a; source .env; set +a  # loads S2_API_KEY, OPENALEX_API_KEY

For detailed API reference with all endpoints and parameters, see reference.md.


Decision Guide: Which API When?

I want to... Use
Find papers on a topic S2 keyword search (/paper/search) or OpenAlex (/works?search=...)
Find most-cited papers S2 bulk search (/paper/search/bulk?sort=citationCount:desc)
Look up a specific paper S2 paper lookup (by ArXiv ID, DOI, or S2 ID)
Find exact title match S2 title match (/paper/search/match)
Get citation graph S2 citations + references endpoints
Resolve multiple papers at once S2 batch POST (up to 500 IDs)
Read full text of a paper See "Reading Full Papers" below
Read peer reviews + scores OpenReview (/notes?forum=<id>)
Browse a venue's papers OpenReview (/notes?content.venueid=...)
Get BibTeX S2 citationStyles field (fastest), or CrossRef via dx.doi.org (most reliable)

Reading Full Papers

You must read the full text of the 3-5 most important papers. Abstracts alone are not sufficient for understanding methodology, experimental details, or making accurate comparisons.

Always download papers to literature/ and record notes in literature/README.md:

mkdir -p literature

Use this priority chain:

Priority 1: Paper has an arXiv ID → ar5iv HTML (best quality, most reliable)

Most ML/AI papers are on arXiv. Check the externalIds.ArXiv field from S2 results.

Use WebFetch on the ar5iv HTML version — this preserves tables, math, and structure far better than PDF-to-text extraction:

WebFetch(url="https://ar5iv.labs.arxiv.org/html/1706.03762", prompt="Extract the full methodology, experimental setup, and key results")

ar5iv converts arXiv LaTeX sources to HTML. The content is structured, tables are preserved, and WebFetch can summarize exactly what you need in a single call. You can make multiple WebFetch calls with different prompts to extract different sections.

Also download the PDF to literature/ for figures and as a local backup:

curl -s -L -o literature/1706.03762.pdf "https://arxiv.org/pdf/1706.03762"

You can read the PDF with the Read tool (pages parameter, max 20 pages per call) if you need to inspect figures or verify details. Rate limit: Wait 3 seconds between consecutive arXiv downloads.

Fallback — if ar5iv is unavailable for a paper, read the downloaded PDF directly:

Read(file_path="literature/1706.03762.pdf", pages="1-15")

Priority 2: Paper has no arXiv ID → OpenAlex OA links (free)

Query OpenAlex for the paper's open access PDF URL:

# Look up by DOI
curl -s "https://api.openalex.org/works/https://doi.org/10.1234/example?api_key=$OPENALEX_API_KEY&select=id,title,best_oa_location,open_access"

The response includes best_oa_location.pdf_url — a direct link to the PDF hosted by the publisher, PMC, or a repository. Download it:

curl -s -L -o literature/paper_name.pdf "$PDF_URL"

Then read with the Read tool as above.

Priority 3: No OA PDF available → WebFetch the publisher page

Use WebFetch to read the paper directly from the publisher's website:

WebFetch(url="https://doi.org/10.1234/example", prompt="Extract the full methodology, experimental setup, and key results")

Many publisher pages include the full text in HTML even when the PDF is paywalled. If the publisher page doesn't have full text, try searching for the paper title + "pdf" to find alternative sources.

After reading each paper

Update literature/README.md with the paper's filename, title, and key takeaways. This serves as a persistent index so you (and future sessions) can quickly see what's been read and what was learned.


Quick Start: Common Workflows

Literature search (find related work)

# 1. Relevance-ranked search (S2)
curl -s -H "x-api-key: $S2_API_KEY" \
  "https://api.semanticscholar.org/graph/v1/paper/search?query=YOUR+TOPIC&limit=10&fields=title,year,citationCount,abstract,externalIds,citationStyles"

# 2. Most-cited papers on the topic (S2)
curl -s -H "x-api-key: $S2_API_KEY" \
  "https://api.semanticscholar.org/graph/v1/paper/search/bulk?query=YOUR+TOPIC&sort=citationCount:desc&fields=title,year,citationCount&limit=10"

# 3. Search with OpenAlex (broader coverage, includes OA links)
curl -s "https://api.openalex.org/works?search=YOUR+TOPIC&filter=publication_year:>2020&sort=cited_by_count:desc&per-page=10&select=id,title,publication_year,cited_by_count,doi,open_access,best_oa_location&api_key=$OPENALEX_API_KEY"

Get BibTeX for a paper

# Option A: From S2 (include citationStyles field — has bibtex key)
curl -s -H "x-api-key: $S2_API_KEY" \
  "https://api.semanticscholar.org/graph/v1/paper/ArXiv:1706.03762?fields=title,citationStyles"

# Option B: From CrossRef via DOI (works for ALL DOI types)
curl -s -L -H "Accept: application/x-bibtex" "https://dx.doi.org/$DOI"

Deep dive on a paper's review on openreview

# Search OpenReview for the paper
curl -s "https://api2.openreview.net/notes/search?term=PAPER+TITLE&source=forum&limit=1"

# Get reviews (use forum ID from above)
curl -s "https://api2.openreview.net/notes?forum=$FORUM_ID&limit=50"

# Read full text — use arXiv if available (see "Reading Full Papers" above)

Search Tips

  • Use keyword phrases, not sentences — S2 natural language queries return 0 results
  • Start broad, then narrow: "regularization" → "L2 regularization neural networks"
  • Filter by year for recent work: &year=2023- (S2) or &filter=publication_year:>2022 (OpenAlex)
  • Use minCitationCount to skip low-impact: &minCitationCount=50 (S2) or &filter=cited_by_count:>50 (OpenAlex)
  • Multiple searches: 3-5 queries with different angles for coverage
  • Read full text: After identifying key papers, always download and read them — see "Reading Full Papers"

Rate limit

  • S2 abstracts may contain control chars (ESC/0x1b). Strip with: re.sub(r'[\x00-\x08\x0b\x0c\x0e-\x1f]', '', text)
  • S2 openAccessPdf is often empty — use arXiv direct download or OpenAlex best_oa_location instead
  • S2 /paper/search does NOT support sort — use /paper/search/bulk instead
  • S2 /paper/{id}/citations does NOT support sort — newest first only
  • CrossRef /transform fails for arXiv DOIs — always use dx.doi.org
  • CrossRef references unreliable — use S2 for reference lists
  • OpenReview V2 wraps values: content.title.value, not content.title
  • OpenReview venue IDs have no trailing slash: ICLR.cc/2024/Conference
  • arXiv rate limit — wait 3 seconds between consecutive downloads
  • OpenAlex budget — $1/day free tier; list queries cost $0.0001, search costs $0.001; singleton lookups are free
Install via CLI
npx skills add https://github.com/findalexli/ai-scientist-v3 --skill search-papers
Repository Details
star Stars 16
call_split Forks 2
navigation Branch main
article Path SKILL.md
More from Creator