chem-paper-search

star 26

Searches Semantic Scholar and the open web for chemistry / chemical engineering / materials science papers given a topic and keyword sets. Returns a triaged table with title, first author, year, citation count, PDF availability, and one-line summary. Used as a sub-skill by `paper-mentor` during Phase 2; can also be invoked directly. Triggers on phrases like "find papers on", "幫我找文獻", "search Semantic Scholar", "literature on X".

DennisWei9898 By DennisWei9898 schedule Updated 5/6/2026

name: chem-paper-search description: Searches Semantic Scholar and the open web for chemistry / chemical engineering / materials science papers given a topic and keyword sets. Returns a triaged table with title, first author, year, citation count, PDF availability, and one-line summary. Used as a sub-skill by paper-mentor during Phase 2; can also be invoked directly. Triggers on phrases like "find papers on", "幫我找文獻", "search Semantic Scholar", "literature on X". version: "0.1.0"

chem-paper-search — Chemistry / Materials Paper Discovery

What this does

Given a research topic + 3-5 keyword sets, this skill returns a curated list of 10-20 papers, prioritizing those with open-access PDFs.

This is a sub-skill called by paper-mentor Phase 2. When invoked directly, it produces the same output as the literature search step.

Inputs (when called)

topic: "narrow bandgap Sn-Pb perovskite stability"
keyword_sets:
  - english_broad: "Sn-Pb perovskite stability"
  - english_narrow: "GASCN additive Sn-Pb perovskite crystallization"
  - chinese_optional: "錫鉛鈣鈦礦 穩定性"
target_count: 15  # default
year_min: 2020    # default (last 5 years from current)

Workflow

Step 1: Semantic Scholar search (primary)

Use WebFetch to call Semantic Scholar API:

https://api.semanticscholar.org/graph/v1/paper/search?query={keywords}
  &fields=title,authors,year,abstract,citationCount,openAccessPdf,tldr
  &limit={target_count}
  &sort=citationCount

For each keyword set, run the API call and merge results. Deduplicate by title.

Step 2: WebSearch fallback

If Semantic Scholar API rate-limits (429) or returns <5 results:

First: retry with exponential backoff (10s, 30s, 60s) — most 429s clear within a minute.

If still failing, fall back to WebSearch:

WebSearch: site:semanticscholar.org {topic} {year_min}..{current_year}

Then WebFetch the Semantic Scholar pages to extract metadata.

⚠️ Known limitation: WebFetch-scraped metadata loses structured fields (citationCount, openAccessPdf, tldr). When fields are missing:

  • Mark cites = unknown, pdf = unknown, summary = unknown
  • Do NOT auto-filter these papers — show them all and let the user / orchestrator decide
  • Note in output: "⚠️ Step 3 filter ran with reduced metadata; manual review recommended"

Step 3: Filter

Apply filters in order:

  • ✅ Has openAccessPdf → PRIORITY
  • ✅ citationCount > 20 (or > 5 for papers <2 years old)
  • ✅ year >= year_min
  • ❌ Predatory journal flag (use Beall's list — check journal name)
  • ❌ Off-topic (manual review of abstract)

Step 4: Triage

Mark each paper as:

  • 🔴 Must-read (必讀): Direct competitor, foundational, or very high citations
  • 🟡 Reference (參考): Useful method or comparison data
  • Skippable (可略): Low rigor or off-topic borderline

Step 5: Output

# Literature Search Results — [topic]

**Date**: 2026-MM-DD
**Searches run**: [list keyword sets]
**Total found**: N (after dedup + filter)

## Triaged table

| # | Tag | Title | First author | Year | Cites | PDF | Summary |
|---|-----|-------|--------------|------|-------|-----|---------|
| 1 | 🔴 | ... | ... | 2024 | 142 | ✅ | One-line tldr from Semantic Scholar |
| ... |

## Recommended reading order

1. [#1] — start here because [reason]
2. [#3] — read for method comparison
3. [#7] — read for competing mechanism interpretation

## Searches that returned nothing (if any)

Document any keyword sets that returned 0 results — this is **valuable evidence of a gap** for the gap report.

Backup search engines

If Semantic Scholar is down or insufficient:

Engine URL pattern When to use
Elicit https://elicit.com/search?q={query} When you need AI-extracted research questions
Connected Papers https://www.connectedpapers.com/search?q={query} When you want a citation graph view
Google Scholar (manual via WebSearch) Last resort, low metadata quality

Hard rules

  1. Never invent papers. If 0 results, say so explicitly.
  2. Always show searches that returned nothing — these are evidence of gaps.
  3. Always verify citation count with a second source if it seems high (>500).
  4. Filter out predatory journals — names ending in "International Journal of [Field] Research" with sketchy publishers should be excluded.
  5. Quote exact API responses when uncertain — do not paraphrase metadata.

Common issues

Issue Fix
Semantic Scholar returns 429 Wait 10s, retry. If persists, switch to WebSearch fallback.
All papers are too old Drop year_min by 2 years and retry; if still old, the field is dormant.
No open PDFs Switch search to include arXiv preprints (they have open PDF).
Topic is too narrow Loosen by removing 1-2 keywords; the field may not have studied it.
Topic is too broad Tighten by adding specific technique or material; expect 100+ raw results.

Output downstream

This output feeds into:

  • paper-mentor Phase 2 Step 3: passes the table to chem-nlm-helper for NotebookLM ingestion
  • Direct invocation: returned to user as-is for manual reading
Install via CLI
npx skills add https://github.com/DennisWei9898/paper-mentor --skill chem-paper-search
Repository Details
star Stars 26
call_split Forks 9
navigation Branch main
article Path SKILL.md
More from Creator
DennisWei9898
DennisWei9898 Explore all skills →