save-to-academic-notion

star 1

Save a URL/arXiv link into Selina's Notion Academic database as a new entry if missing, and populate metadata (title, authors, abstract). Supports arXiv papers (auto-fetch), non-arXiv URLs (web scraping fallback), and manual mode for agent-assisted saves. Use when the user says "save to notion", "add this paper", "save this arxiv", "put this link in my Academic database", or asks to create a new paper entry from a URL.

Animadversio By Animadversio schedule Updated 3/10/2026

name: save-to-academic-notion description: Save a URL/arXiv link into Selina's Notion Academic database as a new entry if missing, and populate metadata (title, authors, abstract). Supports arXiv papers (auto-fetch), non-arXiv URLs (web scraping fallback), and manual mode for agent-assisted saves. Use when the user says "save to notion", "add this paper", "save this arxiv", "put this link in my Academic database", or asks to create a new paper entry from a URL.

Save to Academic Notion (Selina)

Quick Usage (Automated)

Preferred method - use the Python script:

cd ~/.openclaw/workspace/skills/save-to-academic-notion
./save_paper.py <arxiv_id_or_url>

Examples:

# arXiv papers (auto-fetch metadata)
./save_paper.py 2501.12345
./save_paper.py https://arxiv.org/abs/2501.12345
./save_paper.py https://arxiv.org/pdf/2501.12345.pdf

# Non-arXiv URLs (web scraping fallback)
./save_paper.py https://openreview.net/forum?id=abc123

# Manual mode (when auto-extraction fails)
./save_paper.py --manual \
  --title "Paper Title" \
  --url "https://example.com/paper" \
  --authors "Alice; Bob; Charlie" \
  --abstract "This paper presents..."

# JSON output (for automation)
./save_paper.py 2501.12345 --json

What it does:

  • Auto-fetches metadata from arXiv API (title, authors, abstract, date)
  • For non-arXiv URLs: attempts web scraping fallback
  • Checks for duplicates (via Link property)
  • Creates page with proper properties and abstract as quote block
  • Uses "Authors txt" field (semicolon-separated text, not multi-select)
  • Returns Notion page URL

Output:

Fetching metadata for 2501.12345...
Creating Notion page for: Paper Title...
Authors: Author One, Author Two, ...
✓ Created: https://www.notion.so/...

If paper already exists:

✓ Paper already exists: https://www.notion.so/...

JSON output:

{
  "ok": true,
  "action": "created",  // or "exists"
  "page_id": "...",
  "page_url": "https://www.notion.so/...",
  "source": "arxiv"  // or "web_fetch" or "manual"
}

Error handling (non-arXiv URLs): If web scraping fails, returns structured error with suggestion:

{
  "ok": false,
  "error": "Could not extract metadata from URL",
  "suggestion": "Use browser tools to extract metadata, then use --manual mode"
}

Non-arXiv Paper Fallback

arXiv papers: Metadata auto-fetched from arXiv API (always works)

Non-arXiv URLs: Attempts web scraping via openclaw web_fetch

  • Success: Creates page with extracted metadata (may be incomplete)
  • Failure: Returns error with suggestion to use --manual mode

Manual mode workflow (for agent):

  1. Try auto-save first: ./save_paper.py <url> --json
  2. If ok: false:
    • Use browser tools to extract title, authors, abstract
    • Save with --manual mode:
      ./save_paper.py --manual \
        --title "Paper Title" \
        --url "https://..." \
        --authors "Author1; Author2; Author3" \
        --abstract "..." \
        --date "2026-03-10"
      

Supported URLs:

  • arXiv (always works): https://arxiv.org/abs/...
  • OpenReview, ACL Anthology, bioRxiv, journal sites (web scraping fallback)
  • Any URL (manual mode as last resort)

Targets

  • Academic paper database:
    • database_id: d3e3be7f-c96a-45de-8e7d-3a78298f9ccd
    • data_source_id (query): 73e9f7f8-c667-4279-a62f-2c16c1885d0f

Hard rules

  • Only write when Selina explicitly asks to save/add/log.
  • Deduplicate: do not create a new entry if it already exists.
  • Prefer using the database property Link as the canonical key.
  • Authors field: Uses "Authors txt" (rich text, semicolon-separated), not "Author" (multi-select)

Dedup workflow

  1. Normalize the URL (trim, remove tracking params when safe).
  2. Query the data source filtering:
    • Link equals the normalized URL
    • If arXiv: also try canonical https://arxiv.org/abs/<id> if the user provided a PDF link.
  3. If found: return the existing page URL; optionally offer to update metadata if missing.

Create workflow

If not found:

  1. Create a new page in the database:
    • POST /v1/pages with parent: {"database_id": <database_id>}
    • Set properties:
      • Name (title): best available title (from arXiv/metadata)
      • Link (url): normalized URL
      • Optionally: Publisher, Type, Publishing/Release Date, Discipline, Topics, Status
  2. Populate authors:
    • Updated (Mar 10, 2026): Uses Authors txt property (rich text, semicolon-separated)
    • Format: "Author1; Author2; Author3"
    • No longer uses: Author property (multi-select) — causes schema overflow with 1000+ unique authors
    • Example: properties["Authors txt"] = {"rich_text": [{"text": {"content": "Alice; Bob; Charlie"}}]}
  3. Populate abstract:
    • There is no dedicated "Abstract" property.
    • Put the abstract into the page body as blocks (preferred):
      • quote (common in Selina's pages) OR heading_2: "Abstract"
      • paragraph (or quote content): abstract text
    • Optionally use TLDR for a short summary (not the full abstract), if Selina wants that later.

Metadata extraction

If URL is arXiv

  • Accept input forms:
    • https://arxiv.org/abs/<id>
    • https://arxiv.org/pdf/<id>.pdf
    • bare <id> like 2501.01234
  • Fetch metadata from one of:
    1. arXiv API: http://export.arxiv.org/api/query?id_list=<id> (Atom XML)
    2. Fallback: scrape the arXiv abs page.
  • Extract:
    • title
    • authors
    • abstract
    • published date (optional)

If URL is not arXiv

  • Use web_fetch to get the page and extract:
    • title (best guess)
    • author list if present
    • abstract/summary if present
  • If not reliably available, create entry with just Name + Link and leave placeholders.

Suggested property mapping (when confident)

  • Publisher: set to bioRxiv, arXiv, NeurIPS, etc. only if clearly indicated.
  • Type: default to Academic Journal for papers; Blog Post for posts.
  • Status: default to Ready to Start.
  • Discipline: add tags like ML, MechInterp, Geometry, ScienceofDL if strongly implied.

UX

Before writing:

  • Confirm the normalized link + the guessed title. After writing:
  • Return the created/updated Notion page URL.
  • Briefly list what fields were populated.
Install via CLI
npx skills add https://github.com/Animadversio/mossy-skills --skill save-to-academic-notion
Repository Details
star Stars 1
call_split Forks 0
navigation Branch main
article Path SKILL.md
Occupations
More from Creator
Animadversio
Animadversio Explore all skills →