ideer-daily-paper-chatbot

star 386

Use iDeer as a daily paper-reading workflow for chatbot-first users such as Codex, Gemini, or ChatGPT. Keep the original iDeer paper-digest setup, source selection, history validation, email/report/ideas workflow, but replace in-repo LLM API summarization and scoring with the current chatbot session. 适用于不用单独配置 OpenAI/SiliconFlow/Ollama API key 的每日论文整理、报告、想法生成与自动化。

LiYu0524 By LiYu0524 schedule Updated 5/12/2026

name: ideer-daily-paper-chatbot description: "Use iDeer as a daily paper-reading workflow for chatbot-first users such as Codex, Gemini, or ChatGPT. Keep the original iDeer paper-digest setup, source selection, history validation, email/report/ideas workflow, but replace in-repo LLM API summarization and scoring with the current chatbot session. 适用于不用单独配置 OpenAI/SiliconFlow/Ollama API key 的每日论文整理、报告、想法生成与自动化。" allowed-tools: "read(), write(.env), write(.web_config.json), write(.client_config.json), write(profiles/), write(state/), write(history/), write(chatbot_test_outputs/), grep(), glob(), bash(), web_fetch(), web_search()"

iDeer Daily Paper Chatbot

Use this skill when the user wants the iDeer daily-paper workflow but does not want the repo to call its own LLM API. The chatbot should do the reading, scoring, grouping, report writing, and idea generation directly in the current conversation.

Constants

  • PROJECT_DIR: the current iDeer repository root. When installed by scripts/install_internshannon_skill.py, this becomes the absolute clone path.
  • SKILL_DIR: skills/ideer-daily-paper-chatbot inside the iDeer repository.
  • Default sources: arxiv semanticscholar huggingface rss
  • Default RSS feed: https://imjuya.github.io/juya-ai-daily/rss.xml
  • First-run schedule preference: Asia/Shanghai daily at 13:00, saved but not enabled.
  • First validation mode: dry run only, save local artifacts, do not send email, do not enable recurring schedules.

Core rule

Keep as much of the original iDeer workflow as possible:

  • reuse the repo layout
  • reuse the source fetchers when they work
  • reuse .env, profiles/description.txt, and profiles/researcher_profile.md
  • reuse history/ as the artifact destination when saving outputs

But do not rely on main.py for any step that requires MODEL_NAME, BASE_URL, API_KEY, or Ollama. Instead, fetch raw items and have the chatbot perform the intelligence layer.

Never use the Tinder/swipe product path for this skill. Do not call /api/swipe, read client/src/swipeView.tsx for workflow state, or use saved swipe queues as recommendation input.

What stays the same

  • source defaults and source-selection heuristics
  • profile-driven filtering using profiles/description.txt
  • optional stronger report/ideas guidance from profiles/researcher_profile.md
  • artifact validation in history/
  • optional SMTP sending when the user explicitly wants live email and SMTP config exists
  • Codex automation support for recurring runs

What changes

Replace these original in-repo LLM tasks with chatbot work in-session:

  • per-item Chinese summary
  • per-item relevance scoring
  • per-source daily summary
  • cross-source narrative report
  • research idea generation

Do not call python main.py or bash scripts/run_daily.sh unless the user explicitly wants to test the original API-based pipeline. For chatbot-first runs, fetch raw data with the repo's fetchers or with web browsing and continue in the conversation.

Files to inspect first

Always check:

  • .env
  • profiles/description.txt

Check when needed:

  • profiles/researcher_profile.md
  • profiles/x_accounts.txt

If .env does not exist, or if .env lacks SMTP_RECEIVER, or if profiles/description.txt is missing/empty, enter First-run setup before any digest run.

Modes

Map the user request to one of these modes:

  • First-run setup: ask the user for required setup fields, write .env/profiles/UI config with the helper script, then run a small dry run
  • Chatbot dry run: fetch sources, summarize in-chat, save markdown/html/json artifacts, do not send email
  • Chatbot full digest: fetch sources, summarize in-chat, save artifacts, send email only if SMTP config is complete and the user asked for live send
  • Setup/fix: adjust .env, profiles, categories, or fetchers so source collection works
  • Recurring automation: create or update a Codex automation that performs a chatbot-first digest

First-run setup

Use this mode when the user is installing iDeer for the first time or when the config files are missing. If the client supports option boxes or structured follow-up questions, use them; otherwise ask concise numbered questions.

Required questions

Ask for:

  • receiver email address
  • research direction / interest description
  • information sources
  • preferred delivery time

Use these defaults when the user accepts defaults or gives an incomplete answer:

  • sources: arxiv semanticscholar huggingface rss
  • schedule: daily, 13:00, Asia/Shanghai
  • arXiv categories: cs.AI cs.CL cs.LG
  • Hugging Face content type: papers
  • Semantic Scholar field: Computer Science
  • report: enabled
  • ideas: disabled
  • email sending: disabled
  • recurring schedule: disabled

Optional questions

Ask, but allow the user to skip:

  • Google Scholar or personal homepage URL
  • SMTP server, sender, and app password
  • whether to include GitHub
  • whether to generate research ideas

Only include twitter if the user explicitly chooses it and an X_RAPIDAPI_KEY is available. Do not ask for repo LLM API keys during chatbot-first setup.

Write setup files

After collecting answers, pass JSON to the helper:

cat <<'JSON' | .venv/bin/python skills/ideer-daily-paper-chatbot/scripts/setup_chatbot_config.py
{
  "receiver": "user@example.com",
  "description": "User research interests here",
  "scholar_urls": [],
  "sources": ["arxiv", "semanticscholar", "huggingface", "rss"],
  "schedule": {
    "frequency": "daily",
    "time": "13:00",
    "timezone": "Asia/Shanghai"
  },
  "generate_ideas": false
}
JSON

If .venv does not exist yet, use python3 skills/ideer-daily-paper-chatbot/scripts/setup_chatbot_config.py for this setup step. The helper writes .env, profiles/description.txt, optional profiles/researcher_profile.md, state/ideer_chatbot_setup.json, .web_config.json, and .client_config.json.

The helper must not invent SMTP passwords or API keys. If SMTP is incomplete, report that email is not configured and that the first run will only save local artifacts.

After setup, run a small chatbot-first dry run, such as arxiv and huggingface with low limits, then report the files created and that scheduling remains disabled.

Source defaults

  • Default paper/news sources: arxiv semanticscholar huggingface rss
  • RSS defaults to Juya AI Daily: https://imjuya.github.io/juya-ai-daily/rss.xml
  • Add github only when the user wants code/repo signals
  • Add twitter only when the user explicitly wants social signals and credentials exist
  • For Hugging Face, default to papers only
  • For CS users, start arXiv from cs.AI cs.CL cs.LG; expand to cs.CV cs.RO for embodied, spatial, or robotics interests
  • Prefer explicit Semantic Scholar queries when the profile is broad

Chatbot-first pipeline

Step 1: Classify and configure

If first-run setup is needed, complete it before this step.

Read the profile and decide:

  • which sources to fetch
  • whether report and idea generation are requested
  • whether email is requested
  • whether the request is one-off or recurring

Use skills/ideer-daily-paper-chatbot/references/presets.md for presets.

Step 2: Fetch raw items

Prefer the repo fetchers first when the repo is available:

  • fetchers/arxiv_fetcher.py
  • fetchers/huggingface_fetcher.py
  • fetchers/semanticscholar_fetcher.py
  • fetchers/rss_fetcher.py
  • fetchers/github_fetcher.py
  • fetchers/twitter_fetcher.py

If the repo is not available or a fetcher is broken, use browsing and cite the public source pages.

Fetch raw candidates only. Do not call the repo's LLM scoring path.

Run commands from PROJECT_DIR. Prefer .venv/bin/python; if the virtualenv is missing, use Python 3.10+ to create it before fetching:

python3 -m venv .venv
.venv/bin/python -m pip install -r requirements.txt

Step 3: Deduplicate and curate

The chatbot should:

  • remove duplicates across sources when the same paper appears in HF and arXiv
  • score relevance qualitatively or numerically in the conversation
  • organize results by the user's stated interest directions
  • write concise Chinese summaries and recommendation reasons

When the user gave explicit directions such as Agent / Spatial Intelligence / World Model, preserve those headings in the final digest.

Step 4: Save artifacts in iDeer-compatible places

Prefer these output shapes:

  • history/<source>/<date>/<date>.md for source-level markdown digests
  • history/reports/<date>/report.md for cross-source report
  • history/ideas/<date>/ideas.json for structured idea output
  • optional history/<source>/<date>/<source>_email.html if you render an HTML email body

It is acceptable for chatbot-first runs to write fewer files than the original pipeline, as long as you report exactly what was written.

If the user wants HTML artifacts without touching the main repo scripts, use the bundled renderer:

.venv/bin/python skills/ideer-daily-paper-chatbot/scripts/render_chatbot_artifacts.py \
  --date YYYY-MM-DD \
  --base-dir <artifact-dir>

This script should render report.html and digest_email.html from chatbot-written markdown/json outputs inside the chosen artifact directory.

Step 5: Email behavior

If SMTP is incomplete, do not claim that email was sent. Save the digest locally and tell the user what is missing.

If SMTP is complete and the user explicitly asked for sending, either:

  • reuse the repo's email templates/utilities if convenient, or
  • render a simple HTML body and send it through SMTP

Never send email on the first validation run unless the user clearly asked for a live send.

Step 6: Recurring automation

For chatbot-first automation, prefer the native agent/workflow scheduler when available. Use the repo root as the working directory and write the prompt so the chatbot fetches raw source items, performs summarization itself, saves artifacts, and only sends email if SMTP exists.

First-run setup saves the user's schedule preference but does not enable it. Create or enable a recurring task only after the user confirms the dry-run artifacts look correct.

See skills/ideer-daily-paper-chatbot/references/automation.md.

Safe command patterns

Use small fetch/test commands instead of the full original pipeline.

Examples:

.venv/bin/python - <<'PY'
from fetchers.huggingface_fetcher import get_daily_papers
print(len(get_daily_papers(10)))
PY
.venv/bin/python - <<'PY'
from fetchers.arxiv_fetcher import fetch_papers_for_categories
print(fetch_papers_for_categories(['cs.AI','cs.LG'], max_entries=25, sleep_range=(0,0)).keys())
PY

Use bash scripts/run_daily.sh only to debug the legacy API-based path.

Validation checklist

After each run, report:

  • the date that actually ran
  • whether first-run setup was needed and which config files were written
  • which sources were fetched
  • whether summarization was done by the chatbot or by the repo pipeline
  • which files were created
  • whether email was sent, skipped, or blocked
  • whether recurring scheduling is enabled or still only saved as a preference
  • the first concrete blocker if anything failed

Safety rules

  • Never print API keys, SMTP passwords, or tokens
  • Never claim files exist before checking them
  • Never claim email was sent before checking SMTP success
  • Do not overwrite user-authored profile files unless the user asked
  • Prefer writing additive chatbot-first artifacts over changing core repo code unless a fetcher is actually broken

Good default

For users who want paper digestion without API keys, start with:

  • raw fetch from arxiv and huggingface
  • chatbot-written markdown digest
  • optional chatbot-written cross-source report
  • no live email on the first pass
Install via CLI
npx skills add https://github.com/LiYu0524/iDeer --skill ideer-daily-paper-chatbot
Repository Details
star Stars 386
call_split Forks 52
navigation Branch main
article Path SKILL.md
More from Creator