name: ai-research-tracker description: Track and analyze AI research from companies like OpenAI, Anthropic, Google DeepMind. Create bilingual (English/Chinese) structured notes in Obsidian with automated daily updates. version: 1.0.0 author: Hermes Agent license: MIT metadata: hermes: tags: [research-tracking, ai-research, obsidian, bilingual, automation, openai] category: research related_skills: [obsidian, web-content-extraction, llm-wiki, arxiv]
AI Research Tracker
Track and analyze AI research publications from major labs (OpenAI, Anthropic, Google DeepMind, etc.) with structured bilingual notes and automated daily updates.
Overview
This skill provides a complete system for:
- Fetching research content from AI labs (even with Cloudflare protection)
- Creating structured bilingual (English/Chinese) notes
- Organizing content in Obsidian with templates
- Automating daily updates via cron jobs
When to Use
Use this skill when:
- User wants to track AI research from OpenAI, Anthropic, etc.
- Content is blocked by Cloudflare (use jina.ai proxy)
- Need bilingual documentation (English source + Chinese translation)
- Want structured analysis format (innovation, technical details, applications)
- Need automated daily fetching
Prerequisites
- Obsidian vault configured
- Proxy available (if needed for network access)
- Python 3.x for automation scripts
Directory Structure
OpenAI Research/ # Or "AI Research/" for multiple sources
├── README.md # System documentation
├── Index.md # Navigation hub
├── Papers/ # Detailed research notes
│ ├── o3-o4-mini.md
│ ├── gpt-5.md
│ └── ...
├── Daily Updates/ # Daily summaries
│ ├── 2025-04-09.md
│ └── ...
├── Insights/ # Trend analysis
│ ├── trends.md
│ └── ...
├── _templates/ # Note templates
│ ├── paper-template.md
│ └── daily-template.md
└── _scripts/ # Automation
└── fetch_research.py
Content Fetching Strategy
Method 1: Direct Browser Navigation (Preferred for Anthropic)
For sites like Anthropic that don't block browser access:
# Navigate to research page
browser_navigate(url="https://www.anthropic.com/research")
# Extract full text content
browser_console(expression="document.body.innerText")
# For long articles, get content in chunks
browser_console(expression="document.body.innerText.substring(0, 15000)")
browser_console(expression="document.body.innerText.substring(15000, 30000)")
Advantages:
- Full content extraction (not just summary)
- Preserves article structure
- No proxy needed for many sites
- Can scroll and navigate sections
Method 2: jina.ai Proxy (For Cloudflare-Protected Sites like OpenAI)
When direct access fails:
# Use jina.ai proxy
curl -sL --proxy "http://127.0.0.1:7890" \
"https://r.jina.ai/http://openai.com/research" 2>&1
# For specific articles
curl -sL --proxy "http://127.0.0.1:7890" \
"https://r.jina.ai/http://openai.com/index/article-slug/" 2>&1
When to use:
- Site blocks direct browser access
- Need quick text extraction
- Don't need interactive navigation
Response Format
jina.ai returns clean Markdown:
Title: Article Title
URL Source: http://original-url.com
Markdown Content:
# Article content...
Note Structure (Bilingual - Enhanced Format)
Each research note follows this comprehensive structure:
# [Article Title in Chinese]
**原文标题**: [Original Title]
**发布日期**: [Date]
**分类**: [Category]
**原文链接**: [URL]
## 摘要 (Abstract)
[Chinese translation of abstract/summary]
## 核心内容翻译 (Full Translation)
[Complete Chinese translation of the article content]
## 深度解读 (Deep Analysis)
### 1. 研究背景与动机
[Research context and why this matters]
### 2. 方法论与创新点
[Methods used and what's novel]
### 3. 主要发现与结论
[Key findings and conclusions]
### 4. 技术细节剖析
[Technical deep dive - architecture, algorithms, benchmarks]
### 5. 实际应用与影响
[Practical implications for developers, businesses, policymakers]
### 6. 局限性与未来方向
[Limitations and future work]
## 思考与反思 (Personal Reflection)
[Your critical thinking about the research]
- What are the implications?
- What concerns does it raise?
- How does it connect to other work?
## 相关阅读 (Related Reading)
- [Links to related papers/articles]
- [Previous work from same lab]
- [Follow-up research]
---
*Generated on [date]*
Alternative Structure (for quick notes)
For faster processing of multiple articles:
# Article Title
**原文标题**: [Original]
**发布日期**: [Date]
**分类**: [Category]
**原文链接**: [URL]
## 核心发现 (Key Findings)
- Finding 1
- Finding 2
- Finding 3
## 中文翻译 (Translation)
[Key sections translated]
## 分析 (Analysis)
[Brief analysis]
## 影响 (Implications)
[Practical implications]
---
*Generated: [date]*
Setup Instructions
Step 1: Create Directory Structure
mkdir -p "{OBSIDIAN_PATH}/OpenAI Research/"{Papers,Daily Updates,Insights,_templates,_scripts}
Step 2: Create Templates
paper-template.md:
---
title: {{title}}
date: {{date}}
url: {{url}}
tags: [{{tags}}]
status: {{status}}
---
# {{title}}
## 基本信息
- **发布日期**: {{date}}
- **原文链接**: [{{url}}]({{url}})
- **研究类型**: {{type}}
- **重要性**: {{priority}}
---
## 原文摘要
{{original_summary}}
---
## 中文翻译
{{chinese_translation}}
---
## 深度解读
### 核心创新
{{core_innovation}}
### 技术细节
{{technical_details}}
### 性能提升
{{performance_gains}}
---
## 关键要点
1. {{key_point_1}}
2. {{key_point_2}}
3. {{key_point_3}}
---
## 实际应用
{{applications}}
---
## 局限性与风险
{{limitations}}
---
*创建于: {{created_date}}*
daily-template.md:
---
date: {{date}}
type: daily-update
---
# Daily Update - {{date}}
## 今日概览
- **新发布研究**: {{count}} 篇
- **重要更新**: {{important_count}} 篇
- **重点关注**: {{focus_area}}
---
## 新发布内容
{{articles_section}}
---
## 趋势观察
{{trends}}
---
*自动生成于: {{timestamp}}*
Step 3: Create Automation Script
fetch_research.py:
#!/usr/bin/env python3
"""Fetch AI research and create Obsidian notes"""
import os
import subprocess
from datetime import datetime
OBSIDIAN_PATH = "{path}/OpenAI Research"
PROXY = "http://127.0.0.1:7890"
JINA_BASE = "https://r.jina.ai/http://"
def fetch(url):
cmd = f'curl -sL --proxy "{PROXY}" "{JINA_BASE}{url}" 2>&1'
result = subprocess.run(cmd, shell=True, capture_output=True, text=True)
return result.stdout if result.returncode == 0 else None
def create_note(article):
# Create structured note from article
pass
def main():
# Fetch research page
content = fetch("openai.com/research")
# Parse articles
# Create notes
# Generate daily update
pass
if __name__ == "__main__":
main()
Step 4: Setup Cron Job
# Copy script to hermes scripts directory
cp fetch_research.py ~/.hermes/scripts/
# Create cron job (runs daily at 9 AM)
cronjob create --name "AI Research Daily" \
--schedule "0 9 * * *" \
--script "fetch_research.py"
Workflow
Multi-Article Batch Processing (Anthropic Example)
For tracking multiple articles from a research page:
Navigate to research page:
browser_navigate(url="https://www.anthropic.com/research")Extract article list:
browser_console(expression="document.body.innerText") # Parse to find article URLs, titles, datesCreate todo list for batch processing:
todo(todos=[ {"id": "1", "content": "Article 1: Title (Date)", "status": "in_progress"}, {"id": "2", "content": "Article 2: Title (Date)", "status": "pending"}, # ... etc ])Process each article:
for article in articles: browser_navigate(url=article['url']) content = browser_console(expression="document.body.innerText") # Create comprehensive note write_file(path=f"{date} {title}.md", content=note_content) todo(todos=[...], merge=True) # Mark as completed
Manual Research Tracking (Single Article)
Fetch content:
# Method 1: Direct browser (for Anthropic, etc.) browser_navigate(url="https://www.anthropic.com/research/article-slug") browser_console(expression="document.body.innerText") # Method 2: jina.ai proxy (for OpenAI, etc.) curl -sL --proxy "http://127.0.0.1:7890" \ "https://r.jina.ai/http://openai.com/index/article-slug/" 2>&1Create note from template:
- Copy paper-template.md
- Fill in all sections
- Translate key content to Chinese
- Add deep analysis with 6 sections
- Include personal reflection
Update index:
- Add to Index.md
- Link related notes
Create daily update:
- Summarize new research
- Note trends
Automated Daily Updates
The cron job will:
- Fetch latest research
- Detect new articles
- Create draft notes
- Generate daily summary
- Deliver notification
Best Practices
Content Quality
- Always translate key sections - Don't leave English-only notes
- Add personal insights - Don't just copy official content
- Include performance tables - Benchmarks are crucial for AI research
- Note limitations - Every research has limitations
- Cross-reference - Link to related notes
Organization
- Use consistent tags - #o-series, #gpt-series, #multimodal, etc.
- Rate importance - ⭐ to ⭐⭐⭐⭐⭐
- Update status - draft → completed → archived
- Maintain index - Keep Index.md current
- Archive old content - Move outdated notes to _archive/
Automation
- Proxy auto-detection - Script automatically detects common proxy ports (7890, 7891, 7897, 1080, 1087, 9090)
- Handle failures gracefully - Script continues on errors and provides helpful diagnostics
- Limit API calls - Don't hammer jina.ai
- Log everything - Keep track of what was fetched
- Review drafts - Automated notes need human review
Proxy Configuration Priority:
- Environment variable
HTTP_PROXYorHTTPS_PROXY - Auto-detected working proxy on common ports
- Direct connection (fallback)
Common Patterns
Pattern 1: Model Release
For new model announcements (GPT-5, o3, etc.):
## 核心创新
- **Architecture**: What's new in the architecture
- **Training**: New training methods
- **Capabilities**: New abilities
## 性能提升
| Benchmark | New Model | Previous | Improvement |
|-----------|-----------|----------|-------------|
| MMLU | 87% | 82% | +5% |
Pattern 2: Technical Report
For research papers:
## 技术细节
### Method
[Detailed method description]
### Experiments
[Experimental setup]
### Results
[Key results with tables]
### Ablation Studies
[What matters most]
Pattern 3: Product Launch
For product announcements:
## 实际应用
### 适用场景
- [Specific use case 1]
- [Specific use case 2]
### 对开发者的影响
[API changes, new features]
### 定价与可用性
[Pricing tiers, rollout plan]
Troubleshooting
Issue: jina.ai returns empty
Cause: URL format issue or rate limiting Solution:
- Check URL format (must include http://)
- Add delay between requests
- Try alternative: textise dot iitty
Issue: Proxy connection fails / "Failed to fetch research page"
Cause: Proxy service not running or network connectivity issues Solution:
Check if proxy is running:
curl --proxy http://127.0.0.1:7890 https://httpbin.org/ipCommon proxy ports to check:
- Clash: 7890 (HTTP), 7891 (SOCKS5)
- V2Ray: 1080, 1087
- Surge: 9090
Start your proxy service:
- Clash Verge: Open app and click "Enable"
- V2Ray/Shadowsocks: Start the client
- Verify:
lsof -i :7890should show the proxy process
The script now auto-detects proxies - it will try common ports automatically
If no proxy available:
- The script will try direct connection as fallback
- Some networks may block jina.ai directly
- Consider using a VPN or different network
Issue: Cron job fails but manual run works
Cause: Environment variables or proxy not available in cron context Solution:
- The script auto-detects proxies at runtime (added in v1.1)
- Ensure proxy service starts before cron job runs
- Check logs:
cronjob listandcronjob log <job-id>
Issue: Chinese characters garbled
Cause: Encoding issue Solution:
- Ensure files are UTF-8
- Use
encoding='utf-8'in Python - Check terminal encoding
Issue: Cron job not running
Cause: Path or permission issue Solution:
- Use absolute paths in script
- Check script permissions:
chmod +x script.py - Check hermes logs:
cronjob list
Integration with Other Skills
With obsidian skill
- Use
skill_view("obsidian")for vault operations - Link research notes to existing notes
- Use Dataview queries for research dashboard
With arxiv skill
- Combine with arxiv for academic papers
- Link blog posts to arxiv papers
- Track both industry and academic research
With llm-wiki skill
- Use llm-wiki structure for broader knowledge base
- AI Research Tracker as specialized module
- Cross-link between systems
Examples
Example 1: OpenAI o3/o4-mini
See the full example in the conversation history. Key sections:
- Agentic tool use breakthrough
- Multimodal reasoning details
- Performance benchmarks
- Safety considerations
Example 2: GPT-5
See the full example in the conversation history. Key sections:
- Unified system architecture
- Router mechanism
- Coding/writing/health capabilities
- Comparison with previous models
Resources
Skill version: 1.1.0 Last updated: 2026-04-11
Changelog
v1.2.0 (2026-04-12)
- Added browser-based content extraction method for sites like Anthropic
- Added batch processing workflow for multiple articles
- Enhanced note template with 6-section deep analysis format
- Added personal reflection section to template
- Support for both Anthropic and OpenAI research tracking
v1.1.0 (2026-04-11)
- Added automatic proxy detection for common ports (7890, 7891, 7897, 1080, 1087, 9090)
- Added fallback to direct connection if no proxy available
- Improved error messages with troubleshooting guidance
- Added multiple jina.ai service endpoints for redundancy
- Updated troubleshooting section with proxy debugging steps
Activation Keywords
- "ai-research-tracker"
- "ai research tracker"
- "use ai research tracker"
- "ai research tracker help"
- "ai research tracker tool"
Tools Used
Read- Read existing files and documentationWrite- Create new files and documentationBash- Execute commands when needed
Instructions for Agents
- Identify user's intent and specific requirements
- Gather necessary context from files or user input
- Execute appropriate actions using available tools
- Provide clear results and suggest next steps