name: article-render-pdf description: "Use when the user provides a URL to a web article, blog post, or X Article and wants LaTeX notes rendered as PDF. Triggers: article/blog URL, '整理这篇文章', '文章笔记'. Handles translation detection (traces back to original)."
Article Render PDF
Use this skill to turn a web article, blog post, or X/Twitter article into a complete, compileable .tex note and a rendered PDF.
This skill complements youtube-render-pdf and bilibili-render-pdf for text-based content sources.
Supported Sources
| Source | Fetch Method | Notes |
|---|---|---|
| X/Twitter Articles | api.fxtwitter.com/{user}/status/{id} |
Full article JSON, zero auth |
| X/Twitter Threads | Same API, recursive via replying_to_status |
Chase reply chain |
| Tech blogs | mcp__duckduckgo-search__fetch_content |
Anthropic, OpenAI, etc. |
| Official docs | Same as blogs | OpenAI Codex docs, etc. |
| Medium / Substack | WebFetch or fetch_content |
May need pagination |
Goal
Produce a professional Chinese article note from a URL.
The output must:
- be written in Chinese with technical terms preserved in English
- faithfully represent the original author's content and arguments
- be structurally organized with
\section{...}and\subsection{...} - be a complete
.texdocument from\documentclassto\end{document} - be compiled successfully to PDF as part of the final delivery
Translation Detection (Critical)
Before processing any article, check if it is a translation:
Detection Signals
- Title contains
【译】,[译],翻译,Translation - Body mentions
原文链接,原文作者,Original by,Source: - Author is a known translator (e.g.,
@dotey宝玉,@MinLiBuilds实践哥MinLi)
When Translation Is Detected
- Extract the original source URL from the article body or entity links
- Check
entityMapin fxtwitter response for LINK entities - Search the first few paragraphs for
原文链接orOriginal:
- Check
- Fetch the original content using the appropriate method
- Generate notes from the original, not the translation
- On the front page, credit:
- Original author and link
- Translator as reference
- Note: "本笔记基于英文原文整理"
- The translation may be used as terminology reference for Chinese technical terms
Finding Original X Articles
When the translation links to another X post:
# Extract entity URLs from fxtwitter response
article.content.entityMap → look for type: "LINK" entries
# The URL often points to the original author's X article
When the original is a podcast/video:
- Search YouTube:
yt-dlp "ytsearch:SPEAKER PODCAST_NAME YEAR" - Get transcript:
youtube-transcript-apior auto-captions
Content Acquisition
X/Twitter Articles
# Fetch full article JSON (zero auth, no API key needed)
curl -s "https://api.fxtwitter.com/{user}/status/{id}" | python3 -c "
import sys, json
data = json.load(sys.stdin)
tweet = data['tweet']
article = tweet.get('article', {})
blocks = article.get('content', {}).get('blocks', [])
# Extract text with structure
for block in blocks:
btype = block.get('type', '')
text = block.get('text', '')
if 'header' in btype:
print(f'\n## {text}\n')
elif btype == 'unordered-list-item':
print(f'- {text}')
elif btype == 'ordered-list-item':
print(f'1. {text}')
elif btype == 'blockquote':
print(f'> {text}')
elif btype == 'code-block':
print(f\"\`\`\`\n{text}\n\`\`\`\")
else:
print(text)
"
Key fields in the fxtwitter response:
tweet.article.title— article titletweet.article.content.blocks— structured content blockstweet.article.content.entityMap— links, media, embedded tweetstweet.author.name— author display nametweet.created_at— publish datetweet.views— view count
Blogs and Documentation
# Use fetch_content for full text (supports pagination)
mcp__duckduckgo-search__fetch_content(url, max_length=20000)
# If content exceeds 20K chars, use start_index for pagination
mcp__duckduckgo-search__fetch_content(url, start_index=20000, max_length=20000)
Image Extraction
For articles with embedded images:
- X Articles: extract media URLs from
entityMapentries with typeMEDIA - Blogs: extract
<img>tags from fetched HTML - Download images to the article directory for LaTeX inclusion
Writing Rules
Write the notes in Chinese unless the user explicitly requests another language.
Organize the document with
\section{...}and\subsection{...}. Reconstruct the teaching flow when needed; do not blindly mirror original article order.Start from
assets/notes-template.tex. Adapt the metadata block for articles:- Change
课程笔记to文章笔记on the front page \noteauthors{...}→ "基于 [Author Name] 文章内容整理". Never use "XX & Codex", "XX & AI", or similar\notedate{...}→ the article's actual publish date (e.g., "2025-03-15"). Never use\today\videochannel→ article author / source platform\videopublishdate→ publish date\videoduration→ estimated reading time (e.g., "阅读时长:约 15 分钟")\videourl→ article URL (always fill this in — you have the URL the user provided)\videocoverpath→ leave empty (articles typically have no cover image)\repourl{https://github.com/hqhq1025/ai-course-notes}→ keep the default; do not change or delete
- Change
For translated articles, add a note below the metadata box:
\vspace{0.3cm} {\small 本笔记基于英文原文整理。原文作者:XXX,译者参考:YYY。\par}Use highlight boxes deliberately:
importantboxfor core concepts, key arguments, design principlesknowledgeboxfor background context, related work, terminologywarningboxfor common misconceptions, pitfalls, caveats the author raises- No images inside boxes
- No quota per section; add as many as the content justifies
When a mathematical formula or data comparison appears:
- Show formulas in display math using
$$...$$ - Use LaTeX tables (
tabular/booktabs) for structured comparisons
- Show formulas in display math using
When code examples appear:
- Wrap in
lstlistingwith a descriptivecaption
- Wrap in
Preserve notable quotes from the original author using
\textit{...}or thequoteenvironment.End every major section with
\subsection{本章小结}. Add\subsection{拓展阅读}when there are worthwhile external links.End the document with
\section{总结与延伸}containing:- Structured distillation of core claims
- Your own synthesis and cross-links between sections
- Concrete takeaways or open questions
Do not emit
[cite]-style placeholders anywhere in the LaTeX.
Directory Structure
articles/
├── anthropic-harness-design/ # kebab-case topic name
│ ├── content.txt # fetched article text
│ ├── metadata.json # author, date, url, type
│ ├── harness-design-notes.tex # LaTeX source
│ └── harness-design-notes.pdf # compiled PDF
├── dotey-karpathy-translation/ # translation example
│ ├── content.txt # translated text (reference)
│ ├── original_content.txt # original English text
│ ├── original_metadata.json # original source metadata
│ ├── metadata.json
│ ├── dotey-karpathy-translation-notes.tex
│ └── dotey-karpathy-translation-notes.pdf
└── ...
File Naming Rule
The .tex and .pdf files MUST be named <dirname>-notes.tex and <dirname>-notes.pdf, where <dirname> is the article directory name. For example, if the directory is anthropic-harness-design/, the files must be anthropic-harness-design-notes.tex and anthropic-harness-design-notes.pdf. Never use bare notes.tex or other ad-hoc names.
Delivery
Deliver all of the following:
- the fetched article content as
content.txt - article metadata as
metadata.json - for translations:
original_content.txtandoriginal_metadata.json - the final
.texfile - the compiled PDF (two passes of
xelatex -interaction=nonstopmode) - evidence that the compiled PDF was visually inspected via rendered pages/contact sheet and corrected if needed
Compilation
cd <article-dir>
xelatex -interaction=nonstopmode <file>.tex # first pass (generate references)
xelatex -interaction=nonstopmode <file>.tex # second pass (resolve references)
After compilation, follow ../video-render-common/writing-and-figures.md PDF Visual QA: render PDF pages, inspect a contact sheet/full-size suspicious pages, fix layout/rendering problems, and recompile before delivery.
Rules
- Follow the
ai-course-notes/CLAUDE.mdconventions for LaTeX structure, boxes, and figures - Always compile PDF twice (
xelatextwo passes) to resolve references - When source is a translation, trace back to the original and use that as primary source
- Technical terms keep English original even when notes are in Chinese
- Do NOT use TikZ for any visualization — it causes compilation timeouts. Use LaTeX tables or pre-generated images instead
Asset
assets/notes-template.tex: default LaTeX template to copy and fill