name: x-article-retrieval description: Retrieve full content from X/Twitter's non-standard tweet types — Note Tweets (long-form truncated tweets) and X Articles. Covers detection, field requirements, the critical incompatibility between note_tweet and article fields, and fallback tiers. trigger: When asked to extract long-form tweet content (Note Tweet or X Article), when fetch_x_bookmarks.py or fetch_x_accounts.py hits _article_fetch_error, when a tweet's text appears truncated (ends with "..."), OR when the user provides an X Article/tweet URL and says "wikiに取り込んで."
X Tweet Content Retrieval — Note Tweets & X Articles
X has two non-standard tweet types that require special retrieval:
| Type | Detection | Full Content Location | Extra API Call? |
|---|---|---|---|
| Note Tweet | tweet.text is truncated (ends with ...); note_tweet.text exists |
note_tweet.text |
No — available inline with tweet.fields=note_tweet |
| X Article | tweet.article.title exists |
article.plain_text |
Yes — requires separate call with tweet.fields=article |
⚠️ CRITICAL: Field Incompatibility
Do NOT mix note_tweet and article in the same tweet.fields request.
# ❌ WRONG — article.plain_text may be silently dropped
xurl "/2/tweets/ID?tweet.fields=note_tweet,article"
# ✅ CORRECT — two separate requests
xurl "/2/tweets/ID?tweet.fields=note_tweet" # for Note Tweets
xurl "/2/tweets/ID?tweet.fields=article" # for X Articles
This is a known X API behavior. The fields interact poorly and article.plain_text is silently dropped from the response when both are requested. All scripts (fetch_x_bookmarks.py, fetch_x_accounts.py) intentionally request only note_tweet in their main fetch and make a separate tweet.fields=article call for article body retrieval.
Note Tweets (Long-Form Tweets)
Note Tweets are regular tweets whose body exceeds the standard display limit. The API returns a truncated text (typically ~140 chars with ...) and the full content in note_tweet.text.
Detection
def _has_note_tweet(tweet):
nt = tweet.get("note_tweet") or {}
return bool(nt.get("text"))
Retrieval
Add note_tweet to tweet.fields in ANY tweet endpoint:
xurl "/2/tweets/<TWEET_ID>?tweet.fields=note_tweet"
xurl "/2/users/<USER_ID>/tweets?tweet.fields=note_tweet"
xurl "/2/users/<USER_ID>/bookmarks?tweet.fields=note_tweet"
Response includes full text in data.note_tweet.text:
{
"data": {
"id": "2052100726608781363",
"text": "Strong Opinions, Loosely Held on... (truncated)",
"note_tweet": {
"text": "Strong Opinions, Loosely Held on Agent + Harness Engineering:\n\n1. You can outperform any default harness+model..."
}
}
}
Key Facts
- No extra API call needed —
note_tweet.textcomes inline when the field is requested - Works on all tweet endpoints — bookmarks, user timelines, search, single tweet lookup
xurl read <ID>returns TRUNCATED text — must use raw v2 endpoint withtweet.fields=note_tweet- Signal: High bookmark count (500+), author known for long-form analysis
X Articles (x.com/i/article/...)
X Articles are a separate long-form content type. The tweet that shares an article contains article.title in the basic response, but article.plain_text (the full body) requires an explicit tweet.fields=article request with OAuth2 user auth.
Detection
def _is_article_tweet(tweet):
article = tweet.get("article") or {}
return bool(article.get("title"))
Retrieval — Primary Method
xurl --auth oauth2 "/2/tweets/<TWEET_ID>?tweet.fields=article"
Returns:
{
"data": {
"article": {
"title": "...",
"plain_text": "full article body...",
"preview_text": "first few lines...",
"cover_media": "media_id",
"entities": { "mentions": [...], "urls": [...] }
},
"text": "https://t.co/...",
"id": "TWEET_ID"
}
}
❌ What Does NOT Work
# Gives only title, no body
xurl read <TWEET_ID>
# Returns 404 or empty
web_extract "https://x.com/i/article/<ARTICLE_ID>"
# Returns 453 (insufficient access) for app-only bearer tokens
curl "https://x.com/i/api/2/tweets/<TWEET_ID>/article?..."
Additional Response Fields
Beyond plain_text and title, the X Article response includes useful structured metadata in article.entities:
{
"article": {
"entities": {
"hashtags": [
{"start": 0, "end": 13, "text": "productivity"},
{"start": 14, "end": 22, "text": "machine"}
],
"mentions": [
{"start": 7, "end": 16, "username": "cyrilXBT"}
]
}
}
}
hashtags: All hashtags found in the article body with character offsets — useful for auto-tagging wiki pagesmentions: All @mentions with usernames — useful for cross-referencing entity pagescover_media: Media ID for the article cover imagepreview_text: First few lines, useful for summaries without fetching the full body
To also get created_at (publication date for wiki filenames), make a separate request with tweet.fields=created_at,author_id (not tweet.fields=article). When fetching metadata-only, the article.title is still present but article.plain_text is NOT — confirming the field incompatibility is real and consistent.
Confirmed Working Example (2026-05-27)
# Step 1: Fetch article body
xurl --auth oauth2 "/2/tweets/2058373087330959829?tweet.fields=article"
# → article.plain_text: full 5000+ word Obsidian vault guide
# Step 2: Fetch metadata (separate call, different tweet.fields)
xurl --auth oauth2 "/2/tweets/2058373087330959829?tweet.fields=created_at,author_id"
# → created_at: "2026-05-24T02:22:41.000Z", author_id: "1373665408193036293"
Error Categories (from fetch_x_bookmarks.py)
| Error Category | Meaning | Action |
|---|---|---|
http_5xx |
X API server error (500) | Retry once. If persistent, fall back to Tier 4. |
timeout |
Request timed out | Retry once with longer timeout. If persistent, fall back to Tier 4. |
no_plain_text |
Article field present but body missing | Article may be too new (not yet processed). Wait and retry. |
no_article_field |
Response has no article field at all | Not an X Article tweet. Check tweet type. |
xurl_error |
Other xurl error | Fall back to Tier 4. |
parse_error |
JSON parse failure | Fall back to Tier 4. |
Fallback Tiers
- Tier 2:
web_extractonhttps://x.com/i/status/<TWEET_ID>— may get partial body for embedded article tweets - Tier 3: GetXAPI —
curl "https://api.getxapi.com/twitter/tweet/article?id=<TWEET_ID>" -H "Authorization: Bearer $GETXAPI_KEY"(requires subscription) - Tier 4:
web_searchfor"<article title>" <author_handle>— secondary sources, mirrors, or summaries
Decision Flow: Note Tweet vs X Article
Tweet received
├── note_tweet.text present? → Note Tweet → use note_tweet.text (no extra call)
├── article.title present? → X Article → fetch with tweet.fields=article
└── neither → Regular tweet → use text field
Script Integration
fetch_x_bookmarks.py
- Requests
tweet.fields=created_at,entities,referenced_tweets,note_tweet - Note Tweets:
note_tweet.textcomes inline — counted asnote_tweets_found - X Articles: detected via
_is_x_article(), separatefetch_article_body()call - Counters in summary:
note_tweets_found,x_articles_fetched,x_articles_failed
fetch_x_accounts.py
- Requests
tweet.fields=created_at,entities,referenced_tweets,note_tweet - Note Tweets:
note_tweet.textused incompact_post()via_get_full_tweet_text() - X Articles: detected via
_is_article_tweet(), separatefetch_article_body()call content_typetag in compact post:"note_tweet","x_article", or"tweet"- Counters in scan_meta:
note_tweets_found,articles_fetched,articles_failed
Reference Implementation
Note Tweet extraction (no extra API call)
def _get_full_tweet_text(tweet):
if tweet.get("note_tweet", {}).get("text"):
return tweet["note_tweet"]["text"]
return tweet.get("text", "")
# Usage in compact_post / is_substantive_post:
text = " ".join((_get_full_tweet_text(tweet) or "").split())
X Article body fetching (separate API call)
def fetch_article_body(tweet_id):
"""Fetch full X Article body via tweet.fields=article."""
try:
resp = json.loads(
run(f"/2/tweets/{tweet_id}?tweet.fields=article")
)
article = resp.get("data", {}).get("article")
if article and article.get("plain_text"):
return article, None, None
if article:
return None, "no_plain_text", "article field present but plain_text missing"
return None, "no_article_field", "response has no article field at all"
except XurlError as e:
msg = str(e)
if "500" in msg:
return None, "http_5xx", msg
if "timeout" in msg.lower() or "timed out" in msg.lower():
return None, "timeout", msg
return None, "xurl_error", msg
except json.JSONDecodeError as e:
return None, "parse_error", str(e)
Pitfalls
- Mixing
note_tweet+articlein tweet.fields → silent data loss —article.plain_textis dropped. Always use separate requests. xurl readdoes NOT fetch Note Tweet full text or Article body — always use raw v2 endpoint with explicittweet.fields.- OAuth2 is mandatory for X Article body — app-only bearer tokens get 453.
- Article ID ≠ Tweet ID —
x.com/i/article/IDis the article resource, not the tweet. - New articles may not have plain_text yet — X processes asynchronously. Retry after a few minutes.
note_tweetfield works on ALL tweet endpoints — bookmarks, timelines, search, single lookup. No special endpoint needed.