fetching-web-content

star 8

Fetches web pages and returns clean, LLM-ready markdown with automatic anti-bot bypass. Handles Cloudflare, DataDome, PerimeterX, and Akamai by auto-escalating from HTTP to a stealth browser. Use when fetching URLs, scraping websites, reading web pages as markdown, or when a site blocks normal HTTP requests. Built-in SSRF protection.

leba01 By leba01 schedule Updated 2/25/2026

name: fetching-web-content description: > Fetches web pages and returns clean, LLM-ready markdown with automatic anti-bot bypass. Handles Cloudflare, DataDome, PerimeterX, and Akamai by auto-escalating from HTTP to a stealth browser. Use when fetching URLs, scraping websites, reading web pages as markdown, or when a site blocks normal HTTP requests. Built-in SSRF protection.

StealthFetch

Quick start

from stealthfetch import fetch_markdown

md = fetch_markdown("https://en.wikipedia.org/wiki/Web_scraping")

One function. No config. Returns clean markdown.

When to use what

fetch_markdown(url) — just need the text content as markdown.

fetch_result(url) — need page metadata (title, author, date, description) alongside the markdown. Returns a FetchResult dataclass.

method="browser" — force stealth browser for JavaScript-heavy SPAs that render content client-side. Default "auto" handles most cases (tries HTTP first, escalates on block detection).

Async

All functions have async variants: afetch_markdown, afetch_result. Same signatures.

from stealthfetch import afetch_markdown

md = await afetch_markdown("https://example.com")

Common parameters

Parameter Default When to change
include_links True Set False to strip hyperlinks
include_images False Set True to preserve image references
include_tables True Set False to strip tables
timeout 30 Increase for slow sites
headers None Pass cookies or auth headers
proxy None {"server": "http://proxy:8080"}

Error handling

from stealthfetch import StealthFetchError, FetchError, ExtractionError

try:
    md = fetch_markdown(url)
except FetchError:
    # Could not reach the URL (network, blocked, timeout)
except ExtractionError:
    # Page fetched but no main content found

BrowserNotAvailable is raised when browser mode is needed but no backend is installed.

Full API reference

See reference.md for complete function signatures, all parameters, and the FetchResult dataclass.

Install via CLI
npx skills add https://github.com/leba01/stealthfetch --skill fetching-web-content
Repository Details
star Stars 8
call_split Forks 0
navigation Branch main
article Path SKILL.md
More from Creator