fetching-web-content - SKILL.md Agent Skill

name: fetching-web-content description: > Fetches web pages and returns clean, LLM-ready markdown with automatic anti-bot bypass. Handles Cloudflare, DataDome, PerimeterX, and Akamai by auto-escalating from HTTP to a stealth browser. Use when fetching URLs, scraping websites, reading web pages as markdown, or when a site blocks normal HTTP requests. Built-in SSRF protection.

StealthFetch

Quick start

from stealthfetch import fetch_markdown

md = fetch_markdown("https://en.wikipedia.org/wiki/Web_scraping")

One function. No config. Returns clean markdown.

When to use what

fetch_markdown(url) — just need the text content as markdown.

fetch_result(url) — need page metadata (title, author, date, description) alongside the markdown. Returns a FetchResult dataclass.

method="browser" — force stealth browser for JavaScript-heavy SPAs that render content client-side. Default "auto" handles most cases (tries HTTP first, escalates on block detection).

Async

All functions have async variants: afetch_markdown, afetch_result. Same signatures.

from stealthfetch import afetch_markdown

md = await afetch_markdown("https://example.com")

Common parameters

Parameter	Default	When to change
`include_links`	`True`	Set `False` to strip hyperlinks
`include_images`	`False`	Set `True` to preserve image references
`include_tables`	`True`	Set `False` to strip tables
`timeout`	`30`	Increase for slow sites
`headers`	`None`	Pass cookies or auth headers
`proxy`	`None`	`{"server": "http://proxy:8080"}`

Error handling

from stealthfetch import StealthFetchError, FetchError, ExtractionError

try:
    md = fetch_markdown(url)
except FetchError:
    # Could not reach the URL (network, blocked, timeout)
except ExtractionError:
    # Page fetched but no main content found

BrowserNotAvailable is raised when browser mode is needed but no backend is installed.

Full API reference

See reference.md for complete function signatures, all parameters, and the FetchResult dataclass.