firecrawl

star 12

Web scraping and crawling for AI agents via Firecrawl MCP. Scrape URLs to markdown, crawl websites, search the web, and extract structured data. Supports cloud API and self-hosted deployments including CRW (Rust alternative).

elasticdotventures By elasticdotventures schedule Updated 5/4/2026

name: firecrawl description: | Web scraping and crawling for AI agents via Firecrawl MCP. Scrape URLs to markdown, crawl websites, search the web, and extract structured data. Supports cloud API and self-hosted deployments including CRW (Rust alternative). version: 1.0.0 tags: [mcp, web, scraping, firecrawl, crawl, markdown, data-extraction] applies_to: [web scraping, URL to markdown, website crawling, structured data extraction, AI research] output_types: [.md, .json] allowed-tools: Read, Write, Edit, Grep, Glob, Bash, mcp_firecrawl_*

What This Skill Does

Enables AI agents to interact with web content through the Firecrawl MCP server. Core capabilities:

  • Scrape: Convert any URL to clean markdown, HTML, or structured JSON
  • Crawl: Multi-page extraction with BFS traversal
  • Search: Web search with optional content scraping
  • Map: Discover all URLs on a website
  • Extract: LLM-powered structured data extraction
  • Agent: Autonomous multi-source research

When It Activates

Activate this skill when you see phrases like:

  • "scrape this website" or "get content from this URL"
  • "crawl the documentation" or "extract all pages from..."
  • "search the web for..." with content extraction
  • "find all URLs on this site"
  • "extract structured data from this page"
  • "what does this website say about..."
  • "gather information from multiple pages"

Installation

Option A: Cloud (Fastest)

# Get API key from firecrawl.dev/app/api-keys
# Add to MCP config:
b00t mcp add firecrawl -- npx -y firecrawl-mcp
# Set env: FIRECRAWL_API_KEY=fc-YOUR_KEY

Option B: Self-Hosted (CRW - Recommended)

# Single binary, 6MB RAM, no server needed
curl -fsSL https://raw.githubusercontent.com/us/crw/main/install.sh | CRW_BINARY=crw sh

# Add to MCP config:
b00t mcp add crw -- npx crw-mcp
# No API key required in embedded mode

Available Tools

Core Scraping

Tool Best For Returns
firecrawl_scrape Single URL extraction markdown/JSON/HTML
firecrawl_batch_scrape Multiple known URLs markdown[]
firecrawl_map Discover URLs on site URL[]

Web Discovery

Tool Best For Returns
firecrawl_search Find info across web results[]
firecrawl_crawl Multi-page extraction markdown[]

Advanced

Tool Best For Returns
firecrawl_extract Structured data extraction JSON (schema-defined)
firecrawl_agent Autonomous research JSON (async)
firecrawl_interact Click/navigate pages execution result

Usage Patterns

Quick Scrape (Markdown)

{
  "name": "firecrawl_scrape",
  "arguments": {
    "url": "https://docs.example.com/api",
    "formats": ["markdown"],
    "onlyMainContent": true
  }
}

Structured Extraction (JSON)

{
  "name": "firecrawl_scrape",
  "arguments": {
    "url": "https://example.com/product",
    "formats": [{
      "type": "json",
      "prompt": "Extract product details",
      "schema": {
        "type": "object",
        "properties": {
          "name": {"type": "string"},
          "price": {"type": "number"},
          "inStock": {"type": "boolean"}
        }
      }
    }]
  }
}

Web Search with Content

{
  "name": "firecrawl_search",
  "arguments": {
    "query": "best practices for async Rust 2025",
    "limit": 5,
    "scrapeOptions": {
      "formats": ["markdown"],
      "onlyMainContent": true
    }
  }
}

Site Crawling

{
  "name": "firecrawl_crawl",
  "arguments": {
    "url": "https://docs.example.com/*",
    "maxDepth": 2,
    "limit": 50
  }
}

Decision Guide

Know exact URL? → scrape (single) or batch_scrape (multiple)
Need to find URLs? → search (web) or map (site discovery)
Need all pages? → crawl (with limits!)
Want specific data? → scrape with JSON format + schema
Complex research? → agent (async, poll for results)

Self-Hosted Options

CRW (Recommended)

Metric CRW Firecrawl Self-Host
RAM 6 MB 4 GB+
Containers 0 5+
Cold Start 85ms 30-60s
Setup Single binary Docker Compose
# CRW embedded mode - no server, no config
npx crw-mcp

# CRW cloud mode - adds web search
CRW_API_URL=https://fastcrw.com/api CRW_API_KEY=xxx npx crw-mcp

Firecrawl Self-Hosted

git clone https://github.com/firecrawl/firecrawl
cd firecrawl
docker compose up -d

# MCP config:
FIRECRAWL_API_URL=http://localhost:3002 npx -y firecrawl-mcp

Important Notes

Format Selection

  • JSON format (preferred): Use with schema to extract only needed data
  • Markdown format: Only when full content is required (articles, summaries)

Token Management

  • Use onlyMainContent: true to skip navigation/footer
  • Set limit on crawls to avoid context overflow
  • Use JSON extraction instead of full markdown when possible

Rate Limits

  • Built-in exponential backoff (configurable)
  • Batch operations are async - poll for status

Troubleshooting

"API key required" - Set FIRECRAWL_API_KEY env var or use CRW in embedded mode

"Rate limited" - Increase FIRECRAWL_RETRY_MAX_ATTEMPTS, wait and retry

"Context too large" - Use JSON format with schema, reduce crawl depth

"Self-hosted connection refused" - Verify Docker containers are running, check FIRECRAWL_API_URL

Related Skills

  • context7: Live library documentation
  • browser-use: Interactive browser automation
  • playwright: UI testing and scraping

References

Install via CLI
npx skills add https://github.com/elasticdotventures/_b00t_ --skill firecrawl
Repository Details
star Stars 12
call_split Forks 0
navigation Branch main
article Path SKILL.md
More from Creator
elasticdotventures
elasticdotventures Explore all skills →