name: obul-zyte description: "USE THIS SKILL WHEN: the user needs to scrape a website that blocks bots, requires JavaScript rendering, or needs anti-bot bypass. Provides heavy-duty web scraping via Zyte API through the Obul proxy." homepage: https://www.zyte.com/zyte-api metadata: obul-skill: emoji: "๐ก๏ธ"
registries: {}
Zyte
Zyte API provides pay-per-use web scraping with anti-bot protection bypass, full browser rendering, automatic CAPTCHA solving, and AI-powered structured data extraction. Through the Obul proxy, each request is paid individually โ no Zyte account or API key required. Use Zyte when other scrapers fail due to bot detection.
Authentication
All requests use the obulx CLI, which handles proxy routing and authentication automatically.
Install and log in (one-time setup):
npm install -g @obul.ai/obulx
obulx login
Base URL: https://api-x402.zyte.com
Common Operations
Basic HTTP Scrape
Fetch the raw HTTP response body from a URL. Fastest and cheapest option โ no browser rendering. The response body is returned Base64-encoded.
Pricing: $0.08 per 1000 URLs (~$0.00008 per request)
obulx -X POST -H "Content-Type: application/json" \
-d '{"url": "https://example.com/page", "httpResponseBody": true}' \
"https://api-x402.zyte.com/v1/extract"
Response: JSON object with the URL, status code, and httpResponseBody as a Base64-encoded string. Decode the
Base64 to get the raw HTML content.
Browser-Rendered Scrape
Render the page in a full browser with JavaScript execution. Use this for SPAs, dynamic pages, and sites that load content via JavaScript.
Pricing: $0.28 per 1000 URLs (~$0.00028 per request)
obulx -X POST -H "Content-Type: application/json" \
-d '{"url": "https://example.com/spa-page", "browserHtml": true}' \
"https://api-x402.zyte.com/v1/extract"
Response: JSON object with the URL, status code, and browserHtml containing the fully rendered HTML including
JavaScript-generated content.
Screenshot Capture
Capture a screenshot of the rendered page. Requires browser rendering. Useful for visual verification or archiving.
Pricing: $0.28 per 1000 URLs (~$0.00028 per request)
obulx -X POST -H "Content-Type: application/json" \
-d '{"url": "https://example.com/page", "screenshot": true}' \
"https://api-x402.zyte.com/v1/extract"
Response: JSON object with a Base64-encoded screenshot image of the rendered page.
Product Data Extraction
Use Zyte's AI to automatically extract structured product data (name, price, currency, description, images) from e-commerce pages.
Pricing: $0.76 per 1000 URLs (~$0.00076 per request)
obulx -X POST -H "Content-Type: application/json" \
-d '{"url": "https://example.com/product/123", "product": true, "productOptions": {"extractFrom": "browserHtml"}}' \
"https://api-x402.zyte.com/v1/extract"
Response: JSON object with a product field containing structured data: name, price, currency,
description, images, and other product attributes.
Article Data Extraction
Use Zyte's AI to automatically extract structured article data (title, body, author, date) from news and blog pages.
Pricing: $0.76 per 1000 URLs (~$0.00076 per request)
obulx -X POST -H "Content-Type: application/json" \
-d '{"url": "https://example.com/blog/article", "article": true}' \
"https://api-x402.zyte.com/v1/extract"
Response: JSON object with an article field containing structured data: headline, articleBody, author,
datePublished, and other article attributes.
Endpoint Pricing Reference
| Endpoint | Price per 1000 URLs | Per-Request | Purpose |
|---|---|---|---|
POST /v1/extract (HTTP) |
$0.08 | ~$0.00008 | Raw HTTP response body (Base64) |
POST /v1/extract (browser) |
$0.28 | ~$0.00028 | Browser-rendered HTML or screenshot |
POST /v1/extract (product) |
$0.76 | ~$0.00076 | AI-powered product data extraction |
POST /v1/extract (article) |
$0.76 | ~$0.00076 | AI-powered article data extraction |
When to Use
- Anti-bot bypass โ Target site uses Cloudflare, Akamai, DataDome, or other bot detection that blocks normal scrapers.
- JavaScript rendering โ Page content loads via JavaScript (SPAs, React/Vue apps, infinite scroll).
- Structured extraction โ Need product or article data extracted automatically by AI without writing custom parsers.
- Geo-targeted scraping โ Need to scrape content as seen from a specific country or region.
- Fallback scraper โ When Firecrawl or SimpleScraper fail on a URL, Zyte is the heavy-duty fallback.
Best Practices
- Start with httpResponseBody โ It is the cheapest option. Only use
browserHtmlif the page requires JavaScript rendering. - Use product/article extraction for structured data โ Zyte's AI extraction is more reliable than parsing HTML yourself for common page types.
- Decode Base64 responses โ The
httpResponseBodyfield is Base64-encoded. Always decode it before processing. - Use device emulation โ Set
"device": "mobile"to see mobile-specific content or bypass desktop-only bot checks. - Combine features โ You can request both
browserHtmlandscreenshotin a single request to get rendered content and a visual snapshot together. - Prefer cheaper scrapers first โ Zyte is the most powerful but also the most expensive. Use Firecrawl or SimpleScraper for sites that do not block scrapers.
Error Handling
| Error | Cause | Solution |
|---|---|---|
402 Payment Required |
Payment not processed or insufficient | Verify your account has sufficient balance at my.obul.ai. Run obulx login if not authenticated. |
400 Bad Request |
Missing or invalid request body | Ensure url is present and at least one output flag is set (e.g., httpResponseBody). |
422 Unprocessable Entity |
URL cannot be scraped | The target URL may be down or completely blocking all access. Try a different approach. |
429 Too Many Requests |
Rate limit exceeded | Add a short delay between requests and avoid rapid-fire calls. |
500 Internal Server Error |
Upstream Zyte service issue | Wait a few seconds and retry. If persistent, the service may be experiencing downtime. |
520 Anti-Bot Bypass Failed |
Bot detection could not be bypassed | Try with browserHtml instead of httpResponseBody, or add a geolocation parameter. |