name: duckduckgo description: Searches DuckDuckGo via its HTML endpoint using scrapling. Returns results as markdown (default), raw HTML, compact JSON, or YAML. Use when performing web searches from within an agent workflow without requiring a search API key or external service.
DuckDuckGo Search
Overview
Searches DuckDuckGo via its HTML endpoint (html.duckduckgo.com/html/) using scrapling for fetching and parsing. Supports four output formats — all sourced from the same HTML API:
- Markdown (default) — clean, AI-targeted markdown from scrapling
- HTML — raw
.web-resultHTML elements - JSON — compact array of
{title, url, domain, snippet}objects - YAML — structured list of result mappings
No API key or external search service required. Uses uvx 'scrapling[shell]' for ephemeral execution.
When to Use
- Performing web searches from within an agent workflow
- Gathering search result snippets, titles, and URLs programmatically
- Research tasks that need structured search output (JSON/YAML) for downstream processing
- Situations where Google/Bing APIs are unavailable or require authentication
Usage Examples
Markdown (default)
Run search.sh with a query — no format flag needed:
bash scripts/search.sh "tangled group"
Output is clean markdown produced by scrapling's --ai-targeted mode, streamed directly to stdout.
HTML
Request raw HTML when you need the full DOM structure:
bash scripts/search.sh "tangled group" --format html
Returns .web-result divs with all CSS classes, links, and snippets intact.
JSON
Get compact JSON for programmatic processing:
bash scripts/search.sh "tangled group" --format json
Output:
[{"title":"Tangled Group, Inc.","url":"https://tangledgroup.com/","domain":"tangledgroup.com","snippet":"We are a software development company..."},...]
Each result object contains only fields present in the HTML: title, url, domain, snippet. Missing fields are omitted.
YAML
Get structured YAML output:
bash scripts/search.sh "tangled group" --format yaml
Output:
- title: 'Tangled Group, Inc.'
url: 'https://tangledgroup.com/'
domain: 'tangledgroup.com'
snippet: |
We are a software development company...
Core Concepts
Output Pipeline
All four formats use the same source — DuckDuckGo's HTML endpoint:
search.shconstructs the URL:https://html.duckduckgo.com/html/?q=<encoded-query>uvx 'scrapling[shell]' extract getfetches and isolates.web-resultelements- For markdown/json/yaml: scrapling's
--ai-targetedflag produces clean, AI-friendly output - For html: raw HTML passed through without
--ai-targetedto preserve original DOM structure - For json/yaml: HTML saved to a temp file, then
format.pyparses it using Python's built-inhtml.parser
Temp File Handling
All formats create temporary files in /tmp/ using mktemp with PID-based prefixes (ddg-search-${$}-XXXXXX) to ensure uniqueness under parallel execution. Files are automatically cleaned up on exit via a bash trap handler (covers normal exit, SIGINT, SIGTERM).
Query Encoding
Query is passed to Python via stdin (not shell argument expansion) to prevent shell injection. urllib.parse.quote URL-encodes the query. Multi-word queries are handled correctly — "tangled group" becomes tangled%20group.
Dependencies
- bash — shell scripting
- uvx — ephemeral Python package execution (runs
scrapling[shell]) - python3 — query encoding and HTML-to-JSON/YAML formatting (builtins only:
html.parser,json,sys,urllib.parse,re)