name: aii_owid_datasets description: Search and load datasets from Our World in Data catalog using BM25 search. Returns actual table data (default limit 3) with metadata, preview, and mini dataset. Use for global statistics on energy, health, COVID-19, economics, environment, demographics.
Contents
- Workflow (2-phase table discovery process)
- Scripts (Search, Download with full parameters)
IMPORTANT - Parallel execution: GNU parallel subshells do NOT inherit source activate. Use export for variables and single-quoted command templates so parallel's subshells can resolve them:
export PY="$(git rev-parse --show-toplevel)/.claude/skills/aii_owid_datasets/scripts/.venv/bin/python"
Workflow: 2-Phase Table Discovery
Phase 1: Search for Tables
Find tables with metadata (title, description, variables)
SKILL_DIR="$(git rev-parse --show-toplevel)/.claude/skills/aii_owid_datasets" && \
$SKILL_DIR/scripts/.venv/bin/python $SKILL_DIR/scripts/aii_owid_search_datasets.py "renewable energy" --limit 5
Phase 2: Download Table (if suitable)
Download the table after reviewing the search results
SKILL_DIR="$(git rev-parse --show-toplevel)/.claude/skills/aii_owid_datasets" && \
$SKILL_DIR/scripts/.venv/bin/python $SKILL_DIR/scripts/aii_owid_download_datasets.py "grapher/energy/2023-12-12/energy_mix"
Scripts
Search OWID tables (aii_owid_search_datasets.py)
Example input:
SKILL_DIR="$(git rev-parse --show-toplevel)/.claude/skills/aii_owid_datasets" && \
$SKILL_DIR/scripts/.venv/bin/python $SKILL_DIR/scripts/aii_owid_search_datasets.py "climate change" --limit 3
Parallel execution (multiple queries):
IMPORTANT: When running multiple searches, use GNU parallel instead of separate Bash tool calls:
export PY="$(git rev-parse --show-toplevel)/.claude/skills/aii_owid_datasets/scripts/.venv/bin/python" && \
export S="$(git rev-parse --show-toplevel)/.claude/skills/aii_owid_datasets/scripts/aii_owid_search_datasets.py" && \
parallel -j 50 -k --group --will-cite '$PY $S {} --limit 3' ::: 'renewable energy' 'climate change' 'covid mortality'
Example output:
Found 3 OWID tables for 'climate change':
[1] Climate Change Impacts
Path: grapher/climate/2023-10-15/climate_impacts
Description: Global temperature anomalies and sea level rise...
Variables (42 total):
- Global temperature anomaly (°C): Annual global mean temperature anomaly
- Sea level rise (mm): Global mean sea level change
- Atmospheric CO2 concentration (ppm): Monthly CO2 concentration at Mauna Loa
- Arctic sea ice extent (million km²): Monthly Arctic sea ice extent
...
Parameters:
query (required, positional)
- Search query string
- Examples:
"covid","energy mix","climate change"
--limit (optional)
- Number of search results to return (default: 3)
- Higher values = more results to choose from
Tips:
- Search is fast (uses pre-built BM25 index, no network required)
- Returns metadata only - no data is downloaded
- Use the
pathfield from results to download specific tables - BM25 search ranks by relevance across table titles, descriptions, and variable metadata
- Search returns tables from all channels (garden=highest quality, meadow=raw, backport=legacy, open_numbers=Gapminder)
Download OWID table (aii_owid_download_datasets.py)
Download a table by path (from search results) and save to files.
Example input:
SKILL_DIR="$(git rev-parse --show-toplevel)/.claude/skills/aii_owid_datasets" && \
$SKILL_DIR/scripts/.venv/bin/python $SKILL_DIR/scripts/aii_owid_download_datasets.py "grapher/energy/2023-12-12/energy_mix"
Parallel execution (multiple tables):
IMPORTANT: When downloading multiple tables, use GNU parallel instead of separate Bash tool calls:
export PY="$(git rev-parse --show-toplevel)/.claude/skills/aii_owid_datasets/scripts/.venv/bin/python" && \
export S="$(git rev-parse --show-toplevel)/.claude/skills/aii_owid_datasets/scripts/aii_owid_download_datasets.py" && \
parallel -j 50 -k --group --will-cite '$PY $S {}' ::: 'grapher/energy/2023-12-12/energy_mix' 'grapher/demography/2023-10-10/population' 'grapher/health/2023-08-01/life_expectancy'
Example output:
Downloaded OWID table: grapher/energy/2023-12-12/energy_mix
Dimensions: 15,420 rows x 12 columns
Columns: country, year, coal, oil, gas, nuclear, hydro, solar, wind, biofuels...
Files saved:
Mini (READ THIS for development/testing): /path/to/mini_grapher_energy_2023-12-12_energy_mix.json
Preview (DO NOT READ - for logging only): /path/to/preview_grapher_energy_2023-12-12_energy_mix.json
Full (DO NOT READ - for scripts only): /path/to/full_grapher_energy_2023-12-12_energy_mix.json
Sample data (first 3 rows):
Row 1:
country: Afghanistan
year: 2000
coal: 0.5
...
Parameters:
path (required, positional)
- Table path from search results
- Examples:
"grapher/energy/2023-12-12/energy_mix","garden/demography/2023-10-10/population"
Output files (auto-saved to temp/tables/):
- Mini:
mini_{path}.json- 3 full rows - READ THIS for development/testing - Preview:
preview_{path}.json- 3 truncated rows - DO NOT READ directly - for code you write to read - Full:
full_{path}.json- All rows - DO NOT READ directly - for code you write to read
Tips:
- Critical: Only read the mini file directly with Read tool. Preview and full are input paths for code you write
- Use the
pathfrom search results to download specific tables - Downloads directly from OWID catalog (network required)
- Files always saved to
temp/tables/(path included in response)