aii-owid-datasets

star 1

Search and load datasets from Our World in Data catalog using BM25 search. Returns actual table data (default limit 3) with metadata, preview, and mini dataset. Use for global statistics on energy, health, COVID-19, economics, environment, demographics.

AMGrobelnik By AMGrobelnik schedule Updated 2/28/2026

name: aii_owid_datasets description: Search and load datasets from Our World in Data catalog using BM25 search. Returns actual table data (default limit 3) with metadata, preview, and mini dataset. Use for global statistics on energy, health, COVID-19, economics, environment, demographics.

Contents

  • Workflow (2-phase table discovery process)
  • Scripts (Search, Download with full parameters)

IMPORTANT - Parallel execution: GNU parallel subshells do NOT inherit source activate. Use export for variables and single-quoted command templates so parallel's subshells can resolve them:

export PY="$(git rev-parse --show-toplevel)/.claude/skills/aii_owid_datasets/scripts/.venv/bin/python"

Workflow: 2-Phase Table Discovery

Phase 1: Search for Tables

Find tables with metadata (title, description, variables)

SKILL_DIR="$(git rev-parse --show-toplevel)/.claude/skills/aii_owid_datasets" && \
$SKILL_DIR/scripts/.venv/bin/python $SKILL_DIR/scripts/aii_owid_search_datasets.py "renewable energy" --limit 5

Phase 2: Download Table (if suitable)

Download the table after reviewing the search results

SKILL_DIR="$(git rev-parse --show-toplevel)/.claude/skills/aii_owid_datasets" && \
$SKILL_DIR/scripts/.venv/bin/python $SKILL_DIR/scripts/aii_owid_download_datasets.py "grapher/energy/2023-12-12/energy_mix"

Scripts

Search OWID tables (aii_owid_search_datasets.py)

Example input:

SKILL_DIR="$(git rev-parse --show-toplevel)/.claude/skills/aii_owid_datasets" && \
$SKILL_DIR/scripts/.venv/bin/python $SKILL_DIR/scripts/aii_owid_search_datasets.py "climate change" --limit 3

Parallel execution (multiple queries):

IMPORTANT: When running multiple searches, use GNU parallel instead of separate Bash tool calls:

export PY="$(git rev-parse --show-toplevel)/.claude/skills/aii_owid_datasets/scripts/.venv/bin/python" && \
export S="$(git rev-parse --show-toplevel)/.claude/skills/aii_owid_datasets/scripts/aii_owid_search_datasets.py" && \
parallel -j 50 -k --group --will-cite '$PY $S {} --limit 3' ::: 'renewable energy' 'climate change' 'covid mortality'

Example output:

Found 3 OWID tables for 'climate change':

[1] Climate Change Impacts
    Path: grapher/climate/2023-10-15/climate_impacts
    Description: Global temperature anomalies and sea level rise...
    Variables (42 total):
      - Global temperature anomaly (°C): Annual global mean temperature anomaly
      - Sea level rise (mm): Global mean sea level change
      - Atmospheric CO2 concentration (ppm): Monthly CO2 concentration at Mauna Loa
      - Arctic sea ice extent (million km²): Monthly Arctic sea ice extent
      ...

Parameters:

query (required, positional)

  • Search query string
  • Examples: "covid", "energy mix", "climate change"

--limit (optional)

  • Number of search results to return (default: 3)
  • Higher values = more results to choose from

Tips:

  • Search is fast (uses pre-built BM25 index, no network required)
  • Returns metadata only - no data is downloaded
  • Use the path field from results to download specific tables
  • BM25 search ranks by relevance across table titles, descriptions, and variable metadata
  • Search returns tables from all channels (garden=highest quality, meadow=raw, backport=legacy, open_numbers=Gapminder)

Download OWID table (aii_owid_download_datasets.py)

Download a table by path (from search results) and save to files.

Example input:

SKILL_DIR="$(git rev-parse --show-toplevel)/.claude/skills/aii_owid_datasets" && \
$SKILL_DIR/scripts/.venv/bin/python $SKILL_DIR/scripts/aii_owid_download_datasets.py "grapher/energy/2023-12-12/energy_mix"

Parallel execution (multiple tables):

IMPORTANT: When downloading multiple tables, use GNU parallel instead of separate Bash tool calls:

export PY="$(git rev-parse --show-toplevel)/.claude/skills/aii_owid_datasets/scripts/.venv/bin/python" && \
export S="$(git rev-parse --show-toplevel)/.claude/skills/aii_owid_datasets/scripts/aii_owid_download_datasets.py" && \
parallel -j 50 -k --group --will-cite '$PY $S {}' ::: 'grapher/energy/2023-12-12/energy_mix' 'grapher/demography/2023-10-10/population' 'grapher/health/2023-08-01/life_expectancy'

Example output:

Downloaded OWID table: grapher/energy/2023-12-12/energy_mix

Dimensions: 15,420 rows x 12 columns
Columns: country, year, coal, oil, gas, nuclear, hydro, solar, wind, biofuels...

Files saved:
  Mini (READ THIS for development/testing): /path/to/mini_grapher_energy_2023-12-12_energy_mix.json
  Preview (DO NOT READ - for logging only): /path/to/preview_grapher_energy_2023-12-12_energy_mix.json
  Full (DO NOT READ - for scripts only):    /path/to/full_grapher_energy_2023-12-12_energy_mix.json

Sample data (first 3 rows):
  Row 1:
    country: Afghanistan
    year: 2000
    coal: 0.5
    ...

Parameters:

path (required, positional)

  • Table path from search results
  • Examples: "grapher/energy/2023-12-12/energy_mix", "garden/demography/2023-10-10/population"

Output files (auto-saved to temp/tables/):

  1. Mini: mini_{path}.json - 3 full rows - READ THIS for development/testing
  2. Preview: preview_{path}.json - 3 truncated rows - DO NOT READ directly - for code you write to read
  3. Full: full_{path}.json - All rows - DO NOT READ directly - for code you write to read

Tips:

  • Critical: Only read the mini file directly with Read tool. Preview and full are input paths for code you write
  • Use the path from search results to download specific tables
  • Downloads directly from OWID catalog (network required)
  • Files always saved to temp/tables/ (path included in response)
Install via CLI
npx skills add https://github.com/AMGrobelnik/ai-inventor-old3 --skill aii-owid-datasets
Repository Details
star Stars 1
call_split Forks 0
navigation Branch main
article Path SKILL.md
More from Creator