name: map-explorer-to-mdim
description: >-
Suggest a redirect mapping from a (soon-to-sunset) OWID explorer's views to the
views of one or more replacement MDIMs. Pulls explorer views and MDIM views from
the grapher DB, writes a CSV per source/target plus a wide joint proposal that
routes each explorer view to a target MDIM view, and flags when several explorer
views land on the same MDIM view. Trigger when the user says "map explorer
to mdim(s) <...>", "suggest explorer->MDIM redirects", "we're sunsetting the
explorer, map its views to the new multidims", or similar.
metadata:
internal: true
Map an explorer's views to MDIM views (redirect proposal)
When an explorer is being retired in favour of one or more MDIMs, every explorer view needs a redirect to the equivalent MDIM view. This skill produces the input for that: a CSV of explorer views, a CSV per target MDIM, and a joint proposal mapping each explorer view to a target MDIM view (the suggestion is for human review).
The mapping itself is explorer-specific (how the explorer's dimensions translate to MDIM dimension slugs, and — when there are multiple MDIMs — which MDIM each view routes to). The skill automates everything mechanical (pulling views, the join, the shared-target accounting, validation) and leaves only the per-explorer rules for you to write, seeded with auto-suggested matches.
Inputs
- Explorer slug — matches
explorers.slugin the grapher DB (e.g.natural-disasters). - One or more MDIM catalogPaths — as stored in
multi_dim_data_pages.catalogPath, e.g.natural_disasters/latest/deaths#deaths. The MDIMs must be published in the DB you connect to (their fully-expanded views are read frommulti_dim_data_pages.config).
DB access (confirm this before running)
Both the explorer and the MDIMs are read from the grapher DB via OWID_ENV, so the
scripts only work where that DB actually contains both the explorer and the published
MDIMs. There are three ways to point OWID_ENV at such a DB — figure out which one
applies before running, and don't assume .env.prod exists:
- Staging branch (often easiest): if you're on a
staging-site-<branch>branch,OWID_ENValready points at that prod-clone DB — run the commands as-is, no prefix. - Production, read-only, via
.env.prod: prefix commands withENV_FILE=.env.prod DATA_API_ENV=production. Only if.env.prodis present. - Some other credentials file: the user may keep prod (or other) DB creds in a
different env file — run with
ENV_FILE=<their file> [DATA_API_ENV=production].
Preflight — check, then ask if needed:
# Is .env.prod available?
ls -la .env.prod 2>/dev/null && echo "found .env.prod" || echo "NO .env.prod"
# Connectivity test (swap the ENV_FILE prefix for whatever applies; drop it on a staging branch):
ENV_FILE=.env.prod DATA_API_ENV=production .venv/bin/python -c \
"from etl.config import OWID_ENV; print('DB OK:', OWID_ENV.read_sql('SELECT 1 AS x').iloc[0,0])"
If .env.prod is missing and you're not on a staging branch with the data, stop
and ask the user which credentials / env file to use (e.g. "I don't see .env.prod —
which env file holds DB credentials that can reach the explorer + MDIMs? Or should I run
this from a staging branch?"). Then use that file as the ENV_FILE= prefix for both
script invocations below. Don't hardcode credentials.
If the connection works but a query returns nothing, the scripts stop with a clear message (explorer slug not found, or MDIM not published in this DB) — that means the DB you reached doesn't have it, so re-check which DB you're pointed at.
Workflow
1. Extract views + scaffold
.venv/bin/python .claude/skills/map-explorer-to-mdim/scripts/extract_views.py \
--explorer <slug> \
--mdim <ns/v/short#short> [--mdim <ns/v/short#short> ...] \
--out ai/<slug>-mdim-mapping
Writes into the out folder:
explorer_views.csv—id(1..N) +dimension_1..M(explorer display values).multidim_<short>_views.csv— one per MDIM;idis letter-prefixed by--mdimorder (A1…,B1…,C1…) so ids are unique across MDIMs; columns are the MDIM dimension slugs._scaffold.md— the explorer dimension legend (whichdimension_iis which name), the distinct values per dimension, each MDIM's dims/choices, auto-suggested value matches (where a slugified explorer value equals a real MDIM choice slug), and a ready-to-editmapping_rules.pytemplate.
2. Write mapping_rules.py
Open _scaffold.md, then write ai/<slug>-mdim-mapping/mapping_rules.py defining:
EXPLORER_DIMENSIONS— list namingdimension_1..N(copy from the scaffold; keep order).MDIMS— MDIM short names in the same order as--mdim(= prefixesA,B,C, …).route(dims) -> str— given a view's{dimension name: value}, return the target MDIM short name. For a single MDIM this is justreturn "<short>". For several, it's a decision on some explorer dimension (e.g. natural-disasters routes onImpact:Deaths→deaths,Economic damages (% GDP)→economic_damages, the rest→affected).translate(dims, mdim) -> dict— return{mdim_dim_slug: choice_slug}for the target MDIM view, built from the*_MAPdicts. Only include slugs the MDIM actually has (e.g. economic_damages has nometric— single-choice dims are pruned from MDIM views).
The scaffold seeds the *_MAP dicts with slugify(value) guesses. Verify every entry —
slugify won't catch label↔slug differences like Decadal average→decadal, Injuries→injured,
Volcanoes→volcanic_activity, or aggregate collapses like All disasters/All disasters (by type)→all_stacked.
3. Build the proposal
.venv/bin/python .claude/skills/map-explorer-to-mdim/scripts/build_mapping.py --out ai/<slug>-mdim-mapping
Writes mapping_proposal.csv, one row per explorer view:
| columns | meaning |
|---|---|
id, dimension_1..N |
the explorer view (same as explorer_views.csv) |
target_mdim, target_view_id |
the resolved target (target_view_id is the A*/B*/C* id) |
<mdim>_<dimslug> … |
wide block; only the target MDIM's columns are filled with the translated slugs |
shared_target_explorer_ids |
when >1 explorer view lands on the same MDIM view, the comma-joined list of all those explorer ids (e.g. 1,12); empty when the target is unique |
The script prints a validation report: how many explorer views resolved, distinct MDIM views hit per MDIM, how many rows share a target, and FLAGS for any explorer view that didn't resolve to a real MDIM view (fix the rules and re-run until there are no flags).
4. Review
Sanity-check the flagged rows and the judgment calls (approximate type matches, aggregate collapses, MDIM choices with no explorer source). The CSVs are typically pasted into a spreadsheet for a human reviewer / topic owner to confirm before redirects are created.
Notes & gotchas
- MDIM views come from
multi_dim_data_pages.config(published, fully expanded). This already reflects code-generated views (group_viewsaggregates) and pruned single-choice dimensions — so e.g. ametricthat has only one active choice won't appear as a column. - Many explorer views can redirect to one MDIM view — that's expected (the explorer often
splits a concept the MDIM merges, e.g. a single-line "All disasters" total and a
stacked-by-type view both mapping to the
all_stackedMDIM view).shared_target_explorer_idssurfaces these so the reviewer sees the collisions. - An MDIM may have choices with no explorer source (e.g. an
…_excluding_extreme_temperatureaggregate). Nothing redirects to those — fine, just confirm. - Explorer dimension columns stay
dimension_1..N(compact, and joinable to the explorer CSV); the name legend lives in_scaffold.mdand inEXPLORER_DIMENSIONS. - Re-running
extract_views.pyoverwrites the CSVs but not yourmapping_rules.py.