name: gtfs-data-build description: > Run the GTFS data pipeline to build web app data files from GTFS sources. Use when the user asks to "update data", "rebuild data", "run the pipeline", "download GTFS", "refresh transit data", or wants to regenerate JSON files from GTFS sources.
GTFS Data Build
Build web app JSON data files from GTFS open data sources.
Pipeline Steps
Commands and execution order are documented in CLAUDE.md "Data preparation" section. Run steps 1-12 in order; each step depends on the previous one.
When to skip steps
- Steps 1-2 (download): Skip if source files are already up to date. GTFS files live in
pipeline/workspace/data/gtfs/{outDir}/; ODPT JSON files inpipeline/workspace/data/odpt-json/{outDir}/. - Step 7 (KSJ shapes): Skip if only bus data changed. Requires
pipeline/workspace/data/mlit/N02-25_RailroadSection.geojson. - Step 12 (data:sync): Always run last — this copies built data from
pipeline/workspace/_build/data-v2/topublic/<PIPELINE_TRANSIT_DATA_DIR>/where the app serves it. The default ispublic/data-v2/, but the destination may be overridden byPIPELINE_TRANSIT_DATA_DIRdepending on the environment.
Data flow
ODPT API (GTFS ZIP / ODPT JSON)
-> pipeline/workspace/data/{gtfs,odpt-json}/{outDir}/ (steps 1-2)
-> pipeline/workspace/_build/db/{outDir}.db (step 3)
-> pipeline/workspace/_build/data-v2/{prefix}/*.json (steps 4-10)
-> public/<PIPELINE_TRANSIT_DATA_DIR>/{prefix}/*.json (step 12)
Sources
Defined in pipeline/config/resources/{gtfs,odpt-json}/. Each .ts file is a single source definition. See each script's TSDoc header for detailed input/output paths.
Troubleshooting
- GTFS ZIP download does not require authentication (publicly accessible)
- ODPT JSON download requires
ODPT_ACCESS_TOKENenvironment variable (set viapipeline/.env.pipeline.local) pipeline:build:dbexpects GTFS CSV files inpipeline/workspace/data/gtfs/{outDir}/pipeline:build:v2-shapes:ksjexpects MLIT GeoJSON atpipeline/workspace/data/mlit/N02-XX_RailroadSection.geojson(year-suffixed)- If JSON output looks stale, check that
data:syncwas run after the build steps and inspect the effectivePIPELINE_TRANSIT_DATA_DIRfor the current environment.