tsa-throughput

star 3

Set up TSA checkpoint throughput ingestion: create PDF stages, throughput table, network rule, external access integration, Python PDF-fetch and AI_EXTRACT procedures, and weekly scheduled tasks. Optional subskill of aviation-installer. Use when: configuring TSA throughput data as part of installation. Do NOT use for: standalone execution, ADS-B ingestion (use adsb-ingestion), flight schedules (use flight-schedules). Triggers: TSA throughput, passenger throughput, checkpoint data, TSA pipeline.

Snowflake-Labs By Snowflake-Labs schedule Updated 6/4/2026

name: tsa-throughput description: "Set up TSA checkpoint throughput ingestion: create PDF stages, throughput table, network rule, external access integration, Python PDF-fetch and AI_EXTRACT procedures, and weekly scheduled tasks. Optional subskill of aviation-installer. Use when: configuring TSA throughput data as part of installation. Do NOT use for: standalone execution, ADS-B ingestion (use adsb-ingestion), flight schedules (use flight-schedules). Triggers: TSA throughput, passenger throughput, checkpoint data, TSA pipeline." depends_on: - aviation-installer - adsb-ingestion metadata: author: Snowflake SIT-IS version: 1.0.0 category: infrastructure

TSA Throughput Setup

This subskill cannot be run independently. It must be invoked from the aviation-installer router.

This subskill is optional -- only execute if the user wants TSA checkpoint throughput data. No API key is required (public FOIA data).

Creates the TSA checkpoint throughput ingestion pipeline: network rule, EAI, PDF stages, throughput table, Python procedures for PDF download and AI_EXTRACT-based data extraction, and a weekly task DAG (Monday 9am PT).

The pipeline fetches the latest weekly TSA throughput PDF from the TSA FOIA reading room, splits it into single-page PDFs, extracts structured tabular data via AI_EXTRACT, and loads it into the TSA_THROUGHPUT table. Data covers all US airports; the dashboard filters by airport IATA code at query time.

Prerequisites

  • adsb-ingestion completed (provides task DAG root)
  • Variables from router: {TARGET_DB}, {SCHEMA}, {IATA}, {WAREHOUSE}

Required Privileges

Privilege Scope Reason
CREATE NETWORK RULE Database Creates rule for tsa.gov
CREATE INTEGRATION Account Creates EAI for tsa.gov HTTPS access
CREATE STAGE Schema Creates 2 PDF stages (source + pages)
CREATE TABLE Schema Creates TSA_THROUGHPUT table
CREATE PROCEDURE Schema Creates 3 ingestion/extraction procedures
CREATE TASK Schema Creates 2-task DAG (fetch + extract)

Configuration

Parameter Default Description
CRON_SCHEDULE 0 9 * * 1 America/Los_Angeles Weekly Monday 9am PT
TASK_PARENT TASK_INGEST_ADSB Parent task for DAG chaining
EAI_TSA_GOV {TARGET_DB}_{SCHEMA}_TSA_GOV_EAI External access integration name

Friction Logging

Report all friction points (errors, warnings, workarounds, race conditions) back to the parent installer using the F1/F2/F3 format from .cortex/skills/logs/README.md. The parent writes the consolidated friction log. If executing standalone, write to .cortex/skills/logs/friction-log_{YYYY-MM-DD}_{HH-MM}.md.

Workflow

Read the references/ subfiles for complete SQL:

  • references/01-network-stages-tables.md -- Network rule, EAI, stages, table
  • references/02-procedures.md -- All 3 procedures (fetch PDF, find latest, process/extract)
  • references/03-tasks.md -- Task definitions + initial fetch/extract calls

Execute ALL SQL from each file in order. Do NOT skip or optimize away any queries.

Step 1: Set Query Tag

ALTER SESSION SET query_tag = '{"origin":"sf_sit-is-aviation","name":"oss-aviation-tsa-throughput","version":{"major":1,"minor":0},"attributes":{"is_quickstart":1,"source":"sql"}}';

Step 2: Create Network Rule and EAI

Create network rule for tsa.gov:443 (HTTPS) and EAI {EAI_TSA_GOV}. No secret needed -- the FOIA reading room is public.

See references/01-network-stages-tables.md for full SQL.

Step 3: Create Stages and Table

  • TSA_PDF_STAGE -- stores downloaded weekly PDF (SNOWFLAKE_SSE encryption, directory enabled)
  • TSA_PDF_PAGES_STAGE -- scratch space for 1-page PDF splits (cleaned after extraction)
  • TSA_THROUGHPUT -- destination table for extracted checkpoint data

See references/01-network-stages-tables.md for full SQL.

Step 4: Create Procedures

  • PROC_FETCH_PDF_TO_STAGE(url, stage_path) -- generic PDF download utility
  • PROC_FETCH_LATEST_TSA_PDF(stage_path) -- scrapes FOIA page, finds latest throughput PDF, deduplicates
  • PROC_PROCESS_TSA_PDF(pages_stage, target_table) -- splits PDF, AI_EXTRACT on each page, bulk insert, cleanup

See references/02-procedures.md for full SQL.

Step 5: Create Tasks

  • TASK_FETCH_TSA_PDF -- root TSA task, weekly CRON, calls PROC_FETCH_LATEST_TSA_PDF
  • TASK_EXTRACT_TSA_PDF -- child task (AFTER fetch), calls PROC_PROCESS_TSA_PDF

See references/03-tasks.md for full SQL.

Note: Tasks are created SUSPENDED. The router resumes them after all tasks are ready.

Step 6: Trigger Initial Fetch and Extract

CALL {TARGET_DB}.{SCHEMA}.PROC_FETCH_LATEST_TSA_PDF('@{TARGET_DB}.{SCHEMA}.TSA_PDF_STAGE');
CALL {TARGET_DB}.{SCHEMA}.PROC_PROCESS_TSA_PDF('@{TARGET_DB}.{SCHEMA}.TSA_PDF_PAGES_STAGE', '{TARGET_DB}.{SCHEMA}.TSA_THROUGHPUT');

Step 7: Verify

SELECT COUNT(*) AS TSA_ROWS FROM {TARGET_DB}.{SCHEMA}.TSA_THROUGHPUT;
SELECT COUNT(DISTINCT airport_code) AS AIRPORTS FROM {TARGET_DB}.{SCHEMA}.TSA_THROUGHPUT;
SELECT COUNT(*) AS LOCAL_ROWS FROM {TARGET_DB}.{SCHEMA}.TSA_THROUGHPUT WHERE UPPER(airport_code) = '{IATA}';

Expected: > 0 rows overall; local rows depend on whether the airport appears in TSA data.

Stopping Points

  • After Step 2: Confirm EAI exists (SHOW INTEGRATIONS LIKE '%TSA_GOV%')
  • After Step 6: Verify TSA_THROUGHPUT has rows

Troubleshooting

Error Cause Fix
EAI creation fails Insufficient privileges ACCOUNTADMIN required for CREATE INTEGRATION
PROC_FETCH fails with connection error Network rule misconfigured Check network rule allows tsa.gov:443 (HTTPS)
No PDF found on FOIA page Page structure changed TSA may have changed page structure; check https://www.tsa.gov/foia/readingroom/ manually
AI_EXTRACT returns empty results Wrong encryption type Ensure stages use SNOWFLAKE_SSE encryption (required for AI_EXTRACT)
0 rows for airport Airport not in TSA data Not all airports appear in TSA data; major US airports should have data

Cleanup

All objects carry the sf_sit-is-aviation COMMENT tracking tag and are auto-discovered by the aviation-cleanup skill. EAIs are matched by name pattern %TSA_GOV_EAI.

Return to Router

After completing all steps, return to the aviation-installer router and continue with derived-analytics.

Install via CLI
npx skills add https://github.com/Snowflake-Labs/sfguide-aviation-ops-intelligence --skill tsa-throughput
Repository Details
star Stars 3
call_split Forks 5
navigation Branch main
article Path SKILL.md
More from Creator
Snowflake-Labs
Snowflake-Labs Explore all skills →