openfda-database - SKILL.md Agent Skill

name: openfda-database description: > Query, search, and download data from the openFDA API for drugs, devices, foods, tobacco, cosmetics, animal and veterinary products, substances, and transparency data. Use for FDA adverse events, recalls, labeling, approvals, shortages, 510(k) clearances, NDC lookups, and any FDA safety or regulatory data query across all 28 API endpoints.

openFDA Search and Query

Prerequisites

uv: Read the uv skill and follow its Setup instructions to ensure uv is installed and on PATH.
User Notification: If LICENSE_NOTIFICATION.txt does not already exist in this skill directory then (1) prominently notify the user to check the terms at https://open.fda.gov/apis/ and https://open.fda.gov/license, then (2) create the file recording the notification text and timestamp.
.env file: Make sure the .env file exists in your home directory. Create one if it does not exist.
FDA_API_KEY (optional but recommended): Raises the daily request limit from 1,000 to 120,000. The skill works without it, but an agent can easily exhaust the keyless limit in a single session. The user can register for a free key at https://open.fda.gov/apis/authentication/. If the variable is missing from .env, do NOT ask the user to paste it into the chat (this would leak the key into the agent's context). Instead, give the user this command — substituting ENV_FILE with the resolved literal path to the .env file:
```
printf "Enter openFDA API key (typing hidden): " && read -s key && echo && echo "FDA_API_KEY=$key" >> "ENV_FILE" && echo "Saved."
```
The scripts load credentials automatically via dotenv. NEVER read, print, or inspect the .env file or its variables (e.g. no cat, grep, echo, printenv, or os.environ.get on keys). Credentials must stay out of the agent's context.

Core Rules

Use the Wrapper: ALWAYS execute the provided helper scripts to query the database rather than accessing the database directly. The scripts automatically enforce the required rate limit gracefully.
Rate Limiting: Respect openFDA rate limits. Without API key: 240 requests/min, 1,000 requests/day per IP. With API key: 240 requests/min, 120,000 requests/day per key. Always set an API key before running multi-query workflows.

Warning: An automated agent can easily exhaust the 1,000-request daily limit in a single research session. Always set an API key before running multi-query workflows.

Instruct the user to register for a free key at https://open.fda.gov/apis/authentication/ and follow the prerequisite instructions above to add FDA_API_KEY to the .env file. The script will emit a warning to stderr if no API key is detected.

Always Use --output: All subcommands require --output <file> to write results to a file. This prevents large output becoming overwhelming. Use jq or code to read the output file.
Notification: If this skill is used, ensure this is mentioned in the output.

Utility Script

Single script for all operations:

uv run scripts/openfda_query.py {search,count,download} --output <file> [options]

1. Search

Search any of the 28 endpoints and save JSON results to a file.

uv run scripts/openfda_query.py search \
  --category drug --endpoint event \
  --search "patient.drug.medicinalproduct:aspirin" \
  --limit 5 --output /tmp/fda_results.json

Stdout prints a compact summary:

{"status": "success", "output": "/tmp/fda_results.json", "results_in_file": 5, "total_matching": 601477}

Options:

--output: Output file for full JSON results (required).
--category: API category — drug, device, food, tobacco, other, animalandveterinary, cosmetic, transparency.
--endpoint: Endpoint within the category (e.g., event, label, 510k). See references/api_endpoints.md for full list.
--search: Query string (e.g., patient.drug.medicinalproduct:aspirin+AND+serious:1).
--sort: Sort field and order (e.g., receivedate:desc).
--limit: Max results (default 10, max 1000).
--skip: Pagination offset (default 0).
--api_key: API key (also reads FDA_API_KEY env var).

2. Count

Count unique values of a field within matching results.

uv run scripts/openfda_query.py count \
  --category drug --endpoint event \
  --search "patient.drug.medicinalproduct:aspirin" \
  --count_field "patient.reaction.reactionmeddrapt.exact" \
  --summary 10 --output /tmp/aspirin_reactions.json

Stdout prints a summary with the top 5 terms. Full data is in the output file.

Additional options:

--count_field: Field to count (append .exact for whole-phrase counting).
--summary N: Return only the top N most frequent terms. Use this to avoid flooding the context with hundreds of infrequent terms.

3. Download

Download multiple pages of results to a file.

uv run scripts/openfda_query.py download \
  --category drug --endpoint event \
  --search "patient.drug.medicinalproduct:aspirin" \
  --limit 100 --max_pages 5 \
  --output /tmp/aspirin_events.json

Additional options:

--max_pages: Maximum pages to fetch (default 10).
--all_results: Automatically paginate to fetch all matching results. Safety cap of 25,000 records maximum per download to prevent runaway downloads and prevent excessive API usage.

Tip: Common drugs can have excessive reports. Use a date range (e.g., receivedate:[20250101+TO+20250131]) to limit the volume of download.

Entity Resolution: Using .exact for Precision

When searching for specific product names, drug names, or categorical terms, always use the .exact suffix on the field to get exact-match results. Without it, the API tokenizes multi-word values and returns noisy partial matches.

# Precise: matches only "ADVIL"
uv run scripts/openfda_query.py search --category drug --endpoint label \
  --search 'openfda.brand_name.exact:"ADVIL"' \
  --limit 5 --output /tmp/advil_label.json

Note: Many brand names in the FDA database include variant suffixes (e.g., "TYLENOL Extra Strength" rather than just "TYLENOL"). If an .exact search returns 0 results, try without .exact to see the available brand name variants, then re-query with the full exact name.

The .exact suffix is also required when using --count_field to aggregate whole phrases instead of individual words.

MedDRA Term Resolution

openFDA adverse event data uses MedDRA (Medical Dictionary for Regulatory Activities) terms for reactions. The API reports Preferred Terms (PTs) but does not provide the MedDRA hierarchy (System Organ Class, High Level Terms, etc.).

Note: MedDRA is a proprietary ontology and is not indexed in the EMBL-EBI OLS. To approximate MedDRA hierarchy lookups, use the Human Phenotype Ontology (HP) or NCI Thesaurus (NCIT) as proxy ontologies — they cross-reference MedDRA IDs and provide parent/ancestor relationships.

# Step 1: Get top reactions from openFDA
uv run scripts/openfda_query.py count \
  --category drug --endpoint event \
  --search "patient.drug.medicinalproduct:metformin" \
  --count_field "patient.reaction.reactionmeddrapt.exact" \
  --summary 5 --output /tmp/metformin_reactions.json

# Step 2: Look up the top reaction term using a biomedical ontology service
# skill (e.g. embl-ebi-ols skill).
# MedDRA is not available in OLS; use the Human Phenotype Ontology (HP) or
# NCI Thesaurus (NCIT) as a proxy to find the hierarchical classification of
# the reaction term.

Available Endpoints (28 total)

Category to endpoint mapping:

drug: event, label, ndc, enforcement, drugsfda, shortages
device: 510k, classification, enforcement, event, pma, recall, registrationlisting, udi, covid19serology
food: enforcement, event
tobacco: problem, researchpreventionads, researchdigitalads, researchsmokefree
other: historicaldocument, nsde, substance, unii
animalandveterinary: event
cosmetic: event
transparency: crl

Reference

Query syntax and all endpoints: See references/api_endpoints.md for field names, search syntax, date ranges, and boolean operators.

Recipes

Common query patterns for drugs, devices, foods, tobacco, cosmetics, animal and veterinary products, substances, transparency data, adverse events, recalls, labeling, approvals, shortages, 510(k) clearances, NDC lookups, any FDA safety or regulatory data query, and more. See references/recipes.md for the full recipes.

Workflow

Search for records using search with --output. Read the output file.
Use count with --summary 10 --output to summarize field distributions.
Use download (with --all_results for exhaustive pulls) to fetch larger datasets.
Read and analyze the output file using standard tools.
For MedDRA term hierarchy questions, use a biomedical ontology service skill (e.g. EMBL-EBI OLS skill with the HP or NCIT ontology) to look up the term.