usage-report - SKILL.md Agent Skill

name: usage-report description: Generate a usage report for MCP Gateway Registry by SSHing into the telemetry bastion host, exporting telemetry data from DocumentDB, and producing a formatted markdown report with deployment insights. license: Apache-2.0 metadata: author: mcp-gateway-registry version: "1.3"

Usage Report Skill

Export telemetry data from the MCP Gateway Registry's DocumentDB telemetry collector and generate a usage report showing deployment patterns, version adoption, and feature usage in the wild.

Two-pass report generation

Pass 1 (deterministic, Step 7): render_report.py substitutes a fixed template with values pulled directly from the analyzer's JSON/CSV outputs. The Recommendations section is generated by rule-based triggers in recommendations.py. Every number and table cell traces back to a source file. The LLM is not in this path.

Pass 2 (LLM commentary, Step 8): augment_with_commentary.py extracts a manifest of  markers from the rendered markdown. The skill hands the manifest to the LLM, which produces 2-4 sentence analyst paragraphs per section. The augmenter then substitutes the markers with the commentary text. The LLM only writes prose into pre-defined slots; it cannot modify numbers, tables, or charts.

Together: deterministic data for everything quantitative, LLM voice for synthesis. Hallucination is prevented because the LLM never writes numbers; flat reports are prevented because the commentary layer adds the analyst's interpretation.

To change report content: edit report_template.md (prose, layout, section ordering, commentary anchors), render_report.py (compute new placeholders), recommendations.py (rule-based bullets), or augment_with_commentary.py (commentary marker handling). See Steps 7 and 8 below for full details.

Visualization Guidelines

All charts in this skill follow Edward Tufte's principles documented in tufte-viz-guidelines.md: high data-ink ratio, no chartjunk, layered information, honest scales. The shared style module tufte_style.py provides apply_tufte_style() (rcParams) and tufte_axes(ax) (per-axes cleanup). When adding new chart generators, import from tufte_style and call apply_tufte_style() once before plotting and tufte_axes(ax) for each axes after plotting. Reference the Tufte checklist in tufte-viz-guidelines.md before merging any new chart.

Prerequisites

SSH key at ~/.ssh/id_ed25519 with access to the bastion host
Terraform state available in terraform/telemetry-collector/ (to read bastion IP)
Bastion host enabled (bastion_enabled = true in terraform/telemetry-collector/terraform.tfvars)
AWS credentials configured on the bastion host (for Secrets Manager access)
GitHub CLI (gh) authenticated with read access to the upstream repo (agentic-community/mcp-gateway-registry) for collecting stars, forks, and contributor counts

Input

The skill accepts optional parameters:

/usage-report [OUTPUT_DIR]

OUTPUT_DIR - Base directory for reports (default: .scratchpad/usage-reports/)

If OUTPUT_DIR is not provided, save to .scratchpad/usage-reports/.

All artifacts for a given run are placed in a dated subfolder: OUTPUT_DIR/YYYY-MM-DD/. This keeps each report self-contained and avoids a flat directory of hundreds of files. Previous metrics and CSV files are discovered by scanning both the base directory and all dated subdirectories.

Workflow

Step 1: Get Bastion IP

cd terraform/telemetry-collector && terraform output -raw bastion_public_ip

If the output is "Bastion not enabled", tell the user to set bastion_enabled = true in terraform/telemetry-collector/terraform.tfvars and run terraform apply.

Step 2: Copy Export Script to Bastion

scp -o StrictHostKeyChecking=no -i ~/.ssh/id_ed25519 \
  terraform/telemetry-collector/bastion-scripts/telemetry_db.py \
  ec2-user@$BASTION_IP:~/telemetry_db.py

Step 3: Run Export on Bastion

ssh -o StrictHostKeyChecking=no -i ~/.ssh/id_ed25519 \
  ec2-user@$BASTION_IP \
  'python3 telemetry_db.py export --output /tmp/registry_metrics.csv 2>&1'

Capture the full output -- it contains the summary statistics printed by telemetry_db.py.

Step 4: Create Dated Subfolder and Download the CSV

Create a dated subfolder for this run's artifacts, then download the CSV into it:

DATE_DIR=OUTPUT_DIR/YYYY-MM-DD
mkdir -p $DATE_DIR

scp -o StrictHostKeyChecking=no -i ~/.ssh/id_ed25519 \
  ec2-user@$BASTION_IP:/tmp/registry_metrics.csv \
  $DATE_DIR/registry_metrics.csv

Step 5: Install Python Dependencies and Generate Charts

First, ensure matplotlib and seaborn are available on the system Python:

/usr/bin/python3 -c "import matplotlib, seaborn" 2>/dev/null || pip install --break-system-packages matplotlib seaborn

Then generate the instance-based deployment distribution chart (counts unique registry instances, not events). Run it twice -- once for the cumulative install base, once filtered to the previous complete day -- so the report can show "everyone who ever installed" alongside "who is running it right now":

# Cumulative -- all customers ever
/usr/bin/python3 .claude/skills/usage-report/generate_instance_distribution_chart.py \
  --csv $DATE_DIR/registry_metrics.csv \
  --output $DATE_DIR/instance-distribution-YYYY-MM-DD.png

# Active-yesterday -- only customers that reported on the last complete day.
# Pass YYYY-MM-DD - 1 (the previous day relative to report date) so today's
# partial-day undercount doesn't bias the picture.
/usr/bin/python3 .claude/skills/usage-report/generate_instance_distribution_chart.py \
  --csv $DATE_DIR/registry_metrics.csv \
  --output $DATE_DIR/instance-distribution-active-PREVIOUS-YYYY-MM-DD.png \
  --active-on-date PREVIOUS-YYYY-MM-DD

Each invocation produces a faceted PNG with 6 subplots: Cloud Provider, Compute Platform, Storage Backend, Auth Provider, Architecture, and Deployment Mode. Each subplot shows unique instance counts and percentages. The --active-on-date form filters the row set to instances that had at least one event on the given date (heartbeat or startup), then runs the same six-panel breakdown on that subset; the chart title is annotated to make the filter explicit.

In the report, embed both PNGs in the "Deployment Distribution (by Unique Instances)" section and add a short narrative pointing out where the two views diverge -- typically the active-yesterday view shifts toward Kubernetes (vs Docker), enterprise IdP (vs the long-tail), and AWS dominance.

Step 5b: Generate Timeseries Chart

Generate a timeseries chart showing unique registry installs per cloud provider over time. This reads ALL CSV files in the base output directory and dated subdirectories to build a complete historical view:

/usr/bin/python3 .claude/skills/usage-report/generate_timeseries_chart.py \
  --csv-dir OUTPUT_DIR \
  --output $DATE_DIR/registry-installs-timeseries-YYYY-MM-DD.png \
  --exclude-incomplete-day YYYY-MM-DD

This produces a PNG with three subplots:

Cumulative Unique Registry Installs -- running total of unique registry_ids per cloud provider
Daily Active Registry Installs -- unique registry_ids seen each day per cloud provider (returning instances are re-counted)
Daily NEW Registry Installs (first-seen) -- unique registry_ids whose earliest-ever event lands on each day, per cloud provider. Each instance is counted exactly once across the entire history. Use this to track raw acquisition velocity per cloud, isolated from churn and re-engagement

--exclude-incomplete-day drops events on the given date (today's date, the in-progress day) before charting so the trailing data point doesn't show a misleading dip. Always pass today's YYYY-MM-DD. Snapshot tables and headline tallies still see the full data; only the chart series are trimmed.

Step 5b2: Generate Compute Platform Timeseries Chart

Generate a second timeseries chart, parallel to the cloud-provider one, showing unique registry installs per compute platform (docker, kubernetes, ecs, ec2, etc.) over time. Same data-sourcing behavior (scans all CSV files across dated subdirectories). Pass --snapshots-table to also emit a markdown per-snapshot table ready to embed in the report:

/usr/bin/python3 .claude/skills/usage-report/generate_compute_timeseries_chart.py \
  --csv-dir OUTPUT_DIR \
  --output $DATE_DIR/compute-installs-timeseries-YYYY-MM-DD.png \
  --snapshots-table $DATE_DIR/compute-platform-snapshots-YYYY-MM-DD.md \
  --exclude-incomplete-day YYYY-MM-DD

This produces:

A PNG with two subplots:
- Cumulative Unique Registry Installs per Compute Platform -- running total of unique registry_ids per platform
- Daily Active Registry Installs per Compute Platform -- unique registry_ids seen each day per platform
A markdown file with the Per-Platform Growth (Unique Installs) table, one row per dated CSV snapshot, sorted descending by date (newest first, bolded). The column order is docker | kubernetes | ecs | ec2 | unknown when present, plus any other platforms alphabetically. Unique-instance counts per snapshot are computed directly from each dated CSV using the compute column (not compute_platform -- that's the schema key but not the CSV column name).

Embed the chart in the report's "Compute Platform Growth" section and drop the contents of the snapshots-table markdown file in under the "Per-Platform Growth (Unique Installs)" subheading. Add a short narrative on which platforms are growing fastest in absolute and percentage terms; the newest (bolded) row is the current total for the report.

Step 5c: Generate Instance Lifetime Chart

Generate a density plot showing the distribution of instance lifetimes (age in days). This reads the metrics JSON produced by the analysis step, so it must run after Step 6. However, the SKILL.md lists it here for logical grouping with other charts:

/usr/bin/python3 .claude/skills/usage-report/generate_lifetime_chart.py \
  --metrics $DATE_DIR/metrics-YYYY-MM-DD.json \
  --output $DATE_DIR/instance-lifetime-YYYY-MM-DD.png

This produces a PNG with three panels:

Age Distribution -- histogram with KDE density overlay showing instance ages in days, with stats annotation (mean, max, multi-day vs single-day counts)
Age Spread -- boxplot with Q1/median/Q3/max annotated and individual points overlaid, useful for spotting outliers and the long-tail of long-lived deployments
Age Buckets -- horizontal bar chart grouping instances into age ranges (0 days, 1-2 days, 3-5 days, etc.) with counts and percentages

Note: Run this after Step 6 (telemetry analysis) since it reads the metrics JSON.

Step 5c-ts: Generate Lifetime-Bucket Retention Chart

Plot per-snapshot lifetime retention percentages (one-day wonders vs >=3 / >=7 / >=14 / >=30 day cohorts) over time. Reads every metrics-*.json file under the base output directory and recomputes the buckets retroactively, so it works on snapshots that predate the lifetime_bucket_pct field. Produces a PNG plus a per-snapshot CSV sidecar that future reports can diff against.

/usr/bin/python3 .claude/skills/usage-report/generate_lifetime_buckets_chart.py \
  --csv-dir OUTPUT_DIR \
  --output $DATE_DIR/lifetime-buckets-YYYY-MM-DD.png \
  --csv-out $DATE_DIR/lifetime-buckets-YYYY-MM-DD.csv

Embed the chart in the report directly below the Registry Instance Lifetime section, alongside a narrative that quotes the latest CSV row (the report-date snapshot) and contrasts it with the earliest snapshot to show whether the customer-retention curve is improving over time.

Note: Run this after Step 6 since it depends on instance_lifetime + internal_instance_ids keys in metrics-*.json.

Step 5c2: Generate Customer-Active-Instances Chart

Generate a chart of customer engagement over time, with three overlaid series:

Daily Active Instances (DAI) -- unique registry_ids that sent at least one event (startup OR heartbeat) on that day.
7-day moving average (MA7) -- trailing 7-day average of DAI.
7-day consistency streak (S7) -- unique registry_ids that sent at least one event on EACH of the 7 days in the window [D-6..D].

Customer-only: internal instances loaded from known-internal-instances.md are excluded so the numbers align with the Liveness section (which is also customer-only). A CSV sidecar of the per-day values is written alongside the PNG so the report narrative can quote exact numbers and future reports can diff against it.

The CSV also includes two DAI percentage columns derived from the same per-day registry-id sets:

cumulative_installs, dai_pct_of_total -- DAI / cumulative_installs through that day; the engagement rate of the full install funnel ever recorded
likely_alive_7d, dai_pct_of_likely_alive -- DAI / unique-active-in-trailing-7d; the engagement rate of the currently-active fleet (analog of a DAU/WAU ratio)

Both percentages should be quoted side-by-side in the report's "DAI as a percentage of total installs" subsection so the reader sees the funnel-engagement number (low, because of one-day wonders) and the active-fleet engagement number (the healthier B2B-style read) together.

/usr/bin/python3 .claude/skills/usage-report/generate_active_instances_chart.py \
  --csv-dir OUTPUT_DIR \
  --output $DATE_DIR/active-instances-YYYY-MM-DD.png \
  --internal-instances .claude/skills/usage-report/known-internal-instances.md \
  --csv-out $DATE_DIR/active-instances-YYYY-MM-DD.csv \
  --exclude-incomplete-day YYYY-MM-DD

Same data-sourcing behavior as the other historical charts (scans all CSVs across dated subdirectories). Embed in the Liveness section under an "Engagement over Time" subheading.

Step 5c3: Generate LTV Infra-Spend Chart and Summary

Compute daily and cumulative AWS customer infra spend (EC2 compute + Bedrock Titan embeddings). The script emits two cost numbers framed as a range so the report can be conservative about not over-estimating spend:

All-days (upper bound): charges every distinct (AWS customer instance, day) pair, including 1-day trial installs. Matches "every real AWS billing day this solution caused".
Proven-persistence (lower bound): charges an instance on day D only if it had events on D AND on any prior day. This excludes every instance's first-ever active day, so instances that phone home once and never again contribute $0. Gap-tolerant: if an instance is silent on day D-1 and comes back on D, it's still charged on D (because it had prior events on earlier days).

Customer-only (internal UUIDs excluded), AWS-only (GCP/Azure/unknown excluded because we can't attribute their AWS-side usage). On the current fleet ~59% of AWS customer instances are "one-day wonders" — they show up once and never return — so the proven-persistence number is typically ~30% lower than all-days.

Cost model (per-compute-platform, grounded in deployment artefacts):

Platform	Daily rate	Grounding
`docker`	$3.99	1 × t3.xlarge on-demand ($0.1664/hr), customer VM
`ecs`	$26.04	From a measured Cost Explorer day (May-30) for the `terraform/aws-ecs` deployment, excl shared-account overhead (CloudTrail, Others) and the standalone EC2-Instances box: Fargate/ECS ($7.66) + EC2-Other NAT/EBS/data ($7.39) + RDS Keycloak ($4.47) + DocumentDB ($2.03) + VPC ($1.80) + CloudWatch ($1.61) + ELB/ALBs ($1.08)
`kubernetes`	$18.58	From the measured EKS reference deployment: EKS control plane ($2.40) + 3 × m6i.xlarge nodes ($13.82) + EBS gp3 ~41Gi ($0.11) + 1 ALB ($0.92) + 1 NAT Gateway + 5 GB egress ($1.31) + ACM public cert ($0.00) + Route 53 hosted zone ($0.02)
`ec2` / `unknown` / other	$3.99	Docker-compose fallback (single VM)

Platform for a given instance is resolved via its most-recent non-empty compute field. If an instance migrates across platforms mid-window, it's billed at the latest platform's rate for the whole window.

Bedrock Titan embeddings: only for instances whose latest embeddings_backend_kind == "bedrock". Cost = delta(search_queries_total) on that day × 100 tokens/query × $0.00002 / 1K tokens. The delta is computed from the instance's own search_queries_total timeseries (monotonic counter), so we never double-count queries that were already charged on a previous day.

/usr/bin/python3 .claude/skills/usage-report/generate_ltv_spend.py \
  --csv-dir OUTPUT_DIR \
  --output $DATE_DIR/ltv-spend-YYYY-MM-DD.png \
  --internal-instances .claude/skills/usage-report/known-internal-instances.md \
  --csv-out $DATE_DIR/ltv-spend-YYYY-MM-DD.csv \
  --summary-json $DATE_DIR/ltv-spend-YYYY-MM-DD.json \
  --exclude-incomplete-day YYYY-MM-DD

When --exclude-incomplete-day is passed, the JSON summary's yesterday block refers to the last complete day (typically YYYY-MM-DD - 1), not today. Headline tables in the report should label this clearly (e.g. "Yesterday (2026-05-16, last complete day)").

Three counting rules are emitted side by side so the report can show a range and pick a headline:

all-days (upper bound): charge every distinct (AWS customer instance, day) pair, including 1-day trial installs.
persisted (HEADLINE): charge every day an instance reported, but only if it reported on >= 2 distinct days total (i.e. it came back on a separate day rather than installing and vanishing the same day). Unlike proven, the instance's first day IS charged once it qualifies. This is the "real running deployment, count every day it phoned home" definition the maintainer settled on: it excludes install-and-delete instances and double-counts nothing.
proven (lower bound): charge an instance on day D only if it had events on D AND any prior day. Excludes every instance's first-ever active day.

/usr/bin/python3 .claude/skills/usage-report/generate_ltv_spend.py \
  --csv-dir OUTPUT_DIR \
  --output $DATE_DIR/ltv-spend-YYYY-MM-DD.png \
  --internal-instances .claude/skills/usage-report/known-internal-instances.md \
  --csv-out $DATE_DIR/ltv-spend-YYYY-MM-DD.csv \
  --summary-json $DATE_DIR/ltv-spend-YYYY-MM-DD.json \
  --exclude-incomplete-day YYYY-MM-DD

Outputs:

PNG chart with three panels: daily EC2 compute USD (all-days bar + persisted overlay), daily Bedrock USD, cumulative LTV USD (all-days / persisted / proven lines with shaded range).
CSV sidecar per day with _persistent (proven) and _persisted (>= 2 distinct days) variants of every column alongside the all-days columns.
JSON summary with headline numbers under yesterday.{all_days, persisted, proven}, last_7_days.{all_days_total_usd, persisted_total_usd, proven_total_usd}, ltv.{all_days, persisted, proven}, and per-platform LTV breakdown for all three models.

Embed the chart in the report's Customer Infra Spend (AWS) section. Lead with the persisted number as the headline, and show all three as a range in the summary table. Include one short paragraph explaining the three counting rules and the per-platform LTV breakdown. Flag clearly that the cost model is hypothetical (we don't actually bill these customers; these are "what it would cost them at list price").

Formula clarification (mandatory): below the spend table, include the "How the three numbers relate" block (the template renders this from ltv.{all_days,persisted,proven}.total_instance_days). The key points the reader needs:

All three rules multiply the same per-platform daily rates by a count of instance-days (one (instance, day) pair = one charge). They differ ONLY in which instance-days qualify, so all-days >= persisted >= proven always holds.
all-days - persisted = the count of single-calendar-day (install-and-vanish) instance-days that persisted drops.
persisted - proven = exactly the number of persisted instances (proven frees each instance's first-ever day; persisted keeps it). This identity is a useful self-check: if it doesn't equal the persisted-instance count, something is wrong.
At the trailing edge of the window the persisted and proven daily figures converge (an instance active on the last complete day cannot still be on its first-ever day, so proven charges it too) -- this is why yesterday's persisted and proven numbers are typically identical while the cumulative ones diverge.

ARR projection (mandatory): below the cumulative LTV table, add a short subsection titled "Annualized Run Rate (ARR) Projection" that takes the 7-day daily-average spend and projects it forward 365 days. Compute as last_7_days.<model>_total_usd / 7 * 365, divided by 1,000,000, formatted to 2 decimal places in millions. Render all three models, with persisted as the headline row:

| Model | 7-day daily avg | x 365 = ARR |
|-------|----------------:|------------:|
| Persisted (headline) | $X | $Y.YYM |
| All-days | $X | $Y.YYM |
| Proven | $X | $Y.YYM |

Frame the ARR as "what the active customer fleet would cost AWS customers per year at list price if today's run rate held constant." Include the same hypothetical disclaimer (we do not bill these customers). The ARR is a useful complement to install-count growth: it tracks the real economic footprint of the customer fleet, not just the headcount of registry instances.

Step 5c3b: Generate Daily Reporters Chart

Plot the daily count of AWS customer instances reporting home, with two persistence-filtered overlays that exclude install-and-vanish-within-a-day instances. This is the visual companion to the persisted LTV counting rule: the green ">= 2 distinct days" line is exactly the daily instance count that drives the headline infra-spend number.

/usr/bin/python3 .claude/skills/usage-report/generate_daily_reporters_chart.py \
  --csv-dir OUTPUT_DIR \
  --output $DATE_DIR/daily-reporters-YYYY-MM-DD.png \
  --internal-instances .claude/skills/usage-report/known-internal-instances.md \
  --csv-out $DATE_DIR/daily-reporters-YYYY-MM-DD.csv \
  --exclude-incomplete-day YYYY-MM-DD

This produces:

A PNG with three overlaid daily series (AWS-only, customer-only): all reporters, persisted (>= 2 events ever), persisted (>= 2 distinct days).
A CSV sidecar per day: date, all_reporters, persisted_2events, persisted_2days.

Same data-sourcing behavior as the other historical charts (scans all CSVs across dated subdirectories). Embed in the report's Customer Infra Spend (AWS) section, directly above or beside the LTV chart, so the reader sees how the headline daily instance count (~the green line) is derived. Add a short narrative noting how little the install-and-vanish filter moves the line (the daily fleet is overwhelmingly persistent), and that the gap between "all reporters" and ">= 2 distinct days" is the single-calendar-day cohort.

Step 5d: Generate Install Forecast Chart

Project when the registry will reach 1,000 installs using two models: a 14-day OLS linear regression and a 7-day recent-pace extrapolation. Produces a PNG chart (cumulative installs with forecast line and confidence bands) and a JSON summary with ETAs.

/usr/bin/python3 .claude/skills/usage-report/generate_install_forecast.py \
  --csv-dir OUTPUT_DIR \
  --output $DATE_DIR/install-forecast-YYYY-MM-DD.png \
  --summary-json $DATE_DIR/install-forecast-YYYY-MM-DD.json

Outputs:

PNG chart showing cumulative installs, linear fit, and projected crossing of the 1,000-install target
JSON summary with today.installs, linear.eta (with 95% CI bounds), recent_pace.eta, and model parameters

Embed the chart in the report's Install Forecast section. Include a table showing both model ETAs and daily rates. This section should come after Version Adoption and before Customer Infra Spend.

Step 5e: Generate Adoption Funnel Chart

Visualize the conversion funnel from total installs through retention stages to confirmed-alive. Reads the metrics JSON (for lifetime buckets) and optionally the liveness JSON (for confirmed-alive count).

/usr/bin/python3 .claude/skills/usage-report/generate_adoption_funnel_chart.py \
  --metrics $DATE_DIR/metrics-YYYY-MM-DD.json \
  --liveness $DATE_DIR/liveness-YYYY-MM-DD.json \
  --output $DATE_DIR/adoption-funnel-YYYY-MM-DD.png

Note: Run after Step 6 and Step 6c since it reads both metrics-*.json and liveness-*.json.

Embed the chart in the report's Adoption Funnel section (placed after Most Engaged Operators, before Recommendations). Include a table showing each funnel stage, count, and percentage of the previous stage.

Step 5f: Generate Cloud-Detection-by-Version Chart

Plot how the cloud_detection_method outcome distributes per registry version. Each row is a version (top 12 by instance count plus a rolled-up "other"); each row is a stacked horizontal bar split by detection-method outcome (env, dmi, ecs_meta, k8s_heuristic, imds, unknown, "(field absent)" for pre-1.23.0).

This chart lets the report validate that fixes to cloud detection (issue #1093, PR #1106 in 1.24.2) actually moved the needle: the "unknown" red slice should shrink on the row for the version where the fix shipped, relative to older versions on the same chart.

/usr/bin/python3 .claude/skills/usage-report/generate_detection_by_version_chart.py \
  --csv $DATE_DIR/registry_metrics.csv \
  --output $DATE_DIR/detection-by-version-YYYY-MM-DD.png \
  --csv-out $DATE_DIR/detection-by-version-YYYY-MM-DD.csv \
  --snapshot-date YYYY-MM-DD

Outputs:

PNG chart with versions on the y-axis (sorted by instance count, largest at top), detection methods stacked on the x-axis with green-for-success / red-for-unknown / grey-for-field-absent colour coding.
CSV sidecar with one row per version, columns for each detection method count, plus a total. Useful for diffing across reports.

Embed the chart in a section titled "Cloud Detection Outcomes by Version" placed after Adoption Funnel and before Recommendations. Add a short narrative quoting the row for the latest release (1.24.2 and later) and contrasting it with 1.23.0 and 1.24.1 to show whether the fix is working in the wild on instances that adopted it.

Step 5g: Fetch GitHub Repository Stats

Collect community-growth signals for the upstream repo (agentic-community/mcp-gateway-registry) using the authenticated gh CLI. These numbers complement telemetry by showing project interest outside of deployed instances.

Run the helper script with the dated output directory as its only argument. It writes github_stats.json and github_contributors_count.txt into that directory and prints both for confirmation. Keep this as a single script call (do NOT inline the gh api pipelines into a larger && chain): one script invocation is one allow-listed command, so it runs without a permission prompt on every report date.

.claude/skills/usage-report/fetch_github_stats.sh $DATE_DIR

Record these numbers in the report and compare them against the previous report's github_stats.json (if present in the previous dated subfolder). Compute deltas for stars, forks, and contributors the same way telemetry metrics are compared.

Note: If gh is not authenticated or an API call fails, the script logs a short note and exits 0 (the file is omitted). Skip the GitHub section in the report in that case instead of failing the entire run.

Step 5h: Generate Community-vs-Internal Chart and Breakdown

Classify every registry instance as a community deployment or an internal/development deployment based on its reported version string, then produce a timeseries chart and a breakdown summary. This scans all CSV files across dated subdirectories like the other historical charts.

Classification rule:

Community: a clean release tag matching ^v?MAJOR.MINOR.PATCH(-pN)?$ (e.g. 1.0.0, 1.24.4, v1.0.20, v1.0.22-p1).
Internal: anything else (git-describe builds like 1.24.1-25-g5b4b2a30-main, bare commit hashes like f3d77e3, dev, sha-..., branch-suffixed builds). Instances in the known-internal allowlist are always counted as internal.
Unknown: an empty version string.

Each instance is attributed to the latest version it reported. Pass --yesterday as the previous complete day (YYYY-MM-DD - 1) so the headline breakdown reflects the last full day, and --exclude-incomplete-day as today's date so the chart series don't show a misleading trailing dip.

/usr/bin/python3 .claude/skills/usage-report/generate_prod_internal_chart.py \
  --csv-dir OUTPUT_DIR \
  --output $DATE_DIR/prod-internal-timeseries-YYYY-MM-DD.png \
  --summary-json $DATE_DIR/prod-internal-YYYY-MM-DD.json \
  --internal-instances .claude/skills/usage-report/known-internal-instances.md \
  --yesterday PREVIOUS-YYYY-MM-DD \
  --exclude-incomplete-day YYYY-MM-DD

Outputs:

A PNG with two panels: cumulative unique installs (community vs internal) and daily active installs (community vs internal) over time.
A JSON summary with yesterday (per-class active counts for the last complete day), cumulative (all-time per-class counts), per_version (cumulative unique instances per version within each class), and timeseries (per-day community/internal counts).

The renderer reads this JSON to populate the "Community vs Internal Deployments" section: the yesterday/cumulative headline numbers and the two per-version breakdown tables. Embed the chart in that section.

Step 6: Run Telemetry Analysis

Run the analysis script to compute all distributions, instance timelines, and metrics. This produces two files:

tables-YYYY-MM-DD.md -- pre-formatted markdown tables ready to embed in the report (with executive summary comparison at the top)
metrics-YYYY-MM-DD.json -- raw computed metrics as JSON (includes per_cloud_unique_installs)

The script automatically finds the most recent previous metrics-*.json file. Since output files are written to the dated subfolder ($DATE_DIR) but previous metrics live in sibling dated subfolders, you must pass --search-dir OUTPUT_DIR so the script searches the parent directory containing all dated subfolders:

INTERNAL_INSTANCES_FILE=".claude/skills/usage-report/known-internal-instances.md"
INTERNAL_FLAG=""
if [ -f "$INTERNAL_INSTANCES_FILE" ]; then
  INTERNAL_FLAG="--internal-instances $INTERNAL_INSTANCES_FILE"
fi

/usr/bin/python3 .claude/skills/usage-report/analyze_telemetry.py \
  --csv $DATE_DIR/registry_metrics.csv \
  --output-dir $DATE_DIR \
  --search-dir OUTPUT_DIR \
  --date YYYY-MM-DD \
  $INTERNAL_FLAG

--output-dir $DATE_DIR -- where to write tables-*.md and metrics-*.json
--search-dir OUTPUT_DIR -- where to search for previous metrics-*.json files (scans this directory and all subdirectories). If omitted, defaults to the parent of --output-dir.
--internal-instances -- path to known-internal-instances.md listing known internal registry instance IDs. When provided, internal instances are labeled "(internal)" in the Instance Lifetime and Identified Instances tables, a Most Active Instances table is generated with an Internal column, and stickiness metrics (3+ day non-internal count, longest-running non-internal instance) are computed and included in the JSON output.

Or with an explicit previous metrics file (skips auto-detection):

/usr/bin/python3 .claude/skills/usage-report/analyze_telemetry.py \
  --csv $DATE_DIR/registry_metrics.csv \
  --output-dir $DATE_DIR \
  --date YYYY-MM-DD \
  --previous-metrics OUTPUT_DIR/PREVIOUS-DATE/metrics-PREVIOUS-DATE.json \
  $INTERNAL_FLAG

Step 6b: Identify Internal vs Customer Instances

The --internal-instances flag passed in Step 6 handles internal instance identification automatically. The analysis script reads .claude/skills/usage-report/known-internal-instances.md (if it exists, since it is gitignored and may not be present on all machines) and:

Labels internal instances with "(internal)" in the Instance Lifetime and Identified Instances tables
Generates a "Most Active Instances" table ranked by activity score (max servers + agents + skills + search), with an Internal column and a Version column
Computes stickiness metrics (3+ day non-internal count, longest-running non-internal) and writes them to the JSON output under the stickiness key
Writes the list of internal instance IDs to the JSON output under internal_instance_ids

Step 6d: Internal Installs Section (telemetry-driven, issue #1216/#1217)

In addition to the manual known-internal-instances.md allowlist, the analysis emits a dedicated Internal Installs markdown section driven by the internal_only_deployment / internal_deployment_type telemetry fields (added in registry release 1.24.5). The section includes:

A total count of internal installs in the window, broken down by type (dev / workshop / other).
An "Internal Installs Over Time" monthly timeseries table across all internal types.
A capture-start note stating that classification is captured from release 1.24.5 onwards (pre-1.24.5 installs are undercounted and rely on the manual allowlist).

The computed data is written to the JSON output under the internal_installs key (total, by_type, timeseries, since_release). These fields come from the CSV export, so the bastion export script (telemetry_db.py) must include the internal_only_deployment and internal_deployment_type columns (it does as of #1216).

If the file does not exist, the script treats all instances as external (no internal labeling, stickiness counts all instances).

When writing the report:

Clearly label known internal instances in the Instance Lifetime table and Registry Instances table (e.g., add "(internal)" suffix or a dedicated column)
Separate metrics: Report total fleet numbers AND customer-only numbers (excluding internal instances). For example: "97 total instances (3 known internal + possibly more, ~94 potential customer instances)"
Flag unusual activity from internal instances: If internal instances show disproportionate activity (e.g., many registered servers/agents/skills, heavy search usage, frequent restarts/heartbeats), explicitly note this is internal testing activity and NOT indicative of customer usage patterns
Note that additional internal instances may exist beyond the known list -- short-lived CI/CD runs, developer local setups, etc. may not be in the known list

The known internal instances are typically the longest-running, highest-activity instances since they are always-on development environments.

Step 6c: Run Liveness Analysis

Classify customer (non-internal) instances into liveness tiers based on recent heartbeat activity. Registry heartbeats are emitted once per 24 hours by default (MCP_TELEMETRY_HEARTBEAT_INTERVAL_MINUTES=1440, see registry/core/telemetry.py and registry/core/config.py), which makes heartbeat counts a direct proxy for "is this deployment still running".

The script produces two files:

liveness-YYYY-MM-DD.md -- a pre-formatted markdown section (tier summary table, confirmed-alive instance list, cloud/compute/auth breakdowns) ready to embed in the report
liveness-YYYY-MM-DD.json -- raw counts and instance ID lists, used for delta tracking in future reports

/usr/bin/python3 .claude/skills/usage-report/analyze_liveness.py \
  --csv $DATE_DIR/registry_metrics.csv \
  --metrics-json $DATE_DIR/metrics-YYYY-MM-DD.json \
  --output-dir $DATE_DIR \
  --search-dir OUTPUT_DIR \
  --date YYYY-MM-DD \
  $INTERNAL_FLAG

Tiers defined:

Confirmed Alive (leading, revenue-countable): ≥ 5 heartbeats in the last 7 days -- a registry that has phoned home almost every day for a week
Stronger Alive (trailing): ≥ 10 heartbeats in the last 14 days -- durable two-week signal
Likely Alive: any event (startup or heartbeat) in the last 7 days
Silent-but-recent: event in last 7 days but < 5 heartbeats (new installs or heartbeat-disabled)
Dormant: no event in the last 14 days (probably deprovisioned)

If a previous liveness-*.json file is found in --search-dir, the "vs Previous" column in the tier summary table is populated with deltas. On first run, it shows "baseline".

Note: Run this after Step 6 since it reads metrics-YYYY-MM-DD.json for per-instance cloud/compute/auth metadata.

Step 7: Generate the Usage Report

The report is rendered deterministically from a template + structured data, not by an LLM. This eliminates hallucination, ensures every section is present, and keeps numbers reconciled with the analyzer outputs.

The renderer reads:

metrics-YYYY-MM-DD.json (key counts, distributions, version adoption, upgrade trajectories, sticky profiles, etc.)
liveness-YYYY-MM-DD.json (tier counts, confirmed-alive list)
ltv-spend-YYYY-MM-DD.json (yesterday/last-7-days/cumulative LTV)
install-forecast-YYYY-MM-DD.json (linear and recent-pace ETAs)
github_stats.json and github_contributors_count.txt
active-instances-YYYY-MM-DD.csv (DAI / MA7 / streak time series)
lifetime-buckets-YYYY-MM-DD.csv (retention-over-time)
compute-platform-snapshots-YYYY-MM-DD.md (per-platform-growth table)
detection-by-version-YYYY-MM-DD.csv (powers the dynamic narrative)
tables-YYYY-MM-DD.md (Most Active / Largest Catalogs / Most Engaged Operators tables, copied verbatim)

Plus the most recent prior dated subfolder's metrics-*.json, liveness-*.json, and github_stats.json (auto-discovered via --search-dir) for delta calculations.

/usr/bin/python3 .claude/skills/usage-report/render_report.py \
  --date YYYY-MM-DD \
  --output-dir $DATE_DIR \
  --search-dir OUTPUT_DIR

Outputs $DATE_DIR/ai-registry-usage-report-YYYY-MM-DD.md.

Architecture

The renderer is composed of three pieces, all under .claude/skills/usage-report/:

report_template.md -- the canonical report markdown with {placeholder} slots for every dynamic value. Static prose lives here verbatim. Section ordering, headings, and chart references are fixed.
render_report.py -- reads the JSON/CSV inputs, computes every placeholder value (including derived metrics like ARR projection, retention deltas vs earliest snapshot, and the dynamic Cloud-Detection-by-Version narrative), and substitutes them into the template via str.format_map.
recommendations.py -- rule-based generator for the Recommendations section. Each rule examines the rendered template-vars dict and returns a bullet when its trigger fires (e.g., "Confirmed Alive crossed a 10-multiple", "longest customer hit a week milestone", "ARR crossed a threshold"). No LLM in the loop.

Mandatory Charts Checklist

The report MUST embed all 13 charts (the template's ![...] references). If any chart file is missing the renderer will substitute the path verbatim and pandoc will render a broken-image placeholder, so generate all charts (Steps 5 through 5h) before invoking render_report.py.

registry-installs-timeseries-YYYY-MM-DD.png (cloud provider: cumulative + daily-active + daily-new)
instance-distribution-YYYY-MM-DD.png (6-panel faceted, all customers)
instance-distribution-active-PREVIOUS-YYYY-MM-DD.png (6-panel faceted, active yesterday)
instance-lifetime-YYYY-MM-DD.png (age histogram + boxplot + buckets)
lifetime-buckets-YYYY-MM-DD.png (retention % over time)
active-instances-YYYY-MM-DD.png (DAI + MA7 + streak)
compute-installs-timeseries-YYYY-MM-DD.png (compute platform cumulative + daily)
install-forecast-YYYY-MM-DD.png (OLS + recent-pace to 1,000)
daily-reporters-YYYY-MM-DD.png (daily AWS reporters: all + persisted >=2 events + persisted >=2 days)
ltv-spend-YYYY-MM-DD.png (daily compute + bedrock + cumulative)
adoption-funnel-YYYY-MM-DD.png (funnel from total to confirmed-alive)
detection-by-version-YYYY-MM-DD.png (cloud_detection_method outcomes per version)
prod-internal-timeseries-YYYY-MM-DD.png (community vs internal: cumulative + daily-active)

Editing the report

To change report content (add a section, reword prose, change a table layout): edit report_template.md. Any new dynamic values needed go in render_report.py under _build_template_vars. New rule-based recommendation triggers go in recommendations.py.

To change a recommendation's trigger threshold (e.g., milestone count, ARR threshold): edit recommendations.py only.

Do NOT have an LLM regenerate the report markdown -- that path is deprecated. The renderer is the source of truth.

Step 8: Augment with LLM Commentary

The deterministic render produces clean, factually accurate markdown but lacks the synthesizing analyst voice that makes a report feel rich. This step layers commentary onto the deterministic output without modifying any of its numbers, tables, or charts.

The pipeline is split so the LLM has a narrowly-scoped, hard-to-screw-up job:

# 1. Extract anchors: walks the rendered markdown, finds every
#    `<!-- COMMENTARY:section_id -->` marker, and emits a manifest
#    with one entry per marker (containing the section text + the
#    instructions the LLM must follow).
/usr/bin/python3 .claude/skills/usage-report/augment_with_commentary.py extract \
  --md $DATE_DIR/ai-registry-usage-report-YYYY-MM-DD.md \
  --date YYYY-MM-DD \
  --output $DATE_DIR/commentary-manifest.json

# 2. THE LLM STEP: read commentary-manifest.json and produce
#    commentary.json mapping section_id -> 2-4 sentence paragraph.
#    The manifest's `instructions_for_llm` field has the exact
#    constraints (no inventing numbers, no markdown headings, etc).
#    Save the resulting JSON as $DATE_DIR/commentary.json.

# 3. Apply: substitute every `<!-- COMMENTARY:section_id -->` marker
#    with `_Commentary: <text>_`. Markers with no entry are silently
#    dropped. The augmenter cannot modify any other content; only
#    marker positions are touched.
/usr/bin/python3 .claude/skills/usage-report/augment_with_commentary.py apply \
  --md $DATE_DIR/ai-registry-usage-report-YYYY-MM-DD.md \
  --commentary $DATE_DIR/commentary.json

Why this design

Append-only semantics. The augmenter is a pure marker-substitution. The original deterministic markdown is the source of truth for every number, table, and chart. The LLM cannot retcon the data.
Narrow LLM job. The LLM produces a flat JSON of short prose paragraphs, not the entire report. Easy to validate, easy to constrain in the prompt.
Drop-safe. If the LLM omits a section_id or returns an empty string, that section's commentary is dropped (rendered as no commentary). The deterministic output remains valid.
Section-context isolation. Each manifest entry includes only the section text the LLM needs to write commentary for that section. The LLM cannot see other sections' content as it generates each paragraph, which discourages cross-section invention.

Required commentary anchors (in template)

The current template has 21 commentary anchors covering: executive_summary, cloud_installs, deployment_distribution, lifetime_retention, liveness, engagement, compute_platform, version_adoption, prod_internal, upgrade_trajectories, feature_adoption, sticky_breakdown, most_active, install_forecast, daily_reporters, ltv_arr, adoption_funnel, cloud_detection, github, architecture, recommendations.

Adding/removing anchors: edit report_template.md to insert or delete  markers; the augmenter's extract phase will pick up the new set automatically.

Step 9: Generate Self-Contained HTML

Convert the markdown report to a single self-contained HTML file using pandoc. The chart PNG is base64-embedded so the HTML works standalone. Run from the DATE_DIR so relative image paths resolve:

cd $DATE_DIR && pandoc ai-registry-usage-report-YYYY-MM-DD.md \
  -o ai-registry-usage-report-YYYY-MM-DD.html \
  --embed-resources --standalone \
  --css=.claude/skills/usage-report/report-style.css \
  --metadata title="AI Registry - Usage Report YYYY-MM-DD"

The report-style.css file in the skill directory provides a clean, professional layout. Pandoc must be installed:

which pandoc >/dev/null || sudo apt-get install -y pandoc

Step 10: Present Results

After generating the report:

Display the Executive Summary (with comparison deltas, including GitHub stars delta) and Key Metrics directly in the conversation
Tell the user the full report path, HTML path, CSV path, and chart paths
Highlight the most interesting findings and notable changes from the previous report (telemetry + GitHub)

Error Handling

SSH connection fails: Check that the bastion IP is correct and security group allows your IP. The allowed CIDRs are in terraform/telemetry-collector/terraform.tfvars under bastion_allowed_cidrs.
Export returns 0 documents: The telemetry collector may not have received any events yet. Check that telemetry_enabled is true in registry settings and the collector endpoint is reachable.
Terraform output fails: Make sure you're in the right directory and have run terraform init.

Example Usage

User: /usage-report

Output:

Executive Summary: 31479 events from 562 unique registry instances over 55 days...
Compared to previous report (2026-05-20): +2299 events (+8%), +26 new instances (+5%)

Full report: .scratchpad/usage-reports/2026-05-22/ai-registry-usage-report-2026-05-22.md
HTML report: .scratchpad/usage-reports/2026-05-22/ai-registry-usage-report-2026-05-22.html
Charts (10):
  - registry-installs-timeseries-2026-05-22.png
  - compute-installs-timeseries-2026-05-22.png
  - instance-distribution-2026-05-22.png
  - instance-distribution-active-2026-05-21.png
  - instance-lifetime-2026-05-22.png
  - lifetime-buckets-2026-05-22.png
  - active-instances-2026-05-22.png
  - install-forecast-2026-05-22.png
  - ltv-spend-2026-05-22.png
  - adoption-funnel-2026-05-22.png
CSV data: .scratchpad/usage-reports/2026-05-22/registry_metrics.csv

User: /usage-report /tmp/reports

Output saved to /tmp/reports/2026-05-22/.

Output Directory Structure

.scratchpad/usage-reports/
  2026-05-22/
    # Report files
    ai-registry-usage-report-2026-05-22.md
    ai-registry-usage-report-2026-05-22.html

    # Charts (10 mandatory PNGs)
    registry-installs-timeseries-2026-05-22.png
    compute-installs-timeseries-2026-05-22.png
    instance-distribution-2026-05-22.png
    instance-distribution-active-2026-05-21.png
    instance-lifetime-2026-05-22.png
    lifetime-buckets-2026-05-22.png
    active-instances-2026-05-22.png
    install-forecast-2026-05-22.png
    ltv-spend-2026-05-22.png
    adoption-funnel-2026-05-22.png

    # Analysis outputs
    tables-2026-05-22.md
    metrics-2026-05-22.json
    liveness-2026-05-22.md
    liveness-2026-05-22.json
    compute-platform-snapshots-2026-05-22.md

    # CSV sidecars
    registry_metrics.csv
    active-instances-2026-05-22.csv
    ltv-spend-2026-05-22.csv
    lifetime-buckets-2026-05-22.csv

    # JSON summaries
    ltv-spend-2026-05-22.json
    install-forecast-2026-05-22.json
    github_stats.json
    github_contributors_count.txt