core-web-vitals - SKILL.md Agent Skill

name: core-web-vitals description: Run Google Core Web Vitals and PageSpeed audits against URLs. Use when asked to check site performance, CWV scores, LCP/CLS/INP/FCP/TTFB metrics, PageSpeed Insights, compare two sites, compare Shopify preview themes vs production, or bulk-audit URLs from a Google Sheet. Supports single URL, compare (two URLs side-by-side including Shopify theme QA), batch (multiple URLs), local measurement (Puppeteer, no API needed), and Google Sheet modes with CrUX field data preferred, lab fallback, and browser scraping for errors.

Core Web Vitals Skill

Audit website performance using Google's CrUX field data (real user metrics) and PageSpeed Insights API.

Prerequisites

GOOGLE_PAGESPEED_API_TOKEN env var (Google PageSpeed Insights API key)
gog CLI for Google Sheets operations (alternative to service account auth)
agent-browser CLI for web.dev scraping fallback

Environment Setup

The scripts auto-load .env files from the working directory or skill root. No manual export needed.

If GOOGLE_PAGESPEED_API_TOKEN is in a .env file, the scripts will find it automatically. Users can also pass --api-key inline.

⚠️ NEVER echo, print, or display the API key value.

Five Modes

1. Single URL

User provides one URL. Run the API, return formatted results inline.

2. Compare (two URLs)

User provides two URLs to compare. Run both, show side-by-side with winner highlighted per metric.

3. Batch (multiple URLs)

User provides URLs separated by line breaks. Run each, return formatted table.

4. Local Measurement (--local flag)

Measure CWV locally using Puppeteer + web-vitals library. No API key needed. Supports desktop and mobile (with throttling). Use for batch measurements without API quota limits, testing auth-protected sites, or measuring Shopify preview URLs.

5. Google Sheet

User provides a Sheet URL. Read URLs from column A, write results to columns B-N with conditional formatting.

Data Source Priority

CrUX Field Data (preferred) — Real user metrics from Chrome UX Report (28-day p75 values). This matches what web.dev shows prominently.
Lab Data (fallback) — Lighthouse synthetic test. Used when CrUX has insufficient traffic data.
Browser Scraping (error retry) — Load web.dev in agent-browser, wait for results, parse the page. Used when API times out.

Important: CrUX field data ≠ Lighthouse lab data. Field data reflects real users; lab data simulates a throttled mobile device. Always prefer field data — it's what Google uses for ranking signals.

Metrics Collected

Core Web Vitals (Field-Comparable)

Available in both API mode (CrUX field data) and local mode (lab measurement):

Metric	Field (CrUX)	Lab (Local)	Good	Needs Improvement	Poor
LCP (s)	✅ p75	✅	≤ 2.5	2.5–4.0	> 4.0
CLS	✅ p75	✅	≤ 0.1	0.1–0.25	> 0.25
FCP (s)	✅ p75	✅	≤ 1.8	1.8–3.0	> 3.0
TTFB (s)	✅ p75	✅	≤ 0.8	0.8–1.8	> 1.8

INP (API Mode Only)

Metric	Field (CrUX)	Lab (Local)	Good	Needs Improvement	Poor
INP (ms)	✅ p75	❌ Not available	≤ 200	200–500	> 500

Why is INP not available in local mode?

INP requires PerformanceEventTiming entries from real user interactions. Puppeteer's synthetic events (click, keyboard, pointer) do not generate these browser API entries. This is a fundamental limitation of headless testing, not a bug.

Even Google Lighthouse doesn't measure INP in lab tests — it uses TBT as a proxy instead. Local mode follows the same pattern.

Lab-Only Metrics (Local Mode)

Calculated from synthetic testing, not available from CrUX field data:

Metric	Description	Good	Needs Improvement	Poor
TBT (ms)	Total Blocking Time — time main thread blocked by long tasks (>50ms) between FCP and TTI	≤ 200	200–600	> 600
SI (ms)	Speed Index — how quickly page contents are visually populated	≤ 3400	3400–5800	> 5800
TTI (s)	Time to Interactive — when page becomes fully interactive (5s quiet window)	≤ 3.8	3.8–7.3	> 7.3

CWV Assessment: FAST / AVERAGE / SLOW (from CrUX overall_category, API mode only)

Modes 1-4: Single, Compare, Batch, and Local

All handled by scripts/pagespeed-single.py. The script auto-detects the mode based on URL count:

# Single URL (Google API)
python3 scripts/pagespeed-single.py example.com

# Compare two URLs (Google API)
python3 scripts/pagespeed-single.py site-a.com site-b.com

# Batch (3+ URLs, Google API)
python3 scripts/pagespeed-single.py site1.com site2.com site3.com

# Local measurement (Puppeteer, no API)
python3 scripts/pagespeed-single.py --local example.com

# Local with mobile emulation
python3 scripts/pagespeed-single.py --local --mobile example.com

# Local compare mode
python3 scripts/pagespeed-single.py --local site-a.com site-b.com

# With inline API key
python3 scripts/pagespeed-single.py --api-key YOUR_KEY example.com

Local Mode (--local flag)

When --local is passed, the script shells out to scripts/pagespeed-local.js which uses Puppeteer + web-vitals library to measure CWV locally.

Prerequisites:

npm install puppeteer web-vitals in the skill directory

What it does:

Launches headless Chromium via Puppeteer
Injects web-vitals IIFE library via evaluateOnNewDocument before page load
Collects CWV metrics: LCP, FCP, CLS, TTFB (NO INP — see above)
Calculates lab metrics: TBT, SI, TTI from long tasks and paint events (using PerformanceObserver)
Waits for metrics to stabilize (5-10s after load)
Forces LCP finalization by clicking body
Returns metrics in same format as API mode

Mobile mode (--mobile flag):

Viewport: 412x823 (Pixel 4)
Network throttling: 3G (1.5 Mbps down, 750 Kbps up, 40ms latency)
CPU throttling: 4x slowdown
Mobile user agent

Output format:

📱 Mobile (Local (Puppeteer)):
  CWV: LCP: 4.3s 🔴 | CLS: 0.31 🔴 | FCP: 1.6s 🟢 | TTFB: 0.1s 🟢
  Lab: TBT: 450ms 🟡 | SI: 2880ms 🟢 | TTI: 5.2s 🟡

Data source labeled as "Local (Puppeteer)" vs "CrUX field" or "Lab"

When to use:

No PageSpeed API key available
Measuring sites behind auth/firewall
Batch measurements without API quota limits
Testing Shopify preview URLs (production theme vs dev theme in preview mode)
Get bonus lab metrics (TBT, SI, TTI) not available in CrUX field data

Limitations:

INP not available — requires real user interactions (use API mode for INP)
Results may vary slightly between runs
Mobile throttling is simulated, not real device performance
Lab metrics measure synthetic load, not real-world field performance

Shopify Theme QA Detection

When comparing two URLs, detect if they share the same domain with ?preview_theme_id= parameters. If so, this is a Shopify theme QA comparison (not a competitor comparison). Adjust the output framing:

Competitor compare (different domains): "Who's faster?" — neutral winner per metric
Theme QA (same domain, both have preview_theme_id): "Did we regress?" — first URL is "Before" (production theme in preview mode), second is "After" (development theme in preview mode). Both must be preview URLs for apples-to-apples comparison since Shopify preview mode adds inherent overhead vs the live site.

For theme QA, use this framing:

Label URLs as "Before" (theme ID from first URL) and "After" (theme ID from second URL)
Show the delta for each metric (e.g., "+0.7s", "-5ms")
Flag regressions with ⚠️ ("LCP regressed 0.7s")
Flag improvements with 🎉 ("INP improved 5ms")
Summary: "X regressions, Y improvements, Z unchanged"
If ANY CWV metric crosses a threshold boundary (green→yellow, yellow→red), add a strong warning: "🚨 Do not publish — CWV regression detected"

Running Scripts

The script auto-loads .env from the working directory (searching upward). Run from the project root in a single Bash call:

cd /path/to/project && python3 /path/to/skills/core-web-vitals/scripts/pagespeed-single.py example.com

Or if the user's .env is elsewhere, source it first in the same command:

export $(grep -v '^#' /path/to/.env | xargs) && python3 scripts/pagespeed-single.py example.com

Do NOT:

Check for env vars in a separate Bash call (each call is a new shell)
Write your own curl commands or inline Python
Echo or verify the API key value

Mode 3: Google Sheet

Use scripts/pagespeed-bulk.py in this skill directory.

Prerequisites

GOOGLE_PAGESPEED_API_TOKEN env var (or pass --api-key)
Google Sheets auth: either a service account JSON key (--credentials) or gog CLI (--account)
The Google Sheet must be editable by the auth identity (share with service account email or gog account)
Column A must contain URLs (starting at A2, A1 is header)

Setup

Extract spreadsheet ID from the Google Sheet URL (the long string between /d/ and /edit)
Set headers B1:N1 via Sheets API or manually:
- B1: M-LCP (s), C1: M-CLS, D1: M-INP (ms), E1: M-FCP (s), F1: M-TTFB (s), G1: M-CWV
- H1: D-LCP (s), I1: D-CLS, J1: D-INP (ms), K1: D-FCP (s), L1: D-TTFB (s), M1: D-CWV
- N1: Data Source

Run

# With service account
python3 -u scripts/pagespeed-bulk.py SPREADSHEET_ID --credentials service-account.json

# With gog CLI
python3 -u scripts/pagespeed-bulk.py SPREADSHEET_ID --account EMAIL

# Resume from index N, custom workers
python3 -u scripts/pagespeed-bulk.py SPREADSHEET_ID --credentials sa.json --start N --workers 6

Monitor

tail -f /tmp/pagespeed.log

Retry Errors (browser scraping)

After main run completes, retry ERROR rows via web.dev:

python3 -u scripts/pagespeed-retry-browser.py SPREADSHEET_ID --credentials sa.json

Conditional Formatting

Apply via Google Sheets batchUpdate API (see references/conditional-formatting.md). Colors:

Green: good threshold
Yellow: needs improvement
Red: poor threshold

Applied to all metric columns for all data rows.

Key Learnings

gog CLI quirk: Multiple positional values get concatenated into one cell. Always write ONE cell per update call, or use the Sheets API directly for batch row writes.
API timeouts: Heavy Shopify sites often timeout (60-90s). Use browser scraping fallback.
CrUX vs Lab confusion: Users expect web.dev numbers. Web.dev defaults to showing CrUX field data prominently. Always use CrUX when available.
Rate limits: 25K requests/day, 400/100s with API key. Parallel workers (4) stay well under limits.
CLS values from CrUX: Returned as percentile × 100 (e.g., 26 = 0.26). Divide by 100.
Batch Sheets API writes: Use PUT to spreadsheets/{id}/values/{range}?valueInputOption=RAW for whole-row writes (much faster than per-cell gog CLI calls).
Read full sheet range: Always check actual row count. Don't assume 200 rows.