name: browser-dev-cycle
description: Full development cycle browser automation for viewing, debugging, testing, and visual inspection of web apps. Five-tier strategy: @playwright/mcp (interaction), @playwright/cli (AI agent disk-based), CDP CLI (scripts/cdp.mjs) (performance), Playwright-core (scripting), and Playwright CLI (e2e test suites). Use when users need to browse, take screenshots, debug network/performance, run e2e tests, record tests with codegen, or do visual QA. Triggers on "browser", "screenshot", "viewport", "performance trace", "network debug", "visual QA", "responsive test", "e2e test", "playwright test", "codegen", "test suite".
Browser Dev Cycle: Five-Tier Automation Strategy
Comprehensive browser automation covering interaction, performance analysis, and scripted testing. Use the lightest tier that gets the job done.
Companion agent: sentinel — when full multi-role E2E/UAT driving is needed (login as admin/editor/viewer, RBAC verification, console capture, accessibility audit), delegate to the sentinel agent which loads this skill before acting.
1. Decision Tree
Quick Selector
| Task | Tier | Tool |
|---|---|---|
| Navigate, click, fill forms, screenshot | Tier 1 | @playwright/mcp |
| Accessibility snapshot | Tier 1 | browser_snapshot |
| AI agent browser tasks (token-efficient) | Tier 1.5 | @playwright/cli |
| Disk-based YAML snapshots for agents | Tier 1.5 | @playwright/cli |
| Performance trace / profiling | Tier 2 | CDP CLI (scripts/cdp.mjs) |
| Core Web Vitals | Tier 2 | CDP CLI (scripts/cdp.mjs) |
| Network HAR detail / response bodies | Tier 2 | CDP CLI (scripts/cdp.mjs) |
| Console errors with stack traces | Tier 2 | CDP CLI (scripts/cdp.mjs) |
| CSS computed styles debugging | Tier 2 | CDP CLI (scripts/cdp.mjs) |
| Network mocking / interception | Tier 3 | Playwright-core scripts |
| Viewport matrix testing | Tier 3 | Playwright-core scripts |
| Video / trace recording | Tier 3 | Playwright-core scripts |
| State save / restore | Tier 3 | Playwright-core scripts |
| Multi-page orchestration | Tier 3 | Playwright-core scripts |
| Visual regression | Tier 1 + Tier 3 | Screenshot compare |
| Write/run E2E test suites | Tier 4 | Playwright CLI (npx playwright test) |
| Record test from user actions | Tier 4 | Codegen (npx playwright codegen) |
ASCII Decision Tree
What do you need?
|
+-- Basic interaction (nav, click, type, screenshot)
| +-> Tier 1: @playwright/mcp
| Direct MCP tool calls. No scripting needed.
|
+-- AI agent browser tasks (token-efficient, disk-based)
| +-> Tier 1.5: @playwright/cli
| Shell commands. YAML snapshots saved to disk. 4x fewer tokens than MCP.
|
+-- Performance profiling / network analysis
| +-> Tier 2: CDP CLI (`scripts/cdp.mjs`)
| CPU traces, Core Web Vitals, response bodies, computed styles.
|
+-- Scripted tests / network mocking / recording
| +-> Tier 3: Playwright-core CDP scripts
| Full Playwright API via CDP connection. Write and run .mjs scripts.
|
+-- Formal test suites / codegen / cross-browser
| +-> Tier 4: Playwright CLI
| Full test runner, codegen, traces, reports. Write and run .spec.ts files.
|
+-- Multiple of the above
+-> Combine tiers as needed
Selection Rules
- Start with Tier 1 for any interactive task — covers 80% of browser automation needs.
- Use Tier 1.5 when an AI agent needs token-efficient browser automation with disk-based state (YAML snapshots, screenshots on disk — 4x fewer tokens than MCP).
- Escalate to Tier 2 when you need performance metrics, network bodies, or computed CSS.
- Escalate to Tier 3 when you need programmatic control (loops, conditionals, mocking, recording).
- Escalate to Tier 4 when you need repeatable test suites, codegen recording, trace capture, or cross-browser testing.
- Tier 1 + Tier 2 can run against the same browser instance simultaneously via separate CDP sessions.
- Tier 3 scripts run as standalone Node.js processes and manage their own connections.
2. TIER 1: @playwright/mcp
Primary tool for browser interaction. Tools are called directly from Claude via MCP — no scripting needed.
Setup (add to .mcp.json):
{
"mcpServers": {
"playwright": {
"command": "cmd",
"args": ["/c", "npx", "-y", "@playwright/mcp@latest"],
"type": "stdio"
}
}
}
Core workflow:
1. browser_navigate -> Load the page
2. browser_snapshot -> Get accessibility tree with element refs
3. browser_click / browser_type / browser_select_option -> Interact using refs
4. browser_snapshot -> Re-read after DOM changes (refs are invalidated)
Critical: After any navigation or DOM change, call browser_snapshot again — prior refs are stale.
Key tools: browser_navigate, browser_snapshot, browser_click, browser_type, browser_select_option, browser_press_key, browser_hover, browser_wait_for, browser_evaluate, browser_tab_new/select/close, browser_take_screenshot (vision mode), browser_handle_dialog
Full tool reference with all parameters: references/tier1-playwright-mcp.md
2.5. TIER 1.5: @playwright/cli (AI Agent Browser CLI)
Token-efficient browser automation for AI coding agents. Saves state to disk (YAML snapshots, screenshots) instead of streaming into context — ~4x fewer tokens than MCP per task.
Install: npm install -g @playwright/cli@latest
Key difference from MCP: State lives on disk, not in context. Agent reads snapshots selectively via Read tool.
Core commands:
| Command | Purpose |
|---|---|
playwright-cli open [url] |
Open browser and navigate |
playwright-cli snapshot |
Save YAML accessibility snapshot to disk |
playwright-cli screenshot |
Save screenshot to disk |
playwright-cli click REF |
Click element by ref from snapshot |
playwright-cli fill REF TEXT |
Fill input field |
playwright-cli type TEXT |
Type text |
playwright-cli select REF VAL |
Select dropdown option |
playwright-cli eval EXPR |
Evaluate JavaScript |
playwright-cli close |
Close page |
Workflow:
1. playwright-cli open <url> # Open browser
2. playwright-cli snapshot # Save YAML to disk
3. Read the YAML file # Agent reads selectively (low tokens)
4. playwright-cli click <ref> # Interact using refs from snapshot
5. playwright-cli snapshot # Re-snapshot after DOM changes
Options: --headed (show browser), --persistent (save profile)
When to use Tier 1.5 over Tier 1:
- Agent-driven automation where token budget matters
- Long multi-step flows (token savings compound)
- When you need snapshots persisted to disk for later analysis
- Batch operations across multiple pages
When to use Tier 1 (MCP) instead:
- Interactive development (faster feedback loop)
- One-off inspections
- When you need the full 70+ MCP tool set (file upload, PDF, drag-drop)
3. TIER 2: CDP CLI (scripts/cdp.mjs)
For performance analysis, network debugging, and CSS inspection.
See Tier 2 Reference for full CDP CLI command reference.
Quick usage:
node scripts/cdp.mjs perf # Performance metrics
node scripts/cdp.mjs network # Network requests
node scripts/cdp.mjs a11y # Accessibility audit
node scripts/cdp.mjs lighthouse <url> # Lighthouse scores
4. TIER 3: Playwright-core CDP Scripting
For programmatic control — loops, mocking, recording, viewport matrix.
Connect to Chrome:
import { chromium } from 'playwright-core';
const browser = await chromium.connectOverCDP('http://localhost:9222');
const context = browser.contexts()[0];
const page = context.pages()[0] || await context.newPage();
Unique capabilities: page.route() (network mock), page.routeFromHAR(), context.tracing, page.setViewportSize(), context.storageState(), page.waitForFunction()
Pre-built scripts in scripts/: browser-setup.ps1, viewport-test.mjs, playwright-helper.mjs, network-mock.mjs
Full patterns and code examples: references/tier3-scripting.md
5. TIER 4: Playwright CLI Testing
Full Playwright test runner for formal E2E test suites, interactive codegen, and trace debugging.
Setup: Already installed in continuous-claude (npm install -D playwright @playwright/test && npx playwright install chromium).
Key commands:
| Command | Use |
|---|---|
npm run test:e2e |
Run all tests |
npm run test:e2e:headed |
Watch tests run visually |
npm run test:e2e:debug |
Step through with inspector |
npm run test:e2e:codegen |
Record interactions interactively |
npm run test:e2e:report |
Open HTML test report |
Full reference: See references/tier4-cli-testing.md for codegen workflow, Ralph integration, and when to use CLI vs MCP.
6. Error Recovery
| Error | Cause | Recovery |
|---|---|---|
| "Element not found" / "No element with ref" | Stale ref after DOM change | Call browser_snapshot again, find element with new ref |
| "Target page closed" | Navigation changed page | Reconnect; get a new page reference |
| "Navigation timeout" | Slow page or server down | Increase timeout; check server is running |
| "Connection refused" on CDP | Chrome not running with debug port | Launch Chrome with --remote-debugging-port=9222 |
| "Protocol error" | CDP connection dropped | Reconnect to CDP endpoint |
| No MCP tools available | MCP server not started | Restart the Claude session |
| "Execution context destroyed" | SPA route change during evaluate | Get fresh page reference; re-snapshot |
| "Dialog is active" | Unhandled alert/confirm | Call browser_handle_dialog before other actions |
Stale ref recovery (most common):
Error: Element with ref "e5" not found
Recovery:
1. browser_snapshot -> get fresh accessibility tree
2. Find the element again -> it may have a new ref like "e17"
3. browser_click ref="e17" -> use the new ref
Windows-specific:
- Chrome profile lock: close other Chrome instances or use a different
--user-data-dir - Port 9222 in use:
taskkill /F /IM chrome.exe(closes all Chrome) cmd /c npxfails: ensure npm is in PATH
Retry strategy: Re-snapshot and retry → wait 2s + retry → stop after 3rd failure and diagnose.
7. Workbook Platform Patterns
Workbook-specific SPA patterns (Next.js on Railway):
- Sidebar nav: click item → wait 1-2s → re-snapshot (SPA changes all refs)
- Command palette:
Control+k→ snapshot → type query → click result → re-snapshot - Data tables: find header/pagination refs → click → wait 500ms → re-snapshot
- Modals: click trigger → wait 500ms → snapshot → fill form → click submit → verify closed
Full patterns, breakpoints, known issues, and testing checklist: references/workbook-patterns.md
8. Workflow Recipes
Common multi-step workflows: Visual QA, API Debugging, Responsive Testing, Performance Profiling, State Management, End-to-End Feature Testing.
See: references/workflow-recipes.md
9. Deprecation Notes
| Deprecated Tool | Issue | Replacement |
|---|---|---|
agent-browser CLI (ab) |
Windows daemon startup broken (GitHub #89, #90) | Tier 1: @playwright/mcp |
| Claude-in-Chrome MCP | 6+ Windows 11 bugs. Extension approach is fragile. | Tier 1: @playwright/mcp |
| Puppeteer MCP | Deprecated upstream. ESM import errors. | Tier 1: @playwright/mcp |
The legacy agent-browser skill at .claude/skills/agent-browser/SKILL.md is preserved for historical reference only.
Reference Files
| File | Contents |
|---|---|
references/tier1-playwright-mcp.md |
Full tool tables, all parameters, modes, best practices |
references/tier2-devtools.md |
Full tool tables, combined workflow, CWV targets |
references/tier3-scripting.md |
Setup, code patterns, pre-built scripts |
references/tier4-cli-testing.md |
Playwright CLI testing, codegen, Ralph integration |
references/workflow-recipes.md |
Step-by-step recipes for common workflows |
references/workbook-patterns.md |
Workbook SPA patterns, breakpoints, known issues |
references/tool-comparison.md |
Capability matrix across all tools |