name: browser
description: Automate browser interactions using agent-browser CLI — navigate, click, fill forms, scrape content, take screenshots, and more.
argument-hint: [args...]
allowed-tools: Bash(npx agent-browser *)
Browser Automation with agent-browser
Control a headless Chromium browser via CLI commands. All commands are run with npx agent-browser <command>.
Setup (first time only)
npx agent-browser install
Navigation
| Command |
Description |
open <url> |
Navigate to a URL |
close |
Close browser |
tab |
List open tabs |
tab new [url] |
Open new tab |
tab <n> |
Switch to tab n |
tab close [n] |
Close tab |
window new |
Open new browser window |
frame <selector> |
Switch to iframe |
frame main |
Return to main frame |
Page Inspection (use these to understand page structure)
| Command |
Description |
snapshot |
Get accessibility tree with @e1, @e2 element refs (best for AI) |
screenshot [path] |
Capture screenshot |
screenshot --full |
Full page screenshot |
screenshot --annotate |
Screenshot with numbered element labels |
get text <sel> |
Extract text content |
get html <sel> |
Get innerHTML |
get value <sel> |
Get input value |
get attr <sel> <attr> |
Get element attribute |
get title |
Get page title |
get url |
Get current URL |
get count <sel> |
Count matching elements |
get box <sel> |
Get bounding box |
get styles <sel> |
Get computed styles |
is visible <sel> |
Check if element is visible |
is enabled <sel> |
Check if element is enabled |
is checked <sel> |
Check if checkbox is checked |
Clicking & Interaction
| Command |
Description |
click <sel> |
Click an element |
click <sel> --new-tab |
Click, opening in new tab |
dblclick <sel> |
Double-click |
hover <sel> |
Hover over element |
focus <sel> |
Focus element |
scroll <dir> [px] |
Scroll up/down/left/right |
scrollintoview <sel> |
Scroll element into view |
drag <src> <tgt> |
Drag and drop |
Text Input
| Command |
Description |
fill <sel> <text> |
Clear field and type text |
type <sel> <text> |
Type into focused element (appends) |
keyboard type <text> |
Type with real keystrokes |
keyboard inserttext <text> |
Insert text without key events |
press <key> |
Press a key (Enter, Tab, Control+a, etc.) |
keydown <key> |
Hold key down |
keyup <key> |
Release key |
Form Controls
| Command |
Description |
check <sel> |
Check a checkbox |
uncheck <sel> |
Uncheck a checkbox |
select <sel> <val> |
Select dropdown option |
upload <sel> <files> |
Upload file(s) to input |
Semantic Find (locate elements by meaning, not CSS)
find role <role> <action> [value] # By ARIA role
find text <text> <action> # By visible text
find label <label> <action> [value] # By label
find placeholder <ph> <action> [value] # By placeholder
find alt <text> <action> # By alt text
find title <text> <action> # By title attribute
find testid <id> <action> [value] # By data-testid
find first <sel> <action> [value] # First match
find last <sel> <action> [value] # Last match
find nth <n> <sel> <action> [value] # Nth match
Actions: click, fill, type, hover, focus, check, uncheck, text
Options: --name <name> (filter by accessible name), --exact (exact match)
Examples:
npx agent-browser find role button click --name "Submit"
npx agent-browser find text "Sign In" click
npx agent-browser find label "Email" fill "user@example.com"
npx agent-browser find placeholder "Search..." fill "vinyl records"
Waiting
| Command |
Description |
wait <selector> |
Wait for element to appear |
wait <ms> |
Wait N milliseconds |
wait --text "text" |
Wait for text to appear |
wait --url "**/path" |
Wait for URL pattern |
wait --load networkidle |
Wait for network idle |
wait --fn "window.ready" |
Wait for JS condition |
Cookies & Storage
| Command |
Description |
cookies |
Get all cookies |
cookies set <name> <val> |
Set a cookie |
cookies clear |
Clear cookies |
storage local |
Get all localStorage |
storage local <key> |
Get localStorage key |
storage local set <k> <v> |
Set localStorage value |
storage local clear |
Clear localStorage |
storage session |
Same commands for sessionStorage |
Network Interception
| Command |
Description |
network route <url> |
Intercept requests matching URL |
network route <url> --abort |
Block matching requests |
network route <url> --body <json> |
Mock response body |
network unroute [url] |
Remove intercept routes |
network requests |
View tracked requests |
network requests --filter <pat> |
Filter requests by pattern |
JavaScript Execution
| Command |
Description |
eval <js> |
Run JavaScript in page context |
eval -b <base64> |
Run base64-encoded JS |
Browser Configuration
| Command |
Description |
set viewport <w> <h> |
Set viewport size |
set device <name> |
Emulate device (e.g. "iPhone 14") |
set media dark |
Emulate dark color scheme |
set media light |
Emulate light color scheme |
set offline on |
Enable offline mode |
set offline off |
Disable offline mode |
set geo <lat> <lng> |
Set geolocation |
set headers <json> |
Set extra HTTP headers |
set credentials <user> <pass> |
Set HTTP basic auth |
Mouse Control
| Command |
Description |
mouse move <x> <y> |
Move cursor to coordinates |
mouse down [button] |
Press mouse button |
mouse up [button] |
Release mouse button |
mouse wheel <dy> [dx] |
Scroll wheel |
State Persistence
| Command |
Description |
state save <path> |
Save auth/session state |
state load <path> |
Load saved state |
state list |
List saved states |
state clear [name] |
Clear a saved state |
state clear --all |
Clear all saved states |
Debugging
| Command |
Description |
console |
View console messages |
errors |
View uncaught exceptions |
highlight <sel> |
Visually highlight element |
trace start [path] |
Start trace recording |
trace stop [path] |
Stop and save trace |
profiler start |
Start CPU profiling |
profiler stop [path] |
Save profile |
pdf <path> |
Save page as PDF |
Comparison & Diffing
| Command |
Description |
diff snapshot |
Compare current vs previous snapshot |
diff snapshot --baseline <file> |
Compare against saved file |
diff screenshot --baseline <file> |
Visual pixel comparison |
diff url <a> <b> |
Compare two URLs |
Typical Workflow
- Open a page:
open http://localhost:3000
- Inspect structure:
snapshot (returns element refs like @e1, @e2)
- Interact using refs:
click @e3 or fill @e5 "my text"
- Verify results:
get text @e7 or screenshot
- Close when done:
close
Selectors
Commands accept these selector types:
- Element refs from snapshot:
@e1, @e2, etc. (preferred for AI)
- CSS selectors:
#id, .class, div > span
- XPath:
//button[@type="submit"]
- Semantic find: Use the
find command family above
Default Login Flow
When testing locally, log in first before performing authenticated browser actions:
Project-Specific Pages
Key pages to test in the MVP boilerplate:
| Page |
URL |
Notes |
| Landing |
http://localhost:3000 |
Sections: Navbar, Hero, Logos, Items, Stats, Pricing, FAQ, Cta, Footer |
| Sign In / Sign Up |
http://localhost:3000/auth |
Supabase Auth UI |
| Account |
http://localhost:3000/account |
Authenticated - user profile & subscription |
| Pricing |
http://localhost:3000/#pricing |
Stripe checkout integration |
Testing Dark Mode
# Toggle dark mode via the ModeToggle button in the navbar
npx agent-browser find role button click --name "Toggle theme"
# Or force via browser emulation
npx agent-browser set media dark
npx agent-browser set media light
Testing Responsive Design
# Mobile
npx agent-browser set viewport 375 812
# Tablet
npx agent-browser set viewport 768 1024
# Desktop
npx agent-browser set viewport 1440 900