name: computer-dispatch description: "Control computers with mouse/keyboard — your VM desktop OR the user's personal Mac/PC via remote relay" metadata: triggers: keywords: [dispatch, desktop, screen, click, screenshot, gui, app, window, open, visit, show, dexscreener, website, url, browse, my computer, my screen, remote] phrases: ["take a screenshot", "open an app", "click on", "what is on screen", "open this website", "show me this site", "go to", "visit", "pull up", "on my computer", "on my desktop", "on my screen", "control my computer"]
Computer Dispatch Skill
You can control TWO computers: your own VM desktop AND the user's personal computer (when their relay is connected).
CRITICAL RULES (read first)
1. Use dispatch-remote-exec.sh for ALL shell commands on the user's computer.
This executes commands DIRECTLY on the user's Mac/PC — no Terminal window needed, no GUI, no screenshots required. The command runs through the relay and returns stdout/stderr.
bash ~/scripts/dispatch-remote-exec.sh "mkdir -p ~/Desktop/Screenshots && mv ~/Desktop/Screenshot*.png ~/Desktop/Screenshot*.jpg ~/Desktop/Screenshots/ 2>/dev/null; echo Done"
That's ONE command. It runs on the USER'S machine, not your VM. Output comes back as JSON with stdout, stderr, and exitCode.
File Operations on User's Computer (copy this pattern exactly)
Example: "organize my screenshots into a folder"
Step 1 — See what's on the desktop:
bash ~/scripts/dispatch-remote-exec.sh "ls ~/Desktop/"
Step 2 — Run the command:
bash ~/scripts/dispatch-remote-exec.sh "mkdir -p ~/Desktop/Screenshots && find ~/Desktop -maxdepth 1 -name 'Screenshot*' -type f -exec mv {} ~/Desktop/Screenshots/ \; && ls ~/Desktop/Screenshots/ | wc -l"
Note: macOS screenshot filenames have spaces ("Screenshot 2026-03-27 at 1.16 PM.png"). Use find -exec mv instead of mv Screenshot* to handle spaces correctly.
Step 3 — Verify and report:
bash ~/scripts/dispatch-remote-screenshot.sh
~/scripts/deliver_file.sh ~/.openclaw/workspace/dispatch-remote-screenshot.jpg "Done — here's your desktop now"
That's 3 steps. Under 15 seconds. No Terminal window, no clicking, no GUI.
Common commands via dispatch-remote-exec.sh:
- Create folder:
dispatch-remote-exec.sh "mkdir -p ~/Desktop/NewFolder" - Move files (with spaces in names):
dispatch-remote-exec.sh "find ~/Desktop -maxdepth 1 -name '*.png' -type f -exec mv {} ~/Desktop/Screenshots/ \;" - List files:
dispatch-remote-exec.sh "ls -la ~/Desktop/" - Delete files:
dispatch-remote-exec.sh "rm ~/Desktop/old-file.txt"(ask user first!) - Rename:
dispatch-remote-exec.sh "mv ~/Desktop/old.txt ~/Desktop/new.txt" - Find files:
dispatch-remote-exec.sh "find ~/Desktop -name '*.png' -type f" - Open app:
dispatch-remote-exec.sh "open -a 'Google Chrome'"(macOS) - Get system info:
dispatch-remote-exec.sh "sw_vers; uname -a"
NEVER type commands into the user's Terminal via dispatch-remote-type.sh for file operations. The relay's Terminal window captures focus and your commands end up in the wrong window. Always use dispatch-remote-exec.sh instead.
When to use GUI (screenshot/click/type) vs exec:
- Use exec: File operations, running commands, installing software, opening apps, any shell task
- Use GUI (screenshot + click): Interacting with app UIs (clicking buttons, filling forms, navigating websites)
After EVERY exec command, verify the result before telling the user it's done. Either check the command output (exitCode 0 + expected stdout) or take a screenshot. NEVER claim success without proof.
If the user attached a screenshot of their desktop, DO NOT take another dispatch-remote-screenshot.sh. Use the image they sent — it shows the same thing. Taking redundant screenshots wastes context tokens.
If the user needs to reconnect the relay: Run bash ~/scripts/dispatch-connection-info.sh to get the exact npx command with the real token and IP. Give this to the user — never use placeholder values like YOUR_TOKEN_HERE.
2. Save task state every 5 actions. During multi-step dispatch tasks, write your progress to ~/.openclaw/workspace/ACTIVE_TASK.md every 5 actions so you can resume after context resets. Format:
## Active Task
Request: [what the user asked]
Status: IN_PROGRESS
Completed: [what's done]
Next: [exact next step]
Updated: [timestamp]
3. Batch over single actions. Use dispatch-remote-batch.sh to combine multiple actions into one round-trip. See Batch Command section below.
4. Context budget limit. Remote dispatch tasks must complete in 10 messages or fewer. If you're past 10 messages without completing the task, STOP immediately and:
- Tell the user what went wrong
- Give them the exact shell command to run manually
- Do NOT try another approach — you're burning context
Max 10 screenshots per task. If you've taken more than 5 screenshots and the task isn't done, you're using GUI when you should be using shell commands. Switch to Terminal immediately.
5. If the first approach fails, go to shell commands. Do NOT try 3 different GUI approaches. If clicking doesn't work on the first try, open Terminal and type a shell command instead. No "let me try another approach" — go straight to the shell fallback.
Two Modes
Mode 1: Local Dispatch (Your VM Desktop)
Your VM has a virtual desktop (Xvfb at DISPLAY=:99, 1280x720, Openbox WM). Use this for:
- Opening websites with stealth Chrome (
dispatch-browser.sh) - Running GUI applications autonomously
- Tasks that don't need the user's computer
Scripts: dispatch-screenshot.sh, dispatch-click.sh, dispatch-type.sh, dispatch-press.sh, dispatch-scroll.sh, dispatch-browser.sh
Mode 2: Remote Dispatch (User's Personal Computer)
When the user runs instaclaw-dispatch on their Mac/PC, you can control their actual computer. Use this for:
- User asks "do this on MY computer"
- Tasks that require the user's installed apps (Figma, Excel, Slack, etc.)
- Interacting with the user's logged-in sessions
Scripts: dispatch-remote-screenshot.sh, dispatch-remote-click.sh, dispatch-remote-type.sh, dispatch-remote-press.sh, dispatch-remote-scroll.sh
Which Mode to Use
| User says... | Mode | Why |
|---|---|---|
| "open dexscreener" / "show me this site" | Local (dispatch-browser.sh) | You browse on your VM |
| "do this on my computer" / "on my screen" | Remote (dispatch-remote-*) | User's machine |
| "open Figma and edit the logo" | Remote | Figma is on user's Mac |
| "take a screenshot of your desktop" | Local (dispatch-screenshot.sh) | Your VM screen |
| "take a screenshot of my screen" | Remote (dispatch-remote-screenshot.sh) | User's screen |
| "click on this button" (in VM browser) | Local (dispatch-click.sh) | Your VM |
| Regular web browsing/scraping | Local browser tool or dispatch-browser.sh | No need for user's machine |
Default: Use Local dispatch unless the user explicitly asks you to act on THEIR computer.
Checking Remote Relay Status
Before using remote dispatch, check if the user's relay is connected:
bash ~/scripts/dispatch-remote-status.sh
Returns {"connected":true} or {"connected":false}. If not connected, tell the user:
"To let me control your computer, run npx @instaclaw/dispatch in your terminal."
Local Dispatch Commands (Your VM)
Open a Website (Stealth Chrome)
bash ~/scripts/dispatch-browser.sh "https://example.com"
sleep 5
bash ~/scripts/dispatch-screenshot.sh
~/scripts/deliver_file.sh ~/.openclaw/workspace/dispatch-screenshot.jpg "Screenshot"
Has anti-Cloudflare stealth. Use for ANY website visit.
Screenshot Your Desktop
bash ~/scripts/dispatch-screenshot.sh
Returns JSON with path, coordMap, image_base64. Send to user via deliver_file.sh.
Click / Type / Press / Scroll
bash ~/scripts/dispatch-click.sh <x> <y>
bash ~/scripts/dispatch-type.sh "text"
bash ~/scripts/dispatch-press.sh "Return"
bash ~/scripts/dispatch-scroll.sh down 3
Launch GUI Apps
DISPLAY=:99 xterm &
Remote Dispatch Commands (User's Computer)
Screenshot User's Screen
bash ~/scripts/dispatch-remote-screenshot.sh
Captures the user's actual screen. Returns JSON with path (saved to workspace) and coordMap. Send to user via deliver_file.sh.
Click on User's Screen
bash ~/scripts/dispatch-remote-click.sh <x> <y>
Type on User's Keyboard
bash ~/scripts/dispatch-remote-type.sh "text"
Press Key on User's Machine
bash ~/scripts/dispatch-remote-press.sh "Return"
Scroll on User's Machine
bash ~/scripts/dispatch-remote-scroll.sh down 3
Drag on User's Screen
bash ~/scripts/dispatch-remote-drag.sh <fromX> <fromY> <toX> <toY>
List Windows on User's Machine
bash ~/scripts/dispatch-remote-windows.sh
The Screenshot → Reason → Act Loop
Use batch commands to execute multiple actions per reasoning cycle. This is 2-3x faster than single actions.
Fast Loop (preferred — use batch):
- Screenshot — see what's on screen
- Plan multiple actions — identify the next 2-5 steps you can take without needing to re-check the screen
- Batch execute — run all planned actions in one call (includes auto-screenshot after)
- Analyze result — check the post-batch screenshot
- Repeat until done
Batch Command (Remote):
bash ~/scripts/dispatch-remote-batch.sh '{"actions":[{"type":"click","params":{"x":400,"y":300},"waitAfterMs":100},{"type":"type","params":{"text":"hello world"},"waitAfterMs":0},{"type":"press","params":{"key":"Return"},"waitAfterMs":1500}]}'
Returns JSON with both action results AND a screenshot (auto-captured after the batch). The screenshot is saved to ~/.openclaw/workspace/dispatch-remote-screenshot.jpg.
Batch Options:
screenshotAfter: true (default) — auto-screenshot after batchscreenshotFormat: "webp" (default) — smaller than JPEGscreenshotQuality: 55 (default) — good enough for GUI analysissettleMs: 300 (default) — wait for screen to settle before screenshotwaitAfterMsper action: milliseconds to wait after each action (default 50ms)
Wait Time Guide (for waitAfterMs):
| Action | waitAfterMs | Why |
|---|---|---|
| Click on UI element | 100 | OS redraws instantly |
| Type text | 0 | Characters appear immediately |
| Press Enter on form/search | 1500-3000 | Page navigation or API call |
| Click link / navigate | 2000-3000 | Page load |
| Scroll | 200 | Smooth scroll animation |
| Click dropdown/menu | 300 | Animation |
When to Batch vs Single Action:
- Batch: Click + type + Enter (search flow), fill multiple form fields, navigate menus
- Single: When you're unsure what's on screen, first action on a new page, after an error
Fallback: Single Actions
If you need precise control or are unsure of the screen state, use individual commands:
Max 50 actions per task. Max 20 actions per batch.
Verification Decision Tree — When to Screenshot
Not every action needs a verification screenshot. Use this decision tree:
ALWAYS screenshot after:
- Page navigation (clicked a link, submitted a form, pressed Enter in address bar)
- First action on a new screen or app
- Switching windows or tabs
- After a batch that includes navigation
- After any action that produced an error
- When you're unsure what's on screen
SKIP verification screenshot when:
- You just typed text into a field you already confirmed exists
- You pressed a single key (Tab, Escape) in a known context
- You scrolled in a page you've already screenshotted
- You're mid-batch — the batch auto-screenshots at the end
- You clicked a button and the next step is to type in the resulting dialog (batch these together)
Rule of Thumb:
If you can predict what the screen looks like after the action, skip the screenshot. A search flow (click search bar → type query → press Enter) needs ONE screenshot at the end, not three.
Cost Awareness:
Each screenshot costs 1,049 vision tokens ($0.003). A 20-step task with screenshots after every action: ~$0.12. With smart verification: ~$0.04-0.06. Prefer batching to cut costs by 50-70%.
User Takeover Detection
Before executing any dispatch command, check if the user has taken control:
[ -f ~/.openclaw/workspace/.user-takeover ] && echo "USER_IN_CONTROL" || echo "OK"
If .user-takeover exists, STOP all dispatch actions immediately. The user is controlling the desktop via live view. Wait and check again in 10 seconds. When the file is removed, resume your work.
Never fight the user for control. If the takeover file exists, do not click, type, press, scroll, or take screenshots.
Rate Limits
- Max 10 commands per second — the dispatch server enforces this. Batch commands count as 1 command.
- Max 60 screenshots per minute — each screenshot costs ~$0.003 in vision tokens.
- Max 500 commands per relay session — after 500 commands, the relay disconnects. Tell the user to reconnect if more work is needed.
- Max 20 actions per batch — individual batch actions are not rate-limited internally.
- 30-minute idle timeout — if no commands for 30 minutes, the relay auto-disconnects.
If a dispatch command returns an error containing "rate limit": Tell the user: "I'm being rate limited on dispatch commands. I'll wait 30 seconds and try again." Then wait 30 seconds before retrying.
Before EVERY remote dispatch command: Check relay status first:
bash ~/scripts/dispatch-remote-status.sh
If connected: false, tell the user: "Your dispatch relay isn't connected. Run npx @instaclaw/dispatch in your terminal to connect."
Sending Screenshots to Users (BOTH modes)
After taking a screenshot (local OR remote), ALWAYS send it to the user:
# Local screenshot:
bash ~/scripts/dispatch-screenshot.sh
~/scripts/deliver_file.sh ~/.openclaw/workspace/dispatch-screenshot.jpg "Desktop screenshot"
# Remote screenshot:
bash ~/scripts/dispatch-remote-screenshot.sh
~/scripts/deliver_file.sh ~/.openclaw/workspace/dispatch-remote-screenshot.jpg "Your Mac screenshot"
Token Cost Budget
Each dispatch screenshot costs 1,049 vision tokens ($0.003 at Sonnet pricing). A 20-step task costs ~$0.06-0.30. Be efficient:
- Don't take unnecessary screenshots — only when you need to see the screen
- Use the browser tool for data extraction (cheaper than vision-based dispatch)
- If a task needs >30 screenshots, warn the user about the cost
Safety Rules
- Never click blindly — screenshot first
- Never type passwords — ask the user to type credentials themselves
- Never delete files without user confirmation
- Never interact with banking/financial apps unless user explicitly requested
- Remote mode: the user sees every action in their terminal (supervised mode). Be descriptive about what you're doing.
- If something looks wrong, stop and describe what you see
- NEVER restart, kill, or modify dispatch-server — this is infrastructure managed by the system, not by you. Restarting it destroys the user's relay connection. If dispatch commands fail, tell the user the error. Do NOT try to fix the server, check ports, debug sockets, or restart processes.
Error Handling
| Error | Fix |
|---|---|
| "dispatch relay not connected" | User needs to run npx @instaclaw/dispatch |
| Screenshot fails (local) | Check Xvfb: ps aux | grep Xvfb |
| Screenshot fails (remote) | User may need to grant Screen Recording permission |
| Click doesn't work | Verify coordinates from latest screenshot |
| dispatch-browser.sh won't launch | Check RAM: free -m (needs 500MB+ available) |