pentest-task-tree

name: pentest-task-tree description: "Iterative PTT (Penetration Testing Tree) session reasoning — build, update, and traverse a live numbered task tree to drive LLM-guided pentest decisions across a full session." allowed-tools: Bash Read Write metadata: subdomain: orchestration when_to_use: "PTT, task tree, what to do next, next task, session reasoning, iterative pentest, live pentest session, update task list, pentest tree, task selection" tags: ptt, task-tree, session-reasoning, iterative, decision-making, llm-pentest, pentestgpt upstream_ref: "PentestGPT (Gelei Deng et al., USENIX Security 2024) — pentest task tree methodology spanning the full ATT&CK Enterprise kill chain"

Pentest Task Tree (PTT) — Session Reasoning Playbook

Authorized use only. This methodology is for certified penetration testers operating under a signed scope-of-work, rules of engagement (RoE), and explicit written authorization. Do not apply to systems you do not own or have written permission to test.

What This Skill Does

The PTT methodology converts a live pentest into a session-stateful, LLM-reasoned task graph. Unlike static checklists, the PTT starts minimal, expands only on discovered evidence, and always exposes a single ranked "next node" to execute — preventing scope creep, cognitive overload, and wasted effort on unconfirmed attack surfaces.

This skill is the live-session reasoning complement to orchestration (multi-agent delegation) and kill-chain-analysis (post-recon vector scoring). Use it when you want a single operator driving a session with a maintained task tree rather than delegating to sub-agents.

PTT Format

1. Reconnaissance                              [to-do]
   1.1 Passive information gathering           [completed]
   1.2 Active port scan (nmap -sV -sC -p-)     [to-do]
   1.3 Service fingerprinting                  [to-do]
2. Initial Access                              [to-do]
   2.1 Web application testing (port 80/443)   [to-do]
       2.1.1 Directory enumeration (gobuster)  [to-do]
       2.1.2 CMS/version identification        [to-do]
   2.2 SSH brute-force (port 22)               [not-applicable]
3. Privilege Escalation                        [to-do]

Rules:

Layer depth: 1, 1.1, 1.1.1 etc. Each child is a concrete sub-operation of its parent.
Status values: to-do, completed, not-applicable. Never leave a node status-less.
Do NOT pre-generate nodes for unknown ports/services. Expand only on confirmed evidence.
Remove/prune stale or invalidated nodes aggressively to control token budget.
The tree is written to disk (<engagement>/ptt.md) after every update.

Session Lifecycle

Phase 1 — Tree Initialization

Input: Target description (IP, URL, brief scope notes from RoE)

Action:

Generate root nodes only — typically reconnaissance tasks.
Do not generate exploitation nodes until recon confirms a surface.
Write initial PTT to <engagement>/ptt.md.

Starting template:

1. Reconnaissance                              [to-do]
   1.1 Passive information gathering           [to-do]
   1.2 Active port/service scan                [to-do]
2. (Expand after recon confirms attack surface)

Phase 2 — Task Selection Loop

Each iteration of the main loop:

Read current PTT from <engagement>/ptt.md.
Filter all to-do leaf nodes (leaf = no children, or all children also to-do).
Score each candidate:
```
Priority = (P_success × Impact) / Detection_risk
```
Prefer: confirmed vulns > misconfigs > spray/brute force > speculative nodes.
Select the single highest-priority leaf node.

Emit the next-task block in the canonical three-sentence format:

-----
Task: <what to do — one sentence>
Command: <exact command or GUI steps>
Expected outcome: <what success looks like>

Execute (or hand to operator), then proceed to Phase 3.

Phase 3 — Result Ingestion and Tree Update

Input processing (parse before reasoning):

Raw tool output is noisy. Before updating the PTT, distill the output:

Input type	Distillation rule
`nmap` output	Keep: open ports, service/version, script results. Drop: closed/filtered noise.
Web page / Burp response	Keep: forms, parameters, comments, error messages, auth state. Drop: boilerplate HTML.
`gobuster` / `ffuf`	Keep: non-404 paths, redirect targets, interesting status codes (200/301/403/500).
`nikto` output	Keep: CVE references, misconfig findings. Drop: informational noise.
Exploit output	Keep: shell prompt, privilege level, hostname, error messages.
Arbitrary operator note	Rephrase to one concise sentence preserving all field:value pairs.

Tree update rules (apply in order):

Mark the executed node completed if successful, not-applicable if confirmed inapplicable.
Add child nodes only for newly confirmed services/paths/vulns — do not speculate.
Delete nodes invalidated by findings (e.g., port closed → remove service sub-tree).
If a new attack surface opens, add a new top-level or mid-level node.
Write updated PTT back to <engagement>/ptt.md.
Return to Phase 2.

Decision Logic: Which Node Next?

Is there a confirmed vulnerability / credential / shell from the last step?
├── YES → Immediately prioritize exploitation or post-exploitation node
│         (Do not queue more recon when you have a live lead)
└── NO  → Continue down the recon/enumeration branch

Is the current branch exhausted (all leaves completed or not-applicable)?
├── YES → Expand to adjacent attack surface OR escalate to next kill-chain phase
└── NO  → Stay in current branch, pick highest-scored to-do leaf

Are all nodes completed or not-applicable?
├── YES → Session complete — generate PTT summary and hand off to reporting
└── NO  → Continue loop

Hard rule: A confirmed vuln/shell/cred overrides any pending enumeration node. Never run "one more scan" when exploitation is available.

Tool-Output Distillation Examples

nmap raw → distilled

# Raw (noisy)
22/tcp  open  ssh     OpenSSH 8.9p1 Ubuntu
80/tcp  open  http    Apache httpd 2.4.52
443/tcp open  ssl/http Apache httpd 2.4.52
8080/tcp filtered http
...
(500 lines of script output)

# Distilled (PTT-ready)
Port 22: OpenSSH 8.9p1 Ubuntu (open)
Port 80: Apache 2.4.52 HTTP (open)
Port 443: Apache 2.4.52 HTTPS (open)
Port 8080: filtered

Expand PTT: add 2.1 Web (80), 2.2 Web (443), mark SSH node low-priority.

gobuster raw → distilled

# Raw
/index.php      (Status: 200) [Size: 4821]
/admin          (Status: 301) [Size: 312] [--> /admin/]
/config.php     (Status: 403) [Size: 277]
/backup.zip     (Status: 200) [Size: 1048576]

Distilled: /admin/ redirect (interesting), /config.php 403 (exists, access-controlled), /backup.zip 200 (high-value download). Expand PTT: add nodes for /admin/ auth bypass test, /backup.zip download and analysis.

PTT State File: `<engagement>/ptt.md`

# PTT — <engagement name>
Updated: <timestamp>

1. Reconnaissance                              [completed]
   1.1 Passive information gathering           [completed]
   1.2 Active port scan                        [completed]
       Findings: ports 22, 80, 443 open
   1.3 Service fingerprinting                  [completed]
       Findings: Apache 2.4.52, OpenSSH 8.9p1
2. Initial Access                              [to-do]
   2.1 Web application (port 80/443)           [to-do]
       2.1.1 Directory enumeration             [completed]
             Findings: /admin/ (301), /backup.zip (200)
       2.1.2 /backup.zip download + analysis   [to-do]  ← NEXT
       2.1.3 /admin/ authentication testing    [to-do]
   2.2 SSH (port 22)                           [to-do]
3. Privilege Escalation                        [to-do]  ← expand after foothold

Next task block (emitted to operator):

-----
Task: Download /backup.zip and inspect its contents for credentials, source code, or configuration files.
Command: wget http://<target>/backup.zip -O backup.zip && unzip -l backup.zip
Expected outcome: A file listing that reveals source code, database configs, or hardcoded credentials usable for further access.

Integration with Decepticon Skills

Situation	Companion skill
Need to choose between multiple confirmed attack vectors	`kill-chain-analysis`
Foothold established, planning post-exploit	`post-exploit/workflow`
AD services confirmed on network	`ad/kerberoasting`, `ad/bloodhound-query`
WAF/EDR blocking technique	`defense-evasion`
Session complete, write report	`decepticon/final-report`
Multi-agent delegation preferred over single-session	`decepticon/orchestration`

MITRE ATT&CK Mapping

PTT Phase	ATT&CK Tactic	Key Techniques
Reconnaissance	TA0043	T1595 (Active Scan), T1592 (Host Info), T1589 (Identity)
Initial Access	TA0001	T1190 (Exploit Public-Facing App), T1133 (External Remote), T1566 (Phishing)
Execution	TA0002	T1059 (Command/Script Interpreter), T1203 (Exploit for Client Exec)
Persistence	TA0003	T1505 (Server Software Component), T1078 (Valid Accounts)
Priv Esc	TA0004	T1068 (Exploit for Priv Esc), T1548 (Abuse Elevation Control)
Defense Evasion	TA0005	T1055 (Process Injection), T1070 (Indicator Removal)
Credential Access	TA0006	T1003 (OS Credential Dumping), T1552 (Unsecured Credentials)
Discovery	TA0007	T1082 (System Info), T1083 (File/Dir Discovery), T1046 (Net Service Scan)
Lateral Movement	TA0008	T1021 (Remote Services), T1550 (Use Alt Auth Material)

Detection Notes

Defenders should monitor for:

Sequential port scans from a single source (TA0043 / T1595)
Rapid HTTP 404/200/301 enumeration patterns (T1595.003)
Unusual downloads of large archive files from web roots
Login attempts against admin panels following enumeration

Reference

Based on the PTT (Penetration Testing Tree) framework introduced in:

Deng et al., "PentestGPT: An LLM-Empowered Automatic Penetration Testing Framework", USENIX Security 2024. https://www.usenix.org/conference/usenixsecurity24/presentation/deng