pentest-task-tree

star 4.4k

Iterative PTT (Penetration Testing Tree) session reasoning — build, update, and traverse a live numbered task tree to drive LLM-guided pentest decisions across a full session.

PurpleAILAB By PurpleAILAB schedule Updated 6/2/2026

name: pentest-task-tree description: "Iterative PTT (Penetration Testing Tree) session reasoning — build, update, and traverse a live numbered task tree to drive LLM-guided pentest decisions across a full session." allowed-tools: Bash Read Write metadata: subdomain: orchestration when_to_use: "PTT, task tree, what to do next, next task, session reasoning, iterative pentest, live pentest session, update task list, pentest tree, task selection" tags: ptt, task-tree, session-reasoning, iterative, decision-making, llm-pentest, pentestgpt upstream_ref: "PentestGPT (Gelei Deng et al., USENIX Security 2024) — pentest task tree methodology spanning the full ATT&CK Enterprise kill chain"

Pentest Task Tree (PTT) — Session Reasoning Playbook

Authorized use only. This methodology is for certified penetration testers operating under a signed scope-of-work, rules of engagement (RoE), and explicit written authorization. Do not apply to systems you do not own or have written permission to test.

What This Skill Does

The PTT methodology converts a live pentest into a session-stateful, LLM-reasoned task graph. Unlike static checklists, the PTT starts minimal, expands only on discovered evidence, and always exposes a single ranked "next node" to execute — preventing scope creep, cognitive overload, and wasted effort on unconfirmed attack surfaces.

This skill is the live-session reasoning complement to orchestration (multi-agent delegation) and kill-chain-analysis (post-recon vector scoring). Use it when you want a single operator driving a session with a maintained task tree rather than delegating to sub-agents.


PTT Format

1. Reconnaissance                              [to-do]
   1.1 Passive information gathering           [completed]
   1.2 Active port scan (nmap -sV -sC -p-)     [to-do]
   1.3 Service fingerprinting                  [to-do]
2. Initial Access                              [to-do]
   2.1 Web application testing (port 80/443)   [to-do]
       2.1.1 Directory enumeration (gobuster)  [to-do]
       2.1.2 CMS/version identification        [to-do]
   2.2 SSH brute-force (port 22)               [not-applicable]
3. Privilege Escalation                        [to-do]

Rules:

  • Layer depth: 1, 1.1, 1.1.1 etc. Each child is a concrete sub-operation of its parent.
  • Status values: to-do, completed, not-applicable. Never leave a node status-less.
  • Do NOT pre-generate nodes for unknown ports/services. Expand only on confirmed evidence.
  • Remove/prune stale or invalidated nodes aggressively to control token budget.
  • The tree is written to disk (<engagement>/ptt.md) after every update.

Session Lifecycle

Phase 1 — Tree Initialization

Input: Target description (IP, URL, brief scope notes from RoE)

Action:

  1. Generate root nodes only — typically reconnaissance tasks.
  2. Do not generate exploitation nodes until recon confirms a surface.
  3. Write initial PTT to <engagement>/ptt.md.

Starting template:

1. Reconnaissance                              [to-do]
   1.1 Passive information gathering           [to-do]
   1.2 Active port/service scan                [to-do]
2. (Expand after recon confirms attack surface)

Phase 2 — Task Selection Loop

Each iteration of the main loop:

  1. Read current PTT from <engagement>/ptt.md.
  2. Filter all to-do leaf nodes (leaf = no children, or all children also to-do).
  3. Score each candidate:
    Priority = (P_success × Impact) / Detection_risk
    
    Prefer: confirmed vulns > misconfigs > spray/brute force > speculative nodes.
  4. Select the single highest-priority leaf node.
  5. Emit the next-task block in the canonical three-sentence format:
    -----
    Task: <what to do — one sentence>
    Command: <exact command or GUI steps>
    Expected outcome: <what success looks like>
    
  6. Execute (or hand to operator), then proceed to Phase 3.

Phase 3 — Result Ingestion and Tree Update

Input processing (parse before reasoning):

Raw tool output is noisy. Before updating the PTT, distill the output:

Input type Distillation rule
nmap output Keep: open ports, service/version, script results. Drop: closed/filtered noise.
Web page / Burp response Keep: forms, parameters, comments, error messages, auth state. Drop: boilerplate HTML.
gobuster / ffuf Keep: non-404 paths, redirect targets, interesting status codes (200/301/403/500).
nikto output Keep: CVE references, misconfig findings. Drop: informational noise.
Exploit output Keep: shell prompt, privilege level, hostname, error messages.
Arbitrary operator note Rephrase to one concise sentence preserving all field:value pairs.

Tree update rules (apply in order):

  1. Mark the executed node completed if successful, not-applicable if confirmed inapplicable.
  2. Add child nodes only for newly confirmed services/paths/vulns — do not speculate.
  3. Delete nodes invalidated by findings (e.g., port closed → remove service sub-tree).
  4. If a new attack surface opens, add a new top-level or mid-level node.
  5. Write updated PTT back to <engagement>/ptt.md.
  6. Return to Phase 2.

Decision Logic: Which Node Next?

Is there a confirmed vulnerability / credential / shell from the last step?
├── YES → Immediately prioritize exploitation or post-exploitation node
│         (Do not queue more recon when you have a live lead)
└── NO  → Continue down the recon/enumeration branch

Is the current branch exhausted (all leaves completed or not-applicable)?
├── YES → Expand to adjacent attack surface OR escalate to next kill-chain phase
└── NO  → Stay in current branch, pick highest-scored to-do leaf

Are all nodes completed or not-applicable?
├── YES → Session complete — generate PTT summary and hand off to reporting
└── NO  → Continue loop

Hard rule: A confirmed vuln/shell/cred overrides any pending enumeration node. Never run "one more scan" when exploitation is available.


Tool-Output Distillation Examples

nmap raw → distilled

# Raw (noisy)
22/tcp  open  ssh     OpenSSH 8.9p1 Ubuntu
80/tcp  open  http    Apache httpd 2.4.52
443/tcp open  ssl/http Apache httpd 2.4.52
8080/tcp filtered http
...
(500 lines of script output)

# Distilled (PTT-ready)
Port 22: OpenSSH 8.9p1 Ubuntu (open)
Port 80: Apache 2.4.52 HTTP (open)
Port 443: Apache 2.4.52 HTTPS (open)
Port 8080: filtered

Expand PTT: add 2.1 Web (80), 2.2 Web (443), mark SSH node low-priority.

gobuster raw → distilled

# Raw
/index.php      (Status: 200) [Size: 4821]
/admin          (Status: 301) [Size: 312] [--> /admin/]
/config.php     (Status: 403) [Size: 277]
/backup.zip     (Status: 200) [Size: 1048576]

Distilled: /admin/ redirect (interesting), /config.php 403 (exists, access-controlled), /backup.zip 200 (high-value download). Expand PTT: add nodes for /admin/ auth bypass test, /backup.zip download and analysis.


PTT State File: <engagement>/ptt.md

# PTT — <engagement name>
Updated: <timestamp>

1. Reconnaissance                              [completed]
   1.1 Passive information gathering           [completed]
   1.2 Active port scan                        [completed]
       Findings: ports 22, 80, 443 open
   1.3 Service fingerprinting                  [completed]
       Findings: Apache 2.4.52, OpenSSH 8.9p1
2. Initial Access                              [to-do]
   2.1 Web application (port 80/443)           [to-do]
       2.1.1 Directory enumeration             [completed]
             Findings: /admin/ (301), /backup.zip (200)
       2.1.2 /backup.zip download + analysis   [to-do]  ← NEXT
       2.1.3 /admin/ authentication testing    [to-do]
   2.2 SSH (port 22)                           [to-do]
3. Privilege Escalation                        [to-do]  ← expand after foothold

Next task block (emitted to operator):

-----
Task: Download /backup.zip and inspect its contents for credentials, source code, or configuration files.
Command: wget http://<target>/backup.zip -O backup.zip && unzip -l backup.zip
Expected outcome: A file listing that reveals source code, database configs, or hardcoded credentials usable for further access.

Integration with Decepticon Skills

Situation Companion skill
Need to choose between multiple confirmed attack vectors kill-chain-analysis
Foothold established, planning post-exploit post-exploit/workflow
AD services confirmed on network ad/kerberoasting, ad/bloodhound-query
WAF/EDR blocking technique defense-evasion
Session complete, write report decepticon/final-report
Multi-agent delegation preferred over single-session decepticon/orchestration

MITRE ATT&CK Mapping

PTT Phase ATT&CK Tactic Key Techniques
Reconnaissance TA0043 T1595 (Active Scan), T1592 (Host Info), T1589 (Identity)
Initial Access TA0001 T1190 (Exploit Public-Facing App), T1133 (External Remote), T1566 (Phishing)
Execution TA0002 T1059 (Command/Script Interpreter), T1203 (Exploit for Client Exec)
Persistence TA0003 T1505 (Server Software Component), T1078 (Valid Accounts)
Priv Esc TA0004 T1068 (Exploit for Priv Esc), T1548 (Abuse Elevation Control)
Defense Evasion TA0005 T1055 (Process Injection), T1070 (Indicator Removal)
Credential Access TA0006 T1003 (OS Credential Dumping), T1552 (Unsecured Credentials)
Discovery TA0007 T1082 (System Info), T1083 (File/Dir Discovery), T1046 (Net Service Scan)
Lateral Movement TA0008 T1021 (Remote Services), T1550 (Use Alt Auth Material)

Detection Notes

Defenders should monitor for:

  • Sequential port scans from a single source (TA0043 / T1595)
  • Rapid HTTP 404/200/301 enumeration patterns (T1595.003)
  • Unusual downloads of large archive files from web roots
  • Login attempts against admin panels following enumeration

Reference

Based on the PTT (Penetration Testing Tree) framework introduced in:

Deng et al., "PentestGPT: An LLM-Empowered Automatic Penetration Testing Framework", USENIX Security 2024. https://www.usenix.org/conference/usenixsecurity24/presentation/deng

Install via CLI
npx skills add https://github.com/PurpleAILAB/Decepticon --skill pentest-task-tree
Repository Details
star Stars 4,393
call_split Forks 875
navigation Branch main
article Path SKILL.md
More from Creator