malware-analysis - SKILL.md Agent Skill

name: Malware Analysis description: Windows PE malware analysis — kill chain, IOC extraction, MITRE ATT&CK mapping tags: [malware, windows, pe, stealer, rat, loader]

Task: Malware Analysis. You are analyzing a potentially malicious Windows PE binary. Defang all IOCs (hxxps://, [.]).

Phase 0: Obfuscation Triage

Before analysis, check for obfuscation:

Decompile the entry point and a few key functions
If code is heavily obfuscated (opaque predicates, CFF, MBA): invoke /deobfuscation first
If very few readable strings in a large binary → strings are encrypted
Look for the string decode stub: a small function called before every string use
If strings are encrypted: trace the decode function, understand the algorithm (XOR, RC4, custom)

Crypto constant hunting: When you find RC4 S-box or AES constants via search_strings, use xrefs_to on the constant's address (not the S-box itself) → finds KSA/key schedule → decompile that → xrefs_to on it → finds the encrypt/decrypt wrapper. Do NOT use hex pattern search for KSA byte-swap — too many false positives.

Phase 1: Reconnaissance

Use built-in tools for initial survey:

get_binary_info — format, architecture, size, function count
list_imports — group by DLL, identify capabilities:
- WINHTTP/WININET = Network communication
- CRYPT32/ADVAPI32 (Crypt*) = Cryptographic operations
- GDI32 = Screenshot capture
- ntdll = Low-level ops, possible direct syscalls
list_strings + search_strings — look for URLs, API paths, user agents, mutex names, file extensions, registry paths, app paths (Discord, Telegram, Chrome)
list_exports — exported functions and ordinals

Batch these calls: get_binary_info + list_imports + list_exports can run in parallel.

Phase 2: Execution Flow

Decompile entry point / WinMain with decompile_function
Identify the execution sequence and lifecycle: init → config → steal → exfil → exit
Use function_xrefs to map the call graph from entry
Categorize functions: initialization, C2, stealing modules, exfiltration, cleanup

Phase 3: Targeted Analysis — Kill Chain

For network malware:

Trace: socket init → DNS resolution → connect → send/recv
Identify C2 protocol (raw socket, HTTP, DNS tunneling)
Extract C2 config (hardcoded IP/domain or decrypted)

For loader/dropper:

Identify stage 2 location (resource, overlay, encrypted blob)
Extract and decrypt if possible
Check for process injection: VirtualAllocEx → WriteProcessMemory → CreateRemoteThread

For stealers:

Identify targeted applications (browser paths, wallet files, credential stores)
Trace data exfiltration path
Check for browser DB access patterns: SQLite, LevelDB, Login Data, Cookies

For RATs:

Identify command dispatcher (usually a switch/if-chain on received data)
Map each command: file upload/download, screenshot, keylog, shell, persistence

Phase 4: Persistence + Evasion

Check registry keys (RegSetValueEx, service install, scheduled task)
Check process injection (VirtualAllocEx, WriteProcessMemory, CreateRemoteThread)
Check anti-analysis: IsDebuggerPresent, timing checks, VM detection, CPUID
Direct syscalls: Resolving ntdll via GetModuleHandleA + scanning for B8...0F05
Most commodity stealers have NO persistence — document this explicitly

Phase 5: Report

Produce a compact triage first (~12 lines), then offer a detailed report:

Binary:   mal.exe | PE x86-64 | 1519 funcs | 92 imports
Packing:  None (entropy <7.0 all sections)
Crypto:   RC4 x 5 constants
IAT:      KERNEL32 only — runtime LoadLibrary/GetProcAddress resolution
Targets:  Chrome, Brave, Edge
C2:       hxxps://example[.]com/api/config
Verdict:  Stealer. RC4 string enc, dynamic API resolution.

Include: classification, capabilities, IOCs (defanged), kill chain, MITRE ATT&CK mapping.

Naming Conventions

Functions: PascalCase verb-noun (InitializeGlobals, StealDiscordTokens, HttpPostRequest)
Globals: camelCase with g_ prefix (g_bEnabled, g_pConfigStart, g_C2ServerUrl)
Structs: PascalCase (BrowserConfig, C2ResponseData)

Common Patterns

Browser theft: CryptUnprotectData, SQLite, paths with "User Data", "Login Data", "Cookies"
Token theft: LevelDB (Discord), tdata (Telegram), .vdf (Steam)
Dead drop resolver: Fetching Steam/Telegram/Pastebin pages to extract real C2
Direct syscalls: Resolving ntdll via GetModuleHandleA + scanning for B8...0F05
XOR + Base64: Common for C2 config and payload delivery
API hashing: CRC32/DJB2/FNV of API names, resolved at runtime via PEB walk