linux-malware-analysis - SKILL.md Agent Skill

name: Linux Malware Analysis description: Expert ELF malware analysis — packing, toolchain ID, kill chain, persistence, C2, rootkits, cryptominers, Go/Rust/Mirai patterns, MITRE ATT&CK mapping tags: [malware, linux, elf, iot, botnet, rootkit, miner, go, rust]

Task: Linux Malware Analysis. You are a senior malware analyst examining a potentially malicious ELF binary. Defang ALL IOCs in output (hxxps://, [.] notation). Work methodically through each phase. Do not skip phases.

Phase 0: Binary Triage

Run these first — the answers change every subsequent decision.

get_binary_info — ELF class (32/64), architecture, endianness, linkage type, entry point, section count, function count
list_segments — check PT_INTERP (non-standard interpreter path = backdoor dropper), segment permissions (W+X = suspicious), entropy per segment

Determine compiler toolchain:

search_strings for: runtime., go.itab, panicked at → Go binary
search_strings for: core::panicking, _ZN (Rust name mangling), .L__unnamed → Rust binary
Import of pthread_create + libc.so → standard C/C++
search_strings for UPX magic, check function count vs binary size

Packing triage:

Signal	Meaning
`"UPX!"` string present	UPX (may have corrupted header)
<5 functions in a binary >50KB	Packer stub only
Entry point at high address / near end of segment	Self-extracting loader
W+X segment	Runtime code decryption / shellcode loading
No `.symtab`, no section headers	Stripped (normal) but also packing indicator
`search_strings` returns very few results (<10) in large binary	String encryption

UPX header corruption (common in IoT/Mirai variants): The UPX magic 0x55505821 is replaced with random bytes to prevent automated unpacking, but the binary still executes. Note in report, recommend dynamic analysis for full unpacking.

Architecture-specific expectations:

x86_64: Desktop/server malware, containers, cloud; most tools work natively
aarch64 / arm: IoT, embedded, mobile-adjacent; check for ARM debug register checks
mipsel / mipseb: Router/DVR malware (Mirai family); MIPS delay slots complicate disassembly
ppc / ppc64: Rare; router firmware (old Synology, QNAP)

Phase 1: ELF Structure Analysis

This phase reveals backdoors in the loader mechanism that pure string/import analysis misses.

PT_INTERP (Dynamic Linker)

Check the interpreter path via list_segments. Standard values:

x86_64: /lib64/ld-linux-x86-64.so.2
aarch64: /lib/ld-linux-aarch64.so.1
mipsel: /lib/ld-uClibc.so.0 (or musl on newer IoT)

Non-standard interpreter = binary uses a rogue dynamic linker. High confidence malware.

.init_array / .fini_array (Constructor/Destructor Chains)

Use get_sections or search_binary for .init_array. Decompile each function pointer in the array — malware hides initialization logic here (anti-debug setup, self-decryption, first-stage drops) that runs before main().

Pattern: Array at fixed VA with multiple function pointers → decompile each one individually.

PLT/GOT Analysis

In stripped binaries, PLT entries are often the only way to identify library calls. Each PLT stub is a jmp [GOT_entry] — the target GOT slot tells you which library function.

When IDA fails to resolve GOT entries automatically:

Find .plt section start
Each stub is 16 bytes (x86_64): FF 25 XX XX XX XX = jmp [rip+offset]
Calculate GOT address = stub_addr + 6 + offset
Mark the GOT slot and add a comment with the function name

IFUNC Resolvers (GNU_IFUNC)

Symbol type STT_GNU_IFUNC — resolver function called at load time returns the actual implementation. XZ backdoor (CVE-2024-3094) abused this mechanism. If you see IFUNC symbols for unexpected functions, decompile the resolver immediately.

Hidden Init Vectors

.ctors / .dtors (legacy): same purpose as init/fini_array, older GCC
preinit_array: Runs before init_array, even before _start in some configurations
DT_INIT entry in dynamic section: Single init function pointer, separate from array

Phase 2: Reconnaissance

Batch these calls — they can run in parallel:

list_imports — group by category:

Import cluster	Capability
`socket`, `connect`, `bind`, `listen`, `accept`, `send`, `recv`	Network C2
`execve`, `execvp`, `fork`, `clone`, `posix_spawn`	Process execution
`ptrace`	Anti-debug OR debugger (rare in malware)
`init_module`, `finit_module`	Kernel module loading = rootkit
`memfd_create`	Fileless execution
`mmap` + `mprotect` (both present)	Shellcode loading / runtime decryption
`dlopen`, `dlsym`	Dynamic library loading / hooking
`inotify_*`	File system monitoring
`prctl`	Process name spoofing (`PR_SET_NAME`) or capabilities
`capset`, `setuid`, `setreuid`, `setgid`	Privilege manipulation
`bpf`	eBPF rootkit or monitoring

list_exports — exported symbols reveal shared library role or hooking targets

search_strings — look for Linux-specific IOC clusters:

Persistence paths:

/etc/crontab  /var/spool/cron  /etc/cron.d
/etc/systemd/system  /etc/init.d  /etc/rc.local
/etc/profile.d  /etc/ld.so.preload
~/.bashrc  ~/.profile  ~/.ssh/authorized_keys
/etc/modules  /etc/udev/rules.d
/proc/sys/kernel/modules_disabled

Evasion and anti-analysis:

/proc/self/status    (TracerPid check)
/proc/self/exe       (self-path / unlink-self)
/proc/self/maps      (memory layout inspection)
/proc/1/cgroup       (container detection)
/sys/class/dmi/id    (VM detection: VirtualBox, VMware, QEMU, KVM)
/sys/hypervisor      (Xen detection)
LD_PRELOAD           (self-injection or hook evasion)

Credential targets:

/etc/passwd   /etc/shadow   /etc/sudoers
~/.ssh/id_rsa   ~/.ssh/id_ed25519   ~/.ssh/known_hosts
~/.aws/credentials   ~/.gcp   ~/.kube/config
.mozilla/firefox  google-chrome  Login Data  Cookies

Fileless execution signals:

memfd:    /proc/self/mem    /dev/shm    /tmp/

Rootkit signals:

kallsyms   /proc/kallsyms   sys_call_table
ftrace   kprobe   uprobe   /dev/kmem

Phase 3: Malware Classification

Based on Phase 2 findings, determine the primary classification before deep-diving:

Mirai / IoT Botnet

Fingerprints: MIPS/ARM architecture, UPX-packed, table.c encrypted string pattern, process killer, scanner loop.

Key analysis steps:

Find the string decryption function — original Mirai XORs with 0xDEADBEEF; variants use XOR with other keys, XXTEA, RC4, or TEA. Look for a loop with XOR and a fixed constant in .rodata.
Decrypt all table entries — each is a null-terminated string XOR'd with the key
Identify the three core modules by function pattern:
- Scanner: TCP SYN probe loop + credential dictionary lookup
- Killer: Process enumeration + SIGKILL loop (kills other malware / debuggers)
- Attacker: Switch on attack type (UDP flood=0, TCP flood, ACK flood, HTTP flood, DNS flood)
Extract C2 IP/domain from decrypted strings; look for connect() call chain from attacker init
Check for fork() + daemon() → background persistence without file

2024+ variants also add: SSH brute force (NoaBot), crypto mining (stratum beacon), RC4 string table, anti-VM checks.

Cryptominer

Fingerprints: stratum+tcp:// or stratum2+tcp:// strings, pool domain strings, nice() + sched_setaffinity() calls, XMRig-specific strings (RandomX, xmrig, donate.v2.xmrig.com).

Key analysis steps:

Search strings for stratum protocol: search_strings "stratum" + search_strings "pool"
Find the CPU affinity call chain: sched_setaffinity → check which cores are claimed
Check for nice(-20) — stealing maximum CPU priority
Embedded XMRig: May be base64-encoded or gzip'd in .rodata; look for large blob + decode routine
GPU miner: Look for cudaLaunchKernel / clEnqueueNDRangeKernel or libcuda.so / libOpenCL.so dlopen

Linux Rootkit / LKM

Fingerprints: init_module / finit_module imports, kallsyms strings, sys_call_table, /proc/kallsyms, vermagic string.

Key analysis steps:

Find module_init function — entry point for kernel module, analogous to main()
Look for syscall hooking pattern: kallsyms_lookup_name("sys_call_table") → save original → replace pointer
If kernel ≥5.7 (kallsyms not exported): look for kprobe-based lookup or ftrace hooks
Identify which syscalls are hooked: getdents64 (process/file hiding), open/read (file content manipulation), tcp4_seq_show (socket hiding)
eBPF rootkit: bpf() syscall + BPF_PROG_LOAD + uprobe/kprobe attach = kernel-level hook without .ko file

PUMAKIT pattern (2024): Two-stage — ELF dropper + LKM; dropper hides in legitimate-looking binary using .init_array to load the module before main() runs.

Fileless / Memory-Only Malware

Fingerprints: memfd_create import, /proc/self/exe + unlink() combo, /proc/self/mem write pattern.

Key analysis steps:

Trace memfd_create → write(fd, payload, size) → fexecve(fd, ...): This is the fileless execution chain. The payload written to the memfd IS the second-stage binary.
Look for payload extraction: XOR/AES decrypt → write to memfd → execute
If self-deletion: readlink("/proc/self/exe") → unlink(path) — binary removes itself after launch
/proc/self/mem write: Open own mem, seek to target address, write new code — runtime patching without file modification

Detection note for report: Process will show /proc/<pid>/exe as (deleted) or memfd:<name> (deleted).

Backdoor / RAT

Fingerprints: socket + connect/bind + execve("/bin/sh") cluster, reverse shell pattern.

Key analysis steps:

Find the command dispatch function: usually a switch or if/else if chain on a received byte/string
Map commands to handlers: file read/write, shell exec, screenshot (X11), keylog, persistence install
Identify C2 protocol: plaintext? XOR-encrypted? TLS? IRC (look for PRIVMSG, JOIN # strings)
Check for SSH backdoor: PAM module (/etc/pam.d/sshd reference) or authorized_keys write

Credential Stealer / Info Stealer

Fingerprints: /etc/shadow access, ~/.ssh/ path strings, browser database paths, cloud credential paths.

Key analysis steps:

Trace each credential path to the exfiltration function — what encoding is used before send?
Look for HTTP POST to C2 with stolen data body
Check for /proc/[pid]/mem read targeting sshd or browser processes

Phase 4: Deep Dive — Kill Chain

Anti-Analysis (check before spending time on obfuscated code)

Anti-debug techniques — search for these call patterns:

ptrace(PTRACE_TRACEME, 0, ...)    → if returns -1, exit/behave differently
open("/proc/self/status")         → read TracerPid: field; non-zero = debugger
clock_gettime(CLOCK_MONOTONIC)    → double-call, delta check; >threshold = debugger
raise(SIGILL) / raise(SIGTRAP)    → if no custom handler → crash; if handled = clean

VM/sandbox detection:

open("/sys/class/dmi/id/product_name")  → "VirtualBox", "VMware Virtual Platform", "QEMU"
open("/sys/class/dmi/id/sys_vendor")    → "innotek GmbH", "VMware, Inc.", "QEMU"
open("/proc/cpuinfo")                   → check hypervisor flags
readlink("/proc/self/cgroup")           → detect container cgroups

Process name spoofing: prctl(PR_SET_NAME, "kworker/0:1", ...) — makes malware look like a kernel thread.

Persistence

Identify ALL persistence vectors — malware often installs multiple. Check string results from Phase 2 against this hierarchy (ordered by stealth):

LD_PRELOAD / /etc/ld.so.preload — most stealthy; affects all processes
PAM module — credential capture + persistence
Kernel module / eBPF — survives user-space analysis
Systemd service — survives reboot, shown by systemctl list-units
Crontab — common in cryptominers; crontab -l reveals
SSH authorized_keys — opportunistic, targeted
.bashrc / .profile — user-scoped, less stealthy
Udev rule — triggered by device events

For each: identify the exact path being written and what binary/command is being persisted.

Phase 5: Language-Specific Analysis

Go Binaries

Identification: runtime., go.itab, sync.Mutex, runtime.morestack strings. Function count is high (Go runtime inflates it). go.buildinfo section contains Go version.

Critical difference: Go strings are NOT null-terminated — they are {ptr, len} pairs. IDA's built-in string detection misses them. Use search_strings and cross-reference from .rodata to usage sites.

Function recovery in stripped binaries:

pclntab (Program Counter Line Table) is always present, even stripped. It maps every PC range to a function name string. Search for magic bytes \xFF\xFF\xFF\xFA (Go 1.20+) or \xFB\xFF\xFF\xFF (Go 1.2-1.19) to locate it.
goroutine stack check preamble: Every function starts with CMPQ SP, goroutine_stackguard; JBE morestack — use this as a reliable function start signature.

Common Go malware families: Chaos/Sparc (DDoS+SSH+module loader), Sliver C2, Gopuram backdoor.

Naming convention for Go functions:

main.funcName — package main
net.(*Dialer).DialContext — standard library method
Rename to: go_main_ConnectC2, go_net_ResolveDomain, etc.

Rust Binaries

Identification: _ZN mangled names, core::panicking::panic, std::sys_common, Tokio / tokio_ strings. No true RTTI.

Key technique: Panic strings leak source code paths and line numbers even in release builds. search_strings "src/" often returns paths from the original Rust source tree, revealing project structure and function names.

Async (Tokio) pattern: State machine in .await points — a single async fn compiles into a state enum. Recognize by the poll function pattern and Pin<Box<dyn Future>>.

Vtables: Look for data arrays with consecutive function pointers — these are trait vtables. Retype in IDA with correct function signatures to enable cross-references.

Phase 6: Architecture-Specific Notes

x86_64 (Desktop/Server/Container)

Container escape: /var/run/docker.sock access, cgroup release_agent write (CVE-2022-0492)
Cryptominer: SIMD instructions (AVX2/AVX512), high entropy memory loops = RandomX
Direct syscalls: syscall instruction with rax = syscall number (bypasses libc hooking)

AArch64 / ARM

IoT/embedded: Hardcoded default credentials for common devices in string table
Android-adjacent: Bionic C library patterns instead of glibc
ARM debug registers: Read MRS X0, MDSCR_EL1 — if debug enable bit set = debugger attached

MIPS (Mirai-family routers/DVRs)

Big-endian vs little-endian: Check ELF header EI_DATA byte (0x01=LE, 0x02=BE)
MIPS delay slots: instruction after branch/jump always executes — IDA handles this but can confuse manual reading
MIPS syscall: syscall instruction with $v0 = syscall number

Phase 7: Report

Triage summary first (fits in ~15 lines):

Binary:       sample | ELF x86_64 | dynamically linked | 847 funcs
Toolchain:    GCC / glibc 2.31
Packing:      None (entropy 5.8, all sections)
Architecture: x86_64 — standard server/container target
Classification: Linux backdoor + cryptominer (dual purpose)

Strings:      /etc/cron.d/, /proc/self/status, stratum+tcp://, memfd_create
Imports:      socket, connect, execve, memfd_create, prctl, sched_setaffinity
Persistence:  cron job (/etc/cron.d/update), SSH authorized_keys
C2:           hxxps://185.220.101[.]47/beacon (HTTPS), stratum+tcp://pool.minexmr[.]com:4444
Evasion:      ptrace PTRACE_TRACEME, /proc/self/status TracerPid, process name spoofing
Lateral:      SSH key harvesting from /home/*/.ssh/
Fileless:     memfd_create + fexecve for second-stage drop
Verdict:      Dual-purpose Linux implant: reverse shell + XMRig miner drop.