name: bn-re description: "Reverse-engineering methodology for unknown binaries via the bn CLI. Highest-value moves over raw decompilation — a C++ class-lens triage that maps the type lattice (public API vs implementation, vtables, inheritance) before reading code, and a conditional hidden-surface sweep that recovers .init_array constructors, dispatch tables, and RTTI handlers only when the target's signatures warrant it (and deliberately skips them when they don't). Also covers function triage, iterative type/struct recovery, call-graph mapping, and naming."
bn-re — Reverse Engineering Methodology
Use this skill when the user wants to understand, reverse engineer, or analyze a binary. This is a methodology guide — it tells you what to do and why. For command syntax, see the bn skill.
Approaching an Unknown Binary
Start broad, then narrow:
Orient — get architecture, platform, and entry point:
bn target infoSurvey imports and strings — these reveal libraries, APIs, and embedded literals that hint at functionality:
bn imports bn stringsImports tell you what the binary does (network I/O, file ops, crypto, GUI). Strings reveal configuration keys, error messages, format strings, and embedded paths.
Scan the function list — get a sense of scope:
bn function listNote the total count, address range, and whether symbols are stripped. A stripped binary with 2000 functions requires different tactics than a symbolicated one with 50. For vulnerability work on a stripped static target, see the "Stripped / static lane" in
bn-vr, which inverts the import-first workflow (strings → string-xref → behavioral sink recovery).Map the C++ type lattice (RTTI / symbolicated C++ targets) — when the symbols show C++ (mangled
_Z…names, RTTI), lead with the class lens instead of grepping symbols by hand:bn class list --no-stl # domain classes, folding std/ABI noise bn class show <ClassName> # one class: methods, vtable slots, bases, construction sitesclass listclusters functions by class and separates the public API surface from the implementation/engine classes at a glance;class showresolves a class's inheritance, vtable layout, and where it is constructed. On a rich C++ binary this is the fastest orientation there is — start here, then drill with the sections below. (Plain-C / stripped targets have no classes to show — skip it.)
Quick-loaded target? If the binary was opened with
bn load --quick/bn session start --quick(fast, no analysis),bn importsandbn sectionswork, butbn stringserrors untilbn refresh(it refuses rather than return nothing) andbn function listis partial (only entry-point + symbol functions) — steps 2–3 would otherwise read as "no strings, almost no functions" and mislead the survey. Checkanalysis_stateinbn target info("quick"vs"full") andbn refreshbefore surveying. See "Quick Load" in thebnskill (nowreference/runtime.md).
Function Triage
Pipe trap: large
bnread output (decompile,function list, etc.) spills to disk and stdout carries only an envelope. Piping that intogrep/jq/awk/c++filtmakes the filter see the envelope, not the data — a no-match then misreads as "absent" (e.g. concluding a name is mangled because| grep _Zmatched nothing). Write to a file first and process it:bn function list --out /tmp/fns.json && jq '.items|length' /tmp/fns.json, or slice with--limit/--linesso it doesn't spill.
Not all functions matter equally. Prioritize:
- Entry point and exports — start with what the OS calls.
bn target infogives the entry point;bn function search mainorbn function search startmay find the real main. - Large functions — complex logic concentrates in big functions. Sort by size or instruction count.
- High xref count — functions called from many places are utilities or core abstractions:
Many inbound xrefs = widely used. Few xrefs + large body = likely a top-level handler.bn xrefs <function_name> - String references — functions containing interesting strings (error messages, protocol keywords, file paths) are high-value targets:
Then usebn strings --regex --query "error|fail|password|key|flag"bn xrefson the string address to find which functions reference it. - Import callers — trace backward from interesting imports:
bn xrefs malloc bn callsites recv --within <function>
Hidden Code Surfaces
Binary Ninja's auto-analysis follows direct calls. Two important categories of code don't sit on that graph and will be invisible until you go looking for them.
Pre-main code (.init_array, constructors)
Functions tagged with __attribute__((constructor)), C++ static initializers, and any code the linker registers in .init_array run before main. They commonly stage globals, derive keys, or wire up dispatch tables — exactly the kind of setup that breaks an analysis built only from main's call graph.
To find them:
bn evidence initfinds every constructor/destructor section (.init_array,.ctors,.fini_array, …), walks each one, and resolves every slot to its function. It's arch-aware (uses the view's pointer size + endianness) and ARM/Thumb-aware (clears the T bit so an odd0x…1pointer resolves to the even function entry, marked[thumb-adjusted]) — don't hand-rollbv.read+struct.unpack('<…Q'), which hardcodes 8-byte little-endian and is silently wrong off x86-64. See the command in the bn skill (reference/reading.md).- Skip the toolchain stub
frame_dummy— it's the first slot on most GCC builds and rarely interesting. - Decompile each remaining entry. Anything that writes to BSS /
.data/.bssis staging state formainto read; rename itstage1_<purpose>(or similar) so the relationship is visible from later analysis.
ELF entry-flow review when nothing in main makes sense: entry_point → __libc_start_main → main, but _start and __libc_start_main invoke the .init_array callbacks before main. If a global "appears from nowhere" in main, the producer is almost certainly an .init_array entry.
Data-only function references
If the binary has a dispatch table (an array of function pointers — common in VMs, FSAs, vtables, callback registries), Binary Ninja often won't identify the targets as functions because there's no direct call to them, only a data reference from the table.
Symptoms: bn decompile <addr> errors with Function not found; the bytes at <addr> look like a function prologue (endbr64, push rbp, push {…,lr} / stp x29,x30 on ARM) on disasm, but it's marked as data.
Recover and create the targets:
bn evidence table <table-addr> --entries Nreads the dispatch/vtable table and resolves each slot to a function — Thumb-normalized, with astatus/plausibletag per entry and a warning when the address doesn't look like a table (so you can tell a real table from misread code). Slots that come backstatus: mapped/unmappedwith no function are the ones BN missed.bn function create <target> --previewcreates and verifies a function at each missing slot (a previewed, revertible mutation — see the bn skill —reference/reading.mdandreference/mutating.md). Save afterward to persist it.
After that, the normal bn decompile / bn xrefs flow works on the new function.
When this comes up most: VM opcode handler tables, FSA predicate tables, COM-style vtables, plugin registries. If you've recovered a struct of (tag, fn_ptr) rows and one of the fn_ptr targets is missing, this is almost always why.
Stripped C++ / generated code (RTTI, vtables, protobuf)
Stripped C++ firmware leaks structure through RTTI type-name strings and vtables even when symbols are gone. To turn a type name into code:
- First, if the binary still has demangled C++ symbols, use the class lens.
bn class list --no-stlclusters the recovered classes andbn class show <Name>gives one class's vtable + bases + construction sites — it correlates the symbols/RTTI/vtables BN already recovered, so reach for it before the lower-level evidence helpers below (which you still need on a fully stripped target where the class lens has no symbols to cluster). bn evidence message <TypeName>locates the type-name string (e.g. a mangledN…Etypeinfo name or apkg.Messageproto string), lists its xrefs, and dumps the nearby metadata windows — the typeinfo table and the serializer/handler pointer slots sitting next to it. This is how you get from "I see the stringcommon.HeadUnitInfo" to "its serializer issub_…" without reading raw bytes by hand.bn evidence table <vtable-addr>lists a class vtable's methods (Thumb-normalized), so you can tell construction from dispatch from generated boilerplate.bn evidence function <fn>flags thunks (aj_*veneer or PLT/import trampoline → its target) and, for each call, shows the raw ABI argument evidence beside the pseudo-C — including the vtable offset for an indirect/virtual call ((*(*this + 0xNN))(...)). Reach for it when the decompiler's argument story is incomplete and you'd otherwise drop to MLIL/disassembly.
Iterative Type Recovery
Type recovery is incremental. Don't try to get everything right at once.
Phase 1: Rename functions
Start with the easiest wins — rename functions whose purpose is obvious from strings, imports, or call patterns:
bn symbol rename sub_401000 parse_config --preview
Always preview first. Renaming propagates through decompilation and makes surrounding code easier to read.
Phase 2: Retype locals and parameters
Once a function's purpose is clear, fix the prototype and local types:
bn proto get parse_config
bn proto set parse_config "int32_t parse_config(char* buf, int32_t len)" --preview
bn local list parse_config
bn local retype parse_config arg1 "char*"
Correct prototypes propagate to all callers.
Phase 3: Struct reconstruction
When you see repeated field accesses at fixed offsets from a pointer, that pointer is a struct. See the Struct Reconstruction section below.
Batch mutations
When you have multiple renames or retypes queued up, use bn batch apply with a manifest instead of individual commands. This is faster and atomic. Pipe the manifest on stdin with a quoted heredoc — no temp file, and free-text comments need no escaping:
bn batch apply - <<'BN_EOF'
{"target": "active", "ops": [
{"op": "rename_symbol", "identifier": "sub_401000", "new_name": "parse_header"},
{"op": "set_comment", "address": "0x401040", "comment": "len isn't bounds-checked"}
]}
BN_EOF
A file path is also accepted (bn batch apply /tmp/manifest.json); --preview (before -) diffs without committing.
Call Graph Analysis
Understanding relationships between functions reveals architecture:
Trace callees — what does a function depend on?
bn decompile <function>Read the decompilation and note every function call.
Trace callers — who calls this function?
bn xrefs <function>Detailed call context — when you need to understand how a function is called (what arguments, under what conditions):
bn callsites <callee> --within <caller>This gives you the exact call site with surrounding HLIL context.
Trace argument origins — when you need to know where a specific call argument comes from (function parameter, global, heap allocation, previous call return):
bn trace <caller> <call_address> --arg N bn trace <caller> <call_address> --arg N --interproceduralEach step shows the SSA variable, its defining instruction, and whether the chain terminates at a function parameter, memory load, or call boundary. Add
--interproceduralto follow return values across internal call boundaries (works best on static/kernel binaries).Build a mental call tree — for key functions, trace both up and down 2-3 layers. This reveals the flow: entry -> dispatch -> handler -> utility.
Commenting
Comment the why a name can't carry (assumptions, edge cases, cross-function relationships), and drop TODO: markers for deferred work so later passes resume via bn comment list --query TODO. If it fits in a name, rename instead.
bn comment set --address 0x401000 "len isn't bounds-checked; attacker-controlled"
bn comment set --address 0x402000 "TODO: arg2 looks like a callback — confirm signature"
Struct Reconstruction
Repeated fixed-offset accesses off a pointer (*(arg1 + 0x10), *(arg1 + 0x18)) mean it's a struct. Collect the offsets from decompilation, check for an existing type (bn struct show <T>), set fields at the observed offsets, retype the param, and re-decompile — iterate until it reads naturally. Complex structs: bn types declare or bn py exec + StructureBuilder (see the bn skill).
bn struct field set Player 0x10 health int32_t --preview
bn local retype <function> arg1 "Player*"