Skip to content

Reverse-engineering techniques

The methods maintainers use to observe and confirm protocol facts. Each technique id is part of the fixed provenance vocabulary. Generated from spec/techniques/.

Technique Maturity Targets Guide
Baileys instrumentation established signaling, keying guide
Frida dynamic hooking emerging keying, media, transport guide
Process memory dump experimental keying, media -
TLS man-in-the-middle emerging signaling, transport -
Static smali / native analysis emerging signaling, keying, media -
WhatsApp Web WASM analysis emerging signaling, keying, media, transport guide
WebSocket / WABinary capture established signaling, keying guide

Baileys instrumentation

id: baileys-instrumentation
maturity: established
status: stable
targets: signaling, keying

Add logging and introspection hooks inside the Baileys client to record parsed call nodes, session state, and the shape of the Signal-protocol material used to key a call. Builds on websocket-capture by giving structured, already-decoded access to nodes and to the client's view of the keying handshake.

Strengths

  • Works with already-parsed node objects rather than raw bytes.
  • Can correlate a offer with the Signal session state that produced its nodes.
  • Easy to script: emit findings straight into the capture intake format.
  • Deterministic and repeatable against your own test accounts.

Limitations

  • Constrained to what the library implements and exposes; gaps in Baileys are gaps here too.
  • Still does not reach the media engine or SRTP key derivation.
  • Risk of drift if WhatsApp changes node shapes faster than the library tracks them.

Tooling

  • Baileys (WhiskeySockets)
  • Custom event/middleware hooks around node send/receive
  • The capture Issue Form to upstream findings

Maintainers: @purpshell

Guide: docs/techniques/baileys-instrumentation.md

References

Frida dynamic hooking

id: frida-hooking
maturity: emerging
status: review
targets: keying, media, transport

Attach Frida to the WhatsApp app on a device you own and hook native functions around call setup and cryptography to observe values the WebSocket never carries, notably derived media keys and the SRTP/RTP path. Highest effort, highest reach into the keying and media planes.

Strengths

  • Can observe runtime values, including derived keys and SRTP parameters, at the moment they are computed.
  • Reaches the media engine and transport selection that signaling-only techniques cannot.
  • Lets you confirm (not just infer) how a key in an node becomes an SRTP key.

Limitations

  • Requires a rooted/owned test device and ongoing effort to track app updates.
  • Function offsets and signatures shift between builds; hooks are brittle.
  • Sensitive: only ever run against your own accounts and devices, never third parties.

Tooling

  • Frida (frida.re)
  • A rooted test device or emulator you control
  • Symbol/offset discovery aids (e.g. a disassembler) to locate hook points

Guide: docs/techniques/frida-hooking.md

References

Process memory dump

id: memory-dump
maturity: experimental
status: draft
targets: keying, media

Recover in-memory state and key material from the WhatsApp process on a device you own, for example the plaintext call/media key after it is decrypted from an node, or SRTP context. The most invasive technique; powerful for keying/media but fragile and highly sensitive.

Strengths

  • Can recover the actual plaintext key bytes that other techniques only see as ciphertext or derivations.
  • Provides ground-truth values to anchor speculative keying/media facts.

Limitations

  • Extremely build- and timing-dependent; layouts change constantly.
  • Requires a rooted/owned device and careful handling, never dump a device or account that is not yours.
  • Recovered material is sensitive; only synthetic/own-account values may ever be written to this repo, and never raw keys.

Tooling

  • Process memory acquisition on a rooted test device you own
  • A debugger/instrumentation harness to snapshot at the right moment
  • Frida (to locate the moment a key is in memory)

References

TLS man-in-the-middle

id: mitm-tls
maturity: emerging
status: draft
targets: signaling, transport

Intercept the auxiliary HTTPS/TLS traffic around a call (provisioning, relay allocation, telemetry) on a device you own. Does not break WhatsApp's Noise transport or Signal-protocol encryption, but can surface server endpoints, relay/ICE configuration, and timing that contextualize the signaling plane.

Strengths

  • Reveals out-of-band HTTPS endpoints and relay/transport configuration that the WebSocket alone does not explain.
  • Useful for mapping which servers participate in call setup and media relaying.
  • Standard, well-understood tooling.

Limitations

  • Cannot decrypt the Noise WebSocket or Signal payloads, the core call crypto is out of reach.
  • Certificate pinning blocks naive interception; needs a patched/owned client to bypass on your own device.
  • Often yields context rather than direct protocol facts; corroborate before raising confidence.

Tooling

  • mitmproxy
  • A device/emulator you own with a trusted interception CA
  • Pinning-bypass on your own build where required

References

Static smali / native analysis

id: static-smali-analysis
maturity: emerging
status: draft
targets: signaling, keying, media

Decompile the WhatsApp Android app and read the disassembled smali and native code to map how call stanzas are built, which attributes and enum values exist, and how the keying and media paths are wired, the intended behavior, without running anything.

Strengths

  • Exposes the full vocabulary of attributes, node tags, and enum constants as the app defines them, including paths rarely seen live.
  • Great for discovering names/structure to look for with dynamic techniques.
  • No live account or device required once you have the APK.

Limitations

  • Shows intended logic, not observed runtime values, easy to misread dead or feature-flagged code paths.
  • Obfuscation and native code raise the effort substantially.
  • A finding here is a hypothesis; it needs a live technique to corroborate before reaching confirmed.

Tooling

  • apktool / baksmali for smali
  • jadx for readable decompilation
  • Ghidra / radare2 for native libraries

References

WhatsApp Web WASM analysis

id: wasm-analysis
maturity: emerging
status: review
targets: signaling, keying, media, transport

WhatsApp Web ships its calling engine as an Emscripten-compiled WebAssembly module. Statically and iteratively reverse-engineer that module, parse, fingerprint, auto-identify library code, lift to pseudo-C, and diff across releases, to recover how 1:1 calls are signaled, keyed, and carried. The web client is a cleaner, more stable surface than the mobile app, and findings can compound across WhatsApp's frequent rebuilds instead of resetting each release.

Strengths

  • WASM has explicit function boundaries, structured control flow, and typed signatures, far more legible than Android smali or native ARM.
  • The JS-to-WASM boundary (imports/exports) makes the interface to WebCrypto, WebRTC, and the socket explicit and easy to enumerate.
  • Open runtime: Emscripten/musl/libc++ are open source, so 40-80% of the module can be auto-identified, concentrating analyst effort on app-specific call logic.
  • Compounding RE: keyed to stable function identity, annotations survive vendor rebuilds, so knowledge accumulates across WhatsApp releases.

Limitations

  • Static surface only: it does not observe live call packets or real-time SRTP media, pair with frida-hooking or websocket-capture for runtime values.
  • LTO inlining, custom allocators, and heavy obfuscation degrade automated library identification and structural matching.
  • Reveals intended logic: a finding still needs runtime corroboration from an independent technique before it can reach confirmed.

Tooling

  • warden (warden-re), living RE knowledge base for Emscripten WASM (parse, fingerprint, Emscripten Oracle identification, pseudo-C lift, cross-version diff)
  • Ghidra with the ghidra-wasm-plugin; WABT / wasm-tools; Emscripten / emsdk
  • Browser devtools to locate and pull the call WASM module and its JS glue

Maintainers: @purpshell

Guide: docs/techniques/wasm-analysis.md

References

WebSocket / WABinary capture

id: websocket-capture
maturity: established
status: stable
targets: signaling, keying

Capture and decode the binary WABinary nodes that flow over the Noise-encrypted WhatsApp multi-device WebSocket. Because call signaling rides the same socket as messaging, the entire stanza family can be observed at the framing boundary on an account you control, without touching the native media engine.

Strengths

  • Cheapest, lowest-risk way to see the stanza family end to end.
  • Observes the full node tree (tags, attributes, children) after Noise decryption.
  • Reveals the structure of the keying envelope (type pkmsg/msg, version) even though the ciphertext payload stays opaque.
  • Reproducible from a plain library session; no rooted device or app patching needed.

Limitations

  • Completely blind to the media plane (SRTP, RTP, codecs), that traffic never crosses the WebSocket.
  • Sees the envelope but not the plaintext call/media key inside it.
  • Only as complete as your client's own session; multi-device fan-out beyond your devices is inferred, not observed.

Tooling

  • Baileys (WhiskeySockets) for a scriptable multi-device session
  • A WABinary decoder (Baileys exposes one)
  • Frame logging at the Noise transport boundary

Maintainers: @purpshell

Guide: docs/techniques/websocket-capture.md

References

Back to spec overview