How-to: WhatsApp Web WASM analysis¶
Maturity: emerging · Reveals: signaling, keying, media, transport (static) · Risk: low (no live traffic required)
WhatsApp Web implements its calling engine as an Emscripten-compiled WebAssembly module. This changes the problem: instead of fighting obfuscated Android smali or native ARM, you get a standardized bytecode with explicit function boundaries, typed signatures, structured control flow, and an explicit JavaScript boundary (imports/exports) into WebCrypto, WebRTC, and the socket. This is a useful surface for the under-served keying and media layers.
Scope of consent: analyze the module shipped to your own browser session on your own account. Static analysis touches no one else's data. Document the protocol's structure and behavior. Never commit recovered secret values or key material. See DISCLAIMER and SECURITY.
Why WASM-on-web beats the mobile app¶
| Mobile app (smali / native) | WhatsApp Web (WASM) |
|---|---|
| Obfuscated DEX + native ARM; every release churns | Standardized bytecode; structure stays comparable across builds |
| Library code looks like everything else | Open-source Emscripten runtime → 40–80% auto-identifiable |
| Annotations die on each update | Annotations can be keyed to stable function identity and carried forward |
| Hard for tools/agents to reason about | Lifts cleanly to pseudo-C; agent-friendly |
It is not a total replacement: WASM static analysis does not see live call packets or real-time SRTP media. Pair it with Frida or WebSocket capture for runtime corroboration.
warden: the living RE knowledge base¶
warden (pip install warden-re) is built for
exactly this surface. It treats reverse engineering as a versioned knowledge base
rather than a one-shot decompile, so work compounds across WhatsApp's frequent rebuilds.
It does three main things:
- Emscripten Oracle: compiles labelled reference modules and fingerprints them, so it can auto-name the libc/musl/allocator/runtime functions that make up most of any Emscripten binary.
- Persistent, versioned symbol KB: every name/type/comment is keyed to a stable function identity (structural hash + type signature + import call targets), not a table index, so annotations survive a rebuild.
- Cross-version carry-over: diffs a new module against the last and carries annotations forward, surfacing only what genuinely changed.
warden's KB uses a provenance and confidence economy (human > oracle > export
import > string-xref > diff-carry > agent) that mirrors wacrg's own confidence/provenance model, which makes exporting a warden finding into a wacrg fact a natural step.
Workflow¶
- Locate the module. In a logged-in WhatsApp Web session (your own account), use
browser devtools to find the call-related
.wasmrequest and its Emscripten JS glue. Save both. - Ingest + fingerprint with warden:
- Auto-identify runtime code via the Oracle so analyst effort lands on app logic:
- Lift app functions to pseudo-C and read the call-setup / crypto / media paths: Look for calls across the JS boundary to WebCrypto (keying), to the media/RTP engine (SRTP), and to the socket (signaling) to anchor which functions own which layer.
- Diff across releases to track what WhatsApp changed without redoing prior work:
- Translate findings into spec facts. Express what you learned as protocol
structure (e.g. "an N-byte secret from the keying envelope feeds the SRTP context
via
") with technique wasm-analysis, honest confidence (usuallyprobablefor a single static read), and provenance referencing the module version. No key material in the repo.
Pitfalls¶
- Whole-program LTO can inline library functions into callers, lowering Oracle hit rate and hiding boundaries. Manual annotation is the fallback.
- Static reads reveal intended logic; a runtime technique (Frida, capture) is what
promotes a
wasm-analysisfinding towardconfirmed, and the two are independent corroboration. - Pin the exact module version (and hash) in
provenance.sourcesso others can fetch the same binary.
See also: encryption & keying, media / SRTP, methodology, and warden.