SRTP key schedule¶
encryption-keying calls the SRTP key derivation "the
most speculative part of the whole spec." This page resolves the structure of
it. The call/media key delivered via Signal <enc> is expanded into SRTP master
keys, then into SRTP/SRTCP session keys, in two layers.
Confidence:
confirmedfor the E2E SRTP derivation. It is now agreed by multiple independent paths: staticwasm-analysisof the binary (this page), a runtime WASM trace, and two independent reconstructions whose primitives are pinned to known-answer test vectors: zapo-caller (TypeScript) and whatsapp-rust (Rust). All derive byte-identical keys. The HBH two-stage schedule (below) isprobable(recovered by the reconstructions; one technique class). Rekey policy staysspeculative.Provenance. Technique
wasm-analysis· flavorszapo-caller,whatsapp-rust· contributorspurpshell,jlucaso1,auties,sheiitear,edgard· sources:wacore/src/voip/e2e_srtp.rs:55,hbh_srtp.rs:51, the trace-verified Go reference, commit history. No key material is in this repo: structure only.
Layer 1: WAHKDF (call key to SRTP master)¶
masterKey(16) || masterSalt(14) || unused(16)
= HKDF-SHA256(IKM = callKey, salt = nil, info = participantLID, L = 46)
- IKM is the 32-byte call/media key (the secret unwrapped from the Signal
<enc>offer; see encryption-keying). - info is the participant's LID (the per-participant identifier), so each participant gets distinct master keying from the same call key.
- The first 16 bytes are the SRTP master key, the next 14 the master salt; the trailing 16 are unused.
Static evidence (new): HKDF-SHA256 requires SHA-256, which is present and
named in the binary: sha256_transform_blocks (#1226, identified by the inline
round constants 0x428a2f98/0x71374491 and the full K[0..63] table at data
address 978512), with sha256_update (#1224) and sha256_finalize (#1223). A
runtime trace matched this HKDF step exactly.
Layer 2: RFC 3711 KDF (master to session keys)¶
Standard SRTP key derivation (RFC 3711 section 4.3): an AES-128 counter-mode PRF keyed by the master key, with the master salt and a one-byte label building the IV.
rfc3711kdf(label, L):
iv = masterSalt (14 bytes), zero-padded to 16
iv[7] ^= label
stream = AES-128-CM(masterKey, iv) # counter in iv[14..15]
return stream[:L]
| label | output | length | use |
|---|---|---|---|
0x00 |
SRTP cipher key | 16 | media encryption |
0x01 |
SRTP auth key | 20 | media auth (HMAC-SHA1) |
0x02 |
SRTP salt | 14 | media IV salt |
0x03 |
SRTCP cipher key | 16 | RTCP encryption |
0x04 |
SRTCP auth key | 20 | RTCP auth |
0x05 |
SRTCP salt | 14 | RTCP IV salt |
The 20-byte auth keys imply HMAC-SHA1; the 16-byte cipher keys and AES-CM PRF
imply AES-128-CM. So the suite is AES_CM_128_HMAC_SHA1_80 (the common
SRTP profile).
Static evidence (new): AES-128 counter mode is present and named:
aes_128_ctr_init (#430), configure_aes128_counter_mode (#459),
aes_ctr_init_context (#465), aes_ctr_cipher_stream_copy (#514). The
per-participant derivation is anchored by a log string in #11407:
"stored E2E key and derived SRTP/P2P for extension jid=%s".
Two SRTP layers: end-to-end and hop-by-hop¶
WhatsApp protects media twice (see media-srtp):
- End-to-end (E2E) SRTP uses the master derived above (Layer 1+2), so relays forward ciphertext they cannot read.
- Hop-by-hop (HBH) SRTP is keyed by a 30-byte key the relay supplies
(
<hbh_key>, base64 in the offer/accept), packed asmasterKey(16) || masterSalt(14). Unlike E2E it does not start from the call key; instead the 30-byte relay key runs through a two-stage HKDF-SHA256 (wa_sfu_kdf) with directional string labels:
stage 1: srtcp_salt = HKDF-SHA256(salt = zeros(32), ikm = masterSalt,
info = "uplink hbh srtcp salt", L = 32)
stage 2: crypto_key = HKDF-SHA256(salt = srtcp_salt, ikm = masterKey,
info = "uplink hbh srtcp key", L = 30)
-> crypto_key(16) || crypto_salt(14)
with a matching downlink hbh srtcp salt / downlink hbh srtcp key pair for
the other direction. The binary references hbh_srtp_key / hbh_srtcp_key;
the labels and two-stage chain are recovered from the reconstructions
(wacore/src/voip/hbh_srtp.rs:51).
A frame is encrypted E2E first, then HBH on transmit; the reverse on receive.
Relationship to SFrame¶
Distinct from SRTP, the binary also carries facebook::sframe /
wa::sframe, a per-frame media-encryption layer using AES-128-GCM (keyed by
a separate HKDF label e2e sframe key). See
SFrame: per-frame media E2EE; whether SFrame and the E2E
SRTP are one layer or two stacked layers is an open question.
Open questions¶
- The exact HKDF
infobeyond the participant LID (any fixed label/context prefix?), and the<encopt keygen>/<enc v>values that select it. - Whether/when SRTP keys rotate during a long call.
- The SFrame key schedule and how it composes with E2E SRTP.
- Formalizing the runtime trace as a recorded capture to reach
confirmed.