Signaling: the <call> node family¶
Call control is expressed as WABinary nodes under a top-level <call> tag,
routed by the WhatsApp server between caller and callee devices over the
Noise-encrypted WebSocket. This page is the human-readable
overview. The precise, machine-generated definitions (attribute tables,
confidence levels, provenance, and examples) live in the
stanza catalog, and the end-to-end sequences live in the
flow catalog.
Everything here is a starting model. Each attribute and child in the generated pages carries its own
confidenceand, where we are unsure, anopen_questionsentry. Treat unhedged prose below asprobableunless noted.
The <call> envelope¶
A <call> node is the routing envelope. Its core attributes identify the
parties and (via the child stanza) the call:
from/to: JIDs of the sending and receiving devices.id: the stanza id used for server acknowledgement.
The child of <call> determines what the stanza means. Exactly one of the
control stanzas below appears as the child (an offer, an accept, a terminate, and
so on), most carrying a call-id that correlates every stanza belonging to the
same call, and often a call-creator naming the initiating device.
The control stanzas¶
| Child | Meaning | Direction (typical) |
|---|---|---|
<offer> |
Start a call; advertise media + deliver the media key | outgoing → callee |
<preaccept> |
Early/ringing acknowledgement | callee → caller |
<accept> |
Callee accepts the call | callee → caller |
<reject> |
Callee declines | callee → caller |
<terminate> |
End the call, with a reason |
either direction |
<transport> |
ICE/relay transport updates | either direction |
<relaylatency> |
Relay latency probe results | either direction |
| mute / video-toggle | In-call state updates | either direction |
The server also emits <ack> / receipt nodes to acknowledge call stanzas; these
are part of the shared messaging infrastructure rather than the <call> family
itself.
The offer in detail¶
The <offer> is the richest stanza. Inside
it you typically find:
- Media descriptors:
<audio enc rate>and, for video calls,<video>, describing the proposed codec(s) and parameters. <net>: network information hints (e.g. connection medium).<capability ver>: device call capabilities, usually a binary blob.<encopt keygen>: encryption and key-generation options.- One or more
<enc v type>: Signal-protocol ciphertext delivering the call/media key, one per peer device.type="pkmsg"establishes a new session;type="msg"uses an existing one. See encryption & keying. <destination>: transport endpoints (<te>/<endpoint>): relay candidates with latency hints, feeding ICE/relay selection.
A synthetic, sanitized offer is shown in
corpus/captures/example-sanitized-offer.yaml
and rendered on the generated offer stanza page.
Termination reasons¶
<terminate> (and <reject>) carry a reason. Known/likely values include
timeout, busy, declined, and connection_lost. The authoritative list is
the generated enums page.
How stanzas compose into flows¶
Individual stanzas only tell half the story; what matters is the sequence. A flow ties stanzas together into a realistic exchange between caller, server, and callee. For example, the outgoing-audio flow is roughly:
Each step in a flow references a stanza id, so the
generated flow pages render a Mermaid sequence diagram and
link every step back to its stanza definition. This is the most useful entry
point if you are trying to understand the order of operations rather than a
single node.
See also:
- Architecture for the two-plane picture and a lifecycle diagram.
- Transport over Noise for what carries these stanzas.
- Encryption & keying for what the
<enc>nodes do.
Open questions¶
- The full set of in-call state updates (mute, hold, video toggle, camera switch) and their exact node shapes.
- Whether
<preaccept>is always present or conditional on callee state. - Server-side vs. peer-side origination of
<transport>and<relaylatency>updates.
These are tracked as open_questions on the relevant
stanza entries.
For the WASM-side view of the same machinery, covering the engine's own offer/accept/
rekey handlers, the call state machine, and the group/call-link surface, recovered
by wasm-analysis as an independent second technique, see
call signaling from the WASM.