MLow range coder¶
Encodings - mlow-rangecoder
ENC-04 - status: review - audio
The Opus/CELT range entropy coder (RFC 6716 §4.1), reused verbatim by MLow: range-coded symbols packed from the buffer front, raw uniform bits from the back, sharing one byte buffer.
An MLow payload is one byte buffer carrying two interleaved bit streams from a single Opus/CELT range coder: range-coded symbols from the front (low offsets, ascending) and raw uniform bits from the back (high offsets, descending). A decoder MUST consume both streams from the same buffer with the algorithms below; the two cursors MUST NOT overlap in a well-formed payload.
All arithmetic is unsigned 32-bit with modular (wrapping) semantics where noted.
Constants¶
EC_SYM_BITS = 8 ; bits per renormalisation byte
EC_CODE_BITS = 32 ; range register width
EC_SYM_MAX = (1 << EC_SYM_BITS) - 1 ; 255
EC_CODE_TOP = 1 << (EC_CODE_BITS - 1) ; 0x80000000
EC_CODE_BOT = EC_CODE_TOP >> EC_SYM_BITS ; 0x00800000
EC_CODE_EXTRA = (EC_CODE_BITS - 2) % EC_SYM_BITS + 1 ; 7
EC_WINDOW_SIZE = 32 ; raw-bit window width
EC_UINT_BITS = 8 ; uint split threshold
EC_CODE_SHIFT = EC_CODE_BITS - EC_SYM_BITS - 1 ; 23
The helper ilog(x) MUST return floor(log2(x)) + 1 for x > 0 and 0 for
x == 0 (equivalently 32 - clz(x)). ec_mini(a, b) MUST return the smaller
of a and b.
Decoder¶
Initialisation (ec_dec_init)¶
offs = 0 ; front cursor (range bytes)
end_offs = 0 ; back cursor (raw bytes, counts from buffer end)
end_window = 0 ; nend_bits = 0
rng = 1 << EC_CODE_EXTRA ; 0x80
nbits_total = EC_CODE_BITS + 1
- ((EC_CODE_BITS - EC_CODE_EXTRA) / EC_SYM_BITS) * EC_SYM_BITS
rem = read_byte()
val = rng - 1 - (rem >> (EC_SYM_BITS - EC_CODE_EXTRA))
normalize()
read_byte() MUST return buf[offs] and advance offs while offs <
len(buf), and MUST return 0 once the front cursor reaches the end of the
buffer (reading past the end yields zero bytes, it is not an error).
Normalisation¶
After every symbol the decoder MUST renormalise so that rng > EC_CODE_BOT.
All val updates here MUST use wrapping 32-bit arithmetic.
while rng <= EC_CODE_BOT:
nbits_total += EC_SYM_BITS
rng <<= EC_SYM_BITS
sym0 = rem
rem = read_byte()
sym = (sym0 << EC_SYM_BITS | rem) >> (EC_SYM_BITS - EC_CODE_EXTRA)
val = ((val << EC_SYM_BITS) + (EC_SYM_MAX & ~sym)) & (EC_CODE_TOP - 1)
Decoding a symbol¶
decode(ft) MUST return a value in [0, ft):
if ft == 0: err = 1; ext = 1; return 0
ext = rng / ft
if ext == 0: err = 1; ext = 1; return 0
s = val / ext
return ft - ec_mini(s + 1, ft)
After the caller locates the symbol whose cumulative range is [fl, fh) out of
ft, it MUST call update(fl, fh, ft):
s = ext * (ft - fh) ; wrapping
val = val - s ; wrapping
if fl > 0: rng = ext * (fh - fl) ; wrapping
else: rng = rng - s ; wrapping
normalize()
CDF table decode¶
A symbol against a u16 cumulative CDF table cdf (length n >= 2) MUST be
decoded as follows. Base cdf[0] is subtracted from every entry; effective
total is cdf[n-1] - cdf[0]:
if n < 2: err = 1; return 0
base = cdf[0]
if cdf[n-1] <= base: err = 1; return 0
ft = cdf[n-1] - base
fs = decode(ft)
target = base + fs
k = 0
while k < n-1 and cdf[k+1] <= target: k += 1
update(cdf[k] - base, cdf[k+1] - base, ft)
return k
Inverse-CDF table decode¶
A symbol against a u8 inverse-CDF table icdf with ftb = log2(ft)
(RFC 6716 ec_dec_icdf) MUST be decoded as:
if icdf is empty: err = 1; return 0
r = rng >> ftb
ret = -1 ; s = rng
repeat:
t = s
ret += 1
s = r * icdf[ret]
until val >= s or ret == len(icdf) - 1
val = val - s
rng = t - s
normalize()
return ret
Binary / logp decode¶
A single bit with probability P(0) = 1 / 2^logp (ec_dec_bit_logp) MUST be
decoded as:
s = rng >> logp
ret = (val < s) ? 1 : 0
if ret == 0: val = val - s ; rng = rng - s
else: rng = s
normalize()
return ret
A uniform bits_n-bit symbol read directly off the range stream
(decode_raw_symbol) MUST be decoded by first computing
ext = rng >> bits_n (on ext == 0, set err = 1, ext = 1, return 0),
then s = val / ext, ft = 1 << bits_n, sym = ft - ec_mini(s + 1, ft), and
finally update(sym, sym + 1, ft), returning sym.
Uniform integer decode¶
An integer uniformly distributed in [0, ft0) for ft0 > 1
(ec_dec_uint) MUST be decoded as:
ft = ft0 - 1
ftb = ilog(ft)
if ftb > EC_UINT_BITS:
ftb -= EC_UINT_BITS
t = (ft >> ftb) + 1
s = decode(t) ; update(s, s+1, t)
v = (s << ftb) | bits_n(ftb)
if v <= ft: return v
err = 1 ; return ft
else:
s = decode(ft + 1) ; update(s, s+1, ft + 1)
return s
64-symbol fine-lag decode (decode_64_fine_sym)¶
ext = rng >> 6
if ext == 0: err = 1; ext = 1; return 0
s = val / ext
sym = clamp(63 - s, 0, 64)
update(sym, sym + 1, 64)
return sym
Raw bits from the back¶
Raw bits are pulled from the back, LSB-first, through a 32-bit window.
bits_n(n) MUST:
window = end_window ; available = nend_bits
if available < n:
repeat:
window |= read_byte_from_end() << available
available += EC_SYM_BITS
until available > EC_WINDOW_SIZE - EC_SYM_BITS
ret = window & ((1 << n) - 1)
window >>= n
available -= n
end_window = window ; nend_bits = available
nbits_total += n
return ret
read_byte_from_end() MUST return buf[storage - 1 - end_offs] (where
storage = len(buf)) and then increment end_offs, while end_offs <
storage; it MUST return 0 once exhausted.
Error state and tell¶
Any operation that detects a degenerate or exhausted input (zero ft, zero
ext, empty/invalid table) MUST set the sticky error flag err = 1. A
consumer SHOULD treat a set err as a hard decode failure rather than using
the synthesised values. ec_tell MUST be nbits_total - ilog(rng).
Encoder¶
The encoder is the exact inverse and produces a byte-identical buffer. It MUST
initialise: rng = EC_CODE_TOP, val = 0, rem = -1, offs = 0,
end_offs = 0, nbits_total = EC_CODE_BITS + 1, fixed-size output buffer.
Renormalisation and carry¶
normalize():
while rng <= EC_CODE_BOT:
carry_out(val >> EC_CODE_SHIFT)
val = (val << EC_SYM_BITS) & (EC_CODE_TOP - 1) ; wrapping
rng <<= EC_SYM_BITS
nbits_total += EC_SYM_BITS
carry_out(c):
if c != EC_SYM_MAX:
carry = c >> EC_SYM_BITS
if rem >= 0: write_byte(rem + carry)
while ext > 0:
write_byte((EC_SYM_MAX + carry) & EC_SYM_MAX)
ext -= 1
rem = c & EC_SYM_MAX
else:
ext += 1
write_byte(b) MUST store b at buf[offs] and advance offs while
offs + end_offs < storage, otherwise set err = -1. write_byte_at_end(b)
MUST store b at buf[storage - 1 - end_offs] and increment end_offs under
the same bound, otherwise set err = -1.
Encoding primitives¶
encode(fl, fh, ft) (the inverse of decode/update) MUST, with err = -1
on ft == 0:
r = rng / ft
if fl > 0:
val = val + (rng - r * (ft - fl)) ; wrapping
rng = r * (fh - fl) ; wrapping
else:
rng = rng - r * (ft - fh) ; wrapping
normalize()
The encoder MUST provide the matching inverses of every decode primitive:
bit_logp(val, logp), encode_icdf(s, icdf, ftb), encode_cdf(s, cdf)
(with the same cdf[0] base subtraction and validity checks as the decoder),
encode_uint(fl, ft0), encode_raw_symbol(sym, nbits) =
encode(sym, sym+1, 1 << nbits), and encode_64_fine_sym(sym) =
encode(sym, sym+1, 64). Raw bits are written toward the back with
bits_n(fl, n) through the same 32-bit window, flushing full bytes with
write_byte_at_end when the window would exceed EC_WINDOW_SIZE.
Flush¶
After all symbols are written the encoder MUST flush (ec_enc_done) to emit
the final range bytes, drain the back raw-bit window, and merge the two
streams into the single buffer:
l = EC_CODE_BITS - ilog(rng)
msk = (EC_CODE_TOP - 1) >> l
end = (val + msk) & ~msk
if (end | msk) >= val + rng:
l += 1 ; msk >>= 1 ; end = (val + msk) & ~msk
while l > 0:
carry_out(end >> EC_CODE_SHIFT)
end = (end << EC_SYM_BITS) & (EC_CODE_TOP - 1)
l -= EC_SYM_BITS
if rem >= 0 or ext > 0: carry_out(0)
; drain back window
while nend_bits >= EC_SYM_BITS:
write_byte_at_end(end_window & EC_SYM_MAX)
end_window >>= EC_SYM_BITS ; nend_bits -= EC_SYM_BITS
if err == 0:
zero-fill buf[offs .. storage - end_offs)
if remaining bits:
if end_offs >= storage - offs: err = -1
else: buf[storage - end_offs - 1] |= end_window
The finished payload is buf; the meaningful body length is offs + end_offs
bytes (front range bytes plus back raw-bit bytes), and the gap between the two
cursors MUST be zero-fill padding written by the flush. A successful flush MUST
have err == 0; a non-zero err means the buffer was too small and the output
MUST be discarded.
Notes. MLow's own decode path uses only decode_cdf/encode_cdf (u16 cumulative CDF),
the uniform nbits-bit raw symbol, and decode_64_fine_sym. The generic
decode_uint, decode_icdf, and bit_logp paths exist but are not on it.
Parent: mlow
Requires: mlow
Breakdown: mlow-decoder, mlow-encoder, mlow-excitation, mlow-frame, mlow-lsf-lpc
Implemented by
| Flavor | Status | Source | Notes |
|---|---|---|---|
whatsapp-rust |
working | history - blame - commits 674e851 |
full ec_dec + ec_enc port; round-trip vectors pass |
meowcaller |
partial | history - blame - commits b0fe93c e362783 8cbef54 bb84ff7 335a0ab |
encodings codec modules are partial |
Annotation wacrg:ENC-04 — a flavor marks its implementation site in source with this comment; a script clones the source, finds it, and attaches the commit blame/permalink.
Contributors
| Contributor | Role |
|---|---|
| wrote initial spec |
protocol history / diff - blame
Open questions - Which exact subset of primitives the MLow frame parser invokes, and in what order, is defined by the frame/excitation parts rather than the coder itself.
References - RFC 6716 — Definition of the Opus Audio Codec, §4.1 (Range Decoder) - libopus — celt/entdec.c / celt/entenc.c
Changelog¶
- 2026-06-21 — Initial spec entry.