Severity: ⚪ low • Category: correctness
Location: src/serde.rs : 99-110
What's wrong
String and bytes lengths are encoded as (bytes.len() as u32). On 64-bit, a len() >= 4 GiB truncates the stored length header, so deserialize would read the wrong number of bytes (and the encoded payload length no longer matches the header). In practice the value is rejected later by store_result's value_bytes.len() > self.max_value_size check, and the default max_value_size is 4096, so a >4 GiB value can never be stored. It only becomes reachable if a user sets max_value_size to a multi-GB value AND caches a >4 GiB str/bytes. Low severity given that, but the truncation happens in serde before any guard, so it is a latent correctness landmine if limits are ever relaxed or removed.
Trigger
max_value_size set to e.g. 5 GiB and a returned bytes/str object larger than 4 GiB; the u32 length header truncates and deserialize reads a wrong-length slice.
Suggested fix
Either reject bytes.len() > u32::MAX as usize explicitly in serialize_element (return Ok(false) to fall back to pickle), or widen the length field to u64. A defensive length check costs nothing and removes the landmine.
Adversarial verification note
The cited mechanism is real and confirmed in the actual code. src/serde.rs:99-101 (str) and 108-110 (bytes) encode the payload length as (bytes.len() as u32).to_le_bytes(), which truncates any length >= 2^32. On deserialize (src/serde.rs:177 for str, :190 for bytes), the truncated u32 is read back as the slice length, so a >4 GiB value would be read with a wrong (mod 2^32) length — a genuine round-trip corruption.
The reachability analysis in the finding is also accurate. With the default max_value_size=4096 (src/shared_store.rs:54), the guard at src/shared_store.rs:268 (value_bytes.len() > self.max_value_size) rejects any value over 4096 bytes long before the truncation matters, so the bug is unreachable in practice. It only becomes reachable if a user raises max_value_size to multi-GB AND caches a >4 GiB str/bytes.
One nuance: the finding actually slightly understates the consequence. self.max_value_size is stored as the un-truncated usize (src/shared_store.rs:62,99), so the guard at :268 compares against the real multi-GB limit and would pass a >4 GiB value. That same value then flows into shm insert_inner, where the slot was sized from max_value_size as u32 (src/shared_store.rs:79 -> shm/mod.rs:57, truncated) while slot.value_len = value_bytes.len() as u32 (shm/mod.rs:399) also truncates and copy_nonoverlapping(..., value_bytes.len()) copies the full >4 GiB into the truncated slot — a shared-memory buffer overflow, not just a wrong-length read. So the underlying landmine is more serious than the serde-only framing. But it shares the exact same gate (a user must deliberately set a multi-GB max_value_size and produce a >4 GiB cached value), so the practical severity remains low: not reachable from the safe API with default or any sane configuration. Confirming the finding as REAL at low severity.
Filed from a multi-agent code review (finder → adversarial verification → synthesis). Confirmed real after a skeptic re-read the code.
Severity: ⚪ low • Category: correctness
Location:
src/serde.rs: 99-110What's wrong
String and bytes lengths are encoded as
(bytes.len() as u32). On 64-bit, alen()>= 4 GiB truncates the stored length header, so deserialize would read the wrong number of bytes (and the encoded payload length no longer matches the header). In practice the value is rejected later bystore_result'svalue_bytes.len() > self.max_value_sizecheck, and the default max_value_size is 4096, so a >4 GiB value can never be stored. It only becomes reachable if a user sets max_value_size to a multi-GB value AND caches a >4 GiB str/bytes. Low severity given that, but the truncation happens in serde before any guard, so it is a latent correctness landmine if limits are ever relaxed or removed.Trigger
max_value_size set to e.g. 5 GiB and a returned bytes/str object larger than 4 GiB; the u32 length header truncates and deserialize reads a wrong-length slice.
Suggested fix
Either reject
bytes.len() > u32::MAX as usizeexplicitly in serialize_element (return Ok(false) to fall back to pickle), or widen the length field to u64. A defensive length check costs nothing and removes the landmine.Adversarial verification note
The cited mechanism is real and confirmed in the actual code. src/serde.rs:99-101 (str) and 108-110 (bytes) encode the payload length as
(bytes.len() as u32).to_le_bytes(), which truncates any length >= 2^32. On deserialize (src/serde.rs:177 for str, :190 for bytes), the truncated u32 is read back as the slice length, so a >4 GiB value would be read with a wrong (mod 2^32) length — a genuine round-trip corruption.The reachability analysis in the finding is also accurate. With the default
max_value_size=4096(src/shared_store.rs:54), the guard at src/shared_store.rs:268 (value_bytes.len() > self.max_value_size) rejects any value over 4096 bytes long before the truncation matters, so the bug is unreachable in practice. It only becomes reachable if a user raisesmax_value_sizeto multi-GB AND caches a >4 GiB str/bytes.One nuance: the finding actually slightly understates the consequence.
self.max_value_sizeis stored as the un-truncatedusize(src/shared_store.rs:62,99), so the guard at :268 compares against the real multi-GB limit and would pass a >4 GiB value. That same value then flows into shminsert_inner, where the slot was sized frommax_value_size as u32(src/shared_store.rs:79 -> shm/mod.rs:57, truncated) whileslot.value_len = value_bytes.len() as u32(shm/mod.rs:399) also truncates andcopy_nonoverlapping(..., value_bytes.len())copies the full >4 GiB into the truncated slot — a shared-memory buffer overflow, not just a wrong-length read. So the underlying landmine is more serious than the serde-only framing. But it shares the exact same gate (a user must deliberately set a multi-GB max_value_size and produce a >4 GiB cached value), so the practical severity remains low: not reachable from the safe API with default or any sane configuration. Confirming the finding as REAL at low severity.Filed from a multi-agent code review (finder → adversarial verification → synthesis). Confirmed real after a skeptic re-read the code.