Severity: ⚪ low • Category: correctness
Location: src/shm/region.rs : 176-201
What's wrong
create_or_open only re-uses an existing file when version, capacity, max_key_size and max_value_size match. It does NOT check that slot_size or ht_capacity in the on-disk header match what this process would compute, and the validation trusts the header's own capacity/ht_capacity fields for all subsequent pointer arithmetic (slab_offset uses header.ht_capacity). If two processes link different builds where SLOT_HEADER_SIZE / HEADER_SIZE / Bucket::SIZE differ but VERSION was not bumped, the size_of compile-time asserts pass per-build but the in-memory offsets diverge, and process B will index the slab using process A's geometry. Because slot_size is fully derived from max_key_size+max_value_size which ARE validated, this is mostly mitigated today, but ht_capacity is recomputed in create() and never cross-checked on open, and HEADER_SIZE/SLOT_HEADER_SIZE constants are not encoded in the file beyond the version byte.
Trigger
Two processes built from source revisions with different SLOT_HEADER_SIZE/HEADER_SIZE/Bucket layout but the same VERSION constant open the same {name}.data file; pointer arithmetic uses mismatched geometry -> silent corruption.
Suggested fix
Store HEADER_SIZE, SLOT_HEADER_SIZE, Bucket::SIZE (or a single layout hash) in the header and validate them on open; bump/extend VERSION whenever any layout constant changes, and assert header.slot_size == expected and header.ht_capacity == next_power_of_two(capacity*2) on open.
Adversarial verification note
Confirmed in the actual code. On open, create_or_open validates only version, capacity, max_key_size, max_value_size (region.rs:177-181). The compiled layout constants HEADER_SIZE, SLOT_HEADER_SIZE, and Bucket::SIZE are never encoded in the file beyond the single VERSION byte (header struct, layout.rs:28-52, stores slot_size/ht_capacity as data but not these constants; MAGIC is fixed "FCACHE01"). The corruption mechanism is genuine and a bit sharper than the finding states: slot stride uses header.slot_size (mod.rs:227,265,286,321) while intra-slot key/value access uses the compiled SLOT_HEADER_SIZE constant (mod.rs:189,200,295,342). slot_size is computed as SLOT_HEADER_SIZE + max_key_size + max_value_size (mod.rs:57), so if a build changes SLOT_HEADER_SIZE without bumping VERSION, the additive term diverges: process B reuses A's file (all validated fields match), reads stride from the header but applies its own SLOT_HEADER_SIZE offset to locate key/value bytes -> silent misread/corruption. Likewise ht_offset()/slab_offset() depend on HEADER_SIZE (layout.rs:104-111), so a HEADER_SIZE change without VERSION bump misaligns the whole region. So the claim that these constants are unvalidated and trusted via header.ht_capacity/slot_size in pointer arithmetic is accurate.
Correction to the finding's emphasis: it frames ht_capacity as the main residual risk, but ht_capacity is purely derived from the already-validated capacity via (capacity*2).next_power_of_two() (region.rs:47), so it cannot diverge unless the derivation formula itself changes; the real residual risk is the additive SLOT_HEADER_SIZE/HEADER_SIZE constants baked into the binary, which the finding lists as 'mostly mitigated' but are in fact the genuine hole.
Severity: low is correct. This is a defense-in-depth/robustness gap, not a bug reachable from the safe Python API with any single coherent build. With one build, all constants match and ht_capacity is deterministic, so no corruption occurs. Triggering it requires a developer mistake (mutating a layout constant without bumping VERSION) combined with two differently-built processes sharing the same {name}.data file. The VERSION constant exists precisely to guard this (documented at layout.rs:9-11), making it an operational-discipline issue. The finding itself rates it low and acknowledges it is mostly mitigated; that is accurate.
Filed from a multi-agent code review (finder → adversarial verification → synthesis). Confirmed real after a skeptic re-read the code.
Severity: ⚪ low • Category: correctness
Location:
src/shm/region.rs: 176-201What's wrong
create_or_open only re-uses an existing file when version, capacity, max_key_size and max_value_size match. It does NOT check that slot_size or ht_capacity in the on-disk header match what this process would compute, and the validation trusts the header's own capacity/ht_capacity fields for all subsequent pointer arithmetic (slab_offset uses header.ht_capacity). If two processes link different builds where SLOT_HEADER_SIZE / HEADER_SIZE / Bucket::SIZE differ but VERSION was not bumped, the size_of compile-time asserts pass per-build but the in-memory offsets diverge, and process B will index the slab using process A's geometry. Because slot_size is fully derived from max_key_size+max_value_size which ARE validated, this is mostly mitigated today, but ht_capacity is recomputed in create() and never cross-checked on open, and HEADER_SIZE/SLOT_HEADER_SIZE constants are not encoded in the file beyond the version byte.
Trigger
Two processes built from source revisions with different SLOT_HEADER_SIZE/HEADER_SIZE/Bucket layout but the same VERSION constant open the same {name}.data file; pointer arithmetic uses mismatched geometry -> silent corruption.
Suggested fix
Store HEADER_SIZE, SLOT_HEADER_SIZE, Bucket::SIZE (or a single layout hash) in the header and validate them on open; bump/extend VERSION whenever any layout constant changes, and assert header.slot_size == expected and header.ht_capacity == next_power_of_two(capacity*2) on open.
Adversarial verification note
Confirmed in the actual code. On open, create_or_open validates only version, capacity, max_key_size, max_value_size (region.rs:177-181). The compiled layout constants HEADER_SIZE, SLOT_HEADER_SIZE, and Bucket::SIZE are never encoded in the file beyond the single VERSION byte (header struct, layout.rs:28-52, stores slot_size/ht_capacity as data but not these constants; MAGIC is fixed "FCACHE01"). The corruption mechanism is genuine and a bit sharper than the finding states: slot stride uses header.slot_size (mod.rs:227,265,286,321) while intra-slot key/value access uses the compiled SLOT_HEADER_SIZE constant (mod.rs:189,200,295,342). slot_size is computed as SLOT_HEADER_SIZE + max_key_size + max_value_size (mod.rs:57), so if a build changes SLOT_HEADER_SIZE without bumping VERSION, the additive term diverges: process B reuses A's file (all validated fields match), reads stride from the header but applies its own SLOT_HEADER_SIZE offset to locate key/value bytes -> silent misread/corruption. Likewise ht_offset()/slab_offset() depend on HEADER_SIZE (layout.rs:104-111), so a HEADER_SIZE change without VERSION bump misaligns the whole region. So the claim that these constants are unvalidated and trusted via header.ht_capacity/slot_size in pointer arithmetic is accurate.
Correction to the finding's emphasis: it frames ht_capacity as the main residual risk, but ht_capacity is purely derived from the already-validated capacity via (capacity*2).next_power_of_two() (region.rs:47), so it cannot diverge unless the derivation formula itself changes; the real residual risk is the additive SLOT_HEADER_SIZE/HEADER_SIZE constants baked into the binary, which the finding lists as 'mostly mitigated' but are in fact the genuine hole.
Severity: low is correct. This is a defense-in-depth/robustness gap, not a bug reachable from the safe Python API with any single coherent build. With one build, all constants match and ht_capacity is deterministic, so no corruption occurs. Triggering it requires a developer mistake (mutating a layout constant without bumping VERSION) combined with two differently-built processes sharing the same {name}.data file. The VERSION constant exists precisely to guard this (documented at layout.rs:9-11), making it an operational-discipline issue. The finding itself rates it low and acknowledges it is mostly mitigated; that is accurate.
Filed from a multi-agent code review (finder → adversarial verification → synthesis). Confirmed real after a skeptic re-read the code.