Skip to content

ITER-0002: codec comparison benchmark + tradeoff guidance#13

Open
nnunley wants to merge 9 commits into
forest-rs:mainfrom
nnunley:codec-benchmark
Open

ITER-0002: codec comparison benchmark + tradeoff guidance#13
nnunley wants to merge 9 commits into
forest-rs:mainfrom
nnunley:codec-benchmark

Conversation

@nnunley
Copy link
Copy Markdown
Collaborator

@nnunley nnunley commented May 30, 2026

Summary

The measurement half of ITER-0002: a Criterion benchmark comparing the two postings codecs, plus tradeoff guidance grounded in the results.

Stacked on #12 (postings-codecs), which is stacked on #11. Until the ancestors merge, this PR's diff against main includes their commits; review only the benchmark + leit_index accessor + tradeoff doc here.

What's in it

  • crates/leit_wind_tunnel_index/benches/codec_compare.rs — Criterion benchmark measuring encode time, decode time, and compressed size for DeltaVarintCodec vs BlockDeltaCodec over the deterministic wind-tunnel corpus (1K + 10K docs, multi-field title+body, Zipfian), with a lossless sanity gate so a broken codec fails the bench.
  • leit_index: PostingEntry made public + InMemoryIndex::postings_by_term() accessor — the minimal surface to extract doc-sorted (SegmentLocalDocId, TermFreq) postings (the lowering stands in for the ITER-0004 segment-write boundary).
  • docs/2026-05-30-codec-tradeoffs.md — decode-cost vs memory guidance.
  • Criterion stays out of every primary crate (added only to leit_wind_tunnel_index dev-deps); the no-Criterion sentinels remain green.

Results (baseline)

Codec Compressed size Decode
DeltaVarint ~25.4–25.6% of the 8-byte/posting baseline (~2.03–2.05 B/posting) fastest
BlockDelta ~26.3–27.4% (~2.10–2.19 B/posting) ~4–11% slower (per-block header)

Guidance: DeltaVarint for v1 simplicity/speed; BlockDelta when the per-block header earns its keep via selective/skip decode (ITER-0003) and Phase-3 WAND.

Run: cargo bench -p leit_wind_tunnel_index --bench codec_compare

🤖 Generated with Claude Code

nnunley and others added 9 commits May 29, 2026 19:47
…ipfian vocab, query fixtures) using rapidhash::v1
…paths)

Completes ITER-0000 walking skeleton (T6-T9) atop the leit_wind_tunnel
harness:

- leit_wind_tunnel_index: index_build/{1k,10k} indexing-throughput benches
- leit_wind_tunnel_query: five execution paths (single/OR/AND/fielded +
  BM25F cross-field) x {1k,10k}, index built once outside the timed region,
  ExecutionWorkspace reused across iterations
- Criterion isolated to the two bench crates (dev-dependencies only);
  primary crates and leit_benchmark untouched
- CI: exclude the three wind-tunnel crates from the no_std/wasm jobs
  (std-only, mirroring leit_benchmark); no cargo bench step added
- harness docs: note the relationship to leit_benchmark (smoke test vs
  performance lab)
… (STORY-0096)

ITER-0001 dependency hygiene per the usage-site rule: the leit_wind_tunnel
harness uses only rapidhash in its library surface; leit_core/leit_index/
leit_text are used solely by its #[cfg(test)] integration tests, so they
move to [dev-dependencies] and no longer appear in the harness's production
dependency graph. The bench crates were already correct (empty lib; all deps
dev). Library build, 17 unit tests, and both bench crates verified green.
… (STORY-0112)

ITER-0001: BlockId, FilterExprId, SegmentOrd, SegmentLocalDocId in leit_core,
each a #[repr(transparent)] newtype over a [u8; 4] little-endian inner deriving
bytemuck Pod/Zeroable. The on-disk form is the in-memory form: a &[u8] slice
from an mmap'd buffer casts in place to &[Id] with no allocation or
deserialization (zero-copy), stable across host endianness; ordering is numeric.

bytemuck chosen over zerocopy because zerocopy's derives emit internal
#[allow(non_ascii_idents)]/#[allow(non_local_definitions)] that conflict with the
workspace's forbid-level Linebender lints (E0453); bytemuck is no_std and
lint-clean under the same forbid set. Proven by SCENARIO-0005 (6 unit tests:
value + slice + unaligned round-trip, numeric ordering, LE byte layout).
Records the design-decidable decisions for the Phase 2 segment format
(DEC-01..10) with rationale, a Phase 3 forward-compatibility audit, and
decision->enforcement traceability. Human-confirmed key calls:

- DEC-01 segment offsets: u64 (no size cap; removes the only Phase 3
  format-migration risk)
- DEC-10 integrity: single footer checksum, verified in Full validation mode
- DEC-06 block-aware API: public dedicated BlockCursor trait (Phase 3 WAND
  consumes it without a format/API break)
- DEC-05 header: fixed-layout little-endian POD, absolute u64 section offsets,
  magic + version + format_flags, reserved stored-fields/columnar slots

Decision-documentation ACs of STORY-0078/0081-0084/0090/0043-0047 are satisfied
here (decided:ITER-0001); their code-enforcement ACs are deferred to
ITER-0003/0004. Forward constraint recorded for ITER-0005: block-metadata schema
must carry per-block max_score + doc-range for Phase 3 WAND/MaxScore.
…ORY-0112 AC-2)

ITER-0001 audit corrective: SCENARIO-0005 now also exercises try_from_bytes/
try_cast_slice (Ok on well-formed, Err on malformed) per AC-2's validated-read
obligation.
…elta) [ITER-0002]

Codec layer for ITER-0002. A Codec trait with two implementations over a stable v1
block format, plus the layout decisions (DEC-11 fixed 128-doc blocks, DEC-12 layout)
and a new TermFreq segment-resident type.

- DeltaVarint (CodecId 0) + BlockDelta (CodecId 1, 128-doc independently-decodable
  blocks with validated first/last-doc header range).
- Hand-rolled LEB128 varint into a type-enforced [u8;5]; no_std + alloc; no new deps.
- API speaks named segment-resident types SegmentLocalDocId + TermFreq (no anonymous
  u32 drift); EntityId stays the in-memory abstraction, lowered at the segment boundary.
- Decode into caller-provided &mut Vec<..> — scratch-ownership-agnostic (TODO(ITER-0003)
  / STORY-0079). Doc-sorted precondition enforced via checked_sub (deterministic panic).
- CodecId marker per list; segment-format reservation deferred:ITER-0004.

Stories: STORY-0002/0003/0004/0005(AC1-2)/0009 done; STORY-0087/0088 decided.
Proof: SCENARIO-0006 (36 leit_postings tests). PAR spec + quality reviewed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ER-0002 T6-T7]

SCENARIO-0070 (process-level): Criterion benchmark comparing DeltaVarint vs
BlockDelta over the deterministic wind-tunnel corpus (1K/10K, multi-field, Zipfian).
Measures encode time, decode time, and compressed size vs the 8-byte/posting
baseline, with a lossless sanity gate. Baseline: DeltaVarint ~25%, BlockDelta ~26-27%
of uncompressed; DeltaVarint decode ~4-11% faster.

- crates/leit_wind_tunnel_index/benches/codec_compare.rs (+ [[bench]], leit_postings/
  leit_core dev-deps). Criterion stays out of all primary crates (SCENARIO-0061/0069 pass).
- leit_index: PostingEntry made public + InMemoryIndex::postings_by_term() accessor,
  the minimal surface needed to extract doc-sorted (SegmentLocalDocId, TermFreq) postings
  (lowering stands in for the ITER-0004 segment-write boundary).
- docs/2026-05-30-codec-tradeoffs.md — STORY-0006 AC-3 decode-cost vs memory guidance.

Stories: STORY-0006 (benchmark + guidance). Proof: SCENARIO-0070. PAR reviewed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant