Avoid cloning index bitmaps for read-only query operands#1908
Merged
brharrington merged 1 commit intoJun 4, 2026
Conversation
RoaringTagIndex query evaluation cloned the matching bitmap at every leaf (`equal`/`hasKey` via withOffsetClone, plus `True`) so the result could be mutated by an enclosing in-place and/or. A multi-term query therefore cloned once per term, and a standalone `:eq` cloned a bitmap that the caller only reads. Roaring bitmap intermediates were ~38% of allocation in an alloc profile, almost all on the query path. Split evaluation into findReadOnly (may return a shared reference into the index -- read only) and findOwned (caller may mutate; copies only the Equal/HasKey/True cases that can be shared). and/or take the mutated accumulator via findOwned(q1) and the read-only operand via findReadOnly(q2), so only the accumulator is copied. Shared bitmaps are safe because the index is immutable while querying. Defer the paging offset to a single application on the final set instead of cloning it into every leaf (prefix removal commutes with and/or/andNot). This also fixes a latent paging bug: previously a `:not` query with an offset left `all` untrimmed and re-emitted the already-paged prefix. Allocation (gc.alloc.rate.norm, limit=1): standalone `:eq` ~952 -> ~24 B/op; multi-term and/or roughly halved. Adds adversarial ownership tests (mutation-verified: they fail if a shared bitmap is mutated), a `:not`+offset paging test, and a QueryAllocation benchmark.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
RoaringTagIndex query evaluation cloned the matching bitmap at every leaf (
equal/hasKeyvia withOffsetClone, plusTrue) so the result could be mutated by an enclosing in-place and/or. A multi-term query therefore cloned once per term, and a standalone:eqcloned a bitmap that the caller only reads. Roaring bitmap intermediates were ~38% of allocation in an alloc profile, almost all on the query path.Split evaluation into findReadOnly (may return a shared reference into the index -- read only) and findOwned (caller may mutate; copies only the Equal/HasKey/True cases that can be shared). and/or take the mutated accumulator via findOwned(q1) and the read-only operand via findReadOnly(q2), so only the accumulator is copied. Shared bitmaps are safe because the index is immutable while querying.
Defer the paging offset to a single application on the final set instead of cloning it into every leaf (prefix removal commutes with and/or/andNot). This also fixes a latent paging bug: previously a
:notquery with an offset leftalluntrimmed and re-emitted the already-paged prefix.Allocation (gc.alloc.rate.norm, limit=1): standalone
:eq~952 -> ~24 B/op; multi-term and/or roughly halved.Adds adversarial ownership tests (mutation-verified: they fail if a shared bitmap is mutated), a
:not+offset paging test, and a QueryAllocation benchmark.