Skip to content

Avoid cloning index bitmaps for read-only query operands#1908

Merged
brharrington merged 1 commit into
Netflix:mainfrom
brharrington:query-bitmap-clone-reduction
Jun 4, 2026
Merged

Avoid cloning index bitmaps for read-only query operands#1908
brharrington merged 1 commit into
Netflix:mainfrom
brharrington:query-bitmap-clone-reduction

Conversation

@brharrington

Copy link
Copy Markdown
Contributor

RoaringTagIndex query evaluation cloned the matching bitmap at every leaf (equal/hasKey via withOffsetClone, plus True) so the result could be mutated by an enclosing in-place and/or. A multi-term query therefore cloned once per term, and a standalone :eq cloned a bitmap that the caller only reads. Roaring bitmap intermediates were ~38% of allocation in an alloc profile, almost all on the query path.

Split evaluation into findReadOnly (may return a shared reference into the index -- read only) and findOwned (caller may mutate; copies only the Equal/HasKey/True cases that can be shared). and/or take the mutated accumulator via findOwned(q1) and the read-only operand via findReadOnly(q2), so only the accumulator is copied. Shared bitmaps are safe because the index is immutable while querying.

Defer the paging offset to a single application on the final set instead of cloning it into every leaf (prefix removal commutes with and/or/andNot). This also fixes a latent paging bug: previously a :not query with an offset left all untrimmed and re-emitted the already-paged prefix.

Allocation (gc.alloc.rate.norm, limit=1): standalone :eq ~952 -> ~24 B/op; multi-term and/or roughly halved.

Adds adversarial ownership tests (mutation-verified: they fail if a shared bitmap is mutated), a :not+offset paging test, and a QueryAllocation benchmark.

RoaringTagIndex query evaluation cloned the matching bitmap at every leaf
(`equal`/`hasKey` via withOffsetClone, plus `True`) so the result could be
mutated by an enclosing in-place and/or. A multi-term query therefore
cloned once per term, and a standalone `:eq` cloned a bitmap that the
caller only reads. Roaring bitmap intermediates were ~38% of allocation in
an alloc profile, almost all on the query path.

Split evaluation into findReadOnly (may return a shared reference into the
index -- read only) and findOwned (caller may mutate; copies only the
Equal/HasKey/True cases that can be shared). and/or take the mutated
accumulator via findOwned(q1) and the read-only operand via
findReadOnly(q2), so only the accumulator is copied. Shared bitmaps are
safe because the index is immutable while querying.

Defer the paging offset to a single application on the final set instead
of cloning it into every leaf (prefix removal commutes with and/or/andNot).
This also fixes a latent paging bug: previously a `:not` query with an
offset left `all` untrimmed and re-emitted the already-paged prefix.

Allocation (gc.alloc.rate.norm, limit=1): standalone `:eq` ~952 -> ~24
B/op; multi-term and/or roughly halved.

Adds adversarial ownership tests (mutation-verified: they fail if a shared
bitmap is mutated), a `:not`+offset paging test, and a QueryAllocation
benchmark.
@brharrington brharrington added this to the 1.9.0 milestone Jun 4, 2026
@brharrington brharrington merged commit 4b99b45 into Netflix:main Jun 4, 2026
5 checks passed
@brharrington brharrington deleted the query-bitmap-clone-reduction branch June 4, 2026 20:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant