Derive in-memory index membership from the data map#1906
Merged
brharrington merged 1 commit intoJun 4, 2026
Conversation
The metadata index could retain a series no longer present in the data
map ("orphaned series"): a tag query returned it long after its data had
aged past the retention window. It happened when a concurrent update
revived a block store in the small window between rebuild()'s emptiness
check and removing it from the data map -- the store stayed reachable
from the index (via its BlockStoreItem) but was dropped from data, so
cleanup could never revisit it.
Fix the root cause -- two independent sources of liveness -- by making
the data map authoritative. data now holds BlockStoreItem, and
rebuildIndex() cleans up rolled-out/filtered stores and rebuilds the
index from the surviving entries in a single pass, so a store removed
during cleanup cannot linger in the index. The pending-queue feed
(index.update) is no longer used by MemoryDatabase.
To avoid losing the datapoint from the residual sub-instruction race,
update()/rollup() call ensurePresent() after writing: a lock-free get
that re-inserts the store only if it was concurrently removed, adding no
per-datapoint locking. hasData is made volatile so the rebuild thread
reliably observes ingest-side revivals now that membership depends on it.
Adds a deterministic regression test for the revive-during-cleanup race
and a test for the rebuild() throttle.
brharrington
added a commit
that referenced
this pull request
Jun 4, 2026
Deriving index membership from the data map (#1906) stopped update() from feeding the index pending queue, so the synchronous data->index sync now happens only in rebuildIndex(). Callers that write via update() and previously forced a sync with index.rebuildIndex() no longer see their data: the no-arg BatchUpdateTagIndex.rebuildIndex() just drains an empty pending queue. Expose the unthrottled MemoryDatabase.rebuildIndex() so callers can force a synchronous rebuild after a bulk load or in tests. rebuild() remains the throttled variant used by the background thread.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The metadata index could retain a series no longer present in the data map ("orphaned series"): a tag query returned it long after its data had aged past the retention window. It happened when a concurrent update revived a block store in the small window between rebuild()'s emptiness check and removing it from the data map -- the store stayed reachable from the index (via its BlockStoreItem) but was dropped from data, so cleanup could never revisit it.
Fix the root cause -- two independent sources of liveness -- by making the data map authoritative. data now holds BlockStoreItem, and rebuildIndex() cleans up rolled-out/filtered stores and rebuilds the index from the surviving entries in a single pass, so a store removed during cleanup cannot linger in the index. The pending-queue feed (index.update) is no longer used by MemoryDatabase.
To avoid losing the datapoint from the residual sub-instruction race, update()/rollup() call ensurePresent() after writing: a lock-free get that re-inserts the store only if it was concurrently removed, adding no per-datapoint locking. hasData is made volatile so the rebuild thread reliably observes ingest-side revivals now that membership depends on it.
Adds a deterministic regression test for the revive-during-cleanup race and a test for the rebuild() throttle.