Skip to content

MB-72240: use the vector cache for fast merge#420

Merged
Thejas-bhat merged 9 commits into
masterfrom
trained_index
Jun 18, 2026
Merged

MB-72240: use the vector cache for fast merge#420
Thejas-bhat merged 9 commits into
masterfrom
trained_index

Conversation

@Thejas-bhat

@Thejas-bhat Thejas-bhat commented May 28, 2026

Copy link
Copy Markdown
Member
  • keep the trained index in the trained index cache to reduce the garbage
    generated during fast merge.
    • the index is always kept in the cache so that we don't prematurely close the index entry especially when the merge operations are sporadic
  • fast merge requires only the vector index, so we skip constructing the
    id mapping

@Thejas-bhat Thejas-bhat force-pushed the trained_index branch 2 times, most recently from ea72515 to 9e5fc63 Compare June 4, 2026 16:46
@Thejas-bhat Thejas-bhat changed the title MB-72082: use the vector cache for fast merge MB-72249: use the vector cache for fast merge Jun 8, 2026
@Thejas-bhat Thejas-bhat changed the title MB-72249: use the vector cache for fast merge MB-72240: use the vector cache for fast merge Jun 8, 2026
@Thejas-bhat Thejas-bhat marked this pull request as ready for review June 8, 2026 17:27
Comment thread centroid_index.go Outdated
Comment thread faiss_vector_cache.go Outdated
Comment thread faiss_vector_cache.go Outdated
Comment thread section_faiss_vector_index.go Outdated
Comment thread trained_index.go Outdated
Likith101
Likith101 previously approved these changes Jun 16, 2026
capemox
capemox previously approved these changes Jun 17, 2026
@CascadingRadium

CascadingRadium commented Jun 17, 2026

Copy link
Copy Markdown
Member

Here is my opinion, From what I can gather, we load the trained index into the cache the first time we hit the API and then reuse it for every merge operation to enable fast merge.

My concern with this approach is that the vector index cache is designed for the query path, with the EWMA mechanism keeping the index alive for a short window after its reference count drops to zero.

Reusing this same struct for indexing is not ideal. We can easily end up evicting the trained index the moment closeTrainedIndex() is called.

Since we incRef and decRef the index within a single merge cycle, and merges are both infrequent and happen at highly variable intervals, we can easily end up thrashing the cache.

Additionally, modifying a cache that was purpose-built for the query path introduces fragility, with fast merge logic interfering into query path code, making it harder to reason about and more prone to breakage as new features are added.

My proposal: add a dedicated always-alive cache in trained_index.go. This would be a new type that holds only the trained index and whatever state is required exclusively for the fast merge path. The EWMA eviction mechanism would not apply to it, the trained index would remain available for the lifetime of the parent index, similar to the nested index cache, which permanently persists the ancestry chain. This way throughout the index lifetime, irrespective of when a merge occurred, we always have the trained index at hand.

Since we memory-map the faiss index anyway, we are not losing out on performance there. This always-alive cache will be applicable only for the trained segment, and the regular cache will be applicable for the regular segments.

func newTrainedIndexCache() *trainedIndexCache {
	return &trainedIndexCache{
		cache: make(map[uint16]faissIndex),
	}
}

type trainedIndexCache struct {
	m     sync.RWMutex
	cache map[uint16]faissIndex
        /// No EWMA or eviction
}

// loadOrCreate obtains the trained faiss index from the cache or creates it
// from the provided segment memory if not already present.
func (tc *trainedIndexCache) loadOrCreate(fieldID uint16) (faissIndex, error) {
	tc.m.RLock()
	if idx, ok := tc.cache[fieldID]; ok {
		tc.m.RUnlock()
		return idx, nil
	}
	tc.m.RUnlock()

	tc.m.Lock()
	defer tc.m.Unlock()

	if idx, ok := tc.cache[fieldID]; ok {
		return idx, nil
	}

	idx, err := tc.createFromOptsLOCKED(opts)
	if err != nil {
		return nil, err
	}
	tc.cache[fieldID] = idx
	return idx, nil
}

// Clear closes all cached trained indices and clears the cache.
func (tc *trainedIndexCache) Clear() {
	tc.m.Lock()
	defer tc.m.Unlock()
	for _, idx := range tc.cache {
		idx.close()
	}
	tc.cache = nil
}

// call clear when we close the segment, along with the other cache clear ops.
	s.nstIndexCache.Clear()
	s.trainedIndexCache.Clear()

@Thejas-bhat

Thejas-bhat commented Jun 17, 2026

Copy link
Copy Markdown
Member Author

modifying a cache that was purpose-built for the query path introduces fragility, with fast merge logic interfering into query path code, making it harder to reason about and more prone to breakage as new features are added.

yeah makes sense, thanks. i'll make the changes

@Thejas-bhat Thejas-bhat requested review from Likith101 and capemox June 17, 2026 17:54
Comment thread faiss_vector_cache.go
Comment thread segment.go
Comment thread trained_index.go Outdated
Comment thread faiss_vector_cache.go Outdated

@Thejas-bhat Thejas-bhat left a comment

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah my bad, i forgot to push the patch that was there on my system

Comment thread faiss_vector_cache.go Outdated
Comment thread faiss_vector_cache.go Outdated
Comment thread trained_index.go Outdated
Comment thread faiss_vector_posting.go Outdated
@Thejas-bhat Thejas-bhat merged commit f4c7e8c into master Jun 18, 2026
9 checks passed
@Thejas-bhat Thejas-bhat deleted the trained_index branch June 18, 2026 04:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants