[WIP] Introduce CuVS resource manager by narangvivek10 · Pull Request #138 · NVIDIA/cuvs-lucene

narangvivek10 · 2026-04-28T23:44:27Z

We have, until now, allowed for the indexing threads to attempt to build indexes on the GPU as and when they are ready. On a GPU with enough resources, dataset size, combined with the indexing pattern, like when a flush happens, etc., this may not seem to be a problem. In tighter conditions, however, and with relatively fewer resources and a large dataset, we may end up with the GPU resources running out, resulting in OOM situations.

I am introducing a CuvsResourcesManager based approach. With a finite number of ManagedCuVSResources in a pool and active monitoring of the available device memory, the requesting threads are allowed to submit requests in a controlled fashion based on resource availability by acquiring resources and releasing them when finished. Once this approach is rolled out to the Index and search on the GPU API as well, the ThreadLocalCuVSResourcesProvider based approach will be retired.

Summary of changes:

Introduce CuvsResourcesManager
Replace CuvsResourcesManager usage in the CAGRA->HNSW APIs with the prior ThreadLocalCuVSResourcesProvider based impl.
Cleanup: remove CuVSProvider classes as they are redundant, as we can directly use the one available in cuvs-java, instead.

copy-pr-bot · 2026-04-28T23:44:31Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

narangvivek10 · 2026-04-29T02:20:15Z

/ok to test f7ba116

narangvivek10 · 2026-04-30T22:39:39Z

/ok to test 5d331d9

…eshold value to 1M, and tweak estimations

…host

narangvivek10 · 2026-06-01T17:28:35Z

Below is the summary of a subset of test runs to evaluate how the peak memory estimations are performing and what we observe in reality with segment sizes ranging from 100K to 20M vectors.

Initial work

399426c

narangvivek10 self-assigned this Apr 28, 2026

narangvivek10 added improvement Improves an existing functionality non-breaking Introduces a non-breaking change labels Apr 28, 2026

Variable renaming and simplification in CuvsResourcesManager

f7ba116

narangvivek10 added 2 commits April 28, 2026 23:43

Cleanup

0cf52db

Add javadocs and rename variables

5d331d9

narangvivek10 marked this pull request as ready for review April 30, 2026 23:24

narangvivek10 requested a review from a team as a code owner April 30, 2026 23:24

narangvivek10 requested review from cjnolet and mythrocks April 30, 2026 23:24

narangvivek10 and others added 2 commits May 11, 2026 15:56

Merge branch 'main' into vivek/implement-managed-resources

8841dbc

Add peak device memory estimation methods for NN_DESCENT and IVF_PQ

89d3991

narangvivek10 changed the title ~~Introduce CuVS resource manager~~ [WIP] Introduce CuVS resource manager May 11, 2026

narangvivek10 marked this pull request as draft May 11, 2026 21:34

narangvivek10 added 3 commits May 12, 2026 00:05

Add logic to check actual current free device memory, change algo thr…

a146fdc

…eshold value to 1M, and tweak estimations

Update utils - allow to switch between building matrix on device and …

64e0e52

…host

Adjust peak memory estimation for both NN_DESCENT and IVF_PQ

8a6c6ed

Merge branch 'main' into vivek/implement-managed-resources

fe6f73a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WIP] Introduce CuVS resource manager#138

[WIP] Introduce CuVS resource manager#138
narangvivek10 wants to merge 10 commits into
NVIDIA:mainfrom
SearchScale:vivek/implement-managed-resources

narangvivek10 commented Apr 28, 2026 •

edited

Loading

Uh oh!

copy-pr-bot Bot commented Apr 28, 2026

Uh oh!

narangvivek10 commented Apr 29, 2026

Uh oh!

narangvivek10 commented Apr 30, 2026

Uh oh!

narangvivek10 commented Jun 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

narangvivek10 commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

copy-pr-bot Bot commented Apr 28, 2026

Uh oh!

narangvivek10 commented Apr 29, 2026

Uh oh!

narangvivek10 commented Apr 30, 2026

Uh oh!

narangvivek10 commented Jun 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

narangvivek10 commented Apr 28, 2026 •

edited

Loading