Skip to content

[FLINK-39753][state/rocksdb] Close ColumnFamilyOptions from getDescriptor() in Compactor#28251

Open
leekeiabstraction wants to merge 1 commit into
apache:masterfrom
leekeiabstraction:FLINK-39753
Open

[FLINK-39753][state/rocksdb] Close ColumnFamilyOptions from getDescriptor() in Compactor#28251
leekeiabstraction wants to merge 1 commit into
apache:masterfrom
leekeiabstraction:FLINK-39753

Conversation

@leekeiabstraction
Copy link
Copy Markdown

@leekeiabstraction leekeiabstraction commented May 25, 2026

What is the purpose of the change

Fixes a native memory leak in the RocksDB SST merge Compactor. ColumnFamilyHandle.getDescriptor() copies the column family's options across JNI and returns a fresh native ColumnFamilyOptions on every call. Compactor.compact() read numLevels() from it but never closed it, so the native object leaked on every compaction. Because the leaked options retain a reference to the shared block cache (via BlockBasedTableFactory -> BlockBasedTableOptions -> LRUCache), the cache's shared_ptr is never released, preventing the block cache from being freed even after all tasks stop. This causes task manager RSS to grow and eventually OOM.

Brief change log

  • Wrap cfName.getDescriptor().getOptions() in a try-with-resources block in Compactor.compact() so the native ColumnFamilyOptions is closed after numLevels() is read.

Verifying this change

The leak and the fix were verified with jemalloc profiling (jeprof), running Flink in session mode and repeatedly starting/stopping jobs to trigger the compactor while tracking the rocksdb::BlockFetcher::ReadBlockContents call stack that dominates block-cache allocations. The configured block cache capacity was 833MB.

  • Before the fix: after jobs were stopped and resubmitted, ReadBlockContents grew to ~1.54GB, far exceeding the 833MB cache capacity; the jemalloc heap profile reported a total of 2,280,777,636 bytes.
  • After the fix: the task manager with the highest RSS held ~800MB in ReadBlockContents, consistent with the 833MB capacity; the jemalloc heap profile reported a total of 1,416,132,765 bytes.

This is a ~37% reduction in native memory usage and eliminates the cache-capacity overage, confirming the LRUCache leak caused by the unclosed ColumnFamilyOptions is resolved. The behavior (output level computation) is unchanged and is covered by existing tests; only the previously-leaked native handle is now closed.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): no
  • The public API, i.e., is any changed class annotated with @Public(Evolving): no
  • The serializers: no
  • The runtime per-record code paths (performance sensitive): no
  • Anything that affects deployment or recovery: no
  • The S3 file system connector: no

Documentation

  • Does this pull request introduce a new feature? no
  • If yes, how is the feature documented? not applicable

@flinkbot
Copy link
Copy Markdown
Collaborator

flinkbot commented May 25, 2026

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

Copy link
Copy Markdown
Contributor

@davidradl davidradl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving if you add the comment

Copy link
Copy Markdown
Contributor

@och5351 och5351 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @leekeiabstraction !

disposeInternal is implemented, but I couldn't find the corresponding close call.

LGTM

…ptor() in Compactor

ColumnFamilyHandle.getDescriptor() allocates a new native ColumnFamilyOptions
on every call and does not close it, preventing the shared block cache from
being freed. Wrap the call in try-with-resources so the options are closed
after reading numLevels().
@leekeiabstraction
Copy link
Copy Markdown
Author

@davidradl Thank you for the review, added comment just before the try-with-resource. PTAL

@github-actions github-actions Bot added the community-reviewed PR has been reviewed by the community. label May 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-reviewed PR has been reviewed by the community.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants