Add builtin ZXC compression support#14786
Conversation
Summary: - Add kZXC as a builtin compression type across C/C++/Java option parsing, CLI tools, db_stress, auto-tuning, and release notes. - Use zxc block data rather than file framing, with RocksDB-encoded uncompressed sizes and exact-sized output buffers via zxc_decompress_block_safe(); this avoids zxc file header/footer overhead and extra decompression copies. - Use a bound-sized zxc compression scratch buffer when RocksDB provides a smaller ratio-capped output buffer; this avoids zxc 0.11.0 small-destination overwrites at the cost of an extra memcpy on accepted blocks in that path. - Leave RocksDB blocks larger than the zxc block API 2 MiB limit uncompressed rather than adding custom multi-block framing and metadata. - Disable zxc-embedded checksums for kZXC because zxc block data does not self-describe checksum presence; adding RocksDB metadata to every block would tax the read hot path for a rarely used feature, so a future distinct CompressionType can define a checksum-bearing format if needed. - Use caller-provided zxc decompression working areas when available; otherwise allocate a temporary dctx on demand to avoid retained per-thread/per-core memory, with ROCKSDB_ZXC_THREAD_LOCAL_DCTX as an opt-in fallback for experiments. - Add Makefile/CMake/build_detect_platform integration, including WITH_ZXC=1 S3 download support and RocksJava static-build coverage for zxc >= 0.11.0. Test Plan: - Added DBCompressionTest.ZXCUsesRocksDBEncodedSize for zxc block round trips, RocksDB size-prefix decoding, checksum-option format stability, and ratio-capped output handling. - Updated compression/options unit tests for kZXC enum parsing, failure handling, and recommended parallel-thread behavior. - Added zxc to rocksdb crash-test randomized compression coverage. - Updated PR CI coverage to enable zxc in build-linux, the non-atomic build-linux-mini-crashtest lane, build-linux-cmake-with-folly-coroutines, and build-linux-static_lib-alt_namespace-status_checked. RocksJava static coverage is provided by the existing static Java job bundled compression-library path.
|
| Check | Count |
|---|---|
cppcoreguidelines-special-member-functions |
1 |
| Total | 1 |
Details
util/compression.cc (1 warning(s))
util/compression.cc:1596:7: warning: class 'BuiltinUncompressionContext' defines a non-default destructor but does not define a move constructor or a move assignment operator [cppcoreguidelines-special-member-functions]
|
On my AMD-based workstation, I was not able to find any real benefit to zxc over LZ4HC. Its best performance showing comes from a server-class ARM machine, which I also tested using sst_dump --command=recompress over real RocksDB SSTs from real Meta workloads. Although ZXC looks fast at decompressing compressed bytes on ARM, that is not the same thing as winning for RocksDB block reads. On these SSTs, LZ4HC consistently achieves better compression ratios, and that changes the overall economics. For example, on one large SST, ZXC level 3 decompressed slightly faster than LZ4HC level 3, but produced about 166.6MB of compressed data versus 152.1MB for LZ4HC, with ratios of 4.75x vs. 5.20x. On another SST, ZXC level 3 was again close on read CPU, but compressed to 45.6MB versus 43.6MB for LZ4HC. Across the data set, the pattern was similar: ZXC’s decompressor is efficient per compressed byte, but LZ4HC gives it fewer bytes to process, ending up with similar or slightly higher (<+10%) decompression CPU cost. That matters beyond the decompression loop itself. Better compression ratio reduces storage footprint, checksum verification work, cache and memory bandwidth pressure, and any read-path overhead proportional to compressed bytes. Remote storage particularly has CPU and network overheads proportional to the number of bytes read. So even where ZXC’s decompression CPU is nominally faster, LZ4HC’s ratio likely brings the total CPU/read-path cost back to parity or better while also saving space. The write-side tradeoff is also important. ZXC’s low levels write quickly but have noticeably worse ratios. The higher ZXC levels can approach LZ4HC ratios on some files, but with a sharp write performance cliff. In these runs, ZXC level 6 took roughly 13s, 19s, or even 110s on some SSTs where comparable LZ4HC levels were around 1-4s. ZXC level 5 is more comparable and could provide a slight benefit for bulk load, read-heavy workloads less sensitive to storage size, but I'm not convinced it's currently worth the cost of proceeding with the integration. My takeaway is that ZXC is technically interesting, especially as a very fast ARM decompressor, but the mixed results do not yet show clear value over LZ4/LZ4HC for RocksDB. I intend to keep this enhancement PR in the backlog, unless/until that landscape changes. |
|
@pdillinger has imported this pull request. If you are a Meta employee, you can view this in D106421979. |
Summary:
Test Plan: