Add read-scoped block buffers for scan reads#14806
Conversation
✅ clang-tidy: No findings on changed linesCompleted in 432.1s. |
Introduce the experimental ReadScopedBlockBufferProvider API and ReadOptions::read_scoped_block_buffer_provider so supported block-based table iterator and MultiScan data-block reads can use caller-provided read-scoped storage. Provider-backed scan reads bypass block cache, attach provider cleanup to uncompressed read buffers without copying, and decompress compressed blocks directly into provider-backed output. Add an independent MultiScanArgs::bypass_block_cache / IODispatcher JobOptions bypass knob for scan reads that should skip data-block cache lookup and insertion while preserving fill_cache=false behavior. Plumb the option through MultiScan, db_bench, db_stress, and IODispatcher. Extend AlignedBuffer and RandomAccessFileReader direct-I/O paths to support external aligned allocations, then use that support for read-scoped iterator, async I/O, and MultiRead scratch buffers. Centralize read-scoped I/O policy, keep coalesced async reads safe when blocks are released before completion, and validate provider lease contracts. Add focused coverage for read-scoped ownership, compressed and uncompressed blocks, direct I/O, cache bypass behavior, invalid provider leases, async release handling, and stress-test provider invariants. Add public API release notes for read-scoped block buffers.
0772171 to
4ae4eab
Compare
CreateAndOpenSST returns BlockBasedTable instances whose Rep keeps a reference to the InternalKeyComparator passed through TableReaderOptions. Keep that comparator in fixture-owned storage alongside the existing ImmutableOptions and EnvOptions storage so iterator creation does not read a stack object after the helper returns.
RandomAccessFileReader::Read tracks direct-I/O progress in a local current_size variable after adding caller-owned aligned buffers. When the read uses the internal temporary AlignedBuffer and copies into caller scratch, keep the buffer's CurrentSize in sync before calling AlignedBuffer::Read(). This avoids a debug assertion in checksum verification with use_direct_reads when no caller-owned direct I/O buffer is supplied.
Delete copy and move operations for the stress read-scoped provider, whose destructor validates allocation lifetime invariants. Also split the debug-only cleanup pointer selection in CreateAndPinBlockFromBuffer so clang-tidy no longer sees the initial value overwritten before use.
Codex Code Review - OBSOLETESuperseded by a newer AI review. Expand to see the original review.🟡 Codex Code ReviewAuto-triggered after CI passed — reviewing commit c5f5998 ❌ Codex review failed before producing findings. ℹ️ About this responseGenerated by Codex CLI. Limitations:
Commands:
|
|
@xingbowang has imported this pull request. If you are a Meta employee, you can view this in D106999951. |
Claude Code Review - OBSOLETESuperseded by a newer AI review. Expand to see the original review.✅ Claude Code ReviewAuto-triggered after CI passed — reviewing commit c5f5998 SummaryThis PR adds a well-structured High-severity findings (3):
Full review (click to expand)Findings🔴 HIGHH1.
|
| Context | Code executes? | Assumptions hold? | Action needed? |
|---|---|---|---|
| WritePreparedTxnDB | YES (scans) | YES - provider is read-path only | Safe |
| ReadOnly DB | YES | YES | Safe |
| CompactionService | NO | N/A | Safe |
| User-defined timestamps | YES | YES - orthogonal | Safe |
| BlobDB | NO - separate path | N/A | Safe |
| Concurrent iterators | YES | Provider must be thread-safe | Per API contract |
| Direct IO + non-aligned reads | YES | Alignment contract covers this | Verify with tests |
| Immortal tables (mmap) | YES | Provider bypasses cache, mmap may still be used | Verify block_contents_pinned |
Positive Observations
- Well-designed
ReadScopedBlockBufferProviderAPI with clear ownership semantics viaSharedCleanablePtr AllocateReadScopedBlockBuffervalidation provides defense-in-depth against misbehaving providersStressReadScopedBlockBufferProviderin db_stress provides comprehensive invariant checkingAlignedBuffer::AllocateNewBufferWithStatusis a clean extension for external allocationLookupAndPinBlocksInCachegraceful null handling is the right approach- Good test coverage for direct IO external buffer paths
ℹ️ About this response
Generated by Claude Code.
Review methodology: claude_md/code_review.md
Limitations:
- Claude may miss context from files not in the diff
- Large PRs may be truncated
- Always apply human judgment to AI suggestions
Commands:
/claude-review [context]— Request a code review/claude-query <question>— Ask about the PR or codebase
Codex Code Review - OBSOLETESuperseded by a newer AI review. Expand to see the original review.🟡 Codex Code ReviewAuto-triggered after CI passed — reviewing commit cbf3237 ❌ Codex review failed before producing findings. ℹ️ About this responseGenerated by Codex CLI. Limitations:
Commands:
|
Claude Code Review - OBSOLETESuperseded by a newer AI review. Expand to see the original review.✅ Claude Code ReviewAuto-triggered after CI passed — reviewing commit cbf3237 SummaryLarge, well-structured PR that introduces read-scoped block buffer providers and cache bypass for scan reads. The core concept is sound and the implementation covers many paths (direct I/O, async I/O, compressed/uncompressed blocks, stress testing). However, several correctness gaps exist where the new bypass and provider logic is not consistently applied across all read paths, and the IODispatcher sync fallback ignores the new settings entirely. High-severity findings (3):
The full review with 5 medium and 5 low findings, cross-component analysis, and positive observations is in ℹ️ About this responseGenerated by Claude Code. Limitations:
Commands:
|
🟡 Codex Code ReviewAuto-triggered after CI passed — reviewing commit dc9c2e6 ❌ Codex review failed before producing findings. ℹ️ About this responseGenerated by Codex CLI. Limitations:
Commands:
|
✅ Claude Code ReviewAuto-triggered after CI passed — reviewing commit dc9c2e6 SummaryLarge, well-structured PR that introduces read-scoped block buffer providers for scan reads, allowing caller-provided storage to bypass the block cache. The refactoring of High-severity findings (2):
Full review (click to expand)Findings🔴 HIGHH1. ReadSet::SyncRead hardcodes use_cache=true, bypassing cache bypass logic --
|
| Context | Affected? | Analysis |
|---|---|---|
| WritePreparedTxnDB | No | Provider only affects read paths, not visibility or transactions |
| ReadOnly DB / SecondaryInstance | Safe | Provider-backed reads work identically to normal reads |
| CompactionService / Remote | No | Provider is on ReadOptions which is per-read, not serialized |
| User-defined timestamps | Safe | Provider is orthogonal to key ordering |
| BlobDB | Safe | Blob reads use separate BlobFileReader::ReadFromFile, which was correctly updated |
| FIFO / Universal compaction | No | Provider only affects scan reads, not compaction |
| Prefix seek | Safe | Provider is block-level, independent of seek mode |
| Concurrent writers | Safe | Provider allocations use per-provider synchronization |
| Snapshots | Safe | Provider is memory management, not visibility |
Positive Observations
- The
StressReadScopedBlockBufferProvideris well-designed with comprehensive invariant checking (alignment, size, lifecycle tracking, abort on violation). - The
AlignedBufferexternal allocator design cleanly separates allocation policy from the buffer management, enabling provider-backed direct I/O without duplicating the read loop. - The
AllocateReadScopedBlockBuffer()validation function properly checks all lease contract requirements (non-null data, sufficient size, alignment, non-null cleanup). - The refactoring from
AlignedBuf(type alias forunique_ptr<void>) toAlignedBuffer(full class) for direct I/O paths is a good cleanup that makes the ownership model more explicit. - The
ShouldUseDataBlockCacheForIterator()centralization is a good pattern for consistent cache bypass decisions. - Compressed block decompression into provider-backed memory avoids an extra copy that would otherwise be needed.
ℹ️ About this response
Generated by Claude Code.
Review methodology: claude_md/code_review.md
Limitations:
- Claude may miss context from files not in the diff
- Large PRs may be truncated
- Always apply human judgment to AI suggestions
Commands:
/claude-review [context]— Request a code review/claude-query <question>— Ask about the PR or codebase
Summary
Add read-scoped block buffers for scan reads. Introduce an experimental ReadScopedBlockBufferProvider API and ReadOptions::read_scoped_block_buffer_provider so block-based table iterator and MultiScan data-block reads can use caller-provided read-scoped storage. Provider-backed scan reads bypass block cache, attach provider cleanup to uncompressed read buffers without copying, and decompress compressed blocks directly into provider-backed output.
Add an independent MultiScanArgs::bypass_block_cache / IODispatcher JobOptions bypass knob for scan reads that should skip data-block cache lookup and insertion while preserving fill_cache=false behavior.
Extend AlignedBuffer and RandomAccessFileReader direct-I/O paths to support external aligned allocations, then use that support for read-scoped iterator, async I/O, and MultiRead scratch buffers. Centralize read-scoped I/O policy, keep coalesced async reads safe when blocks are released before completion, and validate provider lease contracts.
Add focused coverage for read-scoped ownership, compressed and uncompressed blocks, direct I/O, cache bypass behavior, invalid provider leases, async release handling, and stress-test provider invariants. Add public API release notes for read-scoped block buffers.
Testing