Optimize MultiScan dispatch for sorted blocks#14783
Conversation
|
| Check | Count |
|---|---|
cppcoreguidelines-special-member-functions |
1 |
| Total | 1 |
Details
db/db_iterator_test.cc (1 warning(s))
db/db_iterator_test.cc:4421:10: warning: class 'SyncPointGuard' defines a non-default destructor but does not define a copy constructor, a copy assignment operator, a move constructor or a move assignment operator [cppcoreguidelines-special-member-functions]
|
@xingbowang has imported this pull request. If you are a Meta employee, you can view this in D106301516. |
Codex Code Review - OBSOLETESuperseded by a newer AI review. Expand to see the original review.🟡 Codex Code ReviewAuto-triggered after CI passed — reviewing commit ae391b5 ❌ Codex review failed before producing findings. ℹ️ About this responseGenerated by Codex CLI. Limitations:
Commands:
|
Claude Code Review - OBSOLETESuperseded by a newer AI review. Expand to see the original review.✅ Claude Code ReviewAuto-triggered after CI passed — reviewing commit ae391b5 SummarySound optimization that eliminates redundant O(n log n) sorts in the MultiScan IO dispatch path by propagating a "pre-sorted" flag from the validation layer. The core invariant (sorted scan ranges produce offset-sorted block handles in block-based tables) is correct. One significant gap undermines effectiveness in production. High-severity findings (1):
Full review (click to expand)Findings🔴 HIGHH1.
|
| Context | Affected? | Analysis |
|---|---|---|
| Standard block-based table | YES | Key order = offset order. Invariant holds. |
| Partitioned indexes | YES | Each partition still maintains key=offset order. Safe. |
| WritePreparedTxnDB | NO | Same iterator path, no effect on block ordering. |
| ReadOnly DB | NO | Same read path. |
| User-defined timestamps | NO | Timestamps don't affect block physical ordering. |
| LevelIterator (multi-file) | YES | H1 applies — flag lost via CopyConfigFrom(). |
| Direct BBTI::Prepare (tests) | YES | Works correctly when flag is propagated directly. |
Sorted-order invariant verification:
DBIter::ValidateScanOptions()rejects overlapping ranges (start[i] must be >= end[i-1])CollectBlockHandles()iterates the index forward viaSeek()+Next(), producing block handles in offset order- Overlap dedup (
check_overlapflag) only applies to the first block of each new scan range, correctly handling the case where adjacent non-overlapping ranges share a boundary block - The invariant holds: validated sorted non-overlapping scan ranges → offset-sorted block handles
Assumption stress-test:
- Claim: "scan ranges sorted by key → block handles sorted by offset"
- Precondition: block-based table stores data blocks in key order (true by design)
- Precondition: non-overlapping ranges processed in order produce monotonically advancing index positions (true: Seek starts at/after prior range's end)
- No counterexample found for standard block-based tables
Positive Observations
- The optimization is fail-safe:
block_handles_are_sorteddefaults tofalse, so all existing code paths and new code paths that don't explicitly set it continue to sort defensively. - The
block_indices_to_read.reserve()addition is a good micro-optimization. - Iterating cache lookups in offset order (the new behavior even for the unsorted path) is actually better for cache locality than the previous arbitrary-order iteration.
- The
BlockIndicesAreSortedByOffsetdebug assertion function is a clean way to validate the invariant without release overhead. - The TEST_SYNC_POINT placement enables precise testing of the sort-skip behavior.
ℹ️ About this response
Generated by Claude Code.
Review methodology: claude_md/code_review.md
Limitations:
- Claude may miss context from files not in the diff
- Large PRs may be truncated
- Always apply human judgment to AI suggestions
Commands:
/claude-review [context]— Request a code review/claude-query <question>— Ask about the PR or codebase
MultiScan collects data block handles in scan order after DBIter validates that the public scan ranges are sorted and non-overlapping. Propagate that validation through MultiScanArgs and IOJob so IODispatcher can skip its public-boundary sort when the submitted block handle vector is already ordered by file offset. SubmitJob remains the only dispatcher boundary that normalizes block order. When callers do not set block_handles_are_sorted, SubmitJob sorts the ReadSet index once and cache-miss collection walks that sorted index. Private helpers such as PreCoalesceBlocks and PrepareIORequests now consume already-sorted block index vectors instead of performing their own defensive sorts; debug assertions validate that precondition for future private-call changes. Keep the scan-ranges-sorted bit internal to MultiScanArgs: DBIter is the only setter after validation, while internal readers use a getter. CopyConfigFrom copies the bit so LevelIterator per-file scan arguments preserve the optimization in the production path. Tests: built db_iter_test, db_iterator_test, and io_dispatcher_test; ran timeout 60 ./db_iterator_test --gtest_filter=DBMultiScanIteratorTest/DBMultiScanIteratorTest.SortedRangesSkipIODispatcherSort/*; ran timeout 60 ./db_iter_test --gtest_filter=DBIteratorTest.PrepareMarksValidatedScanRangesSorted:DBIteratorTest.PrepareDoesNotForwardInvalidScanRanges; ran timeout 60 ./io_dispatcher_test --gtest_filter=IODispatcherTest.SortedBlockHandlesSkipDispatcherSorts:IODispatcherTest.VerifyReadRequestDetails:IODispatcherTest.VerifyCoalescing:IODispatcherTest.MemoryAccountingWithPartialGroupConsumption:IODispatcherTest.CoalescedGroupsSplitByMemoryBudget:IODispatcherTest.MemoryReleasedAfterReadIndexThenReleaseBlock; ran git diff --check.
ae391b5 to
38e616d
Compare
|
@xingbowang has imported this pull request. If you are a Meta employee, you can view this in D106301516. |
🟡 Codex Code ReviewAuto-triggered after CI passed — reviewing commit 38e616d ❌ Codex review failed before producing findings. ℹ️ About this responseGenerated by Codex CLI. Limitations:
Commands:
|
✅ Claude Code ReviewAuto-triggered after CI passed — reviewing commit 38e616d SummaryThis PR eliminates three redundant O(N log N) sorts in the MultiScan IO dispatch pipeline by propagating a "sorted" flag from DBIter::Prepare() through to IODispatcher. The core invariant -- sorted, validated scan ranges imply sorted block handles within a single SST file -- has been verified as sound. The implementation is clean with appropriate encapsulation (private setter, friend class). One medium-severity concern about debug-only assertion safety in internal helpers, and a few low-severity suggestions. High-severity findings (0): No high-severity findings. Full review (click to expand)Findings🟡 MEDIUMM1. Debug-only assertions replacing sorts in internal helpers --
|
| Context | Does code execute? | Assumptions hold? | Action needed? |
|---|---|---|---|
| WritePreparedTxnDB | YES (same iterator path) | YES | None |
| ReadOnly DB | YES | YES | None |
| User-defined timestamps | YES | YES (CollectBlockHandles handles timestamps) | None |
| Direct BBT::Prepare (tests) | YES | Safe - default false |
None |
| Partitioned index | YES | YES (partitions maintain offset order) | None |
| Async IO fallback | YES | YES (fallback indices from sorted coalesced groups) | None |
| DrainPendingPrefetchGroups | YES | YES (sequential group processing preserves order) | None |
Assumption stress-test:
- Claim: "Sorted scan ranges imply sorted block handles" -- Verified:
CollectBlockHandlesseeks forward through the index iterator for each sorted range. Within a single SST file, the index iterator returns blocks in offset order. Boundary block deduplication (skip ifback() == current) preserves sort order. Confirmed sound. - Claim: "block_indices_to_read is built in sorted order after SubmitJob change" -- Verified: The loop now iterates
rs->sorted_block_indices_(which are raw indices sorted by offset). Cache misses push the raw indexiin sorted order. - Claim: "Fallback indices maintain sorted order" -- Verified:
ExecuteAsyncIOiterates coalesced groups in order, pushing block indices from each group sequentially.
Positive Observations
- Good encapsulation:
MarkScanRangesAsSorted()is private withfriend class DBIter, preventing incorrect external usage. - Conservative default:
block_handles_are_sorteddefaults tofalse, ensuring no change for existing callers. - Complete copy/move semantics:
scan_ranges_are_sorted_is correctly propagated in copy constructor, move constructor, copy assignment, move assignment, andCopyConfigFrom(). - Original MultiScanArgs unmodified: The test
PrepareMarksValidatedScanRangesSortedverifies the user'sMultiScanArgsobject is not mutated -- only the internal copy gets the flag. - Good test coverage: Three new tests cover: (1) flag propagation through Prepare, (2) rejection of invalid ranges, (3) end-to-end verification that sort is skipped via sync point, and (4) unit-level IODispatcher test with both sorted and unsorted paths.
- Debug assertion helper:
BlockIndicesAreSortedByOffsetis well-implemented with bounds checking and is cleanly guarded by#ifndef NDEBUG. - Performance improvement: Eliminates up to 3 O(N log N) sorts and 2 vector copies per SST file per MultiScan dispatch.
ℹ️ About this response
Generated by Claude Code.
Review methodology: claude_md/code_review.md
Limitations:
- Claude may miss context from files not in the diff
- Large PRs may be truncated
- Always apply human judgment to AI suggestions
Commands:
/claude-review [context]— Request a code review/claude-query <question>— Ask about the PR or codebase
|
@xingbowang has imported this pull request. If you are a Meta employee, you can view this in D106301516. |
| ASSERT_OK(db_->CompactRange({}, nullptr, nullptr)); | ||
| ASSERT_EQ(2, NumTableFilesAtLevel(49)); | ||
|
|
||
| struct SyncPointGuard { |
There was a problem hiding this comment.
nit: is this necessary, can't we just add it to end of test assuming test always passes
| // and non-overlapping at the public Iterator::Prepare API boundary. | ||
| // Internal iterators use this to preserve sorted block handle order and avoid | ||
| // repeated dispatcher-side sorting. | ||
| bool scan_ranges_are_sorted_ = false; |
There was a problem hiding this comment.
Is there a scenario when this is actually false? Seems like DBIter always sets it to true
Summary
DBIter::Prepare()throughMultiScanArgs.IODispatcher::SubmitJob()as the normalization boundary for unsorted callers, while allowing sorted callers to skip the defensive block-handle sort.Testing
CI, new unit test