perf: DirectIndexAgg for low-cardinality integer GROUP BY#157
Conversation
Key changes: - query_executor.cpp: Remove parallel_ requirement from use_vectorized gate (Fix 3: enable vectorized path for large scans without parallel mode) - query_executor.cpp: Resolve input_col_idx from aggregate function arguments (Fix 5: enable SUM/AVG to work on specific columns, not just COUNT(*)) - vectorized_operator.hpp: Add AVG support in update_accumulators and produce_output_batch - operator.cpp: Replace std::map with std::unordered_map + binary key encoding (Fix 1+2: O(log n) → O(1) hash, string concat → length-prefixed binary) Benchmark baseline (Q1 @ 100k rows): - Before fixes: ~152k rows/sec - After fixes: ~166k rows/sec - DuckDB: ~197M rows/sec The vectorized path is now enabled but VectorizedGroupByOperator needs further optimization (batch-oriented key construction) to close gap.
This commits the full parallel scan implementation: - VectorizedSeqScanOperator now accepts optional ThreadPool parameter - When enabled (>50k rows, >1 thread), splits table into range chunks and reads each range in parallel via thread pool - Added steal() to ColumnVector/NumericVector/StringVector for O(1) batch transfer via vector swap (avoids per-element copy) - Parallel results are collected and returned via steal() to out_batch The slow benchmark (2.6M/s) vs fast (598M/s for Q6/10k) suggests the GROUP BY aggregation is the bottleneck, not I/O. The parallel scan overhead (file opens/closes per task) likely exceeds I/O benefit at 100k row scale.
Implements direct array indexing for integer GROUP BY keys instead of string-based hashing. When GROUP BY is on a single integer column, uses slot = key - min_key for O(1) lookup without hash computation. Also adds steal() method to ColumnVector/NumericVector/StringVector for O(1) batch transfers via vector swap. Note: Benchmark still shows ~2.6M/s suggesting the optimization path may not be triggered, or there's another bottleneck in the execution path that needs investigation.
The estimated_rows variable was being redeclared inside a nested block, causing the GROUP BY heuristic check to always see stale (0) values. Also adds the condition to enable vectorized path when GROUP BY exists but num_rows==0 (ANALYZE not run), allowing DirectIndexAgg optimization to activate.
📝 WalkthroughWalkthroughThis PR accelerates vectorized query execution by adding parallel scan support with column stealing, introducing memory-mapped storage for zero-copy reads, and implementing a direct-index aggregation fast path for low-cardinality integer GROUP BY operations. Plan selection now uses estimated row cardinality to drive vectorization decisions and provisions a ThreadPool for parallel scans. ChangesVectorized Execution and Parallel Processing
Sequence Diagram(s)sequenceDiagram
participant SeqScan as VectorizedSeqScanOperator
participant Pool as ThreadPool
participant Table as ColumnarTable
participant Out as OutputBatch
SeqScan->>SeqScan: partition remaining rows per thread
SeqScan->>Pool: submit read_batch tasks (per-range)
Pool->>Table: ColumnarTable::read_batch(range)
Table-->>Pool: per-thread result batches
Pool-->>SeqScan: tasks complete
SeqScan->>Out: steal/move columns from per-thread batches
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 5
🧹 Nitpick comments (3)
src/executor/operator.cpp (1)
577-605: ⚡ Quick winHoist
keyoutside the loop and prefertry_emplaceto avoid per-row allocations.
std::string key;is default-constructed inside the loop and thenreserve(64)is called every iteration, which forces a heap allocation per input row (SSO capacity is well under 64). Hoisting the buffer above thewhileand usingclear()to reset preserves capacity across rows and removes that per-row malloc/free pair from this hot path. While here,findfollowed byemplacedoes two hash probes on the miss path;try_emplacecollapses it to one and avoids constructing the key string twice.♻️ Proposed refactor
Tuple tuple; auto child_schema = child_->output_schema(); + std::string key; + key.reserve(64); // reused across rows; capacity is preserved by clear() while (child_->next(tuple)) { - std::string key; - key.reserve(64); // Pre-reserve to avoid repeated allocations + key.clear(); std::vector<common::Value> gb_vals; @@ - auto it = groups_map.find(key); - if (it == groups_map.end()) { - it = groups_map.emplace(key, GroupState(aggregates_.size())).first; - it->second.group_values = std::move(gb_vals); + auto [it, inserted] = groups_map.try_emplace(key, GroupState(aggregates_.size())); + if (inserted) { + it->second.group_values = std::move(gb_vals); }Also note the comment on Line 585 ("no string allocation") is slightly misleading —
val.to_string()still allocates for non-null values; only the per-iterationkeybuffer can be made allocation-free.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/executor/operator.cpp` around lines 577 - 605, Hoist the std::string key buffer out of the per-row loop and reserve its capacity once (e.g., key.reserve(64) before the loop) then call key.clear() each iteration to preserve capacity and avoid per-row heap allocations; keep gb_vals as a local per-row vector but reuse it similarly if needed. Replace the find+emplace pattern on groups_map with a single try_emplace call (using key as the lookup/insert key and constructing GroupState(aggregates_.size()) only on miss) to eliminate the double hash probe and duplicate key construction; after insertion/update, assign it->second.group_values = std::move(gb_vals) as before. Apply these changes around the existing symbols key, group_by_, gb_vals, groups_map, try_emplace, GroupState, and aggregates_.src/executor/query_executor.cpp (1)
1515-1517: ⚡ Quick winThread pool recreation on every vectorized query.
A new
ThreadPoolis created for each vectorized query execution (line 1515). Thread pool creation incurs overhead from spawning worker threads. For workloads with many short queries, this repeated construction/destruction degrades throughput. Consider caching a sharedThreadPoolat theQueryExecutorinstance level for reuse across queries.♻️ Suggested approach
Add a member to
QueryExecutor:std::shared_ptr<ThreadPool> thread_pool_;Initialize once in the constructor:
QueryExecutor::QueryExecutor(...) : ..., thread_pool_(std::make_shared<ThreadPool>(std::thread::hardware_concurrency())) {}Reuse in
build_vectorized_plan:-auto thread_pool = std::make_shared<executor::ThreadPool>(std::thread::hardware_concurrency()); std::unique_ptr<VectorizedOperator> current_root = - std::make_unique<VectorizedSeqScanOperator>(base_table_name, col_table, thread_pool); + std::make_unique<VectorizedSeqScanOperator>(base_table_name, col_table, thread_pool_);🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/executor/query_executor.cpp` around lines 1515 - 1517, The vectorized path currently creates a new ThreadPool per query; add a std::shared_ptr<executor::ThreadPool> thread_pool_ member to QueryExecutor, initialize it once in QueryExecutor's constructor (using std::thread::hardware_concurrency() or configured size), and update build_vectorized_plan (where ThreadPool is currently created and used to construct VectorizedSeqScanOperator) to reuse thread_pool_ instead of making a fresh pool; ensure lifetime is managed by the QueryExecutor instance and remove the local std::make_shared<executor::ThreadPool> creation in build_vectorized_plan.include/executor/vectorized_operator.hpp (1)
80-83: 💤 Low valueConsider making parallel scan threshold configurable.
The hardcoded 50,000 row threshold for enabling parallel scan may not be optimal for all workloads. The crossover point depends on thread pool size, column count, column width, and system characteristics. Consider making this threshold configurable or using a more sophisticated cost model.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@include/executor/vectorized_operator.hpp` around lines 80 - 83, The code currently hardcodes the parallel-scan enable threshold (50000) in the block that sets num_threads_ and parallel_enabled_; replace this magic number with a configurable parameter (e.g., parallel_scan_threshold_) that is read from operator configuration or a global runtime config and fall back to 50000 if unset, expose it via the VectorizedOperator constructor or a setter so callers can tune it, and update the check to set parallel_enabled_ = table_->row_count() > parallel_scan_threshold_; optionally document the new parameter where operator options are defined.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@include/executor/types.hpp`:
- Around line 247-252: The doc for virtual void steal(ColumnVector&& other)
promises that 'other' is emptied after stealing but current implementations only
swap buffers; update each steal implementation (e.g., ColumnVector::steal and
StringVector::steal) to perform the swap and then explicitly clear or reset the
moved-from 'other' (call its clear/reset method, set size/length to zero, and
release/empty any auxiliary buffers) so that after the call 'other' is empty
while 'this' holds the original data; ensure you use the class-specific
clear/reset APIs to leave 'other' in a valid empty state.
In `@include/executor/vectorized_operator.hpp`:
- Around line 385-392: The GroupSlot struct has sums_float64 sized as
MAX_AGGREGATES / 2 which can be indexed up to aggregates_.size() in
process_input_batch_direct(), causing OOB access; change sums_float64 to match
sums_int64 (size MAX_AGGREGATES) so all aggregate indexes
(0..aggregates_.size()-1) are valid, and update any related assumptions/comments
referencing half-size to reflect the new full-size array; ensure GroupSlot,
sums_float64, MAX_AGGREGATES and process_input_batch_direct usage remain
consistent.
- Around line 417-431: The find_or_insert method currently grows slots_ to cover
min_key_/max_key_ which can allocate huge memory for sparse/wide keys; modify
find_or_insert(int64_t key1, int64_t key2) to check the computed new_size
(max_key_ - min_key_ + 1) against a configured safe threshold (e.g.,
MAX_SLOT_RANGE) before resizing, and if it exceeds the threshold switch to or
fallback on a sparse container (e.g., an unordered_map or existing hash-based
path) or return an error/limit-hit indicator instead of resizing; ensure you
reference and update min_key_, max_key_, and slots_ only when within limits and
add clear handling in callers of find_or_insert for the fallback/limit
condition.
In `@include/storage/memory_mapped_column.hpp`:
- Around line 45-54: The helpers data_at() and null_at() currently do pointer
arithmetic without validating row_idx; update both functions to return nullptr
if data_region_.addr or null_region_.addr is null OR if row_idx >= row_count_,
ensuring you check bounds before computing static_cast<char*>(data_region_.addr)
+ row_idx * element_size_ (and similarly for null_region_); this prevents
out-of-bounds pointers — use the existing members row_count_, element_size_,
data_region_.addr and null_region_.addr to perform the checks and only perform
the pointer arithmetic when within range.
In `@src/executor/query_executor.cpp`:
- Around line 1709-1711: The code unsafely casts the size_t from
current_root->output_schema().find_column(col->name()) to int; first capture the
result into a size_t (e.g., auto idx =
current_root->output_schema().find_column(col->name())), check explicitly for
the not-found sentinel (idx == size_t(-1)) and handle that case (set
info.input_col_idx = -1 or raise/log an error), otherwise safely
static_cast<int>(idx) into info.input_col_idx; update references to
info.input_col_idx and use col->name() in the check to preserve current
behavior.
---
Nitpick comments:
In `@include/executor/vectorized_operator.hpp`:
- Around line 80-83: The code currently hardcodes the parallel-scan enable
threshold (50000) in the block that sets num_threads_ and parallel_enabled_;
replace this magic number with a configurable parameter (e.g.,
parallel_scan_threshold_) that is read from operator configuration or a global
runtime config and fall back to 50000 if unset, expose it via the
VectorizedOperator constructor or a setter so callers can tune it, and update
the check to set parallel_enabled_ = table_->row_count() >
parallel_scan_threshold_; optionally document the new parameter where operator
options are defined.
In `@src/executor/operator.cpp`:
- Around line 577-605: Hoist the std::string key buffer out of the per-row loop
and reserve its capacity once (e.g., key.reserve(64) before the loop) then call
key.clear() each iteration to preserve capacity and avoid per-row heap
allocations; keep gb_vals as a local per-row vector but reuse it similarly if
needed. Replace the find+emplace pattern on groups_map with a single try_emplace
call (using key as the lookup/insert key and constructing
GroupState(aggregates_.size()) only on miss) to eliminate the double hash probe
and duplicate key construction; after insertion/update, assign
it->second.group_values = std::move(gb_vals) as before. Apply these changes
around the existing symbols key, group_by_, gb_vals, groups_map, try_emplace,
GroupState, and aggregates_.
In `@src/executor/query_executor.cpp`:
- Around line 1515-1517: The vectorized path currently creates a new ThreadPool
per query; add a std::shared_ptr<executor::ThreadPool> thread_pool_ member to
QueryExecutor, initialize it once in QueryExecutor's constructor (using
std::thread::hardware_concurrency() or configured size), and update
build_vectorized_plan (where ThreadPool is currently created and used to
construct VectorizedSeqScanOperator) to reuse thread_pool_ instead of making a
fresh pool; ensure lifetime is managed by the QueryExecutor instance and remove
the local std::make_shared<executor::ThreadPool> creation in
build_vectorized_plan.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: b5a44b60-77fa-4086-ab79-c50bf70f0427
📒 Files selected for processing (7)
benchmarks/duckdb_comparison_bench.cppinclude/executor/types.hppinclude/executor/vectorized_operator.hppinclude/storage/memory_mapped_column.hppsrc/executor/operator.cppsrc/executor/query_executor.cppsrc/storage/memory_mapped_column.cpp
| /** | ||
| * @brief Steals data from another column vector by swapping internal buffers. | ||
| * After steal(), 'other' is emptied and 'this' holds the original data from 'other'. | ||
| * Throws std::runtime_error if types are incompatible. | ||
| */ | ||
| virtual void steal(ColumnVector&& other) = 0; |
There was a problem hiding this comment.
Semantic contract mismatch between documentation and implementation.
The documentation states "After steal(), 'other' is emptied," but the implementations (lines 339-345, 411-417) only swap internal buffers without clearing other. After steal(), both vectors remain valid with swapped contents rather than other being empty. This violates the documented contract and could mislead callers who rely on other being emptied.
🔄 Proposed fix: clear 'other' after swapping
Update implementations to explicitly clear other after swapping:
void steal(ColumnVector&& other) override {
auto* other_num = dynamic_cast<NumericVector<T>*>(&other);
if (!other_num) throw std::runtime_error("NumericVector::steal: type mismatch");
data_.swap(other_num->data_);
null_bitmap_.swap(other_num->null_bitmap_);
std::swap(size_, other_num->size_);
+ other_num->data_.clear();
+ other_num->null_bitmap_.clear();
+ other_num->size_ = 0;
}Apply the same pattern to StringVector::steal at lines 411-417.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@include/executor/types.hpp` around lines 247 - 252, The doc for virtual void
steal(ColumnVector&& other) promises that 'other' is emptied after stealing but
current implementations only swap buffers; update each steal implementation
(e.g., ColumnVector::steal and StringVector::steal) to perform the swap and then
explicitly clear or reset the moved-from 'other' (call its clear/reset method,
set size/length to zero, and release/empty any auxiliary buffers) so that after
the call 'other' is empty while 'this' holds the original data; ensure you use
the class-specific clear/reset APIs to leave 'other' in a valid empty state.
| const void* data_at(size_t row_idx) const { | ||
| if (!data_region_.addr) return nullptr; | ||
| return static_cast<char*>(data_region_.addr) + row_idx * element_size_; | ||
| } | ||
|
|
||
| // Direct pointer to null bit at row index | ||
| const uint8_t* null_at(size_t row_idx) const { | ||
| if (!null_region_.addr) return nullptr; | ||
| return static_cast<uint8_t*>(null_region_.addr) + row_idx; | ||
| } |
There was a problem hiding this comment.
Add bounds validation to pointer arithmetic helpers.
data_at() and null_at() perform pointer arithmetic without validating that row_idx < row_count_. Returning out-of-bounds pointers can lead to undefined behavior when dereferenced by callers. Even if the caller is expected to validate, defensive bounds checking here prevents subtle bugs.
🛡️ Proposed fix: add bounds checking
const void* data_at(size_t row_idx) const {
- if (!data_region_.addr) return nullptr;
+ if (!data_region_.addr || row_idx >= row_count_) return nullptr;
return static_cast<char*>(data_region_.addr) + row_idx * element_size_;
}
const uint8_t* null_at(size_t row_idx) const {
- if (!null_region_.addr) return nullptr;
+ if (!null_region_.addr || row_idx >= row_count_) return nullptr;
return static_cast<uint8_t*>(null_region_.addr) + row_idx;
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const void* data_at(size_t row_idx) const { | |
| if (!data_region_.addr) return nullptr; | |
| return static_cast<char*>(data_region_.addr) + row_idx * element_size_; | |
| } | |
| // Direct pointer to null bit at row index | |
| const uint8_t* null_at(size_t row_idx) const { | |
| if (!null_region_.addr) return nullptr; | |
| return static_cast<uint8_t*>(null_region_.addr) + row_idx; | |
| } | |
| const void* data_at(size_t row_idx) const { | |
| if (!data_region_.addr || row_idx >= row_count_) return nullptr; | |
| return static_cast<char*>(data_region_.addr) + row_idx * element_size_; | |
| } | |
| // Direct pointer to null bit at row index | |
| const uint8_t* null_at(size_t row_idx) const { | |
| if (!null_region_.addr || row_idx >= row_count_) return nullptr; | |
| return static_cast<uint8_t*>(null_region_.addr) + row_idx; | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@include/storage/memory_mapped_column.hpp` around lines 45 - 54, The helpers
data_at() and null_at() currently do pointer arithmetic without validating
row_idx; update both functions to return nullptr if data_region_.addr or
null_region_.addr is null OR if row_idx >= row_count_, ensuring you check bounds
before computing static_cast<char*>(data_region_.addr) + row_idx * element_size_
(and similarly for null_region_); this prevents out-of-bounds pointers — use the
existing members row_count_, element_size_, data_region_.addr and
null_region_.addr to perform the checks and only perform the pointer arithmetic
when within range.
| info.input_col_idx = static_cast<int>( | ||
| current_root->output_schema().find_column(col->name())); | ||
| } |
There was a problem hiding this comment.
Unsafe cast of size_t to int for column index.
find_column() returns size_t(-1) to indicate "not found" (line 1710). Casting size_t(-1) directly to int is implementation-defined and may not reliably produce -1. On platforms where sizeof(size_t) > sizeof(int), the cast can produce unexpected values. Check for "not found" explicitly before casting.
🔧 Proposed fix: explicit not-found check
if (!func->args().empty()) {
const auto& arg = func->args()[0];
if (arg->type() == parser::ExprType::Column) {
const auto* col = dynamic_cast<const parser::ColumnExpr*>(arg.get());
if (col != nullptr) {
- info.input_col_idx = static_cast<int>(
- current_root->output_schema().find_column(col->name()));
+ size_t col_idx = current_root->output_schema().find_column(col->name());
+ info.input_col_idx = (col_idx != static_cast<size_t>(-1))
+ ? static_cast<int>(col_idx)
+ : -1;
}
}
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| info.input_col_idx = static_cast<int>( | |
| current_root->output_schema().find_column(col->name())); | |
| } | |
| size_t col_idx = current_root->output_schema().find_column(col->name()); | |
| info.input_col_idx = (col_idx != static_cast<size_t>(-1)) | |
| ? static_cast<int>(col_idx) | |
| : -1; | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/executor/query_executor.cpp` around lines 1709 - 1711, The code unsafely
casts the size_t from current_root->output_schema().find_column(col->name()) to
int; first capture the result into a size_t (e.g., auto idx =
current_root->output_schema().find_column(col->name())), check explicitly for
the not-found sentinel (idx == size_t(-1)) and handle that case (set
info.input_col_idx = -1 or raise/log an error), otherwise safely
static_cast<int>(idx) into info.input_col_idx; update references to
info.input_col_idx and use col->name() in the check to preserve current
behavior.
…_insert - Remove unused valid_slot_indices_ member (produces_output_batch_direct iterates direct_group_keys_ instead) - Remove orphaned valid_slots() accessor - Fix find_or_insert to mark slot.valid=true on first insertion (was relying on accidental post-condition in caller)
- Open-addressing hash table with linear probing and FNV-1a 64-bit hash - Binary key encoding: [type_tag (1B)][len (4B)][data...] - avoids string allocation - HashBucket stores key_bytes inline for fast comparison, is_new flag for iteration - Added to VectorizedGroupByOperator as fallback when DirectIndexAgg doesn't apply - Fixed grow() to use key_data instead of casting key_int64 Note: process_input_batch_open_addressing() is written but not yet wired into process_input_batch() - the old hash path is still being used. This is intentional to allow incremental testing.
Instead of returning an error for GROUP BY keys larger than 56 bytes, use std::vector<uint8_t> heap_key when stack buffer is too small. Also fix hash_agg_.find_or_insert call to use key_ptr instead of key_buf when heap allocation was used.
- Add slot() accessor to OpenAddressHashAgg for bucket iteration - Add produce_output_batch_open_addressing() to iterate hash_agg_ buckets - Wire it into produce_output_batch() instead of the old hash path - Keep process_input_batch_hash as dead code for now (for reference)
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
include/executor/vectorized_operator.hpp (1)
68-72:⚠️ Potential issue | 🔴 Critical | ⚡ Quick winBuild failure:
ThreadPoolis undeclared.The pipeline failures confirm that
ThreadPoolis used but never included or forward-declared. This breaks the entire CI build across all configurations.Add the missing include or forward declaration at the top of the file:
🐛 Proposed fix
`#include` "executor/types.hpp" `#include` "parser/expression.hpp" `#include` "storage/columnar_table.hpp" +#include "concurrency/thread_pool.hpp" // or appropriate header for ThreadPoolIf
ThreadPoolis defined elsewhere, ensure the correct header path is used.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@include/executor/vectorized_operator.hpp` around lines 68 - 72, The build fails because ThreadPool is referenced by the member thread_pool_ in vectorized_operator.hpp but ThreadPool is neither included nor forward-declared; fix by either adding the proper header include that defines ThreadPool at the top of include/executor/vectorized_operator.hpp or, if ThreadPool is a class/struct declared in another header, add a forward declaration (e.g., class ThreadPool;) before it is used; ensure the chosen fix resolves the reference for the member std::shared_ptr<ThreadPool> thread_pool_ and compiles across all configurations.
♻️ Duplicate comments (1)
include/executor/vectorized_operator.hpp (1)
394-394:⚠️ Potential issue | 🔴 Critical | ⚡ Quick win
sums_float64array is undersized forMAX_AGGREGATES.Same issue as identified in
DirectIndexAgg::GroupSlot:sums_float64is declared with sizeMAX_AGGREGATES / 2(4 elements) while aggregate indices can range from 0 toaggregates_.size() - 1. If more than 4 aggregates are used,bucket.sums_float64[i]at lines 824-825 causes out-of-bounds access.🐛 Proposed fix: match array sizes
- double sums_float64[MAX_AGGREGATES / 2] = {0.0}; + double sums_float64[MAX_AGGREGATES] = {0.0};🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@include/executor/vectorized_operator.hpp` at line 394, The local array sums_float64 is declared too small (double sums_float64[MAX_AGGREGATES / 2]) which can overflow when aggregates_.size() > MAX_AGGREGATES/2; change its declaration to use the full MAX_AGGREGATES (e.g. double sums_float64[MAX_AGGREGATES]) so accesses like bucket.sums_float64[i] (used in VectorizedOperator / GroupSlot logic) cannot go out of bounds and match the other aggregate arrays.
🧹 Nitpick comments (1)
include/executor/vectorized_operator.hpp (1)
728-733: 💤 Low valueDead code: slot validity check is always true.
After
find_or_insert()returns,slot.validis alwaystrue(set at line 579), so the conditionif (!slot.valid)at line 729 is never entered. This makes lines 730-732 unreachable.♻️ Proposed cleanup
size_t slot_idx = agg_.find_or_insert(key, 0); auto& slot = agg_.slot(slot_idx); - if (!slot.valid) { - slot.valid = true; - slot.key1 = key; + if (slot.key1 != key) { // First time seeing this key + slot.key1 = key; direct_group_keys_.push_back(key); }Or simply track new keys differently (e.g., via a return value from
find_or_insert).🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@include/executor/vectorized_operator.hpp` around lines 728 - 733, The conditional checking slot.valid after calling find_or_insert() is dead because find_or_insert() always sets slot.valid to true; update the API so the caller can tell if the slot was newly inserted and only then push to direct_group_keys_. Concretely, change find_or_insert(...) to return a boolean (or a pair/struct) indicating "was_inserted" (or return an enum) alongside the slot reference, then replace the if (!slot.valid) branch with a check on that returned flag and push key to direct_group_keys_ only when was_inserted is true; update call sites that use find_or_insert() (the caller around direct_group_keys_ and the find_or_insert() implementation) accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@include/executor/vectorized_operator.hpp`:
- Around line 804-807: The code is pushing bucket.key_int64 into
hash_group_keys_ even for non-int64 GROUP BY keys, causing uninitialized values
to be emitted; update the open-addressing path in vectorized_operator.hpp so
that when a new bucket is created you either store the full key (decoded from
bucket.key_data) instead of key_int64, or only allow this path for direct int64
keys (use the same is_direct_indexable_ guard) and fall back to the hash-based
path for string/complex keys; ensure produce_output_batch_open_addressing() can
retrieve the correct key form you store (reference bucket.key_data,
bucket.key_int64, hash_group_keys_, produce_output_batch_open_addressing(), and
is_direct_indexable_ to locate the change).
- Around line 444-446: find_or_insert() copies key_len bytes into
bucket.key_data (a fixed 64-byte buffer) without bounds checking, causing a
buffer overflow when key_len > 64; update find_or_insert() to guard the copy: if
key_len <= sizeof(bucket.key_data) copy into bucket.key_data as now, otherwise
set bucket.key_len and bucket.key_type appropriately and store the full key into
the existing dynamic storage (heap_key / key_string path) instead of copying
into key_data; also update key comparison logic and grow()'s rehashing to
consult key_string/heap_key when bucket.key_len > 64 so long keys are compared
and moved safely; ensure process_input_batch_open_addressing() still routes long
keys to the heap path but that find_or_insert()/grow() use the same overflow
branch.
---
Outside diff comments:
In `@include/executor/vectorized_operator.hpp`:
- Around line 68-72: The build fails because ThreadPool is referenced by the
member thread_pool_ in vectorized_operator.hpp but ThreadPool is neither
included nor forward-declared; fix by either adding the proper header include
that defines ThreadPool at the top of include/executor/vectorized_operator.hpp
or, if ThreadPool is a class/struct declared in another header, add a forward
declaration (e.g., class ThreadPool;) before it is used; ensure the chosen fix
resolves the reference for the member std::shared_ptr<ThreadPool> thread_pool_
and compiles across all configurations.
---
Duplicate comments:
In `@include/executor/vectorized_operator.hpp`:
- Line 394: The local array sums_float64 is declared too small (double
sums_float64[MAX_AGGREGATES / 2]) which can overflow when aggregates_.size() >
MAX_AGGREGATES/2; change its declaration to use the full MAX_AGGREGATES (e.g.
double sums_float64[MAX_AGGREGATES]) so accesses like bucket.sums_float64[i]
(used in VectorizedOperator / GroupSlot logic) cannot go out of bounds and match
the other aggregate arrays.
---
Nitpick comments:
In `@include/executor/vectorized_operator.hpp`:
- Around line 728-733: The conditional checking slot.valid after calling
find_or_insert() is dead because find_or_insert() always sets slot.valid to
true; update the API so the caller can tell if the slot was newly inserted and
only then push to direct_group_keys_. Concretely, change find_or_insert(...) to
return a boolean (or a pair/struct) indicating "was_inserted" (or return an
enum) alongside the slot reference, then replace the if (!slot.valid) branch
with a check on that returned flag and push key to direct_group_keys_ only when
was_inserted is true; update call sites that use find_or_insert() (the caller
around direct_group_keys_ and the find_or_insert() implementation) accordingly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 53c94c16-99dc-4b16-8045-561114c03eb8
📒 Files selected for processing (1)
include/executor/vectorized_operator.hpp
| bucket.key_len = static_cast<uint32_t>(key_len); | ||
| bucket.key_type = key[0]; | ||
| std::memcpy(bucket.key_data, key, key_len); |
There was a problem hiding this comment.
Buffer overflow when key_len > 64.
key_data is a fixed 64-byte buffer, but find_or_insert() copies key_len bytes without bounds checking. While process_input_batch_open_addressing() switches to heap_key when the encoded key exceeds 64 bytes (lines 789-793), that larger key is still passed to find_or_insert() which blindly copies it into the 64-byte key_data buffer.
🐛 Proposed fix: guard the copy or use dynamic storage
Option 1 - Guard the copy and use key_string for overflow:
HashBucket& find_or_insert(const uint8_t* key, size_t key_len, uint64_t hash) {
// ... existing code ...
if (!bucket.occupied) {
bucket.occupied = true;
bucket.is_new = true;
bucket.key_hash = hash;
bucket.key_len = static_cast<uint32_t>(key_len);
bucket.key_type = key[0];
- std::memcpy(bucket.key_data, key, key_len);
+ if (key_len <= sizeof(bucket.key_data)) {
+ std::memcpy(bucket.key_data, key, key_len);
+ } else {
+ bucket.key_string.assign(reinterpret_cast<const char*>(key), key_len);
+ }Then update the key comparison and grow() rehashing to check key_string when key_len > 64.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@include/executor/vectorized_operator.hpp` around lines 444 - 446,
find_or_insert() copies key_len bytes into bucket.key_data (a fixed 64-byte
buffer) without bounds checking, causing a buffer overflow when key_len > 64;
update find_or_insert() to guard the copy: if key_len <= sizeof(bucket.key_data)
copy into bucket.key_data as now, otherwise set bucket.key_len and
bucket.key_type appropriately and store the full key into the existing dynamic
storage (heap_key / key_string path) instead of copying into key_data; also
update key comparison logic and grow()'s rehashing to consult
key_string/heap_key when bucket.key_len > 64 so long keys are compared and moved
safely; ensure process_input_batch_open_addressing() still routes long keys to
the heap path but that find_or_insert()/grow() use the same overflow branch.
- Remove unused key_string field from HashBucket (saves 32 bytes/bucket) - Delete process_input_batch_hash dead code (never called) - Fix self-assignment warning (resumed_bucket_idx_ = resumed_bucket_idx_) - Add trailing newlines to MemoryMappedColumn files
…Operator The constructor signature changed to accept an optional thread_pool parameter, but the test constructions weren't updated.
…canOperator construction Line 57 uses direct construction, not make_unique, so wasn't caught by the previous nullptr fix.
g++ was treating ThreadPool as int in forward declaration resolution. Include the header to ensure the type is fully defined.
Store actual common::Value objects in hash_group_keys_ instead of just int64_t keys, preserving column type (TEXT, INT64, etc.) for correct output emission.
- Add mins/maxes/has_mins fields to HashBucket for MIN/MAX tracking - Wire MIN/MAX into process_input_batch_open_addressing accumulator loop - Emit MIN/MAX in produce_output_batch_open_addressing - Fix VectorizedSeqScanOperator construction in vectorized_operator_tests
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
src/storage/memory_mapped_column.cpp (2)
17-35:⚠️ Potential issue | 🟠 Major | ⚡ Quick winClean up existing mappings before remapping.
Calling
map()twice on the same instance leaks the previous data mapping:data_region_is overwritten on Line 35 before the old region is released. Eitherunmap()at the top ofmap()or reject remapping while already mapped.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/storage/memory_mapped_column.cpp` around lines 17 - 35, MemoryMappedColumn::map currently overwrites data_region_ (and similarly null_region_ if present) when called a second time, leaking the previous mmap and file descriptor; before creating new mappings in map(), call the class's unmap() helper (or implement unmap() if missing) to munmap existing regions and close their fds, or alternatively detect an existing mapping and return false to reject remapping; update MemoryMappedColumn::map to invoke unmap() at the start (or guard with a check on data_region_.ptr/file descriptor) so the old mapping is properly released before assigning data_region_ and null_region_.
23-57:⚠️ Potential issue | 🟠 Major | ⚡ Quick winValidate file sizes before exposing row-based access.
map()never checks that the mapped lengths are consistent withrow_count/element_size. A truncated column can map successfully and thendata_at()/null_at()will hand out pointers beyond the mapped region; empty columns also hitmmap(..., 0, ...), which fails. Please add overflow-safe size checks before mapping and special-case the zero-row path instead of callingmmapwith a zero length.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/storage/memory_mapped_column.cpp` around lines 23 - 57, The map() implementation maps data and null files without validating that st.st_size matches the expected sizes derived from row_count and element_size, and it calls mmap with length 0 for empty columns; update map() to compute expected_data_size = row_count * element_size and expected_null_size = (row_count + 7) / 8 using overflow-safe multiplication/checks, verify st.st_size == expected size (or at least >= expected size) before calling mmap, and treat zero-row cases specially by not calling mmap (set data_region_ / null_region_ to an empty sentinel and keep their fds closed) and return false/cleanup if sizes mismatch; ensure unmap()/::close() are called on any early failures to avoid leaks.include/storage/memory_mapped_column.hpp (1)
19-35:⚠️ Potential issue | 🟠 Major | ⚡ Quick winMake
MemoryMappedColumnnon-copyable (and explicitly handle move semantics).
MemoryMappedColumnowns raw mmap pointers and file descriptors, but it’s implicitly copyable; a copy would duplicate the handles and both destructors would latermunmap/closethe same resources. Delete the copy constructor and copy assignment (and explicitly decide move ctor/assignment, since current destructor suppresses implicit moves).Proposed fix
class MemoryMappedColumn { public: + MemoryMappedColumn(const MemoryMappedColumn&) = delete; + MemoryMappedColumn& operator=(const MemoryMappedColumn&) = delete; ~MemoryMappedColumn() { unmap(); }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@include/storage/memory_mapped_column.hpp` around lines 19 - 35, The class MemoryMappedColumn must be made non-copyable and support move semantics: delete the copy constructor and copy assignment operator, and add a noexcept move constructor and move assignment that transfer ownership of MappedRegion fields (data_region_ and null_region_) and scalar state (element_size_, is_fixed_width_, row_count_) to the destination, leaving the source regions with addr=nullptr, size=0, fd=-1 and resetting scalar state as appropriate; ensure unmap() and the destructor continue to work and that moved-from objects are safe to destroy. Use the existing MappedRegion struct and unmap() helper when implementing moves so file descriptors and mmap pointers are not duplicated or closed twice.
♻️ Duplicate comments (2)
include/executor/vectorized_operator.hpp (2)
391-398:⚠️ Potential issue | 🔴 Critical | ⚡ Quick win
sums_float64can still overflow for later aggregate positions.
bucket.sums_float64[i]is indexed by the aggregate slot, but the backing array is still onlyMAX_AGGREGATES / 2. A fifth FLOATSUM/AVGwill write past the end of the bucket.DirectIndexAgg::GroupSlothas the same layout, so the direct path still has the same memory corruption risk.Also applies to: 821-832
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@include/executor/vectorized_operator.hpp` around lines 391 - 398, The sums_float64 array is too small (MAX_AGGREGATES/2) causing out-of-bounds writes when more than half the aggregates are FLOAT SUM/AVG; change its declaration and any matching layout in DirectIndexAgg::GroupSlot (and the other occurrence around the 821-832 region) to allocate space for MAX_AGGREGATES (or otherwise compute exact per-aggregate float count) and adjust initializers (e.g., {0.0}) accordingly so each aggregate slot maps safely to bucket.sums_float64[i] without overflow.
399-401:⚠️ Potential issue | 🔴 Critical | ⚡ Quick winFixed-size
key_data[64]still breaks wide GROUP BY keys.
hash_group_keys_fixes output reconstruction, but insertion/rehash still stores the encoded key inbucket.key_data[64]. Any composite key longer than 64 bytes will still overflow the bucket duringfind_or_insert()/grow(), so the open-addressing path is not safe for wide keys yet.Also applies to: 804-805
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@include/executor/vectorized_operator.hpp` around lines 399 - 401, The bucket's fixed-size key_data[64] overflows for wide composite GROUP BY keys; replace the in-bucket fixed buffer with an indirection (e.g., a key_ptr / key_index plus key_len and key_type) so buckets store either a pointer to heap-allocated key bytes or an index into a separate contiguous key storage (small-buffer-optimization can still be used in that external storage). Update all code paths that read/write bucket.key_data: change bucket layout (remove key_data[64], add key_ptr/key_index and key_len/key_type), then modify find_or_insert(), grow(), and any rehash/move logic to allocate/copy keys into the external storage and move pointers/indices instead of memcpy’ing into key_data; ensure proper allocation, deallocation, and copy semantics and that hash_group_keys_ reconstruction uses the same external storage semantics.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@include/executor/vectorized_operator.hpp`:
- Around line 835-846: DirectIndexAgg’s fast path currently only updates
COUNT/SUM/AVG, so add handling for AggregateType::Min and AggregateType::Max in
the DirectIndexAgg update and emit logic (mirror the OpenAddressHashAgg
behavior): when agg.input_col_idx >= 0 read the value from
batch.get_column(agg.input_col_idx), check nulls, and update the per-bucket
fields bucket.mins[i], bucket.maxes[i] and bucket.has_mins[i] (and initialize
both min/max on first seen) just like the OpenAddressHashAgg branch; also ensure
the emit/finalize code for DirectIndexAgg outputs MIN/MAX correctly (use the has
flag to decide NULL vs value). Apply the same changes in the second occurrence
noted around the other block (the region referenced by lines 1027-1041).
In `@tests/vectorized_operator_tests.cpp`:
- Line 117: The test suite uses inconsistent calls to VectorizedSeqScanOperator;
inspect the constructor declaration for VectorizedSeqScanOperator to determine
whether the third parameter is required or has a default, then make a consistent
fix: either add a default value to the constructor signature in the class (e.g.,
give the third parameter a default nullptr) or update every test call site (all
uses of VectorizedSeqScanOperator(...) across tests) to pass the third argument
(nullptr) so signatures match; search for all call sites and adjust them, and
update the header/implementation where the constructor is declared/defined to
keep tests compiling.
---
Outside diff comments:
In `@include/storage/memory_mapped_column.hpp`:
- Around line 19-35: The class MemoryMappedColumn must be made non-copyable and
support move semantics: delete the copy constructor and copy assignment
operator, and add a noexcept move constructor and move assignment that transfer
ownership of MappedRegion fields (data_region_ and null_region_) and scalar
state (element_size_, is_fixed_width_, row_count_) to the destination, leaving
the source regions with addr=nullptr, size=0, fd=-1 and resetting scalar state
as appropriate; ensure unmap() and the destructor continue to work and that
moved-from objects are safe to destroy. Use the existing MappedRegion struct and
unmap() helper when implementing moves so file descriptors and mmap pointers are
not duplicated or closed twice.
In `@src/storage/memory_mapped_column.cpp`:
- Around line 17-35: MemoryMappedColumn::map currently overwrites data_region_
(and similarly null_region_ if present) when called a second time, leaking the
previous mmap and file descriptor; before creating new mappings in map(), call
the class's unmap() helper (or implement unmap() if missing) to munmap existing
regions and close their fds, or alternatively detect an existing mapping and
return false to reject remapping; update MemoryMappedColumn::map to invoke
unmap() at the start (or guard with a check on data_region_.ptr/file descriptor)
so the old mapping is properly released before assigning data_region_ and
null_region_.
- Around line 23-57: The map() implementation maps data and null files without
validating that st.st_size matches the expected sizes derived from row_count and
element_size, and it calls mmap with length 0 for empty columns; update map() to
compute expected_data_size = row_count * element_size and expected_null_size =
(row_count + 7) / 8 using overflow-safe multiplication/checks, verify st.st_size
== expected size (or at least >= expected size) before calling mmap, and treat
zero-row cases specially by not calling mmap (set data_region_ / null_region_ to
an empty sentinel and keep their fds closed) and return false/cleanup if sizes
mismatch; ensure unmap()/::close() are called on any early failures to avoid
leaks.
---
Duplicate comments:
In `@include/executor/vectorized_operator.hpp`:
- Around line 391-398: The sums_float64 array is too small (MAX_AGGREGATES/2)
causing out-of-bounds writes when more than half the aggregates are FLOAT
SUM/AVG; change its declaration and any matching layout in
DirectIndexAgg::GroupSlot (and the other occurrence around the 821-832 region)
to allocate space for MAX_AGGREGATES (or otherwise compute exact per-aggregate
float count) and adjust initializers (e.g., {0.0}) accordingly so each aggregate
slot maps safely to bucket.sums_float64[i] without overflow.
- Around line 399-401: The bucket's fixed-size key_data[64] overflows for wide
composite GROUP BY keys; replace the in-bucket fixed buffer with an indirection
(e.g., a key_ptr / key_index plus key_len and key_type) so buckets store either
a pointer to heap-allocated key bytes or an index into a separate contiguous key
storage (small-buffer-optimization can still be used in that external storage).
Update all code paths that read/write bucket.key_data: change bucket layout
(remove key_data[64], add key_ptr/key_index and key_len/key_type), then modify
find_or_insert(), grow(), and any rehash/move logic to allocate/copy keys into
the external storage and move pointers/indices instead of memcpy’ing into
key_data; ensure proper allocation, deallocation, and copy semantics and that
hash_group_keys_ reconstruction uses the same external storage semantics.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 92252dce-f6c8-4554-9418-f723006711b9
📒 Files selected for processing (5)
include/executor/vectorized_operator.hppinclude/storage/memory_mapped_column.hppsrc/storage/memory_mapped_column.cpptests/analytics_tests.cpptests/vectorized_operator_tests.cpp
| int64_t mins[MAX_AGGREGATES] = {0}; | ||
| int64_t maxes[MAX_AGGREGATES] = {0}; | ||
| bool has_mins[MAX_AGGREGATES] = {false}; // Track if initialized |
There was a problem hiding this comment.
MIN/MAX in the open-addressing path is hard-coded to int64_t.
This path forces every MIN/MAX input through to_int64() and always emits make_int64(...). That changes both the value and the result type for non-INT64 aggregates, while the legacy hash path still preserves the original common::Value type. Either store typed common::Values here as well, or decline the open-addressing optimization for non-INT64 MIN/MAX.
Also applies to: 835-846, 1027-1038
| } else if ((agg.type == AggregateType::Min || agg.type == AggregateType::Max) && | ||
| agg.input_col_idx >= 0) { | ||
| const auto& col = batch.get_column(agg.input_col_idx); | ||
| if (!col.is_null(r)) { | ||
| auto val = col.get(r).to_int64(); | ||
| if (!bucket.has_mins[i]) { | ||
| bucket.mins[i] = val; | ||
| bucket.maxes[i] = val; | ||
| bucket.has_mins[i] = true; | ||
| } else { | ||
| bucket.mins[i] = std::min(bucket.mins[i], val); | ||
| bucket.maxes[i] = std::max(bucket.maxes[i], val); |
There was a problem hiding this comment.
This only wires MIN/MAX for one aggregation path.
The new cases update and emit extrema for OpenAddressHashAgg, but DirectIndexAgg still only handles COUNT/SUM/AVG. Integer GROUP BY queries that take the direct-index fast path can therefore return NULL/default results for MIN/MAX unless that path is updated too, or explicitly opted out of the optimization.
Also applies to: 1027-1041
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@include/executor/vectorized_operator.hpp` around lines 835 - 846,
DirectIndexAgg’s fast path currently only updates COUNT/SUM/AVG, so add handling
for AggregateType::Min and AggregateType::Max in the DirectIndexAgg update and
emit logic (mirror the OpenAddressHashAgg behavior): when agg.input_col_idx >= 0
read the value from batch.get_column(agg.input_col_idx), check nulls, and update
the per-bucket fields bucket.mins[i], bucket.maxes[i] and bucket.has_mins[i]
(and initialize both min/max on first seen) just like the OpenAddressHashAgg
branch; also ensure the emit/finalize code for DirectIndexAgg outputs MIN/MAX
correctly (use the has flag to decide NULL vs value). Apply the same changes in
the second occurrence noted around the other block (the region referenced by
lines 1027-1041).
|
|
||
| auto table_ptr = std::make_shared<ColumnarTable>(table); | ||
| VectorizedSeqScanOperator scan("sequential_scan", table_ptr); | ||
| VectorizedSeqScanOperator scan("sequential_scan", table_ptr, nullptr); |
There was a problem hiding this comment.
Verify constructor signature and update all call sites consistently.
Only one VectorizedSeqScanOperator constructor call in this file is updated with the third nullptr argument (Line 117), but dozens of other call sites throughout the file (Lines 41, 63, 91, 154, 184, 227, 267, 297, 330, 367, 395, 426, and many more in GROUP BY and JOIN tests) still use the 2-parameter constructor. This inconsistency suggests either:
- If the third parameter is required: The update is incomplete and will cause compilation failures.
- If the third parameter has a default value: The selective update is confusing and inconsistent.
Run the following script to check the VectorizedSeqScanOperator constructor signature and identify all unconverted call sites:
#!/bin/bash
# Description: Verify VectorizedSeqScanOperator constructor signature and find all call sites
# Check constructor declarations in header file
echo "=== Constructor signature(s) ==="
rg -n -A2 'VectorizedSeqScanOperator\s*\(' include/executor/vectorized_operator.hpp
echo -e "\n=== 2-parameter call sites in tests ==="
# Find all 2-parameter constructor calls (table_ptr); or table));
rg -n 'VectorizedSeqScanOperator\([^,]+,\s*[^,)]+\);' tests/
echo -e "\n=== 3-parameter call sites in tests ==="
# Find all 3-parameter constructor calls with nullptr
rg -n 'VectorizedSeqScanOperator\([^,]+,\s*[^,]+,\s*nullptr\);' tests/🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tests/vectorized_operator_tests.cpp` at line 117, The test suite uses
inconsistent calls to VectorizedSeqScanOperator; inspect the constructor
declaration for VectorizedSeqScanOperator to determine whether the third
parameter is required or has a default, then make a consistent fix: either add a
default value to the constructor signature in the class (e.g., give the third
parameter a default nullptr) or update every test call site (all uses of
VectorizedSeqScanOperator(...) across tests) to pass the third argument
(nullptr) so signatures match; search for all call sites and adjust them, and
update the header/implementation where the constructor is declared/defined to
keep tests compiling.
Summary
Performance
Test plan
make -j4 duckdb_comparison_bench./build/duckdb_comparison_bench --benchmark_filter=BM_CloudSQL_Q[16]/100000 --benchmark_repetitions=3Summary by CodeRabbit
New Features
Performance
Bug Fixes
Tests