perf: search + snapshot I/O optimizations#498
Conversation
- Pre-lowercase query needle in searchInContent — eliminates redundant per-byte case conversion in matchAtCaseInsensitive inner loop - Reuse Tier 0 word_hits in Tier 4 instead of re-running word_index.search() - Pre-compute is_doc language flag in Tier 0 sort — detectLanguage called once per file during collection, not O(n log n) times during sort - Bulk freq table I/O — read/write 128KB frequency table in 1 syscall instead of 256 (both loadSnapshotValidated and loadSnapshotFast paths, plus writeSnapshot) - Add test_bench.zig with fuzzyScore and detectLanguage microbenchmarks - Note: fuzzyScore pointer-swap was benchmarked and disproven — @memcpy is faster in ReleaseFast (362ms vs 670ms for 300K calls) because the compiler optimizes it to SIMD moves Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d628a0756a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| const first_lower: u8 = if (query[0] >= 'A' and query[0] <= 'Z') query[0] + 32 else query[0]; | ||
| const first_upper: u8 = if (query[0] >= 'a' and query[0] <= 'z') query[0] - 32 else query[0]; | ||
| var query_lower_buf: [4096]u8 = undefined; | ||
| if (query.len > query_lower_buf.len) return; |
There was a problem hiding this comment.
Remove fixed 4096-byte cutoff in content search
The new query_lower_buf guard returns immediately when query.len > 4096, which drops valid matches instead of searching. searchContent is used by CLI and server paths without a 4096-byte query cap, so long queries that previously worked (bounded only by content.len) now always return no results. This is a functional regression in search accuracy for long-input scenarios introduced by the optimization.
Useful? React with 👍 / 👎.
Benchmark Regression ReportThresholds: 10.00% and 50,000 ns absolute delta
|
Summary
searchInContent— eliminates redundant per-byte case conversion inmatchAtCaseInsensitiveinner loopword_index.search(query)(identical call, wasted work)is_docflag in Tier 0 sort —detectLanguagecalled once per file during collection, not O(n log n) times in sort comparatortest_bench.zigwith fuzzyScore and detectLanguage microbenchmarks (zig build test-bench -Doptimize=ReleaseFast)All optimizations were verified against the actual code (not theoretical). The fuzzyScore pointer-swap was benchmarked and disproven —
@memcpyis faster in ReleaseFast (362ms vs 670ms) because the compiler optimizes it to SIMD moves.Test plan
zig build test— all 8 test binaries passzig build -Doptimize=ReleaseFast— clean buildzig build test-bench -Doptimize=ReleaseFast— benchmarks run🤖 Generated with Claude Code