test(bench): sustained-duration measurement for MemWAL HNSW parity bench by touch-of-grey · Pull Request #7010 · lance-format/lance

touch-of-grey · 2026-05-30T19:49:45Z

Summary

Adds sustained-duration measurement to the MemWAL HNSW parity bench so it reports steady-state throughput under continuous load rather than a short burst. This follows up the AVX-512 distance work in #7009.

Changes (bench-only, no library changes):

--insert-seconds / --query-seconds: run the write (graph build) and read (query) workloads in a loop for a fixed wall-clock duration; report aggregate throughput over all passes (insert_passes / query_passes).
insert_core breakdown: times the insertion itself separately from per-build graph allocation + teardown.
Both knobs added to the Lance bench and the hnswlib reference bench; run_parity_suite.sh gains INSERT_SECONDS / QUERY_SECONDS.

Motivation: a sub-second query window gave noisy/optimistic numbers and hid AVX-512 frequency throttling. Measuring 30 s of continuous load makes read/write parity (and where it doesn't hold) reproducible.

Latest perf results (merged main, c7i.12xlarge, 48 threads, dim=1024, m=12, ef=64, k=10)

Sustained 30 s read + 30 s write per size; AVX-512 throttles 3.78 GHz → ~2.5 GHz under all-core load (affects both impls).

Read (query_qps), Lance / hnswlib:

rows	ratio
100k	1.01
500k	0.995
1M	0.996

Write — insertion compute only (insert_core), Lance / hnswlib:

rows	ratio
100k	0.99
500k	0.98
1M	0.96

Write — end-to-end incl. per-build graph alloc + teardown:

rows	ratio
100k	0.96
500k	0.89
1M	0.87

Takeaways the improved bench makes visible:

Read is at parity under sustained throttled load (confirms perf(mem_wal): match hnswlib throughput via runtime AVX-512 f32 distance #7009 holds; the burst window wasn't hiding a regression).
Insertion compute is at parity — AVX-512 distance keeps pace even while downclocked.
The end-to-end write gap at scale is entirely graph allocation/teardown (Lance's per-node Vec/Mutex/Arc vs hnswlib's flat arrays), not the algorithm — and it's allocator-sensitive: with mimalloc/jemalloc as the global allocator Lance is actually faster than hnswlib (≈1.08–1.25×). No in-tree change is warranted; using a modern allocator for the memtable workload closes it.

cc @jackye1995 — please review.

Add --insert-seconds (rebuild the graph in a loop) and --query-seconds (loop the query workload) to both the Lance and hnswlib HNSW benches so throughput reflects steady-state under continuous AVX-512 load rather than a short burst, plus an insert_core breakdown that times insertion separately from per-build graph allocation/teardown. The parity-suite driver gains INSERT_SECONDS/QUERY_SECONDS. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

claude

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

touch-of-grey · 2026-05-30T19:50:06Z

@jackye1995 ping for review. Bench-only follow-up to #7009: adds 30s sustained read/write measurement + insert_core breakdown. Latest results in the description — sustained read at parity (0.99-1.01x), insertion compute at parity (0.96-0.99x); the end-to-end write gap at scale is purely graph alloc/teardown (allocator-sensitive: mimalloc/jemalloc make Lance faster), not the algorithm.

codecov · 2026-05-30T20:30:03Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

jackye1995

Looks good to me

claude Bot reviewed May 30, 2026

View reviewed changes

github-actions Bot added the chore label May 30, 2026

jackye1995 approved these changes May 31, 2026

View reviewed changes

jackye1995 merged commit 5ed51d3 into lance-format:main May 31, 2026
29 checks passed

touch-of-grey deleted the SustainedHnsw branch May 31, 2026 00:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test(bench): sustained-duration measurement for MemWAL HNSW parity bench#7010

test(bench): sustained-duration measurement for MemWAL HNSW parity bench#7010
jackye1995 merged 1 commit into
lance-format:mainfrom
touch-of-grey:SustainedHnsw

touch-of-grey commented May 30, 2026

Uh oh!

claude Bot left a comment

Uh oh!

touch-of-grey commented May 30, 2026

Uh oh!

codecov Bot commented May 30, 2026

Uh oh!

jackye1995 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

touch-of-grey commented May 30, 2026

Summary

Latest perf results (merged main, c7i.12xlarge, 48 threads, dim=1024, m=12, ef=64, k=10)

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Claude Code Review

Uh oh!

touch-of-grey commented May 30, 2026

Uh oh!

codecov Bot commented May 30, 2026

Codecov Report

Uh oh!

jackye1995 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants