bench: tensorstore CPU vs damacy GPU read+decode comparison by nclack · Pull Request #155 · nclack/damacy

nclack · 2026-06-12T23:56:34Z

Adds bench/tensorstore_bench.py, a scenario-driven CPU read+decode benchmark
using tensorstore, so the GPU (damacy) vs CPU (tensorstore) tradeoff for zarr v3
read+decode can be measured on identical data, chunking, and patch sampling.

Consumes the existing Scenario JSON schema via bench/scenario.py (same as
run.py), handling both the synthetic path (uris is None, arrays enumerated
from uri_fmt/array_path/n_zarrs) and explicit uris.
Opens arrays with tensorstore's zarr3 driver; skips wrong-rank/too-small arrays
with the same filtering as the bench so array counts line up.
Ports bench/main.c's xorshift64* RNG and draw order (array index, then
per-axis start), so with a shared seed the sampled patches match damacy's
bit-for-bit and both read the same bytes.
--threads concurrency sweep (default 1,2,4,8,16,32) via tensorstore context
limits + a bounded in-flight read window, reporting samples/s and GB/s per
thread count and the best point — the CPU thread pool is the real comparison
point against a single GPU decode stream.
--drop-cache mirrors run.py's page-cache drop for cold-read measurement;
--compare-with <damacy results.json> prints a head-to-head line.
Emits a table plus a JSON summary shaped like the existing bench output.

tensorstore fuses read+decode, so only total throughput is reported (no per-stage
split). Smoke-tested on a synthetic scenario and a uris scenario on a login node;
the full cold sweep runs on a compute node.

Closes #153.

codecov · 2026-06-12T23:58:48Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 57.73%. Comparing base (17239ae) to head (537d29b).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #155   +/-   ##
=======================================
  Coverage   57.72%   57.73%           
=======================================
  Files          64       64           
  Lines       10055    10055           
  Branches     1750     1750           
=======================================
+ Hits         5804     5805    +1     
+ Misses       3501     3500    -1     
  Partials      750      750

Flag	Coverage Δ
unittests	`57.73% <ø> (+<0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.
see 2 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Small follow-up rolling up the remaining working-tree changes left out of #155. Two independent changes, one commit each: - **bench: per-stage in/out throughput + load** — `bench/report.py`'s stage table now reports `GB/s_out` and `load%` (stage `ms_total` / wall) alongside the existing `GB/s_in`, so each pipeline stage shows input and output throughput and its share of the wall — making it obvious which stage bounds a run. - **chore: pixi workspace config** — `[tool.pixi.*]` workspace/environments in `pyproject.toml`, `.pixi/*` ignored in `.gitignore`, and `pixi.lock` merge attributes in `.gitattributes`. --------- Co-authored-by: Nathan Clack <nclack@biohub.org>

Small follow-up rolling up the remaining working-tree changes left out of #155. Two independent changes, one commit each: - **bench: per-stage in/out throughput + load** — `bench/report.py`'s stage table now reports `GB/s_out` and `load%` (stage `ms_total` / wall) alongside the existing `GB/s_in`, so each pipeline stage shows input and output throughput and its share of the wall — making it obvious which stage bounds a run. - **chore: pixi workspace config** — `[tool.pixi.*]` workspace/environments in `pyproject.toml`, `.pixi/*` ignored in `.gitignore`, and `pixi.lock` merge attributes in `.gitattributes`. --------- Co-authored-by: Nathan Clack <nclack@biohub.org> 96d6be8

bench: tensorstore comparison harness

0652ec4

nclack mentioned this pull request Jun 12, 2026

bench report stage in/out + load, and pixi workspace config #156

Merged

Merge branch 'main' into bench-tensorstore-comparison

dacacb9

Nathan Clack added 2 commits June 15, 2026 04:32

bench: fair tensorstore vs damacy GB/s

6b5724d

bench: simplify tensorstore bench

537d29b

nclack merged commit dee3e66 into main Jun 15, 2026
6 checks passed

nclack deleted the bench-tensorstore-comparison branch June 15, 2026 04:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bench: tensorstore CPU vs damacy GPU read+decode comparison#155

bench: tensorstore CPU vs damacy GPU read+decode comparison#155
nclack merged 4 commits into
mainfrom
bench-tensorstore-comparison

nclack commented Jun 12, 2026

Uh oh!

codecov Bot commented Jun 12, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

nclack commented Jun 12, 2026

Uh oh!

codecov Bot commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

codecov Bot commented Jun 12, 2026 •

edited

Loading