Small bundled fixes: #153, #94, #134, partial #64#155
Merged
Conversation
added 4 commits
June 13, 2026 14:43
stream_flush_body divided by layout.epoch_elements, which is 0 when tile_stream_gpu_create fails before binding the array (engine_limits or stream_engine_init failure, both before engine_array_state_init sets ctx->layout at stream.init.c:221). The destroy auto-flush then hit a SIGFPE. Nothing was sized, so there is nothing to flush. Closes acquire-project#153
fill_value is a per-array zarr property (written to zarr array metadata, src/zarr/zarr_metadata.c:201) and already lives on zarr_array_config. The ngff multiscale layer exposed a redundant copy. ngff-created arrays now use the zarr default; all callers passed 0, so behavior is unchanged. Closes acquire-project#134
Codecov Report✅ All modified and coverable lines are covered by tests.
🚀 New features to boost your workflow:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Small, contained fixes for several open issues. One commit per issue so the
PR stays reviewable.
Commits
Guard flush against unsized layout — Closes Divide-by-zero in stream_flush_body when destroying a half-created GPU stream #153.
stream_flush_bodydivided byctx->layout.epoch_elements, which is 0 whentile_stream_gpu_createfails before binding the array (anengine_limitsor
stream_engine_initfailure, both of which run beforeengine_array_state_initsetsctx->layoutatsrc/gpu/stream.init.c:221).The destroy auto-flush then hit a SIGFPE. The guard early-returns when
nothing was ever sized. (Justified by code-walk in the commit; a clean unit
test would need to force a GPU init failure deterministically, which is
GPU-dependent and fragile.)
Test uniform chunk bytes across LODs — references benchmark compression ratios for multiscale are not accurate #94.
Adds a CPU-only test (
test_uniform_chunk_bytes_across_levelsintests/test_lod_cpu.c) that builds a multiscale layout viacompute_stream_layoutsand asserts every level'sagg_layout.max_comp_chunk_bytesis identical and equals the L0 chunk bytesize (
chunk_stride * bpe) for CODEC_NONE. Locks in the invariant theruntime relies on (
batch_aggregate_layout_initasserts it;compute_stream_layoutsfeeds onemax_output_sizeto every level).Drop fill_value from ngff config — Closes fill_value is a zarr thing not an ngff thing #134.
fill_valueis a per-array zarr property: it is written to zarr arraymetadata (
src/zarr/zarr_metadata.c:201) and already lives onzarr_array_config. The OME-NGFF multiscale config exposed a redundantcopy. Removed it from
ngff_multiscale_config; ngff-created arrays now takethe zarr default. Every caller passed 0, so behavior is unchanged. The zarr
metadata test (
test_json_writer.c) asserts fill_value in zarr metadata(correct placement) and needed no change.
Emit memory estimate in bench JSON — references benchmark sweep: record memory usage #64 (estimate half only).
The per-stream memory estimate is already computed and printed by the bench;
this emits it into the JSON as two additive fields,
memory_estimate_total_bytes(device bytes on GPU, heap bytes on CPU) andmemory_estimate_pinned_bytes(pinned host bytes on GPU, 0 on CPU), so thesweep records it. Additive only:
scripts/sweep/models.py(
RunResult(extra="allow")) validates both old result files and freshlyemitted ones — verified against both. The "actual RSS vs baseline" half of
benchmark sweep: record memory usage #64 is out of scope, so benchmark sweep: record memory usage #64 stays open.
Dropped from the bundle
The issue assumes a fix that snapshots/double-buffers the per-fc LOD timing
"the same way the aggregate timing is already handled." On inspection that
pattern does not transfer: the aggregate timing events are recorded at kick
time and are safe only because the drain-before-rekick host rule keeps the
same fc's slot drained before it is re-kicked. The LOD timing events are
recorded at epoch-fill time (
lod_run_epoch), which is not gated againstthe prior same-fc drain, so the identical handle snapshot would be a no-op
against the value race (the snapshot copies a handle to the same event object
the producer then re-records). Worse,
t_endis dual-purpose — it is alsothe
GPU_EDGE_LOD_DONEsynchronization edge (bound atsrc/gpu/stream.init.c:177) — so relocating or re-keying these eventstouches load-bearing ordering, not just metrics. A correct fix needs extra
timing generations or a new drain-before-refill gate, i.e. a new mechanism
rather than a handful of lines. Since the bug is cosmetic (metrics-quality
only, never wrong output bytes), that cost is not worth risking this PR.
Left open with this rationale.
Validation
Built RelWithDebInfo with GPU on (sm_89, L40);
ctest -E "(s3)"green; no newbuild warnings; the new invariant test passes; the bench JSON emits the new
fields on both GPU and CPU runs.