Skip to content

Small bundled fixes: #153, #94, #134, partial #64#155

Merged
nclack merged 5 commits into
acquire-project:mainfrom
nclack:small-fixes
Jun 17, 2026
Merged

Small bundled fixes: #153, #94, #134, partial #64#155
nclack merged 5 commits into
acquire-project:mainfrom
nclack:small-fixes

Conversation

@nclack

@nclack nclack commented Jun 13, 2026

Copy link
Copy Markdown
Member

Small, contained fixes for several open issues. One commit per issue so the
PR stays reviewable.

Commits

  • Guard flush against unsized layout — Closes Divide-by-zero in stream_flush_body when destroying a half-created GPU stream #153.
    stream_flush_body divided by ctx->layout.epoch_elements, which is 0 when
    tile_stream_gpu_create fails before binding the array (an engine_limits
    or stream_engine_init failure, both of which run before
    engine_array_state_init sets ctx->layout at src/gpu/stream.init.c:221).
    The destroy auto-flush then hit a SIGFPE. The guard early-returns when
    nothing was ever sized. (Justified by code-walk in the commit; a clean unit
    test would need to force a GPU init failure deterministically, which is
    GPU-dependent and fragile.)

  • Test uniform chunk bytes across LODs — references benchmark compression ratios for multiscale are not accurate #94.
    Adds a CPU-only test (test_uniform_chunk_bytes_across_levels in
    tests/test_lod_cpu.c) that builds a multiscale layout via
    compute_stream_layouts and asserts every level's
    agg_layout.max_comp_chunk_bytes is identical and equals the L0 chunk byte
    size (chunk_stride * bpe) for CODEC_NONE. Locks in the invariant the
    runtime relies on (batch_aggregate_layout_init asserts it;
    compute_stream_layouts feeds one max_output_size to every level).

  • Drop fill_value from ngff config — Closes fill_value is a zarr thing not an ngff thing #134.
    fill_value is a per-array zarr property: it is written to zarr array
    metadata (src/zarr/zarr_metadata.c:201) and already lives on
    zarr_array_config. The OME-NGFF multiscale config exposed a redundant
    copy. Removed it from ngff_multiscale_config; ngff-created arrays now take
    the zarr default. Every caller passed 0, so behavior is unchanged. The zarr
    metadata test (test_json_writer.c) asserts fill_value in zarr metadata
    (correct placement) and needed no change.

  • Emit memory estimate in bench JSON — references benchmark sweep: record memory usage #64 (estimate half only).
    The per-stream memory estimate is already computed and printed by the bench;
    this emits it into the JSON as two additive fields,
    memory_estimate_total_bytes (device bytes on GPU, heap bytes on CPU) and
    memory_estimate_pinned_bytes (pinned host bytes on GPU, 0 on CPU), so the
    sweep records it. Additive only: scripts/sweep/models.py
    (RunResult(extra="allow")) validates both old result files and freshly
    emitted ones — verified against both. The "actual RSS vs baseline" half of
    benchmark sweep: record memory usage #64 is out of scope, so benchmark sweep: record memory usage #64 stays open.

Dropped from the bundle

  • Per-LOD timing metrics can race the next batch's events (skew only, no data impact) #154 (per-LOD timing-metric skew) — DROPPED, not contained.
    The issue assumes a fix that snapshots/double-buffers the per-fc LOD timing
    "the same way the aggregate timing is already handled." On inspection that
    pattern does not transfer: the aggregate timing events are recorded at kick
    time and are safe only because the drain-before-rekick host rule keeps the
    same fc's slot drained before it is re-kicked. The LOD timing events are
    recorded at epoch-fill time (lod_run_epoch), which is not gated against
    the prior same-fc drain, so the identical handle snapshot would be a no-op
    against the value race (the snapshot copies a handle to the same event object
    the producer then re-records). Worse, t_end is dual-purpose — it is also
    the GPU_EDGE_LOD_DONE synchronization edge (bound at
    src/gpu/stream.init.c:177) — so relocating or re-keying these events
    touches load-bearing ordering, not just metrics. A correct fix needs extra
    timing generations or a new drain-before-refill gate, i.e. a new mechanism
    rather than a handful of lines. Since the bug is cosmetic (metrics-quality
    only, never wrong output bytes), that cost is not worth risking this PR.
    Left open with this rationale.

Validation

Built RelWithDebInfo with GPU on (sm_89, L40); ctest -E "(s3)" green; no new
build warnings; the new invariant test passes; the bench JSON emits the new
fields on both GPU and CPU runs.

Nathan Clack added 4 commits June 13, 2026 14:43
stream_flush_body divided by layout.epoch_elements, which is 0 when
tile_stream_gpu_create fails before binding the array (engine_limits or
stream_engine_init failure, both before engine_array_state_init sets
ctx->layout at stream.init.c:221). The destroy auto-flush then hit a
SIGFPE. Nothing was sized, so there is nothing to flush.

Closes acquire-project#153
fill_value is a per-array zarr property (written to zarr array metadata,
src/zarr/zarr_metadata.c:201) and already lives on zarr_array_config. The
ngff multiscale layer exposed a redundant copy. ngff-created arrays now use
the zarr default; all callers passed 0, so behavior is unchanged.

Closes acquire-project#134
@codecov

codecov Bot commented Jun 13, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

Files with missing lines Coverage Δ
src/ngff/ngff_multiscale.c 58.79% <ø> (-0.18%) ⬇️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@nclack nclack merged commit 1fb07b8 into acquire-project:main Jun 17, 2026
8 checks passed
@nclack nclack deleted the small-fixes branch June 17, 2026 20:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Divide-by-zero in stream_flush_body when destroying a half-created GPU stream fill_value is a zarr thing not an ngff thing

1 participant