You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The multi-chunk-per-worker writer (#30 item 3, landed in #84) accumulates all K chunk outputs in memory before any are written.process_shard (src/zagg/processing/worker.py) reads the shard's granules once, then loops iter_chunks appending each chunk's (block_index, carrier, ragged) to the chunk_results sink (worker.py:344) and returns; the write happens afterward in a separate loop (runner _process_and_write; deployment/aws/lambda_handler.py:280-283). So peak memory holds all K carriers + all K ragged payloads at once, on top of the pooled shard reads. For K>1 the worker could instead write each chunk and free it, capping the output-side footprint at one chunk.
Scope — what this does NOT fix
This only helps K>1. At K=1 (shard == chunk; e.g. parent_order: 13 with chunk_inner dropped) there is exactly one chunk, so there's nothing to accumulate — the change is a no-op at K=1. It also does not touch the read-phase memory (pooled photons + the #66 h5coro per-granule cache leak), which is the dominant OOM driver for dense ATL03 AOIs regardless of K. So this is an efficiency win for the multi-chunk path, complementary to #66, not a substitute — the current order-13 OOMs are #66, not this.
Proposed fix
Invert the compute/write boundary so the worker streams: compute a chunk → write it → drop its refs → next, instead of materializing all K.
Add a per-chunk write-callback seam to process_shard: process_shard(..., write_chunk: Callable | None = None). When provided, after computing each chunk's (block_index, carrier, ragged), call write_chunk(...) immediately and do not append to chunk_results (drop the locals so they're collectible). When None, keep today's chunk_results-append behavior for back-compat (the 2-tuple return + existing tests).
write_chunk callback seam in process_shard + free-after-write; chunk_results path preserved when callback is None. Tests: with a callback, chunk_results stays empty and each chunk is written-then-dropped; K=1 byte-identical; multi-chunk output identical to today.
Rewire runner _process_and_write to the callback.
Rewire lambda_handler.py to the callback (authorized edit).
Acceptance
For K>1, peak output-side memory holds ~1 chunk instead of K (assert chunk_results is not accumulated when streaming).
🤖 from Claude
Problem
The multi-chunk-per-worker writer (#30 item 3, landed in #84) accumulates all K chunk outputs in memory before any are written.
process_shard(src/zagg/processing/worker.py) reads the shard's granules once, then loopsiter_chunksappending each chunk's(block_index, carrier, ragged)to thechunk_resultssink (worker.py:344) and returns; the write happens afterward in a separate loop (runner_process_and_write;deployment/aws/lambda_handler.py:280-283). So peak memory holds all K carriers + all K ragged payloads at once, on top of the pooled shard reads. For K>1 the worker could instead write each chunk and free it, capping the output-side footprint at one chunk.Scope — what this does NOT fix
This only helps K>1. At K=1 (shard == chunk; e.g.
parent_order: 13withchunk_innerdropped) there is exactly one chunk, so there's nothing to accumulate — the change is a no-op at K=1. It also does not touch the read-phase memory (pooled photons + the #66 h5coro per-granule cache leak), which is the dominant OOM driver for dense ATL03 AOIs regardless of K. So this is an efficiency win for the multi-chunk path, complementary to #66, not a substitute — the current order-13 OOMs are #66, not this.Proposed fix
Invert the compute/write boundary so the worker streams: compute a chunk → write it → drop its refs → next, instead of materializing all K.
process_shard:process_shard(..., write_chunk: Callable | None = None). When provided, after computing each chunk's(block_index, carrier, ragged), callwrite_chunk(...)immediately and do not append tochunk_results(drop the locals so they're collectible). WhenNone, keep today'schunk_results-append behavior for back-compat (the 2-tuple return + existing tests)._process_and_write, anddeployment/aws/lambda_handler.py(re-touches the handler → needs the §1 "named" authorization, as in Vector/ragged chunk companions + multi-chunk-per-worker (Closes #82, Refs #30) #84 phase 7).carrier/raggedafter its write.Phases
write_chunkcallback seam inprocess_shard+ free-after-write;chunk_resultspath preserved when callback isNone. Tests: with a callback,chunk_resultsstays empty and each chunk is written-then-dropped; K=1 byte-identical; multi-chunk output identical to today._process_and_writeto the callback.lambda_handler.pyto the callback (authorized edit).Acceptance
chunk_resultsis not accumulated when streaming).Relation / notes
Filed as a tracking issue (
enhancement); addimplementto queue it. For the current OOM, #66 is the fix.