Skip to content

Stream-and-free chunk writes in the multi-chunk worker (don't accumulate all K) #91

Description

@espg

🤖 from Claude

Problem

The multi-chunk-per-worker writer (#30 item 3, landed in #84) accumulates all K chunk outputs in memory before any are written. process_shard (src/zagg/processing/worker.py) reads the shard's granules once, then loops iter_chunks appending each chunk's (block_index, carrier, ragged) to the chunk_results sink (worker.py:344) and returns; the write happens afterward in a separate loop (runner _process_and_write; deployment/aws/lambda_handler.py:280-283). So peak memory holds all K carriers + all K ragged payloads at once, on top of the pooled shard reads. For K>1 the worker could instead write each chunk and free it, capping the output-side footprint at one chunk.

Scope — what this does NOT fix

This only helps K>1. At K=1 (shard == chunk; e.g. parent_order: 13 with chunk_inner dropped) there is exactly one chunk, so there's nothing to accumulate — the change is a no-op at K=1. It also does not touch the read-phase memory (pooled photons + the #66 h5coro per-granule cache leak), which is the dominant OOM driver for dense ATL03 AOIs regardless of K. So this is an efficiency win for the multi-chunk path, complementary to #66, not a substitute — the current order-13 OOMs are #66, not this.

Proposed fix

Invert the compute/write boundary so the worker streams: compute a chunk → write it → drop its refs → next, instead of materializing all K.

  • Add a per-chunk write-callback seam to process_shard: process_shard(..., write_chunk: Callable | None = None). When provided, after computing each chunk's (block_index, carrier, ragged), call write_chunk(...) immediately and do not append to chunk_results (drop the locals so they're collectible). When None, keep today's chunk_results-append behavior for back-compat (the 2-tuple return + existing tests).
  • Rewire the consumers to pass the callback (their existing per-chunk write-loop body): runner _process_and_write, and deployment/aws/lambda_handler.py (re-touches the handler → needs the §1 "named" authorization, as in Vector/ragged chunk companions + multi-chunk-per-worker (Closes #82, Refs #30) #84 phase 7).
  • Free each carrier/ragged after its write.

Phases

  1. write_chunk callback seam in process_shard + free-after-write; chunk_results path preserved when callback is None. Tests: with a callback, chunk_results stays empty and each chunk is written-then-dropped; K=1 byte-identical; multi-chunk output identical to today.
  2. Rewire runner _process_and_write to the callback.
  3. Rewire lambda_handler.py to the callback (authorized edit).

Acceptance

  • For K>1, peak output-side memory holds ~1 chunk instead of K (assert chunk_results is not accumulated when streaming).
  • Output byte-identical (K=1 and K>1).
  • No behavior change at K=1.

Relation / notes

Filed as a tracking issue (enhancement); add implement to queue it. For the current OOM, #66 is the fix.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions