Skip to content

perf(s1-rtc): shard the conditions arrays (gamma_area/LIA) like the vv/vh pyramid#197

Merged
lhoupert merged 4 commits into
feat/s1-rtc-stac-builderfrom
feat/s1-gamma-area-sharding
Jun 19, 2026
Merged

perf(s1-rtc): shard the conditions arrays (gamma_area/LIA) like the vv/vh pyramid#197
lhoupert merged 4 commits into
feat/s1-rtc-stac-builderfrom
feat/s1-gamma-area-sharding

Conversation

@lhoupert

@lhoupert lhoupert commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

Tracking issue: EOPF-Explorer/data-pipeline#288 · Sibling PR: EOPF-Explorer/data-pipeline#287 (T1–T4 — incremental + concurrent transfer) · Base: feat/s1-rtc-stac-builder (#180)

Stacked on #180 (feat/s1-rtc-stac-builder) — this PR merges into #180, not main.

Task T5 of the data-pipeline S1 ingest upload-bottleneck plan
(data-pipeline/claude-docs/plans/s1_ingest_upload_perf.md). The biggest absolute lever in that work.

Problem

A real S1 RTC cube (s1-rtc-31TEG, staging) is 3807 objects / 3.5 GB, of which 3604 (94.7%)
are the conditions/gamma_area_<relorbit> arrays: [10980,10980] float32, inner chunk 366², no
sharding_indexed
→ ~900 tiny chunk objects per array. They are time-invariant yet dominate the
object count, which dominated the ingest's S3 upload wall-time (a live pod sat ~34 min in
"Uploading store" at 9 millicores — pure object-count latency).

The vv/vh/border_mask display pyramid is already sharded (one shard per time slice over the
full extent, inner 366²). This applies that same existing layout to the one array family left out.

Change

Add shards=(h, w) to the single condition create_array in ingest_s1tiling_conditions. All
condition arrays (gamma_area, lia, incidence_angle) share that write path and the same 2D
full-resolution shape, so all collapse from ~900 chunk objects to 1 shard object (cube
~3807 → ~210). calculate_aligned_chunk_size returns a divisor of the dimension, so (h, w) is a
clean multiple of the inner chunk (Zarr v3 shard-divisibility). Condition arrays are not in
TiTiler's render path, so the web-render layout is untouched; values are byte-identical.

Validation (real OVH S3, laptop→DE)

layout S3 objects PUT (best of 3) full-array GET byte-identical
unsharded 100 4.43 s 1.38 s
sharded 1 2.55 s (1.7×) 1.61 s
  • Object collapse 100 → 1 (production ~900 → 1 per array; cube 3604 → ~4).
  • PUT 1.7× faster even with batched concurrency on; ratio grows with object count.
  • Divisibility valid at the production 10980² (aligned=366, 10980 % 366 == 0).
  • Honest caveat: a full-array read is not faster sharded (same bytes, one un-parallelizable
    object) — the win is object-count (upload + listing) and windowed/partial cloud reads.

Tests

+2 targeted (test_gamma_area_is_sharded, test_sharding_collapses_chunk_objects_to_one — 9 inner
chunks → 1 on-disk shard object + byte-identical roundtrip). 102 passed across ingest + STAC +
per-acquisition + data_api (no regression in consolidation/STAC).

Migration

Old cubes stay unsharded until re-ingested (Zarr doesn't re-chunk in place) — sequence a per-tile
re-ingest after the rebuilt image deploys. Documented in the migration note above and tracking issue EOPF-Explorer/data-pipeline#288.

🤖 Generated with Claude Code

lhoupert and others added 3 commits June 19, 2026 09:19
A real S1 RTC cube is 3807 objects / 3.5 GB, of which 3604 (94.7%) are the
conditions/gamma_area_<relorbit> arrays: [10980,10980] float32, inner chunk
366², NO sharding_indexed -> ~900 tiny chunk objects each. They are
time-invariant yet dominate the object count, which dominated the ingest's S3
upload wall-time (a live pod sat ~34 min in "Uploading store" at 9 millicores).

The vv/vh/border_mask display pyramid is already sharded (one shard per time
slice over the full spatial extent, inner 366²). Apply that same existing
layout to the condition arrays: add shards=(h, w) to the one create_array in
ingest_s1tiling_conditions. All condition arrays (gamma_area, lia,
incidence_angle) share that write path and the same 2D full-resolution shape,
so all collapse from ~900 chunk objects to 1 shard object (cube ~3807 -> ~210).
calculate_aligned_chunk_size returns a divisor of the dimension, so (h, w) is a
clean multiple of the inner chunk (Zarr v3 shard-divisibility).

conditions arrays are NOT in TiTiler's render path (vv/vh/border_mask), so this
does not touch the web-render layout; it only makes a client read a condition
array in one ranged GET instead of ~900. Values are byte-identical.

Tests: +2 (sharding codec present; 9 inner chunks -> 1 on-disk shard object +
byte-identical roundtrip). 57 passed. Spec:
claude-docs/specs/s1_gamma_area_sharding.md. Cross-repo Task T5 of
data-pipeline/claude-docs/plans/s1_ingest_upload_perf.md.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_011LsWkVvRfkRzjqAMrzfmRP
Validated on the live OVH bucket (laptop->DE): object collapse 100->1 (prod
~900->1), PUT 1.7x faster even with batched concurrency on, divisibility valid
at the production 10980² (aligned 366 divides 10980), reads byte-identical.
Honest caveat recorded: a full-array sequential read is NOT faster sharded
(same bytes, one un-parallelizable object) — the win is object-count (upload +
listing) and windowed/partial cloud reads, not full-read throughput.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_011LsWkVvRfkRzjqAMrzfmRP
… style

Comment-only. The surrounding vv/vh sharding has no inline explainer; the long
cloud-access rationale now lives in claude-docs/specs/s1_gamma_area_sharding.md.
Keep only the non-obvious bits: why one shard, and the Zarr v3 shard-divisibility
invariant. No behavior change (20 condition/shard tests green).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_011LsWkVvRfkRzjqAMrzfmRP
@lhoupert

Copy link
Copy Markdown
Contributor Author

Context / problem statement: EOPF-Explorer/data-pipeline#288 (umbrella issue covering this PR + the sibling data-pipeline transfer PR #287).

@lhoupert

Copy link
Copy Markdown
Contributor Author

@d-v-b do you approve?

The data-model repo has no claude-docs/specs convention; the spec was noise for
this PR's reviewers. The problem statement, real-S3 benchmark and migration note
live in the data-pipeline plan + tracking issue EOPF-Explorer/data-pipeline#288
and PR #197's description. Also drop the now-dangling spec path from the code
comment (the rationale stays inline). No behavior change (20 condition/shard
tests green).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_011LsWkVvRfkRzjqAMrzfmRP
@lhoupert lhoupert requested a review from emmanuelmathot June 19, 2026 09:08
@d-v-b

d-v-b commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

looks good!

@lhoupert

Copy link
Copy Markdown
Contributor Author

fiy @emmanuelmathot

@lhoupert lhoupert merged commit 817e2b9 into feat/s1-rtc-stac-builder Jun 19, 2026
lhoupert added a commit that referenced this pull request Jun 19, 2026
… writer) (#200)

* perf(s1-rtc): shard the conditions arrays like the vv/vh pyramid

A real S1 RTC cube is 3807 objects / 3.5 GB, of which 3604 (94.7%) are the
conditions/gamma_area_<relorbit> arrays: [10980,10980] float32, inner chunk
366², NO sharding_indexed -> ~900 tiny chunk objects each. They are
time-invariant yet dominate the object count, which dominated the ingest's S3
upload wall-time (a live pod sat ~34 min in "Uploading store" at 9 millicores).

The vv/vh/border_mask display pyramid is already sharded (one shard per time
slice over the full spatial extent, inner 366²). Apply that same existing
layout to the condition arrays: add shards=(h, w) to the one create_array in
ingest_s1tiling_conditions. All condition arrays (gamma_area, lia,
incidence_angle) share that write path and the same 2D full-resolution shape,
so all collapse from ~900 chunk objects to 1 shard object (cube ~3807 -> ~210).
calculate_aligned_chunk_size returns a divisor of the dimension, so (h, w) is a
clean multiple of the inner chunk (Zarr v3 shard-divisibility).

conditions arrays are NOT in TiTiler's render path (vv/vh/border_mask), so this
does not touch the web-render layout; it only makes a client read a condition
array in one ranged GET instead of ~900. Values are byte-identical.

Tests: +2 (sharding codec present; 9 inner chunks -> 1 on-disk shard object +
byte-identical roundtrip). 57 passed. Spec:
claude-docs/specs/s1_gamma_area_sharding.md. Cross-repo Task T5 of
data-pipeline/claude-docs/plans/s1_ingest_upload_perf.md.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_011LsWkVvRfkRzjqAMrzfmRP

* docs(s1-rtc): record real-S3 sharding benchmark in the T5 spec

Validated on the live OVH bucket (laptop->DE): object collapse 100->1 (prod
~900->1), PUT 1.7x faster even with batched concurrency on, divisibility valid
at the production 10980² (aligned 366 divides 10980), reads byte-identical.
Honest caveat recorded: a full-array sequential read is NOT faster sharded
(same bytes, one un-parallelizable object) — the win is object-count (upload +
listing) and windowed/partial cloud reads, not full-read throughput.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_011LsWkVvRfkRzjqAMrzfmRP

* refactor(s1-rtc): trim the conditions-sharding comment to match vv/vh style

Comment-only. The surrounding vv/vh sharding has no inline explainer; the long
cloud-access rationale now lives in claude-docs/specs/s1_gamma_area_sharding.md.
Keep only the non-obvious bits: why one shard, and the Zarr v3 shard-divisibility
invariant. No behavior change (20 condition/shard tests green).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_011LsWkVvRfkRzjqAMrzfmRP

* chore(s1-rtc): drop the claude-docs spec from this PR

The data-model repo has no claude-docs/specs convention; the spec was noise for
this PR's reviewers. The problem statement, real-S3 benchmark and migration note
live in the data-pipeline plan + tracking issue EOPF-Explorer/data-pipeline#288
and PR #197's description. Also drop the now-dangling spec path from the code
comment (the rationale stays inline). No behavior change (20 condition/shard
tests green).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_011LsWkVvRfkRzjqAMrzfmRP

* fix(s1-rtc): heal a multiscale level missing `time` on append (robust writer)

`ingest_s1tiling_acquisition` resized `level["time"]` on every multiscale level,
assuming a fresh build created `time` at each level (#192). A cube built before
#192 -- or left half-built by an interrupted append -- can carry `r10m/time` yet
lack it at a coarser level, so the resize raised `KeyError: 'time'` and, because
the append consistency check validated only CRS + shape, the ingest was
non-convergent (observed on 30TWM).

Before the per-level write loop, recreate any missing-level `time` from
`r10m/time` (backfilling the existing slices so prior timestamps are preserved),
or raise a clear error when the cube is inconsistent in a way a backfill cannot
fix (a level's length disagrees with `r10m/time`, or `r10m` has slices but no
`time`). This is the durable upstream counterpart to the data-pipeline guard
(data-pipeline #294), making that orchestration-side mitigation belt-and-suspenders.

Tests: 4 new cases (heal, no-op when healthy, raise on half-built, raise on
missing r10m/time); full s1_ingest suite 61 passed.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01V3qS75byrUuCSHFqcWi26B

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants