From 8518c7894a179820a03c943df0322b5b6cd5903c Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Lo=C3=AFc=20Houpert?= <10154151+lhoupert@users.noreply.github.com> Date: Fri, 19 Jun 2026 09:19:09 +0100 Subject: [PATCH 1/4] perf(s1-rtc): shard the conditions arrays like the vv/vh pyramid MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit A real S1 RTC cube is 3807 objects / 3.5 GB, of which 3604 (94.7%) are the conditions/gamma_area_ arrays: [10980,10980] float32, inner chunk 366², NO sharding_indexed -> ~900 tiny chunk objects each. They are time-invariant yet dominate the object count, which dominated the ingest's S3 upload wall-time (a live pod sat ~34 min in "Uploading store" at 9 millicores). The vv/vh/border_mask display pyramid is already sharded (one shard per time slice over the full spatial extent, inner 366²). Apply that same existing layout to the condition arrays: add shards=(h, w) to the one create_array in ingest_s1tiling_conditions. All condition arrays (gamma_area, lia, incidence_angle) share that write path and the same 2D full-resolution shape, so all collapse from ~900 chunk objects to 1 shard object (cube ~3807 -> ~210). calculate_aligned_chunk_size returns a divisor of the dimension, so (h, w) is a clean multiple of the inner chunk (Zarr v3 shard-divisibility). conditions arrays are NOT in TiTiler's render path (vv/vh/border_mask), so this does not touch the web-render layout; it only makes a client read a condition array in one ranged GET instead of ~900. Values are byte-identical. Tests: +2 (sharding codec present; 9 inner chunks -> 1 on-disk shard object + byte-identical roundtrip). 57 passed. Spec: claude-docs/specs/s1_gamma_area_sharding.md. Cross-repo Task T5 of data-pipeline/claude-docs/plans/s1_ingest_upload_perf.md. Co-Authored-By: Claude Opus 4.8 Claude-Session: https://claude.ai/code/session_011LsWkVvRfkRzjqAMrzfmRP --- claude-docs/specs/s1_gamma_area_sharding.md | 93 +++++++++++++++++++++ src/eopf_geozarr/conversion/s1_ingest.py | 17 +++- tests/test_s1_rtc_ingest.py | 51 +++++++++++ 3 files changed, 157 insertions(+), 4 deletions(-) create mode 100644 claude-docs/specs/s1_gamma_area_sharding.md diff --git a/claude-docs/specs/s1_gamma_area_sharding.md b/claude-docs/specs/s1_gamma_area_sharding.md new file mode 100644 index 00000000..f02120c0 --- /dev/null +++ b/claude-docs/specs/s1_gamma_area_sharding.md @@ -0,0 +1,93 @@ +# Spec: Shard the S1 RTC `conditions` arrays (gamma_area / LIA / incidence_angle) + +**Status:** implemented on `feat/s1-gamma-area-sharding` (PR targets #180 `feat/s1-rtc-stac-builder`). +**Cross-repo origin:** Task **T5** of the data-pipeline plan +`data-pipeline/claude-docs/plans/s1_ingest_upload_perf.md` — the *biggest absolute* lever in the +S1 RTC ingest upload-bottleneck work. This spec keeps T5 in the data-model review loop, as that +plan decided (2026-06-18); the data-pipeline transfer changes (T1–T4) do **not** depend on it. + +## Problem + +A real S1 RTC cube (`s1-rtc-31TEG`, staging, measured 2026-06-18) is **3807 objects / 3.5 GB**, +and **3604 of them (94.7%)** are the `conditions/gamma_area_` arrays: + +- shape `[10980, 10980]`, dtype `float32`, codecs `bytes + blosc`, inner chunk `366²`, + **no `sharding_indexed` codec** → ~900 tiny chunk objects per array × ~4 arrays ≈ 3604 objects. + +These arrays are time-invariant (one per relative orbit), yet because they are unsharded they +dominate the object count, which in turn dominated the ingest's S3 transfer wall-time (a live pod +sat ~34 min in "Uploading store" at 9 millicores — pure object-count latency, not bandwidth). + +The multiscale **display pyramid** (`vv` / `vh` / `border_mask`, r10m…r720m) is **already sharded** +(one shard per time slice spanning the full spatial extent, inner chunk `366²`) — so the fix is to +apply that *same, existing* layout to the one array family that was left out. + +## Objective + +Write the `conditions` arrays with the **same `sharding_indexed` layout** the `vv`/`vh` pyramid +already uses: one shard spanning the full `(y, x)` extent, 512-aligned inner chunks. Each condition +array collapses from ~900 chunk objects to **1 shard object** (`~3604 → ~8` for the cube; +`3807 → ~210` total). + +## Scope + +- **In:** the condition-array `create_array` in `ingest_s1tiling_conditions` + (`src/eopf_geozarr/conversion/s1_ingest.py`). All condition arrays go through this one call — + `gamma_area`, `lia`, `incidence_angle` — and all share the same full-resolution 2D shape, so all + are sharded by the single change. (Sharding `lia`/`incidence_angle` too is *more correct and less + code* than special-casing `gamma_area`, and identical in rationale: fewer cloud objects.) +- **Out:** the display pyramid (already sharded — leave untouched); the overwrite-in-place branch + (`conditions[name][:, :] = data`) is unchanged — an existing array keeps its codec; re-ingest of + *old, unsharded* cubes (those are not auto-migrated — see Migration). + +## Design + +In `ingest_s1tiling_conditions`, the new-array branch mirrors the pyramid: + +```python +inner_chunks = (calculate_aligned_chunk_size(h, 512), calculate_aligned_chunk_size(w, 512)) +arr = conditions.create_array( + array_name, shape=(h, w), dtype="float32", + chunks=inner_chunks, shards=(h, w), # one shard over the full extent (the only change) + compressors=zarr.codecs.BloscCodec(cname="zstd", clevel=5), + fill_value=float("nan"), dimension_names=["y", "x"], +) +``` + +`calculate_aligned_chunk_size` returns a **divisor** of the dimension near 512, so `(h, w)` is a +clean multiple of the inner chunk — the Zarr v3 shard-divisibility requirement (the same reason the +pyramid's `shard=(1, level_h, level_w)` / inner `(1, aligned, aligned)` is valid). + +## Web-optimized-GeoZarr constraint check + +`gamma_area`/`lia`/`incidence_angle` are **`conditions` arrays** (per-relative-orbit normalization +factors), **not** part of the multiscale pyramid TiTiler renders (`vv`/`vh`/`border_mask`). So +sharding them does **not** touch the web-render path; it only changes how a client reads a condition +array — one ranged shard GET instead of ~900 chunk GETs (strictly better for cloud access). Values +are byte-identical. + +## Acceptance criteria + +- [x] Condition arrays written with a sharding codec: `arr.shards == (h, w)`, inner + `arr.chunks == (aligned, aligned)` (same config as `vv`). *(test `test_gamma_area_is_sharded`)* +- [x] Object-count collapse proven: a multi-inner-chunk array lands as **1** on-disk shard object, + not one per inner chunk; values byte-identical through the shard. + *(test `test_sharding_collapses_chunk_objects_to_one` — 9 inner chunks → 1 object)* +- [x] Existing data-integrity / shape / dtype / attr tests stay green (sharding is read-transparent). +- [ ] **Real-S3 validation** (see Verification): object census of a re-ingested tile drops + ~3807 → ~210; condition array reads back byte-identical; one ranged GET vs ~900. +- [ ] Re-ingest path for existing (unsharded) cubes documented — see Migration. + +## Verification + +1. Unit: `uv run pytest tests/test_s1_rtc_ingest.py` (57 green; +2 sharding tests). +2. Object census on a re-ingested real tile → ~3807 → ~210 objects. +3. TiTiler still renders `vv`/`vh` for that cube (render path unaffected). +4. `xarray.open_zarr` / `zarr` reads `gamma_area_*` byte-identical to the unsharded version. + +## Migration + +Old cubes written before this change stay **unsharded** until rewritten — Zarr does not re-chunk in +place. Re-ingest (data-pipeline `argo submit --from cronworkflow/eopf-explorer-s1rtc` per tile, cron +suspended) rebuilds the conditions with the sharded layout. Until a cube is re-ingested it is still +correct, just object-heavy. Sequence the re-ingest after the rebuilt image is deployed. diff --git a/src/eopf_geozarr/conversion/s1_ingest.py b/src/eopf_geozarr/conversion/s1_ingest.py index 751b0913..72ae7f44 100644 --- a/src/eopf_geozarr/conversion/s1_ingest.py +++ b/src/eopf_geozarr/conversion/s1_ingest.py @@ -985,14 +985,23 @@ def ingest_s1tiling_conditions( conditions[array_name][:, :] = data log.info("Overwrote condition array", array_name=array_name) else: + # Shard the full-resolution condition arrays exactly like the vv/vh display pyramid: + # one shard spanning the whole (y, x) extent, with 512-aligned inner chunks. Without + # this a 10980² gamma_area is ~900 tiny 366²-chunk objects; sharding collapses each + # condition array to a single shard object (~900 → 1), so a cloud client reads it in + # one ranged GET and the ingest uploads one object instead of ~900. The inner chunk is + # a divisor of the dimension (calculate_aligned_chunk_size), so (h, w) is a clean + # multiple of it — the Zarr v3 shard-divisibility requirement. + inner_chunks = ( + calculate_aligned_chunk_size(h, 512), + calculate_aligned_chunk_size(w, 512), + ) arr = conditions.create_array( array_name, shape=(h, w), dtype="float32", - chunks=( - calculate_aligned_chunk_size(h, 512), - calculate_aligned_chunk_size(w, 512), - ), + chunks=inner_chunks, + shards=(h, w), compressors=zarr.codecs.BloscCodec(cname="zstd", clevel=5), fill_value=float("nan"), dimension_names=["y", "x"], diff --git a/tests/test_s1_rtc_ingest.py b/tests/test_s1_rtc_ingest.py index c8100ed6..a3ed070c 100644 --- a/tests/test_s1_rtc_ingest.py +++ b/tests/test_s1_rtc_ingest.py @@ -2,6 +2,7 @@ from __future__ import annotations +import os from math import ceil from pathlib import Path from unittest.mock import patch @@ -26,6 +27,7 @@ ingest_s1tiling_conditions, parse_s1tiling_filename, ) +from eopf_geozarr.conversion.utils import calculate_aligned_chunk_size # ============================================================================= # Constants @@ -677,6 +679,55 @@ def test_data_integrity_roundtrip( actual = root["ascending"]["conditions"]["gamma_area_037"][:] np.testing.assert_allclose(actual, expected, rtol=1e-6) + def test_gamma_area_is_sharded( + self, s1_store_with_acquisition: Path, gamma_area_geotiff: Path + ) -> None: + """The condition array carries a sharding codec: one shard over the full (y, x) extent, + 512-aligned inner chunks (the same layout vv/vh already use).""" + ingest_s1tiling_conditions( + store_path=s1_store_with_acquisition, + orbit_direction="ascending", + relative_orbit=37, + gamma_area_path=gamma_area_geotiff, + ) + arr = zarr.open_group(str(s1_store_with_acquisition), mode="r", zarr_format=3)[ + "ascending" + ]["conditions"]["gamma_area_037"] + # shards == full extent (None would mean unsharded — the pre-fix layout) + assert arr.shards == (SIZE, SIZE) + assert arr.chunks == (calculate_aligned_chunk_size(SIZE, 512),) * 2 + + def test_sharding_collapses_chunk_objects_to_one(self, s1_store_with_acquisition: Path) -> None: + """A multi-chunk condition array lands as a SINGLE on-disk shard object, not one object per + inner chunk — the object-count collapse (real gamma_area: ~900 chunk objects → 1 shard).""" + # 1098 sq with a 366 sq inner chunk = 3x3 = 9 inner chunks that, sharded, share one shard. + big = 1098 + rng = np.random.default_rng(7) + data = rng.uniform(0.5, 2.0, (big, big)).astype(np.float32) + gpath = s1_store_with_acquisition.parent / "GAMMA_AREA_BIG_037.tif" + _create_synthetic_geotiff( + gpath, data, transform=from_bounds(XMIN, YMIN, XMAX, YMAX, big, big) + ) + ingest_s1tiling_conditions( + store_path=s1_store_with_acquisition, + orbit_direction="ascending", + relative_orbit=37, + gamma_area_path=gpath, + ) + arr = zarr.open_group(str(s1_store_with_acquisition), mode="r", zarr_format=3)[ + "ascending" + ]["conditions"]["gamma_area_037"] + assert arr.chunks == (366, 366) + assert arr.shards == (big, big) + # exactly one chunk-data object on disk (the shard), regardless of the 9 inner chunks + array_dir = s1_store_with_acquisition / "ascending" / "conditions" / "gamma_area_037" + data_objects = [ + f for _r, _d, files in os.walk(array_dir) for f in files if f != "zarr.json" + ] + assert len(data_objects) == 1, data_objects + # values still byte-identical through the shard + np.testing.assert_allclose(arr[:], data, rtol=1e-6) + def test_multiple_conditions( self, s1_store_with_acquisition: Path, gamma_area_geotiff: Path, lia_geotiff: Path ) -> None: From 0bf0dad3b48b2d6b7f36c8f1ee7c589bcfca0a1f Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Lo=C3=AFc=20Houpert?= <10154151+lhoupert@users.noreply.github.com> Date: Fri, 19 Jun 2026 09:33:58 +0100 Subject: [PATCH 2/4] docs(s1-rtc): record real-S3 sharding benchmark in the T5 spec MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Validated on the live OVH bucket (laptop->DE): object collapse 100->1 (prod ~900->1), PUT 1.7x faster even with batched concurrency on, divisibility valid at the production 10980² (aligned 366 divides 10980), reads byte-identical. Honest caveat recorded: a full-array sequential read is NOT faster sharded (same bytes, one un-parallelizable object) — the win is object-count (upload + listing) and windowed/partial cloud reads, not full-read throughput. Co-Authored-By: Claude Opus 4.8 Claude-Session: https://claude.ai/code/session_011LsWkVvRfkRzjqAMrzfmRP --- claude-docs/specs/s1_gamma_area_sharding.md | 35 ++++++++++++++++++--- 1 file changed, 30 insertions(+), 5 deletions(-) diff --git a/claude-docs/specs/s1_gamma_area_sharding.md b/claude-docs/specs/s1_gamma_area_sharding.md index f02120c0..69974ca0 100644 --- a/claude-docs/specs/s1_gamma_area_sharding.md +++ b/claude-docs/specs/s1_gamma_area_sharding.md @@ -63,8 +63,10 @@ pyramid's `shard=(1, level_h, level_w)` / inner `(1, aligned, aligned)` is valid `gamma_area`/`lia`/`incidence_angle` are **`conditions` arrays** (per-relative-orbit normalization factors), **not** part of the multiscale pyramid TiTiler renders (`vv`/`vh`/`border_mask`). So sharding them does **not** touch the web-render path; it only changes how a client reads a condition -array — one ranged shard GET instead of ~900 chunk GETs (strictly better for cloud access). Values -are byte-identical. +array — a **windowed** read becomes one ranged GET into the shard instead of many chunk GETs + a +listing (better for cloud partial access), and the ingest writes **1 object instead of ~900** (the +upload lever). A full-array sequential read is the same bytes either way (see Benchmark caveat). +Values are byte-identical. ## Acceptance criteria @@ -74,9 +76,32 @@ are byte-identical. not one per inner chunk; values byte-identical through the shard. *(test `test_sharding_collapses_chunk_objects_to_one` — 9 inner chunks → 1 object)* - [x] Existing data-integrity / shape / dtype / attr tests stay green (sharding is read-transparent). -- [ ] **Real-S3 validation** (see Verification): object census of a re-ingested tile drops - ~3807 → ~210; condition array reads back byte-identical; one ranged GET vs ~900. -- [ ] Re-ingest path for existing (unsharded) cubes documented — see Migration. +- [x] **Real-S3 validation** (laptop→DE, `esa-zarr-sentinel-explorer-tests`, 2026-06-19): see + Benchmark below — object collapse 100→1 proven on real S3, byte-identical read-back, + divisibility valid at the production 10980². (Full-tile re-ingest census ~3807→~210 still + pending an in-cluster run.) +- [x] Re-ingest path for existing (unsharded) cubes documented — see Migration. + +## Benchmark (real OVH S3, 2026-06-19) + +A 3660² (= 10×366 → 100 inner 366² chunks) gamma_area-like **smooth/compressible** surface +(models the real normalization factor, not random noise), sharded vs unsharded, on the live bucket; +upload via batched `fs.put(batch_size=32)` so concurrency is *already on* for the unsharded case: + +| layout | S3 objects | PUT (best of 3) | full-array GET | byte-identical | +|---|---|---|---|---| +| unsharded | 100 | 4.43 s | 1.38 s | ✓ | +| **sharded** | **1** | **2.55 s (1.7×)** | 1.61 s | ✓ | + +- **Object count 100 → 1** (production ~900 → 1 per `gamma_area`; cube 3604 → ~4). The core lever. +- **PUT 1.7× faster** *even with* batched concurrency hiding per-object latency; the ratio grows with + object count (900/array ≈ 28 batch-waves → 1), so the production win is far larger than 1.7×. +- **Divisibility valid at the real 10980²**: `calculate_aligned_chunk_size(10980,512)=366`, + `10980 % 366 == 0`, so `shards=(10980,10980)` builds without error. +- **Honest caveat:** a *full-array* sequential read is **not** faster sharded (1.61 vs 1.38 s) — it is + the same bytes in one un-parallelizable object vs many concurrent chunk GETs. T5's win is + object-count (upload + S3 listing/metadata overhead) and **windowed/partial** cloud reads (one + ranged GET into the shard vs many chunk GETs), not full-read throughput. ## Verification From 42ca5d4dbebe64ac26bf82b1d5376f96f3e9eb29 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Lo=C3=AFc=20Houpert?= <10154151+lhoupert@users.noreply.github.com> Date: Fri, 19 Jun 2026 09:34:45 +0100 Subject: [PATCH 3/4] refactor(s1-rtc): trim the conditions-sharding comment to match vv/vh style Comment-only. The surrounding vv/vh sharding has no inline explainer; the long cloud-access rationale now lives in claude-docs/specs/s1_gamma_area_sharding.md. Keep only the non-obvious bits: why one shard, and the Zarr v3 shard-divisibility invariant. No behavior change (20 condition/shard tests green). Co-Authored-By: Claude Opus 4.8 Claude-Session: https://claude.ai/code/session_011LsWkVvRfkRzjqAMrzfmRP --- src/eopf_geozarr/conversion/s1_ingest.py | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-) diff --git a/src/eopf_geozarr/conversion/s1_ingest.py b/src/eopf_geozarr/conversion/s1_ingest.py index 72ae7f44..38367488 100644 --- a/src/eopf_geozarr/conversion/s1_ingest.py +++ b/src/eopf_geozarr/conversion/s1_ingest.py @@ -985,13 +985,11 @@ def ingest_s1tiling_conditions( conditions[array_name][:, :] = data log.info("Overwrote condition array", array_name=array_name) else: - # Shard the full-resolution condition arrays exactly like the vv/vh display pyramid: - # one shard spanning the whole (y, x) extent, with 512-aligned inner chunks. Without - # this a 10980² gamma_area is ~900 tiny 366²-chunk objects; sharding collapses each - # condition array to a single shard object (~900 → 1), so a cloud client reads it in - # one ranged GET and the ingest uploads one object instead of ~900. The inner chunk is - # a divisor of the dimension (calculate_aligned_chunk_size), so (h, w) is a clean - # multiple of it — the Zarr v3 shard-divisibility requirement. + # Shard like the vv/vh pyramid: one shard over the full (y, x) extent so a 10980² + # condition array is a single object, not ~900 tiny 366²-chunk objects (see + # claude-docs/specs/s1_gamma_area_sharding.md). calculate_aligned_chunk_size returns a + # divisor of the dimension, so (h, w) is a clean multiple of the inner chunk — the + # Zarr v3 shard-divisibility requirement. inner_chunks = ( calculate_aligned_chunk_size(h, 512), calculate_aligned_chunk_size(w, 512), From fad68727a94268bc94f8a2d0dfb4986f64faaf9f Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Lo=C3=AFc=20Houpert?= <10154151+lhoupert@users.noreply.github.com> Date: Fri, 19 Jun 2026 10:04:29 +0100 Subject: [PATCH 4/4] chore(s1-rtc): drop the claude-docs spec from this PR The data-model repo has no claude-docs/specs convention; the spec was noise for this PR's reviewers. The problem statement, real-S3 benchmark and migration note live in the data-pipeline plan + tracking issue EOPF-Explorer/data-pipeline#288 and PR #197's description. Also drop the now-dangling spec path from the code comment (the rationale stays inline). No behavior change (20 condition/shard tests green). Co-Authored-By: Claude Opus 4.8 Claude-Session: https://claude.ai/code/session_011LsWkVvRfkRzjqAMrzfmRP --- claude-docs/specs/s1_gamma_area_sharding.md | 118 -------------------- src/eopf_geozarr/conversion/s1_ingest.py | 7 +- 2 files changed, 3 insertions(+), 122 deletions(-) delete mode 100644 claude-docs/specs/s1_gamma_area_sharding.md diff --git a/claude-docs/specs/s1_gamma_area_sharding.md b/claude-docs/specs/s1_gamma_area_sharding.md deleted file mode 100644 index 69974ca0..00000000 --- a/claude-docs/specs/s1_gamma_area_sharding.md +++ /dev/null @@ -1,118 +0,0 @@ -# Spec: Shard the S1 RTC `conditions` arrays (gamma_area / LIA / incidence_angle) - -**Status:** implemented on `feat/s1-gamma-area-sharding` (PR targets #180 `feat/s1-rtc-stac-builder`). -**Cross-repo origin:** Task **T5** of the data-pipeline plan -`data-pipeline/claude-docs/plans/s1_ingest_upload_perf.md` — the *biggest absolute* lever in the -S1 RTC ingest upload-bottleneck work. This spec keeps T5 in the data-model review loop, as that -plan decided (2026-06-18); the data-pipeline transfer changes (T1–T4) do **not** depend on it. - -## Problem - -A real S1 RTC cube (`s1-rtc-31TEG`, staging, measured 2026-06-18) is **3807 objects / 3.5 GB**, -and **3604 of them (94.7%)** are the `conditions/gamma_area_` arrays: - -- shape `[10980, 10980]`, dtype `float32`, codecs `bytes + blosc`, inner chunk `366²`, - **no `sharding_indexed` codec** → ~900 tiny chunk objects per array × ~4 arrays ≈ 3604 objects. - -These arrays are time-invariant (one per relative orbit), yet because they are unsharded they -dominate the object count, which in turn dominated the ingest's S3 transfer wall-time (a live pod -sat ~34 min in "Uploading store" at 9 millicores — pure object-count latency, not bandwidth). - -The multiscale **display pyramid** (`vv` / `vh` / `border_mask`, r10m…r720m) is **already sharded** -(one shard per time slice spanning the full spatial extent, inner chunk `366²`) — so the fix is to -apply that *same, existing* layout to the one array family that was left out. - -## Objective - -Write the `conditions` arrays with the **same `sharding_indexed` layout** the `vv`/`vh` pyramid -already uses: one shard spanning the full `(y, x)` extent, 512-aligned inner chunks. Each condition -array collapses from ~900 chunk objects to **1 shard object** (`~3604 → ~8` for the cube; -`3807 → ~210` total). - -## Scope - -- **In:** the condition-array `create_array` in `ingest_s1tiling_conditions` - (`src/eopf_geozarr/conversion/s1_ingest.py`). All condition arrays go through this one call — - `gamma_area`, `lia`, `incidence_angle` — and all share the same full-resolution 2D shape, so all - are sharded by the single change. (Sharding `lia`/`incidence_angle` too is *more correct and less - code* than special-casing `gamma_area`, and identical in rationale: fewer cloud objects.) -- **Out:** the display pyramid (already sharded — leave untouched); the overwrite-in-place branch - (`conditions[name][:, :] = data`) is unchanged — an existing array keeps its codec; re-ingest of - *old, unsharded* cubes (those are not auto-migrated — see Migration). - -## Design - -In `ingest_s1tiling_conditions`, the new-array branch mirrors the pyramid: - -```python -inner_chunks = (calculate_aligned_chunk_size(h, 512), calculate_aligned_chunk_size(w, 512)) -arr = conditions.create_array( - array_name, shape=(h, w), dtype="float32", - chunks=inner_chunks, shards=(h, w), # one shard over the full extent (the only change) - compressors=zarr.codecs.BloscCodec(cname="zstd", clevel=5), - fill_value=float("nan"), dimension_names=["y", "x"], -) -``` - -`calculate_aligned_chunk_size` returns a **divisor** of the dimension near 512, so `(h, w)` is a -clean multiple of the inner chunk — the Zarr v3 shard-divisibility requirement (the same reason the -pyramid's `shard=(1, level_h, level_w)` / inner `(1, aligned, aligned)` is valid). - -## Web-optimized-GeoZarr constraint check - -`gamma_area`/`lia`/`incidence_angle` are **`conditions` arrays** (per-relative-orbit normalization -factors), **not** part of the multiscale pyramid TiTiler renders (`vv`/`vh`/`border_mask`). So -sharding them does **not** touch the web-render path; it only changes how a client reads a condition -array — a **windowed** read becomes one ranged GET into the shard instead of many chunk GETs + a -listing (better for cloud partial access), and the ingest writes **1 object instead of ~900** (the -upload lever). A full-array sequential read is the same bytes either way (see Benchmark caveat). -Values are byte-identical. - -## Acceptance criteria - -- [x] Condition arrays written with a sharding codec: `arr.shards == (h, w)`, inner - `arr.chunks == (aligned, aligned)` (same config as `vv`). *(test `test_gamma_area_is_sharded`)* -- [x] Object-count collapse proven: a multi-inner-chunk array lands as **1** on-disk shard object, - not one per inner chunk; values byte-identical through the shard. - *(test `test_sharding_collapses_chunk_objects_to_one` — 9 inner chunks → 1 object)* -- [x] Existing data-integrity / shape / dtype / attr tests stay green (sharding is read-transparent). -- [x] **Real-S3 validation** (laptop→DE, `esa-zarr-sentinel-explorer-tests`, 2026-06-19): see - Benchmark below — object collapse 100→1 proven on real S3, byte-identical read-back, - divisibility valid at the production 10980². (Full-tile re-ingest census ~3807→~210 still - pending an in-cluster run.) -- [x] Re-ingest path for existing (unsharded) cubes documented — see Migration. - -## Benchmark (real OVH S3, 2026-06-19) - -A 3660² (= 10×366 → 100 inner 366² chunks) gamma_area-like **smooth/compressible** surface -(models the real normalization factor, not random noise), sharded vs unsharded, on the live bucket; -upload via batched `fs.put(batch_size=32)` so concurrency is *already on* for the unsharded case: - -| layout | S3 objects | PUT (best of 3) | full-array GET | byte-identical | -|---|---|---|---|---| -| unsharded | 100 | 4.43 s | 1.38 s | ✓ | -| **sharded** | **1** | **2.55 s (1.7×)** | 1.61 s | ✓ | - -- **Object count 100 → 1** (production ~900 → 1 per `gamma_area`; cube 3604 → ~4). The core lever. -- **PUT 1.7× faster** *even with* batched concurrency hiding per-object latency; the ratio grows with - object count (900/array ≈ 28 batch-waves → 1), so the production win is far larger than 1.7×. -- **Divisibility valid at the real 10980²**: `calculate_aligned_chunk_size(10980,512)=366`, - `10980 % 366 == 0`, so `shards=(10980,10980)` builds without error. -- **Honest caveat:** a *full-array* sequential read is **not** faster sharded (1.61 vs 1.38 s) — it is - the same bytes in one un-parallelizable object vs many concurrent chunk GETs. T5's win is - object-count (upload + S3 listing/metadata overhead) and **windowed/partial** cloud reads (one - ranged GET into the shard vs many chunk GETs), not full-read throughput. - -## Verification - -1. Unit: `uv run pytest tests/test_s1_rtc_ingest.py` (57 green; +2 sharding tests). -2. Object census on a re-ingested real tile → ~3807 → ~210 objects. -3. TiTiler still renders `vv`/`vh` for that cube (render path unaffected). -4. `xarray.open_zarr` / `zarr` reads `gamma_area_*` byte-identical to the unsharded version. - -## Migration - -Old cubes written before this change stay **unsharded** until rewritten — Zarr does not re-chunk in -place. Re-ingest (data-pipeline `argo submit --from cronworkflow/eopf-explorer-s1rtc` per tile, cron -suspended) rebuilds the conditions with the sharded layout. Until a cube is re-ingested it is still -correct, just object-heavy. Sequence the re-ingest after the rebuilt image is deployed. diff --git a/src/eopf_geozarr/conversion/s1_ingest.py b/src/eopf_geozarr/conversion/s1_ingest.py index 38367488..62d723e0 100644 --- a/src/eopf_geozarr/conversion/s1_ingest.py +++ b/src/eopf_geozarr/conversion/s1_ingest.py @@ -986,10 +986,9 @@ def ingest_s1tiling_conditions( log.info("Overwrote condition array", array_name=array_name) else: # Shard like the vv/vh pyramid: one shard over the full (y, x) extent so a 10980² - # condition array is a single object, not ~900 tiny 366²-chunk objects (see - # claude-docs/specs/s1_gamma_area_sharding.md). calculate_aligned_chunk_size returns a - # divisor of the dimension, so (h, w) is a clean multiple of the inner chunk — the - # Zarr v3 shard-divisibility requirement. + # condition array is a single object, not ~900 tiny 366²-chunk objects. + # calculate_aligned_chunk_size returns a divisor of the dimension, so (h, w) is a clean + # multiple of the inner chunk — the Zarr v3 shard-divisibility requirement. inner_chunks = ( calculate_aligned_chunk_size(h, 512), calculate_aligned_chunk_size(w, 512),