Lift HEALPix assign reference order to 29 + order-19 t-digest / gain-bias templates#86
Conversation
espg
left a comment
There was a problem hiding this comment.
🤖 from Claude (review) — fresh-context adversarial review of the reference-order bump + the two order-19 templates. No correctness bugs found; the change is safe to merge. Findings below are documentation drift and optional test hardening only.
Verified (independently, against mortie 0.8.1)
- Byte-identity of the bump:
clip2order(child, geo2mort(.., 29)) == clip2order(child, geo2mort(.., 18))for child ∈ {6, 12, 18} across 8 seeds incl. exact poles (±90) and antimeridian (±180). Shipped configs (atl06 order 12, atl03 order 18) do not drift. HIGH-risk claim → holds. - mortie cap: orders 30/31 raise
ValueError: Max order is 29; 29 is genuinely the ceiling. - Dtype safety: order-29
assignoutput reaches1.27e19(exceeds int64 max), but it's transient — never serialized. The persistedmorton/cell_idscoords come fromchildren()/cells_ofand are carried asuint64(configs declare it;write.pyroutes through the uint64 boundary; #71/#75). No int64 overflow/sign hazard is reintroduced. - Regression is locked: forcing
HEALPIX_REF_ORDER=18makestest_child_order_19_refines_order_18fail and leavestest_existing_order_assignment_unchangedgreen — the suite fences the fix in both directions. Fulltests/test_grids.py(56 tests) green on this branch. - tdigest template runnable end-to-end:
resolve_function→build_tdigest(h_ph, delta=256)→(k,2)float32 matchesinner_shape:[2];calculate_cell_statisticsyieldsh_tdigest (290,2);emit_templatecorrectly omits the dense array (CSR group prefix). - gain/bias template runnable: sandbox claims hold (
np.max/np.maximumwork,h_ph.max()raises under empty__builtins__);resolution: chunkcompanionsoffset_h/gain_hemit at the chunk grid(12·4^13,)chunked(1,),waveform_countsat(3.3e12,128)chunked(4096,128). K=16 multi-chunk-per-worker genuinely exercised; nominal 3.3e12 array is metadata-only and emits in ~10 ms.
Findings (all LOW; none block merge)
- Stale doc —
cells_ofdocstring (healpix.py:233) still says "order-18 leaf morton IDs"; now order-29. Alsosrc/zagg/grids/base.py:10(not in this diff) says "HEALPix: order-18 morton". The PR body's "single usage" is true for code but the literal18survives in prose. Pure drift —clip2orderis order-agnostic on input. - Test gap (optional hardening) —
TestReferenceOrdernever asserts the order-29 morton is uint64 / exceeds int64, and the suite never emits an order-19 template (the K=16 companion/CSR-skip behavior). Adding both would lock the two contracts this PR most depends on.
Recommendation: fix the two doc references (trivial, single-source the reference order in prose); the test additions are nice-to-have. Otherwise ready.
Generated by Claude Code
|
|
||
| def shards_of(self, leaf_ids) -> np.ndarray: | ||
| """Vectorized parent-morton lookup. ``leaf_ids`` must be at order 18.""" | ||
| """Vectorized parent-morton lookup. ``leaf_ids`` at :data:`HEALPIX_REF_ORDER`.""" |
There was a problem hiding this comment.
🤖 from Claude (review) — LOW (stale doc, anchored here for the hunk): You updated this shards_of docstring and assign above to reference HEALPIX_REF_ORDER, but two sibling references to the old "order-18" were left behind:
cells_ofdocstring (line 233 just below): "Coarsen order-18 leaf morton IDs tochild_ordercell IDs." — now order-29 leaves.src/zagg/grids/base.py:10(not in this diff): leaf-id terminology says "HEALPix: order-18 morton".
Also the PR body states "The constant has a single usage (assign, healpix.py:215)" — accurate for code usage, but the value 18 is still referenced in these two doc strings. No behavioral impact (clip2order coarsens order-29 input identically); pure doc drift. Worth a one-line fix to each so the reference order is single-sourced in prose too.
Generated by Claude Code
| # orders up to 29, so this is the cap. Coarsening from a deeper reference is | ||
| # byte-identical for any order <= the old value, so raising it leaves existing | ||
| # grids' cell/shard assignments unchanged (verified for the shipped configs). | ||
| HEALPIX_REF_ORDER: int = 29 |
There was a problem hiding this comment.
🤖 from Claude (review) — Confirmed safe (no change requested), one note for the record.
I independently verified the core claims of this bump against mortie 0.8.1:
- Byte-identity:
clip2order(child, geo2mort(..., 29))==clip2order(child, geo2mort(..., 18))for child ∈ {6, 12, 18} across 8 seeds incl. exact poles (±90) and the antimeridian (±180). Shipped configs do not drift. ✓ - mortie cap: order 30/31 raise
ValueError: Max order is 29, so 29 is genuinely the ceiling. ✓ - Dtype: order-29
assignoutput reaches1.27e19(bit 63 set → exceeds int64 max 9.22e18). This is fine only because that uncoarsened leaf morton is never stored — it's consumed in-process bycells_of/shards_of. The persistedmortoncoord comes fromchildren()/cells_of, which for full-sphere base cells 7–11 also exceeds int64 (saw1.50e19at order 19), and that path is correctly carried asuint64(configs declareuint64;write.pyroutes through the uint64 boundary; Adopt mortie v1 MortonIndexDtype for themortoncoordinate (keep NESTEDcell_ids) #71/Adopt mortie MortonIndexDtype for themortoncoordinate (keep NESTEDcell_ids) #75). So no int64 overflow/sign hazard is introduced. ✓
Worth one explicit line in the comment block here noting that the reference order must stay ≤ 29 and that the order-29 leaf morton is intentionally transient (never serialized) — it's the load-bearing reason the int64→uint64 #71 work isn't re-triggered by this bump.
Generated by Claude Code
| kind: ragged | ||
| function: "zagg.stats.tdigest.build_tdigest" | ||
| source: h_ph | ||
| inner_shape: [2] |
There was a problem hiding this comment.
🤖 from Claude (review) — Confirmed runnable (no change requested). I exercised this template end-to-end:
resolve_function("zagg.stats.tdigest.build_tdigest")resolves, and is called asbuild_tdigest(h_ph, delta=256).- Return shape is
(k, 2)float32, which matchesinner_shape: [2]exactly —_coerce_ragged_valueaccepts it (arr.ndim == len(inner)+1,arr.shape[1:] == (2,)), no reshape needed. calculate_cell_statistics(cell_data={'h_ph': ...}, config=cfg)returns{'count', 'h_tdigest'}withh_tdigestshaped(290, 2)float32 for 800 photons (saturates near δ).emit_templateto aMemoryStoreproduces19/{cell_ids,count,morton}/zarr.jsonand correctly omits a denseh_tdigestarray (CSR group prefix, the Tier-2 vectors: CSR ragged (values/offsets/cell_ids) + t-digest as List<FixedSizeList<2>> #48/Vector/ragged chunk companions + multi-chunk-per-worker (Closes #82, Refs #30) #84 skip in_spec). Grid resolves K=16, chunk_shape 4096. ✓
No action needed — flagging the verification so it's on record.
Generated by Claude Code
| # raises but the functional numpy form is fine. The range is floored at 1e-6 m | ||
| # BEFORE log2 so an all-equal-height chunk (range == 0) does not hit log2(0). | ||
| chunk_gain: | ||
| expression: "np.float32(np.maximum(0.5, 2.0 ** np.ceil(np.log2(np.maximum(1e-6, np.max(h_ph) - np.min(h_ph)) / 128.0))))" |
There was a problem hiding this comment.
🤖 from Claude (review) — Sandbox claim verified (no change requested). I ran this chunk_gain expression and the comment's claims in the actual sandbox namespace ({"__builtins__": {}, "np": np}, matching _eval_chunk_precompute aggregate.py:417 and the param path at :245):
np.float32(np.maximum(0.5, 2.0 ** np.ceil(np.log2(np.maximum(1e-6, np.max(h_ph) - np.min(h_ph)) / 128.0))))evaluates to anp.float32scalar. ✓chunk_offset=np.float32(np.floor(np.min(dem_h)))evaluates. ✓- The
waveform_countshistogram expr referencingchunk_offset/chunk_gainproduces a(128,)uint32. ✓ - Confirmed the comment's central caveat:
h_ph.max()(ndarray method) raisesKeyErrorunder empty__builtins__, while the functionalnp.max(h_ph)form works — so the "usenp.max/np.maximum, not ndarray methods" guidance is correct and load-bearing. ✓
Generated by Claude Code
| expression: "chunk_gain" | ||
| source: h_ph | ||
| dtype: float32 | ||
| resolution: chunk |
There was a problem hiding this comment.
🤖 from Claude (review) — resolution: chunk companions emit correctly (no change requested). emit_template to a MemoryStore produces:
waveform_counts: shape[3298534883328, 128], chunks[4096, 128], uint32 (vector dim chunked whole). ✓offset_h/gain_h: shape[805306368](= 12·4^13, the chunk grid atchunk_inner=13, not the parent grid), chunks[1], float32 — i.e. one value per inner chunk, K=16 per shard, indexed by the fineriter_chunksblock, consistent with the per-chunk (not shard-pooled) precompute. ✓cell_ids/mortoncoords present asuint64per the post-Adopt mortie v1 MortonIndexDtype for themortoncoordinate (keep NESTEDcell_ids) #71/Adopt mortie MortonIndexDtype for themortoncoordinate (keep NESTEDcell_ids) #75 convention. ✓
So the K>1 multi-chunk-per-worker path and the chunk-grid companion sizing are genuinely exercised end-to-end at child_order 19; the nominal 3.3e12 array is metadata-only and Zarr emits it in ~10 ms. No action needed.
Generated by Claude Code
| leaves = g.assign(np.array([38.89, -45.0]), np.array([-76.5, 30.0])) | ||
| for shard in np.unique(g.shards_of(leaves)): | ||
| children = g.children(int(shard)) | ||
| assert len(children) == 4 ** (19 - 11) |
There was a problem hiding this comment.
🤖 from Claude (review) — LOW (test-coverage gap, optional). TestReferenceOrder is solid and I confirmed it locks the regression: I temporarily forced HEALPIX_REF_ORDER = 18 and test_child_order_19_refines_order_18 fails (c19 == c18), while test_existing_order_assignment_unchanged passes at both 18 and 29 — so the suite genuinely fences the fix in both directions. Two things it does not assert, both load-bearing for the bump:
- No order-29 morton fits-uint64 / not-int64 check. The whole reason this is safe is that the order-29 leaf morton (which reaches
1.27e19, bit 63 set) is never serialized and the persisted coord is uint64. A test likeassert g.assign(...).dtype == np.uint64andassert assigned.max() > np.iinfo(np.int64).maxfor a full-sphere sample would catch any future regression to an int64 morton path (Adopt mortie v1 MortonIndexDtype for themortoncoordinate (keep NESTEDcell_ids) #71/Adopt mortie MortonIndexDtype for themortoncoordinate (keep NESTEDcell_ids) #75) silently truncating order-29 values. - No template is emitted at child_order 19 in the test suite. The PR body says the two configs
load_config/validate_config/from_configclean, but nothing emits their order-19 template (the K=16 companion/CSR-skip behavior). Anemit_templatesmoke test to aMemoryStoreasserting the companion shapes (offset_h=(12·4^13,)) would lock the chunk-grid sizing that this PR's templates rely on.
Neither blocks merge; the existing tests cover the actual constant change. These would harden the surrounding contract.
Generated by Claude Code
…/template tests
What
Fix: lift the HEALPix assign reference order so
child_order> 18 actually resolves.HealpixGrid.assign()pinned point resolution atHEALPIX_REF_ORDER = 18(commented "do not change"), thencells_of/shards_ofcoarsen down. mortie 0.8.1 resolves up to order 29, but with the reference pinned at 18 anychild_order > 18was silently collapsed to order 18 (verified: order-19 cells were byte-identical to order-18 for 5000 random points). This raises the constant to 29 (mortie's max). Adopt mortie MortonIndexDtype for themortoncoordinate (keep NESTEDcell_ids) #75 fixed the morton dtype but never lifted this pin.Two HEALPix order-19 test templates (multi-chunk-per-worker), for exercising the Vector (and ragged?) chunk companions:
resolution: chunkfor non-scalar kinds #82/Refactor the per-cell aggregation handoff: sort/hash grouping + Arrow path (additive, benchmarked) #30/Tier-2 vectors: CSR ragged (values/offsets/cell_ids) + t-digest as List<FixedSizeList<2>> #48 work end-to-end:configs/atl03_tdigest_healpix.yaml— per-cell t-digest of photon heights (kind: ragged,function: zagg.stats.tdigest.build_tdigest,inner_shape: [2],delta: 256), stored as CSR.configs/atl03_gain_bias_healpix.yaml— per-chunk gain/bias:chunk_precomputeDEM-anchoredchunk_offset+ power-of-twochunk_gain, withoffset_h/gain_hasresolution: chunkcompanions.Both grids:
parent_order: 11(shard = 256×256 cells),chunk_inner: 13(inner Zarr chunk = 64×64 cells),child_order: 19(~10 m leaf) ⇒ K = 16 inner chunks per shard.Safety of the reference-order bump
Coarsening from a deeper reference is byte-identical for any order ≤ the old value, so existing configs are unaffected. Verified for the shipped configs: atl06 (order 12) and atl03 (order 18) produce identical cells and shards at reference 18 vs 29. The constant has a single usage (
assign, healpix.py:215).What stays at
child_order(not 29)The bump only changes the intermediate resolution in
assign; the storedmorton/cell_idscoordinates (and the DGGSrefinement_level) remain at the grid'schild_order(19 here), since they come fromchildren() = generate_morton_children(shard_key, child_order). (See the thread for the more general per-observation high-resolution location-morton case, which this bump now makes computable — tracked separately.)Deploy note
The fix lives in the function code (
src/zagg/grids/healpix.py, shipped bybuild_function.sh), andassign/cells_ofrun insideprocess_shardon the worker — so Lambda runs need a function redeploy (and the deployed layer must carrymortie ≥ 0.8.1forgeo2mort(order=29)). The catalog/shardmap (orchestrator) is unaffected. Local runs pick it up directly. If deploying viastand_up.sh(which reads zips from source.coop), that means a fresh Lambda Build →publish_mirror.sh→stand_up.sh.Testing
tests/test_grids.py::TestReferenceOrder(new): reference order ≥ 19; order-19 genuinely refines order-18 (the regression); existing order-12 assignment byte-identical from the deeper reference; order-19 children (65536 = 256×256) nest back to their order-11 shard.uv run pytest tests/test_grids.py tests/test_config.py -q→ 225 passed. Both templatesload_config+validate_config+from_configclean (K=16, 4096 cells/chunk).ruff check/format --checkclean on touched files.Questions for review
max(child_order, …). Keeps a single fixed reference and is forward-compatible with high-resolution morton location encoding (order 26/27); flag if you'd prefer it derived fromchild_order.Generated by Claude Code