Skip to content

Lift HEALPix assign reference order to 29 + order-19 t-digest / gain-bias templates#86

Merged
espg merged 2 commits into
mainfrom
claude/healpix-order29-templates
Jun 23, 2026
Merged

Lift HEALPix assign reference order to 29 + order-19 t-digest / gain-bias templates#86
espg merged 2 commits into
mainfrom
claude/healpix-order29-templates

Conversation

@espg

@espg espg commented Jun 23, 2026

Copy link
Copy Markdown
Member

What

  1. Fix: lift the HEALPix assign reference order so child_order > 18 actually resolves. HealpixGrid.assign() pinned point resolution at HEALPIX_REF_ORDER = 18 (commented "do not change"), then cells_of/shards_of coarsen down. mortie 0.8.1 resolves up to order 29, but with the reference pinned at 18 any child_order > 18 was silently collapsed to order 18 (verified: order-19 cells were byte-identical to order-18 for 5000 random points). This raises the constant to 29 (mortie's max). Adopt mortie MortonIndexDtype for the morton coordinate (keep NESTED cell_ids) #75 fixed the morton dtype but never lifted this pin.

  2. Two HEALPix order-19 test templates (multi-chunk-per-worker), for exercising the Vector (and ragged?) chunk companions: resolution: chunk for non-scalar kinds #82/Refactor the per-cell aggregation handoff: sort/hash grouping + Arrow path (additive, benchmarked) #30/Tier-2 vectors: CSR ragged (values/offsets/cell_ids) + t-digest as List<FixedSizeList<2>> #48 work end-to-end:

    • configs/atl03_tdigest_healpix.yaml — per-cell t-digest of photon heights (kind: ragged, function: zagg.stats.tdigest.build_tdigest, inner_shape: [2], delta: 256), stored as CSR.
    • configs/atl03_gain_bias_healpix.yamlper-chunk gain/bias: chunk_precompute DEM-anchored chunk_offset + power-of-two chunk_gain, with offset_h/gain_h as resolution: chunk companions.

Both grids: parent_order: 11 (shard = 256×256 cells), chunk_inner: 13 (inner Zarr chunk = 64×64 cells), child_order: 19 (~10 m leaf) ⇒ K = 16 inner chunks per shard.

Safety of the reference-order bump

Coarsening from a deeper reference is byte-identical for any order ≤ the old value, so existing configs are unaffected. Verified for the shipped configs: atl06 (order 12) and atl03 (order 18) produce identical cells and shards at reference 18 vs 29. The constant has a single usage (assign, healpix.py:215).

What stays at child_order (not 29)

The bump only changes the intermediate resolution in assign; the stored morton/cell_ids coordinates (and the DGGS refinement_level) remain at the grid's child_order (19 here), since they come from children() = generate_morton_children(shard_key, child_order). (See the thread for the more general per-observation high-resolution location-morton case, which this bump now makes computable — tracked separately.)

Deploy note

The fix lives in the function code (src/zagg/grids/healpix.py, shipped by build_function.sh), and assign/cells_of run inside process_shard on the worker — so Lambda runs need a function redeploy (and the deployed layer must carry mortie ≥ 0.8.1 for geo2mort(order=29)). The catalog/shardmap (orchestrator) is unaffected. Local runs pick it up directly. If deploying via stand_up.sh (which reads zips from source.coop), that means a fresh Lambda Build → publish_mirror.shstand_up.sh.

Testing

  • tests/test_grids.py::TestReferenceOrder (new): reference order ≥ 19; order-19 genuinely refines order-18 (the regression); existing order-12 assignment byte-identical from the deeper reference; order-19 children (65536 = 256×256) nest back to their order-11 shard.
  • uv run pytest tests/test_grids.py tests/test_config.py -q → 225 passed. Both templates load_config + validate_config + from_config clean (K=16, 4096 cells/chunk). ruff check/format --check clean on touched files.

Questions for review

  • Reference order set to 29 (mortie's max) rather than a per-grid max(child_order, …). Keeps a single fixed reference and is forward-compatible with high-resolution morton location encoding (order 26/27); flag if you'd prefer it derived from child_order.

Generated by Claude Code

@espg espg left a comment

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 from Claude (review) — fresh-context adversarial review of the reference-order bump + the two order-19 templates. No correctness bugs found; the change is safe to merge. Findings below are documentation drift and optional test hardening only.

Verified (independently, against mortie 0.8.1)

  • Byte-identity of the bump: clip2order(child, geo2mort(.., 29)) == clip2order(child, geo2mort(.., 18)) for child ∈ {6, 12, 18} across 8 seeds incl. exact poles (±90) and antimeridian (±180). Shipped configs (atl06 order 12, atl03 order 18) do not drift. HIGH-risk claim → holds.
  • mortie cap: orders 30/31 raise ValueError: Max order is 29; 29 is genuinely the ceiling.
  • Dtype safety: order-29 assign output reaches 1.27e19 (exceeds int64 max), but it's transient — never serialized. The persisted morton/cell_ids coords come from children()/cells_of and are carried as uint64 (configs declare it; write.py routes through the uint64 boundary; #71/#75). No int64 overflow/sign hazard is reintroduced.
  • Regression is locked: forcing HEALPIX_REF_ORDER=18 makes test_child_order_19_refines_order_18 fail and leaves test_existing_order_assignment_unchanged green — the suite fences the fix in both directions. Full tests/test_grids.py (56 tests) green on this branch.
  • tdigest template runnable end-to-end: resolve_functionbuild_tdigest(h_ph, delta=256)(k,2) float32 matches inner_shape:[2]; calculate_cell_statistics yields h_tdigest (290,2); emit_template correctly omits the dense array (CSR group prefix).
  • gain/bias template runnable: sandbox claims hold (np.max/np.maximum work, h_ph.max() raises under empty __builtins__); resolution: chunk companions offset_h/gain_h emit at the chunk grid (12·4^13,) chunked (1,), waveform_counts at (3.3e12,128) chunked (4096,128). K=16 multi-chunk-per-worker genuinely exercised; nominal 3.3e12 array is metadata-only and emits in ~10 ms.

Findings (all LOW; none block merge)

  1. Stale doc — cells_of docstring (healpix.py:233) still says "order-18 leaf morton IDs"; now order-29. Also src/zagg/grids/base.py:10 (not in this diff) says "HEALPix: order-18 morton". The PR body's "single usage" is true for code but the literal 18 survives in prose. Pure drift — clip2order is order-agnostic on input.
  2. Test gap (optional hardening)TestReferenceOrder never asserts the order-29 morton is uint64 / exceeds int64, and the suite never emits an order-19 template (the K=16 companion/CSR-skip behavior). Adding both would lock the two contracts this PR most depends on.

Recommendation: fix the two doc references (trivial, single-source the reference order in prose); the test additions are nice-to-have. Otherwise ready.


Generated by Claude Code

Comment thread src/zagg/grids/healpix.py

def shards_of(self, leaf_ids) -> np.ndarray:
"""Vectorized parent-morton lookup. ``leaf_ids`` must be at order 18."""
"""Vectorized parent-morton lookup. ``leaf_ids`` at :data:`HEALPIX_REF_ORDER`."""

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 from Claude (review)LOW (stale doc, anchored here for the hunk): You updated this shards_of docstring and assign above to reference HEALPIX_REF_ORDER, but two sibling references to the old "order-18" were left behind:

  • cells_of docstring (line 233 just below): "Coarsen order-18 leaf morton IDs to child_order cell IDs." — now order-29 leaves.
  • src/zagg/grids/base.py:10 (not in this diff): leaf-id terminology says "HEALPix: order-18 morton".

Also the PR body states "The constant has a single usage (assign, healpix.py:215)" — accurate for code usage, but the value 18 is still referenced in these two doc strings. No behavioral impact (clip2order coarsens order-29 input identically); pure doc drift. Worth a one-line fix to each so the reference order is single-sourced in prose too.


Generated by Claude Code

Comment thread src/zagg/grids/healpix.py
# orders up to 29, so this is the cap. Coarsening from a deeper reference is
# byte-identical for any order <= the old value, so raising it leaves existing
# grids' cell/shard assignments unchanged (verified for the shipped configs).
HEALPIX_REF_ORDER: int = 29

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 from Claude (review)Confirmed safe (no change requested), one note for the record.

I independently verified the core claims of this bump against mortie 0.8.1:

  • Byte-identity: clip2order(child, geo2mort(..., 29)) == clip2order(child, geo2mort(..., 18)) for child ∈ {6, 12, 18} across 8 seeds incl. exact poles (±90) and the antimeridian (±180). Shipped configs do not drift. ✓
  • mortie cap: order 30/31 raise ValueError: Max order is 29, so 29 is genuinely the ceiling. ✓
  • Dtype: order-29 assign output reaches 1.27e19 (bit 63 set → exceeds int64 max 9.22e18). This is fine only because that uncoarsened leaf morton is never stored — it's consumed in-process by cells_of/shards_of. The persisted morton coord comes from children()/cells_of, which for full-sphere base cells 7–11 also exceeds int64 (saw 1.50e19 at order 19), and that path is correctly carried as uint64 (configs declare uint64; write.py routes through the uint64 boundary; Adopt mortie v1 MortonIndexDtype for the morton coordinate (keep NESTED cell_ids) #71/Adopt mortie MortonIndexDtype for the morton coordinate (keep NESTED cell_ids) #75). So no int64 overflow/sign hazard is introduced. ✓

Worth one explicit line in the comment block here noting that the reference order must stay ≤ 29 and that the order-29 leaf morton is intentionally transient (never serialized) — it's the load-bearing reason the int64→uint64 #71 work isn't re-triggered by this bump.


Generated by Claude Code

kind: ragged
function: "zagg.stats.tdigest.build_tdigest"
source: h_ph
inner_shape: [2]

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 from Claude (review)Confirmed runnable (no change requested). I exercised this template end-to-end:

No action needed — flagging the verification so it's on record.


Generated by Claude Code

# raises but the functional numpy form is fine. The range is floored at 1e-6 m
# BEFORE log2 so an all-equal-height chunk (range == 0) does not hit log2(0).
chunk_gain:
expression: "np.float32(np.maximum(0.5, 2.0 ** np.ceil(np.log2(np.maximum(1e-6, np.max(h_ph) - np.min(h_ph)) / 128.0))))"

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 from Claude (review)Sandbox claim verified (no change requested). I ran this chunk_gain expression and the comment's claims in the actual sandbox namespace ({"__builtins__": {}, "np": np}, matching _eval_chunk_precompute aggregate.py:417 and the param path at :245):

  • np.float32(np.maximum(0.5, 2.0 ** np.ceil(np.log2(np.maximum(1e-6, np.max(h_ph) - np.min(h_ph)) / 128.0)))) evaluates to a np.float32 scalar. ✓
  • chunk_offset = np.float32(np.floor(np.min(dem_h))) evaluates. ✓
  • The waveform_counts histogram expr referencing chunk_offset/chunk_gain produces a (128,) uint32. ✓
  • Confirmed the comment's central caveat: h_ph.max() (ndarray method) raises KeyError under empty __builtins__, while the functional np.max(h_ph) form works — so the "use np.max/np.maximum, not ndarray methods" guidance is correct and load-bearing. ✓

Generated by Claude Code

expression: "chunk_gain"
source: h_ph
dtype: float32
resolution: chunk

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 from Claude (review)resolution: chunk companions emit correctly (no change requested). emit_template to a MemoryStore produces:

So the K>1 multi-chunk-per-worker path and the chunk-grid companion sizing are genuinely exercised end-to-end at child_order 19; the nominal 3.3e12 array is metadata-only and Zarr emits it in ~10 ms. No action needed.


Generated by Claude Code

Comment thread tests/test_grids.py
leaves = g.assign(np.array([38.89, -45.0]), np.array([-76.5, 30.0]))
for shard in np.unique(g.shards_of(leaves)):
children = g.children(int(shard))
assert len(children) == 4 ** (19 - 11)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 from Claude (review)LOW (test-coverage gap, optional). TestReferenceOrder is solid and I confirmed it locks the regression: I temporarily forced HEALPIX_REF_ORDER = 18 and test_child_order_19_refines_order_18 fails (c19 == c18), while test_existing_order_assignment_unchanged passes at both 18 and 29 — so the suite genuinely fences the fix in both directions. Two things it does not assert, both load-bearing for the bump:

  1. No order-29 morton fits-uint64 / not-int64 check. The whole reason this is safe is that the order-29 leaf morton (which reaches 1.27e19, bit 63 set) is never serialized and the persisted coord is uint64. A test like assert g.assign(...).dtype == np.uint64 and assert assigned.max() > np.iinfo(np.int64).max for a full-sphere sample would catch any future regression to an int64 morton path (Adopt mortie v1 MortonIndexDtype for the morton coordinate (keep NESTED cell_ids) #71/Adopt mortie MortonIndexDtype for the morton coordinate (keep NESTED cell_ids) #75) silently truncating order-29 values.
  2. No template is emitted at child_order 19 in the test suite. The PR body says the two configs load_config/validate_config/from_config clean, but nothing emits their order-19 template (the K=16 companion/CSR-skip behavior). An emit_template smoke test to a MemoryStore asserting the companion shapes (offset_h = (12·4^13,)) would lock the chunk-grid sizing that this PR's templates rely on.

Neither blocks merge; the existing tests cover the actual constant change. These would harden the surrounding contract.


Generated by Claude Code

@espg espg marked this pull request as ready for review June 23, 2026 05:07
@espg espg merged commit fa42d8d into main Jun 23, 2026
8 checks passed
@espg espg deleted the claude/healpix-order29-templates branch June 29, 2026 02:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants