shardmap viewer: headless render core + ipyleaflet wrapper#44
Conversation
|
🤖 from Claude Opened this draft PR for #38 — phases 1 and 2 both landed; phase 3 (docs/notebook) is deferred to the next run per the plan. Landed
Tested: Self-review folded (commit Two items raised under Questions for review in the PR body: deferred browser verification, and the Generated by Claude Code |
espg
left a comment
There was a problem hiding this comment.
🤖 from Claude (review)
Fresh adversarial self-review of the phase-1/phase-2 diff. Two real defects were found and folded before opening:
_split_antimeridianoriginally gated on total longitude span (max-min > 180), which over-split a wide-but-continuous swath (e.g. a granule footprint stepping -170 → 0 → +170) into ±180 slivers, discarding the real interior. Now keyed on an actual consecutive-vertex seam jump (_crosses_antimeridian). Live exposure wasgranule_footprints, since catalog footprints carry no narrow-cell guarantee. Covered bytest_wide_non_crossing_polygon_kept_intact.- Interior rings (holes) were dropped on the exterior-only path. Now carried through the unwrap/clip. Covered by
test_holes_preserved.
Verified fine (no change needed): import safety with ipyleaflet absent, the <= max_shards gate at its boundary, grid_from_signature rectilinear affine→bounds math (incl. anisotropic resolution), and the MultiPolygon assembly nesting.
Standing (not blocking, for @espg): viewport_cells drops a clip that degenerates to a point/line on a shared shard edge (touch-only intersection). That's the right default for a viewer, but flagging in case you want edge-touch shards surfaced in the grid layer.
Generated by Claude Code
...instructions on to verify this? Are we doing an example ipython notebook (would be the natural place for instructions). |
espg
left a comment
There was a problem hiding this comment.
🤖 from Claude (review)
Fresh-context adversarial review of phase 3 only (commit 69b5301): the new notebooks/shardmap_viewer.ipynb and the 7-line docs/quickstart.md pointer. Findings are in the summary because the notebook is JSON and inline anchoring would be noisy; cell numbers are 0-indexed as in the file.
Verdict: looks good. No blocking or diff-scoped findings. The notebook runs offline on synthetic fixtures, the public API calls are all correct, and the verification instructions match the code. Two trivial nits below, both optional.
API usage — all correct (checked against src/zagg/viz/):
render_shardmap(shardmap, catalog)(cell 5) → returns keysshards/granules/cells; cell 6 readslayers["shards"]and cell 7 readslayers["granules"]— both matchshardmap.py:346. With nobbox,cellsisNone, and the cell-5 dict-comp guards withfc and len(...), so it won'tNoneType-subscript. Good.shard_outlines,granule_footprints,viewport_cells,grid_from_signature,show_shardmap(cells 6/7/9/11/13) — signatures and import paths (from zagg.viz import ...,from zagg.viz.shardmap import grid_from_signature) all match__init__.py's__all__.ShardMap(grid.signature(), shard_keys, granules, {...})(cell 2) andCatalog(pa.table(...), {...})(cell 3) match the dataclass field order inshardmap.py:243/sources.py:222.RectilinearGrid(crs, res, bounds, chunk_shape)andHealpixGrid(2, 6, layout=...)match the constructors. These mirrortests/test_viz.pyfixtures exactly, which is the right thing to copy.viewport_cells(shardmap, (lon0, lat0, lon1, lat1), max_shards=...)(cell 9) usesshard_footprint(0).bounds→(minx, miny, maxx, maxy)=(lon_min, lat_min, lon_max, lat_max), the orderviewport_cellsexpects. Correct.
Offline / no-network (req 1): confirmed. Everything is built in-memory from RectilinearGrid + hand-written STAC dicts → stac_geoparquet.arrow; the only disk I/O is a tempfile.mkdtemp() round-trip in cell 13. No S3/Earthdata/CMR calls.
Verification instructions (req 3): accurate. Layer names in cell 14 — shards, granule footprints, grid (shard cells) — match leaflet.py:93/99/109. The grid-on-zoom description matches the max_shards gate wired to bounds (leaflet.py:113-123), and "layer control (top-right)" matches LayersControl(position="topright") (leaflet.py:125). The styling words ("blue" shards / "red" footprints / "grey" grid) match _SHARD_STYLE/_FOOTPRINT_STYLE/_GRID_STYLE.
Leanness (req 4): clean — all 8 code cells have execution_count: null and zero outputs, 12 KB, no embedded images/binaries.
Docs pointer (req 5): the github.com/englacial/zagg/blob/main/notebooks/shardmap_viewer.ipynb link matches the existing custom_aggregations.ipynb link style in the same file, the zagg[viz] extra exists in pyproject.toml, and python -m zagg.catalog (cited in cells 1/14) matches the command documented at quickstart.md:56.
Nits (optional, non-blocking):
- Cell 13 markdown: "We round-trip the synthetic objects to disk first, since the viewer accepts either in-memory objects or file paths." The because is slightly backwards — since the viewer accepts in-memory objects too, the disk round-trip isn't required; it's a demonstration of the saved-file path. Consider "...to demonstrate the saved-file path (the viewer also accepts in-memory objects)."
- Cell 13's
print("layers:", [getattr(layer, "name", ...) ...])will list the basemap TileLayer (noname) asTileLayeralongside the three GeoJSON layer names — fine, just slightly noisier than the cell-14 prose implies. No change needed.
Nothing here blocks the next step; fold the nits or leave them at your discretion.
Generated by Claude Code
|
🤖 from Claude Phase 3 — example notebook + verification instructions (commits This directly answers your question in #44 (comment) ("instructions on how to verify this? Are we doing an example ipython notebook…"): yes — a notebook is the natural home, so that's what this phase adds. Added
Verification (also written as a checklist in the notebook's final markdown cell):
Validation done here: notebook is valid JSON / Self-review: a fresh-context adversarial review of this diff found no blocking/diff-scoped issues (API calls, layer names, gate behavior, and offline-ness all verified). I folded its one wording nit in Phases 1–3 are now complete, so I'm setting this PR to Generated by Claude Code |
nope. we want to run it for real, and actually generate a real stack catalog hitting the nasa CMR. Note that doing this still won't require any creds (anonymous cmr query is allowed). We're assuming / requiring an internet connection for the notebook to work-- which is fine. Update the notebook so we have a (or a a few) real stac and shardmaps to visualize in the notebook. |
|
🤖 from Claude Done — the notebook now queries real ATL06 granules from NASA CMR-STAC anonymously (no Earthdata credentials required). What changed:
How to run: pip install zagg[viz] stac-geoparquet
jupyter lab notebooks/shardmap_viewer.ipynbRequires an internet connection for the anonymous CMR-STAC query in section 1; no credentials needed. Note on headless execution: The build environment has no outbound network access to Generated by Claude Code |
espg
left a comment
There was a problem hiding this comment.
🤖 from Claude (review)
Four findings on commit e9db108. Two are correctness issues that will cause runtime failures; two are lower-severity.
1. version="006" is wrong for ATL06 — will return zero granules (cells 2 and 18)
The CLI in src/zagg/catalog/__main__.py line 187 defaults to --version "007", which forms the CMR collection ID ATL06_007. The notebook uses version="006" forming ATL06_006. If ATL06_006 is not a live collection on NSIDC_CPRD's CMR-STAC endpoint, CMRSource.fetch will raise ValueError: No granules for ATL06_006 … and the notebook will fail at cell 2 (and again at cell 18). This needs to be confirmed against the live CMR endpoint — if "006" is intentional and verified, add a comment explaining why the notebook diverges from the CLI default. Otherwise change both version="006" to version="007" to match the CLI.
2. Cell 18 uses render_shardmap without importing it (NameError on partial run)
Cell 18 calls render_shardmap(shardmap2, catalog2) but does not import it. The only import is in cell 10 (from zagg.viz import render_shardmap). A linear top-to-bottom run works because Jupyter carries the name forward, but if a user restarts the kernel and runs cell 18 in isolation (the "Second AOI" section is self-contained in intent), they get a NameError. Add from zagg.viz import render_shardmap at the top of cell 18, or hoist it to the first code cell alongside the other catalog imports.
3. Cell 3: str(rec['https'])[:70] prints "None..." silently when https is absent
granule_records() (sources.py line 280) sets "https": data.get("href"), which is None when no HTTPS asset is present. str(None)[:70] does not crash, but it prints None... — indistinguishable from a valid URL that happens to start with "None". A guard makes intent explicit:
https_preview = rec["https"][:70] if rec["https"] else "(no HTTPS asset)"
print(f" {rec['id']} | https: {https_preview}...")4. Cell 6 shadows granules in the kernel namespace
for key, granules in zip(shardmap.shard_keys[:4], shardmap.granules[:4]):The loop variable granules shadows any prior binding. It does not break anything today, but it is a hazard if cells are reordered or a later cell expects granules as a list. Renaming to shard_granules or grp avoids the ambiguity.
Generated by Claude Code
|
🤖 from Claude Self-review of commit
On the Generated by Claude Code |
|
minor nit -- version="007" is correct. larger nit -- the current example doesn't showcase the grid display. Zooming in is supposed to display the grid, but we only get the shard layout (which is useful, but incomplete). major nit-- we aren't supporting polar areas correctly. The base projection is web mercator, and does horribly for polar regions. Can we talk about options that project on a globe and are usable? Have a look at englacial/xopr#64 for one discussion on this in the stac catalog space. Detail you thoughts and potential options for us. |
|
🤖 from Claude Taking the three nits in order; the projection one is the real design fork, so most of this is there. 1.
|
|
I think the correct answer here is (C) and (B) -- with (C) being an earlier phase, and (B) as a later phase. I'll have you Plan this, since (B) probably needs a planning pass. The reasoning for above is as follows-- we want the inline ipyleaflet functionality, and want it to work. (C) is something that runs within a notebook cell, and displays there too. Option (B) is heavier, and doesn't live within a notebook-- although it can live within a jupyterhub. The idea for (B) is as a more full featured hub extension; something that takes in either just a shardmap.json, or that and a stac catalog as well. But it's menu driven, and outside of the notebook.
GIBS is great for this. |
espg
left a comment
There was a problem hiding this comment.
🤖 from Claude (review)
Fresh-context adversarial self-review of the grid-on-zoom fix. One material finding, since folded into commit 8886321; the rest of the change checked out clean.
Material (addressed): HEALPix child cells crossing ±180 were clipped before the seam split. A child cell straddling the antimeridian comes back from mort2polygon as a flat ring spanning ~360° (-179.98 … 180). The first pass clipped that band with .intersection(view) directly, so a seam-hugging viewport produced a wrong, oversized geometry (the band filled the whole view strip, ~0.19 deg² vs the true ~0.009 deg² cell) or dropped the cell. The working shard_outlines layer never hit this because it runs the seam split on the unclipped footprint; the bug was specific to the new HEALPix viewport_cells path. Fix: _seam_safe_polygon cuts the cell at ±180 (unwrap +360 / clip / rewrap, returning shapely) before the viewport clip; the new regression test test_seam_cell_clipped_to_true_sliver_not_global_band fails on the pre-fix behavior and passes after.
Checked and OK (no action):
_debouncethread-safety:loop.call_laterruns on the kernel's main/loop thread; theboundsobserver and the callback share that single thread, so no race on the pending handle. This genuinely moves thegrid_layer.datamutation off the oldthreading.Timerthread.- No
boundsfeedback loop: settinggrid_layer.datamutates a separate widget; it does not re-firem.bounds. np.isin(parents, visible_keys)parent filter,max_cellsgate placement (before polygon construction), and empty-coverage / empty-visible edge cases all correct.- Non-HEALPix path and the public
viewport_cellssignature are unchanged (max_cellsis a new keyword with a default).
Generated by Claude Code
|
🤖 from Claude Deep diagnostic pass on the two issues reported after the STRtree+debounce commits ( (B) "the grid doesn't NEST the shards" — root cause
Fix: for a HEALPix grid, (A) "still unusable / crashes the kernel when zoomed in" — root causeThe debounce mutated the widget from a background thread. Fix: Before / after (headless,
|
| zoom (frac of shard) | cells emitted | order | time |
|---|---|---|---|
| 0.4 | 1404 | 12 | ~1.0 s |
| 0.2 | 364 | 12 | 0.27 s |
| 0.08 | 84 | 12 | 0.07 s |
| 0.03 | 12 | 12 | 0.02 s |
Cell count scales with viewport area, not shard fan-out, and stays far below the 16384-cell full enumeration.
Tests added (tests/test_viz.py, headless)
TestViewportCellsHealpix: emitted cells are atchild_order(infer_order_from_morton), each cell's parent is a visible shard, the union of a shard's child cells reconstructs its footprint (nesting), cell count shrinks as you zoom in, a guard that a zoomed-in query does not callgenerate_morton_children(no full enumeration), and themax_cellsgate.- Antimeridian: child cells crossing ±180 split into hemisphere-local slivers (the self-review caught a clip-before-split defect on the seam, folded in
8886321— see the review comment). TestDebounce: the burst coalesces to one call on the loop thread under a running event loop, plus the no-loop synchronous fallback.
Status
uv run ruff check --select=E,F,W,I --ignore=E501 src tests, ruff format --check (touched files), and pytest tests/test_viz.py (53 passing) are green; PR CI ruff / test (3.12) / test (3.13) all green.
Question for review
I added a max_cells gate (default 2000) so a single very dense viewport returns empty rather than emitting tens of thousands of child polygons — the analog of the existing max_shards gate. At child_order=12 a whole order-6 shard in view is ~44k cells, so an upper bound is needed somewhere. Options if you'd prefer different behavior for dense viewports: (1) keep the empty-on-too-dense gate (current); (2) render a coarser intermediate order (e.g. halfway between parent and child) until the viewport tightens; (3) cap to the N cells nearest the view center instead of all-or-nothing. Happy to switch — flag a preference.
Generated by Claude Code
|
This has been a disaster in terms of producing any kind of viewable grid at the cell level. We should remove any attempt at viewing grids and just focus on displaying the shardmap, data granules, and the context background map (i.e., the auto polar projection switching). Have a close look and see where we can tighten up the code on this to have a minimal and fast viewer for the above. (i.e., think about what we can delete from this module). |
espg
left a comment
There was a problem hiding this comment.
🤖 from Claude (review)
Fresh-context adversarial review of the cell-grid removal (HEAD 9a7bfd8). The removal is clean and matches the directive: viewport_cells, _healpix_child_cells/_child_cell_polygon, morton_coverage viewport gen, max_shards/max_cells, the bounds observer / _debounce / cancel_grid_refresh / _refresh_grid, the grid layer in show_shardmap, ShardIndex/shard_index/_INDEX_CACHE, _seam_safe_polygon, and _polygonal are all gone. Kept: shard_outlines, granule_footprints, render_shardmap (no grid), crs.py, show_shardmap (basemap + shards + footprint toggle + LayersControl + auto CRS), _split_antimeridian. __all__ lists in shardmap.py, leaflet.py, and __init__.py are consistent with what remains, imports all resolve, and asyncio (only used by _debounce) was correctly dropped from leaflet.py. Tests: every removed test was grid/viewport-specific; the kept suite still covers shard outlines, granule footprints, render_shardmap, antimeridian split, CRS picking, and the ipyleaflet wrapper — no references to removed symbols remain. Notebook: outputs cleared (all execution_count: None, no outputs), version="007", both AOIs (Antarctic Peninsula + Jakobshavn) and the polar EPSG:3031/3413 demos retained; the verification checklist is correctly down to 3 items (basemap / shards / footprints).
One orphaned doc reference to flag (not blocking):
notebooks/shardmap_viewer.ipynb, cell 0 (intro markdown): the ipyleaflet-wrapper bullet still reads "basemap + shard layer + toggleable footprint layer + zoom-gated grid." The zoom-gated grid no longer exists — this lone phrase contradicts the rest of the notebook (the §5 checklist already dropped the grid item). Trim to "basemap + shard layer + toggleable footprint layer" so the notebook has no dangling reference to the removed behavior.
Nothing else: no leftover machinery, no dead helpers, no broad noqa/type: ignore added, no scope creep. The # noqa: ARG001 that previously sat on _refresh_grid went away with the function. Modules are well under the ~1000-line limit and read as genuinely minimal now.
Generated by Claude Code
|
🤖 from Claude Narrowed the viewer to a minimal, fast surface per the directive: removed all cell-level grid rendering and kept only shard outlines, granule footprints, and the context basemap with auto polar-projection switching. Removed
Kept
Net: Tested
A fresh-context adversarial self-review confirmed the removal is complete and flagged one doc orphan (a "zoom-gated grid" phrase in the notebook intro), folded in the follow-up commit. The pre-existing Generated by Claude Code |
|
...is the example ipython notebook updated too? |
|
🤖 from Claude Yes — What the notebook contains now (18 cells, all code-cell outputs cleared,
No residual staleness. Grepped the notebook for every removed symbol/behavior and found none: It runs anonymously against CMR-STAC (no Earthdata Login) — version Generated by Claude Code |
|
this needs to be updated given that #105 and #106 have now merged ahead ; see #104 (comment) also, nasa cmr has been added to egress whitelist, and earthdata doesn't need creds for metadata (so this should be executable now) |
espg
left a comment
There was a problem hiding this comment.
🤖 from Claude (review)
Adversarial review of fc50ea4 (Binder-alignment for the shardmap viewer notebook), diff range 89131c5..fc50ea4. Docs/notebook/infra only — no code behavior changed, so no test is required. Overall the change is sound and matches the #105 Binder convention; one diff-scoped nit and one out-of-scope note below.
Binder convention fidelity — good. The badge (mybinder.org/badge_logo.svg), the ?urlpath=lab/tree/notebooks/shardmap_viewer.ipynb form, and the italic _Runs end-to-end on Binder ..._ intro line all match the merged notebooks (custom_aggregations / rasterized_zarr / tdigest_reader_example). Badge points to the correct notebook.
Data-anonymity claim — accurate. Every executable cell operates on CMR-STAC granule metadata + footprint geometry (CMRSource().fetch, ShardMap.build on the mortie HEALPix backend, render_shardmap/show_shardmap on saved JSON/parquet). No cell reads granule HDF5 pixels or needs an Earthdata login, so "runs end-to-end on Binder ... no Earthdata Login" is not an overclaim. ipyleaflet (viz extra) is a pure-Python widget, so adding it to .binder/postBuild won't break the other Binder notebooks' import paths, and the environment.yml comment edit (jupyterhub_example / shardmap_viewer read anonymous CMR-STAC metadata) is correct.
Finding (diff-scoped) — stale/contradictory install note. The intro and the ## Install section were updated to the canonical pip install "zagg[analysis,catalog,viz]", but the section-5 markdown cell still reads:
Run in JupyterLab with
pip install "zagg[viz,catalog]"to see the live map.
This zagg[viz,catalog] form is now stale relative to the rest of the notebook (drops analysis, reorders extras) and is exactly the kind of duplicate/contradictory install instruction this commit set out to eliminate. Suggest updating it to pip install "zagg[analysis,catalog,viz]" (or just dropping the per-section reminder, since Install already covers it) for consistency. The notebook source is single-line JSON, so this couldn't be pinned inline.
Note (out-of-scope, pre-existing — not introduced by this commit). The Greenland code cell comments "Imports also available from earlier cells; repeated here for standalone re-runs" but the cell still references Query, CMRSource, ShardMap, and grid defined only in earlier cells — a true standalone re-run would NameError. This predates fc50ea4 and isn't in its diff; flagging only so it's on record, not as a blocker for this PR.
No stale references to removed viewer features (cell-grid / viewport_cells / debounce / ShardIndex / grid-on-zoom) remain in the notebook.
Generated by Claude Code
|
🤖 from Claude Rebased onto current Rebase approach — merge, not rebase (push-safe)This branch had already been pushed, and A welcome side effect of merging current Binder alignment (#105 convention)Notebook now matches the merged Binder notebooks (
How Binder gets
|
Closes #38
A minimal, fast shard-map viewer: a headless GeoJSON render core plus a thin
ipyleafletwrapper. Both inputs (ShardMap JSON and the STAC-geoparquetCatalog) are supported from day one, and the geometry is reused straight off the existinggrid.shard_footprint/signaturesurface — no viewer-specific file type or second tessellation.Scope narrowed (per @espg, 2026-06-23)
The cell-level grid rendering never produced a usable grid and was the source of the kernel hangs/complexity. It has been removed entirely. The viewer now focuses on three things only:
pick_crs→ EPSG:3031/3413/3857, GIBS polar basemaps)Removed surface
viewport_cells, the HEALPix child-cell enumeration (_healpix_child_cells,_child_cell_polygon, viewport-boundedmorton_coverage), themax_shards/max_cellsgrid-on-zoom gate.boundsobserver + debounce machinery inleaflet.py(_debounce,loop.call_later,cancel_grid_refresh,_refresh_grid) and the grid layer inshow_shardmap— all of it existed only to drive the now-deleted grid.STRtree-backedShardIndex/shard_index/_INDEX_CACHEmachinery (only used to speed up the grid refresh; shard outlines build footprints directly)._seam_safe_polygon/_polygonal(only used by the deleted cell-clip path)._split_antimeridianis kept —shard_outlinesandgranule_footprintsstill need it.Net:
src/zagg/viz/shardmap.pyandleaflet.pyare substantially smaller; the module is minimal and fast (no per-pan recompute, no background-thread comm mutation).What this does
Headless render core (
src/zagg/viz/__init__.py,src/zagg/viz/shardmap.py) — pure Python, no browser/ipyleaflet import:shard_outlines(shardmap)— one feature per shard, offgrid.shard_footprint(key); grid rebuilt from the map's owngrid_signatureviagrid_from_signature(HEALPix + rectilinear round-trip exactly).granule_footprints(catalog)— one feature per granule footprint, viaCatalog.granule_records().render_shardmap(...)— assembles{"shards", "granules"}.MultiPolygonby_split_antimeridian(unwrap +360 / clip / rewrap), shapely-only; skipped under a polar CRS (no seam there).ipyleaflet wrapper (
src/zagg/viz/leaflet.py, exposed aszagg.viz.show_shardmap) — context basemap (auto polar projection + GIBS, else OSM/Web Mercator) + shard layer + a toggleable granule-footprint layer + aLayersControl. Allipyleafletimports are lazy/local so importingzagg.vizand the test suite never require the widget stack.CRS picker (
src/zagg/viz/crs.py) — the valued part, unchanged:pick_crsselects EPSG:3031 (Antarctic), EPSG:3413 (Arctic), or EPSG:3857 (mid-latitude/global) from the map extent;crs_infocarries the proj4leaflet definition + GIBS WMTS basemap per polar code.Phases
How tested
uv run pytest tests/test_viz.py -q— 31 passed, 5 skipped (the 5 skips are thepytest.importorskip("ipyleaflet")wrapper tests;ipyleafletisn't installed in this env). Tests for the removed grid/viewport/debounce/ShardIndexcode were deleted; the kept tests covergrid_from_signatureround-trips, shard outlines (count/props/WGS84/JSON), granule footprints, antimeridian splitting, CRS picking, the seam flag, and the wrapper smoke tests.uv run ruff check --select=E,F,W,I --ignore=E501 src tests— clean (matches the PR lint bot).ruff format --checkclean on the files I touched.notebooks/shardmap_viewer.ipynb) updated: grid-on-zoom and zoom-to-shard demo cells removed; the real anonymous CMR-STAC ATL06 queries (version="007", two AOIs) and the polar-projection demo are kept; outputs cleared.Note:
tests/test_integration.py/tests/test_processing.pyhave 3 pre-existing failures (morton/mortie, fixed onmainafter this branch's base) — unrelated to this change, not touched.Questions for review