Skip to content

Non-scalar aggregation outputs (Tier-1 vectors): config declaration + vector cell statistics#40

Merged
espg merged 12 commits into
mainfrom
claude/29-nonscalar-outputs
Jun 17, 2026
Merged

Non-scalar aggregation outputs (Tier-1 vectors): config declaration + vector cell statistics#40
espg merged 12 commits into
mainfrom
claude/29-nonscalar-outputs

Conversation

@espg

@espg espg commented Jun 15, 2026

Copy link
Copy Markdown
Member

Closes #29.

What this does (#29, Tier-1 fixed-width vectors)

Option B output signatures — each aggregation field declares its output kind + trailing shape + dtype. The cell→table→Zarr path now carries non-scalar per-cell payloads end-to-end via the @espg-approved Arrow FixedSizeList<C> carrier (B′): scalar configs are byte-for-byte unchanged; a config with any vector field routes through Arrow and stores the payload on a trailing Zarr dimension.

Per the settled plan in #29 (comment): Option B (kind + trailing_shape + dtype), CSR for ragged deferred to Tier 2, Tier 1 (fixed vector) first, leaf-emit only, Lambda + local scope.

Approach

  • Phase 1config.py: per-field kind/trailing_shape/dtype metadata, validated by _validate_output_kind; get_output_signature(meta) is the single normalized read point.
  • Phase 2calculate_cell_statistics returns a per-cell ndarray of trailing_shape/dtype for vector fields (with the schema-declared fill_value sentinel on empty cells); scalar fields byte-identical. Plus vector-expression enablement and a vector+len/count guard.
  • Phase 3 — the approved Arrow FixedSizeList carrier. When any vector field is present, the cell→table handoff routes through Arrow (reusing sort/hash grouping refactor ( #30 ) #33's _concat_and_group/_group_columns seam): scalars become plain Arrow columns, vectors become FixedSizeList<C>. Pure-scalar configs unchanged.
  • Phase 4OutputGrid.signature() now carries an Option-B output-field set {name, kind, trailing_shape, dtype} per agg variable (via the new config.output_field_signature, JSON-canonical, sorted by name), and nests_with() requires a matching field-kind set so co-aggregated grids must share a scalar/vector schema. Both grids + the protocol docstring updated.
  • Phase 5 — dense vector writer. The Zarr template (grids.base.vector_array_spec, wired into both grids' _spec()) creates a (*spatial_shape, *trailing_shape) array for each vector field, chunking the trailing payload dim whole so the writer's block_idx = chunk_idx + (0,)*len(trailing) holds. The single-trailing-chunk invariant is documented and now enforced at the write site (raises if a target array splits the trailing dim).
  • Phase 6 — reader + a leaf→Zarr→read round-trip test for a real vector field: builds an Arrow carrier with populated + NaN-padded cells, writes via write_dataframe_to_zarr (exercising the trailing-dim set_block_selection), reads back through the trailing-dim selection, asserts populated cells round-trip exactly, empty cells carry the NaN sentinel, and a NaN-aware reducer (np.nanmean) skips the padding. Reads use the existing Zarr open_group API — the trailing payload is a normal axis, so no new reader code is needed.

Phases checklist

  • Phase 1 — config field-kind declaration + validation.
  • Phase 2 — calculate_cell_statistics returns ndarray-valued entries for vector fields (+ vector-expression, vector+len/count guard).
  • Phase 3 — Arrow FixedSizeList carrier through grouping/assembly (approved B′).
  • Phase 4 — OutputGrid.signature(){name, kind, trailing_shape, dtype} + nests_with() field-kind check.
  • Phase 5 — dense vector writer (trailing payload dim), single-trailing-chunk invariant enforced.
  • Phase 6 — reader + leaf→Zarr→read round-trip test; reducers skip padding.
  • Tier 2 (follow-up) — CSR ragged (values/offsets/cell_ids) + t-digest as List<FixedSizeList<2>>.

How tested

  • uv run pytest -v — full suite 316 passed, 1 skipped (the skip is the non-PyPI spherely SpatialIndex fork, expected per CLAUDE.md §7). New coverage: TestOutputFieldSignature / TestVectorTemplate (test_grids.py), rectilinear nests_with field-set test, TestVectorRoundTrip (test_processing.py — the end-to-end write→read round-trip + the trailing-chunk-invariant guard).
  • uv run ruff check src testspasses. The lines added in this run are format-clean under the pinned hook ruff (v0.14.10); pre-existing repo-wide ruff format drift on main (grid/processing files predate this PR) is left untouched per §4, and CI's ruff job is ruff check only (fail_level: none), so format drift does not gate.

Questions for review

  • signature() now includes output_fields — a ShardMap built before this change (no output_fields key) will no longer equality-match grid.signature() and the runner will report a grid mismatch. This is correct-by-design (the shard map must pin the output schema), but existing catalogs need regeneration (python -m zagg.catalog). Flag if you'd rather soften the runner mismatch message to hint at regeneration.
  • t-digest fit (Tier 2): the list-carrier and trailing-dim writer are the deliberate shape for the imminent ragged t-digest as List<FixedSizeList<2>>; that lands as the Tier-2 follow-up (CSR ragged), not here.

Generated by Claude Code

@espg espg added the implement label Jun 15, 2026

@espg espg left a comment

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 from Claude (review)

Fresh-context adversarial review of phase 1 (config/schema declaration + validation for non-scalar outputs, issue #29). Scope is correctly held to declaration/validation only — no runtime/writer/signature work leaked in, which is right for this phase.

Overall: solid and in-scope. Scalar default is backward-compatible (kind defaults to "scalar", scalar fields short-circuit with no new required keys; the test_scalar_default_backward_compatible + test_atl06_validates cases confirm pre-existing configs can't newly raise). The bool guard (isinstance(dim, bool)) is correct given isinstance(True, int) is True, and it's tested. OUTPUT_KINDS defined after validate_config is fine — it's referenced at call time, not import time.

Findings below are minor / polish — none block advancing to the next phase.

1. Docstring overreach — fill/pad_weight sentinel (config.py:225). The _validate_output_kind docstring claims a field may declare "a fill/pad_weight sentinel," but nothing in this phase reads or validates those keys. Either drop the mention or mark it as a later phase, so the docstring doesn't promise validation that isn't there.

2. Test gap — per-element check inside a list (tests/test_config.py, TestOutputKind). _validate_trailing_shape loops over each dim and rejects non-int / bool / < 1 entries, but every "bad element" test exercises the scalar value (trailing_shape="64" hits the type branch; 0/True are single ints). There's no case like trailing_shape=[16, "x"], [16, 0], or [16, True] that reaches the loop body with a valid leading dim — so the in-loop rejection of a bad element within an otherwise-valid sequence is untested. Adding one such case closes the coverage hole.

3. np.dtype catches only TypeError (config.py:251). Verified against numpy 2.4.6 and the lambda-pinned-equivalent path: bad dtype strings/objects all raise TypeError today, so this is correct in practice. Still, np.dtype has historically raised ValueError for some malformed inputs; broadening to except (TypeError, ValueError) is a cheap hedge against a numpy-version drift turning a config typo into an uncaught crash. Low priority — note, not a blocker.

4. Minor — OUTPUT_KINDS interpolated raw (config.py:244). f"(allowed: {OUTPUT_KINDS}; ...)" renders the literal tuple repr ('scalar', 'vector') in the user-facing error. Cosmetic; a joined form reads slightly cleaner.

No correctness bugs found in the validation logic itself. Acceptance criteria for phase 1 (declare vector kind + trailing_shape + dtype; scalar unchanged; ragged recognized-but-rejected) are met and tested.


Generated by Claude Code

@espg

espg commented Jun 15, 2026

Copy link
Copy Markdown
Member Author

🤖 from Claude

Phase 1 status: landed and green. Config field-kind declaration (Option B: kind + trailing_shape + dtype, scalar default, vector Tier-1, ragged recognized-but-rejected) is in src/zagg/config.py with validation + a normalized get_output_signature accessor, plus tests/test_config.py coverage. Local: full suite 227 passed, 1 skipped; ruff-check / ruff-format / codespell clean; mypy unchanged from the main baseline (12 pre-existing, none in the touched code).

Ran the fresh-context adversarial self-review (the COMMENT review above). Folded its diff-scoped findings in commit 77f4838:

  • Trimmed the _validate_output_kind docstring that referenced a fill/pad_weight sentinel not validated this phase (it's a later-phase concern).
  • Added tests for the per-element trailing_shape list checks — bad element type [16, "x"], zero [16, 0], and bool [16, True] — which were previously only exercised on scalar inputs.
  • Defensively widened the dtype validation catch to (TypeError, ValueError) (numpy raises TypeError here today, but the catch is version-robust). I did not add a test asserting the ValueError branch since this numpy can't trigger it — fabricating that coverage would be misleading.
  • Made the unknown-kind error message render allowed: scalar, vector instead of a raw tuple repr.

Ordering: keeping this implement (continuing next run), not waiting/blocked — phase 1 is independent and complete. Phases 2–3 (calculate_cell_statistics ndarray return + process_shard container) build on the cell→table handoff that PR #33 reworks, so per #29 comment I'll rebase this PR after #33 merges before advancing those phases / final merge. Test vehicle for the first real vector field will be the ATL03 fixed-width elevation bins (64/128 @ 1–2 m).


Generated by Claude Code

@espg espg changed the title Non-scalar aggregation outputs: config field-kind declaration (phase 1) Non-scalar aggregation outputs (Tier-1 vectors): config declaration + vector cell statistics Jun 16, 2026
@espg

espg commented Jun 16, 2026

Copy link
Copy Markdown
Member Author

🤖 from Claude

#33 merged → this PR is unblocked. Status of this run:

  • Merged main into the branch (merge commit 249fb19, not a rebase — repo conventions forbid force-push). sort/hash grouping refactor ( #30 ) #33's processing.py rework (calculate_cell_statistics now takes dict[str, np.ndarray]; _group_columns/_concat_and_group; the experimental arrow-kernel path; numpy-NaN contract) came across with no conflicts. Suite was green on the merge before any new work (285 passed, 1 skipped).
  • Phase 2 landed (commit phase 2 of issue #29): calculate_cell_statistics emits a per-cell ndarray of trailing_shape/declared dtype for vector fields, with the schema-declared fill_value sentinel on empty cells; scalar fields stay byte-for-byte identical. New TestVectorOutputs (6 tests). Folded one self-review docstring fix into the commit.
  • Tests: uv run pytest291 passed, 1 skipped (skip = the non-PyPI spherely fork, expected per CLAUDE.md §7). ruff check src tests passes; the two edited files are format-clean under the pinned hook ruff (v0.14.10). Repo-wide ruff format drift under newer ruff (0.15.x) is pre-existing on clean main — left untouched per §4.
  • Phase 3 NOT started — design fork, needs your call. Building the per-cell N×C vector container in process_shard can't be assembled/tested end-to-end without deciding the carrier (the current pd.DataFrame column can't hold a 2-D block) and the phase-4/5 Zarr template + writer. I laid out concrete options (A)/(B)/(C) in the updated PR body "Questions for review" — my lean is (B) (separate vector dict alongside df_out; scalars stay byte-identical), but it changes process_shard's return contract, so I stopped here rather than guess.

Next step: your pick of A/B/C for the phase-3 container, then I'll implement it + phases 4–6. Also a quick confirm that fill_value is the sentinel key you want (I reused it rather than adding a new key — answers the phase-1 open question). PR stays a draft.


Generated by Claude Code

@espg

espg commented Jun 16, 2026

Copy link
Copy Markdown
Member Author

🤖 from Claude (review)

Fresh adversarial pass over the phase-2 diff (src/zagg/processing.py, tests/test_processing.py) vs CLAUDE.md + #29 acceptance criteria.

Verified good

  • Scalar byte-for-byte parity: the empty-cell _empty_cell_value returns exactly 0 if function in (len,count) else np.nan for scalar fields (identical to the old inline expr), and the populated scalar path keeps float(out). The len/count short-circuit still returns n_obs before any coercion. test_scalar_fields_unchanged_alongside_vector guards this.
  • Sentinel reuse: _field_sentinel matches the fill_value handling already in process_shard/_kernel_aggregate, so empty vector cells and the (future) seeded arrays agree.
  • _coerce_field_value fails loudly on a width mismatch rather than silently truncating/padding — correct for Tier-1 fixed-width.

Standing finding (not fixed — out of phase-2 scope; flagging for @espg)

  • A field declaring kind: vector with function: len/count would hit the func_name in ("len","count") short-circuit and return a scalar n_obs, bypassing _coerce_field_value — so a vector-count field silently emits a scalar instead of a trailing_shape array (or erroring). Phase-1 _validate_output_kind doesn't block that combination. It's a nonsensical config (a count isn't a vector), so I left it rather than widen scope, but if you want it caught, the cleanest spot is a validation rule in _validate_output_kind (reject vector + len/count). Happy to fold into phase 3 if you agree.

No other defects. Tests cover the shape/dtype/sentinel/wrong-width/expression-rejected paths.


Generated by Claude Code

@espg espg added the waiting label Jun 16, 2026
@espg

espg commented Jun 16, 2026

Copy link
Copy Markdown
Member Author

Vector expression fields: deferred — a kind: vector field with expression is rejected (evaluate_expression returns a Python float; a non-scalar eval contract is out of Tier-1 scope). Flag if you want it in scope.

This seems trivial to allow, we should do it now while we're thinking about it

process_shard carries vector containers through grouping/assembly. ⚠️ Blocked on an @espg decision — the per-cell N×C container can't be assembled/tested end-to-end without committing to a carrier (the current pd.DataFrame column can't hold a 2-D block) and the phase-4/5 Zarr template + writer. See Questions for review.

Are we sure pd.DataFrame column can't hold a 2-D block? jbbutler does exactly this is in this notebook which encodes 2-D arrays of binary masks. I believe the code that modifies pandas to do this is in the same repo...

(A) Allocate stats_arrays[name] as (n_cells, *trailing_shape) and keep the pd.DataFrame carrier — but a DataFrame column can't hold a 2-D block, so vector columns would have to ride as an object/list column and write_dataframe_to_zarr would need a vector-aware reshape. Couples phase 3 tightly to the writer.

Don't like this, lists of objects in python have the wrong smell.

(B) Split the assembly seam: build vector stats into a separate dict[name -> (n_cells, *trailing_shape) ndarray] returned alongside df_out (scalars stay in the DataFrame, vectors bypass it), and let the phase-5 dense writer consume that dict directly. Keeps df_out/write_dataframe_to_zarr untouched for scalars.

Possible option, better than (A)... but if we're going to return something alongside df_out, is pyarrow an escape hatch here? We have arrow already wired in as an alternative option to pandas -- can arrow deal with 2D arrays, and can we just default to arrow for that case?

Keep in mind that we'll be doing t-digest as a vector output almost immediately as a follow on, so if in one of these options is a better fit for that, I'd like that flagged.

@espg espg removed the waiting label Jun 16, 2026
@espg

espg commented Jun 16, 2026

Copy link
Copy Markdown
Member Author

🤖 from Claude

Answering your phase-3 carrier questions. Net: I now lean toward an Arrow FixedSizeList/List carrier (a refinement of Option B), and I'll correct my own "a DataFrame can't hold a 2-D block" claim — that was imprecise.

On "can pandas hold a 2-D block?" — yes, you're right, and jbbutler's approach is the reason. pandas' internal BlockManager is genuinely n-dimensional, and you can surface "one column = one per-cell vector" cleanly via a pandas ExtensionArray (a custom dtype) — that's almost certainly what the antarctic_AR_dataset code does to store 2-D mask blocks, and it's not the same as the object/list-column hack. So my Option A was conflating two things:

On "is pyarrow the escape hatch / can Arrow do 2-D / default to Arrow for that case?" — yes to all three, and this is my recommendation. Arrow carries non-scalar per-cell payloads natively:

  • Tier-1 fixed vectorFixedSizeList<C> column (and FixedSizeList<FixedSizeList<2>> or a 2-D fixed for the t-digest (k,2) shape).
  • Tier-2 raggedList/LargeList column — the same machinery, no third rewrite.

#30 already landed the Arrow handoff path (handoff="arrow", _concat_and_group), so this isn't a new dependency or a new seam — it's reusing the one we built. That's the deciding factor.

So my updated recommendation is B′ = Option B carried in Arrow, not a numpy dict:

  • When a config has any non-scalar field, route the cell→table handoff through Arrow end-to-end: scalars become plain Arrow columns (byte-identical results), vectors become FixedSizeList, ragged becomes List. Pure-scalar configs are completely unchanged (default path, byte-for-byte).
  • write_dataframe_to_zarr gets an Arrow-aware sibling that maps FixedSizeList → a trailing Zarr dimension (Tier-1) and List → CSR values/offsets/cell_ids (Tier-2). The len == prod(chunk_shape) invariant stays for scalar/vector and is relaxed only for ragged.

Why this is the best t-digest fit (your flag): t-digest is (means, weights) and, done properly, is ragged (centroid count varies per cell). Of the options, only Arrow List/LargeList carries that without a fresh design — A′ (fixed ExtensionArray) and plain-B (fixed numpy dict) both force a pad-to-worst-case or a parallel ragged structure when t-digest lands. Picking Arrow now means Tier-1 vectors and the imminent Tier-2 t-digest share one carrier and one writer.

Two decisions I'd like from you before I build phase 3:

  1. Confirm B′/Arrow (vs A′ pandas-ExtensionArray, if you'd rather keep everything in pandas). My lean is B′.
  2. For the t-digest (k,2) shape specifically: store as nested FixedSizeList<2> inside the per-cell List (ragged centroid count), i.e. List<FixedSizeList<2>>? That's the layout that makes combine() at merge-time clean.

Meanwhile, the vector-expression enablement you asked for ("do it now") and the vector+len/count guard are landing this run as a separate scoped commit on this branch — they don't depend on the carrier decision. PR stays implement (continuing); phase 3 waits on your pick above.


Generated by Claude Code

@espg espg left a comment

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 from Claude (review)

Adversarial self-review of commit 3fda3c4 (vector expression fields, issue #29). Reviewed against /CLAUDE.md and the issue criteria. Scope limited to this commit; the Phase-3 per-cell container fork is intentionally not requested here.

Verdict: solid, ships as-is. The change is small, well-factored, and correct on the points that matter:

  • Scalar evaluate_expression contract is preserved byte-identically. The wrapper now calls float(eval_expression_raw(...)); eval_expression_raw reuses the same ns (same __builtins__={}, same np/numpy/abs/len/min/max/sum set, **columns), so the scalar path is unchanged. Security namespace is unchanged — no new exec surface.
  • Vector-expression path routes through the same coercion as the vector function path (eval_expression_raw_coerce_field_valuetrailing_shape/dtype check), so shape/dtype enforcement and the wrong-width ValueError are shared, not duplicated.
  • The len/count guard is sound and correctly scoped. It lives in the kind == "vector" branch and keys on meta.get("function"). Because function and expression are validated mutually-exclusive earlier (config.py ~L176), the guard can never collide with an expression field, and it matches the actual short-circuit in calculate_cell_statistics (the func_name in ("len","count") branch only runs when expression is falsy). Tests cover both len and count.
  • Tests added on both sides (config validation + processing behavior). I ran TestVectorOutputs + TestOutputKind locally: 28 passed.

Findings below are minor — one real test-coverage gap, the rest are nits. None block the next phase.

  1. [minor coverage gap] The two empty-cell expression tests don't exercise the new expression code. calculate_cell_statistics returns early via _empty_cell_value when n_obs == 0, so eval_expression_raw is never called for an empty cell — the test would pass even if the expression were np.min on an empty array (which would otherwise warn/raise). See inline note. These tests are correct, just mislabeled as exercising the expression path; consider an assertion or comment clarifying they hit _empty_cell_value.
  2. [nit] eval_expression_raw is exposed at module level (no leading underscore) and imported into processing.py. The sibling _coerce_field_value/_empty_cell_value/_field_sentinel are private. Since the only consumer is calculate_cell_statistics and the public expression entry point remains evaluate_expression, _eval_expression_raw would better signal intent — unless you intend it as public API. Worth a deliberate decision.
  3. [nit] eval_expression_raw has no return annotation; its docstring documents Any but the signature is bare. -> Any (with the import) would match the documented contract and the repo's typed style (mypy runs in CI). Optional.

No correctness or security defects found within scope.


Generated by Claude Code

@espg espg left a comment

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 from Claude (review)

Inline notes for commit 3fda3c4 (vector expression fields, issue #29). See the summary review for the full verdict — this change is correct and ships as-is; the items below are one minor coverage gap plus nits, none blocking.


Generated by Claude Code

Comment thread tests/test_processing.py
assert edges.dtype == np.dtype("float32")
np.testing.assert_array_equal(edges, [1.0, 5.0])

def test_vector_expression_empty_cell_gets_sentinel(self):

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 from Claude (review)

[diff-scoped, minor] This test (and test_vector_expression_empty_cell_numeric_sentinel below) doesn't actually exercise the new vector-expression code. With {"h": np.array([])}, n_obs == 0, so calculate_cell_statistics returns early via _empty_cell_value before the if expression: branch — eval_expression_raw is never called. I confirmed locally that the expression here (np.array([np.min(h), np.max(h)]), which would raise/warn on an empty array) is never evaluated. The test is correct in what it asserts (empty cell → sentinel-filled trailing_shape), but the docstring "the same shape filled with the fill_value sentinel" implies the expression path; it's really covering _empty_cell_value, which is the same code the vector-function empty-cell tests already cover. Either note that explicitly, or to genuinely cover the expression path use an expression that's safe on empty input and assert it isn't what's returned.


Generated by Claude Code

Comment thread src/zagg/config.py Outdated

def evaluate_expression(expression: str, columns: dict[str, np.ndarray]) -> float:
"""Evaluate an expression string in a restricted namespace.
def eval_expression_raw(expression: str, columns: dict[str, np.ndarray]):

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 from Claude (review)

[diff-scoped, nit] Two small things on this new function:

  1. It's exposed module-level (no leading underscore) and imported by processing.py, while its siblings _coerce_field_value/_empty_cell_value/_field_sentinel are private and the public expression entry point stays evaluate_expression. If this isn't intended as public API, _eval_expression_raw would better match the surrounding convention (CLAUDE.md §4 "match the surrounding code"). A deliberate call either way.
  2. No return annotation — the docstring promises Any but the signature is bare. Since mypy runs in CI, adding -> Any (with the from typing import Any import) would make the signature match the documented contract. Optional.

Generated by Claude Code

Comment thread src/zagg/config.py
# ``len``/``count`` short-circuit to a scalar obs count in
# ``calculate_cell_statistics``; pairing them with kind 'vector' would
# silently emit a scalar, so reject the nonsensical combination.
if meta.get("function") in ("len", "count"):

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 from Claude (review)

[diff-scoped, confirmation — no change requested] Confirming this guard is reachable and matches runtime: function/expression are validated mutually-exclusive earlier (~L176), so this only fires for function-driven vector fields, never for expression fields — and it mirrors the func_name in ("len","count") short-circuit in calculate_cell_statistics, which only runs when expression is falsy. So the guard and the runtime short-circuit agree, and there's no expression-vs-len ambiguity. Good. Leaving this as a note, not a request.


Generated by Claude Code

@espg

espg commented Jun 16, 2026

Copy link
Copy Markdown
Member Author

🤖 from Claude

Landed the vector-expression enablement you asked for ("do it now while we're thinking about it") plus the standing vector+len/count guard — both independent of the phase-3 carrier decision.

Pushed (claude/29-nonscalar-outputs):

  • vector expression fields for issue #29 — a kind: vector field with expression now routes its result through the same _coerce_field_value/trailing_shape/dtype path as a vector function field (with the fill_value sentinel on empty cells); scalar expressions still cast to float. _validate_output_kind now (a) accepts vector+expression and (b) rejects the nonsensical vector+len/count combo that previously short-circuited to a silent scalar (the self-review finding from this comment).
  • fold review: privatize _eval_expression_raw + clarify empty-cell tests (#29) — folded the fresh-context self-review's two diff-scoped findings: made eval_expression_raw_eval_expression_raw private (matching the _coerce_field_value/_empty_cell_value siblings) with an -> Any return annotation (mypy in CI), and clarified the empty-cell expression tests' docstrings — they verify the empty-cell short-circuit (it never evaluates the expression; since the test expression is np.array([np.min(h), np.max(h)]) and np.min([]) raises, the tests passing proves eval is skipped on empty cells).

Tests: uv run pytest green (full suite 291+ passed, 1 pre-existing spherely skip); the touched files pass uv run ruff check src tests. Scalar paths byte-identical.

Phase 3 still awaits your call. I posted my recommendation on the carrier in the comment above: Arrow FixedSizeList/List (a refinement of Option B that reuses #30's landed Arrow handoff and is the natural fit for the imminent ragged t-digest) — with two confirms requested (B′/Arrow vs A′ pandas-ExtensionArray; and List<FixedSizeList<2>> for the t-digest (k,2) shape). Marking waiting on that pick. Draft only.


Generated by Claude Code

@espg espg added the waiting label Jun 16, 2026
@espg

espg commented Jun 16, 2026

Copy link
Copy Markdown
Member Author

Arrow FixedSizeList/List (a refinement of Option B that reuses #30 landed Arrow handoff and is the natural fit for the imminent ragged t-digest)

approved

List<FixedSizeList<2>> for the t-digest (k,2) shape)

also approved

@espg espg removed the waiting label Jun 16, 2026
@espg

espg commented Jun 16, 2026

Copy link
Copy Markdown
Member Author

🤖 from Claude

Phase 3 landed (4d5a844, "phase 3 of issue #29") — the approved Arrow FixedSizeList carrier (B′). When a config declares any vector field, the cell→table handoff routes through Arrow end-to-end (reusing #30's landed _concat_and_group/_group_columns seam): scalar fields become plain Arrow columns, vector fields become FixedSizeList<C> columns (C = prod(trailing_shape)). Pure-scalar configs stay on the default path, byte-for-byte unchanged. Touches src/zagg/processing.py (+127/−20), src/zagg/runner.py (+7/−6), tests/test_processing.py (+128). The list-carrier shape is the deliberate fit for the imminent ragged t-digest as List<FixedSizeList<2>>.

CI is green on the head commit: test (3.12) ✅, test (3.13) ✅, ruff ✅, build-x86_64/build-arm64 ✅.

Process note: the run that pushed phase 3 hit a transient API 500 right after the push, before posting this status or running the phase-3 self-review — so a fresh-context adversarial review of 4d5a844 is being posted now as a follow-up, and I'll fold any diff-scoped findings next run.

Remaining (continuing next run, no @espg decision pending — the carrier is settled):

  • Phase 4 — extend OutputGrid.signature() to {name, kind, trailing_shape, dtype} + nests_with() field-kind check.
  • Phase 5 — dense vector writer (FixedSizeList → trailing Zarr dim).
  • Phase 6 — reader + leaf→Zarr→read round-trip test for a vector field; reducers skip padding.

PR stays implement (draft, continuing).


Generated by Claude Code

@espg

espg commented Jun 16, 2026

Copy link
Copy Markdown
Member Author

🤖 from Claude (review)

Fresh-context adversarial review of commit 4d5a844 ("phase 3 of issue #29", the Arrow FixedSizeList carrier) — posted manually because the review subagent hit a transient API 500 before it could submit. Scope: processing.py, runner.py, tests/test_processing.py for this commit only, against CLAUDE.md + #29/#30 criteria. Event: COMMENT (not approve/request-changes).

Verdict: correct and in-scope; scalar byte-identity is properly guaranteed and tested. _has_vector_fields gates use_arrow, so a pure-scalar config takes the unchanged _build_output(use_arrow=False) pandas assembly verbatim; test_scalar_columns_byte_identical_with_and_without_vector runs the same canned input through scalar-only and scalar+vector configs and asserts the shared count/h_min columns are equal — directly pinning the #29/#30 hard criterion. The FixedSizeList<C> build (_arrow_column) and the (n_cells, C) recovery (_iter_carrier_columns) round-trip cleanly (test_arrow_column_roundtrips_through_iter), and the per-cell vector values are verified against a real np.bincount aggregation. Keeping the vector path a list-carrier is the right call for the approved ragged List<FixedSizeList<2>> t-digest. Reuses #30's _concat_and_group seam rather than forking it.

Findings (none block the merge of phase 3, but they shape phases 4–6):

  1. [should-fix in phase 5/6 — writer path landed but not end-to-end tested] write_dataframe_to_zarr already grew the vector branch (values.reshape((*chunk_shape, *trailing)), block_idx = chunk_idx + (0,)*len(trailing), set_block_selection), but no test writes a vector column to an actual Zarr array — the template that creates a trailing-payload array is phase-5 work that hasn't landed, so a real vector config would currently fail in write_dataframe_to_zarr (the template arrays are still per-cell scalar). The carrier (process_shard → Arrow table) is fully tested; the write half is effectively untested plumbing until the phase-5 template + phase-6 round-trip arrive. Recommend the phase-6 leaf→Zarr→read test explicitly exercise the block_idx/trailing-dim selection.

  2. [nit — confirm in phase 5] block_idx = chunk_idx + (0,)*len(trailing) assumes the trailing payload dimension is a single Zarr chunk (block 0). That's the natural template design, but it's an unstated invariant — document it on the writer and enforce it when the phase-5 template creates the (*shape, *trailing_shape) array (chunk the trailing dim whole).

  3. [minor] Lazy import pyarrow in _build_output/_arrow_column raises a bare ImportError for a vector config if pyarrow isn't installed. If pyarrow is an optional extra (not core), a guarded message ("non-scalar outputs require pyarrow — pip install zagg[...]") would beat a raw import traceback. Confirm pyarrow's core-vs-extra status and add the friendly error if it's an extra.

CI is green (test 3.12/3.13, ruff, builds). Findings 1–3 fold naturally into phases 4–6 (continuing next run), so they don't gate landing the carrier.


Generated by Claude Code

@espg

espg commented Jun 16, 2026

Copy link
Copy Markdown
Member Author

...why did you stop implementing, aren't your instructions to continue until your require instructions / approval?

@espg espg left a comment

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 from Claude (review)

Phase-4 review (range 4d5a844..93ee562), scoped to the output-field-signature work for issue #29.

Overall: clean and on-target. output_field_signature is canonical and deterministic — sorted by name, trailing_shape rendered as a list, ()[], and the whole structure round-trips through json.dumps/loads unchanged (verified locally against default_config("atl06")). The nests_with field-set check is symmetric (both grids compute the same canonical list and compare for equality), and get_output_signature already normalizes dtype to None when unset, so the None-vs-explicit-dtype case is consistent across grids. Docstrings updated on both grids + the base Protocol. Nice.

Issues / things to weigh:

  1. Coverage gap — rectilinear nests_with field-set branch is untested (real gap, non-blocking). The new branch in src/zagg/grids/rectilinear.py nests_with is the only genuinely-new behavioral path for that family — the Healpix tests in tests/test_grids.py don't exercise it, and the cross-family Healpix↔rectilinear case short-circuits on isinstance before reaching the field check, so it doesn't cover it either. tests/test_rectilinear.py already has the _grid(cfg, ...) helper and same-CRS nesting tests (~lines 276–296); add one case where two same-CRS, whole-ratio, origin-aligned rectilinear grids differ only in their output-field set and assert not a.nests_with(b) (+ symmetric). Without it, a regression that drops the rectilinear check stays green. Per CLAUDE.md §4 ("every behavioral change needs tests").

  2. Backward-compat consequence of adding output_fields to signature() (flag for @espg, likely acceptable-by-design). _check_signature (src/zagg/runner.py ~268) does exact expected != actual on grid.signature(), and grid_signature is persisted verbatim in the ShardMap JSON (src/zagg/catalog/shardmap.py to_json/from_dict). Any ShardMap built before this commit has no output_fields key, so it now mismatches and raises "ShardMap was built for a different grid than this run config." That's arguably correct — the gate exists to refuse schema drift, and signature() is already a hard equality check — but it silently invalidates pre-existing catalogs with a message that says "different grid" rather than "stale/regenerate." Worth a one-line note under "Questions for review," and consider whether the mismatch message should hint at regeneration (the loader already documents python -m zagg.catalog for the pre-Phase-5 case). No code change required if @espg treats this as a schema-version bump.

Nits (non-blocking):

  1. output_field_signature docstring has a double space: () for scalar (src/zagg/config.py ~441). Trivial.

  2. test_signature_is_json_serializable (tests/test_grids.py ~286) asserts json.loads(json.dumps(sig)) == json.loads(json.dumps(sig)) — comparing two serializations of the same object, which is trivially true and doesn't prove round-trip fidelity. To test what the docstring claims, assert json.loads(json.dumps(sig["output_fields"])) == sig["output_fields"] (deserialized == original), which would catch a stray tuple / np-scalar leaking in. The current form is a tautology.

None of these block advancing; (1) is the one I'd fold into the next commit.


Generated by Claude Code

@espg espg left a comment

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 from Claude (review)

Phase-5 review of 93ee562..9c736f9 (the dense vector writer template). Scope limited to this diff. Overall the implementation is correct and the tests pass locally — pytest tests/test_grids.py::TestVectorTemplate tests/test_rectilinear.py::TestNesting::test_differing_output_field_set_rejected → 4 passed. I verified field preservation, multi-dim trailing, and scalar non-regression by running _spec() directly. Findings below; none blocking.

Verified correct

  • vector_array_spec produces (*spatial, *trailing) shape with the trailing dim as one whole chunk: rectilinear (64,64)(64,64,4) chunk grid; healpix matches (4**(8-6), 4). Multi-dim trailing_shape=[3,2] → shape (...,3,2), chunk (64,64,3,2). Good — this satisfies the writer's block_idx = chunk_idx + (0,)*len(trailing) invariant since the trailing block index is always 0.
  • The type(base)(**{**base.model_dump(), ...}) reconstruction does not lose or corrupt fields. I confirmed empirically that codecs, chunk_key_encoding, fill_value, dtype, and storage_transformers on the vector array are identical to the scalar count array (m.codecs == sc.codecs, m.chunk_key_encoding == sc.chunk_key_encoding, both True; fill_value=0/dtype=int64 preserved). pydantic_zarr's ArraySpec.model_dump recursively dumps nested NamedConfigs to dicts and re-validates them on construction, so this round-trips cleanly. The inline comment about setting shape+dimension_names together (vs chained with_*) is accurate — validate_dimension_names (v3.py:222) is a mode="after" validator that would reject a transient rank mismatch.
  • Scalar fields are untouched: the early if not trailing: return base means a pure-scalar config yields _spec() members with unchanged ('cells',) / ('y','x') dim_names and no extra axes. No scalar regression.

Real issue (low severity) — invariant is documented, not enforced (src/zagg/grids/base.py:40-45, src/zagg/processing.py:273-279): The single-trailing-chunk contract is the load-bearing coupling between the template and the writer, but nothing guards it. If a future edit chunks the trailing dim (e.g. chunk_shape=(*base_chunk_shape, *some_smaller)), the writer's block_idx = chunk_idx + (0,)*len(trailing) would silently write only block 0 of the trailing axis and drop the rest — no error, corrupt output. The two doc blocks cross-reference each other well, but consider a cheap runtime assertion at the write site (e.g. assert the target array's trailing chunk sizes equal values.shape[1:]) or in vector_array_spec itself, so the invariant fails loud rather than producing partial data. Flagging for @espg; out of scope to add silently.

Nit (non-blocking) — trailing dimension_names derived from base_dims[-1] (base.py:72): dim_names = (*base_dims, *(f"{base_dims[-1]}_v{i}" ...)). On rectilinear this yields ('y','x','x_v0') — the payload axis is named after the x spatial axis, which is semantically misleading (the trailing axis is not an x-axis). It's unique and harmless, but a field-derived or fixed prefix (e.g. hist_v0, or just v0) would read better. Uniqueness is currently guaranteed only incidentally (each field is a separate array, so cross-field collision isn't possible; within a field the _v{i} suffix disambiguates). Minor.

Nit (non-blocking) — test coverage gaps: TestVectorTemplate covers the happy path well (trailing shape + whole-chunk + dtype for both grids, dim_names extension for healpix). Two small gaps worth a follow-up, not a blocker: (1) no test asserts codecs/fill_value/chunk_key_encoding survive the model_dump reconstruction — I verified it manually but a regression here would be silent (the dtype assert wouldn't catch a dropped codec); (2) no multi-dim trailing_shape template test (e.g. (k,2)) — the writer's (0,)*len(trailing) and the chunk grid both depend on len(trailing)>1 working, and only the single-axis case is exercised here.

Line-length note: base.py:74 (chunk_grid = NamedConfig(...)) looks near/over 100 chars — confirm ruff check src is clean (E501 is ignored in this repo per CLAUDE.md §7, so it won't fail CI, but worth a glance).

Net: correct and well-documented; the only thing I'd push on before merge is making the single-trailing-chunk invariant fail-loud rather than relying on a docstring, since a silent break there corrupts data.


Generated by Claude Code

@espg espg left a comment

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 from Claude (review)

Phase-6 review — scope is the 9c736f9..1511853 diff only (the new TestVectorRoundTrip in tests/test_processing.py). Pre-existing pinned-ruff format drift in this file is on main and out of scope.

Verdict: solid, meaningful round-trip. The block-index math, trailing-dim selection, and NaN-skip assertion all check out against the source. Findings below are mostly nits; one is a substantive coverage gap worth a decision.

What the test gets right (verified against source)

  • Block-index/trailing-dim math is correct. block_index for fullsphere returns the parent's HEALPix nested id (healpix.py:160-164), and set_block_selection(block_idx, ...) writes block b → rows [b*n, (b+1)*n). The test's block_start = chunk_idx[0] * n matches that, and mirrors TestWriteDataframeToZarr (test_processing.py:51). Good.
  • Shape assertion uses the constant, not a magic 12. group["edges"].shape == (HEALPIX_BASE_CELLS * 4**child_order, 2) lines up exactly with vector_array_spec extending the spatial shape (healpix.py:98, base.py:71) by the whole trailing dim. The new HEALPIX_BASE_CELLS import is the right call.
  • The padding/NaN-skip assertion is genuinely meaningful. np.nanmean(got, axis=0) over a (n, 2) block with only rows 0 and 3 populated and rows 1,2 NaN-padded, asserted against [(1-2)/2, (9+4)/2], actually proves the reducer skips the padding sentinel — it isn't a tautology, since a non-NaN-aware mean would give a different (NaN) answer. assert_array_equal on the populated rows (not allclose) is the right strictness for an exact round-trip.
  • column_names[:2] == ["count","edges"] holds — get_data_vars preserves dict-insertion order and coords are appended after (_build_output, processing.py:209-214).

Real issue (non-blocking, but the standing #1 finding asked for e2e)

  1. The "writer was untested e2e" gap is only partly closed — the leaf→stats half is still bypassed. The test fabricates stats dicts by hand and feeds them straight to _build_output; it never runs calculate_cell_statistics/the expression path that produces a real vector block. So the path under test is _build_output → _arrow_column (FixedSizeList) → write_dataframe_to_zarr → set_block_selection → Zarr read, which is the writer→Zarr→reader seam (the literal phase-6 ask) — but the edges config carries expression: "np.array([np.min(h), np.max(h)])" and kind: vector, and nothing in this test ever evaluates that expression. It's dead config that reads like it's exercised. Either (a) drive one cell through the real statistic eval so the expression→vector block is covered too, or (b) drop the unused expression/source keys and add a one-line comment that the stats are synthetic. As written a reviewer could reasonably believe the expression path is tested when it isn't.

Reader-requirement assessment (the explicit question)

  1. Using the existing open_group/Zarr API with no new reader code is the right call for #29 phase 6 — agreed, no helper needed. The trailing dim is a normal Zarr axis; group["edges"][block_start:block_start+n] is exactly how a consumer reads it, and the test demonstrates the full contract (selection + sentinel + reducer). Adding a bespoke reader helper would be speculative abstraction (CLAUDE.md §4) with no caller. The one thing I'd want before calling the "reader" box checked: a sentence in the PR body stating explicitly that "reader = the stock Zarr trailing-dim selection, demonstrated by this test" so the decision is on record rather than implicit.

Nits (non-blocking)

  1. Coverage is single-trailing-dim + HEALPix only. The whole point of the block_idx = chunk_idx + (0,)*len(trailing) generalization (processing.py:279) and the vector_0/vector_1 naming (base.py:74-77) is multi-dim trailing shapes (the t-digest (k,2) case named throughout the docstrings). A trailing_shape: (k, 2) case would exercise (0,)*len(trailing) with len>1 — currently len(trailing) is always 1 here, so the multi-dim block-index arithmetic is asserted by no test. Similarly the rectilinear grid's block_index (rectilinear.py:270, a 2-tuple) never sees a vector write. Both are reasonable to defer, but worth a one-line "Questions for review" note on whether phase 6 should cover at least the (k,2) case given it's the motivating future shape.
  2. local import placement. from zagg.config import PipelineConfig sits inside _vector_cfg (+30) while from mortie import geo2mort is inside the test method — fine functionally, but PipelineConfig is already importable at module scope and the surrounding TestNumpyNanAware... test (line 123) also does the in-body import, so this matches local style. No change needed; flagging only for consistency awareness.
  3. parent/children naming vs n. n = len(children) is immediately re-asserted to equal 4**(child_order-parent_order) (+ assert n == ...). The assert is harmless but redundant with the comment on the same line; it documents the invariant, so I'd keep it — minor.

Nothing here blocks advancing. Finding #1 is the one I'd most want addressed (or explicitly waved off in the PR body) since it's the residual of the original e2e-coverage finding; #3 is the strongest candidate for an @espg scope decision.


Generated by Claude Code

@espg

espg commented Jun 16, 2026

Copy link
Copy Markdown
Member Author

🤖 from Claude

Continued past the carrier decision (it's settled — approved) and landed phases 4–6, completing the #29 Tier-1 vector path end-to-end.

Pushed (claude/29-nonscalar-outputs):

  • phase 4 of issue #29 (93ee562) — OutputGrid.signature() carries an Option-B output-field set {name, kind, trailing_shape, dtype} per agg variable (new config.output_field_signature, JSON-canonical, sorted by name), and nests_with() requires a matching field-kind set (both grids + protocol docstring). Folded the phase-4 self-review nits into phase 5 (docstring typo, JSON-round-trip test fix, and the missing rectilinear nests_with field-set test).
  • phase 5 of issue #29 (9c736f9) — dense vector writer. New grids.base.vector_array_spec (wired into both grids' _spec()) creates the (*spatial, *trailing) template array and chunks the trailing payload dim whole, so the writer's block_idx = chunk_idx + (0,)*len(trailing) holds.
  • phase 6 of issue #29 (1511853) — the standing self-review finding Lambda function based orchestration for horizontal scaling of aggregations #1: a real leaf→Zarr→read round-trip test (TestVectorRoundTrip) that builds an Arrow FixedSizeList carrier (populated + NaN-padded cells), writes it (exercising the trailing-dim set_block_selection), reads it back, and asserts a np.nanmean reducer skips the padding.
  • fold review: enforce single-trailing-chunk invariant + name payload axis (#29) (91cce63) — phase-5 self-review fold: the single-trailing-chunk invariant is now enforced at the write site (raises if a target array splits the trailing dim, with a test) rather than only documented; trailing payload axes are named vector / vector_0,1 instead of <spatial>_v0.
  • fold review: clarify round-trip test isolates the write/read path (#29) (ea17405) — phase-6 self-review fold (finding Lambda function based orchestration for horizontal scaling of aggregations #1): a comment makes explicit the round-trip test deliberately fabricates the carrier to isolate write→read (stat-eval is covered by TestVectorOutputs).

Self-review: ran a fresh-context adversarial pass after each phase (COMMENT-event reviews on 93ee562/9c736f9/1511853). All diff-scoped findings folded; the remaining items were the two "Questions for review" below (informational, not blocking).

Tests: uv run pytest316 passed, 1 skipped (the skip is the non-PyPI spherely fork, expected per §7). uv run ruff check src tests passes; lines added this run are format-clean under the pinned hook ruff (v0.14.10). (Note: CI didn't report check runs against the head here — I gated on the local suite + ruff.)

Two things for your eyes (non-blocking):

  1. signature() now includes output_fields → any pre-existing ShardMap (built before this change) will no longer equality-match grid.signature() and the runner reports a grid mismatch. Correct-by-design (the shard map must pin the output schema), but existing catalogs need regeneration. Want me to soften the runner mismatch message to hint at regeneration?
  2. No new reader code — reads go through the existing Zarr open_group API (the trailing payload is a normal axis), so a dedicated reader would be speculative. The round-trip test exercises it directly.

The PR body checklist is updated (phases 1–6 ✅; Tier-2 ragged/t-digest is the explicit follow-up). Draft, implement. Marking waiting since the Tier-1 work is complete and it's awaiting your review/merge — happy to act on #1 above if you want it.


Generated by Claude Code

@espg espg added the waiting label Jun 16, 2026
@espg

espg commented Jun 17, 2026

Copy link
Copy Markdown
Member Author

signature() now includes output_fields → any pre-existing ShardMap (built before this change) will no longer equality-match grid.signature() and the runner reports a grid mismatch. Correct-by-design (the shard map must pin the output schema), but existing catalogs need regeneration. Want me to soften the runner mismatch message to hint at regeneration?

no, it's fine as is.

t-digest fit (Tier 2): the list-carrier and trailing-dim writer are the deliberate shape for the imminent ragged t-digest as List<FixedSizeList<2>>; that lands as the Tier-2 follow-up (CSR ragged), not here.

understood -- it makes sense to do this as a new issue / seperate PR

Does this mean that this PR is ready for review / can be merged?

@espg

espg commented Jun 17, 2026

Copy link
Copy Markdown
Member Author

🤖 from Claude

Yes — the Tier-1 vector path (phases 1–6) is complete and CI-green:

  • config field-kind declaration (Option B: kind/trailing_shape/dtype),
  • ndarray-valued calculate_cell_statistics for vector fields (+ vector-expression, vector+len/count guard),
  • the approved Arrow FixedSizeList carrier (B′) — pure-scalar configs stay byte-for-byte identical,
  • OutputGrid.signature()/nests_with() field-kind set,
  • the dense trailing-dim writer with the single-trailing-chunk invariant enforced at the write site,
  • the leaf→Zarr→read round-trip test (reducers skip the NaN padding).

CI on the head commit is green (test (3.12/3.13), ruff, builds); suite 316 passed / 1 skipped (the spherely fork). Tier-2 (CSR ragged + t-digest) is split out to #48.

Per the routine I keep this a draft and don't self-mark ready-for-review or merge — but it is ready for your review/merge whenever you'd like. Leaving it waiting.


Generated by Claude Code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support non-scalar aggregation outputs (vectors + ragged per-cell payloads)

2 participants