(fix): _CachedRange boundary OOB misclassification on x86_64#153
Conversation
On x86_64 the _CachedRange fast path stores lo/hi that can round 1 ULP inward of the true endpoint, with domain_lo/domain_hi as the safe cushion. Forward-eval OOB classification read raw first(x)/last(x), so a query at the true endpoint (e.g. range[end]) was misclassified out-of-bounds: FillExtrap returned the fill value (KernelDensity.jl pdf returned an all-zero boundary row), ClampExtrap gave a flat derivative, WrapExtrap wrapped to the far edge. The batch path was already correct (it dispatched _is_all_inbounds on axis type). Centralize every forward OOB decision through one axis-aware classifier: - _domain_bounds(x): safe bounds (widened for _CachedRange, exact otherwise) - _oob_state(x, xq): shared 3-way IN_DOMAIN/OOB_LEFT/OOB_RIGHT classifier - _is_inbounds(x, xq): boolean wrapper; _is_all_inbounds unified onto _domain_bounds Routed through it: _anchor_loc, all 1D scalar/one-shot evals (linear, quadratic, constant, cubic, hermite x2), the _wrap_to_domain chokepoint, ND _try_fill_oob/ _is_fill_oob/_handle_axis_extrap, and the cubic-periodic hoist. first/last stay the true grid endpoints, so geometry and wrap-period are unchanged. test_fillextrap_domain_boundary.jl injects a synthetic widened _CachedRange to reproduce the x86 bug deterministically on any arch, including the KernelDensity.jl 2D fill=0 boundary-row pattern.
Companion to the forward-eval fix (fadac05). The adjoints derived OOB state from raw first/last instead of the widened _CachedRange domain bounds, so a query at the true endpoint was misclassified: FillExtrap produced an all-zero sensitivity row (the boundary cell's weights were skipped/zeroed) and NoExtrap threw a DomainError one ULP early — both inconsistent with the (now-fixed) forward eval. Route every adjoint boundary classification through _domain_bounds: - ClampExtrap/FillExtrap clamp+state sites: linear, constant, cubic, _bake_quadratic_adjoint_anchors, and _bake_hermite_adjoint_anchors (the shared helper covers hermite + pchip + akima + cardinal). - NoExtrap validation loops: linear, constant, quadratic, hermite, pchip, akima, cardinal. Clamp targets and wrap periods stay on the actual endpoints; only the in/out classification widens, mirroring the forward _oob_state. test/test_fillextrap_adjoint_boundary.jl: at an in-domain boundary the adjoint row must sum to 1 (value interpolation reproduces constants) and NoExtrap must not throw, across linear/cubic/quadratic/constant/cardinal/pchip/akima, both boundaries.
Companion to the 1D adjoint fix (1e48c87). The ND adjoint per-axis OOB flag was computed from raw first(grids[d])/last(grids[d]), so a query at a true _CachedRange endpoint was flagged OOB and its weights zeroed — an all-zero sensitivity row for FillExtrap at a boundary corner. Route the per-axis classification through the shared _oob_state: - nd_adjoint_scatter.jl _bake_nd_anchors_generic (cubic / quadratic / hetero ND) - linear_nd_adjoint.jl - constant_nd_adjoint.jl (state_flag) ND NoExtrap already validated via the shared _validate_nd_domain (_check_domain on _CachedRange uses domain bounds), so it needed no change. test_fillextrap_adjoint_boundary.jl gains the ND corner row-sum=1 guard across linear/cubic/constant/quadratic + hetero, both corners.
Code review of the adjoint boundary fix found the widened _domain_bounds cushion leaking into evaluation geometry: ClampExtrap clamped OOB queries to domain_hi (1 ULP past the true endpoint) instead of last(x), and quadratic_adjoint's wrap folded against the widened period. Classification must use the widened bounds; clamp/wrap/search geometry must use the actual endpoints — the two were mixed. Separate the two concepts behind helpers so call sites never touch the widened bounds directly: - _clamp_to_grid(xq, x): clamp to the physical span [first, last] (geometry) - _is_inbounds / _oob_state: widened in-domain test / 3-way classify - _validate_domain(axis, queries): 1D NoExtrap check via _check_domain (replaces seven duplicated _domain_bounds validation loops) Adjoint anchor builders (linear/constant/cubic clamp+fixup; quadratic/hermite bake) now clamp via _clamp_to_grid and classify via _is_inbounds/_oob_state; the *_fixup_* helpers take the axis and classify internally. quadratic wrap routes through the 2-arg _wrap_to_domain(xq, x) (actual period). Behavior is unchanged within the accepted +-1 ULP cushion; full adjoint suite green.
The _ExclusivePeriodicAxis classification used first(g)=inner[1]/last(g)=_x_max directly, so the inner _CachedRange's widened domain bounds never reached it. A query at the inner's true left endpoint (lo rounded inward on the x86_64 fast path) was classified OOB and folded to the seam cell — the value still came out y[1], but the derivative used the seam-cell slope instead of the first-cell slope (observable: -20 vs +20). - _domain_bounds(g::_ExclusivePeriodicAxis) propagates the inner's widened left bound (fixes _anchor_loc / _oob_state / adjoint paths). - _wrap_to_domain(xq, g::_ExclusivePeriodicAxis) gains the _is_inbounds guard so an in-domain query (incl. the widened endpoint) is not folded; genuinely-OOB queries still fold against the actual [inner[1], _x_max] span. Zero-alloc preserved on the exclusive eval path; periodic + wrap suites green. Adds a Series boundary regression guard (Series inherits the _anchor_loc fix) and the exclusive-periodic first-cell-vs-seam derivative test.
The shared locator now classifies domain state first (_oob_state), wraps only OOB queries when wrap is set (re-classifying after), then searches — the docstring's old 'wrap -> classification -> search' order was inverted.
…e pass The linear/constant/cubic adjoint constructors built ClampExtrap/FillExtrap anchors in three passes: a _clamp_to_grid.() broadcast (allocating a temp clamped-query array), an _anchor_query loop, then a _fixup_* loop. Replace each with a single _bake_*_clampfill_anchors loop that clamps, anchors, and applies the OOB fixup per query — dropping the temp array and fusing the passes (matching the Hermite/quadratic adjoints' existing single-loop style). Behavior is unchanged: the same scalar anchor impl runs on the same clamped query, and OOB queries get the same state flag (linear/constant) or zeroed weights (cubic) via the same widened (_oob_state/_is_inbounds) classification. The apply hot path is untouched; only construction-time work is reduced. Adds a golden-rule (transpose identity) test on the synthetic widened _CachedRange: dot(itp.(xq), y_bar) == dot(f, adj(y_bar)) for f-linear methods at a query set including the true-endpoint sliver.
Comment/docstring-only: trim the verbose narrative added with the boundary-OOB fix to its core (the widened-classification vs actual-geometry distinction), and remove the KernelDensity.jl references from the test comments/test name. No code change; the genuine AD/perf rationale (ForwardDiff partial-sign, extrema/SIMD note) is kept.
FastInterpolations.jl BenchmarksAll benchmarks (50 total, click to expand)
This comment was automatically generated by Benchmark workflow. |
There was a problem hiding this comment.
Pull request overview
Fixes a boundary-classification bug where scalar/oneshot/ND/adjoint paths could misclassify a query at the true endpoint of a _CachedRange as out-of-bounds (especially on x86_64), causing FillExtrap to return the fill value instead of the boundary value. The change centralizes domain classification via widened axis-aware bounds while keeping search/geometry on the physical grid endpoints, and adds regression tests covering forward + adjoint behavior.
Changes:
- Introduce a single axis-dispatched domain-bounds source (
_domain_bounds) and route OOB decisions through_oob_state/_is_inbounds. - Update multiple eval/oneshot/ND/adjoint paths to use the shared classifier and to separate “classification bounds” from “geometry/clamp/wrap bounds”.
- Add targeted regression tests reproducing the inward-rounded
_CachedRangeendpoint issue (forward and adjoint).
Reviewed changes
Copilot reviewed 25 out of 25 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
src/core/utils.jl |
Adds _domain_bounds, _is_inbounds, _clamp_to_grid; updates _is_all_inbounds to use widened bounds consistently. |
src/core/anchor_common.jl |
Adds _oob_state and reworks _anchor_loc to classify via widened bounds and only wrap when truly OOB. |
src/core/query_protocol.jl |
Adds _validate_domain helper for 1D NoExtrap validation via _check_domain. |
src/core/periodic.jl |
Makes _wrap_to_domain(xq, x) axis-aware with an in-domain guard using widened bounds. |
src/core/periodic_axis.jl |
Extends _domain_bounds for _ExclusivePeriodicAxis and applies widened in-domain guards for wrapping. |
src/core/nd_utils.jl |
Routes FillExtrap ND OOB checks and clamp/fill axis handling through _oob_state. |
src/core/nd_adjoint_scatter.jl |
Uses _oob_state for ND adjoint per-axis OOB classification. |
src/linear/linear_oneshot.jl |
Uses _oob_state for scalar OOB detection instead of raw first/last. |
src/quadratic/quadratic_oneshot.jl |
Uses _oob_state for scalar OOB detection instead of raw first/last. |
src/cubic/cubic_eval.jl |
Uses _oob_state for scalar OOB detection instead of raw first/last. |
src/cubic/cubic_oneshot.jl |
Switches in-domain check to _is_inbounds (widened bounds). |
src/hermite/hermite_eval.jl |
Uses _oob_state for scalar OOB detection instead of raw first/last. |
src/linear/linear_adjoint.jl |
Refactors Clamp/Fill anchor baking and NoExtrap validation to shared helpers. |
src/constant/constant_adjoint.jl |
Refactors Clamp/Fill/Extend anchor baking and NoExtrap validation to shared helpers. |
src/cubic/cubic_adjoint.jl |
Refactors Clamp/Fill anchor baking into a fused loop with widened-bound classification. |
src/quadratic/quadratic_adjoint.jl |
Uses _clamp_to_grid, _wrap_to_domain(xq,x), _is_inbounds, _validate_domain for consistent boundary handling. |
src/hermite/hermite_adjoint.jl |
Uses _clamp_to_grid, _wrap_to_domain(xq,x), _is_inbounds, _validate_domain for consistent boundary handling. |
src/pchip/pchip_adjoint.jl |
Replaces manual NoExtrap validation with _validate_domain. |
src/cardinal/cardinal_adjoint.jl |
Replaces manual NoExtrap validation with _validate_domain. |
src/akima/akima_adjoint.jl |
Replaces manual NoExtrap validation with _validate_domain. |
src/linear/nd/linear_nd_adjoint.jl |
Uses _oob_state for ND adjoint OOB flags (matches forward). |
src/constant/nd/constant_nd_adjoint.jl |
Uses _oob_state for ND adjoint side flags (matches forward). |
test/test_fillextrap_domain_boundary.jl |
New forward regression tests ensuring true-endpoint queries are in-domain across scalar/oneshot/ND/Series/Wrap/deriv/exclusive-periodic. |
test/test_fillextrap_adjoint_boundary.jl |
New adjoint regression tests ensuring boundary queries scatter correctly (row sums, NoExtrap acceptance, transpose identity). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #153 +/- ##
==========================================
+ Coverage 96.57% 96.61% +0.03%
==========================================
Files 149 149
Lines 12293 12276 -17
==========================================
- Hits 11872 11860 -12
+ Misses 421 416 -5
🚀 New features to boost your workflow:
|
_check_domain extracted the primal of the bounds but not the query, so a ForwardDiff.Dual query exactly at the domain boundary flipped in/out of domain by its partial sign alone (NoExtrap one-shot threw a spurious DomainError). Extract the query primal too — partial-independent, matching _is_all_inbounds/ _oob_state. Float queries unchanged. Adds duck-type tests pinning the new boundary helpers (_oob_state/_is_inbounds/ _clamp_to_grid) and forward-eval Dual-query preservation at the widened _CachedRange endpoint. Documents the pre-existing adjoint-constructor Dual-query limitation at _promote_adjoint_inputs (use forward eval for query-position AD).
* (fix): ND ClampExtrap promotes OOB endpoint instead of coercing query type _handle_axis_extrap(::_ClampOrFill) clamped an OOB query by returning oftype(q, first(axis)), coercing the float grid endpoint into the query's type. An Int query against a fractional-endpoint Float grid hit convert(Int, 0.5) -> InexactError, breaking every ND method under ClampExtrap (linear/constant/cubic/quadratic/pchip/cardinal/akima all share the helper). Latent since 3562593 (AbstractExtrapMode refactor); the area was last reshaped by #153. Replace the coercion with _promote_extrap_val(first(axis), q) -- the carrier-propagating idiom already used for the ND FillExtrap result two lines up. It promotes the endpoint to the natural query-grid type (Int->Float, Float32->Float64 without narrowing), preserves ForwardDiff Dual carriers with a zeroed partial (correct flat-region gradient), and keeps the widened-classify / actual-clamp sliver split intact. 1D was already correct (value-clamp via _eval_extrapolation), so no 1D change. Pin: test/test_query_grid_promotion.jl -- 1D/ND x range/vector x all extraps x {linear,constant,cubic} x both type-mismatch directions, plus exact-boundary corners, non-half fractional endpoints, and a ForwardDiff AD carrier/zero-gradient lock. * (perf): type-stable ND extrap coordinate handling for mismatched query eltypes ND coordinate-clamp (Clamp/Fill/Wrap) returned Union{Int,Float} for Int/Float32 queries on a Float grid: in-domain returned the raw query, OOB the promoted endpoint. That Union costs union-split per query, and for the Hermite family (pchip/cardinal/akima) ND it leaked to a public Any -> dynamic dispatch + allocation in user code. Fix: promote each axis query to the grid float type ONCE in _handle_all_extraps -- the extrap-independent chokepoint every ND path (persistent + one-shot, scalar + batch) funnels through. No-op for matched Float64, int->float for Int/Float32, Dual/GridIdx-safe. The per-extrap handlers no longer promote. A function barrier at _eval_nd_at_point promotes before _try_fill_oob as well, removing Fill's Int/Float two-views cost on the scalar path. imresize 16->128 (scalar Int queries): Clamp 80->55us, Fill 60.8->51.8us (Int now equals Float); Float hot paths unchanged (compile-time no-op). 1D is unaffected -- it value-clamps, so it was already type-stable. Tests: type-stability pinned at the _handle_all_extraps (internal), public (all 7 methods, incl. Hermite Any), and 1D levels via return_types + @inferred/isa; plus exact-boundary, fractional-endpoint, and AD carrier locks. * (perf): type-stable N=2 batch coordinate path for mismatched query eltypes The N=2 _locate_cell specialization routes through _locate_cell_2d_preamble, which — unlike the generic-N _handle_all_extraps chokepoint — did not promote the query. The scalar path promotes in _eval_nd_at_point, but the batch path (_interp_nd_batch!) hands the raw per-query tuple straight to the preamble, so a mismatched-eltype query (e.g. Int on a Float64 grid) under ClampExtrap left the OOB branch returning Float64 and the in-domain branch returning Int — a Union that union-splits per batch query. It does not escape _locate_cell's own return type, so only the preamble exposes it. Promote each axis query to grid-float at the preamble, symmetric with _handle_all_extraps and idempotent on the already-promoted scalar path. Also correct the _handle_axis_extrap / _handle_all_extraps / _eval_nd_at_point comments to name all promotion sites accurately. Test: pin _locate_cell_2d_preamble as concrete for Int queries (RED before the preamble promotion, green after). * Runic formatting * (fix): promote ND query at the _extrap_axis gateway (mirror 1D) — fix Int-grid regression + Hermite Any leak Supersedes the per-eval query chokepoint from 44beacf/c01fce91b, which over-promoted non-float grids: promoting via float(eltype(grid)) upcast an Int query on an Int/Rational grid to Float64, breaking the constant-interp Int->Int / Rational->Rational eltype contracts (test_constant_eltype.jl, test_duck_tv_dual_tq.jl) on x86 + LTS CI. Unified fix mirroring the 1D anchor path: promote the query toward the grid ELTYPE via _promote_for_anchor(q, eltype(grid)) at the single per-axis gateway _extrap_axis, and route the N=2 _locate_cell_2d_preamble through it too. Promotion is a no-op for a matched/wider float query and for non-float grids (Int/Rational stay put; Dual via the duck catchall), firing only for a narrower query on a float grid — exactly the case that hit the ClampExtrap oftype InexactError and the per-axis Union{Int,Float64}. That Union leaked to a public Any for Hermite ND (pchip/cardinal/akima); promoting at the gateway makes every coordinate concrete, so no method/extrap/path leaks. interpolant_protocol.jl returns to its master form (no chokepoint). Tests (test_query_grid_promotion.jl): RED->GREEN dedicated Hermite-ND Any-leak pin (pchip/cardinal/akima x all extraps x Int -> Float64); internal _handle_all_extraps concrete for every (extrap x eltype) with Float-grid->Float64 and Int-grid->Int (pins the no-op the regression broke); N=2 preamble concrete for mismatched Int.
Summary
A query at the true endpoint of a range grid could be misclassified out-of-bounds, so
FillExtrapreturned the fill value instead of the boundary value. This routes every out-of-bounds decision through one axis-aware classifier and adds boundary regression tests.Root cause
On x86_64,
_CachedRangecollapses aStepRangeLen{TwicePrecision}into plain scalarlo/hi, which can land ~1 ULP inward of the true endpoint; the struct already stored widened bounds (domain_lo/domain_hi) for this. The batch path used them and was correct, but the scalar / one-shot / ND / adjoint paths classified against rawfirst/last, so a query at the true endpoint was flagged OOB — leaking the fill (and giving a flat boundary derivative underClampExtrap, wrapping toy[1]underWrapExtrap, or an all-zero adjoint row).Fix
A single classifier for "in domain, and if not which side?":
_domain_bounds(widened for_CachedRange, exact otherwise), with_oob_state/_is_inboundsreading it and_clamp_to_gridclamping to the actual endpoints. Classification uses the widened bounds; all geometry (search, weights, clamp target, wrap period) stays on the actual endpoints, so the cushion never leaks into a coordinate. Every scalar / one-shot / ND / adjoint site routes through these.Refactor
The linear/constant/cubic adjoint constructors built ClampExtrap/FillExtrap anchors in three passes (clamp broadcast + anchor + fixup); these are fused into one
_bake_*_clampfill_anchorsloop, dropping the temp array. Behavior is unchanged; only construction-time work is reduced.Tests
test/test_fillextrap_domain_boundary.jl(forward scalar/one-shot/ND/Series, derivative, Wrap, exclusive-periodic) andtest/test_fillextrap_adjoint_boundary.jl(sensitivity-row sum, NoExtrap acceptance, reverse-matches-forward, transpose-identity golden rule), both on a synthetic inward-rounded_CachedRange.