muon_calibration: ONNX-backed shift+smear reweight pipeline by bendavid · Pull Request #692 · WMass/WRemnants

bendavid · 2026-05-22T15:29:23Z

Summary

End-to-end pipeline for the muon-calibration shift+smear reweight model — from snapshot creation through training, ONNX/AOTI export, C++ helper integration, and histmaker wiring. Routes the J/ψ-stats, Z-non-closure, and closure-uncertainty helpers through the trained network by default, with the analytic Splines / Gaussian / massWeights paths still available via --muonScaleVariation.

Built on top of #691 (CI container v53 + narf TBB task-arena fix), which provided the ONNX-runtime / threading prerequisites.

Pipeline pieces

Snapshot (J/ψ + W/Z → per-muon Arrow IPC shards). New: flow_training_snapshot.py, arrow_shard_export.py; structured-record intermediate dtype; train/val/holdout split via per-shard record-batch ranges (default 80/10/10).
Training (BCE shift+smear-reweight head). New: train_shift_smear_reweight.py, train_muon_response_flow.py, arrow_shard_loader.py. MLP-factored architecture (trunk + shift / smear heads composing an inner-product reweight), bf16 trainer, muon_source conditioning ({W/Z prompt, τ-decay, J/ψ leg} → {-1, 0, +1}).
Export (shift_smear_reweight_export.py): single-file ONNX with weights inlined; muon_source remap baked into the graph; AOTI/ONNX with dynamic batch + N_var.
C++ helpers (wremnants/production/include/shift_smear_reweight_helpers.hpp):
- shift_smear_reweight::ReweightModel (narf onnx wrapper) + ReweightEvaluator<NVar> (shared per-muon kinematics → ONNX → exp(log_r) core).
- JpsiCorrectionsUncReweightHelper<T> and SmearingUncertaintyReweightHelper<...> as drop-ins for the analytic Splines variants.
- ZNonClosureParametrizedReweightHelper{Corl,} and ZNonClosureBinnedReweightHelper{Corl,} (new) replace the splines linearisation with the trained reweight; shared shift-source code with the existing Splines helpers via two free z_non_closure_*_delta_r_kappa helpers.
- SmearingHelperSimpleReweight and ScaleHelperSimpleReweight for the w_z_muonresponse.py --testHelpers closure path.
Bundled ONNX (wremnants-data submodule bump): two variants matching the resolution-smearing setting,
shift_smear_reweight_mlp_factored_combined_{smearing,nosmearing}.onnx. The factories pick the matching one based on --no-smearing.

Histmaker integration

New CLI choice --muonScaleVariation onnxReweight (default). The Splines / Gaussian / massWeights paths are unchanged.
make_muon_smearing_helpers, make_jpsi_crctn_unc_helper, make_closure_uncertainty_helper, make_uniform_closure_uncertainty_helper, make_Z_non_closure_parametrized_helper, make_Z_non_closure_binned_helper all accept smearing=True and resolve the bundled ONNX path automatically; histmakers (mw_with_mu_eta_pt, mw_with_mu_eta_pt_VETOEFFI, mz_dilepton, mz_wlike_with_mu_eta_pt, w_z_muonresponse) pass smearing=not args.noSmearing.
New muon_calibration.jpsi_style_cols(df, helper, reco_sel_GF, response_weight_col) centralises the per-helper column-list selection (9-col with muon_source for ONNX, 7-col with response_weight for analytic Splines).
w_z_muonresponse.py --testHelpers adds ONNX reweight references (hist_qopr_smeared_weight_onnx, hist_qopr_scaled_weight_onnx) alongside the existing Splines / MC-sample references.
New plotter scripts/corrections/muon_calibration/plot_muonresponse_reweight_closure.py overlays the variants with a variant/MC-truth ratio panel (auto-rescaling the multi-replica reference, auto-zooming the ratio y-range).

Drive-by fix

SmearingHelperSimple{,Multi} were passing the raw signed sigmarel * qop to std::normal_distribution, which trips a libstdc++ stddev>0 assertion for q<0 muons. std::abs + zero-stddev guard.

Test plan

Verified on Wminustaunu_2016PostVFP and a Z prompt dataset (--maxFiles 1 -j1):

mz_dilepton.py (default --muonScaleVariation onnxReweight) runs end-to-end and produces output.
mz_dilepton.py --muonScaleVariation smearingWeightsSplines still works (no regression on the analytic path).
mz_dilepton.py --noSmearing runs end-to-end (auto-selects the _nosmearing ONNX).
w_z_muonresponse.py --testHelpers produces the closure histograms; plotter reproduces the expected per-permille ONNX scale closure and ~1% smear closure consistent with the per-model diagnostics.
CI on the v53 container.

🤖 Generated with Claude Code

…+train+export) Adds the minimum file set needed to run the shift_smear_reweight pipeline end-to-end on top of main: Snapshot (J/psi + W/Z -> per-muon Arrow IPC shards): scripts/corrections/muon_calibration/flow_training_snapshot.py wremnants/production/arrow_shard_export.py Training (BCE shift+smear reweight head): scripts/corrections/muon_calibration/train_shift_smear_reweight.py scripts/corrections/muon_calibration/train_muon_response_flow.py scripts/corrections/muon_calibration/arrow_shard_loader.py Export + diagnostics: scripts/corrections/muon_calibration/shift_smear_reweight_export.py scripts/corrections/muon_calibration/shift_smear_reweight_diagnostics.py Plumbing: wremnants/production/datasets/dataset_tools.py (WREMNANTS_DATA_PATH override + FQDN host match) scripts/tests/testenv.py (env verification) CLAUDE.md (container setup notes) narf / rabbit / wums (submodule pointers) wremnants/production/muon_calibration.py is left identical to main; the misctechnical changes to it (the dead make_parameterized_scale_shift_helper and the flexible_define / globalidxv variants) ride along with a module_corrections.hpp header that isn't part of this minimal set, so sticking with main's helper code keeps the snapshot self-consistent. Deliberately excluded (not needed for this pipeline): the polyhead / score / flow-onnx variants, the J/psi calibration-tensor workflow (aggregategrads, bake_couplings, make_jpsi_calibration_tensor, fitresults_to_correctionResults), the C++ AOTI/ORT inference benchmarks, the new muon_calibration.hpp / module_corrections.hpp additions (J/psi tensor only), and the older mlp-only reweight export. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… ONNX path Production follow-up to 046226b9 (minimal pipeline). The trained ONNX model is now bundled in wremnants-data and used by default in the muon-calibration uncertainty helpers. Sharded snapshots and dataset splits - arrow_shard_export: structured-record intermediate dtype (mixed int/float) for the bucket-shuffle writer. - arrow_shard_loader: train/val/holdout partition via split_batch_range() over per-shard record batches (default 80/10/10); each split reads only its own slice, avoiding epoch-edge full-shard scans. - flow_training_snapshot: drop per-muon ``event`` column -- split is decided at consumer time, not snapshot time. - train_shift_smear_reweight / train_muon_response_flow: CLIs and stats computation use the new split scheme. - shift_smear_reweight_diagnostics: --split (default holdout), plus per-charge and per-source closure breakdowns. Model export - shift_smear_reweight_export: single-file ONNX with weights inlined by default (--inline-weights), muon_source {1,15,443} -> {-1,0,+1} remap baked into the graph, and AOTI export fixes (dynamic_shapes positional tuple, N-D linear decomposition for the factored heads). C++ helpers and integration - shift_smear_reweight_helpers.hpp (new): JpsiCorrectionsUncReweightHelper and SmearingUncertaintyReweightHelper. NCond bumped 5 -> 6; muon_source_from_gen_part_flav() maps the Muon_genPartFlav input to the network's expected source code. - muon_calibration: bundled default ONNX path (wremnants-data/data/calibration/shift_smear_reweight_mlp_factored_combined.onnx); --muonScaleVariation default switched to ``onnxReweight``; a ``{reco_sel_GF}_muon_source`` RVec<int> column is injected when an ONNX reweight helper is active. Submodule bumps: - narf: TBB-task-arena thread index for the ONNX session pool (prevents slot collisions across RDataFrame loops in the same arena). - wremnants-data: bundles the trained ONNX artifact. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…pton) Three coupled fixes that the previous commit's defaults exposed: shift_smear_reweight_helpers.hpp -- cling JIT crash - narf::onnx_helper_alloc holds Ort::Env / std::vector<Ort::Session> (non-copyable). RDataFrame's Define / narf::DefineWrapper need to copy the callable into internal storage, so the helpers must be copyable too. Wrap onnx_ inside ReweightModel, and model_ inside each helper, in std::shared_ptr. - ReweightModel::run was wrapping its tensor refs in std::cref before forwarding to narf::onnx_helper_alloc::operator(). narf::tensor_traits has no reference_wrapper<> specialisation -- ::get_sizes() then fails during cling's lazy instantiation, surfacing later as a recursive descent / illegal instruction during graph build. Switch to std::tie. - Ort::Value::CreateTensor<T>(... T* p_data ...) is non-const even for input tensors, so run()'s input tensor refs must be non-const (the caller's stack-allocated scratch buffers). muon_calibration.py -- aux closure helper routing - The "...Splines..." vs analytic variants of the parametrised / binned Z non-closure helpers differ in whether they consume the per-muon response_weight column. dilepton with --muonScaleVariation onnxReweight also ships that column (network ignores its value, but it is present in the DataFrame), so route onnxReweight through the Splines variant too in make_Z_non_closure_parametrized_helper and make_Z_non_closure_binned_helper. mz_dilepton.py -- ONNX-helper Define integration - Build the SplinesDifferentialWeightsHelper (diff_weights_helper) whenever --muonScaleVariation is "smearingWeightsSplines" or "onnxReweight" (was only the former). This keeps input_kinematics consistent (always with response_weight) for the auxiliary closure helpers downstream. - For the J/psi stats-uncertainty Define, if the helper is the ONNX reweight type, use a local 10-element column list that includes recoPhi/genPhi/muon_source/response_weight, leaving the 7-element input_kinematics intact for the analytic-style closure helpers below. Verified end-to-end on Wminustaunu_2016PostVFP --maxFiles 1 -j1 for both --muonScaleVariation onnxReweight (new default) and smearingWeightsSplines (no regression). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

… evaluator The J/psi-stats and smearing-uncertainty ONNX helpers used to inline the per-muon kinematics packing, u_buf fill, model run and exp/clip step. The Z non-closure scale-uncertainty helpers were still splines-only, which forced the histmakers to keep building the SplinesDifferentialWeightsHelper and the response_weight column even when --muonScaleVariation=onnxReweight. shift_smear_reweight_helpers.hpp - New shift_smear_reweight::ShiftReweightEvaluator<NVar>: per-muon raw kinematics + caller-supplied delta_r_kappa[NVar] -> alt_weights[NVar]. Encapsulates the y_raw/c_raw build, the u_buf fill, the ONNX call and the exp/clip step. Owns the ReweightModel via shared_ptr so the host class stays copyable. - JpsiCorrectionsUncReweightHelper refactored to compose ShiftReweightEvaluator; ~120 LOC of per-muon code collapses to ~25. Also drops response_weights from the column list (the network gives the full reweight directly, so the splines linearisation isn't consulted). - SmearingUncertaintyReweightHelper drops response_weights for consistency (it still has its own σ-build logic since resolution variations are smear-only). - Four new helpers: ZNonClosureParametrizedReweightHelper{Corl,}<T, N> and ZNonClosureBinnedReweightHelper{Corl,}<T, N, M>. NVar = 2 down/up per muon. The per-muon shift-source calculation (the bit that mirrors the existing Splines helpers' recoQopUnc loop) lives in two free functions z_non_closure_{param,binned}_delta_r_kappa, so the four classes share both the shift source AND the ShiftReweightEvaluator. muon_calibration.py - make_Z_non_closure_parametrized_helper / make_Z_non_closure_binned_helper: default scale_var_method to "onnxReweight"; add onnx_path / onnx_nslots kwargs. Route to the four new C++ classes for the ONNX path. - _is_onnx_reweight_helper matches any of the six reweight-helper prefixes. - jpsi_style_cols: ONNX branch returns a 9-col list (no response_weight) -- the column is genuinely unused now. - add_resolution_uncertainty: column list for ONNX no longer includes response_weight. - add_jpsi_crctn_stats_unc_hists / add_jpsi_crctn_Z_non_closure_hists: use jpsi_style_cols where possible; the W histmaker path now handles the ONNX helpers correctly. mz_dilepton.py - diff_weights_helper is no longer built for --muonScaleVariation=onnxReweight (splines response_weight is completely bypassed in this mode). Verified end-to-end on Wminustaunu_2016PostVFP and a Z prompt dataset with both --muonScaleVariation=onnxReweight (default) and smearingWeightsSplines, --maxFiles 1 -j1. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

``SmearingHelperSimple::operator()`` was passing ``dsigma = sigmarel_ * qop`` straight to ``std::normal_distribution{qop, dsigma}``. For q<0 muons ``qop < 0`` so ``dsigma < 0``, which trips libstdc++'s assertion ``_M_stddev > _RealType(0)`` and aborts (seen in ``scripts/histmakers/w_z_muonresponse.py``). Wrap in ``std::abs`` to match what ``SmearingHelperSimpleMulti`` already did. Also add a ``dsigma > 0`` guard in both helpers so a legitimate ``sigmarel == 0`` (no smearing) falls through cleanly -- the assertion is strict greater-than and ``std::abs(0) == 0`` would still trip it. For the Multi variant the δ-function case emits N copies of ``qop``. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

shift_smear_reweight_helpers.hpp - Rename shift_smear_reweight::ShiftReweightEvaluator -> ReweightEvaluator. The namespace already carries the shift+smear context; the bare class name should just say what the type does, especially now that it handles both u and σ inputs. Touches the typedefs in J/psi-stats, Z non-closure (parametrized / binned, Corl & uncorrelated), and the new SimpleReweight helpers below. - Extend ReweightEvaluator::evaluate with a (delta_r_kappa, sigma_r_kappa) overload; the shift-only signature now forwards to it with a zero σ array (existing callers unchanged). - Refactor SmearingUncertaintyReweightHelper to compose ReweightEvaluator<NVar> instead of inlining the y/c/u/σ packing + ONNX call (~85 LOC -> ~40). - Add SmearingHelperSimpleReweight and ScaleHelperSimpleReweight as ONNX drop-ins for wrem::SmearingHelperSimpleWeight / ScaleHelperSimpleWeight: same scalar reweight semantics, computed via the trained network instead of the analytic dweightd[mu|sigmasq] linearisation. scripts/histmakers/w_z_muonresponse.py - In --testHelpers, construct SmearingHelperSimpleReweight and ScaleHelperSimpleReweight using muon_calibration.default_shift_smear_reweight_onnx, Define selMuons_muon_source (per-muon Muon_genPartFlav passthrough, remapped to {-1, 0, +1} inside the ORT graph), compute weight_smear_onnx / weight_scale_onnx, and book the matching hist_qopr_*_weight_onnx histograms on the same axes as the splines counterparts. scripts/corrections/muon_calibration/plot_muonresponse_reweight_closure.py - New closure plotter. Loads the histmaker output, aggregates over MC processes, projects to qopr, and overlays nominal / MC-truth / splines reweight / ONNX reweight / splines transform with a variant-over-truth ratio panel. Emits both log-y and linear-y versions (4 files per closure type × 2 = 8 files / run). - The MC-smeared reference (hist_qopr_smearedmulti) is filled with nreps=100 replicas per muon, so its integral is N_reps × the nominal integral. The plotter detects this empirically (integral ratio) and rescales so all curves share a common normalisation; for single-sample reference hists this is a no-op. - Ratio-panel y-range auto-zooms to max |ratio - 1| in the visible x-range, restricted to bins with >=1% of the truth peak (filters noise tails); padded by 1.3×, clamped to [±0.5%, ±50%]. Manual override: --ratio-ylim LO HI. Closure level verified consistent with the per-source diagnostic from direct20/splitmlp_bf16_bce_exp_largetrunk_factshift: ~1% wavy residual at sigmarel=5e-3 (= 0.3 σ_y), ~few permille on splines and ~0 on ONNX at scalerel=5e-4 (= 0.03 σ_y). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

The shift+smear-reweight network is conditioned on the reco pt's treatment of the data/MC resolution-matching smearing helper, so a model trained on a smeared snapshot must not be applied to un-smeared reco kinematics (and vice versa). wremnants-data now ships two bundled variants: shift_smear_reweight_mlp_factored_combined_smearing.onnx (default) shift_smear_reweight_mlp_factored_combined_nosmearing.onnx Replace the module-level ``default_shift_smear_reweight_onnx`` string with a function ``default_shift_smear_reweight_onnx(smearing=True)`` that returns the matching path. Every factory that takes an ONNX path (``make_muon_smearing_helpers``, ``make_jpsi_crctn_unc_helper``, ``make_closure_uncertainty_helper``, ``make_uniform_closure_uncertainty_helper``, ``make_Z_non_closure_parametrized_helper``, ``make_Z_non_closure_binned_helper``) gains a ``smearing=True`` kwarg and falls back to the matching default path when ``onnx_path`` is None. Wrapper factories ``make_jpsi_crctn_helpers`` and ``make_Z_non_closure_helpers`` thread ``smearing`` through to the per-helper builders (the latter reads ``args.noSmearing`` directly). Histmaker call sites updated to pass ``smearing=not args.noSmearing``: mw_with_mu_eta_pt, mw_with_mu_eta_pt_VETOEFFI, mz_dilepton, mz_wlike_with_mu_eta_pt. ``w_z_muonresponse.py --testHelpers`` picks the matching variant for its SmearingHelperSimpleReweight / ScaleHelperSimpleReweight constructions too. Also bump the wremnants-data submodule pointer to the ``add no-smearing ONNX; rename existing to _smearing suffix`` commit that introduced the two-file bundle. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

CI linting on PR #692 failed on isort / flake8 / black. Apply the same commands the workflow runs: * ``isort . --skip narf --skip rabbit --skip wremnants-data --skip wums --profile black --line-length 88`` * ``black --exclude '(^\.git|\.github|\.ipynb|narf|rabbit|wremnants-data|wums)' .`` * drop F401 unused imports (``typing.Tuple``, ``typing.List``, ``hist``, ``import zuko`` next to ``from zuko.flows import ...`` blocks, ``PreprocStats`` in ``train_shift_smear_reweight.py``). * mark ``scripts/tests/testenv.py`` imports with ``# noqa: F401`` (the file exists exactly to verify those imports work). * fix a real F821 in ``train_muon_response_flow.py``: ``_detach_pure_coefs_in_joint`` takes a ``polyhead`` argument but read ``head.is_pure_u | head.is_pure_sigma`` instead of ``polyhead.*``. Would crash at the first JOINT-mode batch when ``--detach-pure-{shift,smear}-in-joint`` is on. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…mpilable CI's ``check C++ Files`` step runs ``clang++ -fsyntax-only`` on every header alone. shift_smear_reweight_helpers.hpp depended on symbols that muon_calibration.hpp brings in (``wrem::clip_tensor``, ``wrem::calculateQopUnc``, ``wrem::SmearingHelperParametrized``, the ``using ROOT::VecOps::RVec`` pulled into namespace wrem, ``narf::get_value``), and the runtime cling load order made it work, but the standalone check failed. Add a header guard to muon_calibration.hpp and #include it from shift_smear_reweight_helpers.hpp. With the guard, re-includes in cling become no-ops, so the runtime ``narf.clingutils.Declare`` of both headers still works. Verified standalone: clang++ -I./narf/narf/include/ -I./wremnants/include/ \\ -I .../ROOT/include -std=c++20 -fsyntax-only \\ wremnants/production/include/shift_smear_reweight_helpers.hpp passes (exit 0) for both headers. mz_dilepton and w_z_muonresponse --testHelpers still produce output. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

The W (``mw_with_mu_eta_pt.py``) and W-like (``mz_wlike_with_mu_eta_pt.py``) histmakers crashed under the new ``--muonScaleVariation=onnxReweight`` default with ``Column "nominal_muonScaleSyst_responseWeights_tensor" is not in a dataset and is not a custom column been defined`` (``add_jpsi_crctn_stats_unc_hists``) and ``10 column names are required but 7 were provided`` for ``JpsiCorrectionsUncReweightHelper`` (``muonScaleClosSyst_responseWeights_tensor_splines``). Two pieces: * ``muon_calibration.add_jpsi_crctn_stats_unc_hists`` gains an ``onnxReweight`` branch parallel to ``smearingWeightsSplines``: build the ONNX column list via ``jpsi_style_cols``, Define ``muonScaleSyst_responseWeights_tensor_onnx``, and alias ``nominal_muonScaleSyst_responseWeights_tensor`` to it. * ``mw_with_mu_eta_pt`` and ``mz_wlike_with_mu_eta_pt`` route every inline Define that consumes ``closure_unc_helper{,_A,_M}``, ``z_non_closure_parametrized_helper``, and (in wlike) ``data_jpsi_crctn_unc_helper`` through ``jpsi_style_cols`` so each helper sees the matching column list -- 10-col with ``muon_source`` for ONNX, 7-col with ``response_weight`` for analytic Splines. Verified end-to-end: python scripts/histmakers/mw_with_mu_eta_pt.py --filterProcs Wminusmunu_2016PostVFP --maxFiles 1 python scripts/histmakers/mz_wlike_with_mu_eta_pt.py --filterProcs Zmumu_2016PostVFP --maxFiles 1 both produce output. isort / flake8 / black all pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

setupRabbit smoothing of the ``muonResolutionSyst_responseWeights`` systematic crashed with ``numpy.linalg.LinAlgError: Singular matrix`` in ``solve_leastsquare`` (and ``solve_nonnegative_leastsquare``). The fake-rate ABCD smoothing fits a small-window polynomial per bin. When too many bins in a window are 0 / negative (the ~10% rate we see across most systematics), the X^T X matrix for that window is rank-deficient. ``np.linalg.inv`` raises; ``np.linalg.pinv`` (Moore-Penrose pseudo-inverse) returns the minimum-norm solution and is numerically equivalent for full-rank matrices. The pinv also serves as the parameter covariance returned by both functions: in the rank-deficient direction it returns 0, which is the right downstream behaviour -- no spurious large uncertainty in a direction the data couldn't constrain. Triggered downstream of WMass#692 (ONNX-backed ``SmearingUncertaintyReweightHelper`` produces ~74 eigenvariations with wider per-event tails than the analytic helper; the smoothing hits a degenerate window for one of them and crashes). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

bendavid · 2026-05-23T07:27:12Z

Modest differences in the CI for the impacts, which could be expected. Will run the full stats CI for a more robust comparison.

bendavid · 2026-05-23T14:35:10Z

There are some general problems that prevents the long-CI from finishing (also for the reference). I'll have a look at this.

In the meantime the comparison is available for mZ (dilepton mass) and W-like. The muon calibration uncertainty for Wlike decreases from 5.39 to 4.61 MeV which is probably larger than justified (this would imply that the splines had a pretty big spurious contribution to the uncertainty). More importantly for the dilepton mass, the calibration impact decreases from 4.62 to 1.38 MeV which clearly makes no sense, so I will follow up what is wrong there.

Empirically the deployed shift+smear reweight network's effective first-order gradient is calibrated for u in qop space, not r_κ space. Per-eigenvariation half-diff comparison on Wmunu/Zmumu with --validationHists showed the ONNX variations ~100× LARGER than the analytic-Splines path when u was supplied as δr_κ = δqop · p_gen · sign(q_gen). Dropping the ``p_gen · sign(q_gen)`` factor (i.e. passing δqop in GeV⁻¹) brings ONNX and Splines magnitudes into ~1× agreement across all three processes. ReweightEvaluator::evaluate docstring updated to reflect that u_raw / sigma_raw are in physical units (GeV⁻¹, radians, radians) matching y_raw component-by-component, NOT r_κ. Callers updated to drop the p_gen factor: * JpsiCorrectionsUncReweightHelper: ``dr = recoQopUnc`` (was ``recoQopUnc · pgen · sign_qgen``). * SmearingUncertaintyReweightHelper: σ_raw = sqrt(dsigmarelsq · qop_reco²) (was sqrt(... · pgen²)). * z_non_closure_param_delta_u_raw / z_non_closure_binned_delta_u_raw (renamed from ``*_delta_r_kappa``): drop p_gen · sign_qgen, no longer take gen kinematics. * SmearingHelperSimpleReweight: σ_raw = sigmarel / p_reco (was sigmarel · pgen / preco). * ScaleHelperSimpleReweight: u_raw = scalerel · q_reco / p_reco (was scalerel · q_reco · sign_qgen · pgen / preco). Verified per-eigenvariation half-diff RMS over 144 J/ψ-stats variations, mw_with_mu_eta_pt --validationHists --maxFiles 5: Process | RMS splines | RMS ONNX | ratio (was) Wplusmunu | 2.21 | 2.00 | 0.91 (120×) Wminusmunu | 2.52 | 1.50 | 0.59 ( 54×) Zmumu | 0.88 | 0.94 | 1.06 (180×) mz_dilepton, w_z_muonresponse --testHelpers still produce output. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…w r_κ" This reverts commit f3ba94b.

… reweight ReweightModel::run and ReweightEvaluator::evaluate declared their y / c / u / σ / log_r buffers with Eigen's default ColMajor layout, but onnx_helper_alloc forwards .data() to ORT, which interprets the buffer as row-major. For shape (1, NVar, F) with NVar ≥ 2 the two layouts disagree on how the (NVar, F) plane is packed in memory, so the per-variation u and σ inputs were silently scrambled — variation 0 saw a non-physical mixed-axis shift and variation 1 came back as exactly exp(0) = 1.0. NVar = 1 escaped because length-1 axes are layout-invariant, which is why ScaleHelperSimpleReweight and SmearingHelperSimpleReweight (used by w_z_muonresponse --testHelpers) kept matching the splines reference; the broken path was every NVar ≥ 2 helper — JpsiCorrectionsUncReweightHelper (NVar = 2·nUnc), all four Z-non-closure variants (NVar = 2), SmearingUncertaintyReweightHelper, and the closureA / closureM helpers (NVar = 2). Declaring the buffers Eigen::RowMajor fixes the byte order with no call-site changes. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

… call sites The signature no longer takes ``tflite_file`` as the second positional arg or ``dummy_mu_scale_var`` / ``dummy_var_mag`` as kwargs; three call sites in ``add_jpsi_crctn_stats_unc_hists`` still passed them, which would have errored at runtime (and was silently feeding ``tflite_file`` in as ``scale_A``). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…ference For shifts the small-magnitude rows of the closure plot are dominated by h_shifted's per-bin Poisson noise — too few events cross any bin edge for the literal reference to be informative. Add an ``h_lin`` curve computed directly from h_raw via centered finite differences, h_lin(y_j; δy) = h_raw[j] − δy · (∂h/∂y)[j], which matches h(y − δy) at linear order with no extra sampling noise. For smears the analog is the σ² Taylor term, h_lin(y_j; σ) = h_raw[j] + ½ σ² · (∂²h/∂y²)[j]. Below ``--lin-noise-thresh`` average per-bin event crossings (default 100, S/N ≈ √100 ≈ 10) the literal reference is considered noise-dominated and the lin reference takes over as the ratio-panel denominator and y-range setter; otherwise the literal stays primary. Both curves are always drawn — lin in red, the literal in black, the pred ratio (blue for shift / green for smear) colored by numerator. Pad floor on the ratio panel y-range is also removed so the auto-zoom follows the actual pred/ref ratio spread. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…bins" This reverts commit 752fd2e.

``make_muon_smearing_helpers`` now takes ``scale_var_method`` and only builds the ONNX-backed ``SmearingUncertaintyReweightHelper`` when the caller selected ``onnxReweight``; for ``smearingWeightsSplines`` / ``smearingWeightsGaus`` / ``massWeights`` it falls back to the analytic ``SmearingUncertaintyHelperParametrized`` that consumes the splines ``response_weight`` column. Previously the smearing helper always picked the ONNX path regardless of the flag, so a side-by-side ONNX-vs-splines comparison on the resolution syst impact was impossible from CLI alone. The five histmaker call sites (mz_dilepton, mw_with_mu_eta_pt, mw_with_mu_eta_pt_VETOEFFI, mz_wlike_with_mu_eta_pt, w_z_muonresponse) now forward ``scale_var_method=args.muonScaleVariation``. The two ``flow_training_snapshot.py`` callers keep the default (always-ONNX) since they run from the training pipeline, not the analysis-side histmakers. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

The network parameterises σ as a positive magnitude (and is even in σ since Gaussian smearing is symmetric under σ→−σ), so the previous helper clamped any eigenvariation with ``dsigmarelsq<0`` (a reduce- resolution direction) to σ=0 and returned exp(0)=1 — a no-op. The analytic ``SmearingUncertaintyHelperParametrized`` instead applies the signed linearisation 1+∂_σ²·δσ², so the two diverged exactly on the "shrink the resolution" half of every Hessian eigendirection. Recover the signed response without retraining by feeding |δσ²| to ONNX and applying the leading-order odd symmetry of log_r in δσ² afterwards: log_r(−|δσ²|) ≈ −log_r(+|δσ²|), i.e. alt_weight → 1/alt_weight when dsigmarelsq<0. Exact to first order in δσ²; higher-order curvature terms get the wrong sign in this approximation, but those are O((δσ²)²) and negligible for the 1σ eigenvariations the helper is ever called with. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Pair with bendavid/narf#44. The narf submodule is bumped to the ``onnx-helper-iobinding`` tip so ``narf::onnx_helper`` exposes the new persistent-buffer + IoBinding + TensorMap-based fill interface and ``narf::onnx_helper_alloc`` is gone. ReweightModel becomes a class template on ``NVar`` and constructs ``narf::onnx_helper`` once with the explicit input/output shapes [{1,F}, {1,NCond}, {1,NVar,F}, {1,NVar,F}] / [{1,NVar}], pinning the model's dynamic axes at instantiation. ``run`` is no longer a function template — it takes the fixed-shape RowMajor Eigen tensors directly, inputs as ``const &``. ``ReweightEvaluator<NVar>`` holds a ``std::shared_ptr<ReweightModel<NVar>>`` and calls ``model_->run(...)`` without a template argument. Per-call we no longer pay for Ort::Value construction or name array setup — the IoBinding is one-shot at construction; ``operator()`` copies in/out through ``Eigen::TensorMap<…, RowMajor>`` views over the persistent ORT-owned buffers. The RowMajor declarations on the caller-side Eigen tensors still match the model shape for the copy to be a simple memcpy, but they no longer need to match the network's storage order for correctness — Eigen's cross-layout assignment handles that. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…pton externalPostfit step The high-stats workflow (``setup 1:1 data:mc events`` branch of ``setenv``) was timing out at NTHREADS=64; bump to 128 to keep under the per-step time budget. The ``dilepton ptll from wlike`` step invokes ``rabbit_fit.py`` with ``--externalPostfit`` (loading the wlike fit's ``uncorr`` postfit values) and no local re-minimisation. ``rabbit_fit.py`` then unconditionally tries to compute an EDM + covariance from the local Hessian at that externally-supplied point, which is generically indefinite off-minimum — Cholesky failed at the 4th leading minor and the step exited non-zero, even though the fit-results file had already been written. Add ``--noHessian`` (already wired in ``rabbit_fit.py`` at line 298) to skip that local Hessian / EDM computation; the downstream plotting step reads the saved ``fitresults_from_ZMassWLike_eta_pt_charge.hdf5`` and the externally-loaded postfit covariance, neither of which depended on the indefinite local Hessian. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…MassDilepton externalPostfit step" This reverts commit 24ddf17. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

bendavid · 2026-05-25T11:07:44Z

There were some technical fixes (mismatched memory order for Eigen tensors vs onnx etc) which was leading to nonsense in evaluating the model. This is fixed now.

The standard CI now matches extremely well against the reference.

The high stats CI reference is currently broken for the mW fit, but the wlike and dilepton mass agree extremely well.

For the high stats dilepton mass comparison, which is the most sensitive the numbers now look like

reference (https://github.com/WMass/WRemnants/actions/runs/26323635314/job/77497221066):

   resolutionCrctn: 0.00684
   binByBinStat: 0.00741
   theory: 0.00766
   stat: 0.01034
   scaleClosCrctn: 0.0127
   scaleClosACrctn: 0.02085
   scaleCrctn: 0.03891
   muonCalibration: 0.04622
   expNoLumi: 0.04626
   experiment: 0.04626
   massShift: 0.04886
   Total: 0.04886

this PR (https://github.com/WMass/WRemnants/actions/runs/26368527418/job/77616529258)


   resolutionCrctn: 0.00609
   theory: 0.00722
   binByBinStat: 0.00741
   stat: 0.01034
   scaleClosCrctn: 0.01263
   scaleClosACrctn: 0.02071
   scaleCrctn: 0.0386
   muonCalibration: 0.04594
   expNoLumi: 0.04598
   experiment: 0.04598
   massShift: 0.04859
   Total: 0.04859

So there are small differences as can be expected, but group by group the impacts match very well.

I've reverted the changes to the high stats CI since this can/should be discussed separately in #693

In principle once the standard CI runs again (runners are currently down) this should be good to merge (pending review of course)

bendavid · 2026-05-25T11:18:38Z

For some more details, here are variation-by-variation comparisons with respect to the splines for scale and resolution

bendavid and others added 7 commits May 22, 2026 15:35

bendavid force-pushed the muon_reweight branch from 12c560d to 3a4ade0 Compare May 22, 2026 15:36

bendavid and others added 4 commits May 22, 2026 16:55

bendavid and others added 13 commits May 23, 2026 16:20

Revert "muon_calibration: pass u/σ to ONNX in raw qop (GeV⁻¹), not ra…

e4253cc

…w r_κ" This reverts commit f3ba94b.

CLAUDE.md: document podman/docker container detection via $container

e8add0e

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

shift_smear_reweight_diagnostics: black

f39e31d

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Revert "regression: pinv instead of inv for rank-deficient smoothing …

7893d23

…bins" This reverts commit 752fd2e.

Revert "ci: bump high-stats NTHREADS to 128 and pass --noHessian to Z…

40e034c

…MassDilepton externalPostfit step" This reverts commit 24ddf17. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

muon_calibration: ONNX-backed shift+smear reweight pipeline#692

muon_calibration: ONNX-backed shift+smear reweight pipeline#692
bendavid wants to merge 24 commits into
WMass:mainfrom
bendavid:muon_reweight

bendavid commented May 22, 2026

Uh oh!

bendavid commented May 23, 2026

Uh oh!

bendavid commented May 23, 2026

Uh oh!

bendavid commented May 25, 2026

Uh oh!

bendavid commented May 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

bendavid commented May 22, 2026

Summary

Pipeline pieces

Histmaker integration

Drive-by fix

Test plan

Uh oh!

bendavid commented May 23, 2026

Uh oh!

bendavid commented May 23, 2026

Uh oh!

bendavid commented May 25, 2026

Uh oh!

bendavid commented May 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant