Bin-by-bin (BBB) statistical treatment: numerical fixes + modularization by davidwalter2 · Pull Request #132 · WMass/rabbit

davidwalter2 · 2026-04-29T01:16:36Z

This PR groups three lines of work on the bin-by-bin statistical treatment, all preserving the existing CLI and giving identical results to the pre-change baseline on tests/make_tensor.py (6/6 binByBinStatType × binByBinStatMode combinations) to all reported digits.

Numerical fixes

--binByBinStatType gamma --binByBinStatMode full postfit covariance was wrong. The Newton iteration over nbeta uses tf.Variable.assign_sub, which is not a differentiable op. After the loop, self.nbeta was treated as a leaf constant by the outer tape, so the implicit dependence dβ_profile/dx was dropped from the postfit Hessian — covariance and POI σ came out too small. Fix: take one more Newton step at the converged value with autodiff enabled, restoring the implicit-function-theorem gradient. On tests/make_tensor.py the sig POI σ goes from 0.00835 (buggy) to 0.01437, matching all five other type×mode combinations.
New --minBBKstat option. Extends betamask with kstat = sumw² / sumw2 < threshold so degenerate per-process bins (e.g. mixed-sign-weight cancellations) get their β nuisance fixed at β0 instead of being profiled. Useful in full mode when individual processes have very low effective stats per bin and otherwise produce a non-PD postfit Hessian. Default 0.0 preserves existing behaviour.
lite mode no longer scales zero-variance processes by β. For processes with sumw2 == 0 (e.g. data-driven backgrounds whose yield is determined by a paramModel), β is passed through unchanged. The merged kstat in lite mode is computed from finite-variance contributions only (data-driven processes were previously inflating sumw_total without contributing to sumw2_total, over-tightening β). Each lite-mode profile formula was re-derived to handle nexp(β) = β·n_active + n_zero_var; when no zero-variance process is present each formula reduces exactly to the original closed form, so the regression is exact. On a W-mass fit with one data-driven process (Fake), POI σ in lite matches full to 5 sig figs (0.09753 vs 0.09752), where previously lite was inflated by ~1%.

Modularization

The ~700–900 BBB-related lines previously interleaved through fitter.py move into a new rabbit/bbstat/ package:

  rabbit/bbstat/  
    __init__.py                               
    bbstat.py     # BinByBinStat: Variables, masks, kstat, profile + apply,
                  # constraint NLL, toy randomization, Newton helper                        
    formulas.py   # solve_quad_eq

Fitter constructs a single self.bbstat = BinByBinStat(...) and delegates all β-related operations to it. _compute_yields_with_beta shrinks from ~614 lines to a 10-line wrapper. The legacy attribute / method shims on Fitter (binByBinStat, beta, kstat, betamask, set_beta0, …) have been removed; the two external callers (workspace.py, bin/rabbit_fit.py) now read fitter.bbstat. directly.

Net diff in fitter.py: 3118 → 2337 lines. New module: 942 lines.

The selective-lite naming uses physics-neutral terms: proc_zero_var_mask / n_active / n_zero_var (rather than "data-driven" / "MC", which were specific to one use case).

Tests + CI

New tests/test_bbstat.py (15 tests, runs as a script and under pytest): solve_quad_eq, default β0 per type, lbeta = 0 at β = β0, normal-additive / normal-multiplicative constraint values, betamask construction, --minBBKstat extension, proc_zero_var_mask detection, full-mode skip, invalid type validation.
unit-tests matrix in .github/workflows/main.yml gains the new test.
All fit-running jobs (fitting, sparse-fits, alternative-fits, regularization, bsm, likelihoodscans, abcd-fits) now declare needs: [setenv, unit-tests, …] so a regression in the bbstat module short-circuits the pipeline before the long fits.

The Newton iteration over `nbeta` uses `tf.Variable.assign_sub`, which is not a differentiable op. After the loop, `self.nbeta` was treated as a leaf constant by the outer tape, so the implicit dependence `dbeta_profile/dx` was dropped from the postfit Hessian, giving an incorrect (too small) covariance and POI uncertainty. Apply the implicit-function-theorem trick: take one more Newton step at the converged value. Since `grad ≈ 0` at convergence the value is unchanged, but `u_diff = u* - F(u*, z)/H(u*, z)` carries the correct `du*/dz = -(∂F/∂z)/H` through the outer tape. On `tests/make_tensor.py`, sig POI σ for `--binByBinStatMode full --binByBinStatType gamma` now matches lite gamma and full normal-multiplicative (0.01437 vs 0.01436 / 0.01438; previously 0.00835). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Adds a CLI option that extends the BBB mask to (bin, process) entries with effective MC statistics kstat = sumw**2 / sumw2 below the given threshold. These entries get their bin-by-bin nuisance fixed at beta0 instead of being profiled. Useful for `--binByBinStatMode full` when individual processes have very low effective stats per bin (e.g. mixed-sign-weight cancellations) which otherwise produce ill-conditioned per-process profiles and a non-PD postfit Hessian. Default 0.0 preserves the existing behaviour. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Previously, in binByBinStatMode lite the per-bin β scalar scaled all processes uniformly, including data-driven processes (e.g. ABCD- estimated QCD with sumw2==0). This had two consequences: 1. The merged kstat = (Σ_p sumw_p)² / Σ_p sumw²_p was inflated by the data-driven yield contribution, over-constraining β toward 1. 2. β movements perturbed the data-driven prediction, even though that prediction has no MC stat uncertainty and is determined by the param model. This change detects (bin, process) entries with sumw²==0 (proc_data_ driven_mask), excludes them from the merged sumw used to compute kstat, and applies β only to MC processes when reconstructing nexp. The closed-form profile β formulas are generalized to nexp(β) = β·N_MC + N_data; each lite branch now solves a quadratic (or scaled linear, for normal-multiplicative chisqFit) instead of the original simpler form. With N_data == 0 each new formula reduces exactly to the old one, so behaviour is unchanged for fits without data-driven processes — verified on tests/make_tensor.py where all 6 type×mode combinations reproduce the pre-change POI σ to 4 sig figs. For normal-additive lite, the β formula is unchanged (β cancels in the additive math) but the application step distributes the additive correction sbeta·β only over MC processes, weighted by their share of N_MC. On a W-mass fit with one data-driven process (Fake), POI σ in lite and full modes now agree to 4 sig figs (0.09753 vs 0.09752), as expected when the BBB nuisance only couples to MC stat. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

First step of extracting the bin-by-bin statistical treatment from fitter.py into its own module. Creates the rabbit/bbstat/ package and moves the standalone solve_quad_eq helper into bbstat/formulas.py. No behaviour change. The existing 6 binByBinStatType × binByBinStatMode combinations on tests/make_tensor.py reproduce the previous σ values. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Step 2 of the bbstat refactor. Lifts the bin-by-bin statistical initialization (~150 lines: type/mode/threshold validation, β/β0/log β0 TF Variables, sumw/varbeta/kstat reduction, betamask, gamma+full nbeta Variable, and the covarianceFit + normal-additive LU pre-decomposition) out of Fitter.__init__ and into BinByBinStat.__init__. Fitter now constructs self.bbstat once and exposes the legacy attribute names (binByBinStat, binByBinStatType, beta, kstat, betamask, etc.) via @Property shims so external callers (parsing.py, workspace.py, scripts in bin/) continue to work without changes. Helpers _default_beta0, set_beta0, beta0defaultassign, betadefaultassign delegate one line each to bbstat. The 6 binByBinStatType × binByBinStatMode regression on tests/make_tensor.py reproduces the previous σ values to all reported digits. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Step 3 of the bbstat refactor. Lifts the constraint NLL term (~50 lines: gamma + normal-multiplicative + normal-additive variants, with optional full_nll constants) out of Fitter._compute_lbeta and into BinByBinStat.lbeta. Fitter._compute_lbeta is now a one-liner that delegates. The 6 type × mode regression on tests/make_tensor.py reproduces the previous σ values to all reported digits. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Step 4 of the bbstat refactor — the largest. Lifts the entire β profile and application logic out of Fitter._compute_yields_with_beta (~600 lines covering 8 lite + 8 full × 3 likelihood-form branches, including the gamma+full Newton iteration with the implicit-function-theorem trick) into BinByBinStat.profile_and_apply. Fitter._compute_yields_with_beta is now a 10-line delegator that calls self._compute_yields_noBBB to get the model prediction then hands off to self.bbstat.profile_and_apply. The Newton-loop helper (_newton_solve_nbeta) lives on BinByBinStat and is shared between the chisqFit and Poisson gamma+full paths. The 6 binByBinStatType × binByBinStatMode regression on tests/make_tensor.py reproduces the previous σ values to all reported digits. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Step 5 of the bbstat refactor. The β branches of bayesassign / frequentistassign now delegate to BinByBinStat.randomize_bayes / randomize_frequentist. Fitter keeps the θ orchestration but the gamma/normal randomizers no longer touch β-related state directly. The 6 binByBinStatType × binByBinStatMode regression on tests/make_tensor.py reproduces the previous σ values to all reported digits. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Step 6 of the bbstat refactor. Adds tests/test_bbstat.py covering the new BinByBinStat module without spinning up the full Fitter: - solve_quad_eq known roots - default_beta0 per stat type (gamma=1, normal-mult=1, normal-add=0) - lbeta=0 at β=β0 for all three types - normal-additive quadratic constraint value - normal-multiplicative kstat-weighted quadratic value - betamask construction (sumw==0 or sumw²==0) - --minBBKstat extension of betamask - proc_data_driven_mask detection in lite mode - full mode skips the proc_data_driven_mask path - invalid binByBinStatType raises 15/15 passing. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Adds tests/test_bbstat.py to the unit-tests matrix in the GitHub CI and makes the fit-running jobs (fitting, sparse-fits, alternative-fits, regularization, bsm, likelihoodscans, abcd-fits) depend on the unit-tests job, so a broken bbstat module fails the pipeline before spending CI cycles on the actual fits. The new test file picks up an `if __name__ == "__main__"` entrypoint that calls pytest, matching the script-style invocation used by the existing unit tests in the same matrix. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

The selective lite-mode β treatment is general — it applies to any (bin, process) entry with sumw2 == 0, not just data-driven QCD-style backgrounds. Rename the internal names to reflect that: proc_data_driven_mask → proc_zero_var_mask n_mc → n_active n_data → n_zero_var mc_norm → active_norm sumw_mc → sumw_active pdd → zv_mask Comments now talk about "finite-variance" / "zero-variance" entries rather than "MC" / "data-driven" processes. The Fitter property shim is renamed in step (proc_zero_var_mask), and the unit-test names follow. The 6 binByBinStatType × binByBinStatMode regression on tests/make_tensor.py reproduces the previous σ values to all reported digits; tests/test_bbstat.py 15/15 still pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

The forwarding @Property shims (binByBinStat, binByBinStatType, binByBinStatMode, beta0, logbeta0, beta, ubeta, nbeta, varbeta, sumw, betamask, kstat, betaauxlu, beta_shape, minBBKstat, proc_zero_var_mask, plus the four wrapper methods _default_beta0, set_beta0, beta0defaultassign, betadefaultassign) are removed; every internal call site now reads / calls the BinByBinStat instance directly. Two external callers (workspace.py, bin/rabbit_fit.py) are updated to fitter.bbstat.enabled / fitter.bbstat.binByBinStatMode. Net: -90 lines of forwarding boilerplate. The 6 binByBinStatType × binByBinStatMode regression on tests/make_tensor.py reproduces the previous σ values to all reported digits and the 15 bbstat unit tests still pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

The profile-step branches each repeated kstat = self.kstat[: self.indata.nbins] betamask = self.betamask[: self.indata.nbins] varbeta = self.varbeta[: self.indata.nbins] 13 times across the 8 type×mode×likelihood combinations. Hoist them once at the top of the ``if profile:`` block; the application step at the bottom still needs ``[: nexp.shape[0]]`` slices because nexp may omit masked bins, so those stay separate. The 6 binByBinStatType × binByBinStatMode regression on tests/make_tensor.py reproduces the previous σ values to all reported digits. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Polish round on top of the bbstat refactor: * `bbstat/__init__.py` is empty, matching the other rabbit subpackages (`impacts/`, `param_models/`, `mappings/`, `regularization/`). Callers import from the explicit submodule (`from rabbit.bbstat.bbstat import BinByBinStat`, `from rabbit.bbstat.formulas import solve_quad_eq`). * `bbstat/formulas.py` docstring drops the stale reference to `rabbit.bbstat.newton` (the planned helper file was never created; the Newton iteration lives as `BinByBinStat._newton_solve_nbeta`). * New `BinByBinStat.is_linear` property mirrors `param_model.is_linear` and replaces the inline `(not self.bbstat.enabled) or self.bbstat.binByBinStatType == "normal-additive"` check inside `Fitter.is_linear`. * New `BinByBinStat.randomize_postfit` encapsulates the β-toy that `Fitter.toyassign(..., randomize_parameters=True)` used to do inline. The 6 binByBinStatType × binByBinStatMode regression on tests/make_tensor.py reproduces the previous σ values to all reported digits and the 15 bbstat unit tests still pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

bendavid · 2026-04-29T09:22:56Z

Does 3 only handle the zero variance case, or does it also "distribute" the beta impact across processes with different (but non-zero) relative statistical uncertainty?

davidwalter2 · 2026-04-29T12:50:42Z

Does 3 only handle the zero variance case, or does it also "distribute" the beta impact across processes with different (but non-zero) relative statistical uncertainty?

It distributes the beta impact across all processes with finite statistical uncertainty relative to their contribution, not considering their statistical uncertainty. This is the same way as it was done before just leaving out the zero variance contribution.

Intuitively I would expect that distributing the beta impact weighted with the relative statistical uncertainty of each process is more precise but I think it needs further study to quantify that. The BB full treats this exact.

davidwalter2 and others added 14 commits April 28, 2026 15:56

Add option to plot prefit impacts (only for mappings)

a6c2920

davidwalter2 merged commit 8d9693f into WMass:main May 11, 2026
61 of 67 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bin-by-bin (BBB) statistical treatment: numerical fixes + modularization#132

Bin-by-bin (BBB) statistical treatment: numerical fixes + modularization#132
davidwalter2 merged 15 commits into
WMass:mainfrom
davidwalter2:260428_bbstat

davidwalter2 commented Apr 29, 2026

Uh oh!

bendavid commented Apr 29, 2026

Uh oh!

davidwalter2 commented Apr 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

davidwalter2 commented Apr 29, 2026

Uh oh!

bendavid commented Apr 29, 2026

Uh oh!

davidwalter2 commented Apr 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants