Skip to content

Bin-by-bin (BBB) statistical treatment: numerical fixes + modularization#132

Merged
davidwalter2 merged 15 commits into
WMass:mainfrom
davidwalter2:260428_bbstat
May 11, 2026
Merged

Bin-by-bin (BBB) statistical treatment: numerical fixes + modularization#132
davidwalter2 merged 15 commits into
WMass:mainfrom
davidwalter2:260428_bbstat

Conversation

@davidwalter2
Copy link
Copy Markdown
Collaborator

This PR groups three lines of work on the bin-by-bin statistical treatment, all preserving the existing CLI and giving identical results to the pre-change baseline on tests/make_tensor.py (6/6 binByBinStatType × binByBinStatMode combinations) to all reported digits.

Numerical fixes

  1. --binByBinStatType gamma --binByBinStatMode full postfit covariance was wrong. The Newton iteration over nbeta uses tf.Variable.assign_sub, which is not a differentiable op. After the loop, self.nbeta was treated as a leaf constant by the outer tape, so the implicit dependence dβ_profile/dx was dropped from the postfit Hessian — covariance and POI σ came out too small. Fix: take one more Newton step at the converged value with autodiff enabled, restoring the implicit-function-theorem gradient. On tests/make_tensor.py the sig POI σ goes from 0.00835 (buggy) to 0.01437, matching all five other type×mode combinations.

  2. New --minBBKstat option. Extends betamask with kstat = sumw² / sumw2 < threshold so degenerate per-process bins (e.g. mixed-sign-weight cancellations) get their β nuisance fixed at β0 instead of being profiled. Useful in full mode when individual processes have very low effective stats per bin and otherwise produce a non-PD postfit Hessian. Default 0.0 preserves existing behaviour.

  3. lite mode no longer scales zero-variance processes by β. For processes with sumw2 == 0 (e.g. data-driven backgrounds whose yield is determined by a paramModel), β is passed through unchanged. The merged kstat in lite mode is computed from finite-variance contributions only (data-driven processes were previously inflating sumw_total without contributing to sumw2_total, over-tightening β). Each lite-mode profile formula was re-derived to handle nexp(β) = β·n_active + n_zero_var; when no zero-variance process is present each formula reduces exactly to the original closed form, so the regression is exact. On a W-mass fit with one data-driven process (Fake), POI σ in lite matches full to 5 sig figs (0.09753 vs 0.09752), where previously lite was inflated by ~1%.

Modularization

The ~700–900 BBB-related lines previously interleaved through fitter.py move into a new rabbit/bbstat/ package:

  rabbit/bbstat/  
    __init__.py                               
    bbstat.py     # BinByBinStat: Variables, masks, kstat, profile + apply,
                  # constraint NLL, toy randomization, Newton helper                        
    formulas.py   # solve_quad_eq             

Fitter constructs a single self.bbstat = BinByBinStat(...) and delegates all β-related operations to it. _compute_yields_with_beta shrinks from ~614 lines to a 10-line wrapper. The legacy attribute / method shims on Fitter (binByBinStat, beta, kstat, betamask, set_beta0, …) have been removed; the two external callers (workspace.py, bin/rabbit_fit.py) now read fitter.bbstat. directly.

Net diff in fitter.py: 3118 → 2337 lines. New module: 942 lines.

The selective-lite naming uses physics-neutral terms: proc_zero_var_mask / n_active / n_zero_var (rather than "data-driven" / "MC", which were specific to one use case).

Tests + CI

  • New tests/test_bbstat.py (15 tests, runs as a script and under pytest): solve_quad_eq, default β0 per type, lbeta = 0 at β = β0, normal-additive / normal-multiplicative constraint values, betamask construction, --minBBKstat extension, proc_zero_var_mask detection, full-mode skip, invalid type validation.
  • unit-tests matrix in .github/workflows/main.yml gains the new test.
  • All fit-running jobs (fitting, sparse-fits, alternative-fits, regularization, bsm, likelihoodscans, abcd-fits) now declare needs: [setenv, unit-tests, …] so a regression in the bbstat module short-circuits the pipeline before the long fits.

davidwalter2 and others added 14 commits April 28, 2026 15:56
The Newton iteration over `nbeta` uses `tf.Variable.assign_sub`, which is
not a differentiable op. After the loop, `self.nbeta` was treated as a
leaf constant by the outer tape, so the implicit dependence
`dbeta_profile/dx` was dropped from the postfit Hessian, giving an
incorrect (too small) covariance and POI uncertainty.

Apply the implicit-function-theorem trick: take one more Newton step at
the converged value. Since `grad ≈ 0` at convergence the value is
unchanged, but `u_diff = u* - F(u*, z)/H(u*, z)` carries the correct
`du*/dz = -(∂F/∂z)/H` through the outer tape.

On `tests/make_tensor.py`, sig POI σ for `--binByBinStatMode full
--binByBinStatType gamma` now matches lite gamma and full
normal-multiplicative (0.01437 vs 0.01436 / 0.01438; previously 0.00835).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds a CLI option that extends the BBB mask to (bin, process) entries
with effective MC statistics kstat = sumw**2 / sumw2 below the given
threshold. These entries get their bin-by-bin nuisance fixed at beta0
instead of being profiled.

Useful for `--binByBinStatMode full` when individual processes have very
low effective stats per bin (e.g. mixed-sign-weight cancellations) which
otherwise produce ill-conditioned per-process profiles and a non-PD
postfit Hessian.

Default 0.0 preserves the existing behaviour.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Previously, in binByBinStatMode lite the per-bin β scalar scaled all
processes uniformly, including data-driven processes (e.g. ABCD-
estimated QCD with sumw2==0). This had two consequences:

1. The merged kstat = (Σ_p sumw_p)² / Σ_p sumw²_p was inflated by the
   data-driven yield contribution, over-constraining β toward 1.
2. β movements perturbed the data-driven prediction, even though that
   prediction has no MC stat uncertainty and is determined by the
   param model.

This change detects (bin, process) entries with sumw²==0 (proc_data_
driven_mask), excludes them from the merged sumw used to compute
kstat, and applies β only to MC processes when reconstructing nexp.

The closed-form profile β formulas are generalized to nexp(β) =
β·N_MC + N_data; each lite branch now solves a quadratic (or scaled
linear, for normal-multiplicative chisqFit) instead of the original
simpler form. With N_data == 0 each new formula reduces exactly to
the old one, so behaviour is unchanged for fits without data-driven
processes — verified on tests/make_tensor.py where all 6 type×mode
combinations reproduce the pre-change POI σ to 4 sig figs.

For normal-additive lite, the β formula is unchanged (β cancels in the
additive math) but the application step distributes the additive
correction sbeta·β only over MC processes, weighted by their share of
N_MC.

On a W-mass fit with one data-driven process (Fake), POI σ in lite and
full modes now agree to 4 sig figs (0.09753 vs 0.09752), as expected
when the BBB nuisance only couples to MC stat.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
First step of extracting the bin-by-bin statistical treatment from
fitter.py into its own module. Creates the rabbit/bbstat/ package and
moves the standalone solve_quad_eq helper into bbstat/formulas.py.

No behaviour change. The existing 6 binByBinStatType × binByBinStatMode
combinations on tests/make_tensor.py reproduce the previous σ values.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Step 2 of the bbstat refactor. Lifts the bin-by-bin statistical
initialization (~150 lines: type/mode/threshold validation, β/β0/log β0
TF Variables, sumw/varbeta/kstat reduction, betamask, gamma+full nbeta
Variable, and the covarianceFit + normal-additive LU pre-decomposition)
out of Fitter.__init__ and into BinByBinStat.__init__.

Fitter now constructs self.bbstat once and exposes the legacy attribute
names (binByBinStat, binByBinStatType, beta, kstat, betamask, etc.) via
@Property shims so external callers (parsing.py, workspace.py, scripts
in bin/) continue to work without changes.

Helpers _default_beta0, set_beta0, beta0defaultassign, betadefaultassign
delegate one line each to bbstat.

The 6 binByBinStatType × binByBinStatMode regression on
tests/make_tensor.py reproduces the previous σ values to all reported
digits.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Step 3 of the bbstat refactor. Lifts the constraint NLL term (~50 lines:
gamma + normal-multiplicative + normal-additive variants, with optional
full_nll constants) out of Fitter._compute_lbeta and into
BinByBinStat.lbeta. Fitter._compute_lbeta is now a one-liner that
delegates.

The 6 type × mode regression on tests/make_tensor.py reproduces
the previous σ values to all reported digits.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Step 4 of the bbstat refactor — the largest. Lifts the entire β profile
and application logic out of Fitter._compute_yields_with_beta (~600
lines covering 8 lite + 8 full × 3 likelihood-form branches, including
the gamma+full Newton iteration with the implicit-function-theorem
trick) into BinByBinStat.profile_and_apply.

Fitter._compute_yields_with_beta is now a 10-line delegator that calls
self._compute_yields_noBBB to get the model prediction then hands off
to self.bbstat.profile_and_apply. The Newton-loop helper
(_newton_solve_nbeta) lives on BinByBinStat and is shared between the
chisqFit and Poisson gamma+full paths.

The 6 binByBinStatType × binByBinStatMode regression on
tests/make_tensor.py reproduces the previous σ values to all reported
digits.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Step 5 of the bbstat refactor. The β branches of bayesassign /
frequentistassign now delegate to BinByBinStat.randomize_bayes /
randomize_frequentist. Fitter keeps the θ orchestration but the
gamma/normal randomizers no longer touch β-related state directly.

The 6 binByBinStatType × binByBinStatMode regression on
tests/make_tensor.py reproduces the previous σ values to all reported
digits.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Step 6 of the bbstat refactor. Adds tests/test_bbstat.py covering the
new BinByBinStat module without spinning up the full Fitter:

- solve_quad_eq known roots
- default_beta0 per stat type (gamma=1, normal-mult=1, normal-add=0)
- lbeta=0 at β=β0 for all three types
- normal-additive quadratic constraint value
- normal-multiplicative kstat-weighted quadratic value
- betamask construction (sumw==0 or sumw²==0)
- --minBBKstat extension of betamask
- proc_data_driven_mask detection in lite mode
- full mode skips the proc_data_driven_mask path
- invalid binByBinStatType raises

15/15 passing.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds tests/test_bbstat.py to the unit-tests matrix in the GitHub CI
and makes the fit-running jobs (fitting, sparse-fits, alternative-fits,
regularization, bsm, likelihoodscans, abcd-fits) depend on the
unit-tests job, so a broken bbstat module fails the pipeline before
spending CI cycles on the actual fits.

The new test file picks up an `if __name__ == "__main__"` entrypoint
that calls pytest, matching the script-style invocation used by the
existing unit tests in the same matrix.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The selective lite-mode β treatment is general — it applies to any
(bin, process) entry with sumw2 == 0, not just data-driven QCD-style
backgrounds. Rename the internal names to reflect that:

  proc_data_driven_mask  →  proc_zero_var_mask
  n_mc                   →  n_active
  n_data                 →  n_zero_var
  mc_norm                →  active_norm
  sumw_mc                →  sumw_active
  pdd                    →  zv_mask

Comments now talk about "finite-variance" / "zero-variance" entries
rather than "MC" / "data-driven" processes.

The Fitter property shim is renamed in step (proc_zero_var_mask), and
the unit-test names follow. The 6 binByBinStatType × binByBinStatMode
regression on tests/make_tensor.py reproduces the previous σ values to
all reported digits; tests/test_bbstat.py 15/15 still pass.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The forwarding @Property shims (binByBinStat, binByBinStatType,
binByBinStatMode, beta0, logbeta0, beta, ubeta, nbeta, varbeta, sumw,
betamask, kstat, betaauxlu, beta_shape, minBBKstat,
proc_zero_var_mask, plus the four wrapper methods _default_beta0,
set_beta0, beta0defaultassign, betadefaultassign) are removed; every
internal call site now reads / calls the BinByBinStat instance
directly. Two external callers (workspace.py, bin/rabbit_fit.py) are
updated to fitter.bbstat.enabled / fitter.bbstat.binByBinStatMode.

Net: -90 lines of forwarding boilerplate. The 6 binByBinStatType ×
binByBinStatMode regression on tests/make_tensor.py reproduces the
previous σ values to all reported digits and the 15 bbstat unit tests
still pass.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The profile-step branches each repeated
  kstat = self.kstat[: self.indata.nbins]
  betamask = self.betamask[: self.indata.nbins]
  varbeta = self.varbeta[: self.indata.nbins]
13 times across the 8 type×mode×likelihood combinations. Hoist them
once at the top of the ``if profile:`` block; the application step at
the bottom still needs ``[: nexp.shape[0]]`` slices because nexp may
omit masked bins, so those stay separate.

The 6 binByBinStatType × binByBinStatMode regression on
tests/make_tensor.py reproduces the previous σ values to all reported
digits.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Polish round on top of the bbstat refactor:

* `bbstat/__init__.py` is empty, matching the other rabbit subpackages
  (`impacts/`, `param_models/`, `mappings/`, `regularization/`).
  Callers import from the explicit submodule
  (`from rabbit.bbstat.bbstat import BinByBinStat`,
   `from rabbit.bbstat.formulas import solve_quad_eq`).
* `bbstat/formulas.py` docstring drops the stale reference to
  `rabbit.bbstat.newton` (the planned helper file was never created;
  the Newton iteration lives as `BinByBinStat._newton_solve_nbeta`).
* New `BinByBinStat.is_linear` property mirrors `param_model.is_linear`
  and replaces the inline
  `(not self.bbstat.enabled) or self.bbstat.binByBinStatType == "normal-additive"`
  check inside `Fitter.is_linear`.
* New `BinByBinStat.randomize_postfit` encapsulates the β-toy that
  `Fitter.toyassign(..., randomize_parameters=True)` used to do inline.

The 6 binByBinStatType × binByBinStatMode regression on
tests/make_tensor.py reproduces the previous σ values to all reported
digits and the 15 bbstat unit tests still pass.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@bendavid
Copy link
Copy Markdown
Collaborator

Does 3 only handle the zero variance case, or does it also "distribute" the beta impact across processes with different (but non-zero) relative statistical uncertainty?

@davidwalter2
Copy link
Copy Markdown
Collaborator Author

Does 3 only handle the zero variance case, or does it also "distribute" the beta impact across processes with different (but non-zero) relative statistical uncertainty?

It distributes the beta impact across all processes with finite statistical uncertainty relative to their contribution, not considering their statistical uncertainty. This is the same way as it was done before just leaving out the zero variance contribution.

Intuitively I would expect that distributing the beta impact weighted with the relative statistical uncertainty of each process is more precise but I think it needs further study to quantify that. The BB full treats this exact.

@davidwalter2 davidwalter2 merged commit 8d9693f into WMass:main May 11, 2026
61 of 67 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants