Round-2 audit fixes: parallel bootstrap RNG reproducibility + chunk guard, calendar max_e warning#268
Conversation
…ax_e warning From the second deep audit (no estimand/bias/calibration defects found; these are robustness/plumbing on non-default paths): - Parallel multiplier bootstrap (bstrap=TRUE, pl=TRUE, cores>1, n>2500) is now reproducible under a fixed set.seed(): use L'Ecuyer-CMRG parallel RNG streams in the mclapply branch (forked Mersenne-Twister workers re-seed non-deterministically), and restore the caller's RNGkind on exit. Point estimates were always unaffected; only bootstrap SE / uniform-band crit.val drifted run-to-run. - Fixed a crash in the parallel bootstrap when biters < cores: the per-core work split could go negative (e.g. biters=2, cores=4 -> [-1,1,1,1]). It is now a non-negative split that drops empty chunks. - aggte(type="calendar") now warns when min_e/max_e/balance_e are supplied (event-study-only options with no effect on calendar aggregation; the correct unrestricted result is returned, as before). Regression tests added to test-audit-fixes.R (parallel-seed reproducibility, non-negative chunking incl. biters<cores, calendar warning). Bug #1 from the audit (scale-dependent absolute SE threshold) intentionally not addressed. Full suite passes (0 failures).
There was a problem hiding this comment.
Pull request overview
This PR implements three robustness fixes in non-default code paths: making the parallel multiplier bootstrap reproducible under a fixed seed, preventing a parallel chunking edge-case crash when biters < cores, and warning when dynamic-only options are provided to calendar-time aggregation.
Changes:
- Switch parallel multiplier bootstrap RNG handling to use L’Ecuyer-CMRG streams (and restore the caller’s
RNGkind). - Replace per-core bootstrap chunk splitting logic to guarantee non-negative chunks that sum to
biters. - Add a warning in
aggte(type = "calendar")whenmin_e/max_e/balance_eare supplied, since they’re ignored for calendar aggregation; document changes inNEWS.mdand add regression tests.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
R/mboot.R |
Makes parallel multiplier bootstrap reproducible under fixed seeds and fixes chunk splitting when biters < cores. |
tests/testthat/test-audit-fixes.R |
Adds regression tests for parallel RNG reproducibility, chunk guard, and calendar aggregation warning behavior. |
R/compute.aggte.R |
Warns when dynamic-only windowing options are passed to type = "calendar". |
NEWS.md |
Documents the audit-driven robustness fixes under the existing 2.5.1 section. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| @@ -194,6 +197,13 @@ run_multiplier_bootstrap <- function(inf.func, biters, pl = FALSE, cores = 1) { | |||
| pl <- FALSE | |||
| } | |||
| if(n > 2500 & pl == TRUE & cores > 1) { | |||
There was a problem hiding this comment.
Done in 329493c — condition is now if (n > 2500 && pl && cores > 1) (scalar &&, dropped == TRUE), matching the Windows guard above.
| old_kind = RNGkind("L'Ecuyer-CMRG") | ||
| on.exit(RNGkind(old_kind[1L], old_kind[2L], old_kind[3L]), add = TRUE) | ||
| results = parallel::mclapply( | ||
| chunks, | ||
| FUN = parallel.function, |
There was a problem hiding this comment.
Done in 329493c — <- for the assignments and mc.cores = min(cores, length(chunks)) so biters < cores doesn't spawn idle workers.
| set.seed(7); a <- did:::run_multiplier_bootstrap(inf, 300, pl = TRUE, cores = 2) | ||
| set.seed(7); b <- did:::run_multiplier_bootstrap(inf, 300, pl = TRUE, cores = 2) | ||
| expect_equal(a, b) # same seed -> identical draws |
There was a problem hiding this comment.
Done in 329493c — switched to expect_identical(a, b); verified the two same-seed runs are bit-identical (max abs diff 0).
…ect_identical - mboot.R: use <- for the new assignments and && / drop `== TRUE` in the parallel-branch condition (scalar, matches the Windows guard above). - mboot.R: cap mc.cores at the number of non-empty chunks so biters < cores does not spawn idle workers. - test-audit-fixes.R: assert exact reproducibility with expect_identical() (verified bit-identical) instead of the tolerant expect_equal().
| set.seed(7); a <- did:::run_multiplier_bootstrap(inf, 300, pl = TRUE, cores = 2) | ||
| set.seed(7); b <- did:::run_multiplier_bootstrap(inf, 300, pl = TRUE, cores = 2) | ||
| expect_identical(a, b) # same seed -> bit-identical draws | ||
| expect_identical(RNGkind()[1], "Mersenne-Twister") # caller's RNGkind restored |
There was a problem hiding this comment.
Fixed — the test now captures RNGkind() before the call and asserts it's restored to that value, instead of a hardcoded 'Mersenne-Twister'.
…to-prior, <- consistency - compute.aggte.R: the calendar warning no longer claims the options "apply only to type='dynamic'" (max_e is also honored by simple/group); it just states they are ignored for type="calendar". - mboot.R: serial branch now uses <- for consistency with the rest of the function. - test-audit-fixes.R: assert RNGkind() is restored to its pre-call value (captured at runtime) rather than a hardcoded "Mersenne-Twister".
| * The parallel multiplier bootstrap (`bstrap = TRUE, pl = TRUE, cores > 1`, on more than 2500 cross-sectional units) is now reproducible under a fixed `set.seed()`. It previously used the default Mersenne-Twister RNG inside `parallel::mclapply()`, whose forked workers re-seed non-deterministically, so bootstrap standard errors and uniform-band critical values drifted between identical-seed runs. It now uses L'Ecuyer-CMRG parallel RNG streams (and restores the caller's RNG kind on exit). Point estimates were always unaffected. | ||
|
|
||
| * Fixed a crash in the parallel bootstrap when `biters < cores` (with `pl = TRUE`, `cores > 1`, and more than 2500 units): the per-core work split could produce a negative chunk size. The split is now always non-negative. |
There was a problem hiding this comment.
Good catch — fixed. Both NEWS bullets now say the >2500 threshold is on influence-function rows (units, or clusters when clustered standard errors are used, since mboot() bootstraps the cluster-summed influence function).
…ws (units or clusters) Address Copilot review: the >2500 threshold is nrow(inf.func), which is clusters (not units) when clustered standard errors are used, since mboot() bootstraps the cluster-summed influence function.
Three robustness fixes from the second deep audit. That audit found no estimand/bias/SE-calibration/coverage defects in the core estimator (verified across 60+ Monte Carlo configs and independent re-implementations to machine precision); these are plumbing on non-default paths.
1. Parallel bootstrap reproducible under a fixed seed (medium)
With
bstrap=TRUE, pl=TRUE, cores>1onn>2500units,parallel::mclapplyran under the default Mersenne-Twister, whose forked workers re-seed non-deterministically — so a fixedset.seed()did not pin the bootstrap, and SEs / uniform-band critical values drifted run-to-run (~0.5% / ~5%). Point estimates and influence functions were always bit-identical. Now uses L'Ecuyer-CMRG parallel RNG streams and restores the caller'sRNGkindon exit.max|se diff| = 0(was ~0.004); identical uniform crit.val;RNGkindrestored to the caller's.2. Parallel bootstrap crash when
biters < cores(low)The per-core work split
rep(ceiling(biters/cores), cores)+ correction could go negative (e.g.biters=2, cores=4 → [-1,1,1,1]), throwing a cryptic Armadillo error. Replaced with a split that is always non-negative and sums tobiters.3.
aggte(type="calendar")warns onmin_e/max_e/balance_e(low)These are event-study (dynamic) options with no effect on calendar-time aggregation; calendar silently ignored them. It now warns (and still returns the correct unrestricted calendar effects, as before — matching prior behavior).
Tests & scope
tests/testthat/test-audit-fixes.R: parallel-seed reproducibility +RNGkindrestoration, non-negative chunking (incl.biters<cores), and the calendar warning. Full suite: 0 failures, 0 warnings, 1612 PASS.se <= sqrt(.Machine$double.eps)*10is scale-dependent and can NA valid SEs for very small-magnitude outcomes). Left as-is.Builds on the merged #266/#267; version stays 2.5.1 (NEWS under the
2.5.1section).