Skip to content

Round-2 audit fixes: parallel bootstrap RNG reproducibility + chunk guard, calendar max_e warning#268

Merged
pedrohcgs merged 4 commits into
masterfrom
fix-parallel-bootstrap-calendar-warn
Jun 19, 2026
Merged

Round-2 audit fixes: parallel bootstrap RNG reproducibility + chunk guard, calendar max_e warning#268
pedrohcgs merged 4 commits into
masterfrom
fix-parallel-bootstrap-calendar-warn

Conversation

@pedrohcgs

Copy link
Copy Markdown
Collaborator

Three robustness fixes from the second deep audit. That audit found no estimand/bias/SE-calibration/coverage defects in the core estimator (verified across 60+ Monte Carlo configs and independent re-implementations to machine precision); these are plumbing on non-default paths.

1. Parallel bootstrap reproducible under a fixed seed (medium)

With bstrap=TRUE, pl=TRUE, cores>1 on n>2500 units, parallel::mclapply ran under the default Mersenne-Twister, whose forked workers re-seed non-deterministically — so a fixed set.seed() did not pin the bootstrap, and SEs / uniform-band critical values drifted run-to-run (~0.5% / ~5%). Point estimates and influence functions were always bit-identical. Now uses L'Ecuyer-CMRG parallel RNG streams and restores the caller's RNGkind on exit.

  • Verified: two same-seed runs now give max|se diff| = 0 (was ~0.004); identical uniform crit.val; RNGkind restored to the caller's.

2. Parallel bootstrap crash when biters < cores (low)

The per-core work split rep(ceiling(biters/cores), cores) + correction could go negative (e.g. biters=2, cores=4 → [-1,1,1,1]), throwing a cryptic Armadillo error. Replaced with a split that is always non-negative and sums to biters.

3. aggte(type="calendar") warns on min_e/max_e/balance_e (low)

These are event-study (dynamic) options with no effect on calendar-time aggregation; calendar silently ignored them. It now warns (and still returns the correct unrestricted calendar effects, as before — matching prior behavior).

Tests & scope

  • Regression tests in tests/testthat/test-audit-fixes.R: parallel-seed reproducibility + RNGkind restoration, non-negative chunking (incl. biters<cores), and the calendar warning. Full suite: 0 failures, 0 warnings, 1612 PASS.
  • Not addressed (maintainer's decision): the audit's bug unable to find an inherited method for function ‘lhs<-’ for signature ‘"NULL"’ #1 (the absolute SE-degeneracy threshold se <= sqrt(.Machine$double.eps)*10 is scale-dependent and can NA valid SEs for very small-magnitude outcomes). Left as-is.

Builds on the merged #266/#267; version stays 2.5.1 (NEWS under the 2.5.1 section).

…ax_e warning

From the second deep audit (no estimand/bias/calibration defects found; these
are robustness/plumbing on non-default paths):

- Parallel multiplier bootstrap (bstrap=TRUE, pl=TRUE, cores>1, n>2500) is now
  reproducible under a fixed set.seed(): use L'Ecuyer-CMRG parallel RNG streams
  in the mclapply branch (forked Mersenne-Twister workers re-seed
  non-deterministically), and restore the caller's RNGkind on exit. Point
  estimates were always unaffected; only bootstrap SE / uniform-band crit.val
  drifted run-to-run.

- Fixed a crash in the parallel bootstrap when biters < cores: the per-core work
  split could go negative (e.g. biters=2, cores=4 -> [-1,1,1,1]). It is now a
  non-negative split that drops empty chunks.

- aggte(type="calendar") now warns when min_e/max_e/balance_e are supplied
  (event-study-only options with no effect on calendar aggregation; the correct
  unrestricted result is returned, as before).

Regression tests added to test-audit-fixes.R (parallel-seed reproducibility,
non-negative chunking incl. biters<cores, calendar warning). Bug #1 from the
audit (scale-dependent absolute SE threshold) intentionally not addressed.
Full suite passes (0 failures).
Copilot AI review requested due to automatic review settings June 19, 2026 05:21
@pedrohcgs pedrohcgs added the release CRAN release PR: skip the dev-version bump on merge label Jun 19, 2026

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements three robustness fixes in non-default code paths: making the parallel multiplier bootstrap reproducible under a fixed seed, preventing a parallel chunking edge-case crash when biters < cores, and warning when dynamic-only options are provided to calendar-time aggregation.

Changes:

  • Switch parallel multiplier bootstrap RNG handling to use L’Ecuyer-CMRG streams (and restore the caller’s RNGkind).
  • Replace per-core bootstrap chunk splitting logic to guarantee non-negative chunks that sum to biters.
  • Add a warning in aggte(type = "calendar") when min_e/max_e/balance_e are supplied, since they’re ignored for calendar aggregation; document changes in NEWS.md and add regression tests.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
R/mboot.R Makes parallel multiplier bootstrap reproducible under fixed seeds and fixes chunk splitting when biters < cores.
tests/testthat/test-audit-fixes.R Adds regression tests for parallel RNG reproducibility, chunk guard, and calendar aggregation warning behavior.
R/compute.aggte.R Warns when dynamic-only windowing options are passed to type = "calendar".
NEWS.md Documents the audit-driven robustness fixes under the existing 2.5.1 section.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread R/mboot.R Outdated
Comment thread R/mboot.R Outdated
@@ -194,6 +197,13 @@ run_multiplier_bootstrap <- function(inf.func, biters, pl = FALSE, cores = 1) {
pl <- FALSE
}
if(n > 2500 & pl == TRUE & cores > 1) {

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 329493c — condition is now if (n > 2500 && pl && cores > 1) (scalar &&, dropped == TRUE), matching the Windows guard above.

Comment thread R/mboot.R Outdated
Comment on lines 205 to 209
old_kind = RNGkind("L'Ecuyer-CMRG")
on.exit(RNGkind(old_kind[1L], old_kind[2L], old_kind[3L]), add = TRUE)
results = parallel::mclapply(
chunks,
FUN = parallel.function,

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 329493c<- for the assignments and mc.cores = min(cores, length(chunks)) so biters < cores doesn't spawn idle workers.

Comment thread tests/testthat/test-audit-fixes.R Outdated
Comment on lines +124 to +126
set.seed(7); a <- did:::run_multiplier_bootstrap(inf, 300, pl = TRUE, cores = 2)
set.seed(7); b <- did:::run_multiplier_bootstrap(inf, 300, pl = TRUE, cores = 2)
expect_equal(a, b) # same seed -> identical draws

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 329493c — switched to expect_identical(a, b); verified the two same-seed runs are bit-identical (max abs diff 0).

…ect_identical

- mboot.R: use <- for the new assignments and && / drop `== TRUE` in the
  parallel-branch condition (scalar, matches the Windows guard above).
- mboot.R: cap mc.cores at the number of non-empty chunks so biters < cores
  does not spawn idle workers.
- test-audit-fixes.R: assert exact reproducibility with expect_identical()
  (verified bit-identical) instead of the tolerant expect_equal().

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

Comment thread R/compute.aggte.R Outdated
Comment thread R/mboot.R Outdated
Comment thread tests/testthat/test-audit-fixes.R Outdated
Comment on lines +124 to +127
set.seed(7); a <- did:::run_multiplier_bootstrap(inf, 300, pl = TRUE, cores = 2)
set.seed(7); b <- did:::run_multiplier_bootstrap(inf, 300, pl = TRUE, cores = 2)
expect_identical(a, b) # same seed -> bit-identical draws
expect_identical(RNGkind()[1], "Mersenne-Twister") # caller's RNGkind restored

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — the test now captures RNGkind() before the call and asserts it's restored to that value, instead of a hardcoded 'Mersenne-Twister'.

…to-prior, <- consistency

- compute.aggte.R: the calendar warning no longer claims the options "apply only
  to type='dynamic'" (max_e is also honored by simple/group); it just states they
  are ignored for type="calendar".
- mboot.R: serial branch now uses <- for consistency with the rest of the function.
- test-audit-fixes.R: assert RNGkind() is restored to its pre-call value (captured
  at runtime) rather than a hardcoded "Mersenne-Twister".

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

Comment thread NEWS.md Outdated
Comment on lines +11 to +13
* The parallel multiplier bootstrap (`bstrap = TRUE, pl = TRUE, cores > 1`, on more than 2500 cross-sectional units) is now reproducible under a fixed `set.seed()`. It previously used the default Mersenne-Twister RNG inside `parallel::mclapply()`, whose forked workers re-seed non-deterministically, so bootstrap standard errors and uniform-band critical values drifted between identical-seed runs. It now uses L'Ecuyer-CMRG parallel RNG streams (and restores the caller's RNG kind on exit). Point estimates were always unaffected.

* Fixed a crash in the parallel bootstrap when `biters < cores` (with `pl = TRUE`, `cores > 1`, and more than 2500 units): the per-core work split could produce a negative chunk size. The split is now always non-negative.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — fixed. Both NEWS bullets now say the >2500 threshold is on influence-function rows (units, or clusters when clustered standard errors are used, since mboot() bootstraps the cluster-summed influence function).

…ws (units or clusters)

Address Copilot review: the >2500 threshold is nrow(inf.func), which is clusters
(not units) when clustered standard errors are used, since mboot() bootstraps the
cluster-summed influence function.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated no new comments.

@pedrohcgs pedrohcgs merged commit d66f832 into master Jun 19, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release CRAN release PR: skip the dev-version bump on merge

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants