builder: implement edid() -- Efficient DiD estimator (no-covariate path)#1
Open
marcelortizv wants to merge 112 commits into
Open
builder: implement edid() -- Efficient DiD estimator (no-covariate path)#1marcelortizv wants to merge 112 commits into
marcelortizv wants to merge 112 commits into
Conversation
Adds 14 new R source files implementing the Chen, Sant'Anna & Xie (2025) Efficient DiD estimator for staggered-adoption balanced panels. Key features: - PT-All and PT-Post regimes via enumerate_valid_pairs_edid() - Omega* covariance matrix and optimal inverse-covariance weights - Cluster-robust SE via EIF sandwich formula - Multiplier bootstrap (Rademacher, Mammen, Webb) - WIF correction in overall/event-study/group aggregations - edid_fit S3 class with print/summary/coef/vcov/as.data.frame methods - Covariate and survey paths are clean stubs (stop with message) - No new package dependencies (svd() replaces MASS::ginv) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…p field names) - BUG-1: remove spurious `eif - att_gt` subtraction in compute_eif_nocov_edid(); the score is already zero-mean by construction (each group contribution is demeaned) - BUG-2: safe_inference_edid() now returns inference_valid=FALSE (with non-NA se but NA CIs/p-value) when att is non-finite, fixing the valid=TRUE/NA-CI inconsistency - BUG-3: resolves automatically from BUG-1 fix - BUG-4: rename overall_draws→overall_b, event_study_draws→event_study_b, group_draws→group_b in run_multiplier_bootstrap_edid() and update callers in edid.R Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Appended edid() feature bullet to NEWS.md under the 2.3.1.904 development header. Produced ARCHITECTURE.md and log-entry.md in the run directory documenting the EDiD module structure, panel_obj/edid_fit schemas, (g,t) cell loop, PT-All vs PT-Post pair enumeration, Omega* construction, aggregation/bootstrap flows, and 4 bug resolutions (BUG-1 through BUG-4). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Implements the efficient DiD estimator from Chen, Sant'Anna & Xie (2025) supporting PT-All and PT-Post parallel trends assumptions. Features (Priorities 1-6): - Data validation and balanced-panel preprocessing - Valid-pair enumeration for PT-All and PT-Post regimes - Closed-form efficient DiD via inverse-covariance weights (no-cov path) - Analytical EIF-based SEs (iid and cluster-robust) - Overall, event-study, and group aggregation with WIF correction - Multiplier bootstrap (Rademacher, Mammen, Webb) with cluster expansion Deferred to follow-up: DR covariate path, survey support, Hausman pretest. 207 new edid tests pass; 777 full-suite pass; 0 regressions. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Rename edid() args: outcome->yname, unit->idname, time->tname, first_treat->gname, alpha->alp, cluster->clustervars, n_bootstrap->bstrap+biters; control_group "never_treated"->"nevertreated" - Add G=0->Inf auto-conversion so att_gt() datasets work directly - Rewrite print.edid_fit() to MP-style ATT(g,t) table with sig codes - summary.edid_fit() delegates to print then appends overall/ES/group - Add bstrap field to edid_fit object for CI label selection - Create R/edid-aggte.R: aggte_edid(), print/summary.AGGTEobj_edid - Update compare_att_gt_edid.R: new arg names, drop G_edid column - Update all edid test files to use renamed arguments Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…rison cohort Replaces the PT-All loop in enumerate_valid_pairs_edid() to use only treated cohorts as comparison cohorts (never-treated appears only as the time control inside each moment). Self-pairs (gp==target_g) include period_1 as a valid tpre (degenerate CS DiD); cross-pairs exclude it. This eliminates T-1 redundant gp=Inf rows, resolves near-singular Omega, and produces a correctly-specified analytical Omega for PT-All. Also removes dead gp=Inf branches in compute_omega_star_nocov_edid(), compute_generated_outcomes_nocov_edid(), and compute_eif_nocov_edid() that handled the now-impossible Inf comparison cohort case. PT-Post paths are unchanged. ATT estimates match author's reference to < 1e-10; pair count for cell (g=3, t=any) on 10-period 3-cohort data is 13 (was: 10 finite + 9 redundant Inf). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Replace edid-covariates.R stub (stop() functions) with empty placeholder - Add xformla argument to edid(), validate_edid_inputs(), prepare_edid_panel(), fit_edid_cells() - Extract covariate_matrix in prepare_edid_panel() from xformla formula - Add xformla/covariates validation in edid-validate.R (covariates now deprecated-errors) - Dispatch to covariate EIF path (edid-cov.R + edid-cov-eif.R) when xformla is non-trivial - Fix bs_objects NULL-sentinel bug in build_basis_matrix_edid/predict_basis_edid that broke cross-fitting Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…pass Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Untrack personal working files (PDFs, audit/spec/plan markdown notes) and move compare_att_gt_edid.R into benchmark/. Add the personal files to .gitignore so they remain on disk locally but aren't shared, and to .Rbuildignore (along with benchmark/) so they don't ship in the R package build. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The R CMD check failed because:
- Roxygen plain-text math like (g', t_pre), [Y_s - Y_1 | G=g', X], and
r_{g,Inf} confused the Rd parser (apostrophes interpreted as quoted
strings, brackets as link targets, _{...} as markdown emphasis). Wrap
these in \eqn{} or rename g' -> gp to match the variable name in code.
- as.data.frame.edid_fit() signature didn't match the as.data.frame()
generic; add row.names/optional and move which after ... .
Also regenerate stale Rd files for the covariate-path functions.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three bugs fixed: 1. aggte_edid(type="dynamic") and aggte_edid(type="group") reused the pre-baked "simple" overall instead of computing type-specific overalls. Now: - dynamic overall = mean(ES(e)) for e >= 0 (equal weights, no WIF) - group overall = pi_g-weighted average of per-group ATTs (with WIF) Both match the formulas in compute.aggte.R (did::aggte). 2. enumerate_valid_pairs_edid() for PT-Post excluded tpre == period_1, which caused edid(pt_assumption="post") to return NA for the earliest-treated cohort (e.g., group 8 in the Dobkin application). The standard 2x2 DiD with period_1 as base is valid. 3. EIF index tracking through balance_e/min_e/max_e/na.rm filtering in aggte_edid — original indices into es_list/gr_list are now preserved so that the correct EIFs are used for SE computation. Also stores cohort_fractions and unit_cohorts in edid_fit so that aggte_edid can compute pi_g-weighted group overalls with WIF. Verified against: - did::aggte() on the same data (formula match) - did::att_gt(base_period="varying") for PT-Post (diff ~1e-12) - Paper equations (3.8), (3.9), Theorem 3 SE formula - All 519 existing edid tests pass
Two corrections to the doubly-robust covariate path so the implementation
matches Chen, Sant'Anna & Xie (2025):
1. Cross-cohort term1 in compute_generated_outcomes_cov_edid() now uses
the full Eq. (4.4) augmentation:
(G_g / pi_g) * (Y_t - Y_1 - m_{Inf,t,t'}(X) - m_{g',t',1}(X))
replacing the partial form that subtracted only m_{Inf,t,1}(X).
The previous form was consistent but less efficient than the EIF.
2. Omega* now uses sieve-estimated conditional inverse propensities
1/p_g(X) per Eq. (3.12), rather than the unconditional 1/pi_g
approximation. A new estimate_inverse_propensity_edid() (B-spline
sieve) and estimate_all_inverse_propensities() (K-fold cross-fitting)
produce the per-unit weights threaded into compute_omega_star_cov_edid().
Verification (R=5000, n=500, 10 cores, 3 DGPs with pre/post comparison,
benchmark/edid_cov_fix_verification.R):
- Both pre- and post-fix versions are consistent (|bias| < 0.005 in all
cells across all DGPs); the term1 correction does not remove bias but
delivers the efficient EIF.
- mc_sd of point estimates drops 4-12% (DGP-dependent), matching the
expected efficiency gain from the EIF correction.
- All 519 base tests and 66 covariate tests continue to pass.
…stics The covariate path of edid() now uses plug-in (no sample-splitting) nuisance estimation, with train = test = full sample. This matches the paper's main-text proposal (Sec. 5.2, footnote 601, which explicitly warns that "sample-splitting [...] can lead to a loss of precision in DiD and ES estimation in small samples"). Why: Monte Carlo experiments at R=500, n=500 on three overlap-respecting DGPs (max 1/p_g(X) <= 16, well within Assumption O) showed that K=5 cross-fitting substantially inflates the point estimator's variance at moderate sample sizes: | DGP | mc_sd K=5 | mc_sd K=1 | reduction | edid vs att_gt | |-----|-----------|-----------|-----------|----------------| | A | 0.117 | 0.080 | -32% | K=5 22% WORSE | | B | 0.131 | 0.083 | -36% | K=5 35% WORSE | | C | 0.139 | 0.085 | -39% | K=5 42% WORSE | Under K=5 the efficient EDiD estimator is actually less efficient than the just-identified DR estimator (att_gt) -- the cross-fitting noise exceeds the efficiency gain. Under K=1 the efficient estimator beats att_gt by 13-17% as the paper's theory predicts. Implementation notes: - fit_edid_cells() hardcodes K=1L; the K>1 branch in the three estimate_all_* loops is preserved (gated by `if (K_folds == 1L)`) so cross-fitting can be re-enabled by changing one line once the finite-sample SE behavior of cross-fitted plug-in variance is better understood. - The existing analytical SE estimator (sum(eif^2)/n^2) is conservative even under K=1 (over-coverage ~0.99 at n=500); this is a separate issue tracked for future investigation. Tests: 1090 PASS, 0 FAIL. Also fixed three pre-existing test helpers that passed the obsolete `covariates = NULL` argument to fit_edid_cells() (test-edid-aggregate.R lines 10/70, test-edid-bootstrap.R line 123). Supporting evidence: benchmark/edid_cov_fix_verification.R (uncommitted working-tree script).
pmax(.,0) on the fitted ratio breaks the sieve first-order condition that zeroes the outcome-regression estimation effect; removing it keeps the estimator orthogonal and improves coverage under strong selection. Original line kept commented for easy restore.
… function) compute_eif_cov_edid used w'Ytilde - ATT; the correct first-order IF for the ratio estimator ATT_hat = E_n[w'Ytilde]/E_n[G_g] is w'Ytilde - (G_g/pi_g)*ATT. The constant centering inflated the variance by ATT^2(1/pi_g - 1) with no asymptotic shrinkage (verified: SE/mc_sd 1.33-1.54 -> 1.02-1.12 in good overlap). No-covariate path uses the correct group-demeaned IF and is unaffected.
Author-comparison and original-simulation scripts plus their generated CSVs; superseded auxiliary material, removed to keep the package source focused. Recoverable from git history if needed.
- Aggregation: cohort-share weight-influence (WIF) contribution added to the event-study, group, and calendar overalls (compute_wif_contribution_edid, aggregate_calendar_edid), so the aggregate influence functions account for the estimated cohort shares. - Inverse-propensity helpers (estimate_all_inverse_propensities, estimate_inverse_propensity_edid) supply the conditional 1/p scalings of Omega*(X) (Eq. 3.12) in place of the unconditional fallback. - vcov.edid_fit: cluster-robust branch matching the reported cluster-robust standard errors; docstring no longer claims to return bootstrap variances. - Edge guards: inverse-propensity n_gp == 0, NA weighted generated outcomes, tname finiteness, at least one finite cohort, and balanced cross-fit folds now raise informative errors instead of returning Inf/NA. - Documentation: edid() and compute_*_edid docstrings aligned with Eqs. (3.12)/(4.4) and the aggregation definitions; pointwise-weight docstring states the shrinkage + eigenvalue-floor regularization. - Tests: new test-edid-paper-faithfulness.R; extended cov-eif / integration / pairs-validation tests (580 edid tests pass, 0 failures). New man pages for the helpers above. Note: a full document() + R CMD check pass is still pending before this branch becomes a PR.
as_MP_edid() builds a did::MP object from an edid fit -- group-time estimates, influence functions, and a DIDparams -- so did::aggte() and the rest of the did ecosystem can aggregate edid output unchanged, with no edits to aggte.R / compute.aggte.R. edid() now stores the metadata the MP needs (idname, tname, gname, time_periods, all_units, panel). Verified: aggte() runs for simple/group/dynamic/calendar on the edid MP, with analytical SEs, and the dynamic event-time ATT reproduces edid's per-cell estimates. Requires store_eif = TRUE. First step toward routing edid aggregation through did::aggte instead of edid's own aggregation code. Tests: test-edid-mp.R.
aggte_edid() now builds a did::MP (as_MP_edid) and delegates to did::aggte(), returning a standard did::AGGTEobj, so edid aggregation uses the published CS2021 definitions and inherits did's print / summary / tidy methods. Verified numerically identical to the previous edid-native aggregation (att ~4e-16, se ~6e-17, 4 seeds x 4 types). edid() now always retains the influence functions (as att_gt() always returns $inffunc); 'store_eif' is kept for backward compatibility but no longer gates storage. Removed the now-dead AGGTEobj_edid print/summary methods (did's AGGTEobj methods apply) and their man pages. 589 edid tests pass.
…lete edid-aggregate.R edid() now stores its aggregations as standard did::AGGTEobj objects ($overall/$event_study/$group/ $calendar/$simple), built via aggte_edid() -> did::aggte() on the edid MP. The edid_fit methods (summary/coef/vcov/as.data.frame) read AGGTEobj fields (overall.att/se, att.egt/se.egt/egt) and the per-element influence functions in $inf.function; summary() delegates event-study/group/calendar display to did's print.AGGTEobj. So summary()/coef()/vcov() now follow did's format and definitions. Deletes edid-aggregate.R (487 lines: aggregate_*_edid + compute_wif_contribution_edid), its test, and 5 man pages -- aggregation is now entirely did::aggte-driven (verified numerically identical to the prior edid-native aggregation: att ~4e-16, se ~6e-17). Migrated the public-contract assertions in test-edid-cov-basic / integration / paper-faithfulness from the old scalar-list fields ($overall$att, ...) to the AGGTEobj fields ($overall$overall.att, ...); the underlying values are unchanged. 569 edid tests pass.
…GTEobj aggregations Regenerate the man pages with roxygen2 8.0.0 (new package baseline; RoxygenNote bumped). Update edid()'s @return to describe the did::AGGTEobj aggregation slots ($overall / $simple / $event_study / $group / $calendar) and the always-stored $eif. Remove the now-vestigial store_eif argument: the influence functions are always retained (as att_gt() always returns $inffunc), since the aggregations are built from them. Dropped store_eif from edid()'s signature/@param, pass store_eif = TRUE internally to fit_edid_cells(), updated as_MP_edid() docs, and removed store_eif usages from the tests (the moot "requires store_eif" test deleted). 568 edid tests pass.
…na 2021, Remark 10) The clustered multiplier bootstrap now aggregates the influence function to cluster SUMS rather than cluster means: cluster_sum_if = rowsum(inf.func, cluster); bres = sqrt(n_clusters) * mb(cluster_sum_if); se = bSigma * sqrt(n_clusters) / n. For equal-sized clusters this matches the previous cluster-mean aggregation; for unbalanced clusters and repeated cross-sections it is the correct cluster-sum form. The no-clustering path is unchanged (n_clusters = n reduces se to bSigma / sqrt(n)). Aligns this fork with the fix merged in bcallaway11/did PR bcallaway11#261. 338 cluster/bootstrap/aggte tests pass.
… edid-bootstrap.R Under bstrap = TRUE, the cell-level SEs and simultaneous critical value now come from did::mboot on the influence-function matrix, and the aggregations bootstrap through aggte_edid() -> did::aggte(bstrap = TRUE), so $att_gt, $overall, $event_study, ... all carry multiplier-bootstrap inference consistent with att_gt. Verified: bootstrap SEs match the analytical SEs up to Monte-Carlo error (linear influence function) for both clustered and unclustered designs. Deletes edid-bootstrap.R (run_multiplier_bootstrap_edid / generate_multiplier_weights_edid / compute_bootstrap_stats_edid), its test, and 3 man pages. Removes the bootstrap_weights argument (did's mboot uses BMisc's multiplier; att_gt-consistent). Builds on the mboot cluster-sum fix so edid's clustered bootstrap uses the correct cluster-sum form. 500 edid tests pass.
Fold edid-linalg.R (pseudoinverse / condition-number / weighted-OLS helpers) and the edid-imports.R @importFrom block into edid-utils.R, and delete the intentionally-empty edid-covariates.R placeholder. No behavior change (the merged helpers are unexported internals; NAMESPACE imports unchanged). Reduces the edid file count 16 -> 13. 500 edid tests pass.
- Qualify stats::.lm.fit in solve_ols_edid (no visible global function NOTE). - Replace raw braces with parens in roxygen titles/text (compute_pointwise_weights_edid, estimate_all/inverse_propensity_edid) -- fixes the checkRd 'Lost braces' NOTEs. - Add CLAUDE.md to .Rbuildignore (non-standard top-level file NOTE). R CMD check: 0 errors, 0 warnings, 0 NOTEs.
…efficient-DiD-estimator Brings the branch up to date with master (8 commits, including the merged PR bcallaway11#261 cluster-robust inference). Resolves the only conflict, R/mboot.R, by taking master's version -- the reviewed PR bcallaway11#261 Remark-10 cluster-sum implementation supersedes the local port (0867d6c), which made the same change. Regenerated docs with roxygen2 8.0.0. The edid consolidation routes aggregation/bootstrap through master's aggte/mboot/getSE; verified against the merged infrastructure: edid suite 500 pass, full did suite 1135 pass, 0 failures.
The kernel routine emitted a warning() whenever n > 1000, which is an ordinary sample size for DiD -- so it fired on essentially every covariate-path call (42 of 43 test-suite warnings). It signals nothing about result correctness, only that the O(n^2) kernel may be slow at large n. Demote it to an informational message(), emitted only in interactive sessions for n > 5000 and silenceable via options(edid.quiet = TRUE); the inner conditional-covariance step was already optimized to matrix-vector form, so the n x n cost only bites in the tens of thousands. Also mark the small-n nonlinear-DGP covariate test, which deliberately stresses overlap, as expecting the extreme-propensity diagnostic. edid suite: 500 pass, 0 warnings.
…-coverage CI) The test-coverage workflow runs the suite under covr, where the package's own `did::`/`did:::` self-references do not resolve against the instrumented namespace (e.g. `did::reset.sim`, `did:::prepare_edid_panel`), and the test-inference.R callr sub-processes that load the did 2.1.2 reference build are unreliable. Two changes, both no-ops outside coverage: - Reference package symbols bareword in tests (test_check resolves both exports and internals); this is the standard idiom and removes 21 of the coverage-only failures. - Gate the did 2.1.2 reference install on R_COVR != "true"; under covr `old_did_available` stays FALSE so the SE-stability comparisons hit their existing skip_if() guard (and the install warnings go away). Verified: covr now runs clean (coverage 76.9%); the full suite still passes under load_all (1135 pass, 0 fail), so R CMD check and R-Package-Test are unaffected.
… build CI FEATURE -- no never-treated group. When every unit is eventually treated (no gname == Inf and no gname == 0), edid() now follows the att_gt() control_group = "nevertreated" convention instead of erroring: it drops every period at/after the last cohort's effective onset (g_max - anticipation) and recasts that cohort as never-treated, so it anchors the comparison over the retained pre-onset window (with a warning). Implemented as a shared internal helper (.edid_coerce_no_never_treated) at the edid() boundary; the SAME helper is reused by edid_perturbation_bootstrap()'s inline panel rebuild so both produce an identical panel. Guards error on a single treated cohort, or when fewer than two pre-onset periods remain. Validated byte-identical (AUTO==MANUAL) to running edid() on the hand-transformed panel across the no-cov / covariate paths, all weight schemes, observation weights, every aggregation, clustering, and all three bootstraps (multiplier, refit, perturbation). New tests in test-edid-no-never-treated.R. FIX (R CMD check) -- thin-cohort fingerprint fragility. The legacy byte-identity fingerprints in test-edid-thin-cohort.R were compared with expect_identical, which fails under R CMD check's byte-compiled / installed package (and cross- platform BLAS) even though load_all reproduces them exactly. Relax to a tight tolerance under covr AND R CMD check; keep exact identity in interactive dev. FIX (build-check) -- declare badger in Suggests so devtools::build_readme() (the README badge chunk) succeeds. Dead library(ggpubr)/library(gridExtra) were already removed. Also: gate the no-never-treated coercion on !anyNA(gname) so NA cohorts defer to validation (no spurious warning / base max() warning); make the per-cell is_pre flag anticipation-aware (t < g - anticipation) for the post-cell diagnostics (reported att/se and aggregations unaffected). Full testthat suite: 3609 pass / 0 fail / 0 error.
…ger dep Declaring badger in Suggests (prev commit) let build-check install it, but badger's badge_*() functions query git/GitHub at render time (git_default_branch_(github_remote_config())), which fails in build-check's CI sandbox. Replace the dynamic badger chunk in README.Rmd with the equivalent STATIC badge markdown (identical rendered output, no render-time dependency) and drop badger from Suggests (no longer used anywhere). Verified: README.Rmd renders cleanly end to end.
…heck pkgdown fix)
pkgdown::build_site() (the build-check job) requires every exported, documented
topic to appear in the reference index; PR260's 16 new edid exports were missing,
so build_reference_index() failed ("Reference metadata not ok ... 16 topics
missing from index"). Add them in three sections (equivalent to the independent
Copilot fix in PR bcallaway11#265): edid / aggte_edid / as_MP_edid alongside att_gt / aggte;
an "Efficient DiD: Plotting, Summarizing, and Methods" section (print/summary/
coef/vcov/as.data.frame methods + edid_weights + edid_weight_plot); and an
"Efficient DiD: Specification Testing and Robustness" section (hausman / sargan /
adaptive / frontier / perturbation_bootstrap / refit_bootstrap).
…nization, efficient plug-in toolkit Over-identification / inference overhaul for the efficient DiD estimator (see NEWS.md for full detail). Over-id / Hausman test: finite-sample AHT effective-df F reference (m = G_eff - 1) replacing the chi-square (m -> inf) corner; fixes the few-cluster / dispersed-weight over-rejection. The dispersed-weight eigen-ridge is removed (the F is the sole finite-sample correction). One change in the shared chokepoint .edid_if_diff_quadform, inherited by edid_hausman/sargan/frontier/adaptive; surfaces m_eff / df2 / m_sat (fragility flag). -> chi-square as G_eff -> inf (no-op for balanced i.i.d. large n). Omega-EE harmonization: the first-order misspecification weight-estimation influence function is now ON by default for any non-uniform weight_scheme on BOTH paths. Covariate: psi_Omega(X). No-covariate (new): psi_omega = D %*% mbar, the IF of the weighted pseudo-estimand theta_w, folded into the EIF; it is zero under correct spec (optimal-weight FOC, D %*% 1 = 0) and restores coverage of theta_w under misspecification. It composes with the existing second-order var_add (Bessel + optimization-optimism), which is also default-on for non-uniform no-covariate fits; the var_add cross-cell increment is built from the pure pre-psi EIF. FD-oracled to ~1e-9; Monte-Carlo nominal under correct spec (no double-counting) and a fix under misspecification (mean-SE/MC-SD 0.90 -> 1.00), incl. the aggregate and dispersed weights. vcov.edid_fit(): carries the combined second-order increment (sigma_quad + sigma_nocov_ee) on the att_gt AND aggregation covariances, so sqrt(diag(vcov())) equals the reported SE on every path. Over-identification toolkit uses the EFFICIENT plug-in influence function: edid_hausman / edid_sargan / edid_frontier / edid_adaptive refit the legs in the plug-in configuration (all estimation-effect channels off), so the over-id object lives on the efficient inverse-variance variance (Andrews, Chen & Tecchio 2025, Sec 5), invariant to the fits' SE convention. edid_sargan's 'inference' (match_fit/plugin_fast) argument removed; edid_hausman/frontier/adaptive gain a 'data' argument (recovered from the call, with a guard that errors on a data mismatch). Validation: full testthat 693 blocks / 0 failures (NOT_CRAN); option-matrix smoke clean; roxygen + NEWS updated; man/ regenerated.
The over-identification toolkit (edid_hausman / edid_sargan / edid_frontier /
edid_adaptive) puts each leg in the efficient plug-in configuration via
.edid_plugin_refit(), which is a pure function of (fit, data). One over-id
operation previously refit the SAME legs in edid_hausman(event_study), again in
edid_hausman(overall), again in edid_sargan, and once per window-grow step; on a
large covariate panel each refit re-estimates the full Omega*(X) nuisance.
Memoize the plug-in refit (session cache keyed by an att-vector + refit-args
fingerprint; bounded), so each unique leg is fit exactly once and window-grow
steps add no refits. Bit-identical to recomputing (the data-reproduction guard
runs on the cache miss); validated identical() across {no-cov,cov}x{unw,wt} for
every Hausman statistic/df/p, the Sargan table, and the full window-grow ladder
with options(edid_plugin_cache=FALSE) vs TRUE. ~2.3-2.5x on a moderate op; more
when the refit dominates. New edid_clear_plugin_cache(); full testthat 4212/0.
With control_group = "notyettreated" and no never-treated group, the last-treated cohort is kept in the data as a not-yet-treated comparison group but removed from glist. The step that drops units already treated in the first period filtered the data by `gname %in% c(0, glist)`, which -- because the last cohort was no longer in glist -- also deleted that entire comparison cohort. The mere presence of always-treated units (or, via anticipation, the earliest cohort being reclassified as first-period treated, or a late cohort being un-coerced from never-treated) therefore silently shrank the not-yet-treated control pool, turning affected ATT(g,t) cells into NA or biasing them. Drop the first-period-treated units by row identity (the existing treated_first_period mask) instead of by glist membership, and re-apply the latest-cohort glist exclusion when no never-treated group remains, so the cohort stays in the data as a control but gets no ATT of its own. Fixed identically in pre_process_did (slow) and pre_process_did2 (fast/default). control_group = "nevertreated" was unaffected. Add tests/testthat/test-always-treated-invariance.R covering the three trigger pathways, the outcome-scaling oracle, fast/slow parity, structural retention of the comparison cohort, and nevertreated invariance. Bump to 2.5.1.
Address Copilot review: T is a reassignable global alias for TRUE, so use n_periods as the period-count argument in .mk_design() (matches the package convention of never using T/F).
Address Copilot review nitpicks in test-always-treated-invariance.R: spell out the 0 (slow) / Inf (fast) never-treated sentinels instead of the ambiguous `c(0/Inf, glist)`, and give the P2/structural section dividers a consistent trailing dash run.
…eck) The build-check workflow renders README.Rmd via devtools::build_readme(), which failed at library(ggpubr) -- ggpubr (and gridExtra) are loaded but never used and are not declared dependencies, so CI cannot install them. The badger::badge_*() chunk is also render-time fragile (it queries git/GitHub, which fails in CI sandboxes). Drop the two unused library() calls and replace the badger chunk with the equivalent static badge markdown (identical rendered output, no render-time dependency). Mirrors the same fix on the edid feature branch. README.Rmd now renders cleanly with only declared deps (did, BMisc, ggplot2); badges are unchanged image URLs and still load. Devel-version badge set to 2.5.1.
…l-cohort-deletion Fix: notyettreated control cohort silently deleted when no never-treated group exists
…ive gname
Three correctness/robustness fixes found by the post-2.5.0 deep audit:
1. fix_weights = "varying" reported standard errors exactly 2x too large on a
balanced panel (analytic and bootstrap, propagating through all aggte()
aggregations). Point estimates were correct. The repeated-cross-section
influence function is normalized over the 2*n_units stacked observations;
folding the pre/post halves to the unit level was missing the 1/2. Fixed in
both compute paths (compute.att_gt2.R force_rc fold, compute.att_gt.R
varying branch). Repeated cross sections and unbalanced panels were
unaffected. Verified against Monte Carlo: the corrected SE matches the
empirical sampling SD and equals the panel-estimator SE under time-invariant
weights.
2. aggte() crashed with cryptic errors on empty / all-NA selections:
- type = "calendar" with na.rm = TRUE now drops calendar periods whose
post-treatment cells are all NA (mirrors the type = "group" guard).
- type = "simple"/"dynamic" with a min_e/max_e window that excludes every
post-treatment period now return a clear message instead of an internal
"report this as a bug" / "non-numeric argument to binary operator" error
(seq_len() in the dynamic overall, an empty-keepers guard in wif(), and an
empty-eseq guard).
3. att_gt() rejects a negative gname up front with a clear message in both
faster_mode = TRUE and FALSE; previously the fast path accepted negative
cohort codes silently while the slow path errored.
Add tests/testthat/test-audit-fixes.R (Monte-Carlo-anchored SE equivalence,
aggte empty/all-NA guard behavior, negative-gname rejection). Full suite passes
with 0 failures.
Master was auto-bumped to a 2.5.1.1 dev version after PR bcallaway11#266 merged (the 'release' label was applied after that merge, so the dev-bump correctly fired). This PR is the 2.5.1 release PR; reset the version to 2.5.1 and label it 'release' so the post-merge dev bump is skipped and master HEAD keeps the exact 2.5.1 release version for the CRAN tarball.
…hts-aggte-validation Fix: fix_weights='varying' 2x SE, aggte empty-selection crashes, negative gname (post-2.5.0 audit)
…ax_e warning From the second deep audit (no estimand/bias/calibration defects found; these are robustness/plumbing on non-default paths): - Parallel multiplier bootstrap (bstrap=TRUE, pl=TRUE, cores>1, n>2500) is now reproducible under a fixed set.seed(): use L'Ecuyer-CMRG parallel RNG streams in the mclapply branch (forked Mersenne-Twister workers re-seed non-deterministically), and restore the caller's RNGkind on exit. Point estimates were always unaffected; only bootstrap SE / uniform-band crit.val drifted run-to-run. - Fixed a crash in the parallel bootstrap when biters < cores: the per-core work split could go negative (e.g. biters=2, cores=4 -> [-1,1,1,1]). It is now a non-negative split that drops empty chunks. - aggte(type="calendar") now warns when min_e/max_e/balance_e are supplied (event-study-only options with no effect on calendar aggregation; the correct unrestricted result is returned, as before). Regression tests added to test-audit-fixes.R (parallel-seed reproducibility, non-negative chunking incl. biters<cores, calendar warning). Bug #1 from the audit (scale-dependent absolute SE threshold) intentionally not addressed. Full suite passes (0 failures).
…es cap, expect_identical - mboot.R: use <- for the new assignments and && / drop `== TRUE` in the parallel-branch condition (scalar, matches the Windows guard above). - mboot.R: cap mc.cores at the number of non-empty chunks so biters < cores does not spawn idle workers. - test-audit-fixes.R: assert exact reproducibility with expect_identical() (verified bit-identical) instead of the tolerant expect_equal().
…to-prior, <- consistency - compute.aggte.R: the calendar warning no longer claims the options "apply only to type='dynamic'" (max_e is also honored by simple/group); it just states they are ignored for type="calendar". - mboot.R: serial branch now uses <- for consistency with the rest of the function. - test-audit-fixes.R: assert RNGkind() is restored to its pre-call value (captured at runtime) rather than a hardcoded "Mersenne-Twister".
…ws (units or clusters) Address Copilot review: the >2500 threshold is nrow(inf.func), which is clusters (not units) when clustered standard errors are used, since mboot() bootstraps the cluster-summed influence function.
…strap-calendar-warn Round-2 audit fixes: parallel bootstrap RNG reproducibility + chunk guard, calendar max_e warning
The hardening pass replaced complete.cases() with complete_finite_cases()
in both preprocessing paths, which dropped any row with a non-finite value
in a numeric column. gname == Inf is a documented never-treated code
("group status 0 or Inf"), so that filter silently deleted every
never-treated unit: under control_group = "notyettreated" it warned and
dropped them, and under control_group = "nevertreated" it returned
plausible-looking but wrong ATTs with no error.
Add a finite_exclude argument to complete_finite_cases() and pass gname in
both pre_process_did() and pre_process_did2(). Excluded columns still get
the NA/NaN check via complete.cases(); only legitimate Inf is preserved.
Restores parity with master, where gname = Inf is bit-identical to gname = 0.
Add regression tests: gname = Inf equals gname = 0 (ATT and influence
functions) across both code paths and both control groups, plus a
complete_finite_cases() unit test confirming NA/NaN gname is still dropped.
…gument-validation Harden validation and preprocessing edge cases
…CT "report the J") Model-level over-identification statistic following Andrews, Chen & Tecchio (2025): refit every admissible elementary comparison pair as a just-identified estimator, contrast them within each cell, and stack into the rank-aware IF-difference quadratic form (.edid_if_diff_quadform). The DiD over-identification is intrinsically LOW-RANK, so the reference df is the rank of the contrast covariance, not the nominal Q-p (per-cell / per-horizon / overall coincide up to the cells in scope). Engine (.edid_if_diff_quadform, shared with edid_hausman/edid_sargan): - Effective-rank relative floor rel_tol = "auto" = r_bare/n_eff calibrates the covariate decaying spectrum; a true no-op on the exact-zero no-covariate spectrum. Division of labor with the AHT effective-df F: the floor sets the rank (spectral scale n_eff), the F sets the few-cluster reference (m = G_eff - 1). Gated on auto/rel_tol>0, so edid_hausman/edid_sargan stay byte-identical. - Cluster-rank saturation guard: when the bare numerical rank reaches the cluster ceiling (r_bare = G_eff - 1) the joint is not estimable and is reported as NA (rank_deficient), routing to the per-cell breakdown ($cells)/edid_sargan. Fires only under coarse clustering with a large over-id, not under the default unit-level clustering. edid_overid() returns the scoped J table (overall/event_study/att_gt), the per-cell breakdown, and $rank_deficient. edid_frontier() now reports a dual frontier: the directed Hausman radius and the full-J worst-case radius (ACT Prop 5.2), with a fragile flag. Validation: full testthat 4239/0 (existing baseline byte-identical for edid_hausman/edid_sargan/edid_adaptive + new edid_overid structural / regression / cluster-rank-saturation tests). MC size + power on i.i.d./AR(1)/clustered no-covariate, covariate (effective-rank floor), weighted, and genuinely clustered designs. Real data: mpdta (no-cov df 5; +lpop df 6 from bare-cut 15) and Bailey-Goodman-Bacon (county/unit clustering per the original paper: J = 46.3, df 15, p ~ 5.6e-5; over-id rejects). New files: R/edid-overid.R, tests/testthat/test-edid-overid.R, man/edid_overid.Rd.
…luence function) Under clustering the efficient weights are now formed from the cluster moment covariance Sig_cl = crossprod(rowsum(psi, cluster))/n^2 (the covariance of cluster sums), not the independent-unit covariance Omega*, aligning the weight metric with the cluster-robust SE. This is the efficient influence function within the generated-moment model under cluster sampling: it restores the efficient <= just-identified inequality that independent-unit weights violate, and is byte-identical when there is no clustering or one unit per cluster. - No-covariate PT-All: invert Sig_cl for the weights; few-cluster/rank guard (fall back to Omega* when H >= number of active clusters); ridge intensity on the cluster effective count (not unit n). - Cluster weight-estimation corrections re-derived through Sig_cl: first-order misspecification IF (per-unit, cluster-broadcast EIF) and the second-order estimation_effect var_add (per-cluster; (G/(G-1))*2Q, no separate Bessel) plus the cross-cell increment. FD-oracled; MC-calibrated (restores coverage in the few-cluster regime). - Covariate averaged/gmm: invert the cluster covariance of the generated-outcome moments (the two coincide under clustering); first-order misspecification IF folds into the EIF. - Covariate pointwise (efficient): unchanged; conditionally (not cluster) efficient under clustering with an honest cluster-robust SE. The cluster-efficient covariate influence function (per-cluster block with the within-cluster cross-unit moment covariance) is a separate planned addition. - Rewrote the "clustering changes only SEs" regression to the new invariant (efficient point is clustering-dependent; PT-Post and uniform are not). Full test suite passes; option-matrix smoke sweep clean; the over-identification toolkit (edid_overid/edid_sargan) is byte-identical (just-identified elementary moments).
…ght recovery) The overall ES_avg / vcov(which="overall") second-order weight-estimation increment (var_add) was silently dropped for single-cohort designs: the least-squares recovery of the overall weights from the event-study influence columns degenerates when the egt block is collinear (a single cohort with a long pre-window), so the increment was skipped and the reported ES_avg SE was too small. Fix: recover the overall-aggregation weights from the known design map (.edid_overall_att_map, finite-differencing the overall estimand) as a fallback when the exact LS recovery fails; route vcov(which="overall") through the same path so the headline SE and vcov stay in parity. The genuine group/calendar skip (estimated cohort-share weights outside the egt span) is retained. The hybrid recovery (exact LS where the egt is full-rank, known-weights fallback only where LS fails) preserves byte-identity for staggered designs and corrects the single-cohort case. Audit: dobkin staggered overall SE byte-identical (absDelta=0); vcov<->headline parity <=3.5e-18; full testthat 0 failures / 3667 pass; 20-row option-matrix smoke clean; regression test added. Only caochen's ES_avg SE moves (+3.46%, 0.0167592 -> 0.0173382), bootstrap-corroborated.
…); cluster ESS for LW shrink Two clustered no-cov fixes (both latent: 0 current apps trigger them). B1: when the per-cell moment count H >= G_act (active clusters), the clustered moment covariance Sig_cl is rank-deficient and the prior code fell back to an IID (non-clustered) Sig, which lets the efficient weights exploit Sig_cl's noise null space and report a sub-floor SE (illusory precision -> a spurious ARE<1 violating the efficiency bound). Replace the IID fallback with a per-cell eigen-floor on Sig_cl (the Andrews-1987 noise-floor already used by the Hausman eigen-ridge): lift the noise null space to max_eig * max(sqrt(eps), sqrt(disp_cl/cl_n_eff)), disp_cl = max(0, H/cl_n_eff - 1), gated on H >= G_act. The efficient clustered SE then collapses to the honest equal-weight read (SE/uniform-floor 0.12 -> 1.00 on a rank-1 fixture), consistent with the rank-deficient illusory-precision regime. LW: the Ledoit-Wolf shrinkage intensity used the unit Kish ESS even when shrinking the cluster metric Sig_cl; switch to the cluster ESS (cl_n_eff = G_act) when cl_metric_on, mirroring the ridge. Default (cl_metric_on = FALSE) reproduces the legacy intensity exactly. Audit: all 5 invariant fingerprints byte-identical (ridge block/staggered/post, LW unit-metric no-cluster, none block); the H<G_act and no-cluster paths unchanged; the 4 real clustered apps (tva, kresch, xu, axbarddeng) byte-identical (B1 does not fire, LW not their config) -> no reported number moves. Full testthat 0 failures; 36-row option-matrix smoke (incl few-cluster H>=G_act) clean; regression test added (12 assertions).
…iD-estimator # Conflicts: # NEWS.md # README.Rmd
…c & pkgdown CI Three no-cov helpers gained args (cluster_indices, unit_weights, cl_metric_on, cl_n_eff) without re-running document(); the stale \usage tripped the codoc check under error_on=warning. Add the two missing @param tags, regenerate the Rd, and add edid_overid to the _pkgdown.yml reference index. Docs-only.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds 14 new R source files implementing the Chen, Sant'Anna & Xie (2025)
Efficient DiD estimator for staggered-adoption balanced panels.
Key features:
Co-Authored-By: Claude Sonnet 4.6 noreply@anthropic.com