Skip to content

Fix deprecated group_by_() and enhance CI with binary builds#15

Open
AlanHuang99 wants to merge 5 commits into
RTIInternational:masterfrom
AlanHuang99:master
Open

Fix deprecated group_by_() and enhance CI with binary builds#15
AlanHuang99 wants to merge 5 commits into
RTIInternational:masterfrom
AlanHuang99:master

Conversation

@AlanHuang99
Copy link
Copy Markdown

No description provided.

- Replace group_by_(treat) with group_by(.data[[treat]]) (dplyr >= 1.0)
- Include treat column in scored_data[, c(treat, vars)] subsetting
  to prevent 'Column treat not found' error
… numpy

Fast rolling entry matching for staggered adoption studies.
Handles 90K+ treated units that crash R's rollmatch (dplyr 2.1B row limit).

Key features:
- Block-vectorized numpy matching (no full cross-product materialization)
- 10K treated × 30K controls in 1.7 seconds
- Polars DataFrames throughout
- Post-matching diagnostics: SMD, t-test, variance ratio, KS test, TOST equivalence
- Alpha caliper sweep with automatic best-alpha selection

Modules: reduce, score, match, balance, diagnostics, core
Tests: 26 tests (smoke, stress 10K scale, robustness/edge cases)
Python reimplementation moved to separate repo: pyrollmatch
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants