IPCW sandwich for multinomial-outcome g-computation (23a-2c) + g-formula direct-censoring-term fix#15
Merged
Conversation
…23a-2c) + g-formula direct-censoring-term fix
Lift the last transitional gate (causatr_categorical_outcome_sandwich): a
multinomial point g-computation under IPCW (missing Y) now supports
ci_method = "sandwich". The censoring model's estimation uncertainty enters the
per-class influence function through two channels carried by its own IF: an
indirect path (gamma -> IPCW-weighted outcome model -> marginal mean, the
A_{beta,gamma}^T h cross-term) and a direct path (the IPCW-weighted average
carries gamma in its weights, contributing d mu / d gamma). Validated to ~1e-7
against a Python delicatessen-style M-estimation stack that stacks the logistic
censoring score alongside the IPCW-weighted multinomial score and marginal-mean
equations (multinom_gcomp_ipcw_sandwich.py), on per-class means and every
diff/ratio/OR contrast.
Building that first tight IPCW oracle exposed a pre-existing conservatism in the
scalar gcomp IPCW sandwich: it captured only the indirect path, so per-class
marginal-mean SEs ran ~1-1.5% above the truth (the efficiency gain from
estimating the censoring model; Robins, Rotnitzky & Zhao 1994). The direct
d mu / d gamma term is added there too via shared ipcw_direct_grad_setup() /
ipcw_direct_grad() helpers. The term cancels in contrasts, so reported ATE
difference/ratio/OR SEs are unchanged. Point IPW and point AIPW were
Monte-Carlo-verified to already be well-calibrated (their marginal mean is the
model parameter, so the gamma-dependence is already captured) -- no change there.
Also fix a benign "non-integer #successes" warning from the survey-weighted
binomial censoring model by switching to quasibinomial() when weights are
present (identical coefficients/SEs/predictions; mirrors the propensity-model
convention in fit_bernoulli_density()).
Testing: multinomial IPCW M-estimation oracle (~1e-7), scalar gcomp IPCW oracle
(inline stacked-EE, exact), survey-weighted IPCW oracle, two direct-term-active
guards, a multinomial IPCW SE-vs-empirical-SD calibration MC, a weighted-IPCW
multinomial bootstrap-parity test, and an IPW-no-gap MC regression guard. The
directly-affected test files (test-gcomp-categorical-outcome.R,
test-ipcw-variance.R) pass green; the full suite's IPCW/weighted surface
(aipw-ipcw, aipw-point-extra, aipw-transport, ipcw-lmtp-oracle, ipcw-transport)
passes with 0 failures.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Completes the analytic sandwich for multinomial-outcome point g-computation by adding the IPCW (missing-Y) path (Phase 23a-2c), and fixes a pre-existing conservatism the new tight oracle exposed in the scalar gcomp IPCW sandwich.
Under IPCW the censoring model's estimation uncertainty enters the per-class influence function through two channels, both carried through the censoring model's own IF:
A_{β,γ}ᵀhcross-term).∂μ/∂γ = (1/Σw) Σ (dwᵢ/dγ)(p_{k,i} − μ),dwᵢ/dγ = −wᵢ(1−p_unc,ᵢ)X_cens,ᵢ, summed over target rows with positive weight.The direct term recovers the efficiency gain from estimating the censoring model (Robins, Rotnitzky & Zhao 1994) and cancels in contrasts — reported ATE difference/ratio/OR SEs are unchanged; only the per-class marginal-mean SEs move.
What this exposed (and fixed)
Building the first tight IPCW M-estimation oracle (a
delicatessen-style stack that carries the censoring γ block) revealed that the scalar gcomp IPCW sandwich captured only the indirect path, so its per-class marginal-mean SEs ran ~1–1.5% above the truth. The direct term is now added there too via the sharedipcw_direct_grad_setup()/ipcw_direct_grad()helpers.Point IPW and point AIPW were Monte-Carlo-verified to need no such term — their marginal mean is the model parameter, so the γ-dependence is already captured by the indirect cross-term. The gap is specific to the plug-in g-formula's separate covariate-averaging step.
Also fixes a benign
non-integer #successeswarning from the survey-weighted binomial censoring model by switching toquasibinomial()when weights are present (identical coefficients/SEs/predictions; mirrorsfit_bernoulli_density()).Validation
multinom_gcomp_ipcw_sandwich.py(stacks logistic censoring score + IPCW-weighted multinomial score + marginal-mean equations) — matched to ~1e-7 on per-class means and every diff/ratio/OR contrast.Known follow-up
Longitudinal ICE IPCW likely shares the same plug-in direct-term gap (its coverage test only asserts the conservative
[0.85, 2.0]band). Not addressed here — it needs the stacked-cascade derivation and a longitudinal oracle. Tracked separately.