A focused, runnable treatment of the six workhorse causal-inference techniques used in applied econometrics and marketing analytics: IV / 2SLS, DiD, RDD, PSM, Synthetic Control, and Double / Debiased Machine Learning. Designed as a standalone deep-dive of Stage 4 of the
econometrics-deep-researchparent curriculum.
The full curriculum lives in SKILL.md (~440 lines,
PhD-depth). This README is the front door: it tells you what you'll
learn, how to study it, and what the five runnable demos produce.
- A 440-line curriculum in
SKILL.mdcovering:- §1 IV / 2SLS — endogeneity, instruments, 2SLS mechanics, first-stage F, Anderson-Rubin CI, Sargan J-test, LATE
- §2 DiD — canonical 2x2, TWFE pathology (Goodman-Bacon 2021), event-study pre-trends, threats to parallel trends, modern estimators (Callaway-Sant'Anna, Sun-Abraham, de Chaisemartin-D'Haultfœuille)
- §3 RDD — sharp vs fuzzy, continuity, local-linear estimator, Imbens-Kalyanaraman bandwidth, McCrary density test, donut-hole
- §4 PSM — selection on observables, matching, IPW, Abadie-Imbens SEs, doubly-robust, what PSM doesn't do
- §5 Synthetic Control — ADH estimator, in-space placebos, when SC is the right tool
- §6 DML — partially-linear, cross-fitting, IRM, when DML is the right tool
- §7 Self-assessment rubric (8 items)
- §8 Reading list (15 papers)
- §9 Refinement log (versioned)
- Five self-contained Python demos in
demos/, each ~270–310 lines, that simulate the relevant DGP and exercise the methodology end-to-end. Each demo plants a true effect so you can verify the estimator recovers it. - A statsmodels / linearmodels / dowhy / econml gotcha bank in
references/statsmodels-linearmodels-api-quirks.md(8 documented gotchas). - Helper utilities in
demos/statsmodels_helpers.py(first-stage F, Sargan J, Abadie-Imbens SE — robust across versions).
git clone https://github.com/Pouyasharp/causal-inference.git
cd causal-inference
pip install -r requirements.txt
# Run any demo (each is self-contained, ~30-90s)
python3 demos/demo_01_iv_2sls.py
python3 demos/demo_02_did.py
python3 demos/demo_03_rdd.py
python3 demos/demo_04_psm.py
python3 demos/demo_05_dml_synthetic.py
Each demo prints a step-by-step narrative to stdout and writes a
figure to figures/:
| Demo | Topic | Figure |
|---|---|---|
| 01 | IV / 2SLS: endogeneity, 2SLS, first-stage F, AR CI, Sargan | figures/demo_01_figure.png |
| 02 | DiD: canonical 2x2, TWFE, event study, TWFE pathology | figures/demo_02_figure.png |
| 03 | RDD: local-linear with IK bandwidth, donut-hole | figures/demo_03_figure.png |
| 04 | PSM: propensity, matching, balance, IPW, ATT | figures/demo_04_figure.png |
| 05 | DML + Synthetic Control: 5-fold DML, SC + placebos | figures/demo_05_figure.png |
figures/demo_01_figure.png shows the data + OLS fit (biased by
endogeneity) + 2SLS fit (recovers the true effect) + the
first-stage scatter (z1 → D). The plant is β = 0.5, the
OLS bias from omitting u is ~0.27, the 2SLS estimate is within
SE of the truth.
figures/demo_02_figure.png shows the staggered-treatment panel
(10 early-treated, 10 late-treated, 10 never) + the event-study
coefficients. The TWFE pathology is exposed by comparing the
canonical 2x2 DiD to the TWFE estimate under heterogeneous effects.
figures/demo_03_figure.png shows the data + the local-linear
fit on each side of the cutoff + a zoom at the cutoff. The plant
is τ = 2.0 at c = 0; the local-linear estimate recovers it;
the global-polynomial degree-4 estimate is biased.
figures/demo_04_figure.png shows the propensity score overlap
(common support) + the love plot (covariate balance before/after
matching). The SMD for all covariates drops below 0.1 after
matching.
figures/demo_05_figure.png shows the SC time series (treated vs
synthetic, with the treatment effect shaded) + the DML CI (the
plug-in estimator is biased; DML with cross-fitting recovers the
true θ).
- Read
SKILL.md§0 — get oriented, set up the tool stack. - Read §1 (IV/2SLS) and run
demo_01. The first-stage F is the single most important diagnostic. - Read §2 (DiD) and run
demo_02. The TWFE pathology is the most important practical lesson in this whole skill. - Read §3 (RDD) and run
demo_03. The local-linear estimator with a triangular kernel is the modern default. - Read §4 (PSM) and run
demo_04. Balance is the goal, not prediction accuracy. - Read §5 (SC) and §6 (DML), run
demo_05(which covers both). - Self-assess with §7 rubric.
- Read the papers in §8 (15 entries, organized by topic).
- Open
REFINEMENT.mdto see what's still missing (Callaway- Sant'Anna demo, McCrary test, AIPW, mediation, sensitivity analysis, etc.).
A useful study cadence: 1 § per evening, run the corresponding demo, then close the laptop and write down — without looking — what the central estimator was and what its failure modes are.
For a marketing A/B test with endogeneity concerns:
# 1. Diagnose the endogeneity source
# Is D randomly assigned? If yes → OLS is fine. If no → continue.
# 2. IV / 2SLS
from linearmodels.iv import IV2SLS
formula = "Y ~ 1 + controls + [D ~ instrument]"
fit = IV2SLS.from_formula(formula, df).fit(cov_type="robust")
# REPORT first-stage F (Staiger-Stock: F > 10)
# 3. DiD (if staggered rollout)
from linearmodels.panel import PanelOLS
panel = df.set_index(["geo", "date"])
fit = PanelOLS(panel["Y"], panel[["D"]], entity_effects=True, time_effects=True).fit(cov_type="clustered", cluster_entity=True)
# BEWARE: TWFE pathology under heterogeneous effects — use csdid for staggered
# 4. PSM (if selection on observables)
from sklearn.linear_model import LogisticRegression
pscore = LogisticRegression().fit(df[confounders], df["D"]).predict_proba(df[confounders])[:, 1]
# Match, check SMD < 0.1, estimate ATT with Abadie-Imbens SE
# 5. DML (if high-dimensional X)
from econml.dml import LinearDML
est = LinearDML(
model_y=GradientBoostingRegressor(),
model_t=GradientBoostingClassifier(),
cv=5,
)
est.fit(df["Y"], df["D"], X=df[high_dim_controls])
# Standard errors are NOT bootstrapped — use the closed-form from OLS on residuals
For what to do when each assumption fails, see SKILL.md §1.5
(LATE interpretation), §2.4 (threats to parallel trends), §3.5
(manipulation), §4.5 (PSM doesn't solve unobserved confounding),
§6.5 (DML doesn't find instruments).
causal-inference/
├── README.md # this file
├── SKILL.md # 440-line PhD curriculum
├── INDEX.md # topic-to-section navigation
├── REFINEMENT.md # known gaps, versioned checklist
├── LICENSE # MIT
├── requirements.txt
├── .gitignore
├── references/
│ └── statsmodels-linearmodels-api-quirks.md
├── demos/
│ ├── demo_01_iv_2sls.py
│ ├── demo_02_did.py
│ ├── demo_03_rdd.py
│ ├── demo_04_psm.py
│ ├── demo_05_dml_synthetic.py
│ └── statsmodels_helpers.py
└── figures/
├── demo_01_figure.png
├── demo_02_figure.png
├── demo_03_figure.png
├── demo_04_figure.png
└── demo_05_figure.png
A working causal-inference curriculum needs four things: precise mathematical statements, runnable code that exercises the math, a list of the failure modes that don't show up in textbooks, and the synthesis skill of selecting the right estimator for a given problem. This repo has all four. It is intentionally a curriculum, not a library — the demos are designed to be read and modified, not imported as a package.
The parent curriculum (econometrics-deep-research) maps the
full 5-stage top-to-bottom treatment at survey depth; this deep-dive
is the natural next stop for anyone applying the techniques to a
real dataset with endogeneity concerns.
MIT. See LICENSE. Demos may be reused as starting points for your
own analyses; attribution appreciated.