Skip to content

Pouyasharp/causal-inference

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Causal Inference — A PhD-Level Deep Dive

A focused, runnable treatment of the six workhorse causal-inference techniques used in applied econometrics and marketing analytics: IV / 2SLS, DiD, RDD, PSM, Synthetic Control, and Double / Debiased Machine Learning. Designed as a standalone deep-dive of Stage 4 of the econometrics-deep-research parent curriculum.

The full curriculum lives in SKILL.md (~440 lines, PhD-depth). This README is the front door: it tells you what you'll learn, how to study it, and what the five runnable demos produce.


What this repo gives you

  • A 440-line curriculum in SKILL.md covering:
    • §1 IV / 2SLS — endogeneity, instruments, 2SLS mechanics, first-stage F, Anderson-Rubin CI, Sargan J-test, LATE
    • §2 DiD — canonical 2x2, TWFE pathology (Goodman-Bacon 2021), event-study pre-trends, threats to parallel trends, modern estimators (Callaway-Sant'Anna, Sun-Abraham, de Chaisemartin-D'Haultfœuille)
    • §3 RDD — sharp vs fuzzy, continuity, local-linear estimator, Imbens-Kalyanaraman bandwidth, McCrary density test, donut-hole
    • §4 PSM — selection on observables, matching, IPW, Abadie-Imbens SEs, doubly-robust, what PSM doesn't do
    • §5 Synthetic Control — ADH estimator, in-space placebos, when SC is the right tool
    • §6 DML — partially-linear, cross-fitting, IRM, when DML is the right tool
    • §7 Self-assessment rubric (8 items)
    • §8 Reading list (15 papers)
    • §9 Refinement log (versioned)
  • Five self-contained Python demos in demos/, each ~270–310 lines, that simulate the relevant DGP and exercise the methodology end-to-end. Each demo plants a true effect so you can verify the estimator recovers it.
  • A statsmodels / linearmodels / dowhy / econml gotcha bank in references/statsmodels-linearmodels-api-quirks.md (8 documented gotchas).
  • Helper utilities in demos/statsmodels_helpers.py (first-stage F, Sargan J, Abadie-Imbens SE — robust across versions).

Quick start

git clone https://github.com/Pouyasharp/causal-inference.git
cd causal-inference
pip install -r requirements.txt

# Run any demo (each is self-contained, ~30-90s)
python3 demos/demo_01_iv_2sls.py
python3 demos/demo_02_did.py
python3 demos/demo_03_rdd.py
python3 demos/demo_04_psm.py
python3 demos/demo_05_dml_synthetic.py

Each demo prints a step-by-step narrative to stdout and writes a figure to figures/:

Demo Topic Figure
01 IV / 2SLS: endogeneity, 2SLS, first-stage F, AR CI, Sargan figures/demo_01_figure.png
02 DiD: canonical 2x2, TWFE, event study, TWFE pathology figures/demo_02_figure.png
03 RDD: local-linear with IK bandwidth, donut-hole figures/demo_03_figure.png
04 PSM: propensity, matching, balance, IPW, ATT figures/demo_04_figure.png
05 DML + Synthetic Control: 5-fold DML, SC + placebos figures/demo_05_figure.png

What the figures look like

Demo 01 — IV / 2SLS

figures/demo_01_figure.png shows the data + OLS fit (biased by endogeneity) + 2SLS fit (recovers the true effect) + the first-stage scatter (z1 → D). The plant is β = 0.5, the OLS bias from omitting u is ~0.27, the 2SLS estimate is within SE of the truth.

Demo 02 — DiD

figures/demo_02_figure.png shows the staggered-treatment panel (10 early-treated, 10 late-treated, 10 never) + the event-study coefficients. The TWFE pathology is exposed by comparing the canonical 2x2 DiD to the TWFE estimate under heterogeneous effects.

Demo 03 — RDD

figures/demo_03_figure.png shows the data + the local-linear fit on each side of the cutoff + a zoom at the cutoff. The plant is τ = 2.0 at c = 0; the local-linear estimate recovers it; the global-polynomial degree-4 estimate is biased.

Demo 04 — PSM

figures/demo_04_figure.png shows the propensity score overlap (common support) + the love plot (covariate balance before/after matching). The SMD for all covariates drops below 0.1 after matching.

Demo 05 — DML + Synthetic Control

figures/demo_05_figure.png shows the SC time series (treated vs synthetic, with the treatment effect shaded) + the DML CI (the plug-in estimator is biased; DML with cross-fitting recovers the true θ).


How to study this

  1. Read SKILL.md §0 — get oriented, set up the tool stack.
  2. Read §1 (IV/2SLS) and run demo_01. The first-stage F is the single most important diagnostic.
  3. Read §2 (DiD) and run demo_02. The TWFE pathology is the most important practical lesson in this whole skill.
  4. Read §3 (RDD) and run demo_03. The local-linear estimator with a triangular kernel is the modern default.
  5. Read §4 (PSM) and run demo_04. Balance is the goal, not prediction accuracy.
  6. Read §5 (SC) and §6 (DML), run demo_05 (which covers both).
  7. Self-assess with §7 rubric.
  8. Read the papers in §8 (15 entries, organized by topic).
  9. Open REFINEMENT.md to see what's still missing (Callaway- Sant'Anna demo, McCrary test, AIPW, mediation, sensitivity analysis, etc.).

A useful study cadence: 1 § per evening, run the corresponding demo, then close the laptop and write down — without looking — what the central estimator was and what its failure modes are.


How to apply to a real project

For a marketing A/B test with endogeneity concerns:

# 1. Diagnose the endogeneity source
# Is D randomly assigned? If yes → OLS is fine. If no → continue.

# 2. IV / 2SLS
from linearmodels.iv import IV2SLS
formula = "Y ~ 1 + controls + [D ~ instrument]"
fit = IV2SLS.from_formula(formula, df).fit(cov_type="robust")
# REPORT first-stage F (Staiger-Stock: F > 10)

# 3. DiD (if staggered rollout)
from linearmodels.panel import PanelOLS
panel = df.set_index(["geo", "date"])
fit = PanelOLS(panel["Y"], panel[["D"]], entity_effects=True, time_effects=True).fit(cov_type="clustered", cluster_entity=True)
# BEWARE: TWFE pathology under heterogeneous effects — use csdid for staggered

# 4. PSM (if selection on observables)
from sklearn.linear_model import LogisticRegression
pscore = LogisticRegression().fit(df[confounders], df["D"]).predict_proba(df[confounders])[:, 1]
# Match, check SMD < 0.1, estimate ATT with Abadie-Imbens SE

# 5. DML (if high-dimensional X)
from econml.dml import LinearDML
est = LinearDML(
    model_y=GradientBoostingRegressor(),
    model_t=GradientBoostingClassifier(),
    cv=5,
)
est.fit(df["Y"], df["D"], X=df[high_dim_controls])
# Standard errors are NOT bootstrapped — use the closed-form from OLS on residuals

For what to do when each assumption fails, see SKILL.md §1.5 (LATE interpretation), §2.4 (threats to parallel trends), §3.5 (manipulation), §4.5 (PSM doesn't solve unobserved confounding), §6.5 (DML doesn't find instruments).


Repository layout

causal-inference/
├── README.md             # this file
├── SKILL.md              # 440-line PhD curriculum
├── INDEX.md              # topic-to-section navigation
├── REFINEMENT.md         # known gaps, versioned checklist
├── LICENSE               # MIT
├── requirements.txt
├── .gitignore
├── references/
│   └── statsmodels-linearmodels-api-quirks.md
├── demos/
│   ├── demo_01_iv_2sls.py
│   ├── demo_02_did.py
│   ├── demo_03_rdd.py
│   ├── demo_04_psm.py
│   ├── demo_05_dml_synthetic.py
│   └── statsmodels_helpers.py
└── figures/
    ├── demo_01_figure.png
    ├── demo_02_figure.png
    ├── demo_03_figure.png
    ├── demo_04_figure.png
    └── demo_05_figure.png

Why this exists

A working causal-inference curriculum needs four things: precise mathematical statements, runnable code that exercises the math, a list of the failure modes that don't show up in textbooks, and the synthesis skill of selecting the right estimator for a given problem. This repo has all four. It is intentionally a curriculum, not a library — the demos are designed to be read and modified, not imported as a package.

The parent curriculum (econometrics-deep-research) maps the full 5-stage top-to-bottom treatment at survey depth; this deep-dive is the natural next stop for anyone applying the techniques to a real dataset with endogeneity concerns.


License

MIT. See LICENSE. Demos may be reused as starting points for your own analyses; attribution appreciated.

About

PhD-level deep-dive on causal inference: IV/2SLS, DiD, RDD, PSM, synthetic control, and DML. 5 self-contained runnable Python demos with planted effects, plus a gotcha bank and rubric.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors