StatsOtter — StatsOtter

Heterogeneous treatment effects with a causal forest (GRF recipe)

The full GRF HTE playbook: cross-fit nuisances → causal forest → calibration → AIPW ATE → BLP → RATE → policy.

Difference-in-differences with multiple periods (did)

Staggered-adoption DiD done right: group-time ATT(g,t) → event-study / group / calendar aggregations, with honest pre-trends.

Double machine learning for the 401(k) effect (DoubleML)

Effect of 401(k) eligibility on net assets via PLR / IRM / IIVM with cross-fit ML nuisances — four learners, one honest comparison.

Model, identify, estimate, refute — the DoWhy four-step recipe (DoWhy)

Make your assumptions explicit: draw a causal graph, identify the estimand by the backdoor criterion, estimate it, then actively try to refute it with placebo and confounding tests.

Synthetic control, the tidy way — weights, gaps and placebo inference (tidysynth)

Build a synthetic version of the treated unit from a convex blend of donors, read the treated-minus-synthetic gap, and test it against placebos run on every donor.

Mendelian randomization: genes as instruments for a causal effect (TwoSampleMR)

Use genetic variants as instruments to estimate the causal effect of an exposure on an outcome from GWAS summary data — with IVW plus pleiotropy-robust MR-Egger and weighted-median checks.

An introduction to GRF (getting started)

A minimal first-contact recipe: regression forest, quantile forest, and a causal forest on the same data.

Assessing heterogeneity with RATE (AUTOC & Qini)

Causal forest → train/eval split → RATE with both AUTOC and Qini → TOC plot.

Heterogeneous effects with causal-forest double ML (EconML)

Double machine learning with a forest final stage: partial out nuisance with flexible learners, then read the conditional effect τ(x) — with valid confidence intervals.

Honest sensitivity bounds for parallel-trends violations (HonestDiD)

Stop betting everything on a pre-trends test. Allow the post-treatment trend to deviate within a transparent class, and report the confidence set — and the breakdown value where the effect would vanish.

Matching for causal inference (MatchIt)

Preprocesses observational data by matching treated and control units on covariates, so downstream models depend less on modeling assumptions.

Bayesian regression discontinuity with credible intervals (CausalPy)

Fit a model on each side of the cutoff, put a posterior on the jump, and report a credible interval for the discontinuity — plus an honest look at how it moves with the bandwidth.

Event-study DiD with Sun & Abraham (fixest)

Fast fixed-effects event study that survives staggered timing — sunab() vs naive TWFE, plotted against the truth.

Design-based estimators done fast (estimatr)

Lin's covariate-adjusted estimator and Neyman/HC2 robust standard errors for randomized experiments — one fast function, design-based inference.

Design & analyze randomized experiments (experiment)

Randomize treatment (complete, blocked, cluster) and estimate average effects with design-based variance — including cluster-randomized trials.

Draw the DAG, find the adjustment set (ggdag & dagitty)

Before any estimation: encode your assumptions as a causal graph, enumerate the backdoor paths from treatment to outcome, and let the graph hand you the minimal set of covariates to adjust for.

Sensitive questions, protected answers (rr)

Regression for randomized-response surveys — recover predictors of a sensitive behavior while every respondent's individual answer stays private.

Two-stage difference-in-differences (did2s)

Gardner's 2-stage estimator for staggered DiD: residualize on the untreated, then estimate the event study — fast and timing-robust.

Uplift modelling with S-, T-, X- and R-learners (CausalML)

Estimate who responds, not just the average: fit a family of meta-learners for the CATE, pick the best by validation error, then rank and target with an uplift curve.

Did someone manipulate the cutoff? (rddensity)

The standard manipulation test for RD designs — checks whether the running variable's density jumps at the cutoff (a sign units sorted around it).

Causal forest with time-to-event data (survival)

Censoring check → causal survival forest → RMST-scale AIPW ATE → calibration → report.

Causal mediation analysis (mediation)

Decomposes a treatment effect into the part transmitted through a mediator (ACME) and the rest (direct effect), with sensitivity analysis.

Who responds to treatment? (FindIt)

Find which subgroups respond to a treatment and estimate causal interactions in factorial / conjoint experiments via a LASSO-regularized search.

Bayesian principal stratification (PStrata)

Bayesian mixture modeling for principal stratification: causal effects within latent strata (e.g. compliers) under post-treatment confounding.

Simulation-based inference for any model (clarify)

Turn any fitted model into interpretable quantities of interest — average marginal effects, predictions, contrasts — with simulation-based confidence intervals.

PS weighting for survival outcomes (PSsurvival)

Propensity-score balancing weights for time-to-event outcomes: counterfactual survival curves, survival differences, and marginal hazard ratios.

Is your counterfactual an extrapolation? (WhatIf)

Flags when a counterfactual question is a safe interpolation versus a model-dependent extrapolation far from your data.

Propensity-score weighting (PSweight)

A full design-and-analysis platform for causal effects via balancing weights (overlap, IPW, ATT, matching, entropy) for binary and multiple treatments.

Confounder-adjusted survival curves for a treatment (adjustedCurves)

Compare survival between treatment groups after removing confounding — via IPTW, the g-formula or AIPW — instead of a raw Kaplan-Meier that quietly bakes in selection.

Synthetic control for comparative case studies (Synth)

Build a weighted 'synthetic' control from untreated units to estimate the effect of a single treated case over time.

Regression discontinuity, done right (rdrobust)

Estimate causal effects at a cutoff with data-driven optimal bandwidths and bias-corrected, robust confidence intervals.

Random assignment by design (randomizr)

Reproducible random assignment — simple, complete, block, cluster, stratified — with the exact assignment probabilities design-based inference needs.

Sensitivity analysis for unobserved confounding (sensemakr)

Don't just assume no unobserved confounding — quantify it: robustness value + contour plots benchmarked against your real covariates.

The balance–sample-size frontier (MatchingFrontier)

Trace the whole tradeoff curve between covariate balance and how many units you keep, then estimate the effect at every point on the frontier.

Predicting race/ethnicity from name and geography (wru)

"Who Are You?" predicts an individual's probable race/ethnicity from surname, first/middle name, and geolocation using Bayesian (BISG) updating.

Sensitivity analysis & the E-value (Ding & VanderWeele)

Reports how strong an unmeasured confounder would have to be, on the risk-ratio scale, to fully explain away an observed association.

Multiple imputation of missing data (Amelia)

Fills in missing values via fast bootstrap-EM multiple imputation, producing several complete datasets you analyze and combine.

Goodman-Bacon decomposition: what your TWFE estimate is averaging (bacondecomp)

A two-way fixed-effects DiD is a weighted average of all possible 2×2 comparisons — including 'forbidden' ones that use already-treated units as controls. This shows you the weights.

Bias-corrected nearest-neighbor matching (Matching)

Match treated and control units on covariates, then bias-correct and get the correct large-sample standard errors.

Entropy balancing for covariate overlap (ebal)

Reweight controls to exactly match the treated group's covariate moments, achieving balance without iterative propensity-score tweaking.

IV with heterogeneous effects: the LATE (ivreg)

Two-stage least squares for instrumental-variables regression, with the modern LATE interpretation and rich diagnostics.

Coarsened exact matching (CEM)

Temporarily coarsens each covariate into bins, exact-matches treated and controls within bins, then estimates effects on the matched data.

Evaluating a causal forest fit

Did the forest actually capture treatment-effect heterogeneity? Calibration → variable importance → BLP → omnibus tests.

result

Y_i = Z_i\,Y_i(1) + (1-Z_i)\,Y_i(0),\qquad \tau=\mathbb{E}[\,Y_i(1)-Y_i(0)\,]

Design & diagnose a randomized experiment (DeclareDesign)

Specify a study as model–inquiry–data–answer, simulate it, and read its diagnosands — bias, power, coverage — before you run it.

Analyzing list / item-count experiments (list)

Multivariate regression for list (item-count) experiments, recovering the prevalence and predictors of a sensitive attitude without asking about it directly.

Group & conditional effects with DoubleML (GATE / CATE)

Slice the average effect: Group Average Treatment Effects and a CATE surface from a debiased IRM, with simultaneous confidence bands.

Matching for panel / time-series cross-sectional data (PanelMatch)

Matches treated unit-periods to controls with identical recent treatment histories, then applies a difference-in-differences estimator for TSCS data.

Randomization inference, packaged (ri2)

Exact Fisher randomization tests and sharp-null confidence intervals for any randomization scheme — the packaged version of the design-based test.

Causal forests for heterogeneous effects (grf)

Generalized random forests that estimate conditional average treatment effects τ(x) non-parametrically, with valid confidence intervals.

Ecological inference (ei)

Infers individual-level behavior from aggregate (district-level) data, the classic example being voting rates by race from precinct totals.

Quantitative Social Science: data and code (qss)

The companion R package for Imai's textbook Quantitative Social Science, bundling every dataset and chapter vignette for hands-on data-analysis teaching.

Double machine learning in Python (econml)

A Python toolkit for heterogeneous treatment effects from observational data — double ML, doubly-robust, orthogonal forests, meta-learners.

Quantile treatment effects of 401(k) eligibility (DoubleML)

Beyond the average: how 401(k) eligibility shifts net financial assets across the whole wealth distribution, estimated orthogonally.

result

\tau_{\mathrm{RD}}=\lim_{x\downarrow c}\mathbb{E}[Y\mid X=x]-\lim_{x\uparrow c}\mathbb{E}[Y\mid X=x]

Sharp regression discontinuity with robust bias correction (rdrobust)

Identify the effect at a cutoff: a local-polynomial RD with an MSE-optimal bandwidth and robust, bias-corrected confidence intervals.

Cross-fold validation of heterogeneity

K-fold cross-fitted CATEs → RATE on out-of-fold priorities → honest verdict on heterogeneity strength.

Policy learning via optimal decision trees

Causal forest → doubly-robust scores → policytree → evaluate policy value → plot the tree.

Covariate balance for matching & weighting (cobalt)

Before you trust an observational estimate, prove balance: SMDs, overlap, and a Love plot before vs after adjustment.

Smooth signals with a local linear forest

When the conditional mean is smooth: regression forest baseline → ll_regression_forest → tuning → diagnostics.

result

\hat\tau^{\mathrm{sdid}}=\arg\min_{\tau,\mu,\alpha,\beta}\textstyle\sum_{i,t}\hat\omega_i\hat\lambda_t\big(Y_{it}-\mu-\alpha_i-\beta_t-W_{it}\tau\big)^2

Synthetic difference-in-differences (synthdid)

Reweight both control units and pre-periods to build a synthetic control, then apply a DiD correction — robust where plain SC or TWFE struggle.

Learn an interpretable treatment policy (DoubleML policy tree)

Turn debiased CATEs into a rule: fit a shallow, readable decision tree that maximises the doubly-robust policy value.

Estimating ATEs on a new target population

Train a causal forest on the source sample → reweight AIPW to a target population → report transported ATE.

Causal mediation: natural direct & indirect effects (CMAverse)

Split a total effect into what flows through a mediator (indirect) and what doesn't (direct) — with a sensitivity analysis for mediator–outcome confounding.

Dose–response with average potential outcomes (DoubleML APO)

For a multi-valued or continuous treatment: estimate E[Y(d)] at each dose and the contrasts between them, all cross-fitted.

result

\tau_{\mathrm{LATE}}=\dfrac{\mathrm{Cov}(Y,Z)}{\mathrm{Cov}(D,Z)}