StatsOtter Causal inference workflows
grf Causal ForestMachine LearningRandom Forest

An introduction to GRF (getting started)

A minimal first-contact recipe: regression forest, quantile forest, and a causal forest on the same data.

@grf P D 12 2
BalanceMatchingObservational

Matching for causal inference (MatchIt)

Preprocesses observational data by matching treated and control units on covariates, so downstream models depend less on modeling assumptions.

@gary_king D 11 3
Randomized ExperimentsRegression AdjustmentSoftware

Design-based estimators done fast (estimatr)

Lin's covariate-adjusted estimator and Neyman/HC2 robust standard errors for randomized experiments — one fast function, design-based inference.

@peng_ding D 11
ExperimentsRandomized ExperimentsSoftware

Design & analyze randomized experiments (experiment)

Randomize treatment (complete, blocked, cluster) and estimate average effects with design-based variance — including cluster-randomized trials.

@kosuke_imai D 11
ExperimentsSoftwareSurvey Methods

Sensitive questions, protected answers (rr)

Regression for randomized-response surveys — recover predictors of a sensitive behavior while every respondent's individual answer stays private.

@kosuke_imai D 11
DiagnosticObservationalRegression Discontinuity

Did someone manipulate the cutoff? (rddensity)

The standard manipulation test for RD designs — checks whether the running variable's density jumps at the cutoff (a sign units sorted around it).

@guido_imbens D 11
ExperimentsMediationObservational

Causal mediation analysis (mediation)

Decomposes a treatment effect into the part transmitted through a mediator (ACME) and the rest (direct effect), with sensitivity analysis.

@kosuke_imai D 11
ExperimentsHeterogeneityMachine Learning

Who responds to treatment? (FindIt)

Find which subgroups respond to a treatment and estimate causal interactions in factorial / conjoint experiments via a LASSO-regularized search.

@kosuke_imai D 11
BayesianObservationalPropensity Score

Bayesian principal stratification (PStrata)

Bayesian mixture modeling for principal stratification: causal effects within latent strata (e.g. compliers) under post-treatment confounding.

@fan_li D 11
InferenceObservationalSimulation

Simulation-based inference for any model (clarify)

Turn any fitted model into interpretable quantities of interest — average marginal effects, predictions, contrasts — with simulation-based confidence intervals.

@gary_king D 11
ObservationalOverlap WeightsPropensity Score

PS weighting for survival outcomes (PSsurvival)

Propensity-score balancing weights for time-to-event outcomes: counterfactual survival curves, survival differences, and marginal hazard ratios.

@fan_li D 11
BalanceObservationalOverlap Weights

Propensity-score weighting (PSweight)

A full design-and-analysis platform for causal effects via balancing weights (overlap, IPW, ATT, matching, entropy) for binary and multiple treatments.

@fan_li D 11
DesignRandomized ExperimentsSoftware

Random assignment by design (randomizr)

Reproducible random assignment — simple, complete, block, cluster, stratified — with the exact assignment probabilities design-based inference needs.

@peng_ding D 11
Missing DataMultiple ImputationSoftware

Multiple imputation of missing data (Amelia)

Fills in missing values via fast bootstrap-EM multiple imputation, producing several complete datasets you analyze and combine.

@gary_king D 11
ObservationalPropensity ScoreWeighting

Entropy balancing for covariate overlap (ebal)

Reweight controls to exactly match the treated group's covariate moments, achieving balance without iterative propensity-score tweaking.

@guido_imbens D 11
Instrumental VariablesLATEObservational

IV with heterogeneous effects: the LATE (ivreg)

Two-stage least squares for instrumental-variables regression, with the modern LATE interpretation and rich diagnostics.

@guido_imbens D 11
BalanceCoarsened Exact MatchingMatching

Coarsened exact matching (CEM)

Temporarily coarsens each covariate into bins, exact-matches treated and controls within bins, then estimates effects on the matched data.

@gary_king D 11
grf Causal ForestHeterogeneous Effects

Evaluating a causal forest fit

Did the forest actually capture treatment-effect heterogeneity? Calibration → variable importance → BLP → omnibus tests.

@grf P D 11
ExperimentsSoftwareSurvey Methods

Analyzing list / item-count experiments (list)

Multivariate regression for list (item-count) experiments, recovering the prevalence and predictors of a sensitive attitude without asking about it directly.

@kosuke_imai D 11
Randomization InferenceRandomized ExperimentsSoftware

Randomization inference, packaged (ri2)

Exact Fisher randomization tests and sharp-null confidence intervals for any randomization scheme — the packaged version of the design-based test.

@peng_ding D 11
HeterogeneityMachine LearningObservational

Causal forests for heterogeneous effects (grf)

Generalized random forests that estimate conditional average treatment effects τ(x) non-parametrically, with valid confidence intervals.

@guido_imbens D 11
Ecological InferenceObservationalSoftware

Ecological inference (ei)

Infers individual-level behavior from aggregate (district-level) data, the classic example being voting rates by race from precinct totals.

@gary_king D 11
ExperimentsMachine LearningObservational

Quantitative Social Science: data and code (qss)

The companion R package for Imai's textbook Quantitative Social Science, bundling every dataset and chapter vignette for hands-on data-analysis teaching.

@kosuke_imai D 11
HeterogeneityMachine LearningObservational

Double machine learning in Python (econml)

A Python toolkit for heterogeneous treatment effects from observational data — double ML, doubly-robust, orthogonal forests, meta-learners.

@guido_imbens D 11