StatsOtter Causal inference workflows
11
Workflow·4 steps

Bias-corrected nearest-neighbor matching (Matching)

Summary by StatsOtter

Match treated and control units on covariates, then bias-correct and get the correct large-sample standard errors.

1

Input · what goes in

A treatment indicator Tr, an outcome Y, and a matrix of covariates X (or estimated propensity scores).

Show data format & exampleHide example
Y Tr age educ
12.0 1 34 12
9.5 0 31 11
14.2 1 45 16
8.8 0 29 10
2

Pipeline · the recipe

↑ Click any step in the diagram to read its logic, code, assumptions & discussion.

1
Data prep

Fit the propensity score model

Data preparation — shapes the raw inputs into what the estimator expects.

What happens here

Estimate the propensity score with a logit of treatment on the LaLonde covariates (Dehejia-Wahba specification).

Formula
e(X_i) = \Pr(T_i = 1 \mid X_i)
Reads from the input data Feeds into #2
Key code
# Install:  install.packages("Matching")
library(Matching)
data(lalonde)
glm1 <- glm(treat ~ age + I(age^2) + educ + I(educ^2) + black +
              hisp + married + nodegr + re74 + I(re74^2) +
              re75 + I(re75^2) + u74 + u75,
            family = binomial, data = lalonde)

Reference / docs ↗

Discussion on this step (0)
  • No comments on this step yet — be the first.
2
Estimation

Bias-adjusted nearest-neighbor matching

The core estimate — where the causal quantity itself is computed.

What happens here

Match treated to control on covariates with replacement and turn on Abadie-Imbens bias adjustment.

Formula
\hat\tau_{ATT} = \frac{1}{N_1}\sum_{i:T_i=1}\big(Y_i - \hat Y_i(0)\big)
Reads from #1 Feeds into #3
Key code
X  <- cbind(lalonde$age, lalonde$educ, lalonde$black, lalonde$hisp,
            lalonde$married, lalonde$nodegr, lalonde$re74, lalonde$re75)
Y  <- lalonde$re78
Tr <- lalonde$treat
m  <- Match(Y = Y, Tr = Tr, X = X, M = 1, BiasAdjust = TRUE, estimand = "ATT")

Reference / docs ↗

Discussion on this step (0)
  • No comments on this step yet — be the first.
3
Diagnostic / pre-tests

Check covariate balance after matching

A pre-flight check — run this before trusting any estimate downstream.

What happens here

MatchBalance reports pre- and post-match treated/control means and t-test p-values for every covariate.

Reads from #2 Feeds into #4
Key code
mb <- MatchBalance(treat ~ age + educ + black + hisp + married +
                     nodegr + re74 + re75,
                   match.out = m, nboots = 500, data = lalonde)

Reference / docs ↗

Discussion on this step (0)
  • No comments on this step yet — be the first.
4
Inference

Report the ATT with Abadie-Imbens SE

Uncertainty quantification — standard errors, intervals, and aggregation.

What happens here

summary(m) prints the ATT estimate, the analytic Abadie-Imbens standard error, t-stat, and matched sample sizes.

Formula
\hat\tau_{\mathrm{ATT}}=\frac1{n_1}\sum_{Z_i=1}\big(Y_i-\hat Y_i(0)\big),\quad \hat Y_i(0)=\frac1M\!\sum_{j\in\mathcal J_M(i)}\! Y_j
Reads from #3 Feeds into the final output
Key code
summary(m)

Reference / docs ↗

Discussion on this step (0)
  • No comments on this step yet — be the first.
3

Output · what you get

Bias-adjusted ATT with the Abadie–Imbens standard error (95% confidence interval).
Fig 1Bias-adjusted ATT with the Abadie–Imbens standard error (95% confidence interval).

Result figure rendered by StatsOtter from the package's documented example — unofficial community showcase; all credit to the original authors.

Result · the numbers

\hat\tau_{\mathrm{ATT}}=\frac1{n_1}\sum_{Z_i=1}\big(Y_i-\hat Y_i(0)\big),\quad \hat Y_i(0)=\frac1M\!\sum_{j\in\mathcal J_M(i)}\! Y_j

⚠️ Unofficial community showcase of Matching (docs). Not affiliated with the authors — all credit to Guido Imbens & coauthors; this summarizes public documentation.

What it does. Estimates treatment effects (ATE/ATT) in observational data by pairing each treated unit with similar control units on covariates or a propensity score. How it works. It performs multivariate nearest-neighbor matching (with replacement), optionally applies a regression-based bias correction for inexact matches, and—uniquely—computes the Abadie-Imbens variance estimator instead of naive standard errors. Sekhon's package also supports genetic matching to optimize covariate balance. Assumptions. Unconfoundedness (selection on observables) and overlap/common support between treated and control covariate distributions. Imbens's contribution. Abadie & Imbens (2006, Econometrica) derived the large-sample properties of matching estimators and showed that the bootstrap is invalid for matching, motivating their analytic variance estimator and bias correction—exactly what Matching implements via BiasAdjust and its standard errors.

What you get — Estimated ATT/ATE with Abadie-Imbens standard errors and covariate balance diagnostics.

Example output

Estimate...  1824.2
AI SE......  802.3
T-stat....   2.2738
p.val.....   0.022972

Original number of observations..............  445
Original number of treated obs...............  185
Matched number of observations...............  185
Matched number of observations  (unweighted).  314

***** (V1) age *****
                       Before Matching    After Matching
mean treatment........    25.816            25.816
mean control..........    25.054            25.692
std mean diff.........    10.655             1.7370
T-test p-value........   0.26594            0.81080

Links: package · paper

Discussion (0)

  • No comments yet — start the conversation.