StatsOtter Causal inference workflows
11
Workflow·5 steps

Entropy balancing for covariate overlap (ebal)

Summary by StatsOtter

Reweight controls to exactly match the treated group's covariate moments, achieving balance without iterative propensity-score tweaking.

1

Input · what goes in

A binary treatment indicator and a matrix of covariates to balance.

Show data format & exampleHide example
treat age income educ
1 40 52000 16
0 38 47000 12
1 45 61000 18
0 33 41000 11
2

Pipeline · the recipe

↑ Click any step in the diagram to read its logic, code, assumptions & discussion.

1
Data prep

Assemble treatment indicator and covariates

Data preparation — shapes the raw inputs into what the estimator expects.

What happens here

Build a binary treatment vector and the covariate set whose means you want the reweighted controls to match the treated group on.

Reads from the input data Feeds into #2
Key code
# Install:  install.packages("ebal")
library(ebal)
set.seed(42)
n_t <- 75; n_c <- 250
df <- data.frame(
  treat  = c(rep(1,n_t), rep(0,n_c)),
  age    = c(rnorm(n_t,45,8),  rnorm(n_c,38,10)),
  educ   = c(rnorm(n_t,16,2.5),rnorm(n_c,13,3)),
  income = c(rnorm(n_t,65,12), rnorm(n_c,50,15)))
df$y <- 0.1*df$age + 0.3*df$educ + 0.05*df$income + 5*df$treat + rnorm(nrow(df),0,3)

Reference / docs ↗

Discussion on this step (0)
  • No comments on this step yet — be the first.
2
Estimation

Run entropy balancing

The core estimate — where the causal quantity itself is computed.

What happens here

ebalance() solves for control weights that exactly reproduce the treated-group covariate moments under maximum entropy.

Formula
\min_{w_i}\sum_{i\in C} w_i\log\frac{w_i}{q_i}\;\;\text{s.t.}\;\sum_{i\in C} w_i c_{ri}(X_i)=m_r,\;\sum_i w_i=1
Reads from #1 Feeds into #3
Key code
fit <- ebalance(treat ~ age + educ + income, data = df)

Reference / docs ↗

Discussion on this step (0)
  • No comments on this step yet — be the first.
3
Diagnostic / pre-tests

Check exact moment balance

A pre-flight check — run this before trusting any estimate downstream.

What happens here

summary(fit) prints treated vs control covariate means before and after weighting; post-weighting means coincide with the treated means.

Reads from #2 Feeds into #4
Key code
summary(fit)

Reference / docs ↗

Discussion on this step (0)
  • No comments on this step yet — be the first.
4
Estimation

Weighted ATT estimate

The core estimate — where the causal quantity itself is computed.

What happens here

Pass the balancing weights to a weighted regression of the outcome on treatment to recover the ATT (true value 5).

Formula
\hat\tau_{ATT} = \bar Y_{1} - \sum_{i\in C} w_i\, Y_i
Reads from #3 Feeds into #5
Key code
df$w <- weights(fit)
coef(lm(y ~ treat, data = df, weights = w))["treat"]

Reference / docs ↗

Discussion on this step (0)
  • No comments on this step yet — be the first.
5
Reporting

Love plot of standardized differences

Reporting — turn the numbers into a figure or table a reader can act on.

What happens here

plot(fit) draws a base-R Love plot of standardized differences before vs after weighting, one row per covariate.

Reads from #4 Feeds into the final output
Key code
plot(fit)

Reference / docs ↗

Discussion on this step (0)
  • No comments on this step yet — be the first.
3

Output · what you get

Standardized covariate differences before vs after entropy balancing — exact moment balance.
Fig 1Standardized covariate differences before vs after entropy balancing — exact moment balance.

Result figure rendered by StatsOtter from the package's documented example — unofficial community showcase; all credit to the original authors.

Result · the numbers

\min_{w}\ \sum_i w_i\log\frac{w_i}{q_i}\quad\text{s.t.}\quad \sum_{i:\,Z_i=0} w_i\,c_r(X_i)=\bar c_r^{\,\text{treated}}

⚠️ Unofficial community showcase of ebal (docs). Not affiliated with the authors — all credit to Guido Imbens & coauthors; this summarizes public documentation.

What it does. ebal computes weights that make the reweighted control group's covariate means (and higher moments) exactly equal the treated group's, so you can estimate effects on balanced samples. How it works. It solves a maximum-entropy optimization that finds weights staying as close as possible to uniform while satisfying the user's balance constraints—replacing the usual 'estimate propensity score, check balance, respecify' loop with a single calibration step. Assumptions. Unconfoundedness and overlap; balance constraints must be feasible given the data. Imbens's contribution. Entropy balancing (Hainmueller 2012) sits squarely in the propensity-score / overlap-weighting tradition that Imbens systematized in his influential review, Nonparametric Estimation of Average Treatment Effects Under Exogeneity (Imbens 2004, Review of Economics and Statistics), which framed balancing and overlap as central to observational causal estimation.

What you get — A vector of balancing weights for control units that equalize covariate moments with the treated group.

Example output

Entropy balancing

Means on covariates: treatment group vs. weighted control group

          Treated Control(pre) Control(post) StdDiff(pre) StdDiff(post)
age        45.213       37.842        45.213        0.793         0.000
educ       15.987       12.948        15.987        1.064         0.000
income     64.512       49.871        64.512        0.987         0.000

Constraints converged: TRUE
Max. moment deviation: 1.21e-07
N treated = 75   N control = 250   (effective = 213.4)

Links: package · paper

Discussion (0)

  • No comments yet — start the conversation.