StatsOtter Causal inference workflows
11
Workflow·3 steps

The balance–sample-size frontier (MatchingFrontier)

Summary by StatsOtter

Trace the whole tradeoff curve between covariate balance and how many units you keep, then estimate the effect at every point on the frontier.

1

Input · what goes in

A data frame with a binary treatment and the covariates to balance on.

Show data format & exampleHide example
treat age educ married re74 re75
1 37 11 1 0 0
0 22 9 0 2300 1500
1 30 12 1 0 0
0 45 14 0 9100 8700
2

Pipeline · the recipe

↑ Click any step in the diagram to read its logic, code, assumptions & discussion.

1
Data prep

Load MatchIt's lalonde data

Data preparation — shapes the raw inputs into what the estimator expects.

What happens here

Use the canonical Lalonde job-training data and a treatment ~ covariates specification.

Reads from the input data Feeds into #2
Key code
# Install:  remotes::install_github("IQSS/MatchingFrontier")
# remotes::install_github("IQSS/MatchingFrontier")
library(MatchingFrontier)
data("lalonde", package = "MatchIt")

Reference / docs ↗

Discussion on this step (0)
  • No comments on this step yet — be the first.
2
Estimation

Compute the matching frontier

The core estimate — where the causal quantity itself is computed.

What happens here

makeFrontier orders units by their marginal contribution to imbalance, producing every nested matched subset.

Formula
\mathrm{Imbalance}(S)=L_1\ \text{or Mahalanobis distance}
Reads from #1 Feeds into #3
Key code
f <- makeFrontier(treat ~ age + educ + married + re74 + re75,
                  data = lalonde, QOI = "FSATT")

Reference / docs ↗

Discussion on this step (0)
  • No comments on this step yet — be the first.
3
Reporting

Estimate effects along the frontier

Reporting — turn the numbers into a figure or table a reader can act on.

What happens here

Fit the outcome model at each point and plot how the estimate moves as balance improves and N shrinks.

Reads from #2 Feeds into the final output
Key code
est <- estimateEffects(f, base.form = re78 ~ treat)
plot(est)

Reference / docs ↗

Discussion on this step (0)
  • No comments on this step yet — be the first.
3

Output · what you get

The frontier: average imbalance falls as more units are pruned; the ATT estimate is read off at each retained-sample size.
Fig 1The frontier: average imbalance falls as more units are pruned; the ATT estimate is read off at each retained-sample size.

Result figure rendered by StatsOtter from the package's documented example — unofficial community showcase; all credit to the original authors.

Result · the numbers

\text{frontier: }\ \min_{|S|=n}\ \mathrm{Imbalance}(S)\quad\text{for every }n,\ \ \hat\tau(n)=\widehat{\mathrm{ATT}}\big(S^\star(n)\big)

⚠️ Unofficial community showcase of MatchingFrontier (docs). Not affiliated with the authors — all credit to Gary King, Christopher Lucas, Richard Nielsen & Noah Greifer; this summarizes public documentation.

What it does. Matching forces a choice: prune more units for better balance, or keep more for precision. MatchingFrontier computes the entire frontier of that tradeoff at once (King, Lucas & Nielsen 2017), so you can see balance and effect estimates as a function of the sample retained instead of guessing one caliper.

How it works. makeFrontier() orders units so that dropping the next-worst one maximally improves an imbalance metric (e.g. Mahalanobis or L1), producing every nested matched subset from all units down to a few. estimateEffects() then fits the outcome model along the frontier and plot() shows how the estimate moves as you trade sample size for balance.

Assumptions. Unconfoundedness given the covariates and overlap; the frontier characterizes bias–variance, it does not remove unmeasured confounding.

What you get — A nested family of matched samples, an imbalance value and an effect estimate at each point on the balance–N frontier.

Example output

A matching frontier with 614 points.
QOI: FSATT   Metric: dist (Mahalanobis)
N ranges from 614 down to 2 units retained.
Use estimateEffects() and plot() to view the effect along the frontier.

Links: package · paper

Discussion (0)

  • No comments yet — start the conversation.