StatsOtter Causal inference workflows
11
Workflow·5 steps

Matching for panel / time-series cross-sectional data (PanelMatch)

Summary by StatsOtter

Matches treated unit-periods to controls with identical recent treatment histories, then applies a difference-in-differences estimator for TSCS data.

1

Input · what goes in

A long-format panel with unit ID, time ID, a binary treatment, an outcome, and time-varying covariates.

Show data format & exampleHide example
unit year treat outcome gdp
A 2001 0 12.4 3.1
A 2002 1 13.0 3.4
B 2001 0 9.8 2.2
B 2002 0 10.1 2.5
2

Pipeline · the recipe

↑ Click any step in the diagram to read its logic, code, assumptions & discussion.

1
Diagnostic / pre-tests

Inspect treatment distribution

A pre-flight check — run this before trusting any estimate downstream.

What happens here

Use DisplayTreatment() to visualize when democratization (dem) turns on across countries and years.

Reads from the input data Feeds into #2
Key code
# Install:  install.packages("PanelMatch")
library(PanelMatch)
DisplayTreatment(unit.id = "wbcode2", time.id = "year",
                 treatment = "dem", data = dem,
                 legend.position = "none", xlab = "year", ylab = "Country Code")

Reference / docs ↗

Discussion on this step (0)
  • No comments on this step yet — be the first.
2
Estimation

Build matched sets

The core estimate — where the causal quantity itself is computed.

What happens here

Run PanelMatch() with a 4-period lag and propensity-score weighting to match treated to control histories.

Formula
\hat\delta_F=\frac{1}{|\mathcal A|}\sum_{(i,t)\in\mathcal A}\!\Big\{(Y_{i,t+F}-Y_{i,t-1})-\sum_{j\in\mathcal M_{it}}w_{ij}(Y_{j,t+F}-Y_{j,t-1})\Big\}
Reads from #1 Feeds into #3
Key code
PM.results <- PanelMatch(lag = 4, time.id = "year", unit.id = "wbcode2",
                         treatment = "dem", refinement.method = "ps.weight",
                         data = dem, match.missing = FALSE, listwise.delete = TRUE,
                         covs.formula = ~ I(lag(tradewb, 1:4)) + I(lag(y, 1:4)),
                         size.match = 5, qoi = "att", outcome.var = "y",
                         lead = 0:4, forbid.treatment.reversal = FALSE)

Reference / docs ↗

Discussion on this step (0)
  • No comments on this step yet — be the first.
3
Diagnostic / pre-tests

Check covariate balance

A pre-flight check — run this before trusting any estimate downstream.

What happens here

Use get_covariate_balance() to confirm the matched sets balance the lagged covariates.

Reads from #2 Feeds into #4
Key code
get_covariate_balance(PM.results$att, data = dem,
                      covariates = c("tradewb", "y"), plot = FALSE)

Reference / docs ↗

Discussion on this step (0)
  • No comments on this step yet — be the first.
4
Estimation

Estimate the ATT (DiD)

The core estimate — where the causal quantity itself is computed.

What happens here

PanelEstimate() applies a difference-in-differences estimator with bootstrap standard errors.

Formula
\hat{\delta}(F, L) = \frac{1}{\sum D_{it}} \sum D_{it} \{ (Y_{i,t+F} - Y_{i,t-1}) - \sum w_{it} (Y_{j,t+F} - Y_{j,t-1}) \}
Reads from #3 Feeds into #5
Key code
PE.results <- PanelEstimate(sets = PM.results, data = dem)

Reference / docs ↗

Discussion on this step (0)
  • No comments on this step yet — be the first.
5
Reporting

Summarize and plot leads

Reporting — turn the numbers into a figure or table a reader can act on.

What happens here

summary() and plot() report the dynamic ATT for each lead period with confidence intervals.

Reads from #4 Feeds into the final output
Key code
summary(PE.results)
plot(PE.results)

Reference / docs ↗

Discussion on this step (0)
  • No comments on this step yet — be the first.
3

Output · what you get

Dynamic ATT across lead periods t+0…t+4 with a 95% confidence band.
Fig 1Dynamic ATT across lead periods t+0…t+4 with a 95% confidence band.

Result figure rendered by StatsOtter from the package's documented example — unofficial community showcase; all credit to the original authors.

Result · the numbers

\hat\delta_F=\frac{1}{|\mathcal A|}\sum_{(i,t)\in\mathcal A}\!\Big\{(Y_{i,t+F}-Y_{i,t-1})-\sum_{j\in\mathcal M_{it}}w_{ij}(Y_{j,t+F}-Y_{j,t-1})\Big\}

⚠️ Unofficial community showcase of PanelMatch (docs). Not affiliated with the authors — all credit to Kosuke Imai & coauthors; this summarizes public documentation.

What it does: PanelMatch implements the Imai-Kim-Wang matching framework for causal inference with time-series cross-sectional (panel) data and binary treatments. How it works: for each treated observation it builds a matched set of control units that share an identical treatment history over a user-specified number of lags. The matched set is then refined by Mahalanobis distance, propensity-score matching, or propensity-score weighting on time-varying covariates. Treatment effects are estimated with a difference-in-differences estimator that nets out unit and time effects, yielding short- and long-term ATTs with block-bootstrap standard errors. Assumptions: a parallel-trends/limited-carryover assumption conditional on treatment history and covariates, plus no unobserved time-varying confounding. The package also provides covariate-balance diagnostics and visualizations of treatment-variation and matched sets so users can assess match quality before estimation.

What you get — Short- and long-term ATTs (point estimates at multiple lead periods) with bootstrapped confidence intervals, plus covariate-balance and matched-set diagnostics.

Example output

Weighted Difference-in-Differences with Propensity Score
Matches created with 4 lags

Standard errors computed with 1000 Weighted bootstrap samples

Estimate of Average Treatment Effect on the Treated (ATT) by Period:
     estimate std.error      2.5%    97.5%
t+0    0.2475     0.633   -0.9410    1.554
t+1   -0.7800     1.144   -3.0426    1.430
t+2   -1.4290     1.700   -4.7917    1.832
t+3   -1.7770     2.187   -6.0908    2.448
t+4   -1.3450     2.567   -6.4096    3.715

Links: package · paper

Discussion (0)

  • No comments yet — start the conversation.