StatsOtter Causal inference workflows
11
Workflow·5 steps

Regression discontinuity, done right (rdrobust)

Summary by StatsOtter

Estimate causal effects at a cutoff with data-driven optimal bandwidths and bias-corrected, robust confidence intervals.

1

Input · what goes in

An outcome vector y and a running/forcing variable x (and a cutoff c, default 0).

Show data format & exampleHide example
y (outcome) x (running var)
4.2 -0.8
5.1 -0.1
6.9 0.2
7.3 0.9
2

Pipeline · the recipe

↑ Click any step in the diagram to read its logic, code, assumptions & discussion.

1
Data prep

Load senate data and inspect the running variable

Data preparation — shapes the raw inputs into what the estimator expects.

What happens here

Load the Cattaneo-Frandsen-Titiunik U.S. Senate data and look at the Democratic margin of victory that determines incumbency.

Reads from the input data Feeds into #2
Key code
# Install:  install.packages("rdrobust")   # R  ·  pip install rdrobust  (Python)
library(rdrobust)
data(rdrobust_RDsenate)
vote   <- rdrobust_RDsenate$vote     # Dem vote share next election (outcome)
margin <- rdrobust_RDsenate$margin   # Dem margin this election (running var)
summary(margin)

Reference / docs ↗

Discussion on this step (0)
  • No comments on this step yet — be the first.
2
Diagnostic / pre-tests

RD plot around the cutoff

A pre-flight check — run this before trusting any estimate downstream.

What happens here

Bin the data and overlay local polynomial fits on each side of the c = 0 cutoff to visualize the discontinuity.

Formula
\tau_{SRD} = \lim_{x\downarrow c} E[Y_i\mid X_i=x] - \lim_{x\uparrow c} E[Y_i\mid X_i=x]
Reads from #1 Feeds into #3
Key code
rdplot(y = vote, x = margin, c = 0,
       title = "RD Plot: U.S. Senate",
       x.label = "Dem. margin of victory",
       y.label = "Vote share, next election")

Reference / docs ↗

Discussion on this step (0)
  • No comments on this step yet — be the first.
3
Diagnostic / pre-tests

Data-driven optimal bandwidth

A pre-flight check — run this before trusting any estimate downstream.

What happens here

Select the MSE-optimal bandwidth with rdbwselect before estimating the effect.

Reads from #2 Feeds into #4
Key code
bw <- rdbwselect(y = vote, x = margin, c = 0, kernel = "triangular", bwselect = "mserd")
summary(bw)

Reference / docs ↗

Discussion on this step (0)
  • No comments on this step yet — be the first.
4
Estimation

Robust bias-corrected RD estimate

The core estimate — where the causal quantity itself is computed.

What happens here

Estimate the local-linear treatment effect at the cutoff with robust bias correction; all=TRUE prints all three inference rows.

Formula
\tau_{\mathrm{RD}}=\lim_{x\downarrow c}\mathbb E[Y\mid X=x]-\lim_{x\uparrow c}\mathbb E[Y\mid X=x]
Reads from #3 Feeds into #5
Key code
rd <- rdrobust(y = vote, x = margin, c = 0,
               kernel = "triangular", bwselect = "mserd", all = TRUE)

Reference / docs ↗

Discussion on this step (0)
  • No comments on this step yet — be the first.
5
Inference

Read the three inference rows

Uncertainty quantification — standard errors, intervals, and aggregation.

What happens here

summary() reports Conventional, Bias-Corrected, and Robust point estimates with their SEs and confidence intervals.

Reads from #4 Feeds into the final output
Key code
summary(rd)

Reference / docs ↗

Discussion on this step (0)
  • No comments on this step yet — be the first.
3

Output · what you get

The jump in the outcome at the cutoff — the local RD treatment effect (τ ≈ 7.4).
Fig 1The jump in the outcome at the cutoff — the local RD treatment effect (τ ≈ 7.4).

Result figure rendered by StatsOtter from the package's documented example — unofficial community showcase; all credit to the original authors.

Result · the numbers

\tau_{\mathrm{RD}}=\lim_{x\downarrow c}\mathbb E[Y\mid X=x]-\lim_{x\uparrow c}\mathbb E[Y\mid X=x]

⚠️ Unofficial community showcase of rdrobust (docs). Not affiliated with the authors — all credit to Guido Imbens & coauthors; this summarizes public documentation.

What it does. In a regression discontinuity (RD) design, units just above and just below a cutoff on a running variable are treated as if randomized; rdrobust estimates the jump in the outcome at the cutoff. How it works. It fits local-linear (or local-quadratic) regressions on each side of the threshold, selects a data-driven bandwidth, and—critically—reports bias-corrected point estimates with robust confidence intervals that account for the bias introduced by the smoothing. Assumptions. Continuity of potential outcomes at the cutoff and no precise manipulation of the running variable; the estimate is a local effect at the threshold. Imbens's contribution. The mean-squared-error-optimal bandwidth selector originates with Imbens & Kalyanaraman (2012, Review of Economic Studies); rdrobust (Calonico, Cattaneo, Farrell, Titiunik) builds robust inference on top of that bandwidth lineage.

What you get — RD point estimate at the cutoff with conventional, bias-corrected, and robust standard errors / CIs.

Example output

Sharp RD estimates using local polynomial regression.

Number of Obs.                 1297
BW type                        mserd
Kernel                         Triangular
VCE method                     NN

Number of Obs.                  595          702
Eff. Number of Obs.             360          323
Order est. (p)                    1            1
BW est. (h)                  17.754       17.754

=============================================================================
        Method     Coef. Std. Err.    z     P>|z|      [ 95% C.I. ]
=============================================================================
  Conventional     7.414     1.459  5.083   0.000    [ 4.555 , 10.273 ]
Bias-Corrected     7.507     1.459  5.146   0.000    [ 4.647 , 10.366 ]
        Robust     7.507     1.741  4.311   0.000    [ 4.094 , 10.919 ]
=============================================================================

Links: package · paper

Discussion (0)

  • No comments yet — start the conversation.