Bayesian mixture modeling for principal stratification: causal effects within latent strata (e.g. compliers) under post-treatment confounding.
Input · what goes in
One row per unit: treatment Z, a post-treatment intermediate D, outcome Y, and any covariates.
Show data format & exampleHide example
| Z | D | Y |
|---|---|---|
| 1 | 1 | 2.31 |
| 0 | 0 | -0.4 |
| 1 | 0 | 0.88 |
| 0 | 0 | 1.05 |
Pipeline · the recipe
↑ Click any step in the diagram to read its logic, code, assumptions & discussion.
Specify the principal stratification model
Data preparation — shapes the raw inputs into what the estimator expects.
PStrataModel() declares the strata from the (Z, D) pattern - never-takers '00', compliers '01', always-takers '11' - the outcome family, priors, and which strata obey the exclusion restriction.
# Install: install.packages("PStrata")
library(PStrata)
model <- PStrataModel(
S.formula = Z + D ~ 1,
Y.formula = Y ~ 1,
Y.family = gaussian(link = "identity"),
strata = c(n = "00", c = "01", a = "11"),
ER = c("n", "a"),
prior_intercept = prior_normal(0, 1),
prior_sigma = prior_inv_gamma(1)
)
summary(model)
- No comments on this step yet — be the first.
Log in to comment on this step.
Fit the model with MCMC
The core estimate — where the causal quantity itself is computed.
fit() compiles the model to Stan and samples the posterior over stratum membership probabilities and outcome parameters across multiple chains.
ps_fit <- fit(model, data = sim_data_normal,
chains = 4, warmup = 500, iter = 1000)
ps_fit
diagnostics(ps_fit)
- No comments on this step yet — be the first.
Log in to comment on this step.
Estimate stratum-specific potential outcomes
Uncertainty quantification — standard errors, intervals, and aggregation.
estimate() summarizes the posterior mean potential outcomes E[Y(z)] within each principal stratum as a tidy data frame.
est <- estimate(ps_fit)
summary(est, "data.frame")
plot(est)
- No comments on this step yet — be the first.
Log in to comment on this step.
Contrast treatment effects within strata
Uncertainty quantification — standard errors, intervals, and aggregation.
contrast() with Z=TRUE forms the within-stratum causal effect Y(1)-Y(0) - notably the complier average causal effect - with full posterior summaries.
ctr <- contrast(ps_fit, Z = TRUE)
summary(ctr, "data.frame")
plot(ctr)
- No comments on this step yet — be the first.
Log in to comment on this step.
Output · what you get
Result figure rendered by StatsOtter from the package's documented example — unofficial community showcase; all credit to the original authors.
Result · the numbers
⚠️ Unofficial community showcase of PStrata (docs). Not affiliated with the authors — all credit to Fan Li & coauthors; this summarizes public documentation.
What it does. PStrata estimates principal causal effects — effects defined within latent principal strata such as compliers, always-takers and never-takers — when an intermediate (post-treatment) variable confounds the treatment-outcome relationship. It handles continuous, binary, count, and time-to-event outcomes (Liu & Li, 2023).
How it works. Units are modeled as a finite mixture over principal strata defined by joint potential values of the intermediate variable. A Bayesian model (compiled to Stan) jointly estimates stratum membership probabilities and outcome models within each stratum via MCMC. The workflow is PStrataModel() to specify strata, exclusion-restriction (ER) and monotonicity assumptions and priors; fit() to run MCMC; then estimate() and contrast() for stratum-specific potential outcomes and effects. Users can toggle assumptions (e.g. drop ER) to probe sensitivity.
Assumptions. SUTVA, ignorable treatment assignment, plus user-chosen structural assumptions (monotonicity, exclusion restriction) and outcome-model/prior specification; identification is driven by the mixture model.
What you get — Posterior stratum membership probabilities and stratum-specific causal contrasts (e.g. complier/LATE effect) with credible intervals.
Example output
# Posterior summary of stratum effects (contrast Z=1 vs Z=0)
stratum mean sd 2.5% median 97.5% Rhat
1 n 0.0000 0.000 0.000 0.0000 0.0000 1.00
2 c 1.0382 0.1471 0.751 1.0376 1.3271 1.00
3 a 0.0000 0.000 0.000 0.0000 0.0000 1.00
# Stratum proportions
stratum mean sd 2.5% 97.5%
1 n 0.3013 0.0205 0.262 0.342
2 c 0.4021 0.0231 0.357 0.448
3 a 0.2966 0.0198 0.259 0.336

Discussion (0)
Log in to join the discussion.