StatsOtter Causal inference workflows
11
Workflow·4 steps

Analyzing list / item-count experiments (list)

Summary by StatsOtter

Multivariate regression for list (item-count) experiments, recovering the prevalence and predictors of a sensitive attitude without asking about it directly.

1

Input · what goes in

Survey data with the respondent's count response, a treatment/control indicator, and covariates.

Show data format & exampleHide example
id treat count age educ
1 1 3 40 2
2 0 2 55 3
3 1 2 31 1
4 0 1 62 2
2

Pipeline · the recipe

↑ Click any step in the diagram to read its logic, code, assumptions & discussion.

1
Data prep

Load the list package and race data

Data preparation — shapes the raw inputs into what the estimator expects.

What happens here

Load the list package and the 1991 National Race and Politics Survey list experiment on racial attitudes.

Reads from the input data Feeds into #2
Key code
# Install:  install.packages("list")
library(list)
data(race)
set.seed(1)

Reference / docs ↗

Discussion on this step (0)
  • No comments on this step yet — be the first.
2
Diagnostic / pre-tests

Difference-in-means prevalence

A pre-flight check — run this before trusting any estimate downstream.

What happens here

Fit an intercept-only ictreg with method='lm' to recover the basic difference-in-means estimate of the sensitive-item prevalence.

Formula
\hat{\tau} = \bar{Y}_{treat} - \bar{Y}_{control}
Reads from #1 Feeds into #3
Key code
diff.in.means <- ictreg(y ~ 1, data = race, treat = "treat",
                        J = 3, method = "lm")
summary(diff.in.means)

Reference / docs ↗

Discussion on this step (0)
  • No comments on this step yet — be the first.
3
Estimation

Multivariate ML regression

The core estimate — where the causal quantity itself is computed.

What happens here

Fit the constrained maximum-likelihood model relating covariates to the latent sensitive-item response (Imai 2011, Table 1).

Formula
g(z_i^\top \delta) = \Pr(Z_i^* = 1 \mid x_i)
Reads from #2 Feeds into #4
Key code
ml.results <- ictreg(y ~ south + age + male + college,
                     data = race, treat = "treat", J = 3,
                     method = "ml", constrained = TRUE)
summary(ml.results)

Reference / docs ↗

Discussion on this step (0)
  • No comments on this step yet — be the first.
4
Reporting

Predict sensitive-item prevalence

Reporting — turn the numbers into a figure or table a reader can act on.

What happens here

Use predict() to obtain the model-based estimated proportion holding the sensitive view, with uncertainty.

Reads from #3 Feeds into the final output
Key code
predict(ml.results, newdata = race, avg = TRUE,
        interval = "confidence", se.fit = TRUE)

Reference / docs ↗

Discussion on this step (0)
  • No comments on this step yet — be the first.
3

Output · what you get

Estimated predictors of holding the sensitive attitude, with 95% confidence intervals.
Fig 1Estimated predictors of holding the sensitive attitude, with 95% confidence intervals.

Result figure rendered by StatsOtter from the package's documented example — unofficial community showcase; all credit to the original authors.

Result · the numbers

\hat\pi=\bar Y_{\text{treat}}-\bar Y_{\text{control}}

⚠️ Unofficial community showcase of list (docs). Not affiliated with the authors — all credit to Kosuke Imai & coauthors; this summarizes public documentation.

What it does: list implements the Blair-Imai statistical toolkit for list experiments (a.k.a. the item-count technique), a survey design that elicits truthful answers to sensitive questions—prejudice, corruption, illegal behavior—by hiding the sensitive item among innocuous ones. How it works: respondents are randomized to a control list or a treatment list that adds the sensitive item, and report only how many items apply, preserving privacy. The package's ictreg() fits maximum-likelihood and Bayesian/EM multivariate regressions that model the probability of holding the sensitive trait as a function of covariates, with estimators for standard, multiple-sensitive-item, and design-effect cases. Assumptions: identification relies on no design effect (adding the sensitive item doesn't change responses to control items) and no liars (truthful answering under the protected format). The package supplies diagnostic tests for these assumptions and methods to combine list experiments with direct or endorsement questions.

What you get — Estimated population prevalence of the sensitive trait and regression coefficients linking covariates to it, with standard errors/credible intervals.

Example output

Item Count Technique Regression 

Call: ictreg(formula = y ~ south + age + male + college, data = race,
    treat = "treat", J = 3, method = "ml", constrained = TRUE)

Sensitive item 
            Est. S.E.
(Intercept) -7.343 1.837
south        2.580 0.713
age          0.422 0.474
male         0.328 0.477
college     -0.831 0.522

Control items 
            Est. S.E.
(Intercept)  0.700 0.111
south        0.310 0.072
age          0.181 0.030

Log-likelihood: -1947.085

Links: package · paper

Discussion (0)

  • No comments yet — start the conversation.