Regression for randomized-response surveys — recover predictors of a sensitive behavior while every respondent's individual answer stays private.
Input · what goes in
Survey data with the randomized-response item and predictor covariates.
Show data format & exampleHide example
| rr.q1 | asset.index | married | age |
|---|---|---|---|
| 1 | 0.4 | 1 | 41 |
| 0 | -1.1 | 0 | 33 |
| 1 | 0.9 | 1 | 52 |
| 0 | 0.2 | 0 | 29 |
Pipeline · the recipe
↑ Click any step in the diagram to read its logic, code, assumptions & discussion.
Load the survey data
Data preparation — shapes the raw inputs into what the estimator expects.
Use the Nigeria randomized-response survey, with the design probabilities for the forced-known design.
# Install: install.packages("rr")
library(rr)
data(nigeria)
set.seed(1)
- No comments on this step yet — be the first.
Log in to comment on this step.
Fit the randomized-response regression
The core estimate — where the causal quantity itself is computed.
rrreg deconvolves the known randomization noise and fits a logistic model for the latent sensitive trait.
out <- rrreg(rr.q1 ~ cov.asset.index + cov.married + age,
data = nigeria, p = 2/3, p1 = 1/6, p0 = 1/6,
design = "forced-known")
summary(out)
- No comments on this step yet — be the first.
Log in to comment on this step.
Output · what you get
Result figure rendered by StatsOtter from the package's documented example — unofficial community showcase; all credit to the original authors.
Result · the numbers
⚠️ Unofficial community showcase of rr (docs). Not affiliated with the authors — all credit to Graeme Blair, Yang-Yang Zhou & Kosuke Imai; this summarizes public documentation.
What it does. The randomized-response technique lets people answer a sensitive yes/no question truthfully without revealing their answer, because a coin (known probability) sometimes dictates the response. rr (Blair, Imai & Zhou 2015) provides the multivariate regression that recovers how covariates predict the latent truthful response.
How it works. Given the design probabilities (mirrored, forced-known, or unrelated-question), rrreg() fits a maximum-likelihood logistic model for the latent sensitive trait, deconvolving the randomization noise. Companion functions predict prevalence and combine RR with direct questions.
Assumptions. Respondents follow the randomization device and answer truthfully under its protection; the design probabilities are known.
What you get — Logistic-regression coefficients (with SEs) linking covariates to the latent sensitive response, despite the privacy noise.
Example output
Randomized Response Technique Regression
Estimated coefficients (logistic):
est se z p
(Intercept) -0.91 0.27 -3.37 0.00
cov.asset.index 0.14 0.06 2.33 0.02
cov.married 0.32 0.18 1.78 0.08
age 0.01 0.01 1.02 0.31
Design: forced-known N = 2400

Discussion (0)
Log in to join the discussion.