StatsOtter Causal inference workflows
11
Workflow·4 steps

Quantitative Social Science: data and code (qss)

Summary by StatsOtter

The companion R package for Imai's textbook Quantitative Social Science, bundling every dataset and chapter vignette for hands-on data-analysis teaching.

1

Input · what goes in

No user data required—the package supplies the textbook's bundled datasets and chapter vignettes.

Show data format & exampleHide example
dataset description rows
elections U.S. election returns 1000+
afghan Afghanistan survey 2754
resume resume audit experiment 4870
social social-pressure experiment 305866
2

Pipeline · the recipe

↑ Click any step in the diagram to read its logic, code, assumptions & discussion.

1
Data prep

Install and load qss

Data preparation — shapes the raw inputs into what the estimator expects.

What happens here

Install the teaching package from GitHub and load the resume audit-experiment dataset from the Causality chapter.

Reads from the input data Feeds into #2
Key code
# Install:  remotes::install_github("kosukeimai/qss-package")
# devtools::install_github("kosukeimai/qss-package", build_vignettes = TRUE)
library(qss)
data(resume, package = "qss")

Reference / docs ↗

Discussion on this step (0)
  • No comments on this step yet — be the first.
2
Diagnostic / pre-tests

Cross-tabulate callbacks by race

A pre-flight check — run this before trusting any estimate downstream.

What happens here

Tabulate callback (0/1) against the randomly assigned applicant race implied by the name.

Reads from #1 Feeds into #3
Key code
race.call.tab <- table(race = resume$race, call = resume$call)
race.call.tab
prop.table(race.call.tab, margin = 1)

Reference / docs ↗

Discussion on this step (0)
  • No comments on this step yet — be the first.
3
Estimation

Difference-in-means in callback rates

The core estimate — where the causal quantity itself is computed.

What happens here

Compute the callback rate for white-sounding and black-sounding names and take the difference (the average causal effect of race).

Formula
\widehat{ATE} = \bar{Y}_{white} - \bar{Y}_{black}
Reads from #2 Feeds into #4
Key code
callback.rate <- prop.table(race.call.tab, margin = 1)[, 2]
ate <- callback.rate["white"] - callback.rate["black"]
ate

Reference / docs ↗

Discussion on this step (0)
  • No comments on this step yet — be the first.
4
Reporting

Interpret the effect

Reporting — turn the numbers into a figure or table a reader can act on.

What happens here

White-sounding names receive callbacks at a notably higher rate, evidence of racial discrimination in hiring.

Reads from #3 Feeds into the final output
Key code
round(callback.rate, 3)
round(ate, 3)

Reference / docs ↗

Discussion on this step (0)
  • No comments on this step yet — be the first.
3

Output · what you get

Résumé-callback rates by perceived race — the 3.2-percentage-point gap.
Fig 1Résumé-callback rates by perceived race — the 3.2-percentage-point gap.

Result figure rendered by StatsOtter from the package's documented example — unofficial community showcase; all credit to the original authors.

Result · the numbers

\tau=\mathbb E\big[Y(1)-Y(0)\big]

⚠️ Unofficial community showcase of qss (docs). Not affiliated with the authors — all credit to Kosuke Imai & coauthors; this summarizes public documentation.

What it does: qss is the official companion R package to Kosuke Imai's textbook Quantitative Social Science: An Introduction (Princeton University Press, 2017). It packages all of the book's datasets and reproducible code so students and instructors can follow along chapter by chapter. How it works: the package ships ready-to-load datasets (elections, civil war, social-media, intervention studies, and more) and a set of vignettes—one per chapter (causality, measurement, prediction, probability, uncertainty)—that walk through the analyses using base R. Rather than introducing new estimators, it provides a curated, classroom-tested foundation for learning causal inference, measurement, prediction, and statistical inference through real social-science data. Assumptions: none statistical—it is a pedagogical resource. Installed from GitHub (not CRAN), typically with build_vignettes = TRUE to access the chapter tutorials. It anchors a wider teaching ecosystem, including tidyverse and learnr adaptations of the same material.

What you get — Loadable datasets and per-chapter vignettes reproducing the book's analyses (causality, measurement, prediction, probability, uncertainty).

Example output

       call
race       0    1
  black 2278  157
  white 2200  235

prop.table(race.call.tab, margin = 1)[, 2]
     black      white 
0.06447639 0.09650924 

callback.rate["white"] - callback.rate["black"]
     white 
0.03203285 

Links: package · paper

Discussion (0)

  • No comments yet — start the conversation.