Double machine learning in Python (econml)

1

Input · what goes in

Outcome Y, treatment T, effect-modifier features X, and controls W (arrays or DataFrames).

Show data format & exampleHide example

Y	T	X1	X2
1.0	1	0.5	-0.3
0.2	0	-0.8	0.1
1.7	1	0.9	1.2
0.0	0	0.1	-0.6

2

Pipeline · the recipe

↑ Click any step in the diagram to read its logic, code, assumptions & discussion.

1

Data prep

Build the data

Data preparation — shapes the raw inputs into what the estimator expects.

What happens here

Provide outcome, a (discrete) treatment, and effect-modifier features.

Reads from the input data Feeds into #2

Key code

# Install:  pip install econml
import numpy as np
from econml.dml import LinearDML
from sklearn.ensemble import GradientBoostingRegressor

n = 1000
X = np.random.normal(size=(n, 5))
T = np.random.binomial(1, 0.5, size=n)
Y = (X[:, 0] > 0) * T + X[:, 1] + np.random.normal(size=n)

Reference / docs ↗

Discussion on this step (0)

No comments on this step yet — be the first.

2

Estimation

Fit a Linear Double-ML model

The core estimate — where the causal quantity itself is computed.

What happens here

LinearDML partials out flexible ML models for Y and T (cross-fitted) and recovers the effect.

Formula

\tau(x)=\mathbb E[Y(1)-Y(0)\mid X=x];\quad \tilde Y=\tilde T\,\theta(x)+\varepsilon\ \ (\text{double ML residuals})

Reads from #1 Feeds into #3

Key code

est = LinearDML(model_y=GradientBoostingRegressor(),
                model_t=GradientBoostingRegressor(),
                discrete_treatment=True)
est.fit(Y, T, X=X)

Reference / docs ↗

Discussion on this step (0)

No comments on this step yet — be the first.

3

Inference

Read the average effect

Uncertainty quantification — standard errors, intervals, and aggregation.

What happens here

const_marginal_ate averages the CATE over the sample.

Reads from #2 Feeds into the final output

Key code

print(est.const_marginal_ate(X))

Reference / docs ↗

Discussion on this step (0)

No comments on this step yet — be the first.

3

Output · what you get

Fig 1Double-ML CATE estimates across units from econml's LinearDML, centered on the average treatment effect.

Result figure rendered by StatsOtter from the package's documented example — unofficial community showcase; all credit to the original authors.

Result · the numbers

\tau(x)=\mathbb E[Y(1)-Y(0)\mid X=x];\quad \tilde Y=\tilde T\,\theta(x)+\varepsilon\ \ (\text{double ML residuals})

⚠️ Unofficial community showcase of econml (docs). Not affiliated with the authors — all credit to Microsoft Research / PyWhy (Lewis, Syrgkanis & collaborators); this summarizes public documentation.

What it does. econml brings the Athey–Imbens-style econometrics-meets-ML estimators to Python: it estimates conditional average treatment effects with machine-learning nuisance models while keeping valid inference on the causal parameter.

How it works. Estimators like LinearDML fit flexible models for the outcome and the treatment, partial both out (Neyman-orthogonal/double ML), and regress the residuals to recover the effect — optionally as a function of effect-modifiers X for a CATE. Cross-fitting removes overfitting bias.

Assumptions. Unconfoundedness given the controls and overlap; orthogonalization makes the estimate robust to first-stage ML error.

Packages the double-ML / heterogeneous-effects methods from the causal-ML agenda Imbens helped build; authored by the PyWhy/Microsoft team.

What you get — A CATE model you can evaluate at any X, and the average treatment effect implied by it.

Example output

0.5123847716

Links: package · paper

Double machine learning in Python (econml)

Input · what goes in

Pipeline · the recipe

Output · what you get

∑Result · the numbers

Example output

Discussion (0)

Result · the numbers