CUPED Variance Reduction
CUPED (Controlled-experiment Using Pre-Experiment Data) reduces the variance of your metric estimates by adjusting for pre-experiment behavior. Lower variance means you reach statistical significance with fewer users — typically 20–40% fewer, depending on how predictive the covariate is.
Reference: Deng, Xu, Kohavi & Walker — Improving the sensitivity of online controlled experiments by utilizing pre-experiment data (WSDM 2013).
How It Works
CUPED adjusts each user's observed metric by subtracting a term proportional to their pre-experiment value:
Y_cuped = Y - θ × (X - E[X])
Where:
Y= observed metric (e.g. revenue in the experiment window)X= pre-experiment covariate (e.g. revenue in the 2 weeks before experiment)θ= OLS coefficient (Cov(Y, X) / Var(X)) estimated from the control groupE[X]= population mean of the covariate
The adjustment removes the variance explained by X, leaving a lower-noise estimate of the treatment effect.
Winsorization
Extreme outliers can inflate variance and destabilize θ. The service supports optional winsorization: values outside the [lower_pct, upper_pct] percentile range are clamped before the CUPED adjustment is applied.
API Reference
GET /api/v1/results/{experiment_id}/cuped
Returns CUPED-adjusted treatment effect estimates alongside the standard (unadjusted) results.
Query Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
winsorize | bool | false | Apply winsorization before adjustment |
lower_pct | float | 0.01 | Lower winsorization percentile (1st percentile) |
upper_pct | float | 0.99 | Upper winsorization percentile (99th percentile) |
Example Request
curl -X GET "http://localhost:8000/api/v1/results/exp-uuid/cuped?winsorize=true" \
-H "Authorization: Bearer $TOKEN"
Example Response
{
"experiment_id": "exp-uuid",
"cuped_results": [
{
"variant_id": "variant-uuid",
"variant_name": "Checkout v2",
"adjusted_control_mean": 48.21,
"adjusted_treatment_mean": 51.84,
"adjusted_effect": 3.63,
"adjusted_se": 0.91,
"adjusted_p_value": 0.0002,
"adjusted_ci": [1.85, 5.41],
"variance_reduction_pct": 31.4,
"theta": 0.78
}
],
"winsorized": true,
"lower_pct": 0.01,
"upper_pct": 0.99
}
Response Fields
| Field | Description |
|---|---|
adjusted_effect | CUPED-adjusted treatment effect (treatment mean − control mean) |
adjusted_p_value | Two-tailed p-value from z-test on the adjusted effect |
adjusted_ci | 95% confidence interval for the adjusted effect |
variance_reduction_pct | % of control variance removed by CUPED. Higher = more sensitive test |
theta | OLS coefficient. Values near 0 mean the covariate is a poor predictor; consider a different covariate |
Choosing a Covariate
The covariate X should be:
- Measured before the experiment starts — prevents contamination
- Correlated with the outcome metric — higher correlation → more variance reduction
- Available for all users — users without covariate data are excluded from CUPED analysis
Good covariate examples:
| Outcome Metric | Covariate |
|---|---|
| Revenue in experiment window | Revenue in prior 2–4 weeks |
| Session duration | Average session duration (pre-experiment) |
| Conversion rate | Prior conversion rate |
| Click-through rate | Prior CTR on similar content |
Interpreting variance_reduction_pct
| Value | Interpretation |
|---|---|
| < 10% | Covariate is a weak predictor. Little benefit from CUPED. |
| 10–30% | Moderate benefit. Worth using. |
| 30–60% | Strong benefit. Experiment is significantly more sensitive. |
| > 60% | Excellent covariate. Consider whether the experiment and covariate are too correlated (check for data leakage). |
Permissions
- VIEWER: ✅ Read access
- ANALYST: ✅ Read access
- DEVELOPER: ✅ Read access
- ADMIN: ✅ Full access