CUPED Variance Reduction

CUPED (Controlled-experiment Using Pre-Experiment Data) reduces the variance of your metric estimates by adjusting for pre-experiment behavior. Lower variance means you reach statistical significance with fewer users — typically 20–40% fewer, depending on how predictive the covariate is.

Reference: Deng, Xu, Kohavi & Walker — Improving the sensitivity of online controlled experiments by utilizing pre-experiment data (WSDM 2013).


How It Works

CUPED adjusts each user's observed metric by subtracting a term proportional to their pre-experiment value:

Y_cuped = Y - θ × (X - E[X])

Where:

  • Y = observed metric (e.g. revenue in the experiment window)
  • X = pre-experiment covariate (e.g. revenue in the 2 weeks before experiment)
  • θ = OLS coefficient (Cov(Y, X) / Var(X)) estimated from the control group
  • E[X] = population mean of the covariate

The adjustment removes the variance explained by X, leaving a lower-noise estimate of the treatment effect.

Winsorization

Extreme outliers can inflate variance and destabilize θ. The service supports optional winsorization: values outside the [lower_pct, upper_pct] percentile range are clamped before the CUPED adjustment is applied.


API Reference

GET /api/v1/results/{experiment_id}/cuped

Returns CUPED-adjusted treatment effect estimates alongside the standard (unadjusted) results.

Query Parameters

ParameterTypeDefaultDescription
winsorizeboolfalseApply winsorization before adjustment
lower_pctfloat0.01Lower winsorization percentile (1st percentile)
upper_pctfloat0.99Upper winsorization percentile (99th percentile)

Example Request

curl -X GET "http://localhost:8000/api/v1/results/exp-uuid/cuped?winsorize=true" \
  -H "Authorization: Bearer $TOKEN"

Example Response

{
  "experiment_id": "exp-uuid",
  "cuped_results": [
    {
      "variant_id": "variant-uuid",
      "variant_name": "Checkout v2",
      "adjusted_control_mean": 48.21,
      "adjusted_treatment_mean": 51.84,
      "adjusted_effect": 3.63,
      "adjusted_se": 0.91,
      "adjusted_p_value": 0.0002,
      "adjusted_ci": [1.85, 5.41],
      "variance_reduction_pct": 31.4,
      "theta": 0.78
    }
  ],
  "winsorized": true,
  "lower_pct": 0.01,
  "upper_pct": 0.99
}

Response Fields

FieldDescription
adjusted_effectCUPED-adjusted treatment effect (treatment mean − control mean)
adjusted_p_valueTwo-tailed p-value from z-test on the adjusted effect
adjusted_ci95% confidence interval for the adjusted effect
variance_reduction_pct% of control variance removed by CUPED. Higher = more sensitive test
thetaOLS coefficient. Values near 0 mean the covariate is a poor predictor; consider a different covariate

Choosing a Covariate

The covariate X should be:

  • Measured before the experiment starts — prevents contamination
  • Correlated with the outcome metric — higher correlation → more variance reduction
  • Available for all users — users without covariate data are excluded from CUPED analysis

Good covariate examples:

Outcome MetricCovariate
Revenue in experiment windowRevenue in prior 2–4 weeks
Session durationAverage session duration (pre-experiment)
Conversion ratePrior conversion rate
Click-through ratePrior CTR on similar content

Interpreting variance_reduction_pct

ValueInterpretation
< 10%Covariate is a weak predictor. Little benefit from CUPED.
10–30%Moderate benefit. Worth using.
30–60%Strong benefit. Experiment is significantly more sensitive.
> 60%Excellent covariate. Consider whether the experiment and covariate are too correlated (check for data leakage).

Permissions

  • VIEWER: ✅ Read access
  • ANALYST: ✅ Read access
  • DEVELOPER: ✅ Read access
  • ADMIN: ✅ Full access