What is CUPED?

CUPED (Controlled-experiment Using Pre-Existing Data) is a statistical method that reduces variance in A/B tests, enhancing their sensitivity and making it easier to detect differences between groups.

CUPED helps you run smarter experiments that reach conclusions faster and with greater confidence. It involves:

  • Using pre-experiment data to create a more precise measurement framework
  • Reducing statistical noise to reveal true treatment effects
  • Achieving statistical significance with less data when there's a genuine effect
  • Transforming experimentation with higher precision and faster results

CUPED in action

Image source: Optimizely

Leading companies like Netflix, Meta, and Airbnb use CUPED to boost experiment sensitivity and speed up learning cycles.

Why is CUPED important?

By implementing CUPED in your experimentation program, you can:

  • Reach statistical significance faster with smaller sample sizes
  • Detect smaller effects traditional methods might miss
  • Run more experiments with your existing traffic
  • Make data-driven decisions with greater confidence
  • Overcome common challenges like high metric variance and insufficient traffic

CUPED example

In digital experimentation, there's always room for improvement in statistical efficiency, and the best companies are constantly enhancing their testing capabilities to get faster, more reliable results.

Think we're testing an ad campaign and whether it increases revenue or not:

  • Testing if a new ad campaign increases customer revenue
  • 1,000 customers split between control and treatment groups
  • Pre-experiment revenue data available for all customers

Here's how the results compare.

Traditional A/B testing:

  • 8% revenue lift observed
  • p-value: 0.09 (not statistically significant)

With CUPED:

  • Same 8% revenue lift
  • p-value: 0.03 (statistically significant)
  • Variance reduced by 41%

CUPED allows teams to confidently detect the same effect with the same sample size by accounting for each customer's pre-experiment spending patterns.

Traditional A/B testing vs. CUPED

  Traditional A/B testing CUPED-enhanced testing
Pre-experiment data Not used Used as a covariate
Metric variance Higher Reduced
Sample size Larger required Smaller sufficient
Speed to signifiance Slower Faster
Effect detection May miss small differences Can uncover subtle differences

How CUPED works

CUPED enhances your test results by using pre-existing data to reduce variance. Here's how it works:

  1. Collect historical data: Gather past performance data for selected metrics (requires at least two weeks of pre-experiment data).
  2. Build a predictive model: Estimate what results would look like if no changes were made.
  3. Adjust experiment results: Subtract the predicted baseline from observed results between control and variants.
  4. Get more precise insights: Decreasing pre-existing variance increases statistical sensitivity, tightening confidence intervals.

The technical mechanics involve covariance calculations and linear regression, but in practice, CUPED automatically adjusts for pre-existing differences between users, focusing the analysis on changes that occur after the treatment.

CUPED use cases

CUPED can benefit many types of experiments, especially high-variance metrics:

  • Revenue metrics improvements: More accurately measure the impact of changes on high-variance metrics like average order value or revenue per user.
  • Engagement optimization: Detect meaningful differences in user engagement metrics like session time or page views with less data.
  • Ratio metrics handling: Improve the precision of metrics like items per order clicks per user with numerator/denominator structure.
  • Low-traffic segments analysis: Boost statistical power when analyzing user segments with limited data, making it possible to run meaningful experiments on specific customer cohorts that would otherwise require prohibitively large sample sizes.

CUPED implementation

Optimizely makes it easy to enable CUPED for your experiments.

When implementing CUPED:

  • Compatible metrics: Works only with numeric metrics (revenue, engagement counts, etc.) rather than binary conversion metrics.
  • Pre-experiment data: Only pre-experiment calculations of the primary and secondary target metrics are used as covariates
  • Platform support: Functions on major data warehouses including Snowflake, BigQuery, and Databricks.
  • Implementation steps: Typically activated through a simple toggle within experiment configuration interfaces, requiring no complex statistical calculations from the user.
  • Data requirements: Requires pre-experiment data for the metrics being analyzed; has no effect on newly created metrics with no historical data.
  • Expected outcome: Can significantly reduce variance in experiment results, potentially cutting sample size requirements by 30-50% for metrics with a strong correlation to historical behavior.

Here's how it looks with and without CUPED.

Without CUPED

Without CUPED

Image source: Optimizely

With CUPED

With CUPED

Image source: Optimizely

CUPED best practices

When implementing CUPED, follow these best practices:

  1. Choose the right metrics: CUPED works best with high-variance numeric metrics that show a correlation between pre and post-experiment periods.
  2. Ensure sufficient historical data: At least two weeks of pre-experiment data is needed for effective variance reduction.
  3. Monitor data quality: Ensure consistent tracking before and during the experiment.
  4. Know the limits: CUPED may not help with new features, metrics with low pre/post correlation, or insufficient historical data.
  5. Combine with other techniques: For maximum benefit, use CUPED alongside proper experimental design and sample size calculations.

However, not all metrics benefit equally from CUPED...

While CUPED is powerful, its effectiveness varies by metric type:

  • Most effective for: Revenue per user, session duration, items per order, and other numeric metrics with high variance
  • Less effective for: Binary conversion metrics (currently not supported in Optimizely's implementation)
  • Requires consideration: Metrics that experience seasonal fluctuations or are affected by external factors

Wrapping up...

By reducing variance and increasing statistical power, CUPED helps unlock the full potential of experimentation programs.

Takeaways:

  • Detect results faster: Filter out noise, and uncover key differences between test groups.
  • Minimize experiment bias: Balance test groups using pre-existing data.
  • Run smarter tests: Reach conclusions faster with less data.
  • Trust Your Results: Get clearer, data-backed insights.

Improve your experimentation with Optimizely's CUPED capabilities and join the ranks of data-driven organizations making smarter decisions in less time.