Bonferroni correction is a statistical adjustment applied when multiple hypothesis tests are performed simultaneously on the same dataset or experiment. It counteracts the multiple comparisons problem: when you run many tests at once, the probability of getting at least one false positive by chance increases rapidly. Bonferroni correction controls this by dividing the significance threshold (α) by the number of tests, making each individual test harder to pass and reducing the family-wise error rate.
Adjusted α = Original α / Number of tests
Example: Running 5 simultaneous comparisons at the standard 0.05 significance level:
- Adjusted α = 0.05 / 5 = 0.01
- Each individual test must achieve p < 0.01 to be considered significant.
In practice:
- 3 tests: adjusted α = 0.0167
- 5 tests: adjusted α = 0.0100
- 10 tests: adjusted α = 0.0050
Why Bonferroni Correction Matters for Ecommerce
The multiple comparisons problem is most acute in multivariate testing and situations where teams track many secondary metrics alongside the primary one. Suppose a team runs a test and checks 10 metrics — conversion rate, AOV, bounce rate, add-to-cart rate, checkout rate, etc. At α = 0.05, you'd expect 0.5 of those metrics to show "significant" results purely by chance even if the variant has no real effect on anything. Without Bonferroni correction (or an equivalent adjustment), teams can convince themselves a variant improved several things when in reality it improved nothing.
Real-World Example
Nykaa's analytics team ran a multivariate test with four variants on their category page, testing different combinations of filter placement and sort-order defaults. That's four comparisons (Variant A vs. Control, B vs. Control, C vs. Control, D vs. Control), so they applied Bonferroni correction: adjusted α = 0.05 / 4 = 0.0125. Two variants initially appeared significant at p = 0.03 (which would pass the uncorrected threshold). After applying the correction, neither passed. Rather than shipping a likely false positive, the team ran the two most promising variants in a focused follow-up experiment with more traffic and a single comparison — no correction needed.
How to Apply Bonferroni Correction
- Count the number of independent comparisons you're making — variants vs. control, not total test conditions.
- Apply the correction to your significance threshold, not to p-values (divide α by number of comparisons).
- Consider Holm-Bonferroni (a step-down procedure) as a less conservative alternative that maintains better statistical power than standard Bonferroni.
- Distinguish between primary and secondary metrics — apply correction when testing many metrics, but let one pre-designated primary metric guide the go/no-go decision.
- Note that Bonferroni is conservative: it increases the risk of false negatives (Type II errors). Weigh this against your false positive risk tolerance.
Bonferroni Correction in A/B Testing
In standard two-variant A/B tests with a single primary metric, Bonferroni correction is not needed — there is only one comparison. It becomes important in multivariate tests, tests with many variants, or experiments where teams track and report on many metrics. Most A/B testing platforms do not apply Bonferroni correction automatically; analysts must apply it manually when relevant.
Run smarter A/B tests with CustomFit.ai — 14-day free trial, no credit card required.