| | #1 |
| Warrior Member Join Date: 2016 Location: Paris, France
Posts: 4
Thanks: 0
Thanked 1 Time in 1 Post
| Title: What are your criteria to stop an A/B Test? Well met CRO Warriors! Stopping A/B Tests too early is the most common and potent mistake made. And there are no cookie-cutter answer working for all statistical methods of A/B Testing. So I’ve got one question and an article for you:
I mulled over this topic for quite a while. It wasn’t a good idea to tackle it from a statistical point of view, because—well, it’s complicated. Few people actually care whether their A/B testing tool is frequentist or bayesian and if you dig around a bit in the different softwares, they never use exactly the same statistical engine. Plus I really didn’t like the idea of just giving numbers for you to blindly apply (or discard because they don’t feel right for you). Instead I decided to try and explain the different concepts that would help you stop tests safely (and not get useless, most likely imaginary results). We’ll cover the following:
Note: None of these elements are stopping rules on their own. But having a better grasp of them will allow you to make better decisions. I. Significance level When your A/B Testing tool tells you something along the line of: « your variation has 95% chance of outperforming/beating your control », it’s giving you the significance level. But if you take things the other way around, it means: « There is 5% chance (1 in 20) that the result you see is completely random—a fluke. You want at minimum 95%. Think about what it actually means. If you stop at 80%, there is 20% (1 in 5) that your result is a false positive. You’re testing to make data-driven decisions, not slightly-better-than-flipping-a-coin ones. BUT—having 95% significance level is NOT a sufficient condition to stop an A/B Test. II. Sample size Unless you’re testing a particular segment of visitors, make sure your sample is representative of your overall audience in composition and proportions. Be wary of unusual sources traffic that could be skewing your data. Example: shooting your newsletter during your test thus having a spike of traffic with visitors more likely to receive positively any changes you make since they already trusted/appreciated you enough to subscribe. Your sample must also be large enough so it’s not vulnerable to the natural variability of the data, i.e. if you don’t have enough measures, outliers results will have a strong impact on your overall results. III. Duration You should test for full weeks at a time. We recommend you test for 2-3 weeks, or 1 (or 2) business cycle. Why? You already know that, for example, social networks and emails have optimal days (even hours) to shoot. Meaning time and days influence people behaviours. Same thing with your conversion rates, if you’d do a conversion by day in Google Analytics, you’d see that mondays convert differently than thursdays for example. Test for full weeks. 2 or 3 is good, or 1-2 business cycles so you can have people who just discovered you, some who already know you, etc ... IV. Variability of data If your significance level and/or the conversion rates of your variations are still fluctuating notably, let your test running. Two phenomenons to consider here:
This is also why the significance level isn’t enough on its own. During a test, you’ll most likely reach several times 95% before you can actually stop your test. As we already mentioned, you’ll have these important fluctuations at the beginning of your tests because outliers will have an important impact on the overall conversion rate since you don’t have enough data to approach the « true » value. To sum up, before stopping an A/B Test, consider the following:
Only after taking all of those into account can you stop a test. Don’t skip them, don’t lose money … >> As promised, here’s the link to the original—longer, more detailed and with silly GIF article.<< Alright, back to you now! PS: Tell me if the content was useful for you, and if not why + what would you have needed? |
| Last edited on 20th May 2016 at 08:02 AM. Reason: typo | |
| |
| The Following User Says Thank You to Jeanbaptiste Alarcon For This Useful Post: |
| | #2 |
| viptraffictraining.com Join Date: 2012 Location: Cyberspace
Posts: 359
Thanks: 8
Thanked 54 Times in 51 Posts
|
Great approach. Will bookmark this for further research. Thanks.
|
| | |
| |
| | #3 |
| Warrior Member Join Date: 2016 Location: Paris, France
Posts: 4
Thanks: 0
Thanked 1 Time in 1 Post
|
Thank you for the kind words, Hearn ![]() (If you're interested by this topic, I wrote a 10,000 words ebook on A/B Testing mistakes that I can pm you for further research^^) |
| |
|
| Bookmarks |
| Tags |
| a or b, ab testing, conversion optimization, criteria, cro, growth hacking, stop |
| |