• Resolved cutu234

    (@cutu234)


    Sometimes, we run very specific tests that need a relatively small amount of page views. Right now, we are working on optimizing the checkout. Test scope is set to cart and checkout page, goal is the purchase. For each 100 page views we get 30 – 40 conversions. In contrast, for checking out other elements (before putting something in the cart) we get only 3 – 4 conversions per 100 page views. The question is now: how many page views do we need? What what be a good rule of thumb?

    Unfortunately, information on this topic seems not always to be very reliable. This online tool, for example, calculates a test duration of a whopping 30 days for a test to improve the conversion rate from 30% to 35% with 1 variation and 1000 visits per days.

    In other words, we need almost 10.000 conversions for this pretty simple test? This seems to be an insanely high number to me. So, most real-life shops would never get accurate results, since it would take years to reach 10.000 conversions? Doesn’t make sense to me, to be honest. The company that created the online calculator offer AB-testing. Could there be a conflict of interest? ??

    • This topic was modified 1 year, 11 months ago by cutu234.
    • This topic was modified 1 year, 11 months ago by cutu234.
    • This topic was modified 1 year, 11 months ago by cutu234.
    • This topic was modified 1 year, 11 months ago by cutu234.
    • This topic was modified 1 year, 11 months ago by cutu234.
Viewing 5 replies - 1 through 5 (of 5 total)
  • Plugin Author David Aguilera

    (@davilera)

    That’s a great question, and one that’s by no means easy to answer. So let’s ignore for a moment the math and briefly discuss the idea behind A/B testing.

    As you already know, A/B testing is all about comparing an existing page with an alternative design, copy, etc, and seeing if the new variant is “better” than the original page. In order to discover which variant is better, you simply track the number of page views each variant get and the number of conversions (e.g. purchases) each variant generates, and compute and compare each conversion rate. If you think about it, there’s only three possible outcomes:

    1. Variant A is better (i.e. its conversion rate is higher)
    2. Variant B is better
    3. Both variants perform roughly the same

    The problem you face when running an A/B test is that you won’t be able to tell with absolute certainty if it’s (1), (2), or (3), because you won’t track everyone always — you’re only tracking a sample of visitors. And that’s when sample size comes in.

    If you expect one variant to outperform the other by an extremely large margin, the sample size you’ll need to prove that will be pretty small. After all, you’re assuming (let’s say) variant B is way, way better, so you should see the improvements pretty quickly.

    Conversely, if you expect one variant to be slightly better than the other (let’s say, 0.5% better), the required sample size will be way larger. And this also makes sense: the margin is so small that every new user that participates in the test may change the outcome of the test one way or the other — so you need a larger sample to make sure that the results you got are “stable.”

    Now, back to the numbers offered by VWO. As far as I can tell, they’re accurate. If you want to be rigorous and make sure the data you collect and the conclusions you draw from your tests are actually true (instead of they’re being the result of random chance), you need larger samples when the expected improvement is small.

    However, this is also a balancing act. Smaller shops should avoid analysis paralysis at all costs. You know you need a large sample, but you can’t afford one… so you settle for a smaller one. You know the results aren’t 100% trustworthy (meaning the improvement you see (if any) might be the result of pure chance), but you did the best you could, collected some data, and interpreted the results with a grain of salt.

    As long as you’re aware of the limitations your tests have when the sample size is not large enough, I’d say you shouldn’t worry too much. Some data is better than no data, if you keep in mind it’s not perfect data.

    Thread Starter cutu234

    (@cutu234)

    Hi David,
    thank you very much for the great explanation. I will have a closer look and compare small and large sample size tests to get a better feeling for this.

    Well, SOME data might sometimes be worse than NO data. Imagine a test that shows a false-positive impact of the variation. That could lead to modifications that actually make things worse. ??

    Will keep this in mind!

    Great support, by the way. Thanks again!

    Plugin Author David Aguilera

    (@davilera)

    SOME data might sometimes be worse than NO data. Imagine a test that shows a false-positive impact of the variation. That could lead to modifications that actually make things worse.

    I absolutely agree with you — that’s why I said “some data is better than no data, if you keep in mind it’s not perfect data.”

    Here’s how I see it. If your sample is not large enough, your test might result in a false-positive. So now the question is: what do you do about the result of a test? Do you apply the variant or not? Well, that’s up to you! But what’s the alternative? Run no test at all and just implement the variant blindly? I think that’s clearly worse.

    To me, here’s the important bit: as long as you’re 100% aware that your test has some limitations and, therefore, you might end up applying a variant that’s worse, you’re good to go. You’re doing your best to improve your site by making decisions based on the best data you were able to collect ??

    Thread Starter cutu234

    (@cutu234)

    Thanks, David, for the great explanation!

    Mike

    Plugin Author David Aguilera

    (@davilera)

    Glad I could help ??

Viewing 5 replies - 1 through 5 (of 5 total)
  • The topic ‘How many page views are enough?’ is closed to new replies.