Statistics for A/B testing-Review

Dhivya Priya Anbazhagan
5 min readDec 29, 2020

Proper data leads to better business decisions. Understanding statistics will help you interpret the results correctly.

Image source: Crazyegg.com

Why do you need to know Statistics for A/B testing?

Nobody wants to spend their time and money doing something just to turn out to be useless in the end. Hence understanding how to use A/B testing efficiently and effectively so that you achieve the desired outcome is very important. Even though most of the tools out there today does the calculations for you, understanding the basic concepts and know-how about statistics will help an Optimizer to better evaluate the results.

A/B testing statistics aren’t that complicated — but they are that essential to running tests correctly.

There are 3 core aspects you should know before learning statistics.

Mean: It is the average. For conversion rates, it is the number of events multiplied by the probability of success. (n*p)

Variance: It is the average variability of the data. The higher the variability, the less precise the mean will be.

Sampling: We have to select the sample that is statistically representative of the whole. In general, the larger the sample size, the more accurate your results will be.

Before beginning with the statistics, you need to understand the processes involved in A/B testing.

  1. Creating a hypothesis(claim) based on heuristic analysis and business requirements.
  2. Running the test to gather enough evidence for accepting or rejecting the claim.
  3. Analyzing the final data to see which one of your hypothesis wins.

There are two types: Null hypothesis and Alternate hypothesis.

The null hypothesis is the status quo or the default original state. The alternate hypothesis is the one you create that challenges the status quo. It is basically the hypothesis that you hope the A/B test will prove to be true.

For example, the heuristic research shows that your landing page doesn’t address the trust factor which makes the users drop off. Your Null hypothesis here will be, No trust factors will lead to a conversion rate equal to 8%(the status quo). The Alternate hypothesis will be, Adding social proof and testimonials will lead to an increased conversion rate of more than 8%. Now the optimizer will test both the hypothesis to see which one is true.

Once you have started your hypothesis testing, there are 2 types of errors you might encounter.

Type I error: A type I error is a false positive which rejects a null hypothesis that is actually true. Your test will measure a difference between your original version and the variation which in reality doesn’t exist. The probability of a type I error, denoted by the Greek letter alpha (α), is the level of significance of your A/B test. If your test has a 95% confidence level, then you have a 5% probability of type I error. If 5% is too high, you can decrease the probability of false positive by increasing the confidence level.

Type II error: A type II error is a false negative which is a failure to reject a null hypothesis that is actually false. Denoted by Beta (β), the type II error has an inverse relationship with the statistical power. Statistical power is used to lower the possibility of false negatives. The higher the power level, the lower the probability of type II error. The alpha and beta have an inverse relationship with each other. Hence, lowering one will increase the other.

As Paul D. Ellis says, “A well thought out research design is one that assesses the relative risk of making each type of error, then strikes an appropriate balance between them.”

Now that we have a fair understanding of the hypothesis and errors, let’s start with the obvious question: What is Statistical significance?

Statistical significance is a way of mathematically proving that a certain statistic is reliable. It measures how likely it would be to observe what we observed, assuming the null hypothesis is true. A result of an experiment is set to be statistically significant if it is likely not caused by chance for a given statistical significance level. For example, if you run an A/B testing experiment with a significance level of 95%, this means that if you determine a winner, you can be 95% confident that the observed results are real and not an error caused by randomness. It also means that there is a 5% chance that you could be wrong.

When you run a statistical hypothesis testing, there are two values that you should be paying attention to:

P-value: It is the measure of evidence against the Null hypothesis. But it does not tell us the probability that B is better than A.

Confidence interval: It refers to the upper and lower bounds of what can happen with your experiment. When your testing tool reports the conversion rate as X% +/- Y% with a confidence level of 95%, then you need to account for the +/- Y% as the margin of error. The lower the margin of error, the better. It can also be lowered by having a larger sample size.

There are two different approaches to analyzing statistical data and making decisions based on it.

Frequentist approach: It shows you the percentage of all possible samples that can be expected to include the result you got (the chances of challenger beating the control). The hypothesis is tested without assigning a probability. This approach is usually beneficial for more mature companies.

Bayesian approach: Here a probability is assigned to a hypothesis. The result is a probability that B outperforms A. It’s a different way of presenting the data so that it makes more sense to the managers.

There goes a short summary of my learnings for this week from the CXL Institute’s Conversion Optimization Mini-degree on Statistics for A/B testing.

This post reviewed why we need to know statistics, basic aspects of Statistics, types of hypothesis, types of errors, and different approaches to analyzing statistical data. I will be covering other topics in statistics relevant to A/B testing in the next few posts.

Thank you for reading and do share your feedback on the comments!

--

--

Dhivya Priya Anbazhagan

Digital Analyst. Storyteller from my preliterate days. I write them down now.