In this explainer, we will learn how to approximate a binomial distribution with a normal distribution.
Recall that if a discrete random variable represents the number of successful trials in an experiment, we can model with a binomial distribution , written , provided the experiment satisfies all the following conditions:
- The number of trials, , is fixed.
- Each trial has two possible outcomes (success or failure).
- The probability of success, , is fixed.
- The trials are independent, so the outcome of one trial does not affect the outcome of any other.
The probability mass function of the binomial distribution is given by When the number of trials is large, the number is very big, and this calculation can become unwieldy. For this reason, it is often useful to approximate a binomial distribution with a normal distribution. In order for a normal approximation to be appropriate for a binomially distributed random variable , two conditions must be met:
- The probability must be close to 0.5. For us, this means .
- The number of trials must be sufficiently large. For us, this means .
The normal distribution takes two parameters, the mean and the standard deviation (or variance ). When approximating a binomial distribution with a normal distribution, these parameters are calculated as follows.
Formula: Approximating a Binomial Distribution with a Normal Distribution
Suppose is a binomially distributed discrete random variable with and . Then, may be approximated by a normal distribution , where
An important point to remember when approximating a binomial distribution with a normal distribution is that the binomial distribution is a discrete probability distribution, while the normal distribution is continuous. For this reason, it is necessary to apply a continuity correction when calculating probabilities.
For example, consider a discrete random variable . Suppose we approximate this variable with the normal distribution , where and . If we use this approximation to calculate the probability , then we will get answer of 0 because the normal distribution (being continuous) has for any . The continuity correction we need to make is to instead calculate the probability on a unit interval centered on 200; that is, . Similarly, if we want to calculate the probability that is strictly less than some value, say 230, then we should calculate in order to exclude 230. However, if we want the probability that is less than or equal to 230, we should calculate .
Let us have a look at an example of some variables that can be approximated by a normal distribution and some that cannot.
Example 1: Recognizing a Valid Normal Approximation
Which of the following four binomially distributed random variables may be approximated by a normal distribution?
Answer
The criteria for whether a normal approximation is appropriate for a binomial random variable are as follows:
- The probability is close to 0.5, meaning .
- The number of trials is sufficiently large, meaning .
By these criteria, we can see that variables a and d may be approximated by a normal distribution. Variables b and c cannot be approximated by a normal distribution since variable b fails to satisfy criterion 2 and variable c fails to satisfy criterion 1.
Let us now calculate the mean and variance of a normal approximation.
Example 2: Calculating the Mean and Variance of a Normal Approximation
is a binomially distributed random variable, and . Write down a normal approximation of , stating the values of and .
Answer
Recall that a binomial random variable with and may be approximated by a normal distribution , where and .
Therefore, our binomial random variable may be approximated by a normal distribution with and
Let us now use a normal approximation to estimate a probability.
Example 3: Using a Normal Approximation to Estimate Probabilities
A discrete random variable is binomially distributed, and . Using a normal approximation, estimate .
Answer
Recall that a binomial random variable with and may be approximated by a normal distribution , where and .
Since the normal distribution is a continuous probability distribution while the binomial distribution is discrete, when calculating a probability using a normal approximation, we need to make continuity corrections.
We approximate with a normal distribution where is given by and Our approximate model is therefore .
Notice that the question asks us to estimate the probability that is less than or equal to 96. We therefore need to make a continuity correction to include the value 96 and calculate the probability in our continuous normal approximation. We get a probability of 0.6384.
Remember that if we want to use a normal approximation to estimate the probability that a discrete random variable takes a particular value , then we have to make the continuity correction . Here is an example to test this skill.
Example 4: Using a Normal Approximation to Estimate Probabilities
A discrete random variable is binomially distributed, and . Using a normal approximation, estimate .
Answer
Recall that a binomial random variable with and may be approximated by a normal distribution , where and .
Since the normal distribution is a continuous probability distribution while the binomial distribution is discrete, when calculating a probability using a normal approximation, we need to make continuity corrections.
We approximate with a normal distribution , where and Our approximate model is therefore .
The question asks us to estimate the probability that is equal to 106. We therefore need to make a continuity correction and calculate the probability over the unit interval centered on 106. We get a probability of 0.0289.
Let us now make a normal approximation in a real-life context.
Example 5: Solving a Binomial Distribution Approximation Problem In a Real-Life Context
Karim flips a fair coin 84 times. By using a suitable normal approximation, calculate the estimated probability that the coin lands on heads more than 50 times.
Answer
Recall that a binomial random variable with and may be approximated by a normal distribution , where and .
Since the normal distribution is a continuous probability distribution while the binomial distribution is discrete, when calculating a probability using a normal approximation, we need to make continuity corrections.
Let represent the number of times the coin lands on heads. This is a discrete random variable that follows the binomial distribution with and . Since the number of trials is and the probability of getting heads is , we may approximate with a normal distribution , where and Our approximate model is therefore . We need to estimate the probability . Since the inequality is strict, we make the continuity correction to exclude the case for an estimate of 0.0318.
It can sometimes be instructive to compare the probabilities obtained from a binomial distribution and a normal approximation of it, as shown in the next example.
Example 6: Comparing a Binomial Distribution and a Normal Approximation
A farm produces eggs. The farmer claims that of the eggs weigh more than 64 grams.
- A random sample of 20 eggs is taken. Find the exact probability that at least 15 of the eggs weigh more than 64 grams.
- A random sample of 1 000 eggs is taken. Use a normal approximation to estimate the probability that more than 550 and less than 600 eggs weigh more than 64 grams.
Answer
Part 1
Recall that if a discrete random variable represents the number of successful trials in an experiment, we can model with a binomial distribution , written , provided the experiment satisfies all the following conditions:
- The number of trials, , is fixed.
- Each trial has two possible outcomes (success or failure).
- The probability of success, , is fixed.
- The trials are independent, so the outcome of one trial does not affect the outcome of any other.
Let be the number of eggs that weigh more than 64 grams. A sample of 20 eggs may be modeled by the binomial distribution . Here, and we take our fixed probability according to the farmer’s claim that of the eggs weigh more than 64 grams.
Recall that the probability mass function of the binomial distribution is given by Therefore, we can calculate as
Part 2
As we can see, this calculation is already rather cumbersome! We certainly would not want to have to do this with a sample size of 1 000. For this reason, we will approximate with a normal distribution .
Recall that a binomial random variable with and may be approximated by a normal distribution , where and .
So, in our approximation, we take and Our approximate model is therefore .
We can now use our calculators to calculate the probability that lies strictly between 550 and 600 in this model.
In order to exclude the cases of 550 and 600, we make a continuity correction and calculate .
We can evaluate the accuracy of a normal approximation by comparing it to the binomial distribution it is approximating. Suppose we have a binomial random variable that is approximated by a normal distribution . We can evaluate a particular probability in the binomial model, say , and in the normal approximation, . The magnitude of the difference between these two numbers, , is the absolute error of the approximation. A more useful statistic is the percentage error, which is calculated as In our final example, we apply this idea.
Example 7: Calculating the Percentage Error of a Normal Approximation
A population of frogs contains female and male frogs in the ratio . A random sample of 100 frogs is taken. Find the percentage error when using a normal approximation to calculate the probability that exactly 60 of the frogs are female.
Answer
Recall that if a discrete random variable represents the number of successful trials in an experiment, we can model with a binomial distribution , written , provided the experiment satisfies all the following conditions:
- The number of trials, , is fixed.
- Each trial has two possible outcomes (success or failure).
- The probability of success, , is fixed.
- The trials are independent, so the outcome of one trial does not affect the outcome of any other.
Let be the number of female frogs in the sample. This is a binomial random variable with and .
Recall that a binomial random variable with and may be approximated by a normal distribution , where and .
We will approximate with a normal distribution , where and Our approximate model is therefore . In order to calculate the percentage error of this approximation, we will first calculate the probability of exactly 60 female frogs in the binomial model, , and then calculate the same probability with a continuity correction in the normal model, .
The probability mass function of the binomial distribution is given by and so we can calculate as On the other hand, our calculators give us a value of for the normal approximation. We can now calculate the percentage error of the approximation according to the formula This shows that with a percentage error of less than , the normal distribution is a reasonably good approximation in this case.
Let us finish by recapping a few important concepts from this explainer.
Key Points
- It is appropriate to approximate a binomial random variable with a normal distribution when is sufficiently large and is close to 0.5 .
- The parameters of a normal approximation are given by and
- When using a normal approximation to estimate probabilities, we need to make continuity corrections as follows:
- When estimating a probability of the form , we should calculate the probability over the unit interval centered on ; that is, .
- When estimating a probability of the form , we should evaluate the probability to exclude .
- When estimating a probability of the form , we should evaluate the probability to include .
- We can evaluate the accuracy of a normal approximation by calculating its percentage error according to the formula