The portal has been deactivated. Please contact your portal admin.

Lesson Explainer: Approximating a Binomial Distribution Mathematics

In this explainer, we will learn how to approximate a binomial distribution with a normal distribution.

Recall that if a discrete random variable ๐‘‹ represents the number of successful trials in an experiment, we can model ๐‘‹ with a binomial distribution ๐ต(๐‘›,๐‘), written ๐‘‹โˆผ๐ต(๐‘›,๐‘), provided the experiment satisfies all the following conditions:

  1. The number of trials, ๐‘›, is fixed.
  2. Each trial has two possible outcomes (success or failure).
  3. The probability of success, ๐‘, is fixed.
  4. The trials are independent, so the outcome of one trial does not affect the outcome of any other.

The probability mass function of the binomial distribution is given by ๐‘ƒ(๐‘‹=๐‘Ÿ)=๏€ป๐‘›๐‘Ÿ๏‡๐‘(1โˆ’๐‘)=๐‘›๐‘Ÿ๐‘›โˆ’๐‘Ÿ๐‘(1โˆ’๐‘).๏Ž๏Š๏Šฑ๏Ž๏Ž๏Š๏Šฑ๏Ž When the number of trials ๐‘› is large, the number ๐‘› is very big, and this calculation can become unwieldy. For this reason, it is often useful to approximate a binomial distribution with a normal distribution. In order for a normal approximation to be appropriate for a binomially distributed random variable ๐‘‹โˆผ๐ต(๐‘›,๐‘), two conditions must be met:

  1. The probability ๐‘ must be close to 0.5. For us, this means 0.4โ‰ค๐‘โ‰ค0.6.
  2. The number of trials ๐‘› must be sufficiently large. For us, this means ๐‘›โ‰ฅ50.

The normal distribution ๐‘๏€น๐œ‡,๐œŽ๏…๏Šจ takes two parameters, the mean ๐œ‡ and the standard deviation ๐œŽ (or variance ๐œŽ๏Šจ). When approximating a binomial distribution with a normal distribution, these parameters are calculated as follows.

Formula: Approximating a Binomial Distribution with a Normal Distribution

Suppose ๐‘‹โˆผ๐ต(๐‘›,๐‘) is a binomially distributed discrete random variable with ๐‘›โ‰ฅ50 and 0.4โ‰ค๐‘โ‰ค0.6. Then, ๐‘‹ may be approximated by a normal distribution ๐‘๏€น๐œ‡,๐œŽ๏…๏Šจ, where ๐œ‡=๐‘›๐‘,๐œŽ=๐‘›๐‘(1โˆ’๐‘).๏Šจ

An important point to remember when approximating a binomial distribution with a normal distribution is that the binomial distribution is a discrete probability distribution, while the normal distribution is continuous. For this reason, it is necessary to apply a continuity correction when calculating probabilities.

For example, consider a discrete random variable ๐‘‹โˆผ๐ต(400,0.55). Suppose we approximate this variable with the normal distribution ๐‘๏€น๐œ‡,๐œŽ๏…๏Šจ, where ๐œ‡=๐‘›๐‘=220 and ๐œŽ=๐‘›๐‘(1โˆ’๐‘)=99๏Šจ. If we use this approximation to calculate the probability ๐‘ƒ(๐‘‹=200), then we will get answer of 0 because the normal distribution (being continuous) has ๐‘ƒ(๐‘‹=๐‘Ž)=0 for any ๐‘Ž. The continuity correction we need to make is to instead calculate the probability on a unit interval centered on 200; that is, ๐‘ƒ(199.5<๐‘‹<200.5)=0.0053. Similarly, if we want to calculate the probability that ๐‘‹ is strictly less than some value, say 230, then we should calculate ๐‘ƒ(๐‘‹<229.5)=0.8302 in order to exclude 230. However, if we want the probability that ๐‘‹ is less than or equal to 230, we should calculate ๐‘ƒ(๐‘‹<230.5)=0.8544.

Let us have a look at an example of some variables that can be approximated by a normal distribution and some that cannot.

Example 1: Recognizing a Valid Normal Approximation

Which of the following four binomially distributed random variables may be approximated by a normal distribution?

  1. ๐‘‹โˆผ๐ต(98,0.42)
  2. ๐‘‹โˆผ๐ต(30,0.55)
  3. ๐‘‹โˆผ๐ต(700,0.75)
  4. ๐‘‹โˆผ๐ต(200,0.59)

Answer

The criteria for whether a normal approximation is appropriate for a binomial random variable ๐‘‹โˆผ๐ต(๐‘›,๐‘) are as follows:

  1. The probability ๐‘ is close to 0.5, meaning 0.4โ‰ค๐‘โ‰ค0.6.
  2. The number of trials ๐‘› is sufficiently large, meaning ๐‘›โ‰ฅ50.

By these criteria, we can see that variables a and d may be approximated by a normal distribution. Variables b and c cannot be approximated by a normal distribution since variable b fails to satisfy criterion 2 and variable c fails to satisfy criterion 1.

Let us now calculate the mean and variance of a normal approximation.

Example 2: Calculating the Mean and Variance of a Normal Approximation

๐‘‹ is a binomially distributed random variable, and ๐‘‹โˆผ๐ต(250,0.47). Write down a normal approximation of ๐‘‹โˆผ๐‘๏€น๐œ‡,๐œŽ๏…๏Šจ, stating the values of ๐œ‡ and ๐œŽ๏Šจ.

Answer

Recall that a binomial random variable ๐‘‹โˆผ๐ต(๐‘›,๐‘) with ๐‘›โ‰ฅ50 and 0.4โ‰ค๐‘โ‰ค0.6 may be approximated by a normal distribution ๐‘๏€น๐œ‡,๐œŽ๏…๏Šจ, where ๐œ‡=๐‘›๐‘ and ๐œŽ=๐‘›๐‘(1โˆ’๐‘)๏Šจ.

Therefore, our binomial random variable ๐‘‹โˆผ๐ต(250,0.47) may be approximated by a normal distribution with ๐œ‡=๐‘›๐‘=250ร—0.47=117.5 and ๐œŽ=๐‘›๐‘(1โˆ’๐‘)=117.5ร—0.53=62.275.๏Šจ

Let us now use a normal approximation to estimate a probability.

Example 3: Using a Normal Approximation to Estimate Probabilities

A discrete random variable ๐‘‹ is binomially distributed, and ๐‘‹โˆผ๐ต(200,0.47). Using a normal approximation, estimate ๐‘ƒ(๐‘‹โ‰ค96).

Answer

Recall that a binomial random variable ๐‘‹โˆผ๐ต(๐‘›,๐‘) with ๐‘›โ‰ฅ50 and 0.4โ‰ค๐‘โ‰ค0.6 may be approximated by a normal distribution ๐‘๏€น๐œ‡,๐œŽ๏…๏Šจ, where ๐œ‡=๐‘›๐‘ and ๐œŽ=๐‘›๐‘(1โˆ’๐‘)๏Šจ.

Since the normal distribution is a continuous probability distribution while the binomial distribution is discrete, when calculating a probability using a normal approximation, we need to make continuity corrections.

We approximate ๐‘‹โˆผ๐ต(200,0.47) with a normal distribution ๐‘๏€น๐œ‡,๐œŽ๏…๏Šจ where ๐œ‡ is given by ๐œ‡=200ร—0.47=94 and ๐œŽ=94ร—0.53=49.82.๏Šจ Our approximate model is therefore ๐‘(94,49.82).

Notice that the question asks us to estimate the probability that ๐‘‹ is less than or equal to 96. We therefore need to make a continuity correction to include the value 96 and calculate the probability ๐‘ƒ(๐‘‹<96.5) in our continuous normal approximation. We get a probability of 0.6384.

Remember that if we want to use a normal approximation to estimate the probability that a discrete random variable takes a particular value ๐‘ƒ(๐‘‹=๐‘Ž), then we have to make the continuity correction ๐‘ƒ(๐‘Žโˆ’0.5<๐‘‹<๐‘Ž+0.5). Here is an example to test this skill.

Example 4: Using a Normal Approximation to Estimate Probabilities

A discrete random variable ๐‘‹ is binomially distributed, and ๐‘‹โˆผ๐ต(175,0.56). Using a normal approximation, estimate ๐‘ƒ(๐‘‹=106).

Answer

Recall that a binomial random variable ๐‘‹โˆผ๐ต(๐‘›,๐‘) with ๐‘›โ‰ฅ50 and 0.4โ‰ค๐‘โ‰ค0.6 may be approximated by a normal distribution ๐‘๏€น๐œ‡,๐œŽ๏…๏Šจ, where ๐œ‡=๐‘›๐‘ and ๐œŽ=๐‘›๐‘(1โˆ’๐‘)๏Šจ.

Since the normal distribution is a continuous probability distribution while the binomial distribution is discrete, when calculating a probability using a normal approximation, we need to make continuity corrections.

We approximate ๐‘‹โˆผ๐ต(175,0.56) with a normal distribution ๐‘๏€น๐œ‡,๐œŽ๏…๏Šจ, where ๐œ‡=175ร—0.56=98 and ๐œŽ=98ร—0.44=43.12.๏Šจ Our approximate model is therefore ๐‘(98,43.12).

The question asks us to estimate the probability that ๐‘‹ is equal to 106. We therefore need to make a continuity correction and calculate the probability ๐‘ƒ(105.5<๐‘‹<106.5) over the unit interval centered on 106. We get a probability of 0.0289.

Let us now make a normal approximation in a real-life context.

Example 5: Solving a Binomial Distribution Approximation Problem In a Real-Life Context

Karim flips a fair coin 84 times. By using a suitable normal approximation, calculate the estimated probability that the coin lands on heads more than 50 times.

Answer

Recall that a binomial random variable ๐‘‹โˆผ๐ต(๐‘›,๐‘) with ๐‘›โ‰ฅ50 and 0.4โ‰ค๐‘โ‰ค0.6 may be approximated by a normal distribution ๐‘๏€น๐œ‡,๐œŽ๏…๏Šจ, where ๐œ‡=๐‘›๐‘ and ๐œŽ=๐‘›๐‘(1โˆ’๐‘)๏Šจ.

Since the normal distribution is a continuous probability distribution while the binomial distribution is discrete, when calculating a probability using a normal approximation, we need to make continuity corrections.

Let ๐‘‹ represent the number of times the coin lands on heads. This is a discrete random variable that follows the binomial distribution ๐ต(๐‘›,๐‘) with ๐‘›=84 and ๐‘=0.5. Since the number of trials is ๐‘›=84>50 and the probability of getting heads is ๐‘=0.5, we may approximate ๐‘‹ with a normal distribution ๐‘๏€น๐œ‡,๐œŽ๏…๏Šจ, where ๐œ‡=84ร—0.5=42 and ๐œŽ=42ร—0.5=21.๏Šจ Our approximate model is therefore ๐‘(42,21). We need to estimate the probability ๐‘ƒ(๐‘‹>50). Since the inequality is strict, we make the continuity correction ๐‘ƒ(๐‘‹>50.5) to exclude the case ๐‘‹=50 for an estimate of 0.0318.

It can sometimes be instructive to compare the probabilities obtained from a binomial distribution and a normal approximation of it, as shown in the next example.

Example 6: Comparing a Binomial Distribution and a Normal Approximation

A farm produces eggs. The farmer claims that 60% of the eggs weigh more than 64 grams.

  1. A random sample of 20 eggs is taken. Find the exact probability that at least 15 of the eggs weigh more than 64 grams.
  2. A random sample of 1โ€Žโ€‰โ€Ž000 eggs is taken. Use a normal approximation to estimate the probability that more than 550 and less than 600 eggs weigh more than 64 grams.

Answer

Part 1

Recall that if a discrete random variable ๐‘‹ represents the number of successful trials in an experiment, we can model ๐‘‹ with a binomial distribution ๐ต(๐‘›,๐‘), written ๐‘‹โˆผ๐ต(๐‘›,๐‘), provided the experiment satisfies all the following conditions:

  1. The number of trials, ๐‘›, is fixed.
  2. Each trial has two possible outcomes (success or failure).
  3. The probability of success, ๐‘, is fixed.
  4. The trials are independent, so the outcome of one trial does not affect the outcome of any other.

Let ๐‘‹ be the number of eggs that weigh more than 64 grams. A sample of 20 eggs may be modeled by the binomial distribution ๐‘‹โˆผ๐ต(๐‘›,๐‘). Here, ๐‘›=20 and we take our fixed probability ๐‘=0.6 according to the farmerโ€™s claim that 60% of the eggs weigh more than 64 grams.

Recall that the probability mass function of the binomial distribution is given by ๐‘ƒ(๐‘‹=๐‘Ÿ)=๏€ป๐‘›๐‘Ÿ๏‡๐‘(1โˆ’๐‘).๏Ž๏Š๏Šฑ๏Ž Therefore, we can calculate ๐‘ƒ(๐‘‹โ‰ฅ15) as ๏€ผ2015๏ˆ0.6ร—0.4+๏€ผ2016๏ˆ0.6ร—0.4+๏€ผ2017๏ˆ0.6ร—0.4+๏€ผ2018๏ˆ0.6ร—0.4+๏€ผ2019๏ˆ0.6ร—0.4+๏€ผ2020๏ˆ0.6ร—0.4=15504ร—0.0004702ร—0.01024+4845ร—0.0002821ร—0.0256+1140ร—0.0001693ร—0.064+190ร—0.0001016ร—0.16+20ร—0.00006094ร—0.4+0.00003656=0.1256.๏Šง๏Šซ๏Šซ๏Šง๏Šฌ๏Šช๏Šง๏Šญ๏Šฉ๏Šง๏Šฎ๏Šจ๏Šง๏Šฏ๏Šง๏Šจ๏Šฆ๏Šฆ

Part 2

As we can see, this calculation is already rather cumbersome! We certainly would not want to have to do this with a sample size of 1โ€Žโ€‰โ€Ž000. For this reason, we will approximate ๐‘‹ with a normal distribution ๐‘๏€น๐œ‡,๐œŽ๏…๏Šจ.

Recall that a binomial random variable ๐‘‹โˆผ๐ต(๐‘›,๐‘) with ๐‘›โ‰ฅ50 and 0.4โ‰ค๐‘โ‰ค0.6 may be approximated by a normal distribution ๐‘๏€น๐œ‡,๐œŽ๏…๏Šจ, where ๐œ‡=๐‘›๐‘ and ๐œŽ=๐‘›๐‘(1โˆ’๐‘)๏Šจ.

So, in our approximation, we take ๐œ‡=1000ร—0.6=600 and ๐œŽ=600ร—0.4=240.๏Šจ Our approximate model is therefore ๐‘(600,240).

We can now use our calculators to calculate the probability that ๐‘‹ lies strictly between 550 and 600 in this model.

In order to exclude the cases of 550 and 600, we make a continuity correction and calculate ๐‘ƒ(550.5<๐‘‹<599.5)=0.4864.

We can evaluate the accuracy of a normal approximation by comparing it to the binomial distribution it is approximating. Suppose we have a binomial random variable ๐‘‹โˆผ๐ต(๐‘›,๐‘) that is approximated by a normal distribution ๐‘๏€น๐œ‡,๐œŽ๏…๏Šจ. We can evaluate a particular probability in the binomial model, say ๐‘ƒ(๐‘‹=๐‘Ž)๏Œก, and in the normal approximation, ๐‘ƒ(๐‘‹=๐‘Ž)๏Œญ. The magnitude of the difference between these two numbers, |๐‘ƒ(๐‘‹=๐‘Ž)โˆ’๐‘ƒ(๐‘‹=๐‘Ž)|๏Œก๏Œญ, is the absolute error of the approximation. A more useful statistic is the percentage error, which is calculated as percentageerror=|๐‘ƒ(๐‘‹=๐‘Ž)โˆ’๐‘ƒ(๐‘‹=๐‘Ž)|๐‘ƒ(๐‘‹=๐‘Ž)ร—100.๏Œก๏Œญ๏Œก In our final example, we apply this idea.

Example 7: Calculating the Percentage Error of a Normal Approximation

A population of frogs contains female and male frogs in the ratio 53โˆถ47. A random sample of 100 frogs is taken. Find the percentage error when using a normal approximation to calculate the probability that exactly 60 of the frogs are female.

Answer

Recall that if a discrete random variable ๐‘‹ represents the number of successful trials in an experiment, we can model ๐‘‹ with a binomial distribution ๐ต(๐‘›,๐‘), written ๐‘‹โˆผ๐ต(๐‘›,๐‘), provided the experiment satisfies all the following conditions:

  1. The number of trials, ๐‘›, is fixed.
  2. Each trial has two possible outcomes (success or failure).
  3. The probability of success, ๐‘, is fixed.
  4. The trials are independent, so the outcome of one trial does not affect the outcome of any other.

Let ๐‘‹ be the number of female frogs in the sample. This is a binomial random variable ๐‘‹โˆผ๐ต(๐‘›,๐‘) with ๐‘›=100 and ๐‘=0.53.

Recall that a binomial random variable ๐‘‹โˆผ๐ต(๐‘›,๐‘) with ๐‘›โ‰ฅ50 and 0.4โ‰ค๐‘โ‰ค0.6 may be approximated by a normal distribution ๐‘๏€น๐œ‡,๐œŽ๏…๏Šจ, where ๐œ‡=๐‘›๐‘ and ๐œŽ=๐‘›๐‘(1โˆ’๐‘)๏Šจ.

We will approximate ๐‘‹ with a normal distribution ๐‘๏€น๐œ‡,๐œŽ๏…๏Šจ, where ๐œ‡=100ร—0.53=53 and ๐œŽ=53ร—0.47=24.91.๏Šจ Our approximate model is therefore ๐‘(53,24.91). In order to calculate the percentage error of this approximation, we will first calculate the probability of exactly 60 female frogs in the binomial model, ๐‘ƒ(๐‘‹=60)๏Œก, and then calculate the same probability with a continuity correction in the normal model, ๐‘ƒ(59.5<๐‘‹<60.5)๏Œญ.

The probability mass function of the binomial distribution is given by ๐‘ƒ(๐‘‹=๐‘Ÿ)=๏€ป๐‘›๐‘Ÿ๏‡๐‘(1โˆ’๐‘),๏Ž๏Š๏Šฑ๏Ž and so we can calculate ๐‘ƒ(๐‘‹=60)๏Œก as ๐‘ƒ(๐‘‹=60)=๏€ผ10060๏ˆ0.53ร—0.47=0.0301.๏Œก๏Šฌ๏Šฆ๏Šช๏Šฆ On the other hand, our calculators give us a value of ๐‘ƒ(59.5<๐‘‹<60.5)=0.0299๏Œญ for the normal approximation. We can now calculate the percentage error of the approximation according to the formula percentageerror=|๐‘ƒ(๐‘‹=๐‘Ž)โˆ’๐‘ƒ(๐‘‹=๐‘Ž)|๐‘ƒ(๐‘‹=๐‘Ž)ร—100=0.0301โˆ’0.02990.0301ร—100=0.66%.๏Œก๏Œญ๏Œก This shows that with a percentage error of less than 1%, the normal distribution is a reasonably good approximation in this case.

Let us finish by recapping a few important concepts from this explainer.

Key Points

  • It is appropriate to approximate a binomial random variable ๐‘‹โˆผ๐ต(๐‘›,๐‘) with a normal distribution when ๐‘› is sufficiently large (๐‘›โ‰ฅ50) and ๐‘ is close to 0.5 (0.4โ‰ค๐‘โ‰ค0.6).
  • The parameters of a normal approximation ๐‘๏€น๐œ‡,๐œŽ๏…๏Šจ are given by ๐œ‡=๐‘›๐‘ and ๐œŽ=๐‘›๐‘(1โˆ’๐‘).๏Šจ
  • When using a normal approximation to estimate probabilities, we need to make continuity corrections as follows:
    • When estimating a probability of the form ๐‘ƒ(๐‘‹=๐‘Ž), we should calculate the probability over the unit interval centered on ๐‘Ž; that is, ๐‘ƒ(๐‘Žโˆ’0.5<๐‘‹<๐‘Ž+0.5).
    • When estimating a probability of the form ๐‘ƒ(๐‘‹<๐‘Ž), we should evaluate the probability ๐‘ƒ(๐‘‹<๐‘Žโˆ’0.5) to exclude ๐‘Ž.
    • When estimating a probability of the form ๐‘ƒ(๐‘‹โ‰ค๐‘Ž), we should evaluate the probability ๐‘ƒ(๐‘‹<๐‘Ž+0.5) to include ๐‘Ž.
  • We can evaluate the accuracy of a normal approximation by calculating its percentage error according to the formula percentageerror=|๐‘ƒ(๐‘‹=๐‘Ž)โˆ’๐‘ƒ(๐‘‹=๐‘Ž)|๐‘ƒ(๐‘‹=๐‘Ž)ร—100.๏Œก๏Œญ๏Œก

Nagwa uses cookies to ensure you get the best experience on our website. Learn more about our Privacy Policy.