In this explainer, we will learn how to identify binomial experiments and solve probability problems of binomial random variables.
Suppose we have an experiment that involves flipping a fair coin 3 times. Each flip of the coin is called a trial, and these trials are independent events because the outcome of one coin flip has no effect on the outcome of any other.
We write for “heads,” for “tails,” and and for their respective probabilities. As the coin is fair, and . Moreover, as each trial has 2 possible outcomes, the overall experiment has possible outcomes:
Recall that a random variable is a variable with its value depending on the outcome of a random process; a discrete random variable has the further property that it can take only certain specific values. The above experiment defines a random variable, say, , which can take any of the integer values 0, 1, 2, or 3 and represents the number of times that heads is thrown. If we deem a single flip of the coin with the outcome heads to be a successful trial, then represents the number of successful trials in the experiment. (Note that in theory, there would be nothing to stop us defining in relation to tails instead and treating those trials as successful.)
Suppose we then wish to calculate the probability of throwing heads 0, 1, 2, or 3 times. In this example, we could work out these values directly because the experiment involves simple probabilities and has only 8 possible outcomes. However, it would be useful if there was some mathematical theory that enabled us to model for a more general class of problems of this kind, because many experiments involve less common decimal probabilities and/or have large numbers of possible outcomes. In fact, we have an ideal tool at our disposal: the binomial distribution.
Properties: Binomial Distribution
If a random variable represents the number of successful trials in an experiment, we can model with a binomial distribution , provided the experiment satisfies all the following conditions:
- The number of trials, , is fixed.
- Each trial has two possible outcomes (success or failure).
- The probability of success, , is fixed
- The trials are independent, so the outcome of one trial does not affect the outcome of any other.
For any experiment, the probabilities of all its possible outcomes must sum to 1. For a binomial experiment, the binomial distribution tells us exactly how these probabilities are distributed across the different types of outcomes. Before learning more details about calculating these probabilities, let us look at an example to test our understanding of the sorts of experiments that are binomial.
Example 1: Identifying a Binomial Experiment
Which of the following scenarios is a binomial experiment?
- Flipping a fair coin until heads is thrown three successive times
- Rolling an unbiased die 100 times and recording the score from 1 to 6
- Drawing 10 cards from a standard deck of 52 playing cards without replacement and recording whether a card is a diamond or not
- Flipping a biased coin with and recording the number of times that tails comes up in 20 flips
- Drawing 10 cards from a standard deck of 52 playing cards with replacement and recording how many times each of the four suits (clubs, diamonds, hearts, and spades) appears
Answer
Recall that if a random variable represents the number of successful trials in an experiment, we can model with a binomial distribution , provided the experiment satisfies all the following conditions:
- The number of trials, , is fixed.
- Each trial has two possible outcomes (success or failure).
- The probability of success, , is fixed.
- The trials are independent, so the outcome of one trial does not affect the outcome of any other.
Our strategy is to work through these conditions systematically for each of the experiments described above. If a given experiment fails at least one of these conditions, it cannot be binomial.
- Since this experiment involves flipping a fair coin, it sounds like it could be binomial. However, as it involves an open-ended process of carrying out trials until we get heads three times in a row, this means that the number of trials, , is not fixed in advance. Therefore, this scenario fails condition I, so it cannot be a binomial experiment.
- This experiment involves rolling an unbiased (or fair) die times, so the number of trials is fixed and thus condition I is satisfied. However, as we must record the score from 1 to 6, this means there are six possible outcomes, not two. Therefore, this scenario fails condition II, so it cannot be a binomial experiment.
- In this experiment, we draw cards from a standard deck of 52 playing cards a total of times, so the number of trials is fixed and thus condition I is satisfied. Furthermore,
there are only two possible outcomes (diamond or not), so condition II is also satisfied.
However, we are told that these cards are drawn without replacement. When we choose the first card, the probability of success is the proportion of the complete deck that is diamonds. Recalling that there are four suits in a standard deck (clubs, diamonds, hearts, and spades) with 13 cards of each suit, the proportion of diamonds is therefore . After choosing the first card, we do not replace it, so there will be only 51 cards remaining. This means that the probability of success will be different for the second card: either if the first card was a diamond or if it was not. Similarly, once we have chosen the second card, there will be only 50 cards remaining, which means that the probability of success will be different for the third card, and so on. Hence, this experiment fails condition III because there is not a fixed probability . Note that it also fails condition IV because the above observations about probabilities show that the trials are not independent. We conclude that this scenario cannot be a binomial experiment. - For this experiment, the number of trials is fixed at , so condition I is satisfied. Although the coin we are flipping is biased, there are still only two possible outcomes (heads or tails), so condition II is also satisfied. We need to record the number of tails that are thrown; since , the probability of success is . Therefore, as the probability of success is fixed, condition III is satisfied. Finally, the trials are independent because the outcome of one coin flip does not affect the outcome of any other, so condition IV is satisfied. We conclude that this scenario satisfies all four conditions and thus is a binomial experiment.
- This final experiment is similar to scenario B for the following reasons: The number of trials is fixed, in this case at , so condition I is satisfied. However, there are four possible outcomes (clubs, diamonds, hearts, or spades), not two. Therefore, this scenario fails condition II, so it cannot be a binomial experiment.
We conclude that the correct answer is D.
Now, let us consider how we might calculate some probabilities for binomial experiments. Returning to the coin-flipping experiment we introduced at the start, note that as the probabilities of success and failure are the same, each of the 8 possible outcomes has the same probability; namely, . Moreover, these outcomes fit into four separate sets; these sets correspond to the different values of , depending on whether heads is thrown 0, 1, 2, or 3 times. Therefore, to calculate for each value of , it is sufficient to count the number of outcomes in the set and multiply that value by 0.125, as follows.
If , there is only 1 possible outcome , so .
If , there are 3 possible outcomes (, , or ), so .
If , there are 3 possible outcomes (, , or ), so .
If , there is only 1 possible outcome , so .
Moreover, notice that , so the probabilities of all the possible outcomes of the experiment sum to 1.
Although we were able to calculate these probabilities by hand for such a simple experiment, it would not be feasible to do this in general. Instead, we use the probability mass function formula, which is defined below.
Formula: Binomial Probability Mass Function
If a random variable follows the binomial distribution , written , then its probability mass function is given by the formula where is the number of ways of choosing items from items.
This formula enables us to calculate the probability of the set of outcomes corresponding to each value of . The sum of these probabilities, which is the sum of the probabilities of all possible outcomes of the experiment, will always be 1.
Note also that the binomial coefficient is sometimes written as or and we refer to it as “ choose .” Most calculators have a button for computing this quantity for given values of and .
The binomial coefficients follow a symmetrical pattern that we can recognize from Pascal’s triangle; the first few rows are shown below.
To apply the formula, we need to know the values of and (from which we can also calculate ) as well as the required value for . In our coin-flipping experiment, we have and , so . Therefore, applying the formula for , 1, 2, and 3, we get the four probabilities
As expected, these values agree exactly with those we obtained by hand. We can also represent this discrete probability distribution visually, as shown below.
Notice that the distribution is symmetrical. This is not surprising because in this experiment, we have (i.e., the probability of success is the same as the probability of failure). Therefore, every individual outcome has the same probability, so the sizes of the probabilities corresponding to each value of are determined by the binomial coefficients. The symmetry follows from the fact that for a binomial experiment with trials, for every outcome featuring successes and failures, there will be a complementary outcome featuring failures and successes. For instance, in this example, the outcome has the complementary outcome . The only exception to this general rule occurs when is even and ; in this case, there is a unique type of outcome “in the middle” comprising successes and failures, but this still results in a symmetrical distribution overall.
We also observe that the most likely numbers of heads to be thrown in this experiment are 1 and 2, with equal probabilities. This is what we would expect from 3 trials, each with a probability of success of . The number of trials is odd, but the number of successes must be an integer, so the two most likely numbers of heads are the integers on either side of .
In general, however, the binomial probability mass function does not produce a symmetrical distribution. For instance, suppose we rerun the experiment with a biased coin, where and hence . The associated distribution is skewed toward the lower values of , as shown below.
As always, the probabilities sum to 1, but this time, the most likely number of heads being thrown is 1 (alongside 2 tails). This is intuitively obvious from the fact that in this particular experiment, the probability of throwing heads is less than half that of throwing tails.
Similarly, we could rerun the experiment with and hence . In this case, the associated distribution is skewed toward the upper values of , as shown below.
The probabilities sum to 1, but this time, the most likely number of heads being thrown is 2 (alongside 1 tail). Again, this is intuitively obvious from the fact that in this particular experiment, the probability of throwing heads is more than twice that of throwing tails.
Another useful property of the binomial distribution is that if we want to find, for example, the probability that at most 2 heads are thrown, then we just sum the probabilities of the outcomes where this occurs; that is,. This is an example of a cumulative probability, which we often meet in questions on this topic.
Let us now try an example where we compute some probabilities using the probability mass function formula. Recall that if the random variable follows a binomial distribution , we use the notation .
Example 2: Computing Probabilities of Binomial Random Variables
The random variable . Giving your answer to 5 decimal places, work out the following probabilities.
Answer
Recall that in a binomial experiment, represents the number of trials and the probability of success. If a random variable , then its probability mass function is given by the formula where is the number of ways of choosing items from items.
Here, we have and and must calculate probabilities for three different values of .
Part 1
To calculate , we apply the formula with , , and . This gives which is 0.00098 to 5 decimal places.
Part 2
To calculate , we apply the formula with , , and . This gives which is 0.01465 to 5 decimal places.
Part 3
To calculate , first note that an expression of the form is called a cumulative probability, which means it is the sum of the probabilities for all values of up to and including . Therefore, we have
Applying the formula to the right-hand side with and for , 1, and 2, this becomes
Notice that instead of calculating the values of all three terms on the right-hand side, we can substitute directly for the first two using the values obtained for and in parts 1 and 2 and then evaluate the binomial expression for separately. Thus, we have which is 0.10352 to 5 decimal places.
Let us now look at an example where we must work out the values of and for a given binomial experiment before we can calculate some probabilities.
Example 3: Solving Real-World Problems with Binomial Random Variables
In a binomial experiment, this spinner is spun 10 times and the result is recorded as a success if the top score is achieved.
Let be the number of successes.
- Determine to 5 decimal places.
- Determine to 5 decimal places.
Answer
Recall that in a binomial experiment, represents the number of trials and the probability of success. If a random variable , then its probability mass function is given by the formula where is the number of ways of choosing items from items.
Before we can answer parts 1 and 2, we must first work out the values of and .
We are told that the spinner is spun 10 times, so the number of trials is . Furthermore, the result of a trial is recorded as a success if the top score is achieved. Checking the spinner, we see that it comprises 8 equal-sized sectors. The top score, 100, appears on 2 of those sectors. Therefore, the probability of success, , is given by or 0.25 as a decimal. Since represents the number of successes, we conclude that .
Part 1
To determine , we apply the probability mass function formula with , , and . This gives which is 0.28157 to 5 decimal places.
Part 1
To determine , we apply the probability mass function formula with , , and . This gives which is 0.00003 to 5 decimal places.
Clearly, the answer to part 2 is a tiny probability. If we reflect on the real-world experiment that gave rise to it, this is to be expected. After all, if there was only a 0.25 chance of getting the top score on a spinner, then we would be very surprised to get this score on 9 out of 10 spins. It is always worth considering if our answer makes sense in the context of the given problem.
Earlier, we touched upon cumulative probabilities, which are expressions of the form representing the sum of the probabilities for all values of up to and including ; that is,
For a given random variable , we could calculate a simple cumulative probability such as , which is just , by applying the probability mass function formula separately to and and then adding the results together. However, in general, it is not practical to use this method when calculating cumulative probabilities for larger values of . Fortunately, we can look up these probabilities in the binomial cumulative distribution function tables that appear at the back of our textbooks (or sometimes in separate booklets). These tables give the cumulative probability values to 4 decimal places. We show an example below, which is the table for the case .
To find the cumulative probability for given values of and , we read along the top row to find the required, or closest, value of and then read down that column until we reach the row corresponding to our value of . For example, if , we can find by reading along the top row until we reach 0.45 and then reading down that column to the row labeled 3. We show this process below; it tells us that to 4 decimal places.
These tables are very useful because they enable us to find cumulative probabilities expressed in terms of other inequalities, such as or . The fact that can take only integer values means that it is straightforward to rewrite these cumulative probabilities in terms of . Then, we look up the required values in the tables, followed by further computations with our calculator if necessary.
There are three main cases we will meet.
- Calculating : as the largest integer less than is , is equivalent to , which we can look up in the tables.
- Calculating : as the sum of all probabilities from to is 1, this implies that , from which it follows that . So, we look up in the tables and then use a calculator to subtract this value from 1.
- Calculating : again, as the sum of all probabilities from to is 1, this implies that , from which it follows that . So, we look up in the tables and then use a calculator to subtract this value from 1.
As an aside, note that the individual probability could be rewritten as . This means that we could look up both of these cumulative probabilities in the tables and then subtract the second from the first to obtain . Usually, however, to calculate , we apply the formula directly, especially since the table only gives us answers to 4 decimal places and many questions require our answers to be given to more decimal places than this.
Let us summarize what we have learned about how to calculate different probabilities arising from binomial experiments.
How To: Using the Binomial Distribution
- Identify the number of trials; this number is .
- Identify the probability of success; this number is .
- Identify the relevant number of successes; this number is .
- Identify the required individual or cumulative probability and find it via the following methods:
- : use the formula and a calculator.
- : read off directly from tables.
- : as , read off directly from tables.
- : as , read off from tables and then use a calculator to subtract this value from 1.
- : as , read off from tables and then use a calculator to subtract this value from 1.
In our next example, we must interpret a word problem so that we can solve it by using binomial cumulative distribution function tables.
Example 4: Solving Real-World Problems with Binomial Random Variables
When a biased coin with is flipped 5 times, what is the probability of getting more heads than tails? Give your answer to 4 decimal places.
Answer
Recall that in a binomial experiment, represents the number of trials and the probability of success. If a random variable , then its probability mass function is given by the formula where is the number of ways of choosing items from items.
Before we can calculate the required probability, we must first work out the values of and .
We are told that a biased coin with is flipped 5 times. Since we need to work out a probability relating to the number of heads thrown, we thus have and . Writing to represent the number of successes, which is the number of heads thrown, we deduce that .
Next, we must translate the wording of the second sentence of the question into a mathematical statement about probabilities. The probability of getting more heads than tails is the same as the probability of getting at least 3 heads in our 5 coin flips. Therefore, the probability we require is .
As the sum of all probabilities from to is 1, this implies that , from which it follows that .
Now, recall that expressions of the form are called cumulative probabilities. For given values of , , and , we can look up cumulative probabilities in binomial cumulative distribution function tables. In this case, we must look up in the relevant table and then use a calculator to subtract this value from 1 in order to get . We show the table below.
Since , we find by reading along the top row until we reach 0.35 and then reading down that column until we reach the row labeled 2; this tells us that to 4 decimal places. Therefore, we have which is already given to 4 decimal places. We conclude that the probability of getting more heads than tails is 0.2352.
Suppose we have a binomial experiment with for specific values of and . In some questions, we are told the actual value of a cumulative probability, or an upper or lower bound for it, and must work backward from the tables to find an unknown value of . Here is an example of this type.
Example 5: Solving Real-World Problems with Binomial Random Variables
A driving school has a pass rate of . In one particular month, 20 candidates take their driving test. If the probability that more than candidates fail their test is less than , what is the smallest possible value of ?
Answer
Recall that in a binomial experiment, represents the number of trials and the probability of success. If a random variable follows the binomial distribution , we write .
To find the smallest possible value of , first we must identify the values of and . As 20 candidates take their driving test in one particular month, . Regarding , the probability of success, we are told that a driving school has an pass rate, which corresponds to a probability of . However, as we are asked a question about the number of candidates that fail in one particular month, it is simpler to make the probability of failing the test, which is .
Note that it is not contradictory for us to label a test failure as a success because in the context of an experiment, the term success really means the outcome that we are looking for, not the one that would necessarily be desirable in the real world. Hence, in this case, we let the random variable represent the number of candidates that fail the test in one particular month, so .
In addition, we know the probability that more than candidates fail their test is less than (which corresponds to a probability of ). Written as a mathematical statement, this is , for some unknown integer .
As the sum of all probabilities from to is 1, this implies that , from which it follows that . Moreover, since , then , which can easily be rearranged to give .
Now recall that expressions of the form are called cumulative probabilities. We can look up these probabilities in binomial cumulative distribution function tables and can also work backward from those tables to find an unknown value of .
In this case, we need the table for . As , we read along the top row until we reach 0.15. Then, since we want , the smallest integer for which , we read down that column until we come to the first number greater then 0.99. Finally, we read off the corresponding value of .
As shown above, since the probability 0.9941 is the first value in the column headed , which is greater than 0.99, we read across from this to find the corresponding value for . The required cumulative probability is , so . We know that 7 is the smallest possible value of because , which is less than 0.99.
Therefore, if the probability that more than candidates fail their test is less than , the smallest possible value of is 7.
Let us finish by recapping some key concepts from this explainer.
Key Points
- If a random variable represents the number of successful
trials in an experiment, we can model with a binomial distribution
,
provided the experiment satisfies all the following conditions:
- The number of trials, , is fixed.
- Each trial has two possible outcomes (success or failure).
- The probability of success, , is fixed.
- The trials are independent, so the outcome of one trial does not affect the outcome of any other.
- If a random variable follows the binomial distribution , written , then its probability mass function is given by the formula where is the number of ways of choosing items from items.
- Cumulative probabilities are expressions of the form representing the sum of the probabilities for all values of up to and including ; that is, For given values of , , and , we can look up these probabilities in binomial cumulative distribution function tables.
- Once we have identified the values of , , and for a binomial experiment, we can find individual or cumulative probabilities via the following methods:
- : use the formula and a calculator.
- : read off directly from tables.
- : as , read off directly from tables.
- : as , read off from tables and then use a calculator to subtract this value from 1.
- : as , read off from tables and then use a calculator to subtract this value from 1.