Video Transcript
In this video, we’ll learn how to
deal with the concept of conditional probability using joint frequencies presented
in two-way tables.
But what do we mean by conditional
probability? It’s defined as the likelihood of
an event or outcome occurring based on the occurrence of a previous event or
outcome. For example, event A is, it’s
raining outside, and event B is that your train is delayed. In conditional probability, we look
at these events together and ask questions like, what is the probability that your
train will be delayed given that it’s raining outside? We’d write this as shown, and this
vertical line means given that.
One way we have to recall the
outcomes of such events is in a two-way table. It’s a table that shows the
frequency of two variables. We need to be able to complete
two-way tables and calculate probabilities from them. So let’s see what this might look
like.
A fanzine website for the TV show A
Maze in Space collects data on the number of new alien species encountered by two
starships on each season of the show. The data for seasons one, two, and
seven are shown in the table below, split between the two starships Zeta and
Geoda. Find the probability that a new
alien species chosen at random was encountered by starship Geoda. Give your answer to three decimal
places.
We want to find the probability
that a new alien species chosen at random was encountered by starship Geoda. This doesn’t specify which season
we’re interested in, so we’re going to use the total. And so we begin by recalling that
the probability of an event occurring is found by dividing the number of ways that
event can occur by the total number of possible outcomes. So in this case, we’re going to
find the number of alien species found in total by starship Geoda. And then we’ll divide that by the
total number of alien species encountered altogether.
Looking at the totals, we see that
starship Geoda discovered a total of 72 alien species. And the total was 83. So the probability that a new alien
species chosen at random was encountered by starship Geoda is found by dividing 72
by 83, which is 0.8674 and so on. Correct to three decimal places,
that’s 0.867.
So that’s how we calculate a simple
probability from a two-way table. Let’s now have a look at how to
find conditional probabilities from these tables.
Given that a new alien species was
encountered in season seven, find the probability that they were encountered by
starship Geoda. Give your answer to three decimal
places.
The phrase “given that” is an
indication that we’re going to be calculating conditional probability. If we let A be the event that the
alien species were encountered by starship Geoda and B be the event that this
happened in season seven. Then we use this vertical line to
show “given that.” We’re finding the probability that
A occurs given that B has occurred. Now, the phrase “given that”
essentially narrows down the criteria. We’re told that the alien species
was encountered in season seven. So we’re only interested in these
three pieces of data. Out of this list, we see that the
number of alien species encountered by starship Geoda is eight.
Of course, the probability is found
by dividing the number of ways that event can occur by the total number of
outcomes. And the total number here is
13. So the probability that A happens
given that B has happened is eight divided by 13, which is 0.61538 and so on. Correct to three decimal places, we
see that given that a new alien species was encountered in season seven, the
probability that they were encountered by starship Geoda is 0.615.
Let’s have a look at another
example of finding probabilities from two-way tables.
The table below contains data from
a survey of core gamers who were asked whether their preferred gaming platform was
the smartphone, the console, or the PC. The gamers are split by gender. Find the probability that a core
gamer chosen at random prefers using a console. Give your answer to three decimal
places. Given that a core gamer prefers to
play using a console, find the probability that they are male. Give your answer to three decimal
places.
Now, firstly, we recall that if
we’re trying to find the probability of an event occurring, we divide the number of
ways that event can occur by the total number of outcomes. And the first part of this question
asks us to find the probability that a gamer chosen at random prefers to use a
console. Now, they don’t specify whether
we’re interested in male or female gamers. So, in fact, we’re going to
calculate the totals.
We begin by calculating the total
number of gamers who prefer to use a smartphone. That’s 52 plus 48, which is
100. Similarly, to calculate the total
number of gamers who preferred the console, we add 37 and 23, to give us 60. Finally, the total number of gamers
who prefer to use the PC is 48 plus 35, which is 83. The total number of gamers
questioned is found by adding all of the values in this column. That’s 100 plus 60 plus 83, which
is 243.
Now, remember, we’re looking to
find the probability that the gamer chosen at random prefers to use a console. So that’s this second row. The total number of outcomes or the
total number of gamers here we calculated to be 243. So the probability that a core
gamer chosen at random prefers to use a console is 60 divided by 243, which is
0.2469 and so on. That’s 0.247.
The second part of this question
states that given that a core gamer prefers to play using a console, find the
probability that they are male. This phrase “given that” is an
indication that we’re going to use conditional probability. If we let event A be the event that
the gamer chosen is male and event B be the event that they prefer to use a console,
we use the bar notation to show that we’re trying to find the probability of A
occurring given that B has occurred. And what this does is narrow down
the data somewhat.
We’re told that the gamer prefers
to play using a console. So we can narrow our data down into
just those people who prefer to play using a console. And we want to find the probability
that they are male. So we’re going to divide the number
of male gamers who said they preferred using a console by the total number of gamers
who said they preferred using a console. That’s 37 divided by 60. That’s 0.61666 and so on, which
correct to three decimal places is 0.617. So the probability that a core
gamer is male given that they prefer to play using a console is 0.617.
In our next example, we’ll quote
and learn how to use a conditional probability formula.
Daniel and Jennifer are running for
the presidency of the Students’ Union at their school. The votes they received from each
of three classes are shown in the table. What is the probability that a
student voted for Jennifer given that they are in the class B?
Remember, the phrase “given that”
indicates that we’re working with conditional probability. We’ll say that event A is that a
student voted for Jennifer. And event B is that the student is
in class B. Then this vertical line means given
that. We’re finding the probability that
A occurs given that B has already occurred. And one way we can do this is to
narrow down the table based on the information that we’ve been given.
We’re told that that student is in
class B, so we narrow it down to everyone in class B. In this case, we’re interested in
the number of students that voted for Jennifer. That’s 195. And remember, to find the
probability of an event occurring, we divide the number of ways that event can occur
by the total number of outcomes. Here, the total number of possible
outcomes is the total number of students in class B. That’s 169 plus 195, which is
364. And so the probability that a
student voted for Jennifer given that they’re in class B is 195 over 364, which
simplifies to 15 over 28.
Now, in fact, this is a perfectly
valid method for answering this question. But there is a formula we can
use. We say that to find the probability
of A given B, we divide the probability of A intersection B — in other words, A and
B — by the probability of B. So in this case, what’s A
intersection B? Well, A was the number of students
who voted for Jennifer and B was the number of students in class B. We’re looking for the intersection,
the students that voted for Jennifer and are in class B. There are 195 of them.
The probability of choosing one of
these at random is found by dividing 195 by the total number of students asked
altogether. That’s 507 plus 494. That gives us a total of 1001. So the probability of A
intersection B, in other words, the probability that a student voted for Jennifer
and are in class B, is 195 out of 1001. And what about the probability of
B, in other words, the probability that they’re in class B? Well, we already saw that there are
364 students in class B. So the probability of B occurring
is 364 divided by 1001. And so, if we apply the formula, we
get 195 over 1001 divided by 364 over 1001. Notice that this gives us the exact
same answer as earlier, 195 over 364, which simplifies to 15 over 28.
In our final example, we’ll look at
how we can use information from two-way tables to help us decide where the two
events are independent.
Data is collected from the TV show
A Maze in Space on the number of new alien species first contact is made with. The data for starship Zeta in
seasons one, two, and seven are shown in the table below. The data have also been categorized
by whether the crew member who made first contact was male or female. From the table, find the
probability that first contact was made with a new alien species by a female crew
member. Give your answer to three decimal
places.
We want to find the probability
that first contact was made by a female crew member. Let’s call that 𝑃 of F, where F is
the event that a female crew member was the person who made first contact. And we know that to find the
probability of an event occurring, we divide the number of ways that event can occur
by the total number of outcomes. The information on female crew
members is this second row. And we know there are a total of 37
first contacts made by female crew members.
The total number of outcomes are
the total numbers of first contacts made with a new alien species; that’s 72. So the probability then that first
contact was made with a new alien species by a female crew member is 37 divided by
72. That’s 0.5138 and so on, which
correct to three decimal places is 0.514.
We’re now going to move on to the
second part of this question.
The second part of this question
says, find the probability that first contact was made in season one and by a female
crew member. Give your answer to three decimal
places.
This time, not only are we
interested in first contact being made by a female crew member, but this must occur
in season one. If F is the event that the crew
member is female and S one is the event that first contact was made in season one,
we want to find the probability of S one intersection F. Remember, that just means S one and
F. So let’s begin by finding the
number of first contacts made in season one by a female crew member. Well, that’s 16. The total number of first contacts
made is still 72. So the probability the first
contact is made in season one and by a female crew member is 16 divided by 72. That’s 0.2 recurring, which is
0.222 correct to three decimal places.
Let’s now have a look at the third
part of this question.
Given that first contact was made
with an alien species chosen at random, in season one, find the probability that
first contact was made by a female crew member. Give your answer to three decimal
places.
This phrase “given that” is really
useful because it tells us that we’re working with conditional probability. And we can narrow down our data
set. We’re told the first contact was
made with an alien species chosen at random from season one. And so we narrow the data down to
just the results from season one. We use this vertical line to
represent “given that.” And we see that we want to find the
probability that first contact was made by a female crew member given that it
happened in season one.
Well, in season one, 16 crew
members made first contact. That’s out of a total of 28. So the probability that this
happens then is 16 divided by 28, which is 0.5714 and so on. Correct to three decimal places,
that’s 0.571.
We’ll now consider the fourth and
final part of this question.
Are the events S one which is first
contact made in season one and female independent?
Remember, two events are
independent if one occurring doesn’t affect the probability of the other
occurring. And whilst we could probably try
and use a bit of common sense, there are some formulae we can use. The first is for two events A and
B. And this says that if these events
are independent, then the probability of A intersection B is equal to the
probability of A times the probability of B. In other words, the probability of
A and B will be equal to the product of their two respective probabilities.
Now, alternatively, we can say that
if two events A and B are independent, then the probability of A occurring given
that B has occurred must be equal to the probability of A occurring. And so we can say that if our
events are independent, then the probability that the crew member is female given
that first contact is made in season one will be equal to the probability that they
are female. So let’s see if this is true.
We worked out that the probability
that they’re female given that first contact was made in season one is 0.571. And we worked out the probability
of them being female in general was 0.514. Well, these are not equal. And so we can say no, these events
are not independent. And similarly, we could’ve used the
alternative formula. We calculated the probability that
they were female. And the first contact was made in
season one was 0.222.
We calculated the probability of
them being female to be 0.514. And we could calculate the
probability that first contact was made in season one. It would be 28 divided by 72;
that’s 0.389. Now, in fact, when we find the
product of 𝑃 of F and 𝑃 of S one, we get 0.199. That’s not equal to 0.222. So that’s an alternative way we
could show that these events are not independent.
In this video, we’ve learned that
in a two-way table, we organize the frequencies for the categories of two
categorical variables. We saw that we can calculate
conditional probabilities by reading directly from a two-way table. And the use of the phrase “given
that” is an indication that we can narrow our results down. Finally, we saw that for two events
A and B, we can determine whether they are independent if the probability of A given
B is equal to the probability of A. If this is not true, that’s an
indication that A and B are dependent events.