In this explainer, we will learn how to deal with the concept of conditional probability using joint frequencies presented in two-way tables.

When collecting data on nonnumerical variables, we count how many times a particular characteristic occurs. We can then put our results in a table.
For example, a fanzine site for the TV
show *AMaze in Space* collects data on the number of new alien species encountered each season. The data for seasons 1, 2, and 7
is shown in the table below.

Our population here is “new alien species” and the variable is “season,” which is a categorical (i.e., nonnumerical) variable. In this data set, the variable has 3 categories: season 1, season 2, and season 7. We count the number of new species encountered in each season.

We can delve deeper into the data by splitting it according to which of the two Starships Zeta and Geoda the new species were encountered by. So, the data varies not only across the three seasons, but also with respect to the two starships, as shown in the table below.

Our data is now displayed in a “two-way table” (which is sometimes also called a “contingency table”). The “two” in the two-way table refers to the two variables (which, in our case, are “season” and “starship”). Looking at the table, we can see that, for example, in season 1, 28 new species were encountered by the Starship Geoda, but only 3 by the Starship Zeta.

### Example 1: Calculating Probabilities Using a Two-Way Table

A fanzine website for the TV show *AMaze in Space* collects data on the number of new alien species encountered by two starships in
each season of the show. The data for seasons 1, 2, and 7 is shown in the table below, split by the two starships Zeta and Geoda.

Find the probability that a new alien species chosen at random was encountered by Starship Geoda. Give your answer to three decimal places.

### Answer

To find the probability that a new species chosen at random was encountered by Starship Geoda, we need to know how many of the new species were encountered by Starship Geoda and how many new species were encountered in total across the three seasons.

The probability that a new species was encountered by Starship Geoda is then

Hence, the probability that a new alien species was encountered by Starship Geoda is 0.867 to 3 d.p.

### Note

We can also say that, overall, there is a chance a new species were encountered by Geoda.

We can also use the two-way table to examine the relationship between the two variables and to work out conditional probabilities. Let us look at an example of this.

### Example 2: Using Two-Way Tables to Examine Relationships between Categorical Variables and Calculate Conditional Probabilities

A fanzine website for the TV show *AMaze in Space* collects data on the number of new alien species encountered by two
starships in each season of the show. The data for seasons 1, 2, and 7 is shown in the table below, split by the two Starships Zeta and Geoda.

Given that a new alien species was encountered in season 7, find the probability that they were encountered by Starship Geoda. Give your answer to three decimal places.

### Answer

If we know that a new species was encountered in season 7, we only need to look at the total number of new species encountered in season 7 and work out the proportion of those who were encountered by Starship Geoda. So, we look only at the “season 7” column in our table.

Given that a new species appeared in season 7, the probability that they were encountered by Starship Geoda is then

That is, if we know a new species appeared in season 7, there is approximately a 62% chance they were encountered by Starship Geoda (since ). In this example, we have worked out the conditional probability that a new alien species chosen at random was encountered by Starship Geoda, given that they appeared in season 7.

In our next example, we will calculate conditional probabilities using a two-way table.

### Example 3: Conditional Probability from a Two-Way Table

The table below contains data from a survey of “core gamers” who were asked whether their preferred gaming platform is “smart phone,” “console,” or “PC.” The gamers are also split by gender.

- Find the probability that a core gamer chosen at random prefers using a console. Give you answer to three decimal places.
- Given that a core gamer prefers to play using a console, find the probability that they are male. Give your answer to three decimal places.

### Answer

Let us first work out the totals for the rows and columns of our table.

**Part 1**

To find the probability that a core gamer chosen at random prefers using a console, we find the number of gamers who prefer a console and divide by the total number of gamers.

Let C be the number of gamers who prefer a console; then,

As a percentage, this is . Hence, approximately of gamers prefer to use a console.

**Part 2**

Given that a core gamer prefers to play using a console, we want to find the probability that they are male. Because we are only now interested in gamers who prefer a console, those who prefer to use smart phones or PCs do not figure in this calculation. So we only need to look at the “console” row in the table (highlighted in blue).

Our conditional probability, , the probability that a gamer chosen at random is male given that they prefer a console, is then

As , we can say, given that a gamer chosen at random prefers a console, that there is approximately a 62% chance that they are male.

In our next example, we will calculate conditional probability using a two-way table and the conditional probability formula.

### Example 4: Comparing Conditional Probability from a Two-Way Table and Using the Conditional Probability Formula

Bassem and Samar are running for the presidency of the Students’ Union at their school. The votes they received from each of 3 classes are shown in the table.

Class A | Class B | Class C | Total | |
---|---|---|---|---|

Bassem | 161 | 169 | 177 | 507 |

Samar | 147 | 195 | 152 | 494 |

What is the probability that a student voted for Samar given that they are in class B?

### Answer

Given that the student was in class B, to find the probability that they voted for Samar, let us first work out the total number of students who voted in each class.

Class A | Class B | Class C | Total | |
---|---|---|---|---|

Bassem | 161 | 169 | 177 | 507 |

Samar | 147 | 195 | 152 | 494 |

Total | 308 | 364 | 329 | 1 001 |

We can see that the number of students who voted in class B was 364 and that, of those 364, 195 voted for Samar.

We know already that the student was in class B, so we are not concerned with the students in classes A and C. And so the conditional probability that a student voted for Samar given that they are in class B is

Hence, we can say that, given a student from those who voted was in class B, there is approximately a chance they voted for Samar (since ).

Note that to work this out we could also have used the conditional probability formula

For the probability that a student voted for Samar given that they were in class B, we need to know

- , that is, the probability that a student was in class B
**and**voted for Samar; - the probability that a voting student chosen at random is in Class B .

From our table, there were 195 voting students who, out of a total of 1 001, were in class B **and** voted for Samar.

Hence,

Next, is given by the number of voting students in class B, out of the total number of voting students.

There are 364 voting students in class B and a total of 1 001 voting students. Hence,

We can now work out the conditional probability:

As you can see, two-way tables give us a useful way of working out conditional probabilities without using the formula. Although we could use the conditional probability formula, it is quicker and simpler to use the values directly from the table.

In our final example, we examine further how a two-way table can be used to analyze the relationship between two categorical variables.

### Example 5: Two-Way Tables, Conditional Probability, and the Relationship between Categorical Variables

Data is collected from the TV show *AMaze in Space* on the number of new alien species
first contact is made with. The data for Starship Zeta in seasons 1, 2, and 7 are shown in the table below.
The data have also been categorized by whether the crew member who made first contact was male or female.

- From the table, find the probability that first contact was made with a new alien species by a female crew member. Give your answer to three decimal places.
- Find the probability that first contact was made in season 1 and by a female crew member. Give your answer to three decimal places.
- Given that first contact was made with an alien species chosen at random, in season 1, find the probability that first contact was made by a female crew member. Give your answer to three decimal places.
- Are the events “S1 = first contact made in season 1” and “female” independent?

### Answer

**Part 1**

To find the probability that first contact was made with a new alien species by a female crew member, we need to know the total number of alien species contact was made with and how many of those first contacts were made by a female crew member.

The total number of first contacts made by female crew members was 37, and the total first contacts were 72. Therefore, letting “F” stand for “female,”

The probability that a first contact was made by a female crew member is 0.514, and we can say that there is approximately a chance that a first contact was made by a female crew member.

**Part 2**

The easiest way to find the probability that a first contact was in season 1 (S1) and was by a female crew member (F) is by looking at the table entry for “season 1” and “female.”

As we can see, there were 16 female first contacts in season 1. Our required probability, which we can write as , is therefore

There is a chance that a first contact was in season 1 and by a female crew member.

**Part 3**

Given that first contact was made in season 1, to find the probability that the crew member was female, we only need to look at the column for season 1 in the table. This is because we know already that the contact was made in season 1, so there is zero probability that it was made in either season 2 or season 7. We want to find the conditional probability .

We can simply read off the numbers from the table, so

Given that a first contact was made in season 1, then there is approximately a chance that the first contact was made by a female crew member .

### Note

To work this out, we could also have used the formula for conditional probability:

We know already from part 2 that . And the probability that a first contact was made in season 1 is

Hence, the conditional probability we require is

**Part 4**

For the events “S1 = first contact made in season 1” and “F = Female” to be independent, it must be true that

That is, must be equal to . From our previous calculations, we know that

Since , the events “S1 = first contact made in season 1” and “F = female”,
are **not** independent. This means that the season in which first contact was
made has an effect on the probability that the first contact was made by a female crew member.

Let us now recap on the key points and probability rules associated with conditional probability and two-way tables.

### Key Points

The “two” in “two-way table” refers to the fact that there are two variables under consideration.

- In a two-way table, we organize the counts, or frequencies, for the categories of two categorical variables.
- Values (or categories) of the row variable label the rows running across the table, and values (or categories) of the column variable label the columns running down the table.
- We use two-way tables to examine the relationship between two categorical variables. In particular, we can look at conditional probabilities.

Conditional probabilities can be read directly from two-way tables. The probability of event given that event has occurred, , is a fraction where the denominator is the total for event and the numerator is the number of occurrences of : - We can also use the conditional probability formula, where is the probability of both and occurring at the same time.
- We can use the probabilities worked out from a two-way table to determine whether or not the two categorical variables are independent. So, for example, if events and are independent, then .