### Video Transcript

In this video we’re gonna look at some data and construct two-way frequency
tables. Then we’ll use them to calculate estimated or approximate probabilities of events
occurring. We’re also going to use the tables to help us to decide if the events are
independent of each other. Firstly though let’s just recap some notation and terminology that
we’ll be using in this topic.

“Sample space” is a set of possible outcomes for an experiment or a random
trial. For example, here’s a six-sided dice or number cube. And each side contains one of the
numbers from one to six. And no two numbers contain the same number. If we set up an experiment that
involve rolling it on the table and seeing which side is facing upwards when it stops rolling,
then there are six possible outcomes. If it’s a fair dice, then each of the outcomes are equally
likely and they are. It could land with a one facing up or a two or a three or a four or a five or a six. But
that’s rather a tedious way writing it all down, so we abbreviate it a bit and use set notation
to help us to recall the sample space efficiently. So we list all the possible outcomes: one, two, three, four, five, and six inside
these curly braces here. In fact we can give that set a name so it’s easy to refer to. Commonly people
call it 𝑆 for sample space or 𝑈 for universal set or even the Greek
letter 𝛺, but 𝑆 is my favorite name.

Right let’s dive in and see how to construct a two-way frequency table and
we’ll learn a bit more terminology and notation as we go along. So what is a two-way frequency
table then? Well one way of describing it is as a way of writing down the sample space.
Rather than just writing down the set of possible outcomes, you write down the frequencies with
which they occur too. Another way is that it’s just a table that contains frequencies but is a
table with numbers in it. And each number tells us how many times a certain thing has occurred — its
frequency. But more than that it’s a two-way table which means that we’re categorizing the
different things that have occurred in two different ways.
For example, here’s a table which tells us how many boys and girls there are in each grade at particular school.

Now that two-way table contains more information that a sample space
presented using the set notation 𝑆 equals B nine, G nine, B ten, G ten et cetera for boys in
grade nine, girls in grade nine, boys in grade ten, girls in grade ten, and so on. So the school has got five hundred and seventy-nine students in total in grades nine to twelve. And from this
table we can see how many students there are in each grade, so a hundred and fifty-two in grade nine, a hundred and fifty-two in grade ten,
and so on. And we can also see how many boys and girls there are. So we’ve broken it down in
two ways: by gender and by grade. And the numbers in the table are frequencies. For example, there are seventy boys in
grade nine; there are sixty-eight girls in grade eleven. So that’s the number of times that grade nine boys occur in the dataset and
the number of times that grade eleven girls occur in the dataset. Now we can use the frequencies in the table as a proxy for probability. There
are two hundred and ninety-five girls out of the five hundred and seventy-nine students in total. So the proportion of the students who are girls
is two hundred and ninety-five five hundred and seventy-ninths. And if we picked a student from the school completely at random, then the
probability of them being a girl must be two hundred and ninety-five five hundred and seventy-ninths. So that’s the number of possible ways that your random selection would be: a
girl divided by the total number of students that there are to randomly select from. But what if we said we picked a grade eleven student at random rather than any
student from the whole school; well now we’ve narrowed the field down a bit, just to grade eleven
students.

Now we know that we’re choosing from these hundred and thirty-three grade eleven students rather than
the whole five hundred and seventy-nine in the whole school. And within that subset of the sample space, there are only sixty-eight girls. So the
probability of selecting a girl given that we now know that we’re choosing a grade eleven student
is
sixty-eight out of one hundred and thirty-three.
This idea of setting up a precondition and then working out a probability is
called “conditional probability.” The precondition was the student has been picked from grade eleven
so that cuts down the pool of students that we’re choosing from. We talk about conditions like
this using the phrase “given that.” And there’s some special notation to make it quicker to write
down as well. The probability that we choose a girl — and this vertical line means
given that we chose somebody from grade eleven — is sixty-eight out of a hundred and thirty-three. So this vertical line
means given that.

Okay now you try a question. Listen to the question, press pause, and then try to
answer it. Here’s the table. We select a boy at random from the school. What’s the probability
that they’re in grade nine? Okay let’s write the question down first. So we’re trying to find the probability that it’s a grade nine student. Given
that this vertical line, we know that we’re dealing with a boy. Now if we said given that it’s a boy, we know we’re dealing with one of these
cases circled here. And that means that there are seventy grade nine students out of a total of
two hundred and eighty-four boys. So the proportion of boys that are grade nine students is seventy two hundred and eighty-fourths.
And we can simplify that by cancelling the fraction if we want to, but we don’t have to; either of those answers would be correct.

Right now we know what a two-way table is, how to construct one, how to use it
to calculate probabilities and conditional probabilities, and we’ve learned a bit of notation
for conditional probability. Let’s move on to talk about dependent and independent events and
see if we can spot some in a two-way frequency table. So two events are said to be independent if the probability of either one of
them occurring is unaffected by whether or not the other one has occurred. For example, if we
roll a dice and flip a coin, then it doesn’t matter what happened to the dice. It’s always equally likely
that the coin will land head side up or tail side up. Also no matter how the coin lands, the probabilities of getting each number on
the dice will be the same; the two are independent. But if the probability of one event happening is different depending on
whether or not another one has occurred, then the events are said to be dependent. For example, let’s say a bag contains one chocolate and one strawberry sweet.
First, one person picks a sweet at random from the bag and eats it. Then another person takes
and eats the remaining sweet. Now the probability that the second person will pick chocolate depends very
much on which sweet the first person took. If the first one took strawberry, then the second
must ate chocolate. But if the first one took chocolate, then the second cannot get chocolate.

Now we can use our ideas and notation for conditional probability to describe
dependent and independent events. Let’s say we’ve got two events: 𝐴 and 𝐵. Maybe event 𝐴 is a fair coin landing
head side up when we flip it and event 𝐵 is a fair dice landing with a multiple of three
facing upwards when we roll it. So the probability of event 𝐴 occurring: well it’s a fair coin; that means
heads or tails are equally likely to come up. So the probability of 𝐴 is a
half. The probability of 𝐵 occurring: well multiples of three that are on a dice; so one to
two aren’t, three is, four and five aren’t, but six is. So there are two ways of getting a multiple of three out of
six or we could simplify that down to a third. So 𝐴 and 𝐵 are independent if the probability that 𝐴 occurs is the same
whether or not 𝐵 occurs and the probability that 𝐵 occurs is the same whether or not 𝐴 occurs. So we can write this as 𝐴 and 𝐵 are independent if the probability of 𝐴
is equal to the probability of 𝐴 given 𝐵. So if 𝐵 has happened, the probability of 𝐴 is
exactly the same as if 𝐵 hadn’t happened. And the probability of 𝐵 is equal to the probability of 𝐵 given
𝐴. So in other words the probability of 𝐵 is not affected by whether or not 𝐴 has
occurred. Well we know that the probability of 𝐴 occurring is a half. And
clearly that doesn’t change if we roll a dice as well regardless of how the dice lands. So the
probability of 𝐴 given 𝐵 is also a half. And we know that the probability that 𝐵 occurs is a third. And
clearly that doesn’t change if we flip a coin alongside it regardless of how that lands. So
the probability of 𝐵 given 𝐴 is also a third. So these things are true and therefore 𝐴 and 𝐵 are independent.

Okay we’ve sort of cheated there a bit because we used the fact that we knew they
were independent to work out the probability of 𝐴 given 𝐵 and 𝐵 given 𝐴. But let’s look at
another example. Let’s look at a two-way frequency table for a different school. And let’s define event 𝐴 as choosing a student at random and it being a girl
and event 𝐵 as choosing a student at random and it being a grade ten student. Well there are four hundred and fifty students in total and two hundred and seventy of them are girls. So the probability of event 𝐴 occurring is two hundred and seventy out of four hundred and fifty, which
cancels down to three-fifths. And the probability of getting a grade ten student if we just choose a student
at random, well there are a hundred and twenty-five grade ten students out of the four hundred and fifty. So the probability of event 𝐵 occurring is a hundred and twenty-five out of four hundred and fifty, which
simplifies down to five-eighteenths.

Now what’s the probability of 𝐴 given 𝐵? What’s the probability that we get a
girl given that it’s a grade ten student? So if that’s the case, if we know it’s a grade ten
student, we’re limiting ourself to this row of the table. And in this case how many of those are
girls? Well there are seventy-five out of a hundred and twenty-five students who are girls and that simplifies to three over five. So the probability of event 𝐴 occurring was three-fifths and the
probability of 𝐴 given 𝐵 occurring was also three-fifths.

Okay let’s work out what the probability of 𝐵 given 𝐴 is. So what’s the probability we’re grade ten given that it’s a girl? Well we’ve now
restricted ourselves to this column here. And in this column seventy-five out of the two hundred and seventy students are grade
ten and that simplifies to five-eighteenths. So we said down here that the probability of getting a grade ten student
was five-eighteenths. And we’re now saying that the probability of getting a grade
ten student given that we know we’ve picked a girl is also five-eighteenths. So the probability of 𝐴 is equal to the probability of 𝐴 given 𝐵
and the probability of 𝐵 is equal to the probability of 𝐵 given 𝐴. In other words,
the effect of the other event occurring is nil; it doesn’t have any effect. That’s the definition of independence.

Right let’s look at one final example where the events turn out to be
dependent. In a class of students we asked everyone if they preferred math or English lessons.
And we also asked them if they would rather have a cat or a dog if they were to have a pet.
These are the results in a two-way frequency table. Are the students’ preferences for math
independent of their preferences for cats? That’s the question. So what we need to do is to find out the probability that they prefer math
and the probability that they prefer cats and then work out if the probability they prefer
math is the same as the probability that they prefer math given that they prefer cats and work
out if the probability that they prefer cats is the same as the probability that they prefer
cats given that they prefer math. Okay let’s have a look at that. Sixteen out of the thirty-two students prefer math. And that simplifies to a half. Now if we know that they prefer cats — that’s
this row here — what’s the probability that they prefer math? Well ten out of those
seventeen prefer math. So those two things are not equal. So it doesn’t look like they’re independent.
Let’s do the other side as well. So the probability that they prefer cats, well there are seventeen out of the
thirty-two who prefer cats and that doesn’t simplify. And given that they prefer math, we know we’re in
this column how many of them, what proportion, prefer cats. Ten out of those sixteen
prefer cats and that simplifies to five-eighths. So again we can see that the
probability of preferring a cat is not the same as the probability of preferring a cat if you
know that they definitely prefer math. So because those conditions aren’t satisfied, those two events are not
independent.

So let’s summarize what we think we’ve learned then. A two-way frequency table
summarizes the number of cases in two ways: so here we’ve broken down by boys and girls or
which grade were in. We can use the totals in the frequency table to work out the proportion of
cases that match specified criteria and that proportion can be used as a measure of
probability. And in this case we had two hundred and eighty-four boys in total out of five hundred and seventy-nine students. So
the probability of selecting a boy at random from that population is two hundred and eighty-four over five hundred and seventy-nine.
And we can look at single rows or columns or even groups of rows or columns
to work out conditional probabilities from two-way frequency tables. For example, the probability that the student is year nine given that they’re a
girl: if we look at the column for girls, we know there are two hundred and ninety-five girls in total and eighty-two of those
girls are grade nine. So the probability that it’s a year nine student given that we know it’s
a girl is eighty-two out of two hundred and ninety-five.

And finally we can say that events 𝐴 and 𝐵 are independent if the probability
of 𝐴 occurring is the same as the probability of 𝐴 occurring given that 𝐵 has occurred. And the
probability of 𝐵 occurring is the same as the probability of 𝐵 occurring if 𝐴 has occurred. In
other words knowing the outcome of one event does not affect the probability of the other.