Video Transcript
In this lesson, we’ll learn how to
choose a simple random sample from a population.
Statistics is the area of maths
concerned with the collection, organization, analysis, and presentation of data. For instance, we may want to
collect data to inform us of the number of students that prefer salad to soup to
help us determine a lunch menu. Or we may want to know the spending
habits of customers at a store to help us determine the price of items.
In these cases though, it can be
difficult to collect information from absolutely everyone. Thinking back to the idea of
determining a lunch menu in school, it might be difficult to ask every single
student about their preferred meals. And so we can get around this idea
by asking only a proportion of the students and using the smaller group to gain an
idea of the preferences of the larger group. This smaller group is called a
sample, so let’s define that formally.
The entire set of objects that we
are looking to analyze is called the population. In our example of the lunch menu in
the school, the population was all the students in that school. Then, the smaller subset that we
actually analyze and use to determine the preferences of the larger group is called
the sample. Finally, the size of the sample
group is fairly intuitively called the sample size.
Now, before we look at any
examples, let’s consider a handful of pros and cons of sampling. Firstly, it’s much quicker to ask
the views of a smaller sample than the views of the entire population. For this reason, using a smaller
subset of a population can be cheaper.
However, we need to be careful. We may come to the wrong conclusion
from only looking at a small sample. Considering the example about the
lunch menu, if we asked 20 students and they all said they preferred soup, we must
consider that it could be possible that they were the only 20 students with that
opinion.
Secondly, we do need to be aware
that there could be bias in the sample group. For instance, suppose we only spoke
to the first 20 students in the lunch queue. It might seem like a fair way to
sample. However, what if the soup is made
at the start of the lunch break and gets colder over time, and the salad is always
fresh? In this case, the students who
prefer soup may want to queue up earlier, introducing bias to the sample. For these reasons, it’s worth
noting that the larger the sample size, the more likely the sample is to be an
accurate representation of the population. So, with this in mind, how do we
choose a sample?
Suppose our school contained 500
students. This is the population. Let’s also imagine our sample’s
going to contain 50 students. We could begin by listing all of
our students alphabetically by surname and then choosing every 10th student to give
us a sample size of 50. The name of this technique is
systematic sampling. With this, however, there is a
small potential for bias. That is, small families who share
the same surname could not all be chosen, since this method would skip over these
students.
So let’s find a way to choose them
randomly. One way to do this would be to
assign each student a unique number between one and 500 and then to use a random
number generator to choose 50 of these. By applying this technique, we
guarantee that every single member of the population has an equal chance of being
chosen. And so a random sample is the best
at removing any bias.
This leads us to a formal
definition. A simple random sample is a sample
in which every member of the population has an equal chance of being chosen to be in
the sample. Another way of thinking about this
is that any two members must have an equal probability of being chosen to be in the
sample. And it must be possible for any
group not larger than the sample size to all be chosen.
Let’s look at an example of how to
determine why a given sample method might not be a simple random sample.
Why does the statement “All the
clothing produced by a factory to measure the quality of that factory” not
describe a simple random sample? (A) Because a sample is always
larger than the parent population. Option (B) because a sample has
to be part of the whole population and not the population itself. Or is it (C) because this is a
sample but not a random sample?
Remember, a sample is a simple
random sample when every member of the population has an equal chance of being
chosen to be in that sample. Of course, when we select a
sample, we’re choosing a sample from a population. A population is every possible
member that could be selected. If we visualize that, we can
see that a sample must be a subset of the population. This must mean that it is a
necessity that the sample is smaller than the population. That gives us the correct
answer to our question. The answer is (B), because a
sample has to be part of the whole population and not the population itself.
Let’s now look at an example where
we will determine whether a given sampling method gives a simple random sample.
Suppose your school has 500
students and you need to conduct a short survey on the quality of the food
served in the cafeteria. You decide that a sample of 10
students would be sufficient for your purposes. So you choose 10 students by
assigning them each a number and then using the random button on your calculator
to choose 10 students randomly out of the 500 and conduct the survey on
them. Is that considered a simple
random sample?
Remember, a simple random
sample is a nonempty subset of the population where every member has an equal
chance of being in that subset. In this case, we want to choose
10 students from 500. So, at first glance, it does
look like we have a simple random sample. But let’s check that this
selection is indeed random.
In fact, the word “random” is
in the wording of the question. We’re told that each person is
assigned a number, and then the random number button on a calculator is
used. In this method, any two
students will have the same probability of being chosen. So the correct answer is
yes. This is indeed considered a
simple random sample.
In our next example, we’ll perform
some calculations to determine the percentage size of a sample.
A garden consists of 200
trees. We want to take a sample of 20
trees. Express the sample size chosen
using percentage.
Remember, the sample size is
the number of members in the sample, the subset of the population. In this case, we’re taking a
sample of 20 trees, so the sample size is 20. The garden consists of 200
trees, so this is the total population size. So we can express the sample
size as a percent of the population by dividing the sample size by the
population size and multiplying by 100. That’s 20 divided by 200 times
100, which is equal to 10. So we’ve expressed the sample
size as a percentage. It’s 10 percent.
In our final example, we’ll look at
how to determine which of a number of sampling methods is a simple random
sample.
An actor in a theater wants to
choose random people to go up on stage and participate in the play with him. Which choice is considered a
random sampling method? (A) He chooses those who are
taller than 190 centimeters. (B) He chooses women only. Option (C) he chooses those who
have seats with numbers that were picked from a bowl full of seat numbers. (D) He chooses a third of the
sample to be women and two-thirds to be men. Or option (E) he chooses those
who wear glasses.
Remember, in a random sampling
method, every member of the population has an equal chance of being chosen for
the sample. This instantly rules out option
(A). Anyone shorter than a height of
190 centimeters has a zero percent chance of being chosen. Similarly, if we look at option
(B), we can see that any male would have a zero percent chance of being in the
sample.
So what about our third
option? If we put every seat number in
a bowl and choose one, this should ensure that every seat number has an equal
chance of being chosen. So let’s double-check our final
two options.
In option (D), only a third of
the sample are women, whilst two-thirds are men. Men and women do not have an
equal chance of being chosen. Finally, if we look at option
(E), we see that anyone who doesn’t wear glasses has a zero percent chance of
being chosen.
So the correct answer is option
(C). A random sampling method is
choosing those who have seats with numbers that were picked from a bowl full of
seat numbers.
Let’s recap the key points from
this video. We learnt that the entire set of
people, elements, or objects that we are analyzing is called a population. We learnt that a sample is a
smaller, nonzero subset of the population. And the number of elements that are
in this subset is called the sample size. A simple random sample is one where
each member of the population has an equal chance of being chosen. We saw that a larger sample size is
likely to increase the accuracy of any results. But this might be at the expense of
more difficult and expensive data collection.