Video Transcript
In this video, we will learn how to
determine whether a sample is biased or unbiased. We will begin by defining these two
terms.
A biased sample is one in which
some members of the population have a higher or lower sampling probability than
others. This includes sampling or selecting
based on age, gender, or interests. An unbiased or fair sample must,
therefore, be representative of the overall population being studied. Time and resources often mean that
we can’t ask every single member of the population. In order to ensure that our sample
is a fair reflection, we must ensure it is unbiased. Every member of the population must
have an equal chance of being selected. When talking about population, we
don’t necessarily mean every person in a country or in the world. We could be talking about the
population of a school or a sports club.
For our sample to be unbiased, each
of these people has to have an equal chance of being chosen. We will now look at some questions
involving biased and unbiased sampling. In each case, we need to ask
ourselves the question, does each person in the population have an equal chance of
being chosen? If the answer is yes, our sample is
unbiased. However, if the answer is no, we’re
dealing with a biased sample.
Jennifer is doing a research
project on whether or not students in her school eat healthy food. She decides to interview her
friends who do gymnastics with her. Is her sample biased?
In order to work out if any
sample is biased, we need to ask ourselves one question. The question is this: does
every member of the population have an equal chance of being selected? In this question, the students
in Jennifer’s school are the population. She’s researching whether they
eat healthy food or not. In Jennifer’s sample, she’s
only selecting friends from gymnastics. This means that any students in
the school who do not do gymnastics cannot be selected.
The answer to the question
“Does every member of the population have an equal chance of being selected?” is
therefore no. If this is the case, we know
that the sample is biased or unfair. The correct overall answer is,
therefore, yes, Jennifer’s sample is biased. This is because the only
students that can be selected in her sample are those that do gymnastics. It is also quite possible —
although not certain — that many of the students who do gymnastics eat healthy
food. This means that the sample that
she has chosen could skew the results of her research project. They could potentially give a
more positive outlook on those students who eat healthy food.
In our next question, we need to
select the unbiased sample.
A school principal wants to
find out what the students think about the teaching quality in the school. Which of these samples is
unbiased? Is it (A) All ninth-grade
students are interviewed? (B) A list of female students
to interview is randomly generated. (C) A list of male students to
interview is randomly generated. (D) A list of students to
interview is randomly generated. Or (E) a questionnaire is
available at the library for anyone who wants to take part in the survey.
In order to decide whether a
sample is biased or unbiased, we need to ask ourselves one question, does every
member of the population have an equal chance of being selected? In this question, the students
in the school are the population. If each of these has an equal
chance of being selected, we can say, yes, the sample is unbiased. If the answer to the question
is no, then the sample is biased. A biased sample would mean that
some students have a greater chance of being selected than others. Let’s now look at our five
options.
In option (A), all ninth-grade
students are being interviewed. This means that no students in
any other year will be interviewed. This means that this form of
sampling is biased, as every member of the population does not have an equal
chance of being selected. Options (B), (C), and (D) all
talk about randomly generated lists. This suggests that they could
be unbiased as each student could have an equal chance of being selected. However, option (B) is just a
list of female students. As no male students can be
selected in this sample, this is a biased sample. The same is true of option
(C). This time, we’re selecting only
male students. So, this too is biased.
Option (D), on the other hand,
is an unbiased sample. A list of the students is being
randomly generated from the whole population. This could be done using a
random number generator or a raffle. As long as the list is randomly
generated, the sample is unbiased. Option (E) involves leaving a
questionnaire at the library for anyone who wants to take part. The fact that anyone can take
part suggests it could be unbiased. However, as it is left in the
library, not every student will have an equal chance of being in the sample. There is also an element of
choice here, which also indicates that the sample is biased. The correct answer is option
(D). A list of students to interview
is randomly generated will create an unbiased sample for the principal.
Our third question, we’ll look
at what we mean by a representative sample.
A student wants to research the
amount of pocket money students in his middle school receive. Which of the following would be
the best way to get a representative sample of the population? Option (A) asking all the
students in the library on a Monday lunchtime how much pocket money they
receive. (B) Asking a random sample of
50 students from his grade how much pocket money they receive. (C) Asking the teachers of each
class how much pocket money they think the students in their class receive. Or (D) asking a random sample
of 20 students from each grade how much pocket money they receive.
As we’re trying to get a
representative sample, we want our sample to be unbiased. In order to do this, we ask
ourselves a question, does each member of the population have an equal chance of
being selected? If our answer to this question
is yes, the sample is unbiased. In this question, the
population are the students in the middle school. Each of these students needs to
have an equal chance of being selected for the sample to be unbiased. The one that is the best
representative sample is the one that is closest to this.
In option (A), all the students
in the library on a Monday lunchtime are being asked. This is not very representative
of the whole school, as the students will have to be in the library on Monday
lunchtime. If they’re not there at this
time, they will not be in the sample. So, we can rule out option
(A). Option (B) talks about a random
sample which suggests that it could be representative of the whole
population. However, these students are
only selected from the student’s grades. This means that any student in
a different grade will not be selected. We can, therefore, say that the
sample is biased and will therefore not be a good a representation of the
population.
It is pretty obvious that
option (C) will not get a good representative sample as we’re not asking the
students but the teachers. They are also being asked for
their opinion as opposed to the actual money that the students receive. An opinion can be skewed by
people’s perceptions and is, therefore, biased. Option (D), like option (B),
talks about a random sample. The key here is that we are
selecting students from each grade. This means that it will give a
good representation of the whole population. Students in each grade will
have an equal chance of being selected. Therefore, this is the best way
to get a representative sample of the students.
We’ll now look at two further
questions in different scenarios.
Which of the following is a
representative sample? Is it (A) to find out how
students travel to school, student representatives from each grade ask a random
sample of 20 students from across the grade. (B) A hospital wants to
investigate the reasons why people go to the emergency room, so questionnaires
are handed out to a random sample of people waiting in the emergency room on a
Monday morning. (C) A market research company
wants to find out how much waste people recycle, so they survey 100 people at
the city recycling drop-off location. (D) A student wants to find out
how much students at their school enjoy math classes, so they give a
questionnaire to everyone at the math club.
A representative sample is a
subset of the population that seeks to accurately reflect the characteristics of
the larger group. This means that, where
possible, it needs to be unbiased, such that each member of the population has
an equal chance of being selected. In option (A), the population
is the students in the school. As they are selecting a random
sample of 20 students from each grade, this will represent the whole school. Option (A) is, therefore, a
representative sample of the whole school as students are not selected based on
gender, age, or interest.
In option (B), the population
is the people visiting the hospital’s emergency room. Whilst they are asking people
in the emergency room, they’re only asking on Monday mornings. This means that the sample is
not representative, as people visiting the emergency room at any other time
cannot be selected. In option (C), the market
research company wants to look at the whole population to see how much they
recycle. As they are only serving people
at the city recycling drop-off location, the survey is biased. The results would be skewed as
these people are more likely to recycle more waste than the general public. This means that option (C) is
not a representative sample.
In option (D), the population
is the students in the school. As they are only asking
students at math club, the questionnaire will be biased. Once again, this is not a
representative sample, as those students at math club are more likely to enjoy
math classes. Once again, this will skew the
results. The correct answer is option
(A).
A doctor wants to find out
about some possible side effects of a common drug they have prescribed. Which of these samples is
unbiased? Is it (A) sending a survey to a
select group of patients? (B) Interviewing patients who
suffer from side effects of the drug. (C) Interviewing all the
patients who come for an appointment on a Saturday. (D) Interviewing patients who
come for an appointment during the week at random. Or (E) generating a list of
patients to interview by phone randomly from the patient registry.
In order to identify whether a
sample is unbiased or biased, we need to ask ourselves a question, does each
member of the population have an equal chance of being selected? If the answer to this question
is yes, then our sample is unbiased. In this question, the
population are the patients that have been prescribed the drug. We want each of these patients
to have an equal chance of being selected. Option (A) does not satisfy
this criteria, as only a select group of patients have been surveyed. Option (B) is also incorrect,
as this time we’re interviewing patients who have suffered from side effects of
a drug. This means that our results
will be skewed. Each member of the population,
those who have and those who haven’t suffered side effects, must be able to be
selected.
Options (C) and (D) are
incorrect as we’re not asking the correct members of the population. In option (C), we’re asking all
the patients who came for an appointment on a Saturday. Many of these might not have
been prescribed the drug. Also using this sample, we’re
not interviewing any patients from any other day. Option (D) has a similar
problem to option (C) in that we have no way of knowing if these patients have
been prescribed the drug.
Option (E), on the other hand,
is the correct answer. We are generating a list at
random from the patient registry. This will include all the
patients that have been prescribed the drug by the doctor. Each member of the population
has an equal chance of being selected.
We will now summarize the key
points from this video. We began this video by defining
what we mean by a biased and unbiased sample. In a biased sample, one or more
parts of the population are favored over others, whereas in an unbiased sample, each
member of the population has an equal chance of being selected. We also saw that a representative
sample is a subset of the population that reflects the characteristics of the larger
group. In order for our sample to be fair
and results accurate, we want an unbiased and representative sample.