Video Transcript
In this video, we will learn how to
determine when to choose between taking a sample and using the whole population. We will begin by defining what we
mean by these terms when dealing with statistics. The study of statistics revolves
around the study of data sets. In this video, we will discuss two
important types of data sets, populations and samples. A population includes all of the
elements from a set of data. A sample, on the other hand,
consists of one or more observations drawn from the population.
Whilst we will not focus on them in
this video, there are many different ways of obtaining a sample: for example, random
sampling, systematic sampling, and stratified sampling. In this video, we will only be
looking at whether we should choose the whole population or a sample of the
population. A sample usually has fewer
observations than the population. We use a sample due to constraints
or an inability to study the whole population. The most common constraints are
time and money. However, there are other
constraints that could also impact our ability to study the whole population. We will now look at some specific
questions in context.
Which of the following data sets
would be suitable to check the education level in the poor villages in Africa? Is it (A) mass population or (B)
samples?
When deciding which data set to
use, we need to factor in any constraints. Two of the biggest constraints when
collecting data are time and money. In this particular question, we
need to ask ourselves whether it is possible to check the education level of every
child in the poor villages in Africa. If this was a sensible method, we
could use the mass population. However, as it is not realistic to
visit every village in Africa, we need to choose samples.
We could choose a sample of
different villages and then a sample of children from each of the villages
chosen. This would be the most suitable way
to check the education level in the villages in Africa.
Which of the following data sets is
suitable to calculate how many hospitals there are in a city? Is it (A) mass population or (B)
samples?
When deciding which type of data
set to choose, we need to consider any constraints. These include time and money, but
they also include what we are trying to find out from our question. In this question, we need to
calculate the number of hospitals in a city. This means that we want an exact
answer. As a result, taking a sample would
not be beneficial, as there could be more hospitals in some areas of the city than
in others. In order to calculate how many
hospitals are in a city, we would need to count each individual hospital. This means that we need to use the
whole population. The correct answer is therefore
option (A). The data set that is most suitable
is mass population.
In the next two questions, we need
to identify whether the data collected is a population characteristic or a sample
statistic.
Olivia knows all the families
living in her area quite well. She says that she has found out
that the average number of children per family is 2.3. Is this figure a sample statistic
or a population characteristic?
We recall that a population
includes all the elements from a data set. A sample, on the other hand,
consists of one or more observations drawn from the population. The keyword in this question is
“all” as it states that Olivia knows all the families in her area. She has found out the average
number of children per family using the entire population of her area. The correct answer is therefore a
population characteristic.
A study claims that 96 percent of
people aged 16 to 24 in a certain country own a smart phone. Is this a sample statistic or a
population characteristic?
We recall that a population
includes all the elements from a data set. In this question, this would be all
the people aged 16 to 24 in a country. A sample, on the other hand,
consists of one or more observations drawn from the population. Due to the constraints of time and
money, it would be very difficult to ask every 16- to 24-year-old in a country. Typically, this would only happen
when conducting a census. This means that the 96 percent that
the study claims must be based on a sample of the population. The correct answer is therefore a
sample statistic.
Any study of this type will not be
able to ask the entire population but instead will focus on a sample. This sample could have been
obtained using a variety of methods. Random sampling, systematic
sampling, or stratified sampling are examples of this.
In our final example, we will
identify some keywords involved in sampling.
Which of these makes an inference
in statistics? Is it (A) computing a statistic
from the sample? (B) Generating a random sample from
a given population. (C) Applying conclusions drawn from
a sample of a whole population. Or (D) working out the percentage
of the population that exhibits a certain characteristic.
Statistical inference is the
process of using data analysis to deduce properties of a population. This means that we’re looking to
make conclusions from a sample that could apply to the whole population. The correct answer is therefore
option (C). An inference applies conclusions
drawn from a sample of a whole population.
We will now summarize the key
points from this video. We found out in this video that a
population contains all elements of a data set. As a sample consists of one or more
observations from the population, it is a subset of the population. This can be shown in the given
diagram where the sample is a selection from the larger group or population. All elements of the sample must be
contained within the population. We also found out that we can
analyze a sample to infer properties of an entire population. This allows us to make further
hypotheses or conclusions without asking the entire population.