Video Transcript
In this video, we will learn how to
find the interquartile range given different representations of data. The interquartile range of a data
set is a measure of how the data values are spread out around the center. We will begin by recalling how to
calculate the median as well as the lower and upper quartiles. Once we have done this, we will
show how we can use these values to calculate the interquartile range.
The median, otherwise known as 𝑄
two or the second quartile, marks the middle of a data set. 50 percent of the data is below the
median, and 50 percent is above it. The lower or first quartile, known
as 𝑄 one, marks the center of the bottom half of a data set. 25 percent of the data is below 𝑄
one. The upper or third quartile, 𝑄
three, marks the center of the top half of a data set. 25 percent of the data is above the
upper quartile, and 75 percent is below it.
When dealing with a small data set,
it is easy to calculate the median and quartiles by inspection. However, when dealing with larger
data sets, we can use formulas to find the median position and 𝑄 one and 𝑄 three
position in our data set. The median position is equal to 𝑛
plus one divided by two, where 𝑛 is the number of data values in our set. The lower quartile position is
equal to a quarter of 𝑛 plus one or 𝑛 plus one divided by four. The upper quartile position is
equal to three-quarters of 𝑛 plus one or three multiplied by 𝑛 plus one divided by
four.
In some cases, particularly when
dealing with an even number of data values, we need to round these answers up or
down. We’ll now look at a quick example
where we need to calculate the median as well as the lower and upper quartile.
A group of 14 students were asked
to log the number of friends they each had on the new social media site Nosebook,
one month after joining the site, with the following results. Find the median number of
friends. Find the quartiles 𝑄 one and 𝑄
three of the data.
Before being able to calculate the
median or quartiles from any data set, we need to list the values in ascending
order. In this case, the smallest value is
32. The next smallest is 44. The rest of the values are listed
in order as shown up to 95. The median is the middle value. Whilst we could cross off a number
from either end to find this middle value, we can also use the formula 𝑛 plus one
divided by two to find the middle position.
As there are 14 values in this
question, we need to add one to 14 and then divide by two. This is equal to 7.5. Therefore, the median value is
between the seventh and eighth value. The seventh value is 68, and the
eighth value is 70. The median is the midpoint or
average of these two values. The midpoint of 68 and 70 is
69. Therefore, the median number of
friends is 69.
𝑄 one or the lower quartile is the
center of the bottom half of the data set. We have seven values that are below
the median. The middle or center of these will
be the fourth value as it has three values on either side. As the fourth value is 53, 𝑄 one
or the lower quartile is 53. In a similar way, we can find 𝑄
three or the upper quartile by finding the center of the top half of data. Once again, there are seven values
that are greater than the median. These range from the eighth value
to the 14th, the middle of which is the 11th, which has three values on either
side. As this is equal to 82, the upper
quartile or 𝑄 three is 82.
We will now look at the definition
of the interquartile range and how we can use these values to calculate it. The interquartile range is a
measure of the middle 50 percent of the data and gives us an indication of how
spread out the data is. We can calculate the interquartile
range or IQR of any data set by subtracting our 𝑄 one value from 𝑄 three; we
subtract the lower quartile from the upper quartile.
In our previous question, 𝑄 one or
the lower quartile was equal to 53, and 𝑄 three, the upper quartile, was equal to
82. The interquartile range would,
therefore, be equal to 82 minus 53. This is equal to 29.
We will now look at some questions
where we need to calculate the interquartile range.
A set of data’s minimum is 3.0, its
lower quartile is 4.5, its median is 6.4, its upper quartile is 7.9, and its maximum
is 10.1. Determine its interquartile
range.
We know that the interquartile
range of any data set is equal to the upper quarter minus the lower quartile. We are told that the lower quartile
is equal to 4.5. The upper quartile is equal to
7.9. This means that the IQR or
interquartile range is equal to 7.9 minus 4.5. This is equal to 3.4. The interquartile range of the data
is 3.4.
We will now look at a question
where we need to compare the interquartile range for two data sets.
In this question, we’re given two
data sets.
Calculate the interquartile range
for each data set. What do the interquartile ranges
reveal about the two data sets? Is it (A) the spread of the middle
50 percent of the values is similar for both data sets? (B) The difference between the
minimum and maximum values is similar for both data sets? (C) The median of the two data sets
should be the same? (D) The mean of the two data sets
should be the same? Or (E) the mode of the two data
sets should be the same?
We will begin by clearing some
space to calculate the interquartile range for each data set. Let’s begin by considering data set
one. We begin by writing our seven
values in ascending order, starting with 22 and ending with 51. The median of any data set is the
middle value. In this case, this will be the
fourth value as there are three values on either side of this. The median of data set one is
28.
The lower quartile or 𝑄 one is the
center of the bottom half of our data set. The bottom half of the data set
contains three values, 22, 25, and 26. The middle one of these is 25. This means that the lower quartile
of data set one is 25. The upper quartile or 𝑄 three is
the center of the top half of our data set. This contains the numbers 28, 29,
and 51. The middle one of these is equal to
29. Therefore, the upper quartile is
29.
The interquartile range or IQR is
equal to 𝑄 three minus 𝑄 one. We subtract the lower quartile
value from the upper quartile value. 29 minus 25 is equal to four. The interquartile range of data set
one is equal to four. We will now repeat this method for
data set two.
As there are also seven values in
data set two, the position of the quartiles and median will remain the same. The lowest value of data set two is
19, and the highest value is 28. We can see from our list that the
median is equal to 24; the lower quartile, 21; and the upper quartile, 27. This means that the interquartile
range is equal to 27 minus 21, which is equal to six. The interquartile range of data set
two is six.
We will now move on to the second
part of the question. In the second part of the question,
we are asked to consider what the interquartile ranges reveal about the two date
sets. The interquartile range does not
rely on the median, mean, or mode. Therefore, we know that options
(C), (D), and (E) are all incorrect. The maximum and minimum values also
have no impact on the interquartile range as these are used to calculate the range
of the entire data.
The interquartile range does
contain the middle 50 percent of the values from the lower quartile to the upper
quartile. As our values of four and six are
quite close, we can conclude that the spread of the middle 50 percent of the values
is similar for both data sets. The interquartile range only gives
us information about those middle values.
Our next question involves
calculating the range and interquartile range from a frequency table.
The table shows some non-English
languages spoken by some of the U.S. population. Determine the range and
interquartile range of the data.
The range of any data set can be
calculated by subtracting the minimum value from the maximum. Whereas the interquartile range or
IQR is equal to the upper quartile minus the lower quartile, also known as 𝑄 three
minus 𝑄 one. Our first step is to sort our eight
values into ascending order. The smallest value is equal to
216,300. This is the number of people that
speak Hebrew. Next, we have 246,900 people that
speak Armenian. We can continue to list these in
order all the way up to the number of Spanish speakers, which is 37,580,000.
This is the maximum value. We can now calculate the range by
subtracting the minimum value from the maximum one. This is equal to 37,363,700. This is the range of the data in
the frequency table. As we have eight values in total,
and the median is the middle number, this will lie halfway between the fourth and
fifth value. Whilst we don’t need the median to
calculate the interquartile range, it makes it easier to find the lower and upper
quartiles.
The lower quartile is the center of
the bottom half of our data. As there are four values that are
less than the median, the lower quartile will lie halfway between 246,900 and
304,900. We can find the midpoint of these
two values by adding them and then dividing by two. This gives us 275,900. We can repeat this process for the
upper quartile or 𝑄 three. As there are four values above the
median, the center of this will lie halfway between 800,000 and 1,410,000. This is equal to 1,105,000. We can then calculate the
interquartile range by subtracting 275,900 from this. This is equal to 829,100, which is
the interquartile range of the data.
The final question in this video
involves calculating the interquartile range from a line plot.
The given line plot shows the
magnitudes of the earthquakes that recently took place around the world. Determine the range and
interquartile range of the data.
One way of approaching this
question would be to write out all of the values in order, two, 2.1, 2.6, 2.6, 2.8,
and so on. This should be very time consuming,
so it is easier to work out how many earthquakes of each magnitude we have
first. There was one earthquake of
magnitude two. There was also one earthquake of
magnitude 2.1. There were two earthquakes of
magnitude 2.6, four of 2.8, all the way up to four of 3.5.
We can also calculate a running
total or cumulative frequency of these to calculate the total number of
earthquakes. This gives us values of one, two,
four, eight, 11, 15, 17, 21, 23, and 27. There were 27 earthquakes that took
place altogether. When dealing with a large data set,
we can calculate the position of the median and the quartiles as follows.
The median position can be
calculated by dividing 𝑛 plus one by two, where 𝑛 is the total number of data
values. In this question, we have 27 plus
one divided by two. This is equal to 14. So, the median is the 14th
number. Whilst we don’t need to calculate
the median in this case, it helps us work out the position of the quartiles. The 12th to 15th values all had a
magnitude of three. This means that the median equals
three.
The lower quartile or 𝑄 one
position will be half of this. As the seventh number is 2.8, the
lower quartile or 𝑄 one is 2.8. The upper quartile or 𝑄 three
position will be the 21st value. This means that 𝑄 three is equal
to 3.3. The range of values is calculated
by subtracting the minimum from the maximum. 3.5 minus two is equal to 1.5. So, this is the range. The interquartile range or IQR is
equal to 𝑄 three minus 𝑄 one. 3.3 minus 2.8 is 0.5. So, the interquartile range is
0.5.
We will now summarize the key
points from this video. The median, lower quartile, and
upper quartile of a data set can all be calculated from a list of data, a frequency
table, or a line plot. The interquartile range or IQR is a
measure of the middle 50 percent of the data. It is equal to the upper quartile
or 𝑄 three minus the lower quartile, 𝑄 one. We also use the fact that the range
is equal to the minimum value subtracted from the maximum value of the data set.