In this explainer, we will learn how to find the interquartile range given different representations of data.
The interquartile range (or IQR) of a data set is a measure of how the data values are spread around the center of the data set. It is sometimes called the “midspread,” because it tells us where the middle 50% of the data sit.
You will remember that the median is the middle value of a data set and so it splits the data set into two halves. The quartiles divide a data set into four, dividing the data set into quarters. Let us remind ourselves, with an example of how the median and the quartiles divide a data set.
Example 1: The Quartiles of a Data Set
A group of fourteen students were asked to log the number of friends they each had on the new social media site “Nosebook,” one month after joining the site, with the following results.
- Find the median number of friends.
- Find the quartiles Q1 and Q3 of the data.
To find the median and the quartiles of the data, we need first to put the data in ascending order, that is, with the smallest value, 32, first; the next smallest value, 44, second; and so on, until the largest value, 95, sits at the end.
There were fourteen students in the group, so we have fourteen numbers in our data set. To find the median, we want to split our data set in half. Since fourteen is an even number, we can split the data into two sets of seven values. The median is then the average of the two middle values. The middle values are the 7th and 8th values, which are 68 and 70, so the median .
The median number of friends after one month on the site was, therefore, 69.
Note that the median is actually the second quartile, Q2.
To find the quartiles of the data, we must split both the lower half and the upper half of the data in two.
You can see that the lower half of the data is split in two by the value 53—there are three data values on either side of 53. So 53 is our first quartile, Q1. The second quartile is Q2, the median, which we have already found. The third quartile, Q3, is the middle value of the upper half of the data set. You can see that the data value 82 splits the upper half of the data in half. So, Q3 is 82.
To summarize, we found that the median number of friends after one month on the site was 69; the first quartile, Q1, was 53 friends; and the third quartile, Q3, was 82 friends.
Now, let us remind ourselves of what the quartiles signify for a data set and define the interquartile range.
Definition: The Quartiles and Interquartile Range (IQR) of a Data Set
- The first (or lower) quartile (Q1) marks the center of the lowest half of a data set. So, 25% of the data sit below Q1, and 75% of the data sit above Q1.
- The second quartile (Q2), which is the median, marks the middle of a data set. So, 50% of the data set is below the median and 50% is above the median.
- The third (or upper) quartile (Q3) marks the center of the top half of a data set. So, 75% of the data set is below Q3 and 25% is above it.
- The interquartile range, IQR, is given by The interquartile range is a measure of the middle 50% of the data.
This next example shows how the interquartile range can be calculated.
Example 2: The Interquartile Range
A set of data’s minimum is 3.0, its lower quartile is 4.5, its median is 6.4, its upper quartile is 7.9, and its maximum is 10.1. Determine its interquartile range.
Let us illustrate the information we have in a diagram. The figures we have—the minimum, Q1, the median, Q3, and the maximum value—fit well in a box-and-whisker plot.
The interquartile range is the distance between Q1 and Q3, and to calculate this we use the formula
We are given that, for this data set, the lower quartile, Q1, is 4.5, and the upper quartile, Q3, is 7.9. So, for this data set, the interquartile range is
The interquartile range gives us the range of the middle 50% of the data, which in this case is 3.4.
Note that the interquartile range measures the range of the middle 50% of the data, whereas, you will recall that the range of a data set, given by the maximum value minus the minimum value, gives the range of the whole data set. For this data set, the maximum value is 10.1 and the minimum is 3.0. So, the range of the data is . This is, of course, larger than the interquartile range, which is 3.4.
We will calculate the interquartile range using information from another data set in the next example.
Example 3: Interquartile Range
A set of data’s minimum is 1.71, its lower quartile is 2.05, its median is 6.86, its upper quartile is 7.99, and its maximum is 14.16. Determine its interquartile range.
To calculate the interquartile range, we use the formula
We are given that, for this data set, the lower quartile, Q1, is 2.05 and the upper quartile, Q3, is 7.99. So, the interquartile range is
In our next example, we calculate the range and interquartile range for a data set with large values.
Example 4: Range and Interquartile Range
The table shows some non-English languages spoken by some of the US population. Determine the range and interquartile range of the data.
|Language||Number of Speakers|
Our first step in finding the range and interquartile range of the data set is to put the data in ascending order. We can see from the table that Hebrew is the language with the smallest number of speakers, Armenian has the second smallest number of speakers, Greek has the third smallest number, and so on. So, we reorder our table in this way.
Now that we have the data in order of size, we can find the range quite easily by subtracting the smallest value (Hebrew, with 216,300 speakers) from the largest (Spanish, with 37,580,000 speakers):
The range of the language data is therefore 37,363,700.
Now, we wish to find the interquartile range of the data. To do this, we must first find the quartiles Q1 and Q3 of the data by splitting the ordered data set into quarters. There are eight languages, so the middle of the data set sits between the 4th and 5th values. This is halfway between the numbers of Urdu and Hindi speakers.
Next, we split the data into quarters. The lower half of the data has four values (for Hebrew, Armenian, Greek, and Urdu). So the first, or lower, quartile Q1 will be between Armenian and Greek. Similarly, the third, or upper, quartile Q3 will be between Bengali and Korean, at the center of the upper half of the data.
To find Q1, the lower quartile, we calculate the number that sits halfway between the number of Armenian speakers and the number of Greek speakers. This is given by
To find Q3, the upper quartile, we calculate the number that sits halfway between the number of Bengali speakers and the number of Korean speakers. This is given by
Now, we can determine the value of the interquartile range (IQR), which is :
Our range is therefore 37,363,700, and our interquartile range is 829,100.
In this next example, we will see how to calculate the range and interquartile range of a grouped data set.
Example 5: Interquartile Range with Grouped Data
The given line plot shows the magnitudes of earthquakes that recently took place around the world. Determine the range and interquartile range of the data.
On the line plot, each cross above a data value on the axis represents an item of data in our data set. We can find the range of the data by searching the plot for the lowest and highest earthquake magnitudes with a cross, or crosses, above them.
The highest earthquake magnitude with a cross above it is 3.5, and the lowest is 2. To calculate the range, we then subtract the lowest magnitude from the highest:
This gives us a range of 1.5 on the earthquake magnitude scale.
Now, we wish to find the interquartile range for the earthquake magnitude data. To do this, we first need to find the quartiles Q1 and Q3. And to do this, we need to know how many data values we have in our set.
Each data value in our set is represented by a cross on the line plot, and as there are some earthquake magnitudes with more than one data point (or cross) above them on the plot (i.e., the data has been grouped), we will need to count all of these. You can see, for example, that the magnitude 2 has one cross above it, so one earthquake had a magnitude of 2, whereas the magnitude 2.6 has two crosses above it, so two earthquakes had a magnitude of 2.6. The total number of crosses is the number of earthquakes that occurred.