Video Transcript
In this video, we will learn how to
work out the median and upper and lower quartiles of a data set. The quartiles allow us to split a
data set into quarters such that 25 percent of the data set lie in a particular
quarter. We will begin by defining the
median, lower quartile, and upper quartile and look at how we can calculate these
for a small data set. Once we’ve done this, we will look
at some more complicated problems in context.
The median or second quartile marks
the middle of a data set. This means that 50 percent of our
data is below the median and 50 percent is above the median. The lower quartile or first
quartile marks the center of the bottom half of a data set. This means that 25 percent of the
data is below the lower quartile and 75 percent is above it.
The upper quartile or third
quartile marks the center of the top half of a data set. This means that 25 percent of the
data is above the upper quartile, whereas 75 percent is below it. The lower quartile, median, and
upper quartile are sometimes referred to as Q one, Q two, and Q three,
respectively.
We will now look at how we can
calculate the median and quartiles from a small data set. We will begin by looking at an odd
number of pieces of data.
Determine the median and quartiles
of these values. 10, 17, 21, 25, 29, 32, and 37.
We notice that our values are
already in ascending order. If this was not the case, we would
need to arrange them in this way first. There are seven pieces of data
here. And we know that the median is the
middle value. One way of calculating the median
is to cross off values from either end. Firstly, we cross off 10 and
37. Next, we would cross off 17 and
32. Finally, we would cross off 21 and
29. This leaves us with a median value
of 25. The fourth number in the list is
the median.
As there were seven numbers in the
list, we might think that the median value was half of this. This is not the case, however. And we can calculate the median
position of any data set using the following formula: 𝑛 plus one divided by
two. 𝑛 is the number of values, in this
case seven. So we have seven plus one divided
by two. As eight divided by two is equal to
four, the fourth value in our list will be the median.
We know from our previous
definition that the lower quartile is the center of the bottom half. In this question, the bottom half
of the data set has three values: the first, second, and third. The value in the middle of these is
17. Therefore, the lower quartile is
17. We can calculate the lower quartile
or Q one position using a similar formula to the median. This time, it is 𝑛 plus one
divided by four. Dividing by four is the same as
finding a quarter. Seven plus one is equal to eight,
and dividing this by four gives us two. Therefore, the second number in our
data set, in this case 17, is the lower quartile.
The upper quartile is the center of
the top half of our data set, in this case the fifth, sixth, and seventh number. The center of this is the sixth
number, 32. This is the upper quartile. Once again, we could find the
position of the upper quartile or Q three by finding three-quarters of 𝑛 plus one
or multiplying 𝑛 plus one by three and then dividing by four. This is equal to six. So the sixth number in our list
will be the upper quartile. A quicker way of calculating this
would be to multiply the position of the lower quartile by three, as three-quarters
is one-quarter multiplied by three. The lower quartile, median, and
upper quartile of our values are 17, 25, and 32, respectively.
We will now look at a question
where we have an even number of data points.
David’s history test scores are 74,
96, 85, 90, 71, and 98. Determine the upper and lower
quartiles of his scores.
In order to calculate the upper and
lower quartiles for a data set, we firstly need to sort the data into ascending
order. In this case, the lowest score was
71. The next lowest score was 74. The remainder of David’s scores in
ascending order were 85, 90, 96, and 98. We have six test scores in total,
and we know that the median is the middle value.
One way to calculate the median
with a small data set is to cross off numbers from either end. We cross off the smallest number
and the largest number. We then cross off 74 and 96. This means we’re left with two
middle numbers, 85 and 90. The median will be the midpoint of
these two numbers. We could work this out on a number
line. Alternatively, we can find the
average or midpoint of two numbers by finding their sum and dividing by two. This is equal to 87.5. The median of David’s test scores
is 87.5.
An alternative way of finding the
median, which is useful if we have a large data set, is by using the formula 𝑛 plus
one divided by two. This gives us the median position
on the list. As there were six values in this
question, 𝑛 is equal to six. Six plus one is equal to seven, and
dividing by two gives us 3.5. This means that the median will be
halfway between the third and fourth value. This confirms that our answer of
87.5 was correct.
As we had six values in total,
there are three values less than the median and three values greater than the
median. We know that the lower quartile is
the center of the bottom half of our data set. As there are three values here, the
lower quartile, or Q one, will be the middle one. This is equal to 74. The upper quartile will be the
center of the top half of our data set. Once again, we have three numbers
above the median. The center number will be the
middle one. This is equal to 96.
We can therefore conclude that the
upper quartile of David’s history scores was 96 and the lower quartile was 74. Before moving on from this
question, let’s consider how we could find the lower quartile and upper quartile
position. The position of the lower quartile
can be calculated using the formula 𝑛 plus one divided by four or a quarter of 𝑛
plus one. Seven divided by four is equal to
1.75. As this is more than halfway
between one and two, we round up to two. The lower quartile will be the
second value in our list.
We can calculate the position of
the upper quartile using a similar method. Three-quarters of 𝑛 plus one, or
three multiplied by 𝑛 plus one divided by four. This is equal to 5.25, which we
notice is three times 1.75. As this is less than halfway
between five and six, we round down to five. The fifth number in our list will
be the upper quartile. This method is particularly useful
if we have a large data set.
We will now look at a couple of
more complicated questions in context.
The number of Bonus Bugs won by
each of 15 students in the first level of a computer game tournament was
recorded. The results are in the table
below. Find the median, Q two, and the
lower and upper quartiles, Q one and Q three, for the number of Bonus Bugs won. If the organizers of the tournament
decide that the top 25 percent of students can compete in level two, above what
number of Bonus Bugs must a student win to go on to the next level?
In order to calculate the median
and quartiles of any data set, we firstly need to sort the data into ascending
order. The lowest number of Bonus Bugs
that a student won was 14. The next lowest was 15. The completed list in ascending
order is as shown. Once our data is in order, we can
calculate the median by crossing off one number from either end until we reach the
middle. We would cross off 14 and 35. We would then cross off 15 and 32
and repeat this process until we arrived at the middle. If there were two middle numbers,
we would find the midpoint of these two.
When dealing with a large data set,
there is a quicker way of finding the median position. We do this using the formula 𝑛
plus one divided by two, where 𝑛 is the number of data values. In this question, there are 15 data
values. We add one to 15 and then divide by
two. This is equal to eight. Therefore, the median will be the
eighth number in our list. This is equal to 22. So the median number of Bonus Bugs
is 22.
We can work out the lower quartile
and upper quartile positions in a similar way. The lower quartile or Q one
position is calculated by dividing 𝑛 plus one by four. 15 plus one is equal to 16, and
dividing this by four gives us four. We can therefore say that the
fourth number in our list, in this case 17, is the lower quartile.
An alternative way to find the
lower quartile would be to find the center of the bottom half of our list. There are seven values below the
median, and the middle one of these is 17, the fourth value. To find the upper quartile or Q
three position, we multiply 𝑛 plus one by three-quarters or multiply 𝑛 plus one by
three and then divide by four. This is equal to 12. Notice that this is three times the
Q one position. The 12th number in our list is 29,
so this is the upper quartile.
As the upper quartile is the center
of the top half of data values, we could once again have found this by finding the
middle of the seven values above the median. The median number of Bonus Bugs is
22, the lower quartile is 17, and the upper quartile is 29.
We will now clear some space to
work out the second part of the question.
The second part of the question was
interested in the top 25 percent of students. We recall that one of the reasons
for calculating the quartiles is to split our data into quarters. One-quarter is the same as 25
percent. This means that the top 25 percent
of students will lie between Q three and the maximum inclusive. As Q three or the upper quartile
was equal to 29, any student will be in the top 25 percent if they achieve 29 Bonus
Bugs or more.
Our final question builds on this
one.
In the second year of a computer
game tournament, there were 42 participants and the number of Bonus Bugs each one
won in level one was recorded. The data is shown in the graph
below where each bug represents one participant. Find the median number of Bonus
Bugs won and the lower and upper quartiles, Q one and Q three.
There is also a second part to this
question that we will look at later. We can see from the graph that
there was one student who achieved 13 Bonus Bugs. There was also one student who
achieved 15 Bonus Bugs. Two students achieved 19 Bonus
Bugs, and two students achieved 20. The maximum number of Bonus Bugs
achieved by any student was 38.
In order to calculate the median
and quartiles, we could write all of these numbers out in ascending order. 13, 15, 19, 19, 20, 20, and so
on. This would be very
time-consuming. So a quicker method is to work out
which position the median and quartiles would be in. The median position can be
calculated using the formula 𝑛 plus one divided by two. 𝑛 is the number of data values, in
this case 42. Substituting this into the formula
gives us an answer of 21.5. This means that the median position
is between the 21st and 22nd number.
By calculating the running total or
cumulative frequency, we can see that the 19th, 20th, 21st, and 22nd number are all
26. This means that the median number
of bugs is 26. We can calculate the Q one or lower
quartile position using a similar method. This time, we divide 𝑛 plus one by
four, giving us an answer of 10.75. As this is past halfway between 10
and 11, we round up to the 11th number. The 11th and 12th numbers are equal
to 23. Therefore, Q one equals 23.
To calculate the Q three or upper
quartile position, we multiply the lower quartile position by three. This gives us 32.25. As this is less than halfway
between 32 and 33, we round down. We’re looking for the 32nd
number. This is equal to 29.
The second part of the question
wants us to calculate what score the top 25 percent of participants achieved. The quartiles split our data into
quarters or 25 percent. This means that 25 percent of the
scores will go from the upper quartile to the maximum. A score of 29 or more would put a
student in the top 25 percent.
We will now summarize the key
points from this video. The median marks the middle of a
data set. The lower quartile marks the center
of the bottom half of a data set. And the upper quartile marks the
center of the top half of a data set. We can calculate the median and
quartile positions using the following formulas. The quartiles split our data into
quarters or 25 percent.