Video Transcript
Have you ever found out that two people in your class, or perhaps in your sports team, have the same birthday and thought it was a bit surprising? I mean, what are the chances of that happening? Ignoring the year that somebody was born in, there are 365 possible birthdays or 366 if we include those poor people born on leap years. And perhaps there’re only 30 people in your class. So, it seems incredibly unlikely that two people would share the same birthday. But actually, if we delve a little bit more into the mathematics behind this, you may see that it’s not quite as unlikely as you first thought.
This problem is known as the birthday paradox, a paradox, meaning something which seems counterintuitive or illogical at first. And it’s often phrased as follows. How many people do there need to be in a room for there to be at least a 50 percent chance that at least two of them share a birthday? The part of the statement where we said at least two of them share a birthday just means that there could be more than one shared birthday in the group. We’re going to look at the maths of this, but before we do, we need to make a couple of assumptions.
Firstly, that there are no twins or triplets or quadruplets and so on in the room. So, all of the people’s birthdays are independent of one another. And secondly, we need to assume that every possible date in the year is equally likely as a birthday, which means we’ll be ignoring people born on the 29th of February. What’s your intuition telling you about this problem? Well, for many people, their intuition may be telling them that if we need at least a 50 percent chance that we’ll have a shared birthday in the group, perhaps we need half of 365 people. That’s half of all the possible birthdays. Half of 365 is 182.5. So if we round this up, perhaps we need 183 people in a room in order for there to be at least a 50 percent chance that we have a shared birthday. Instead of trying to guess, though, let’s look at some mathematical methods that we could use.
Here’s our first method. Suppose we have a group of people in a room. We’ll begin by finding the probability that everybody in that group has a unique birthday. And then, to find the probability that there is at least one shared birthday in the group, we can subtract the probability we’ve already calculated from one. Let’s consider the first person; they can have whatever birthday they like. So, there’s a 365 out of 365 or 100 percent or probability of one that this person has a unique birthday at this stage. Then, we introduce person two; they can have any of the 364 remaining birthdays. So, the probability that they’ll have a unique birthday at this stage is 364 out of 365. Remember, we’re ignoring leap years as possible birthdays.
When we introduce the third person, they can’t have any of the two birthdays already used. So, the probability that they have a unique birthday at this stage is 363 out of 365. And we can continue like this, subtracting one in the numerator of our probabilities for as many people as there are in the room. By the time we get to the 𝑛th person in the room, we’ll have used up 𝑛 minus one possible birthdays. So, the probability that person 𝑛 has a unique birthday will be 365 minus 𝑛 minus one over 365.
As we assume that we have no twins or triplets or so on in the room, this means that all the birthdays are independent of one another. So, to find the probability that they’re all unique, we multiply the individual probabilities together, giving 365 over 365 times 364 over 365 all the way down to 365 minus 𝑛 minus one over 365. Remember, though, that we don’t really want to know the probability that all the birthdays are unique; we want to know the probability they’re not all unique. So, to find the probability that there’s at least one shared birthday in the group, we can subtract this product from one.
Let’s try this out then with a couple of different values of 𝑛. That’s different numbers of people in the group. Let’s start with 100 people first of all. That’s a nice round number to begin with. The probability for the 100th person having a unique birthday will be 266 over 365. Remember, that’s 365 minus 𝑛 minus one or 365 minus 99 in this case. When we multiply all of these probabilities together, we get 0.0000003. Remember that gives the probability of everybody having a unique birthday. So, when we subtract from one to find the probability of there being at least one shared birthday, we get 0.9999997. That’s a greater than 99.9999 percent chance that there’s at least one shared birthday in the group. What! That may not have been what you were expecting.
Let’s try something a little smaller then; let’s try a group of 60 people. This time, when we multiply the probabilities of distinct birthdays together, we get 0.005877. And when we subtract from one, we get 0.994122, so still a greater than 99 percent chance that we’ll have at least one shared birthday in the group. It looks like the answer to this problem is going to be much lower than we may have thought. You could set up a spreadsheet to calculate these probabilities for whatever number of people you like. And if you do so, you’ll find a surprising answer to this problem. It turns out that we only need 23 people in a room in order for the probability of there being at least one shared birthday to be over 50 percent. For 23 people, the probability that they all have distinct birthdays is 0.492702. And when we subtract this from one, we get 0.507297. So, there’s a 50.7 percent chance that there will be at least one shared birthday in a group of 23 people.
Wow! If you’re not convinced by this method, let’s have a look at another way that we could prove this. This time, we’ll start with the answer that we think we know of 23 people. And in this method, we’ll consider the number of possible pairings that there are in a group of 23 people. When pairing people up to compare their birthdays, we have 23 choices for the first person in each pair, and we then have 22 choices for the second person in each pair. Overall, then, we have 23 times 22. That’s 506 possible pairings. But we’ll have counted each pair twice. For example, we’ll have counted Sophie and Tom and then Tom and Sophie. So, we actually need to divide this number by two to give the total number of unique pairings. That gives us 253.
Now, calculating the probability that at least two people share a birthday is incredibly complicated. So instead, as we did in our first method, let’s consider the probability that none of these pairs share a birthday and then subtract from one. Whatever the first person in each pair’s birthday is, there’s a 364 out of 365 chance that the second person doesn’t share it. Assuming the pairs to be independent, which we’ll discuss more in a moment, we can find the probability that none of our pairs share a birthday by multiplying this probability together 253 times or, in other words, raising this probability to the power of 253. This gives 0.4995. And to find the probability that there is at least one shared birthday in the group, we can subtract this probability from one, giving 0.5004. So once again, we’ve shown that with 23 people in a room, the probability of there being at least one shared birthday is greater than 50 percent.
Now, if this seems counterintuitive to you and you thought the answer should be much bigger, let’s have a little think about why this might be the case. Although we might not like to admit it, most people are a little bit self-centered and we often think of this problem from our own perspective. Instead of just asking what’s the probability that there’s a shared birthday somewhere in the group, we instead think of the question “what’s the probability that someone else in the room has the same birthday as me?” This is an entirely different question. In the case of 23 people in a room, it would lead to only 22 comparisons instead of the full 253. And so, we see that the probability of a particular individual sharing a birthday with any of the others is, of course, much lower. It’s equal to 0.05857, so just over 5 percent.
The other reason we find it hard to grasp is that the number of possible pairs grows in a nonlinear way. And our brains are generally pretty bad at grasping how nonlinear functions grow. For example, if we only had four people in a room, the number of distinct possible pairs would be six. That’s four times three over two. If we double the number of people in the room to eight, the number of distinct possible pairs grows to 28. If we increase the number of people in the room to 17, there are now 136 distinct pairs. And we know that when we have 23 people in the room, it’s 253 pairs. The more people we add to the room, the faster the number of possible pairs grows and, therefore, the quicker the chance of a shared birthday increases.
In fact, we can plot a graph of this, using the data from the spreadsheet that I suggested you create earlier. We see that as the number of people in the group grows, the probability of there being at least one shared birthday grows really, really quickly. That probability we were looking for of greater than 50 percent is achieved by the time we get beyond 23 people in the group. And by the time we get to beyond 60 people in a group, the probability of there being at least one shared birthday is greater than 99 percent. We’re not going to get to a probability of 100 percent until we have 366 people in the room. But we’re certainly getting very close and a lot sooner than that guess of 183 people that we may have started off with.
Now, you may have spotted that the two methods we use do give slightly different answers for the exact probability, although in both cases the probability is indeed greater than 50 percent when we have 23 people in the room. The reason for this is that in our second method, we assumed all of the pairs were independent of one another. And this isn’t strictly true as each person is included in 22 pairs. We multiplied all of the probabilities together, which is an application of the “and” rule of probability. The probability of two independent events 𝐴 and 𝐵 both occurring is equal to the probability of 𝐴 multiplied by the probability of 𝐵.
But our pairs aren’t strictly independent. If we know that person one shares a birthday with person two and we also know that person one shares a birthday with person three, then, by default, we know that persons two and three also share a birthday. So, strictly speaking, we should make a correction to this formula and use its extension for when the events are not independent. However, in reality, the correction we need is so small that we can see it hasn’t made much difference to our calculation of the probability.
One of the other assumptions we made was that ignoring leap years, all birthdays are equally likely. And actually, birth records show that some birthdays occur more frequently in the general population than others. But this doesn’t really matter because, actually, what it would mean is that the number of people needed to reach that magic probability of 50 percent is even fewer than 23. If you’re still not convinced of the maths of this, perhaps you could do a survey of some of the classes in your school. Remember, we’re not saying that it’s certain that each class will have a shared birthday. But we’re saying that in a group of more than 23 people, you’d expect there to be a shared birthday in at least 50 percent of cases. In fact, if the classes in your school have 30 students, the probability of there being at least one shared birthday is actually 70 percent. So, you’d expect approximately this proportion of the classes in your school to contain at least one shared birthday. Why not try it out?
If you don’t want to carry out your own survey, you could have a look at the results of one already carried out during the 2014 Football World Cup in Brazil. Conveniently, there’re 23 players in a full World Cup squat, and this gives us the perfect opportunity to test out the theory. Some interested maths types did the research for us, and they found that out of the 32 2014 World Cup squads, there were indeed 16 teams with at least one shared birthday. And in fact, five of those teams had two pairs of shared birthdays.
If you’re wondering whether the sample size may be too small to be reliable, the researchers also look back at the teams from the 2010 Football World Cup, giving us 64 teams to look at in total. This time, out of 32 teams, 15 of them contained at least one shared birthday, making a total of 31 out of 64 teams when we combine the data from the two years together, giving a probability of 0.484 of there being at least one shared birthday in a team. That’s pretty close to 50 percent. So, the data demonstrates the theory pretty well. What a result!
So, there you have it: the surprising, but perhaps not so surprising when you really think about it, solution to the birthday paradox. Why not ask your friends what their thoughts are and then impress them with your simple explanation of this intriguing problem?