Pop Video: Pi Hiding in Prime Regularities

Grant Sanderson • 3Blue1Brown • Boclips

Pi Hiding in Prime Regularities

29:25

Video Transcript

This is a video I’ve been excited to make for a while now. The story here braids together prime numbers, complex numbers, and 𝜋 in a very pleasing trio. Quite often in modern math, especially that which flirts with the Riemann zeta function, these three seemingly unrelated objects show up in unison. And I wanna give you a little peek at one instance where this happens, one of the few that doesn’t require too heavy a technical background.

That’s not to say that this is easy. In fact, this is probably one of the most intricate videos I’ve ever done, but the culmination is worth it. What we’ll end up with is a formula for 𝜋, a certain alternating infinite sum. This formula is actually written on the mug that I’m drinking coffee from right now as I write this. And a fun, but almost certainly apocryphal story is that the beauty of this formula is what inspired Leibniz to quit being a lawyer and instead pursue math.

Now, whenever you see 𝜋 show up in math, there’s always gonna be a circle hiding somewhere, sometimes very sneakily. So the goal here is not just to discover this sum, but to really understand the circle hiding behind it. You see, there is another way that you can prove the same result that you and I are gonna spend some meaningful time building up to, but with just a few lines of calculus. And this is one of those proofs that leaves you thinking, “okay, I suppose that’s true,” but not really getting a sense for why or for where the hidden circle is. On the path that you and I will take though, what you’ll see is that the fundamental truth behind this sum and the circle that it hides is a certain regularity in the way that prime numbers behave when you put them inside the complex numbers.

To start the story, imagine yourself with nothing more than a pencil, some paper, and a desire to find a formula for computing 𝜋. There are countless ways that you could approach this. But as a broad outline for the plotline here, you’ll start by asking how many lattice points of the plane sit inside a big circle. And then that question is gonna lead to asking about how to express numbers as the sum of two squares, which in turn is gonna lead us to factoring integers inside the complex plane. From there, we’ll bring in the special function named chi, which is gonna give us a formula for 𝜋 that at first seems to involve a crazy complicated pattern dependent on the distribution of primes. But a slight shift in perspective is gonna simplify it dramatically and expose the ultimate gold nugget. It’s a lot, but good math takes time. And we’ll take it step by step.

When I say lattice point, what I mean is a point 𝑎, 𝑏 on the plane, where 𝑎 and 𝑏 are both integers, a spot where the grid lines here cross. If you draw a circle centered at the origin, let’s say with radius 10, how many lattice points would you guess are inside that circle? Well, there’s one lattice point for each unit of area. So the answer should be approximately equal to the area of the circle, 𝜋𝑟 squared, which in this case is 𝜋 times 10 squared. And if it was a really big circle, like radius 1000000, you would expect this to be a much more accurate estimate, in the sense that the percent error between the estimate 𝜋𝑟 squared and the actual count of lattice points should get smaller.

What we’re gonna try to do is find a second way to answer the same question: how many lattice points are inside the circle. Because that can lead to another way to express the area of a circle and hence another way to express 𝜋. And so, you play and you wonder. And maybe, especially if you just watched a certain calculus video, you might try looking through every possible ring that a lattice point could sit on.

Now if you think about it, for each one of these lattice points, 𝑎, 𝑏, its distance from the origin is the square root of 𝑎 squared plus 𝑏 squared. And since 𝑎 and 𝑏 are both integers, 𝑎 squared plus 𝑏 squared is also some integer. So you only have to consider rings whose radii are the square roots of some whole number. A radius of zero just gives you that single origin point. If you look at the radius one, that hits four different lattice points. Radius square root of two, well that also hits four lattice points. A radius square root of three doesn’t actually hit anything. Square root of four, again, hits four lattice points. A radius square root of five actually hits eight lattice points. And what we want is a systematic way to count how many lattice points are on a given one of these rings, a given distance from the origin, and then to tally them all up.

And if you pause and try this for a moment, what you’ll find is that the pattern seems really chaotic, just very hard to find order under here. And that’s a good sign that some very interesting math is about to come into play. In fact, as you’ll see, this pattern is rooted in the distribution of primes. As an example, let’s look at the ring with radius square root of 25. It hits the point five, zero since five squared plus zero squared is 25. It also hits four, three since four squared plus three squared gives 25. And likewise, it hits three, four and also zero, five. And what’s really happening here is that you’re counting how many pairs of integers 𝑎, 𝑏 have the property that 𝑎 squared plus 𝑏 squared equals 25. And looking at the circle, it looks like there’s a total of 12 of them.

As another example, take a look at the ring with radius square root 11. It doesn’t hit any lattice points. And that corresponds to the fact that you cannot find two integers whose squares add up to 11. Try it! Now, many times in math, when you see a question that has to do with the 2D plane, it can be surprisingly fruitful to just ask what it looks like when you think of this plane as the set of all complex numbers. So instead of thinking of this lattice point here as the pair of integer coordinates three, four, instead think of it as the single complex number three plus four 𝑖. That way, another way to think about the sum of the squares of its coordinates, three squared plus four squared, is to multiply this number by three minus four 𝑖. This is called its complex conjugate. It’s what you get by reflecting over the real axis, replacing 𝑖 with negative 𝑖.

And this might seem like a strange step if you don’t have much of a history with complex numbers. But describing this distance as a product can be unexpectedly useful. It turns our question into a factoring problem, which is ultimately why patterns among prime numbers are gonna come into play. Algebraically, this relation is straightforward enough to verify. You get a three squared and then the three times minus four 𝑖 cancels out with the four 𝑖 times three. And then, you have negative four 𝑖 squared, which because 𝑖 squared is negative one becomes plus four squared.

This is also quite nice to see geometrically. And if you’re a little rusty with how complex multiplication works, I do have another video that goes more into detail about why complex multiplication looks the way that it does. The way that you might think about a case like this is that the number three plus four 𝑖 has a magnitude of five and some angle off of the horizontal. And what it means to multiply it by three minus four 𝑖 is to rotate by that same angle in the opposite direction, putting it on the positive real axis, and then to stretch out by a factor of five, which in this case lands you on the output 25, the square of the magnitude.

The collection of all of these lattice points 𝑎 plus 𝑏𝑖, where 𝑎 and 𝑏 are integers, has a special name. They’re called the Gaussian integers, named after Martin Sheen. Geometrically, you’ll still be asking the same question. How many of these lattice points, Gaussian integers, are a given distance away from the origin, like square root of 25? But we’ll be phrasing it in a slightly more algebraic way. How many Gaussian integers have the property that multiplying by their complex conjugate gives you 25? This might seem needlessly complex. But it’s the key to understanding the seemingly random pattern for how many lattice points are a given distance from the origin. To see why, we first need to understand how numbers factor inside the Gaussian integers.

As a refresher, among ordinary integers, every number can be factored as some unique collection of prime numbers. For example, 2250 can be factored as two times three squared times five cubed. And there is no other collection of prime numbers that also multiplies to make 2250, unless you let negative numbers into the picture, in which case you could just make some of the primes in this factorization negative. So really, within the integers, factorization is not perfectly unique. It’s almost unique with the exception that you can get a different-looking product by multiplying some of the factors by negative one.

The reason I bring that up is that factoring works very similarly inside the Gaussian integers. Some numbers, like five, can be factored into smaller Gaussian integers, which in this case is two plus 𝑖 times two minus 𝑖. This Gaussian integer here, two plus 𝑖, cannot be factored into anything smaller, so we call it a Gaussian prime. Again, this factorization is almost unique. But this time, not only can you multiply each one of those factors by negative one to get a factorization that looks different. You can also be extra sneaky and multiply one of these factors by 𝑖 and then the other one by negative 𝑖. This will give you a different way to factor five into two distinct Gaussian primes.

But other than the things that you can get by multiplying some of these factors by negative one or 𝑖 or negative 𝑖, factorization within the Gaussian integers is unique. And if you can figure out how ordinary prime numbers factor inside the Gaussian integers, that will be enough to tell us how any other natural number factors inside these Gaussian integers. And so here, we pull in a crucial and pretty surprising fact. Prime numbers that are one above a multiple of four, like five or 13 or 17, these guys can always be factored into exactly two distinct Gaussian primes. This corresponds with the fact that rings with a radius equal to the square root of one of these prime numbers always hit some lattice points. In fact, they always hit exactly eight lattice points, as you’ll see in just a moment.

On the other hand, prime numbers that are three above a multiple of four, like three or seven or 11, these guys cannot be factored further inside the Gaussian integers. Not only are they primes in the normal numbers, but they are also Gaussian primes, unsplittable, even when 𝑖 is in the picture. And this corresponds with the fact that a ring whose radius is the square root of one of those primes will never hit any lattice points.

And this pattern right here is the regularity within prime numbers that we’re gonna ultimately explain. And in a later video, I might explain why on earth this is true. Why a prime number’s remainder when divided by four has anything to do with whether or not it factors inside the Gaussian integers, or, said differently, whether or not it can be expressed as the sum of two squares? But here and now, we’ll just have to take it as a given. The prime number two, by the way, is a little special because it does factor. You can write it as one plus 𝑖 times one minus 𝑖. But these two Gaussian primes are a 90-degree rotation away from each other. So you can multiply one of them by 𝑖 to get the other. And that fact is gonna make us wanna treat the prime number two a little bit differently for where all of this stuff is going. So just keep that in the back of your mind.

Remember, our goal here is to count how many lattice points are a given distance away from the origin. And doing this systematically for all distances, square root of 𝑛, can lead us to a formula for 𝜋. And counting the number of lattice points with a given magnitude, like square root of 25, is the same as asking how many Gaussian integers have the special property that multiplying them by their complex conjugate gives you 25. So here’s the recipe for finding all Gaussian integers that have this property.

Step one, factor 25, which inside the ordinary integers looks like five squared. But since five factors even further as two plus 𝑖 times two minus 𝑖, 25 breaks down as these four Gaussian primes. Step two, organize these into two different columns, with conjugate pairs sitting right next to each other. Then, once you do that, multiply what’s in each column. And you’ll come out with two different Gaussian integers on the bottom. And because everything on the right is a conjugate with everything on the left, what comes out is gonna be a complex conjugate pair, which multiplies to 25. Picking an arbitrary standard, let’s say that the product from that left column is the output of our recipe.

Now, notice there are three choices for how you can divvy up the primes that can affect this output. Pictured right here, both copies of two plus 𝑖 are in the left column. And that gives us the product three plus four 𝑖. You could also have chosen to have only one copy of two plus 𝑖 in this left column, in which case the product would be five. Or, you could have both copies of two plus 𝑖 in that right column, in which case the output of our recipe would’ve been three minus four 𝑖. And those three possible outputs are all different lattice points on a circle with radius square root of 25. But why does this recipe not yet capture all 12 of the lattice points?

Remember how I mentioned that a factorization into Gaussian primes can look different if you multiply some of them by 𝑖 or negative one, negative 𝑖. In this case, you could write the factorization of 25 differently, maybe splitting up one of those fives as negative one plus two 𝑖 times negative one minus two 𝑖. And if you do that, running through the same recipe, it can affect the result. You’ll get a different product out of that left column. But the only effect that this is gonna have is to multiply that total output by 𝑖 or negative one or negative 𝑖. So as a final step for our recipe, let’s say you have to make one of four choices. Take that product from the left column and choose to multiply it by one, 𝑖, negative one, or negative 𝑖, corresponding to rotations that are some multiple of 90 degrees. That will account for all 12 different ways of constructing a Gaussian integer whose product with its own conjugate is 25.

This process is a little complicated. So I think the best way to get a feel for it is to just try it out with more examples. Let’s say, instead, we were looking at 125, which is five cubed. In that case, we would have four different choices for how to divvy up the prime conjugate pairs into these two columns. You can either have zero copies of two plus 𝑖 in the left column, one copy in there, two copies in there, or all three of them in that left column. Those four choices multiplied by the final four choices of multiplying the product from the left column by one or by 𝑖 or negative one or negative 𝑖 would suggest that there are a total of 16 lattice points at distance square root of 125 away from the origin.

And indeed, if you draw that circle out and count, what you’ll find is that it hits exactly 16 lattice points. But what if you introduce a factor like three, which doesn’t break down as the product of two conjugate Gaussian primes? Well, that really mucks up the whole system. When you’re divvying up the primes between the two columns, there’s no way that you can split up this three. No matter where you put it, it leaves the columns imbalanced. And what that means is that when you take the product of all of the numbers in each column, you’re not gonna end up with the conjugate pair. So for a number like this, three times five cubed, which is 375, there’s actually no lattice point that you’ll hit. No Gaussian integer whose product with its own conjugate gives you 375.

However, if you introduce a second factor of three, then you have an option. You can throw one three in the left column and the other three in the right column. Since three is its own complex conjugate, this leaves things balanced, in the sense that the products of the left and right columns will indeed be a complex conjugate pair. But it doesn’t add any new options. There’s still gonna be a total of four choices for how to divvy up those factors of five, multiplied by the final four choices of multiplying by one, 𝑖, negative one, or negative 𝑖. So just like the square root of 125 circle, this guy is also gonna end up hitting exactly 16 lattice points.

Let’s just sum up where we are. When you’re counting up how many lattice points lie on a circle with a radius square root of 𝑁, the first step is to factor 𝑁. And for prime numbers like five or 13 or 17 which factor further into a complex conjugate pair of Gaussian primes, the number of choices they give you will always be one more than the exponent that shows up with that factor. On the other hand, for prime factors like three or seven or 11, which are already Gaussian primes and cannot be split, if they show up with an even power, you have one and only one choice with what to do with them. But if it’s an odd exponent, you’re screwed. And you just have zero choices. And always, no matter what, you have those final four choices at the end.

By the way, I do think that this process right here is the most complicated part of the video. It took me a couple times to think through that, “Yes! this is a valid way to count lattice points.” So don’t be shy if you wanna pause and scribble things down to get a feel for it.

The one last thing to mention about this recipe is how factors of two affect the count. If your number is even, then that factor of two breaks down as one plus 𝑖 times one minus 𝑖. So you can divvy up that complex conjugate pair between the two columns. And at first, it might look like this doubles your options, depending on how you choose to place those two Gaussian primes between the columns. However, since multiplying one of these guys by 𝑖 gives you the other one, when you swap them between the columns, the effect that that has on the output from the left column is to just multiply it by 𝑖 or by negative 𝑖. So that’s actually redundant with the final step, where we take the product of this left column and choose to multiply it either by one, 𝑖, negative one, or negative 𝑖. What this means is that a factor of two, or any power of two, doesn’t actually change the count at all. It doesn’t hurt, but it doesn’t help.

For example, a circle with radius square root of five hits eight lattice points. And if you grow this radius to square root of 10, then you also hit eight lattice points. And square root of 20 also hits eight lattice points, as does square root of 40. Factors of two just don’t make a difference. Now what’s about to happen is number theory at its best. We have this complicated recipe telling us how many lattice points sit on a circle with radius square root of 𝑁. And it depends on the prime factorization of 𝑁. To turn this into something simpler, something we can actually deal with, we’re gonna exploit the regularity of primes that those which are one above a multiple of four split into distinct Gaussian prime factors, while those that are three above a multiple of four cannot be split.

To do this, let’s introduce a simple function, one which I’ll label with the Greek letter 𝜒. For inputs that are one above a multiple of four, the output of 𝜒 is just one. If it takes in an input three above a multiple of four, then the output of 𝜒 is negative one. And then on all even numbers, it gives zero. So if you evaluate 𝜒 on the natural numbers, it gives this very nice cyclic pattern: one, zero, negative one, zero, and then repeat indefinitely. And this cyclic function 𝜒 has a very special property. It’s what’s called a multiplicative function. If you evaluate it on two different numbers and multiply the results, like 𝜒 of three times 𝜒 of five, it’s the same as if you evaluate 𝜒 on the product of those two numbers, in this case 𝜒 of 15. Likewise, 𝜒 of five times 𝜒 of five is equal to 𝜒 of 25. And no matter what two natural numbers you put in there, this property will hold. Go ahead; try it if you want.

So for our central question of counting lattice points in this way that involves factoring a number, what I’m gonna do is write down the number of choices we have, but using 𝜒 in what at first seems like a much more complicated way. But this has the benefit of treating all prime factors equally. For each prime power, like five cubed, what you write down is 𝜒 of one plus 𝜒 of five plus 𝜒 of five squared plus 𝜒 of five cubed. You add up the value of 𝜒 on all the powers of this prime up to the one that shows up inside the factorization.

In this case, since five is one above a multiple of four, all of these are just one. So this sum comes out to be four, which reflects the fact that a factor of five cubed gives you four options for how to divvy up the two Gaussian prime factors between the columns. For a factor like three to the fourth, what you write down looks totally similar: 𝜒 of one plus 𝜒 of three, on and on up to 𝜒 of three to the fourth. But in this case, since 𝜒 of three is negative one, this sum oscillates. It goes one minus one plus one minus one plus one. And if it’s an even power, like four in this case, the total sum comes out to be one, which encapsulates the fact that there is only one choice for what to do with those unsplittable threes. But if it’s an odd power, that sum comes out to zero, indicating that you’re screwed. You can’t place that unsplittable three.

When you do this for a power of two, what it looks like is one plus zero plus zero plus zero, on and on, since 𝜒 is always zero on even numbers. And this reflects the fact that a factor of two doesn’t help and it doesn’t hurt. You always have just one option for what to do with it. And as always, we keep a four in front to indicate that final choice of multiplying by one, 𝑖, negative one, or negative 𝑖. We’re getting close to the culmination now. Things are starting to look organized. So take a moment, pause and ponder. Make sure everything feels good up to this point.

Take the number 45 as an example. This guy factors as three squared times five. So the expression for the total number of lattice points is four times 𝜒 of one plus 𝜒 of three plus 𝜒 of three squared times 𝜒 of one plus 𝜒 of five. You can think about this as four times the one choice for what to do with the threes times two choices for how to divvy up the Gaussian prime factors of five. It might seem like expanding out this sum is really complicated, because it involves all possible combinations of these prime factors, and it kind of is. However, because 𝜒 is multiplicative, each one of those combinations corresponds to a divisor of 45. I mean, in this case, what we get is four times 𝜒 of one plus 𝜒 of three plus 𝜒 of five plus 𝜒 of nine plus 𝜒 of 15 plus 𝜒 of 45.

And what you’ll notice is that this covers every number that divides evenly into 45, once and only once. And it works like this for any number. There’s nothing special about 45. And that to me is pretty interesting, and I think wholly unexpected. This question of counting the number of lattice points at distance square root of 𝑁 away from the origin involves adding up the value of this relatively simple function over all the divisors of 𝑁.

To bring it altogether, remember why we’re doing this. The total number of lattice points inside a big circle with radius 𝑅 should be about 𝜋 times 𝑅 squared. But on the other hand, we can count those same lattice points by looking through all of the numbers 𝑁 between zero and 𝑅 squared and counting how many lattice points are at distance square root of 𝑁 from the origin. Let’s go ahead and just ignore that origin dot with radius zero. It doesn’t really follow the pattern of the rest. And one little dot isn’t gonna make a difference as we let 𝑅 grow towards infinity.

Now, from all of this Gaussian integer and factoring and 𝜒 function stuff that we’ve been doing, the answer for each 𝑁 looks like adding up the value of 𝜒 on every divisor of 𝑁 and then multiplying by four. And for now, let’s just take that four and put it in the corner and remember to bring it back later. At first, adding up the values for each one of these rows seems super random, right? I mean numbers with a lot of factors have a lot of divisors, whereas prime numbers will always only have two divisors.

So it initially seems like you would have to have perfect knowledge of the distribution of primes to get anything useful out of this. But if, instead, you organize these into columns, the puzzle starts to fit together. How many numbers between one and 𝑅 squared have one as a divisor? Well, all of them. So our sum should include 𝑅 squared times 𝜒 of one. How many of them have two as a divisor? Well, about half of them, so that would account for about 𝑅 squared over two times 𝜒 of two. About a third of these rows have 𝜒 of three. So we can put in 𝑅 squared divided by three times 𝜒 of three.

And keep in mind we’re being approximate, since 𝑅 squared might not perfectly divide two or three. But as 𝑅 grows towards infinity, this approximation will get better. And when you keep going like this, you get a pretty organized expression for the total number of lattice points. And if you factor out that 𝑅 squared and then bring back the four that needs to be multiplied in, what it means is that the total number of lattice points inside this big circle is approximately four times 𝑅 squared times this sum. And because 𝜒 is zero on every even number and it oscillates between one and negative one for odd numbers, this sum looks like one minus one-third plus a fifth minus one-seventh, and so on.

And this is exactly what we wanted! What we have here is an alternate expression for the total number of lattice points inside a big circle, which we know should be around 𝜋 times 𝑅 squared. And the bigger 𝑅 is, the more accurate both of these estimates are. So the percent error between the left-hand side and the right-hand side can get arbitrarily small. So divide out by that 𝑅 squared, and this gives us an infinite sum that should converge to 𝜋. And keep in mind, I just think this is really cool. The reason that this sum came out to be so simple, requiring relatively low information to describe, ultimately stems from the regular pattern and how prime numbers factor inside the Gaussian integers.

If you’re curious, there are two main branches of number theory: algebraic number theory and analytic number theory. Very loosely speaking, the former deals with new number systems, things like these Gaussian integers that you and I looked at and a lot more. And the latter deals with things like the Riemann zeta function or its cousins, called 𝐿 functions, which involve multiplicative functions like this central character 𝜒 from our story. And the path that we just walked is a little glimpse at where those two fields intersect. And both of these are pretty heavy-duty fields with a lot of active research and unsolved problems. So if all this feels like something that takes time to mentally digest, like there’s more patterns to be uncovered and understood, it’s because it is and there are!

Pop Video: Pi Hiding in Prime Regularities

Video Transcript

Join Nagwa Classes