This is a video I’ve been excited to make for a while now. The story here braids together prime numbers, complex numbers, and 𝜋 in a very
pleasing trio. Quite often in modern math, especially that which flirts with the Riemann zeta
function, these three seemingly unrelated objects show up in unison. And I wanna give you a little peek at one instance where this happens, one of the few
that doesn’t require too heavy a technical background.
That’s not to say that this is easy. In fact, this is probably one of the most intricate videos I’ve ever done, but the
culmination is worth it. What we’ll end up with is a formula for 𝜋, a certain alternating infinite sum. This formula is actually written on the mug that I’m drinking coffee from right now
as I write this. And a fun, but almost certainly apocryphal story is that the beauty of this formula
is what inspired Leibniz to quit being a lawyer and instead pursue math.
Now, whenever you see 𝜋 show up in math, there’s always gonna be a circle hiding
somewhere, sometimes very sneakily. So the goal here is not just to discover this sum, but to really understand the
circle hiding behind it. You see, there is another way that you can prove the same result that you and I are
gonna spend some meaningful time building up to, but with just a few lines of
calculus. And this is one of those proofs that leaves you thinking, “okay, I suppose that’s
true,” but not really getting a sense for why or for where the hidden circle is. On the path that you and I will take though, what you’ll see is that the fundamental
truth behind this sum and the circle that it hides is a certain regularity in the
way that prime numbers behave when you put them inside the complex numbers.
To start the story, imagine yourself with nothing more than a pencil, some paper, and
a desire to find a formula for computing 𝜋. There are countless ways that you could approach this. But as a broad outline for the plotline here, you’ll start by asking how many lattice
points of the plane sit inside a big circle. And then that question is gonna lead to asking about how to express numbers as the
sum of two squares, which in turn is gonna lead us to factoring integers inside the
complex plane. From there, we’ll bring in the special function named chi, which is gonna give us a
formula for 𝜋 that at first seems to involve a crazy complicated pattern dependent
on the distribution of primes. But a slight shift in perspective is gonna simplify it dramatically and expose the
ultimate gold nugget. It’s a lot, but good math takes time. And we’ll take it step by step.
When I say lattice point, what I mean is a point 𝑎, 𝑏 on the plane, where 𝑎 and 𝑏
are both integers, a spot where the grid lines here cross. If you draw a circle centered at the origin, let’s say with radius 10, how many
lattice points would you guess are inside that circle? Well, there’s one lattice point for each unit of area. So the answer should be approximately equal to the area of the circle, 𝜋𝑟 squared,
which in this case is 𝜋 times 10 squared. And if it was a really big circle, like radius 1000000, you would expect this to be a
much more accurate estimate, in the sense that the percent error between the
estimate 𝜋𝑟 squared and the actual count of lattice points should get smaller.
What we’re gonna try to do is find a second way to answer the same question: how many
lattice points are inside the circle. Because that can lead to another way to express the area of a circle and hence
another way to express 𝜋. And so, you play and you wonder. And maybe, especially if you just watched a certain calculus video, you might try
looking through every possible ring that a lattice point could sit on.
Now if you think about it, for each one of these lattice points, 𝑎, 𝑏, its distance
from the origin is the square root of 𝑎 squared plus 𝑏 squared. And since 𝑎 and 𝑏 are both integers, 𝑎 squared plus 𝑏 squared is also some
integer. So you only have to consider rings whose radii are the square roots of some whole
number. A radius of zero just gives you that single origin point. If you look at the radius one, that hits four different lattice points. Radius square root of two, well that also hits four lattice points. A radius square root of three doesn’t actually hit anything. Square root of four, again, hits four lattice points. A radius square root of five actually hits eight lattice points. And what we want is a systematic way to count how many lattice points are on a given
one of these rings, a given distance from the origin, and then to tally them all
And if you pause and try this for a moment, what you’ll find is that the pattern
seems really chaotic, just very hard to find order under here. And that’s a good sign that some very interesting math is about to come into
play. In fact, as you’ll see, this pattern is rooted in the distribution of primes. As an example, let’s look at the ring with radius square root of 25. It hits the point five, zero since five squared plus zero squared is 25. It also hits four, three since four squared plus three squared gives 25. And likewise, it hits three, four and also zero, five. And what’s really happening here is that you’re counting how many pairs of integers
𝑎, 𝑏 have the property that 𝑎 squared plus 𝑏 squared equals 25. And looking at the circle, it looks like there’s a total of 12 of them.
As another example, take a look at the ring with radius square root 11. It doesn’t hit any lattice points. And that corresponds to the fact that you cannot find two integers whose squares add
up to 11. Try it! Now, many times in math, when you see a question that has to do with the 2D plane, it
can be surprisingly fruitful to just ask what it looks like when you think of this
plane as the set of all complex numbers. So instead of thinking of this lattice point here as the pair of integer coordinates
three, four, instead think of it as the single complex number three plus four
𝑖. That way, another way to think about the sum of the squares of its coordinates, three
squared plus four squared, is to multiply this number by three minus four 𝑖. This is called its complex conjugate. It’s what you get by reflecting over the real axis, replacing 𝑖 with negative
And this might seem like a strange step if you don’t have much of a history with
complex numbers. But describing this distance as a product can be unexpectedly useful. It turns our question into a factoring problem, which is ultimately why patterns
among prime numbers are gonna come into play. Algebraically, this relation is straightforward enough to verify. You get a three squared and then the three times minus four 𝑖 cancels out with the
four 𝑖 times three. And then, you have negative four 𝑖 squared, which because 𝑖 squared is negative one
becomes plus four squared.
This is also quite nice to see geometrically. And if you’re a little rusty with how complex multiplication works, I do have another
video that goes more into detail about why complex multiplication looks the way that
it does. The way that you might think about a case like this is that the number three plus
four 𝑖 has a magnitude of five and some angle off of the horizontal. And what it means to multiply it by three minus four 𝑖 is to rotate by that same
angle in the opposite direction, putting it on the positive real axis, and then to
stretch out by a factor of five, which in this case lands you on the output 25, the
square of the magnitude.
The collection of all of these lattice points 𝑎 plus 𝑏𝑖, where 𝑎 and 𝑏 are
integers, has a special name. They’re called the Gaussian integers, named after Martin Sheen. Geometrically, you’ll still be asking the same question. How many of these lattice points, Gaussian integers, are a given distance away from
the origin, like square root of 25? But we’ll be phrasing it in a slightly more algebraic way. How many Gaussian integers have the property that multiplying by their complex
conjugate gives you 25? This might seem needlessly complex. But it’s the key to understanding the seemingly random pattern for how many lattice
points are a given distance from the origin. To see why, we first need to understand how numbers factor inside the Gaussian
As a refresher, among ordinary integers, every number can be factored as some unique
collection of prime numbers. For example, 2250 can be factored as two times three squared times five cubed. And there is no other collection of prime numbers that also multiplies to make 2250,
unless you let negative numbers into the picture, in which case you could just make
some of the primes in this factorization negative. So really, within the integers, factorization is not perfectly unique. It’s almost unique with the exception that you can get a different-looking product by
multiplying some of the factors by negative one.
The reason I bring that up is that factoring works very similarly inside the Gaussian
integers. Some numbers, like five, can be factored into smaller Gaussian integers, which in
this case is two plus 𝑖 times two minus 𝑖. This Gaussian integer here, two plus 𝑖, cannot be factored into anything smaller, so
we call it a Gaussian prime. Again, this factorization is almost unique. But this time, not only can you multiply each one of those factors by negative one to
get a factorization that looks different. You can also be extra sneaky and multiply one of these factors by 𝑖 and then the
other one by negative 𝑖. This will give you a different way to factor five into two distinct Gaussian
But other than the things that you can get by multiplying some of these factors by
negative one or 𝑖 or negative 𝑖, factorization within the Gaussian integers is
unique. And if you can figure out how ordinary prime numbers factor inside the Gaussian
integers, that will be enough to tell us how any other natural number factors inside
these Gaussian integers. And so here, we pull in a crucial and pretty surprising fact. Prime numbers that are one above a multiple of four, like five or 13 or 17, these
guys can always be factored into exactly two distinct Gaussian primes. This corresponds with the fact that rings with a radius equal to the square root of
one of these prime numbers always hit some lattice points. In fact, they always hit exactly eight lattice points, as you’ll see in just a
On the other hand, prime numbers that are three above a multiple of four, like three
or seven or 11, these guys cannot be factored further inside the Gaussian
integers. Not only are they primes in the normal numbers, but they are also Gaussian primes,
unsplittable, even when 𝑖 is in the picture. And this corresponds with the fact that a ring whose radius is the square root of one
of those primes will never hit any lattice points.
And this pattern right here is the regularity within prime numbers that we’re gonna
ultimately explain. And in a later video, I might explain why on earth this is true. Why a prime number’s remainder when divided by four has anything to do with whether
or not it factors inside the Gaussian integers, or, said differently, whether or not
it can be expressed as the sum of two squares? But here and now, we’ll just have to take it as a given. The prime number two, by the way, is a little special because it does factor. You can write it as one plus 𝑖 times one minus 𝑖. But these two Gaussian primes are a 90-degree rotation away from each other. So you can multiply one of them by 𝑖 to get the other. And that fact is gonna make us wanna treat the prime number two a little bit
differently for where all of this stuff is going. So just keep that in the back of your mind.
Remember, our goal here is to count how many lattice points are a given distance away
from the origin. And doing this systematically for all distances, square root of 𝑛, can lead us to a
formula for 𝜋. And counting the number of lattice points with a given magnitude, like square root of
25, is the same as asking how many Gaussian integers have the special property that
multiplying them by their complex conjugate gives you 25. So here’s the recipe for finding all Gaussian integers that have this property.
Step one, factor 25, which inside the ordinary integers looks like five squared. But since five factors even further as two plus 𝑖 times two minus 𝑖, 25 breaks down
as these four Gaussian primes. Step two, organize these into two different columns, with conjugate pairs sitting
right next to each other. Then, once you do that, multiply what’s in each column. And you’ll come out with two different Gaussian integers on the bottom. And because everything on the right is a conjugate with everything on the left, what
comes out is gonna be a complex conjugate pair, which multiplies to 25. Picking an arbitrary standard, let’s say that the product from that left column is
the output of our recipe.
Now, notice there are three choices for how you can divvy up the primes that can
affect this output. Pictured right here, both copies of two plus 𝑖 are in the left column. And that gives us the product three plus four 𝑖. You could also have chosen to have only one copy of two plus 𝑖 in this left column,
in which case the product would be five. Or, you could have both copies of two plus 𝑖 in that right column, in which case the
output of our recipe would’ve been three minus four 𝑖. And those three possible outputs are all different lattice points on a circle with
radius square root of 25. But why does this recipe not yet capture all 12 of the lattice points?
Remember how I mentioned that a factorization into Gaussian primes can look different
if you multiply some of them by 𝑖 or negative one, negative 𝑖. In this case, you could write the factorization of 25 differently, maybe splitting up
one of those fives as negative one plus two 𝑖 times negative one minus two 𝑖. And if you do that, running through the same recipe, it can affect the result. You’ll get a different product out of that left column. But the only effect that this is gonna have is to multiply that total output by 𝑖 or
negative one or negative 𝑖. So as a final step for our recipe, let’s say you have to make one of four
choices. Take that product from the left column and choose to multiply it by one, 𝑖, negative
one, or negative 𝑖, corresponding to rotations that are some multiple of 90
degrees. That will account for all 12 different ways of constructing a Gaussian integer whose
product with its own conjugate is 25.
This process is a little complicated. So I think the best way to get a feel for it is to just try it out with more
examples. Let’s say, instead, we were looking at 125, which is five cubed. In that case, we would have four different choices for how to divvy up the prime
conjugate pairs into these two columns. You can either have zero copies of two plus 𝑖 in the left column, one copy in there,
two copies in there, or all three of them in that left column. Those four choices multiplied by the final four choices of multiplying the product
from the left column by one or by 𝑖 or negative one or negative 𝑖 would suggest
that there are a total of 16 lattice points at distance square root of 125 away from
And indeed, if you draw that circle out and count, what you’ll find is that it hits
exactly 16 lattice points. But what if you introduce a factor like three, which doesn’t break down as the
product of two conjugate Gaussian primes? Well, that really mucks up the whole system. When you’re divvying up the primes between the two columns, there’s no way that you
can split up this three. No matter where you put it, it leaves the columns imbalanced. And what that means is that when you take the product of all of the numbers in each
column, you’re not gonna end up with the conjugate pair. So for a number like this, three times five cubed, which is 375, there’s actually no
lattice point that you’ll hit. No Gaussian integer whose product with its own conjugate gives you 375.
However, if you introduce a second factor of three, then you have an option. You can throw one three in the left column and the other three in the right
column. Since three is its own complex conjugate, this leaves things balanced, in the sense
that the products of the left and right columns will indeed be a complex conjugate
pair. But it doesn’t add any new options. There’s still gonna be a total of four choices for how to divvy up those factors of
five, multiplied by the final four choices of multiplying by one, 𝑖, negative one,
or negative 𝑖. So just like the square root of 125 circle, this guy is also gonna end up hitting
exactly 16 lattice points.
Let’s just sum up where we are. When you’re counting up how many lattice points lie on a circle with a radius square
root of 𝑁, the first step is to factor 𝑁. And for prime numbers like five or 13 or 17 which factor further into a complex
conjugate pair of Gaussian primes, the number of choices they give you will always
be one more than the exponent that shows up with that factor. On the other hand, for prime factors like three or seven or 11, which are already
Gaussian primes and cannot be split, if they show up with an even power, you have
one and only one choice with what to do with them. But if it’s an odd exponent, you’re screwed. And you just have zero choices. And always, no matter what, you have those final four choices at the end.
By the way, I do think that this process right here is the most complicated part of
the video. It took me a couple times to think through that, “Yes! this is a valid way to count
lattice points.” So don’t be shy if you wanna pause and scribble things down to get a feel for it.
The one last thing to mention about this recipe is how factors of two affect the
count. If your number is even, then that factor of two breaks down as one plus 𝑖 times one
minus 𝑖. So you can divvy up that complex conjugate pair between the two columns. And at first, it might look like this doubles your options, depending on how you
choose to place those two Gaussian primes between the columns. However, since multiplying one of these guys by 𝑖 gives you the other one, when you
swap them between the columns, the effect that that has on the output from the left
column is to just multiply it by 𝑖 or by negative 𝑖. So that’s actually redundant with the final step, where we take the product of this
left column and choose to multiply it either by one, 𝑖, negative one, or negative
𝑖. What this means is that a factor of two, or any power of two, doesn’t actually change
the count at all. It doesn’t hurt, but it doesn’t help.
For example, a circle with radius square root of five hits eight lattice points. And if you grow this radius to square root of 10, then you also hit eight lattice
points. And square root of 20 also hits eight lattice points, as does square root of 40. Factors of two just don’t make a difference. Now what’s about to happen is number theory at its best. We have this complicated recipe telling us how many lattice points sit on a circle
with radius square root of 𝑁. And it depends on the prime factorization of 𝑁. To turn this into something simpler, something we can actually deal with, we’re gonna
exploit the regularity of primes that those which are one above a multiple of four
split into distinct Gaussian prime factors, while those that are three above a
multiple of four cannot be split.
To do this, let’s introduce a simple function, one which I’ll label with the Greek
letter 𝜒. For inputs that are one above a multiple of four, the output of 𝜒 is just one. If it takes in an input three above a multiple of four, then the output of 𝜒 is
negative one. And then on all even numbers, it gives zero. So if you evaluate 𝜒 on the natural numbers, it gives this very nice cyclic pattern:
one, zero, negative one, zero, and then repeat indefinitely. And this cyclic function 𝜒 has a very special property. It’s what’s called a multiplicative function. If you evaluate it on two different numbers and multiply the results, like 𝜒 of
three times 𝜒 of five, it’s the same as if you evaluate 𝜒 on the product of those
two numbers, in this case 𝜒 of 15. Likewise, 𝜒 of five times 𝜒 of five is equal to 𝜒 of 25. And no matter what two natural numbers you put in there, this property will hold. Go ahead; try it if you want.
So for our central question of counting lattice points in this way that involves
factoring a number, what I’m gonna do is write down the number of choices we have,
but using 𝜒 in what at first seems like a much more complicated way. But this has the benefit of treating all prime factors equally. For each prime power, like five cubed, what you write down is 𝜒 of one plus 𝜒 of
five plus 𝜒 of five squared plus 𝜒 of five cubed. You add up the value of 𝜒 on all the powers of this prime up to the one that shows
up inside the factorization.
In this case, since five is one above a multiple of four, all of these are just
one. So this sum comes out to be four, which reflects the fact that a factor of five cubed
gives you four options for how to divvy up the two Gaussian prime factors between
the columns. For a factor like three to the fourth, what you write down looks totally similar: 𝜒
of one plus 𝜒 of three, on and on up to 𝜒 of three to the fourth. But in this case, since 𝜒 of three is negative one, this sum oscillates. It goes one minus one plus one minus one plus one. And if it’s an even power, like four in this case, the total sum comes out to be one,
which encapsulates the fact that there is only one choice for what to do with those
unsplittable threes. But if it’s an odd power, that sum comes out to zero, indicating that you’re
screwed. You can’t place that unsplittable three.
When you do this for a power of two, what it looks like is one plus zero plus zero
plus zero, on and on, since 𝜒 is always zero on even numbers. And this reflects the fact that a factor of two doesn’t help and it doesn’t hurt. You always have just one option for what to do with it. And as always, we keep a four in front to indicate that final choice of multiplying
by one, 𝑖, negative one, or negative 𝑖. We’re getting close to the culmination now. Things are starting to look organized. So take a moment, pause and ponder. Make sure everything feels good up to this point.
Take the number 45 as an example. This guy factors as three squared times five. So the expression for the total number of lattice points is four times 𝜒 of one plus
𝜒 of three plus 𝜒 of three squared times 𝜒 of one plus 𝜒 of five. You can think about this as four times the one choice for what to do with the threes
times two choices for how to divvy up the Gaussian prime factors of five. It might seem like expanding out this sum is really complicated, because it involves
all possible combinations of these prime factors, and it kind of is. However, because 𝜒 is multiplicative, each one of those combinations corresponds to
a divisor of 45. I mean, in this case, what we get is four times 𝜒 of one plus 𝜒 of three plus 𝜒 of
five plus 𝜒 of nine plus 𝜒 of 15 plus 𝜒 of 45.
And what you’ll notice is that this covers every number that divides evenly into 45,
once and only once. And it works like this for any number. There’s nothing special about 45. And that to me is pretty interesting, and I think wholly unexpected. This question of counting the number of lattice points at distance square root of 𝑁
away from the origin involves adding up the value of this relatively simple function
over all the divisors of 𝑁.
To bring it altogether, remember why we’re doing this. The total number of lattice points inside a big circle with radius 𝑅 should be about
𝜋 times 𝑅 squared. But on the other hand, we can count those same lattice points by looking through all
of the numbers 𝑁 between zero and 𝑅 squared and counting how many lattice points
are at distance square root of 𝑁 from the origin. Let’s go ahead and just ignore that origin dot with radius zero. It doesn’t really follow the pattern of the rest. And one little dot isn’t gonna make a difference as we let 𝑅 grow towards
Now, from all of this Gaussian integer and factoring and 𝜒 function stuff that we’ve
been doing, the answer for each 𝑁 looks like adding up the value of 𝜒 on every
divisor of 𝑁 and then multiplying by four. And for now, let’s just take that four and put it in the corner and remember to bring
it back later. At first, adding up the values for each one of these rows seems super random,
right? I mean numbers with a lot of factors have a lot of divisors, whereas prime numbers
will always only have two divisors.
So it initially seems like you would have to have perfect knowledge of the
distribution of primes to get anything useful out of this. But if, instead, you organize these into columns, the puzzle starts to fit
together. How many numbers between one and 𝑅 squared have one as a divisor? Well, all of them. So our sum should include 𝑅 squared times 𝜒 of one. How many of them have two as a divisor? Well, about half of them, so that would account for about 𝑅 squared over two times
𝜒 of two. About a third of these rows have 𝜒 of three. So we can put in 𝑅 squared divided by three times 𝜒 of three.
And keep in mind we’re being approximate, since 𝑅 squared might not perfectly divide
two or three. But as 𝑅 grows towards infinity, this approximation will get better. And when you keep going like this, you get a pretty organized expression for the
total number of lattice points. And if you factor out that 𝑅 squared and then bring back the four that needs to be
multiplied in, what it means is that the total number of lattice points inside this
big circle is approximately four times 𝑅 squared times this sum. And because 𝜒 is zero on every even number and it oscillates between one and
negative one for odd numbers, this sum looks like one minus one-third plus a fifth
minus one-seventh, and so on.
And this is exactly what we wanted! What we have here is an alternate expression for the total number of lattice points
inside a big circle, which we know should be around 𝜋 times 𝑅 squared. And the bigger 𝑅 is, the more accurate both of these estimates are. So the percent error between the left-hand side and the right-hand side can get
arbitrarily small. So divide out by that 𝑅 squared, and this gives us an infinite sum that should
converge to 𝜋. And keep in mind, I just think this is really cool. The reason that this sum came out to be so simple, requiring relatively low
information to describe, ultimately stems from the regular pattern and how prime
numbers factor inside the Gaussian integers.
If you’re curious, there are two main branches of number theory: algebraic number
theory and analytic number theory. Very loosely speaking, the former deals with new number systems, things like these
Gaussian integers that you and I looked at and a lot more. And the latter deals with things like the Riemann zeta function or its cousins,
called 𝐿 functions, which involve multiplicative functions like this central
character 𝜒 from our story. And the path that we just walked is a little glimpse at where those two fields
intersect. And both of these are pretty heavy-duty fields with a lot of active research and
unsolved problems. So if all this feels like something that takes time to mentally digest, like there’s
more patterns to be uncovered and understood, it’s because it is and there are!