Suppose you love math. And you had to choose just one proof to show someone to explain why it is that math
is beautiful. Something that can be appreciated by anyone from a wide range of backgrounds while
still capturing the spirit of progress and cleverness in math. What would you choose?
Well, after I put out a video on Fineman’s last lecture, about why planets orbit in
ellipses, published as a guest video over on MinutePhysics. Someone on Reddit asked about why the definition of an ellipse given in that video,
the classic two thumbtacks in a piece of string construction, is the same as the
definition involving slicing a cone. Well, my friend, you’ve asked about one of my all-time favorite proofs. A lovely bit of 3D geometry, which, despite requiring almost no background, still
captures the spirit of mathematical inventiveness.
For context and to make sure we’re all on the same page, there are at least three
main ways that you could define an ellipse geometrically. One is to say you take a circle and you just stretch it out in one dimension. For example, maybe you consider all of the points as 𝑥, 𝑦 coordinates. And what you do is multiply just the 𝑥-coordinate by some special factor for all the
points. Another is the classic two thumbtacks in a piece of string construction. Where you loop a string around two thumbtacks stuck into a piece of paper and pull it
taut with a pencil. And then trace around keeping the string taut the whole time.
What you’re drawing by doing this is the set of all points. So that the sum of the distances from each pencil point to the two thumbtack points
stays constant. Those two thumbtack points are each called a focus of the ellipse. And what we’re saying here is that this constant-focal-sum property can be used to
define what an ellipse even is. And yet another way to define an ellipse is to slice a cone with a plane at an
angle. An angle that’s smaller than the slope of the cone itself. The curve of points where this plane and the cone intersect forms an ellipse. Which is why you’ll often hear ellipses referred to as a conic section.
Now, of course, an ellipse is not just one curve. It’s a family of curves, ranging from a perfect circle up to something that’s
infinitely stretched. The specific shape of an ellipse is typically quantified with a number called its
eccentricity. Which I sometimes just read in my head as squishification. A circle has eccentricity zero. And the more squished the ellipse is, the closer its eccentricity is to the number
one. For example, Earth’s orbit has an eccentricity 0.0167, very low squishification. Meaning, it’s really close to just being a circle. While Halley’s comet has an orbit with eccentricity 0.9671, very high
In the thumbtack definition of an ellipse based on the constant sum of the distances
from each point to the two foci. This eccentricity is determined by how far apart the two thumbtacks are. Specifically, it’s the distance between the foci divided by the length of the longest
axis of the ellipse. For slicing a cone, the eccentricity is determined by the slope of the plane that you
used for the slicing. And you might justifiably ask, especially if you’re a certain Reddit user, why on
earth should these three definitions have anything to do with each other? I mean, sure, it kind of makes sense that each should produce some vaguely
oval-looking, stretched-out loop. But why should the family of curves produced by these three totally different methods
be precisely the same shapes?
In particular, when I was younger, I remember feeling really surprised that slicing a
cone would produce such a symmetric shape. You might think that the part of the intersection farther down would kind of bulge
out and produce a more lopsided-egg shape. But nope, the intersection curve is an ellipse, the same evidently symmetric curve
you’d get by just stretching a circle or tracing around two thumbtacks. But of course, math is all about proofs. So how do you give an airtight demonstration that these three families of curves are
actually the same? For example, let’s focus our attention on just one of these equivalences. Namely that slicing a cone will give us a curve that could also be drawn using the
What you need to show here is that there exist two thumbtack points somewhere inside
that slicing plane. Such that the sum of the distances from any point of the intersection curve to those
two points remains constant. No matter where you are on that intersection curve. I first saw the trick to showing why this is true in Paul Lockhart’s magnificent
book, Measurement. Which I would highly recommend to anyone young or old who needs a reminder of the
fact that math is a form of art. The stroke of genius comes in the very first step, which is to introduce two spheres
into this picture, one above the plain and one below it. Each one of them sized just right. So as to be tangent to the cone along a circle of points and tangent to the plane at
just a single point. Why you would think to do this, of all things, is a tricky question to answer, and
one that we’ll turn back to.
Right now, let’s just say that you have a particularly playful mind that loves
engaging with how different geometric objects all fit together. But once these fears air sitting here, I actually bet that you could prove the target
result yourself. Here, I’ll help you step through it. But at any point, if you feel inspired, please do pause and just try to carry on
without me. First of, these spheres have introduced two special points inside the curve, the
points where they’re tangent to the plane. So reasonable guess might be that these two tangency points are the focus points. That means that you’re gonna wanna draw lines from these foci to some point along the
ellipse. And ultimately, the goal is to understand what the sum of the distances of those two
lines is. Or, at the very least, to understand why that sum doesn’t depend on where you are
along the ellipse.
Keep in mind, what makes these lines special is that each one does not simply touch
one of the spheres. It’s actually tangent to that sphere at the point where it touches. And in general, for any math problem, you want to use the defining features of all of
the objects involved. Another example here is what even defines the spheres. It’s not just the fact that they’re tangent to the plane. But that they’re also tangent to the cone, each one at some circle of tangency
points. So you’re gonna need to use those two circles of tangency points in some way. But how exactly? One thing you might do is just draw a line straight from the top circle down to the
bottom one along the cone.
And there’s something about doing this that feels vaguely reminiscent of the
constant-sum thumbtack property and hence promising. You see, it passes through the ellipse. And so, by snipping that line at the point where it crosses the ellipse, you can
think of it as the sum of two line segments. Each one hitting the same point on the ellipse. And you can do this through various different points of the ellipse, depending on
where you are around the cone. Always getting two line segments with a constant sum. Namely, whatever the straight line distance from the top circle to the bottom circle
So you see what I mean about it being vaguely analogous to the thumbtack
property. And that every point of the ellipse gives us two distances whose sum is a
constant. Granted these lengths are not to the focal points. They’re to the big and the little circle. But maybe that leads you to making the following conjecture. The distance from a given point on this ellipse, this intersection curve, straight
down to the big circle is, you conjecture, equal to the distance to the point where
that big sphere is tangent to the plane, our first proposed focus point. Likewise, perhaps the distance from that point on the ellipse to the small circle is
equal to the distance from that point to the second proposed focus point where the
small sphere touches the plane.
So is that true? Well, yes. Here, let’s give a name to that point that we have on the ellipse 𝑄. The key is that the line from 𝑄 to the first proposed focus is tangent to the big
sphere. And the line from 𝑄 straight down along the cone is also tangent to the big
sphere. Here, let’s look at a different picture for some clarity. If you have multiple lines drawn from a common point to a sphere, all of which are
tangent to that sphere. You can probably see just from the symmetry of the setup that all of these lines have
to have the same length. And in fact, I encourage you to try proving this yourself or to, otherwise, pause and
ponder on the proof that I’ve left on the screen.
But looking back at our cone slicing set up, your conjecture would be correct. The two lines extending from the point 𝑄 on the ellipse tangent to the big sphere
have the same length. Similarly, the line from 𝑄 to the second proposed focus point is tangent to the
little sphere, as is the line from 𝑄 straight up along the cone. So those two also have the same length. And so, the sum of the distances from 𝑄 to the two proposed focus points is the same
as the straight line distance from the little circle down to the big circle along
the cone, passing through 𝑄. And clearly, that does not depend on which point of the ellipse you chose for 𝑄. Bada boom, bada bing, slicing the cone is the same as the thumbtack construction. Since the resulting curve has the constant-focal-sum property.
Now, this proof was first found by Germinal-G-Germinal-Germa-, who cares, Dandelin, a
guy named Dandelin in 1822. So these two spheres are sometimes called Dandelin spheres. You can also use the same trick to show why slicing a cylinder at an angle will give
you an ellipse. And if you’re comfortable with the claim that projecting a shape from one plane onto
another tilted plane has the effect of simply stretching out that shape. This also shows why the definition of an ellipse as a stretched circle is the same as
the other two. More homework. So why do I think that this proof is such a good representative for math itself? That if you had to show just one thing to explain to a non-math enthusiast why you
love the subject, why this would be a good candidate?
The obvious reason is that it’s substantive and beautiful without requiring too much
background. But more than that, it reflects a common feature of math that sometimes there is no
single, most fundamental way of defining something. That what matters more is showing equivalences. And even more than that, the proof itself involves one key moment of creative
construction, adding the two spheres. While most of it leaves room for a nice, systematic and principled approach. And this kind of creative construction is, I think, one of the most thought-provoking
aspects of mathematical discovery. And you might understandably ask where such an idea comes from.
In fact, talking about this particular proof, here’s what Paul Lockhart says in
Measurement. How do people come up with such ingenious arguments? It’s the same as the way people come up with Madame Bovary or Mona Lisa. I have no idea how it happens. I only know that when it happens to me, I feel very fortunate. I agree, but I do think we can say at least a little something more about this. While it is ingenious, we can perhaps decompose how someone who has immersed
themselves in a number of other geometry problems might be particularly primed to
think of adding these specific spheres.
First, a common tactic in geometry is to relate one length to another. And in this problem, you know from the outset that being able to relate these two
lengths to the foci to some other two lengths, especially ones that lineup, would be
a useful thing. Even though, at the start, you don’t even know where the focus points are. And even if it’s not clear exactly how you do that, throwing spheres into the picture
isn’t all that crazy.
Again, if you’ve built up a relationship with geometry through practice. You would be well-acquainted with how relating one length to another happens all the
time when circles and spheres are in the picture. Because it cuts straight to the defining feature of what it even means to be a circle
or a sphere. And this is obviously a very specific example. But the point I wanna make is that you can often view glimpses of ingeniousness. Not as inexplicable miracles but as the residue of experience. And when you do, the idea of genius goes from being mesmerizing to instead being