### Video Transcript

Let’s talk about space-filling curves. They are incredibly fun to animate, and they also give a chance to address a certain philosophical question. Math often deals with infinite quantities, sometimes so intimately that the very substance of a result only actually makes sense in an infinite world. So the question is, how can these results ever be useful in a finite context?

As with all philosophizing, this is best left to discuss until after we look at the concrete case in the real math. So I’ll begin by laying down an application of something called a “Hilbert Curve,” followed by a description of some of its origins in infinite math.

Let’s say that you wanted to write some software that would enable people to see with their ears. It would take in data from a camera and then somehow translate that into a sound in a meaningful way. The thought here is that brains are plastic enough to build an intuition from sight, even when the raw data is scrambled into a different format. I’ve left a few links in the description to studies to this effect. To make initial experiments easier, you might start by treating incoming images with a low resolution, maybe 256 by 256 pixels. And to make my own animation efforts easier, let’s represent one of these images with a square grid, each cell corresponding with a pixel.

One approach to this sound-to-sight software would be to find a nice way to associate each one of those pixels with a unique frequency value. Then when that pixel is brighter, the frequency associated with it would be played louder; and if the pixel were darker, the frequency would be quiet. Listening to all of the pixels all at once would then sound like a bunch of frequencies overlaid on top of one another with dominant frequencies corresponding to the brighter regions of the image, sounding like some cacophonous mess until your brain learns to make sense out of the information that it contains.

Let’s temporarily set aside worries about whether or not this would actually work. And instead, think about what function, from pixel space down to frequency space, gives this software the best chance of working. The tricky part is that pixel space is two-dimensional, but frequency space is one-dimensional. You could, of course, try doing this with a random mapping. After all, we’re hoping that people’s brains make sense out of pretty wonky data anyway. However, it might be nice to leverage some of the intuitions that a given human brain already has about sound. For example, if we think in terms of the reverse mapping from frequency space to pixel space, frequencies that are close together should stay close together in the pixel space. That way, even if an ear has a hard time distinguishing between two nearby frequencies, they will at least refer to the same basic point in space.

To ensure that this happens, you could first describe a way to weave a line through each one of these pixels. Then if you fix each pixel to a spot on that line and unravel the whole thread to make it straight, you could interpret this line as a frequency space, and you have an association from pixels to frequencies, which is what we want. Now, one weaving method would be to just go one row at a time, alternating between left and right as it moves up that pixel space. This is like a well-played game of Snake, so let’s call this a “snake curve.”

When you tell your mathematician friend about this idea, she says: “Why not use a Hilbert curve?” When you ask her what that is, she stumbles for a moment. “So, it’s not a curve, but an infinite family of curves,” she starts. “Well, no, it actually is just one thing, but I need to tell you about a certain infinite family first.” She pulls out a piece of paper and starts explaining what she decides to call “Pseudo-Hilbert Curves,” for lack of a better term.

For an order one Pseudo-Hilbert curve, you divide a square into a two-by-two grid and connect the center of the lower-left quadrant to the center of the upper-left, over to the upper-right and then down in the lower-right.

For an order two Pseudo-Hilbert curve, rather than just going straight from one quadrant to another, we let our curve do a little work to fill out each quadrant while it does so. Specifically, subdivide the square further into a four-by-four grid, and we have our curve trace out a miniature order one Pseudo-Hilbert curve inside each quadrant before it moves on to the next. If we left those many curves oriented as they are, going from the end of the mini curve in the lower-left to the start of the mini curve in the upper-left requires this kind of awkward jump. Same deal with going from the upper-right down to the lower-right. So we flip the curves in the lower-left and the lower-right to make that connection shorter.

Going from an order two to an order three Pseudo-Hilbert curve is completely similar. You divide the square into an eight-by-eight grid, then you put an order two Pseudo-Hilbert curve in each quadrant, flip the lower-left and the lower-right ones appropriately, and then connect them all tip-to-tail. And the pattern continues like that for higher orders.

For the 256 by 256 pixel array, your mathematician friend explains, you would use an order eight Pseudo-Hilbert curve. And remember, defining a curve which weaves through each pixel is basically the same as defining a function from pixel space to frequency space, since you’re associating each pixel with a point on the line.

Now this is nice as a piece of art, but why would these Pseudo-Hilbert curves be any better than just the snake curve? Well, here’s one very important reason. Imagine that you go through with this project, you integrate the software with real cameras and headphones, and it works. People around the world are using the device, building intuitions for vision via sound. What happens when you issue an upgrade that increases the resolution of the camera’s image from 256 by 256 to 512 by 512? If you are using the snake curve, as you transition to a higher resolution, many points on this frequency line would have to go to completely different parts of pixel space.

For example, let’s follow a point about halfway along the frequency line. It’ll end up about halfway up the pixel space no matter the resolution, but where it is, left-to-right, can differ wildly as you go from 256 up to 512. This means everyone using your software would have to relearn how to see with their ears, since the original intuitions of which points in space correspond to which frequencies no longer apply.

However, with the Hilbert curve technique, as you increase the order of a Pseudo-Hilbert curve, a given point on the line moves around less and less. It just approaches a more specific point in space. That way, you’ve given your users the opportunity to fine-tune their intuitions rather than relearning everything.

So for this sound-to-sight application, the Hilbert curve approach turns out to be exactly what you want. In fact, given how specific the goal is, it seems almost weirdly perfect. So you go back to your mathematician friend, and you ask her, “Hey, what was the original motivation for defining one of these curves?” She explains that near the end of the 19th century, in the aftershock of Cantor’s research on infinity, mathematicians were interested in finding a mapping from a one-dimensional line into two-dimensional space, in such a way that the line runs through every single point in space. To be clear, we’re not talking about a finite, bounded grid of pixels, like we had in the sound-to-sight application. This is continuous space, which is very infinite, and the goal is to have a line which is as thin as thin can be and has zero area somehow pass through every single one of those infinitely many points that makes up the infinite area of space.

Before 1890, a lot of people thought that this was obviously impossible. But then Peano discovered the first of what would come to be known as “space-filling curves.” In 1891, Hilbert followed with his own slightly simpler space-filling curve. Technically, each one fills a square, not all of space, but I’ll show you later on how, once you filled a square with a line, filling all of space is not an issue. By the way, mathematicians use this word “curve” to talk about a line running through space, even if it has jagged corners. This is especially counterintuitive terminology in the context of a space-filling curve, which in a sense consists of nothing but sharp corners. A better name might be something like “space-filling fractal,” which some people do use. But hey, it’s math, so we live with bad terminology!

None of the Pseudo-Hilbert curves that you use to fill pixelated space would count as a space-filling curve, no matter how high the order. Just zoom in on one of the pixels. When this pixel is considered part of infinite continuous space, the curve only passes through the tiniest, zero area slice of it. And it certainly doesn’t hit every single point. Your mathematician friend explains that an actual, bona fide Hilbert curve is not any one of these Pseudo-Hilbert curves; instead, it’s the limit of all of them.

Now defining this limit rigorously is delicate. You first have to formalize what these curves are as functions. Specifically, functions which take in a single number somewhere between zero and one as their input and output a pair of numbers. This input can be thought of as a point on the line and the output can be thought of as coordinates in 2D space. But in principle, it’s just an association between a single number and pairs of numbers.

For example, an order two Pseudo-Hilbert curve as a function maps the input 0.3 to the output pair 0.125, 0.75. An order three Pseudo-Hilbert curve maps that same input 0.3 to the output pair 0.0758, 0.6875.

Now the core property that makes a function like this a curve, and not just any old association between single numbers and pairs of numbers, is continuity. The intuition behind continuity is that you don’t want the output of your function to suddenly jump at any point when the input is only changing smoothly. And the way that this is made rigorous in math is, well it’s actually pretty clever. And fully appreciating space-filling curves really does require digesting the formal idea of continuity. So it’s definitely worth taking a brief sidestep to go over it now.

Consider a particular input point, 𝐴, and the corresponding output of the function, 𝐵. Draw a circle centered around 𝐴 and look at all of the other input points inside that circle, and then consider where the function takes all of those points in the output space. Now draw the smallest circle that you can, centered at 𝐵, that contains those outputs. Different choices for the size of the input circle might result in larger or smaller circles in the output space. But notice what happens when we go through this process at a point where the function jumps. Drawing a circle around 𝐴 and looking at the input points within the circle, seeing where they map and drawing the smallest possible circle centered at 𝐵 containing those points, no matter how small the circle around 𝐴, the corresponding circle around 𝐵 just cannot be smaller than that jump.

For this reason, we say that the function is “discontinuous at 𝐴” if there’s any lower bound on the size of this circle that surrounds 𝐵. If, on the other hand, the circle around 𝐵 can be made as small as you want with sufficiently small choices for circles around 𝐴, you say that the function is “continuous at 𝐴.” The function as a whole is called “continuous” if it’s continuous at every possible input point.

Now with that as a formal definition of curves, you’re ready to define what an actual Hilbert curve is. Doing this relies on a wonderful property of the sequence of Pseudo-Hilbert curves, which should feel familiar. Take a given input point like 0.3, and apply each successive Pseudo-Hilbert curve function to this point. The corresponding outputs, as we increase the order of the curve, approaches some particular point in space. It doesn’t matter what input you start with. This sequence of outputs you get by applying each successive Pseudo-Hilbert curve to this point always stabilizes and approaches some particular point in 2D space. This is absolutely not true, by the way, for snake curves. Or, for that matter, most sequences of curves filling pixelated space of higher and higher resolutions. The outputs associated with the given input become wildly erratic as the resolution increases, always jumping from left to right and never actually approaching anything.

Now because of this property, we can define a Hilbert curve function like this. For a given input value between zero and one, consider the sequence of points in 2D space you get by applying each successive Pseudo-Hilbert curve function at that point. The output of the Hilbert curve function evaluated on this input is just defined to be the limit of those points. Because the sequence of Pseudo-Hilbert curve outputs always converges no matter what input you start with, this is actually a well-defined function in a way that it never could have been, had we used snake curves.

Now, I’m not gonna go through the proof for why this gives a space-filling curve, but let’s at least see what needs to be proved. First, verify that this is a well-defined function by proving that the outputs of the Pseudo-Hilbert curve functions really do converge the way that I’m telling you they do. Second, show that this function gives a curve, meaning it’s continuous. Third and most important, show that it fills space, in the sense that every single point in the unit square is an output of this function. I really do encourage anyone watching this to take a stab at each one of these. Spoiler alert: all three of these facts turn out to be true!

You can extend this to a curve that fills all of space just by tiling space with squares, and then chaining a bunch of Hilbert curves together in a spiraling pattern of tiles, connecting the end of one tile to the start of a new tile with an added little stretch of line if you need to. You can think of the first tile as coming from the interval from zero to one, the second tile as coming from the interval from one to two, and so on. So the entire positive real number line is getting mapped into all of 2D space. Take a moment to let that fact sink in. A line, the platonic form of thinness itself, can wander through an infinitely-extending and richly dense space and hit every single point.

Notice the core property that made Pseudo-Hilbert curves useful in both the sound-to-sight application and in their infinite origins, is that points on the curve move around less and less as you increase the order of those curves. While translating images to sound, this was useful because it means upgrading to higher resolutions doesn’t require retraining your senses all over again. For mathematicians interested in filling continuous space, this property is what ensured that talking about the limit of a sequence of curves was actually a meaningful thing to do. And this connection here between the infinite and the finite worlds seems to be more of a rule in math than an exception.

Another example that several astute commenters on the “Inventing Math” video pointed out, is the connection between the divergent sum of all powers of two and the way that the number negative one is represented in computers with bits. It’s not so much that the infinite result is directly useful. But instead, the same patterns and constructs that are used to define and prove infinite facts have finite analogs, and these finite analogs are directly useful. But the connection is often deeper than a mere analogy. Many theorems about an infinite object are often equivalent to some theorem regarding a family of finite objects.

For example, if during your sound-to-sight project, you were to sit down and really formalize what it means for your curve to stay stable as you increase camera resolution, you would end up effectively writing the definition of what it means for a sequence of curves to have a limit. In fact, a statement about some infinite object, whether that’s a sequence or a fractal, can usually be viewed as a particularly clean way to encapsulate a truth about a family of finite objects.

The lesson to take away here is that even when a statement seems very far removed from reality, you should always be willing to look under the hood and at the nuts and bolts of what’s really being said. Who knows? You might find insights for representing numbers from divergent sums or for seeing with your ears from filling space.