### Video Transcript

Today, I wanna share with you a neat way to solve the Towers of Hanoi puzzle, just by
counting in a different number system. And surprisingly, this stuff relates to finding a curve that fills Sierpinski’s
triangle.

I learned about this from a former CS lecturer of mine. His name is Keith Schwarz. And I’ve gotta say, this man is one of the best educators that I’ve ever met. I actually recorded a bit of the conversation where he showed me this stuff. So you guys can hear some of what he described directly.

KEITH: It’s weird. I’m not normally the sort of person who likes little puzzles and games. But I just love looking at the analysis of puzzles and games. And I love just looking at mathematical patterns and — where did that come from?

In case you aren’t familiar, let’s just lay down what the Towers of Hanoi puzzle
actually is.

KEITH: So you have a collection of three pegs. And you have these discs of descending size.

You think of these disks as having a hole in the middle, so that you can fit them
onto a peg. The set I pictured here has five discs, which I’ll label zero, one, two, three,
four. But in principle, you could have as many discs as you want.

KEITH: So they all start up stacked up from biggest to smallest on one spindle. And the goal is to move the entire tower from one spindle to another. The rule is you can only move one disc at a time. And you can’t move a bigger disc on top of a smaller disk.

For example, your first move must involve moving disk zero, since any other disk has
stuff on top of it that needs to get out of the way before it can move. After that, you can move disc one. But it has to go on whatever peg doesn’t currently have disk zero. Since, otherwise, you’d be putting a bigger disk on a smaller one, which isn’t
allowed. If you’ve never seen this before, I highly encourage you to pause and pull out some
books of varying sizes and try it out for yourself. Just kinda get a feel for what the puzzle is. If it’s hard, why it’s hard. If it’s not, why it’s not, that kinda stuff.

Now, Keith showed me something truly surprising about this puzzle, which is that you
can solve it just by counting up in binary and associating the rhythm of that
counting with a certain rhythm of disc movements. For anyone unfamiliar with binary, I’m gonna take a moment to do a quick overview
here first. Actually, even if you are familiar with binary, I wanna explain it with a focus on
the rhythm of counting, which you may or may not have thought about before.

Any description of binary typically starts off with an introspection about our usual
way to represent numbers, what we call base 10, since we use 10 separate digits. Zero, one, two, three, four, five, six, seven, eight, nine. The rhythm of counting begins by walking through all 10 of these digits. Then, having run out of new digits, you express the next number, 10, with two digits:
one, zero. You say that one is in the tens place, since it’s meant to encapsulate the group of
10 that you’ve already counted up to so far, while freeing the ones place to reset
to zero.

The rhythm of counting repeats like this: counting up nine, rolling over to the tens
place, counting up nine more, rolling over to the tens place, etcetera. Until after repeating that process nine times, you roll over to a hundreds place. A digit that keeps track of how many groups of 100 you’ve hit, freeing up the other
two digits to reset to zero. In this way, the rhythm of counting is kind of self-similar. Even if you zoom out to a larger scale, the process looks like doing something,
rolling over, doing that same thing, rolling over, and repeating nine times before
an even larger rollover.

In binary, also known as base two, you limit yourself to two digits, zero and one,
commonly called bits, which is short for binary digits. The result is that when you’re counting, you have to roll over all the time. After counting zero, one, you’ve already run out of bits. So you need to roll over to a two’s place, writing one, zero and resisting every urge
in your base-10-trained brain to read this as 10. And instead understand it to mean one group of two plus zero. Then, increment up to one, one, which represents three. And already, you have to roll over again. And since there’s a one in that two’s place, that has to roll over as well, giving
you one, zero, zero, which represents one group of four plus zero groups of two plus
zero.

In the same way that digits in base 10 represent powers of 10, bits in base two
represent different powers of two. So instead of talking about a tens place, a hundreds place, a thousands place, things
like that, you talk about a two’s place, a four’s place, and an eight’s place. The rhythm of counting is now a lot faster. But that almost makes it more noticeable. Flip the last, roll over once, flip the last, roll over twice, flip the last, roll
over once, flip the last, roll over three times. Again, there’s a certain self-similarity to this pattern.

At every scale, the process is to do something, roll over, then do that same thing
again. At the small scale, say counting up to three — which is one, one in binary — this
means flip the last bit, roll over to the two’s, then flip the last bit. At a larger scale, like counting up to 15 — which is one, one, one, one in binary —
the process is to let the last three count up to seven. Roll over to the eight’s place. Then let the last three bits count up again. Counting up to 255, which is eight successive ones, this looks like letting the last
seven bits count up till they’re full, rolling over to the 128’s place. Then letting the last seven bits count up again.

Alright, so with that mini introduction, the surprising fact that Keith showed me is
that we can use this rhythm to solve the towers of Hanoi. You start by counting from zero. Whenever you’re only flipping that last bit from a zero to a one, move disc zero one
peg to the right. If it was already on the rightmost peg, you just loop it back to the first peg. If in your binary counting, you roll over once to the two’s place, meaning you flip
the last two bits, you move disc number one. “Where do you move it?”, you might ask.

Well, you have no choice. You can’t put it on top of disk zero. And there’s only one other peg. So you move it where you’re forced to move it. So after this, counting up to one, one, that involves just flipping the last bit. So you move disk zero again. Then, when your binary counting rolls over twice to the four’s place, move disc
number two. And the pattern continues like this. Flip the last, move disk zero. Flip the last two, move disc one. Flip the last, move disk zero. And here, we’re gonna have to roll over three times to the eight’s place. And that corresponds to moving disc number three.

KEITH: There’s something magical about it. Like when I first saw this like, “This can’t work.” Like I-I don’t know how this works. I don’t know why this works. Now I know, but it’s just magical when you see it. And I remember putting the other animation for this for- when I was teaching
this. And just like, you know, I know how this works. I know all the things in it. It’s still fun to just sit and just like, you know,...

GRANT: Watch it play out?

KEITH: Oh yeah.

I mean, it’s not even clear at first that this is always gonna give legal moves. For example, how do you know that every time you’re rolling over to the eight’s
place, that disc three is necessarily gonna be freed up to move?

KEITH: At the same time, the solution just immediately raised these questions. Like, where does this come from? Why does this work? And is there a better way of doing this then, having to do two to the 𝑛 minus one
steps?

It turns out, not only does this solve Towers of Hanoi. But it does it in the most efficient way possible. Understanding why this works and how it works and what the heck is going on comes
down to a certain perspective on the puzzle, what CS folk might call a recursive
perspective.

KEITH: Disc three is thinking, “Okay, two, one, and zero!” Like, “You have to get-You have to get off of me!” Like, “I can’t-I can’t really function under this much weight and pressure.” And so, just from disc three’s perspective, if you wanna figure out how is disc three
gonna get over here, somehow, I don’t care how, discs two, one, and zero have to get
to this spindle B. That’s the only way they can move. If any of these discs are on top of three, it can’t move. If any of them are in spindle C, it can’t move there. So, somehow, we have to get two, one, and zero off. Having done that, then we can move disc three over there. And then disc three says, “I’m set. You never need to move me again. Everyone else just figure out how to get here.”

KEITH: And in a sense, you now have a smaller version of the same problem. Now you’ve got discs zero, one, and two sitting on spindle B. We gotta get them to C. So the idea is that if I just focus on one disc and I think about what am I gonna
have to do to get this disc to work, I can turn my bigger problem into something
slightly smaller. And then how do I solve that? Well, it’s exactly the same thing. Disc two is gonna say, “Disc one, disc zero, you need to, you know. It’s not you; it’s me. I just need some space. Get off!”

KEITH: They need to move somewhere. Then disc two can move to where it needs to go. Then discs one and zero can do this. But the interesting point is that every single disc pretty much has the exact same
strategy. They all say, “Everybody above me, get off! And then I’m going to move. Okay, everyone, pile back on.” When you know that insight, you can code up something that will solve Towers of Hanoi
in like five or six lines of code, which probably has the highest ratio of like
intellectual investment to lines of code ever.

And if you think about it for a bit, it becomes clear that this has to be the most
efficient solution. At every step, you’re only doing what’s forced upon you. You have to get discs zero through two off before you can move disc three. And you have to move disc three. And then you have to move discs zero through two back on to it. There’s just not any room for inefficiency from this perspective. So why does counting in binary capture this algorithm?

Well, what’s going on here is that this pattern of solving a subproblem, moving a big
disk, then solving a subproblem again, is perfectly paralleled by the pattern of
binary counting. Count up some amount, roll over, count up to that same amount again. And this Towers of Hanoi algorithm and binary counting are both self-similar
processes, in the sense that if you zoom out and count up to a larger power of two,
or solve Towers of Hanoi with more discs, they both still have that same structure:
subproblem, do a thing, subproblem.

For example, at a pretty small scale, solving Towers of Hanoi for two discs — move
disc zero, move disc one, move disc zero — is reflected by counting up to three in
binary. Flip the last bit, roll over once, flip the last bit. At a slightly larger scale, solving Towers of Hanoi for three discs looks like doing
whatever it takes to solve two discs, move disc number two. Then do whatever it takes to solve two discs again. Analogously, counting up to one, one, one in binary involves counting up to three,
rolling over all three bits, then counting up three more. At all scales, both processes have this same breakdown.

KEITH: So in a sense, the reason that this binary solution works or at least an
explanation — I feel like, you know, there’s no one explanation. But, I think the most natural one is that the pattern you would use to generate these
binary numbers has exactly the same structure as the pattern you would use for
Towers of Hanoi. Which is why if you look at the bits flipping, you’re effectively reversing this
process. You’re saying what process generated these. Like if I were trying to understand how these bits were flipped to give me this
thing, you’re effectively reverse engineering the recursive algorithm for Towers of
Hanoi, which is why it works out.

That’s pretty cool, right? But it actually gets cooler! I haven’t even gotten to how this relates to Sierpinski’s triangle. And that’s exactly what I’m gonna do in the follow-on video, part two. Alright, so with that, I’ll see you guys next video. And I think you’re really gonna like where this is going.