Video Transcript
This right here is what we’re gonna build to this video, a certain animated approach
to thinking about a super important idea from math, the Fourier transform. For anyone unfamiliar with what that is, my number one goal here is just for the
video to be an introduction to that topic. But even for those of you who are already familiar with it, I still think that
there’s something fun and enriching about seeing what all of its components actually
look like.
The central example, to start, is gonna be the classic one, decomposing frequencies
from sound. But after that, I also really wanna show a glimpse of how this idea extends well
beyond sound and frequency into many seemingly disparate areas of math and even
physics. Really, it is crazy just how ubiquitous this idea is. Let’s dive in.
This sound right here is a pure A, 440 beats per second. Meaning, if you were to measure the air pressure right next to your headphones or
your speaker as a function of time. It would oscillate up and down around its usual equilibrium in this wave, making 440
oscillations each second. A lower-pitch note, like a D, has the same structure, just fewer beats per
second. And when both of them are played at once, what do you think the resulting pressure
versus time graph looks like? Well, at any point in time, this pressure difference is gonna be the sum of what it
would be for each of those notes individually. Which, let’s face it, is kind of a complicated thing to think about.
At some points, the peaks match up with each other, resulting in a really high
pressure. At other points, they tend to cancel out. And all in all, what you get is a wave-ish pressure-versus-time graph that is not a
pure sine wave. It’s something more complicated. And as you add in other notes, the wave gets more and more complicated. But right now, all it is is a combination of four pure frequencies. So it seems needlessly complicated given the low amount of information put into
it. A microphone recording any sound just picks up on the air pressure at many different
points in time. It only sees the final sum. So our central question is gonna be, how you can take a signal like this and
decompose it into the pure frequencies that make it up? Pretty interesting, right?
Adding up those signals really mixes them all together. So pulling them back apart feels akin to unmixing multiple paint colors that have all
been stirred up together. The general strategy is gonna be to build for ourselves a mathematical machine that
treats signals with a given frequency differently from how it treats other
signals. To start, consider simply taking a pure signal, say with a lowly three beats per
second, so that we can plot it easily. And let’s limit ourselves to looking at a finite portion of this graph. In this case, the portion between zero seconds and 4.5 seconds. The key idea is gonna be to take this graph and sort of wrap it up around a
circle.
Concretely, here’s what I mean by that. Imagine a little rotating vector where each point in time its length is equal to the
height of our graph for that time. So, high points of the graph correspond to a greater distance from the origin. And low points end up closer to the origin. And right now, I’m drawing it in such a way that moving forward two seconds in time
corresponds to a single rotation around the circle. Our little vector drawing this wound-up graph is rotating at half a cycle per
second. So, this is important. There are two different frequencies at play here. There’s the frequency of our signal, which goes up and down, three times per
second. And then, separately, there’s the frequency with which we’re wrapping the graph
around the circle. Which, at the moment, is half of a rotation per second.
But we can adjust that second frequency however we want. Maybe we wanna wrap it around faster or maybe we go and wrap it around slower. And that choice of winding frequency determines what the wound-up graph looks
like. Some of the diagrams that come out of this can be pretty complicated, although they
are very pretty. But it’s important to keep in mind that all that’s happening here is that we’re
wrapping the signal around a circle. The vertical lines that I’m drawing up top, by the way, are just a way to keep track
of the distance on the original graph that corresponds to a full rotation around the
circle. So, lines spaced out by 1.5 seconds would mean it takes 1.5 seconds to make one full
revolution.
And at this point, we might have some sort of vague sense that something special will
happen when the winding frequency matches the frequency of our signal, three beats
per second. All of the high points on the graph happen on the right side of the circle. And all of the low points happen on the left. But how precisely can we take advantage of that in our attempt to build a
frequency-unmixing machine? Well, imagine this graph is having some kind of mass to it, like a metal wire. This little dot is gonna represent the center of mass of that wire. As we change the frequency and the graph winds up differently, that center of mass
kind of wobbles around a bit. And for most of the winding frequencies, the peaks and the valleys are all spaced out
around the circle in such a way that the center of mass stays pretty close to the
origin.
But, when the winding frequency is the same as the frequency of our signal, in this
case three cycles per second, all of the peaks are on the right and all of the
valleys are on the left. So the center of mass is unusually far to the right. Here, to capture this, let’s draw some kind of plot that keeps track of where that
center of mass is for each winding frequency. Of course, the center of mass is a two-dimensional thing. It requires two coordinates to fully keep track of. But for the moment, let’s only keep track of the 𝑥-coordinate. So for a frequency of zero, when everything is bunched up on the right, this
𝑥-coordinate is relatively high. And then, as you increase that winding frequency and the graph balances out around
the circle, the 𝑥-coordinate of that center of mass goes closer to zero. And it just kinda wobbles around a bit.
But then, at three beats per second, there’s a spike as everything lines up to the
right. This right here is the central construct. So let’s sum up what we have so far. We have that original intensity-versus-time graph. And then, we have the wound-up version of that in some two-dimensional plane. And then, as a third thing, we have a plot for how the winding frequency influences
the center of mass of that graph. And by the way, let’s look back at those really low frequencies near zero. This big spike around zero in our new frequency plot just corresponds to the fact
that the whole cosine wave is shifted up. If I had chosen a signal that oscillates around zero, dipping into negative
values. Then, as we play around with various winding frequencies, this plot of the winding
frequency versus center of mass would only have a spike at the value of three.
But, negative values are a little bit weird and messy to think about, especially for
a first example. So let’s just continue thinking in terms of the shifted-up graph. I just want you to understand that that spike around zero only corresponds to the
shift. Our main focus, as far as frequency decomposition is concerned, is that bump at
three. This whole plot is what I’ll call the “Almost-Fourier Transform” of the original
signal. There’s a couple small distinctions between this and the actual Fourier transform,
which I’ll get to in a couple of minutes. But already, you might be able to see how this machine lets us pick out the frequency
of a signal.
Just to play around with it a little bit more, take a different pure signal, let’s
say with a lower frequency of two beats per second, and do the same thing. Wind it around a circle. Imagine different potential winding frequencies. And as you do that, keep track of where the center of mass of that graph is. And then, plot the 𝑥-coordinate of that center of mass as you adjust the winding
frequency. Just like before, we get a spike when the winding frequency is the same as the signal
frequency, which in this case is when it equals two cycles per second. But the real key point, the thing that makes this machine so delightful is how it
enables us to take a signal consisting of multiple frequencies and pick out what
they are.
Imagine taking the two signals we just looked at, the wave with three beats per
second and the wave with two beats per second, and add them up. Like I said earlier, what you get is no longer a nice pure cosine wave. It’s something a little more complicated. But imagine throwing this into our winding-frequency machine. It is certainly the case that as you wrap this thing around, it looks a lot more
complicated. You have this chaos and chaos and chaos and chaos and then WOOP! Things seem to line up really nicely at two cycles per second. And as you continue on, it’s more chaos and more chaos and more chaos, chaos, chaos,
chaos, WOOP! Things nicely align again at three cycles per second. And, like I said before, the wound-up graph can look kind of busy and
complicated. But all it is is the relatively simple idea of wrapping the graph around the
circle. It’s just a more complicated graph and a pretty quick-winding frequency.
Now what’s going on here with the two different spikes is that if you were to take
two signals and then apply this Almost-Fourier transform to each of them
individually and then add up the results. What you get is the same as if you first added up the signals and then applied this
Almost-Fourier transform. And the attentive viewers among you might wanna pause and ponder and convince
yourself that what I just said is actually true. It’s a pretty good test to verify for yourself that it’s clear what exactly is being
measured inside this winding machine. Now this property makes things really useful to us. Because the transform of a pure frequency is close to zero everywhere except for a
spike around that frequency. So when you add together two pure frequencies, the transform graph just has these
little peaks above the frequencies that went into it.
So this little mathematical machine does exactly what we wanted. It pulls out the original frequencies from their jumbled up sums, unmixing the mixed
bucket of paint. And before continuing into the full math that describes this operation, let’s just
get a quick glimpse of one context where this thing is useful, sound editing. Let’s say that you have some recording. And it’s got an annoying high pitch that you would like to filter out. Well, at first, your signal is coming in as a function of various intensities over
time, different voltages given to your speaker from one millisecond to the next. But we want to think of this in terms of frequencies. So, when you take the Fourier transform of that signal, the annoying high pitch is
gonna show up just as a spike at some high frequency. Filtering that up by just smushing the spike down, what you’d be looking at is the
Fourier transform of a sound that’s just like your recording, only without that high
frequency.
Luckily, there’s a notion of an inverse Fourier transform that tells you which signal
would have produced this as its Fourier transform. I’ll be talking about inverse much more fully in the next video. But long story short, applying the Fourier transform to the Fourier transform gives
you back something close to the original function, kind of. This is a little bit of a lie, but it’s in the direction of truth. And most of the reason that it’s a lie is that I still have yet to tell you what the
actual Fourier transform is. Since it’s a little more complex than this 𝑥-coordinate of the center-of-mass
idea.
First off, bringing back this wound-up graph and looking at its center of mass, the
𝑥-coordinate is really only half the story, right? I mean, this thing is in two dimensions. It’s got a 𝑦-coordinate as well. And, as is typical in math, whenever you’re dealing with something two-dimensional,
it’s elegant to think of it as the complex plane. Where this center of mass is gonna be a complex number that has both a real and an
imaginary part. And the reason for talking in terms of complex numbers rather than just saying it has
two coordinates is that complex numbers lend themselves to really nice descriptions
of things that have to do with winding and rotation.
For example, Euler’s formula famously tells us that if you take e to some number
times 𝑖. You’re gonna land on the point that you get, if you were to walk that number of units
around a circle with radius one counter-clockwise, starting on the right. So, imagine you wanted to describe rotating at a rate of one cycle per second. One thing that you could do is take the expression 𝑒 to the two 𝜋 times 𝑖 times
𝑡, where 𝑡 is the amount of time that has passed. Since, for a circle with radius one, two 𝜋 describes the full length of its
circumference. And this is a little bit dizzying to look at. So maybe you wanna describe a different frequency, something lower and more
reasonable. And for that, you would just multiply that time 𝑡 in the exponent by the frequency,
𝑓.
For example, if 𝑓 was one-tenth, then this vector makes one full turn every 10
seconds, since the time 𝑡 has to increase all the way to 10 before the full
exponent looks like two 𝜋𝑖. I have another video giving some intuition on why this is the behavior of 𝑒 to the
𝑥 for imaginary inputs, if you’re curious. But for right now, we’re just gonna take it as a given. Now why am I telling you this, you might ask. Well, it gives us a really nice way to write down the idea of winding up the graph
into a single tight little formula. First off, the convention in the context of Fourier transforms is to think about
rotating in the clockwise direction. So let’s go ahead and throw a negative sign up into that exponent.
Now, take some function describing a signal intensity versus time, like this pure
cosine wave we had before, and call it 𝑔 of 𝑡. If you multiply this exponential expression times 𝑔 of 𝑡, it means that the
rotating complex number is getting scaled up and down according to the value of this
function. So you can think of this little rotating vector with its changing length as drawing
the wound-up graph. So think about it. This is awesome! This really small expression is a super elegant way to encapsulate the whole idea of
winding a graph around a circle with a variable frequency, 𝑓. And remember, the thing we want to do with this wound-up graph is to track its center
of mass. So think about what formula is gonna capture that.
Well, to approximate it at least, you might sample a whole bunch of times from the
original signal, see where those points end up on the wound-up graph, and then just
take an average. That is, add them all together as complex numbers, and then divide by the number of
points that you’ve sampled. This will become more accurate if you sample more points which are closer
together. And in the limit, rather than looking at the sum of a whole bunch of points divided
by the number of points, you take an integral of this function divided by the size
of the time interval that we’re looking at. Now the idea of integrating a complex-valued function might seem weird, and to anyone
who’s shaky with calculus, maybe even intimidating. But the underlying meaning here really doesn’t require any calculus knowledge. The whole expression is just the center of mass of the wound-up graph.
So, great! Step-b𝑦-step, we have built up this kind of complicated but, let’s face it,
surprisingly small expression for the whole winding-machine idea that I talked
about. And now, there is only one final distinction to point out between this and the actual
honest-to-goodness Fourier transform. Namely, just don’t divide out by the time interval. The Fourier transform is just the integral part of this. What that means is that instead of looking at the center of mass, you would scale it
up by some amount. If the portion of the original graph you were using spanned three seconds, you would
multiply the center of mass by three. If it was spanning six seconds, you would multiply the center of mass by six. Physically, this has the effect that when a certain frequency persists for a long
time, then the magnitude of the Fourier transform at that frequency is scaled up
more and more.
For example, what we’re looking at right here is how when you have a pure frequency
of two beats per second and you wind it around the graph at two cycles per second,
the center of mass stays in the same spot, right? It’s just tracing out the same shape. But the longer that signal persists, the larger the value of the Fourier transform at
that frequency. For other frequencies though, even if you just increase it by a bit, this is
cancelled out by the fact that for longer time intervals you’re giving the wound-up
graph more of a chance to balance itself around the circle. That is a lot of different moving parts. So let’s step back and summarize what we have so far.
The Fourier transform of an intensity-versus-time function, like 𝑔 of 𝑡, is a new
function which doesn’t have time as an input. But instead takes in a frequency, what I’ve been calling the winding frequency. In terms of notation, by the way, the common convention is to call this new function
𝑔-hat, with a little circumflex on top of it. Now the output of this function is a complex number, some point in the 2D plane that
corresponds to the strength of a given frequency in the original signal. The plot that I’ve been graphing for the Fourier transform is just the real component
of that output, the 𝑥-coordinate. But you could also graph the imaginary component, separately, if you wanted a fuller
description. And all of this is encapsulated inside that formula that we built up.
And out of context, you can imagine how seeing this formula would seem sort of
daunting. But if you understand how exponentials correspond to rotation. How multiplying that by the function 𝑔 of 𝑡 means drawing a wound-up version of the
graph. And how an integral of a complex-valued function can be interpreted in terms of a
center-of-mass idea. You can see how this whole thing carries with it a very rich, intuitive meaning. And, by the way, one quick small note before we can call this wrapped up. Even though in practice, with things like sound editing, you’ll be integrating over a
finite time interval, the theory of Fourier transforms is often phrased where the
bounds of this integral are negative infinity and infinity.
Concretely, what that means is that you consider this expression for all possible
finite time intervals. And you just ask, what is its limit as that time interval grows to infinity? And man oh man, there is so much more to say, so much! I don’t wanna call it done here. This transform extends to corners of math well beyond the idea of extracting
frequencies from signal. So the next video I put out is gonna go through a couple of these. And that’s really where things start getting interesting.