### Video Transcript

I’ve introduced a few derivative formulas. But a really important one that I left out was exponentials. So here, I wanna talk about the derivatives of functions like two to the 𝑥, seven to
the 𝑥, and also to show why 𝑒 to the 𝑥 is arguably the most important of the
exponentials.

First of all, to get an intuition, let’s just focus on the function two to the
𝑥. And let’s think of that input as a time, 𝑡, maybe in days, and the output, two to
the 𝑡, as a population size. Perhaps of a particularly fertile band of 𝜋 creatures which doubles every single
day. And actually, instead of population size, which grows in discrete little jumps with
each new baby 𝜋 creature. Maybe let’s think of two to the 𝑡 as the total mass of the population. I think that better reflects the continuity of this function, don’t you? So, for example, at time 𝑡 equals zero, the total mass is two to the zero equals
one, for the mass of one creature. At 𝑡 equals one day, the population has grown to two to the one equals two creature
masses. At day 𝑡 equals two, it’s 𝑡 squared, or four. And in general, it just keeps doubling every day.

For the derivative, we want d𝑀 d𝑡, the rate at which this population mass is
growing. Thought of as a tiny change in the mass divided by a tiny change in time. And let’s start by thinking of the rate of change over a full day, say between day
three and day four. Well, in this case, it grows from eight to 16. So that’s eight new creature masses added over the course of one day. And notice, that rate of growth equals the population size at the start of the
day. Between day four and day five, it grows from 16 to 32. So that’s a rate of 16 new creature masses per day. Which again equals the population size at the start of the day. And in general, this rate of growth over a full day equals the population size at the
start of that day. So it might be tempting to say that this means the derivative of two to the 𝑡 equals
itself. That the rate of change of this function at a given time 𝑡 is equal to, well, the
value of that function. And this is definitely in the right direction, but it’s not quite correct.

What we’re doing here is making comparisons over a full day. Considering the difference between two to the 𝑡 plus one and two to the 𝑡. But for the derivative, we need to ask what happens for smaller and smaller
changes. What’s the growth over the course of a tenth of a day? A hundredth of a day? One one billionth of a day? This is why I had us think of the function as representing population mass. Since it makes sense to ask about a tiny change in mass over a tiny fraction of a
day. But it doesn’t make as much sense to ask about the tiny change in a discrete
population size per second. More abstractly, for a tiny change in time, d𝑡, we wanna understand the difference
between two to the 𝑡 plus d𝑡 and two to the 𝑡 all divided by d𝑡. A change in the function per unit time. But now, we’re looking very narrowly around a given point in time rather than over
the course of a full day.

And here’s the thing. I would love if there was some very clear geometric picture that made everything
that’s about to follow just pop out. Some diagram where you could point to one value and say, “See! That part! That is the derivative of two to the 𝑡.” And if you know of one, please let me know. And while the goal here, as with the rest of the series, is to maintain a playful
spirit of discovery. The type of play that follows will have more to do with finding numerical patterns
rather than visual ones.

So start by just taking a very close look at this term, two to the 𝑡 plus d𝑡. A core property of exponentials is that you can break this up as two to the 𝑡 times
two to the d𝑡. That really is the most important property of exponents. If you add two values in that exponent, you can break up the output as a product of
some kind. This is what lets you relate additive ideas, things like tiny steps in time, to
multiplicative ideas, things like rates and ratios. I mean, just look at what happens here. After that move, we can factor out the term two to the 𝑡, which is now just
multiplied by two to the d𝑡 minus one all divided by d𝑡. And remember, the derivative of two to the 𝑡 is whatever this whole expression
approaches as d𝑡 approaches zero. And at first glance, that might seem like an unimportant manipulation. But a tremendously important fact is that this term on the right, where all of the
d𝑡 stuff lives, is completely separate from the 𝑡 term itself. It doesn’t depend on the actual time where we started.

You can go off to a calculator and plug in very small values for d𝑡 here. For example, maybe typing in two to the 0.001 minus one divided by 0.001. What you’ll find is that for smaller and smaller choices of d𝑡, this value
approaches a very specific number, around 0.6931. Don’t worry if that number seems mysterious. The central point is that this is some kind of constant. Unlike derivatives of other functions, all of the stuff that depends on d𝑡 is
separate from the value of 𝑡 itself. So the derivative of two to the 𝑡 is just itself but multiplied by some
constant. And that should kinda make sense. Because, earlier, it felt like the derivative for two to the 𝑡 should be itself. At least, when we were looking at changes over the course of a full day. And evidently, the rate of change for this function over much smaller time scales is
not quite equal to itself. But it’s proportional to itself, with this very peculiar proportionality constant of
0.6931.

And there’s not too much special about the number two here. If instead, we had dealt with the function three to the 𝑡. The exponential property would also have led us to the conclusion that the derivative
of three to the 𝑡 is proportional to itself. But this time, it would have had a proportionality constant 1.0986. And for other bases to your exponent, we can have fun trying to see what the various
proportionality constants are. Maybe seeing if you can find a pattern in them. For example, if you plug in eight to the power of a very tiny number minus one and
divide by that same tiny number. What you’d find is that the relevant proportionality constant is around 2.079. And maybe, just maybe, you would notice that this number happens to be exactly three
times the constant associated with the base for two. So these numbers certainly aren’t random. There is some kind of pattern, but what is it? What does two have to do with the number 0.6931? And what does eight have to do with the number 2.079?

Well, a second question that is ultimately gonna explain these mystery constants is
whether there’s some base where that proportionality constant is one. Where the derivative of 𝑎 to the power 𝑡 is not just proportional to itself, but
actually equal to itself. And there is! It’s the special constant 𝑒, around 2.71828. In fact, it’s not just that the number 𝑒 happens to show up here. This is, in a sense, what defines the number 𝑒. If you ask why does 𝑒, of all numbers, have this property. It’s a little like asking why does 𝜋, of all numbers, happen to be the ratio of the
circumference of a circle to its diameter. This is, at its heart, what defines this value. All exponential functions are proportional to their own derivative. But 𝑒 alone is the special number so that that proportionality constant is one. Meaning, 𝑒 to the 𝑡 actually equals its own derivative.

One way to think of that is that if you look at the graph of 𝑒 to the 𝑡. It has the peculiar property that the slope of a tangent line to any point on this
graph equals the height of that point above the horizontal axis. The existence of a function like this answers the question of the mystery
constants. And it’s because it gives a different way to think about functions that are
proportional to their own derivative. The key is to use the chain rule. For example, what is the derivative of 𝑒 to the three 𝑡? Well, you take the derivative of the outermost function, which due to this special
nature of 𝑒 is just itself. And then multiply by the derivative of that inner function, three 𝑡, which is the
constant, three. Or, rather than just applying a rule blindly, you could take this moment to practice
the intuition for the chain rule that I talked through last video. Thinking about how a slight nudge to 𝑡 changes the value of three 𝑡. And how that intermediate change nudges the final value of 𝑒 to the three 𝑡.

Either way, the point is, 𝑒 to the power of some constant times 𝑡 is equal to that
same constant times itself. And from here, the question of those mystery constants really just comes down to a
certain algebraic manipulation. The number two can also be written as 𝑒 to the natural log of two. There’s nothing fancy here. This is just the definition of the natural log. It asks the question, 𝑒 to the what equals two. So, the function two to the 𝑡 is the same as the function 𝑒 to the power of the
natural log of two times 𝑡. And from what we just saw, combining the facts that 𝑒 to the 𝑡 is its own
derivative with the chain rule. The derivative of this function is proportional to itself, with a proportionality
constant equal to the natural log of two. And indeed, if you go plug in the natural log of two to a calculator, you’ll find
that it’s 0.6931. The mystery constant that we ran into earlier.

And the same goes for all of the other bases. The mystery proportionality constant that pops up when taking derivatives is just the
natural log of the base. The answer to the question 𝑒 to the what equals that base. In fact, throughout applications of calculus, you rarely see exponentials written as
some base to a power 𝑡. Instead, you almost always write the exponential as 𝑒 to the power of some constant
times 𝑡. It’s all equivalent. I mean, any function like two to the 𝑡 or three to the 𝑡 can also be written as 𝑒
to some constant times 𝑡. At the risk of staying overfocused on the symbols here, I really wanna emphasize that
there are many, many ways to write down any particular exponential function. And when you see something written as 𝑒 to some constant times 𝑡, that’s a choice
that we make to write it that way. And the number 𝑒 is not fundamental to that function itself. What is special about writing exponentials in terms of 𝑒 like this is that it gives
that constant in the exponent a nice, readable meaning.

Here, let me show you what I mean. All sorts of natural phenomena involve some rate of change that’s proportional to the
thing that’s changing. For example, the rate of growth of a population actually does tend to be proportional
to the size of the population itself. Assuming there isn’t some limited resource slowing things down. And if you put a cup of hot water in a cool room. The rate at which the water cools is proportional to the difference in temperature
between the room and the water. Or, said a little differently, the rate at which that difference changes is
proportional to itself. If you invest your money, the rate at which it grows is proportional to the amount of
money there at any time.

In all of these cases, where some variable’s rate of change is proportional to
itself. The function describing that variable over time is gonna look like some kind of
exponential. And even though there are lots of ways to write any exponential function. It’s very natural to choose to express these functions as 𝑒 to the power of some
constant times 𝑡. Since that constant carries a very natural meaning. It’s the same as the proportionality constant between the size of the changing
variable and the rate of change.