Video: Change of Basis

Grant Sanderson • 3Blue1Brown • Boclips

Change of Basis


Video Transcript

If I have a vector sitting here in 2D space, we have a standard way to describe it with coordinates. In this case, the vector has coordinates three, two, which means going from its tail to its tip, involves moving three units to the right and two units up.

Now, the more linear-algebra-oriented way to describe coordinates is to think of each of these numbers as a scalar, a thing that stretches or squishes vectors. You think of that first coordinate as scaling 𝑖-hat, the vector with length one pointing to the right, while the second coordinate scales 𝑗-hat, the vector with length one pointing straight up. The tip-to-tail sum of those two scaled vectors is what the coordinates are meant to describe.

You can think of these two special vectors as encapsulating all of the implicit assumptions of our coordinate system, the fact that the first number indicates rightward motion, that the second one indicates upward motion, exactly how far unit of distance is; all of that is tied up in the choice of 𝑖-hat and 𝑗-hat as the vectors which our scalar coordinates are meant to actually scale. Any way to translate between vectors and sets of numbers is called a coordinate system, and the two special vectors, 𝑖-hat and 𝑗-hat, are called the basis vectors of our standard coordinate system.

What I’d like to talk about here is the idea of using a different set of basis vectors. For example, let’s say you have a friend, Jennifer who uses a different set of basis vectors which I’ll call 𝐛 one and 𝐛 two. Her first basis vector, 𝐛 one, points up and to the right a little bit, and her second vector, 𝐛 two, points left and up. Now, take another look at that vector that I showed earlier, the one that you and I would describe using the coordinates three, two using our basis vectors 𝑖-hat and 𝑗-hat. Jennifer would actually describe this vector with the coordinates five-thirds and one-third. What this means is that the particular way to get to that vector using her two basis vectors is to scale 𝐛 one by five-thirds, scale 𝐛 two by one-third, then add them both together. In a little bit, I’ll show you how you could have figured out those two numbers five-thirds and one-third.

In general, whenever Jennifer uses coordinates to describe a vector, she thinks of her first coordinate as scaling 𝐛 one, the second coordinate as scaling 𝐛 two, and she adds the results. What she gets will typically be completely different from the vector that you and I would think of as having those coordinates. To be a little more precise about the set-up here, her first basis vector, 𝐛 one, is something that we would describe with the coordinates two, one, and her second basis vector, 𝐛 two, is something that we would describe as negative one, one.

But it’s important to realize from her perspective, in her system, those vectors have coordinates one, zero and zero, one. They are what define the meaning of the coordinates one, zero and zero, one in her world. So, in effect, we’re speaking different languages. We’re all looking at the same vectors in space, but Jennifer uses different words and numbers to describe them.

Let me say a quick word about how I’m representing things here. When I animate 2D space I typically use this square grid, but that grid is just a construct, a way to visualize our coordinate system, and so it depends on our choice of basis. Space itself has no intrinsic grid. Jennifer might draw her own grid, which would be an equally made-up construct meant as nothing more than a visual tool to help follow the meaning of her coordinates. Her origin, though, would actually line up with ours since everybody agrees on what the coordinates zero, zero should mean. It’s the thing that you get when you scale any vector by zero. But the direction of her axes and the spacing of her grid lines will be different, depending on her choice of basis vectors.

So, after all this is set up a pretty natural question to ask is how we translate between coordinate systems. If, for example, Jennifer describes a vector with coordinates negative one, two, what would that be in our coordinate system? How do you translate from her language to ours? Well, what our coordinates are saying is that this vector is negative one times 𝐛 one plus two times 𝐛 two. And from our perspective, 𝐛 one has coordinates two, one and 𝐛 two has coordinates negative one, one, so we can actually compute negative one times 𝐛 one plus two times 𝐛 two as they’re represented in our coordinate system. And working this out, you get a vector with coordinates negative four, one. So, that’s how we would describe the vector that she thinks of as negative one, two.

This process here of scaling each of her basis vectors by the corresponding coordinates of some vector then adding them together might feel somewhat familiar. It’s matrix-vector multiplication with a matrix whose columns represent Jennifer’s basis vectors in our language. In fact, once you understand matrix-vector multiplication as applying a certain linear transformation — say by watching what I’ve you to be the most important video in this series, chapter three — there’s a pretty intuitive way to think about what’s going on here. A matrix whose columns represent Jennifer’s basis vectors can be thought of as a transformation that moves our basis vectors — 𝑖-hat and 𝑗-hat, the things we think of when we say one, zero and zero, one — to Jennifer’s basis vectors, the things she thinks of when she says one, zero and zero, one.

To show how this works, let’s walk through what it would mean to take the vector that we think of as having coordinates negative one, two and applying that transformation. Before the linear transformation, we’re thinking of this vector as a certain linear combination of our basis vectors, negative one times 𝑖-hat plus two times 𝑗-hat. And the key feature of a linear transformation is that the resulting vector will be that same linear combination, but of the new basis vectors, negative one times the place where 𝑖-hat lands plus two times the place where 𝑗-hat lands. So what this matrix does is transform our misconception of what Jennifer means into the actual vector that she’s referring to. I remember that when I was first learning this, it always felt kind of backwards to me. Geometrically, this matrix transforms our grid into Jennifer’s grid. But numerically, it’s translating a vector described in her language to our language. What made it finally click for me was thinking about how it takes our misconception of what Jennifer means, the vector we get using the same coordinates but in our system, then it transforms it into the vector that she really meant.

What about going the other way around? In the example I used earlier this video, when I have the vector with coordinates three, two in our system, how did I compute that it would have coordinates five-thirds and one-third in Jennifer’s system? You start with that change of basis matrix that translates Jennifer’s language into ours, then you take its inverse. Remember, the inverse of a transformation is a new transformation that corresponds to playing that first one backwards. In practice, especially when you’re working in more than two dimensions, you’d use a computer to compute the matrix that actually represents this inverse. In this case, the inverse of the change of basis matrix that has Jennifer’s basis as its columns ends up working out to have columns one-third, negative one-third and one-third, two-thirds. So, for example, to see what the vector three, two looks like in Jennifer’s system, we multiply this inverse change of basis matrix by the vector three, two, which works out to be five-thirds, one-third.

So that, in a nutshell, is how to translate the description of individual vectors back and forth between coordinate systems. The matrix whose columns represent Jennifer’s basis vectors, but written in our coordinates, translates vectors from her language into our language. And the inverse matrix does the opposite. But vectors aren’t the only thing that we describe using coordinates.

For this next part, it’s important that you’re all comfortable representing transformations with matrices and that you know how matrix multiplication corresponds to composing successive transformations. Definitely pause and take a look at chapters three and four if any of that feels uneasy.

Consider some linear transformation, like a 90-degree counterclockwise rotation. When you and I represent this with a matrix, we follow where the basis vectors 𝑖-hat and 𝑗-hat each go. 𝑖-hat ends up at the spot with coordinates zero, one, and 𝑗-hat end up at the spot with coordinates negative one, zero, so those coordinates become the columns of our matrix. But this representation is heavily tied up in our choice of basis vectors, from the fact that we’re following 𝑖-hat and 𝑗-hat in the first place to the fact that we’re recording their landing spots in our own coordinate system.

How would Jennifer describe this same 90-degree rotation of space? You might be tempted to just translate the columns of our rotation matrix into Jennifer’s language, but that’s not quite right. Those columns represent where our basis vectors 𝑖-hat and 𝑗-hat go. But the matrix that Jennifer wants should represent where her basis vectors land, and it needs to describe those landing spots in her language. Here’s a common way to think of how this is done. Start with any vector written in Jennifer’s language. Rather than trying to follow what happens to it in terms of her language, first we’re going to translate it into our language using the change of basis matrix, the one whose columns represent her basis vectors in our language. This gives us the same vector but now written in our language. Then, apply the transformation matrix to what you get by multiplying it on the left. This tells us where that vector lands but still in our language. So, as a last step, apply the inverse change of basis matrix, multiplied on the left as usual, to get the transformed vector but now in Jennifer’s language.

Since we could do this with any vector written in her language, first applying the change of basis, then the transformation, then the inverse change of basis, that composition of three matrices gives us the transformation matrix in Jennifer’s language. It takes in a vector of her language and spits out the transformed version of that vector in her language. For this specific example, when Jennifer’s basis vectors look like two, one and negative one, one in our language and when the transformation is a 90-degree rotation, the product of these three matrices, if you work through it, has columns one-third, five-thirds and negative two-thirds, negative one-third. So if Jennifer multiplies that matrix by the coordinates of a vector in her system, it will return the 90-degree rotated version of that vector expressed in her coordinate system.

In general, whenever you see an expression like 𝐴 inverse times 𝑀 times 𝐴, it suggests a mathematical sort of empathy. That middle matrix represents a transformation of some kind, as you see it, and the outer two matrices represent the empathy, the shift in perspective, and the full matrix product represents that same transformation but as someone else sees it. For those of you wondering why we care about alternate coordinate systems, the next video on eigenvectors and eigenvalues will give a really important example of this. See you then!

Nagwa uses cookies to ensure you get the best experience on our website. Learn more about our Privacy Policy.