Video Transcript
If I have a vector sitting here in
2D space, we have a standard way to describe it with coordinates. In this case, the vector has
coordinates three, two, which means going from its tail to its tip, involves moving
three units to the right and two units up.
Now, the more
linear-algebra-oriented way to describe coordinates is to think of each of these
numbers as a scalar, a thing that stretches or squishes vectors. You think of that first coordinate
as scaling 𝑖-hat, the vector with length one pointing to the right, while the
second coordinate scales 𝑗-hat, the vector with length one pointing straight
up. The tip-to-tail sum of those two
scaled vectors is what the coordinates are meant to describe.
You can think of these two special
vectors as encapsulating all of the implicit assumptions of our coordinate system,
the fact that the first number indicates rightward motion, that the second one
indicates upward motion, exactly how far unit of distance is; all of that is tied up
in the choice of 𝑖-hat and 𝑗-hat as the vectors which our scalar coordinates are
meant to actually scale. Any way to translate between
vectors and sets of numbers is called a coordinate system, and the two special
vectors, 𝑖-hat and 𝑗-hat, are called the basis vectors of our standard coordinate
system.
What I’d like to talk about here is
the idea of using a different set of basis vectors. For example, let’s say you have a
friend, Jennifer who uses a different set of basis vectors which I’ll call 𝐛 one
and 𝐛 two. Her first basis vector, 𝐛 one,
points up and to the right a little bit, and her second vector, 𝐛 two, points left
and up. Now, take another look at that
vector that I showed earlier, the one that you and I would describe using the
coordinates three, two using our basis vectors 𝑖-hat and 𝑗-hat. Jennifer would actually describe
this vector with the coordinates five-thirds and one-third. What this means is that the
particular way to get to that vector using her two basis vectors is to scale 𝐛 one
by five-thirds, scale 𝐛 two by one-third, then add them both together. In a little bit, I’ll show you how
you could have figured out those two numbers five-thirds and one-third.
In general, whenever Jennifer uses
coordinates to describe a vector, she thinks of her first coordinate as scaling 𝐛
one, the second coordinate as scaling 𝐛 two, and she adds the results. What she gets will typically be
completely different from the vector that you and I would think of as having those
coordinates. To be a little more precise about
the set-up here, her first basis vector, 𝐛 one, is something that we would describe
with the coordinates two, one, and her second basis vector, 𝐛 two, is something
that we would describe as negative one, one.
But it’s important to realize from
her perspective, in her system, those vectors have coordinates one, zero and zero,
one. They are what define the meaning of
the coordinates one, zero and zero, one in her world. So, in effect, we’re speaking
different languages. We’re all looking at the same
vectors in space, but Jennifer uses different words and numbers to describe
them.
Let me say a quick word about how
I’m representing things here. When I animate 2D space I typically
use this square grid, but that grid is just a construct, a way to visualize our
coordinate system, and so it depends on our choice of basis. Space itself has no intrinsic
grid. Jennifer might draw her own grid,
which would be an equally made-up construct meant as nothing more than a visual tool
to help follow the meaning of her coordinates. Her origin, though, would actually
line up with ours since everybody agrees on what the coordinates zero, zero should
mean. It’s the thing that you get when
you scale any vector by zero. But the direction of her axes and
the spacing of her grid lines will be different, depending on her choice of basis
vectors.
So, after all this is set up a
pretty natural question to ask is how we translate between coordinate systems. If, for example, Jennifer describes
a vector with coordinates negative one, two, what would that be in our coordinate
system? How do you translate from her
language to ours? Well, what our coordinates are
saying is that this vector is negative one times 𝐛 one plus two times 𝐛 two. And from our perspective, 𝐛 one
has coordinates two, one and 𝐛 two has coordinates negative one, one, so we can
actually compute negative one times 𝐛 one plus two times 𝐛 two as they’re
represented in our coordinate system. And working this out, you get a
vector with coordinates negative four, one. So, that’s how we would describe
the vector that she thinks of as negative one, two.
This process here of scaling each
of her basis vectors by the corresponding coordinates of some vector then adding
them together might feel somewhat familiar. It’s matrix-vector multiplication
with a matrix whose columns represent Jennifer’s basis vectors in our language. In fact, once you understand
matrix-vector multiplication as applying a certain linear transformation — say by
watching what I’ve you to be the most important video in this series, chapter three
— there’s a pretty intuitive way to think about what’s going on here. A matrix whose columns represent
Jennifer’s basis vectors can be thought of as a transformation that moves our basis
vectors — 𝑖-hat and 𝑗-hat, the things we think of when we say one, zero and zero,
one — to Jennifer’s basis vectors, the things she thinks of when she says one, zero
and zero, one.
To show how this works, let’s walk
through what it would mean to take the vector that we think of as having coordinates
negative one, two and applying that transformation. Before the linear transformation,
we’re thinking of this vector as a certain linear combination of our basis vectors,
negative one times 𝑖-hat plus two times 𝑗-hat. And the key feature of a linear
transformation is that the resulting vector will be that same linear combination,
but of the new basis vectors, negative one times the place where 𝑖-hat lands plus
two times the place where 𝑗-hat lands. So what this matrix does is
transform our misconception of what Jennifer means into the actual vector that she’s
referring to. I remember that when I was first
learning this, it always felt kind of backwards to me. Geometrically, this matrix
transforms our grid into Jennifer’s grid. But numerically, it’s translating a
vector described in her language to our language. What made it finally click for me
was thinking about how it takes our misconception of what Jennifer means, the vector
we get using the same coordinates but in our system, then it transforms it into the
vector that she really meant.
What about going the other way
around? In the example I used earlier this
video, when I have the vector with coordinates three, two in our system, how did I
compute that it would have coordinates five-thirds and one-third in Jennifer’s
system? You start with that change of basis
matrix that translates Jennifer’s language into ours, then you take its inverse. Remember, the inverse of a
transformation is a new transformation that corresponds to playing that first one
backwards. In practice, especially when you’re
working in more than two dimensions, you’d use a computer to compute the matrix that
actually represents this inverse. In this case, the inverse of the
change of basis matrix that has Jennifer’s basis as its columns ends up working out
to have columns one-third, negative one-third and one-third, two-thirds. So, for example, to see what the
vector three, two looks like in Jennifer’s system, we multiply this inverse change
of basis matrix by the vector three, two, which works out to be five-thirds,
one-third.
So that, in a nutshell, is how to
translate the description of individual vectors back and forth between coordinate
systems. The matrix whose columns represent
Jennifer’s basis vectors, but written in our coordinates, translates vectors from
her language into our language. And the inverse matrix does the
opposite. But vectors aren’t the only thing
that we describe using coordinates.
For this next part, it’s important
that you’re all comfortable representing transformations with matrices and that you
know how matrix multiplication corresponds to composing successive
transformations. Definitely pause and take a look at
chapters three and four if any of that feels uneasy.
Consider some linear
transformation, like a 90-degree counterclockwise rotation. When you and I represent this with
a matrix, we follow where the basis vectors 𝑖-hat and 𝑗-hat each go. 𝑖-hat ends up at the spot with
coordinates zero, one, and 𝑗-hat end up at the spot with coordinates negative one,
zero, so those coordinates become the columns of our matrix. But this representation is heavily
tied up in our choice of basis vectors, from the fact that we’re following 𝑖-hat
and 𝑗-hat in the first place to the fact that we’re recording their landing spots
in our own coordinate system.
How would Jennifer describe this
same 90-degree rotation of space? You might be tempted to just
translate the columns of our rotation matrix into Jennifer’s language, but that’s
not quite right. Those columns represent where our
basis vectors 𝑖-hat and 𝑗-hat go. But the matrix that Jennifer wants
should represent where her basis vectors land, and it needs to describe those
landing spots in her language. Here’s a common way to think of how
this is done. Start with any vector written in
Jennifer’s language. Rather than trying to follow what
happens to it in terms of her language, first we’re going to translate it into our
language using the change of basis matrix, the one whose columns represent her basis
vectors in our language. This gives us the same vector but
now written in our language. Then, apply the transformation
matrix to what you get by multiplying it on the left. This tells us where that vector
lands but still in our language. So, as a last step, apply the
inverse change of basis matrix, multiplied on the left as usual, to get the
transformed vector but now in Jennifer’s language.
Since we could do this with any
vector written in her language, first applying the change of basis, then the
transformation, then the inverse change of basis, that composition of three matrices
gives us the transformation matrix in Jennifer’s language. It takes in a vector of her
language and spits out the transformed version of that vector in her language. For this specific example, when
Jennifer’s basis vectors look like two, one and negative one, one in our language
and when the transformation is a 90-degree rotation, the product of these three
matrices, if you work through it, has columns one-third, five-thirds and negative
two-thirds, negative one-third. So if Jennifer multiplies that
matrix by the coordinates of a vector in her system, it will return the 90-degree
rotated version of that vector expressed in her coordinate system.
In general, whenever you see an
expression like 𝐴 inverse times 𝑀 times 𝐴, it suggests a mathematical sort of
empathy. That middle matrix represents a
transformation of some kind, as you see it, and the outer two matrices represent the
empathy, the shift in perspective, and the full matrix product represents that same
transformation but as someone else sees it. For those of you wondering why we
care about alternate coordinate systems, the next video on eigenvectors and
eigenvalues will give a really important example of this. See you then!