Explainer: Matrix of Linear Transformation

In this explainer, we will learn how to find the matrix of linear transformation and the image of a vector under transformation.

Linear algebra provides an invaluable tool kit for computation tasks such as solving a system of linear equations or finding the inverse of a square matrix. Given that these tasks usually involve a large number of calculations, it is often the case that the geometric interpretation is underappreciated. Linear algebra is a genuine oddity in that it is possible to understand this entire discipline with multiple, distinct perspectives, each of which has its own merits and its own way of understanding this vast and elegant subject. A criminally overlooked perspective for new students of linear algebra is the one in which we think of matrices as a way of transforming vectors, hence offering us a well-developed tool kit to start describing (linear) spatial geometry.

Vectors are special types of matrices which can be split into two categories: row vectors and column vectors. A row vector is a matrix of order 1×𝑛, which has 1 row and 𝑛 columns, whereas a column vector is a matrix of order π‘šΓ—1 which has π‘š rows and 1 column. Although convention differs from source to source, it is arguably most sensible to think only in terms of column vectors. There are two reasons for this: column vectors are used more often and can be related to row vectors by transposition if needed; and column vectors are possible to combine with matrices under matrix multiplication on the left-hand side (which is also simply a matter of arbitrary convention but for some reason it just feels better).

In geometry, a (column) vector having π‘š entries would be referred to as an π‘š-dimensional vector and these objects are used to represent points in π‘š-dimensional space. For example, consider the two-dimensional vectors π‘Ž=32,𝑏=ο”βˆ’52.

We can represent these vectors in diagrammatic form as shown in the figure, where the first entry in each vector corresponds to the π‘₯-coordinate and the final entry corresponds to the 𝑦-coordinate. This is all fine and well and, in likelihood, it is probably a familiar concept to those who have already become interested for whatever reason in the higher art of linear algebra. We will not spend any time revising what a vector is or how it can be represented diagrammatically, but instead we will focus on what it is that we can do with vectors within the construct of linear algebra.

Suppose that we had decided that the previous vectors π‘Ž and 𝑏, whilst interesting, were something that we wanted to modify. Any previous study of mathematics will give some clues as to what options we have available: rotation, reflection, translation, dilation, and so on. In fact, most of these options are simply special cases of a much more powerful route: using matrix multiplication to change the vectors. To demonstrate what we mean, we consider the vectors π‘Ž and 𝑏 as stated above, and additionally we also take the matrix 𝑀=12βˆ’13.

The matrix multiplication 𝐴𝐡 is well defined so long as 𝐴 has order π‘šΓ—π‘› and 𝐡 has order 𝑛×𝑝. The resulting matrix is of order π‘šΓ—π‘. Suppose now that we were to consider the matrix products π‘Žβ€²=π‘€π‘Ž and 𝑏′=𝑀𝑏; then, from the dimensionality of the matrix and both vectors, we know that this will result in two new vectors π‘Žβ€² and 𝑏′ that will be of order 2Γ—1, hence having the same dimension as the original two vectors. Specifically, by completing the correct matrix multiplication, we would obtain π‘Žβ€²=π‘€π‘Ž=12βˆ’1332=73 and 𝑏′=𝑀𝑏=12βˆ’13ο ο”βˆ’52=ο”βˆ’111.

To better understand how the original vectors have changed, we now plot everything on the same axis as shown in the figure.

As we can see, the original vectors have been transformed by the matrix 𝑀 and to notably different effect. In having taken the original vectors π‘Ž and 𝑏 and multiplying them on the left-hand side by the matrix 𝑀, we say that we have performed a β€œlinear transformation” on the original vectors to achieve the new vectors π‘Žβ€² and 𝑏′. Given that 𝑀 is a 2Γ—2 matrix, there were 4 entries whose values could be fixed, meaning that there are many possible linear transformations available to us. For example, suppose that we considered the new matrix 𝑀=2βˆ’111 and the same vectors that we used above: π‘Ž=32,𝑏=ο”βˆ’52.

Then, by defining the new vectors π‘Žβ€²=π‘€π‘Ž and 𝑏′=𝑀𝑏, we obtain the following: π‘Žβ€²=45,𝑏′=ο”βˆ’12βˆ’3.

These are the two vectors after having performed the linear transformation represented by the matrix 𝑀. We have plotted these new vectors in the figure.

As we can clearly see, the new vectors π‘Žβ€² and 𝑏′ are different to those that were obtained using the previous linear transformation, as shown in the figure. Quite obviously, the two output vectors π‘Žβ€² and 𝑏′ are different in both of these diagrams, despite the original vectors being the same. This is because of differences in the two matrices that were used to define the linear transformations.

At this stage, it should be becoming clear that linear transformations can be used to describe a large number of ways that we might wish to transform a set of vectors. When attempting to describe these in a visual sense, the result can quickly become confusing and intractable due to the large number of vectors that might be involved. There is a simple visual trick that can help enormously with this: instead of connecting every vector to the origin, we connect every vector only to one of the other vectors (in a particularly helpful order). Suppose that we took the three vectors π‘Ž=34,𝑏=17,𝑐=ο”βˆ’23 and the linear transformation that is represented by the matrix 𝑀=123βˆ’1.

Then, the vectors after this transformation would be π‘Žβ€²=115,𝑏′=15βˆ’4,𝑐′=4βˆ’9.

In the figure, we have shown how to better represent the linear transformation by connecting the three given vectors as a quadrilateral where the fourth vertex is the origin.

The orange quadrilateral is prior to the linear transformation and the blue quadrilateral is after this transformation has been performed. In this sense we can obtain a better visualization as to how the linear transformation acts on the 2 dimensional space that it alters. This method of describing the linear transformation is clearly superior to the previous diagrammatic method, where every vector was connected to the origin.

Although we could have defined any set of three vectors π‘Ž, 𝑏, and 𝑐 to generate a diagram of the above type, in practice, we often find ourselves returning to a standard set of input vectors that will allow us to get a snapshot of the linear transformation. Suppose that we kept the linear transformation as defined by the previous matrix but instead choose the set of vectors π‘Ž=10,𝑏=11,𝑐=01.

These vectors describe a square with side length 1 that has one vertex at the origin and is contained within the upper-right quadrant of the plane. After applying the linear transformation, we find the modified vectors π‘Žβ€²=13,𝑏′=32,𝑐′=2βˆ’1.

As shown in the figure, the unit square has been stretched and rotated in a way that is difficult to describe precisely but is easier to understand in a visual sense.

From this, we can deduce that the effect of the linear transformation is that, in some way, the order of the corners π‘Ž and 𝑐 is β€œflipped” while the effect on 𝑏 is more akin to a dilation of the original vector. This is still by no means a perfect description of the linear transformation and for this there is no other alternative to the statement of the matrix 𝑀 which defined this.

In being able to choose any values for each of the 4 entries that comprise a 2Γ—2 matrix, we have access to an infinite number of linear transformations on two-dimensional vectors. There are of course many subcategories as to the type of linear transformation that we can perform, for example, rotations, dilations, and reflections. Before these can be understood fully and before the significance of the underlying algebraic structure can be revealed, it is necessary that we practice this idea to ensure that we can first perform the linear transformation that is defined by any 2Γ—2 matrix.

Example 1: Linear Transformation of Two-Dimensional Vectors

Consider the linear transformation described by the matrix 𝑀=2114.

Let us also define the quadrilateral with one vertex at the origin and the three remaining corners described by the vectors π‘Ž=10,𝑏=11,𝑐=01.

Plot a diagram to show the effect of the given linear transformation on the given quadrilateral.

Answer

We complete the matrix multiplications π‘Žβ€²=π‘€π‘Ž, 𝑏′=𝑀𝑏, and 𝑐′=𝑀𝑐 to produce the vectors that we need to complete the diagram. The calculation for π‘Žβ€² is π‘Žβ€²=π‘€π‘Ž=211410=21.

The calculations for 𝑏′ and 𝑐′ are completed in the same way: 𝑏′=𝑀𝑏=211411=35 and 𝑐′=𝑀𝑐=211401=14.

The resulting quadrilateral is shown in the figure, wherein the effect of the given linear transformation is demonstrated by connecting the three vectors π‘Žβ€², 𝑏′, and 𝑐′ to the origin in order.

In choosing the origin and the 3 particular vectors π‘Ž=10,𝑏=11,𝑐=01, we have actually made life much simpler for ourselves. As we can verify from every previous example where the vectors take these values, the vector π‘Žβ€² is simply the first column of the 2Γ—2 matrix and the vector 𝑐′ is just the second column of this matrix. The vector 𝑏′ can then always be written as 𝑏′=π‘Žβ€²+𝑐′. Not only does this approach allow us to describe the effect of the linear transformation on the given quadrilateral without having to perform any matrix multiplication, it also shows how we can generate a 2Γ—2 matrix that achieves any desired effect on π‘Ž and 𝑐, with the effect on 𝑏 being a consequence of this.

For example, suppose for some very good reason that we wanted to take the vectors defined above and apply a linear transformation to π‘Ž and 𝑐 to give the new vectors π‘Žβ€²=ο”βˆ’1βˆ’3,𝑐′=2βˆ’2.

This will place π‘Žβ€² in the lower-left quadrant of the plane and 𝑐′ in the lower-right quadrant of the plane. There is only one option for the matrix that achieves this, given how we have defined π‘Ž and 𝑐: 𝑀=ο”βˆ’12βˆ’3βˆ’2.

We could either use the relationship 𝑏′=π‘Žβ€²+𝑐′ or directly complete the calculation 𝑏′=𝑀𝑏 to find that 𝑏′=1βˆ’5.

Given that we now have π‘Žβ€², 𝑏′, and 𝑐′ without having performed any calculation other than addition, we can plot the graph as shown in the figure. The effect is exactly as we had expected, with π‘Žβ€² and 𝑐′ appearing in the quadrants that we predicted. Please note that we could have picked any two of the three original vectors π‘Ž, 𝑏, and 𝑐 and then used the relationship 𝑏′=π‘Žβ€²+𝑐′ to find the third vector after the linear transformation. We only initially considered the vectors π‘Ž and 𝑐 because this meant that we could immediately populate the two columns of the 2Γ—2 matrix without having to perform any calculations. Had we instead chosen 𝑏 as one of the two initial vectors, then generally we would have had no option but to first complete the matrix multiplication 𝑏′=𝑀𝑏.

Example 2: Linear Transformation of Two-Dimensional Vectors

What is the matrix 𝑀 that sends the points 𝐴, 𝐡, and 𝐢 to the points 𝐴′, 𝐡′, and 𝐢′ as shown?

Answer

By looking at the figure, we can see that the three initial vectors are 𝐴=10,𝐡=11,𝐢=01.

After the linear transformation has been applied, there are the three resultant vectors 𝐴′=14,𝐡′=45,𝐢′=31.

Given our choice of the vectors 𝐴 and 𝐢, it is immediately apparent that the matrix 𝑀 must be the concatenation of the two column vectors 𝐴′ and 𝐢′: 𝑀=1341.

We can check that this is the case for any of the vectors 𝐴′, 𝐡′, and 𝐢′. We will choose only the vector 𝐡′, from which we can check that 𝐡′=𝑀𝐡=134111=45.

We can check that 𝐴′=𝑀𝐴 and 𝐢′=𝑀𝐢 using the equivalent calculation as given above.

Normally, the standard vectors π‘Ž=10,𝑏=11,𝑐=01. are used for questions of this type because the transformation of these points, especially π‘Ž and 𝑐, is straightforward to understand. This removes the need to complete any matrix multiplication or any other arithmetic beyond the two instances of simple addition that are used to find 𝑏′. The vectors π‘Ž and 𝑐 have one entry, that is, 1, with the other entry being 0, meaning that it is little effort to infer the form of the matrix 𝑀 that encodes the specified linear transformation. If the given initial vectors are not as simple as in the equation above, then more work will be needed to determine the linear transformation.

Example 3: Linear Transformation of Two-Dimensional Vectors

Find the matrix of the transformation that maps the points π‘Ž, 𝑏, and 𝑐 onto π‘Žβ€², 𝑏′, and 𝑐′ as shown in the figure.

Answer

By examining the figure, we see the original set of points in orange are as follows: π‘Ž=32,𝑏=5βˆ’1,𝑐=3βˆ’2.

After the linear transformation has been applied, we have the points as colored on the graph in blue: π‘Žβ€²=7βˆ’1,𝑏′=16βˆ’6,𝑐′=11βˆ’5.

The matrix that represents this linear transformation has order 2Γ—2 and to this we will assign the variable 𝑀. Given that we do not yet know the form of 𝑀, we express the matrix with unknown entries: 𝑀=ο”π‘š11π‘š12π‘š21π‘š22.

We can then use any of the equations that link together the initial set of points with the set of points after the linear transformation has been applied. We could choose to use any two of the three relations π‘Žβ€²=π‘€π‘Ž, 𝑏′=𝑀𝑏, and 𝑐′=𝑀𝑐. With no obvious reason to select any of the three points as favorable, we begin with the relationship π‘Žβ€²=π‘€π‘Ž. Writing this out in full gives the matrix equation 7βˆ’1=ο”π‘š11π‘š12π‘š21π‘š2232.

We can check that the matrix multiplication in this equation is well defined, meaning that if there was any situational advantage then we could complete this operation. Fortunately, there is such an advantage, so we complete the matrix multiplication in the above equation to obtain the two equations:

7=3π‘š11+2π‘š12,βˆ’1=3π‘š21+2π‘š22.(1)

This is half of the information needed to answer the problem by finding the values of π‘š11, π‘š12, π‘š21, and π‘š22. To complete this process, we need to use one of the two remaining relationships: 𝑏′=𝑀𝑏 and 𝑐′=𝑀𝑐. With no obvious advantage to choosing either of these, we select 𝑏′=𝑀𝑏. Writing this out in full gives the matrix equation 16βˆ’6=ο”π‘š11π‘š12π‘š21π‘š225βˆ’1.

Completing the matrix multiplication gives

16=5π‘š11βˆ’π‘š12,βˆ’6=5π‘š21βˆ’π‘š22.(2)

Now we compare equations (1) and (2). Notice that the left-hand equations of each of these feature the entries π‘š11 and π‘š12, which means that we can solve the simultaneous equations 7=3π‘š11+2π‘š12 and 16=5π‘š11βˆ’π‘š12 to give π‘š11=3 and π‘š12=βˆ’1. By then comparing the right-hand side of equations (1) and (2), we get the two expressions βˆ’1=3π‘š21+2π‘š22 and βˆ’6=5π‘š21βˆ’π‘š22. This gives π‘š21=βˆ’1 and π‘š22=1, meaning that the full matrix 𝑀 can be written as 𝑀=3βˆ’1βˆ’11.

We can check that this is the correct matrix by ensuring that 𝑐′=𝑀𝑐, which is the only one of the three given relationships that we have not yet used. We do indeed find that this equation is honored by the transformation matrix 𝑀, since it is the case that 11βˆ’5=3βˆ’1βˆ’113βˆ’2.

Example 4: Linear Transformation of Two-Dimensional Vectors

Find the matrix of the transformation that maps the points 𝐴, 𝐡, and 𝐢 onto 𝐴′, 𝐡′, and 𝐢′ as shown in the figure.

Answer

The initial set of points are 𝐴=43,𝐡=79,𝐢=36, and after the linear transformation these points have been mapped to the following points: 𝐴′=ο”βˆ’51,𝐡′=ο”βˆ’66,𝐢′=ο”βˆ’15.

We will represent the demonstrated linear transformation by the 2Γ—2 matrix 𝑀=ο”π‘š11π‘š12π‘š21π‘š22. and we must determine the value of the π‘šπ‘–π‘— entries such that 𝐴′=𝑀𝐴, 𝐡′=𝑀𝐡, and 𝐢′=𝑀𝐢. We will only need two of these relationships, and we begin with the equation 𝐴′=𝑀𝐴. This requires solving the equation ο”βˆ’51=ο”π‘š11π‘š12π‘š21π‘š2243 which can be expanded as two equations by the definition of matrix multiplication:

βˆ’5=4π‘š11+3π‘š12,1=4π‘š21+π‘š22.(3)

Then, we use the relationship 𝐡′=𝑀𝐡 to give the matrix equation ο”βˆ’66=ο”π‘š11π‘š12π‘š21π‘š2279.

Completing the matrix multiplication gives the two equations:

βˆ’6=7π‘š11+9π‘š12,6=7π‘š21+9π‘š22.(4)

There are several techniques that can be used to find the values of the π‘šπ‘–π‘— entries in a way that is more succinct, but for the moment we can take the two left-side equations from (3) and (4) to find π‘š11 and π‘š12, as well as taking the two right-side equations to calculate π‘š21 and π‘š22. Whatever the method, the values that we find are π‘š11=βˆ’95 and π‘š12=1115 and also π‘š21=βˆ’35 and