In this explainer, we will learn how to identify elementary matrices and their relation with row operations and how to find the inverse of an elementary matrix.

Most of the time, when working with systems of linear equations, our first approach would be to create the corresponding augmented coefficient matrix and then manipulate the matrix into the reduced echelon form using row operations, in a process that is most commonly referred to as “Gauss–Jordan elimination.” This approach can be very efficient and is normally the method that is programmed into any computer algebra package by default for solving a system of linear equations. One matter that is often neglected when talking about elementary row operations or Gauss–Jordan elimination in general is the fact that every elementary row operation can be encoded by a very simple matrix that is ostensibly similar to the identity matrix of the same order. Although such matrices might be considered unnecessary, being able to operate in this way is actually vitally important when looking to complete algorithms such as the LU or PLU decomposition of a matrix. For these methods, it is necessary for us to also understand the multiplicative inverse matrix for each of the elementary matrices as well as taking account of the noncommutative property of matrix multiplication by defining the elementary matrices as only ever being used on the left-hand side.

By taking the elementary row operations and phrasing each of these in terms of an equivalent elementary matrix, we are then able to use the algebraic properties of matrices, including the elegant results of matrices being associative under matrix multiplication. If nothing else, then by phrasing elementary row operations as elementary matrices we afford ourselves the opportunity to practice row operations in two ways: by completing Gauss–Jordan elimination and as the product of elementary matrices. Although, undeniably, the first situation is more common, there are a substantial minority of cases where the latter approach is preferable. Before we begin demonstrating the versatility of this approach, we will remind ourselves of the elementary row operations.

### Definition: Elementary Row Operations

Consider a matrix of order and with rows labeled . Then, the three elementary row operations that we can perform are as follows:

- Switching of row with row , denoted ;
- Scaling of row by a nonzero constant , denoted ;
- Adding a scaled version of row to row , denoted .

If an elementary row operation is used to transform the matrix into a new matrix , then we should say that these two matrices are “row equivalent.”

We will not recap the effects of the elementary row operations at this stage, as we will do so later in tandem with their corresponding elementary matrices. At this stage, we might begin to question how it could possibly be the case that the elementary matrices could be used to emulate the effect of the elementary row operations, without having to create a matrix that is absurdly complicated or intractable. Instead, the opposite is true: the elementary matrices are extremely simple, differing from the identity matrices in at most two entries.

### Definition: The First Type of Elementary Row Operation and the Corresponding Elementary Matrix

Consider a matrix and the first type of elementary row operation , giving the row-equivalent matrix . Then, we can define the corresponding “elementary” matrix: which is essentially the identity matrix but with the th and th rows having been swapped. Then, the row-equivalent matrix can be written as the matrix multiplication

This matrix is actually a particular type of permutation matrix and the effect of this type of matrix is easily observed. Suppose that we had the matrix and for some reason we felt that it would be advantageous to perform the operation in order to find the row-equivalent matrix . This row operation involves swapping the second and third rows, which means that the corresponding elementary matrix will be the identity matrix after having swapped the second and third rows:

As a matter of convention, we multiply the elementary matrix on the left-hand side of . It is important that we set this convention when we are looking at the third type of elementary row operation later in this explainer. Given that matrix multiplication is generally noncommutative, it will usually be the case that for any two matrices and with compatible orders. Therefore, we fix this convention at this stage and reassert that any elementary matrix will always be multiplied by the matrix of interest on the left-hand side. To obtain the row-equivalent matrix , we now perform the matrix multiplication

As we can see, the result is exactly what we expected: the second and third rows of the matrix have been swapped.

The example above treated a matrix. Recalling that the matrix product is defined so long as has order and has order , it is clear that we did not need to work with a square matrix in the previous example. In some ways, this is totally obvious, as row operations are used as part of Gauss–Jordan elimination, and there is no reason why this method must apply only to systems of linear equations that produce a square coefficient matrix. In summary, there is no reason at all why the elementary matrices cannot apply to nonsquare matrices, as we will see in the following example.

### Example 1: Using First Type of Elementary Matrix

Consider the matrix

- Write the elementary matrix corresponding to the row operation .
- Derive the subsequent row-equivalent matrix .

### Answer

The matrices and are row-equivalent and must therefore have the same order, , meaning that must have order for the matrix multiplication to be well defined. The matrix must switch the first and the third row of , which means that this matrix must be equal to the identity matrix after having swapped the first and third rows. In other words,

We can check that this elementary matrix does indeed produce the desired effect by calculating

As we can see, the matrix is the same as the matrix , only after having the first and third rows switched.

At this stage, after defining the elementary matrices , it would normally occur to a mathematician to ask how we might calculate the inverse matrix. There is a risk that this could quickly become an overly computational exercise, wherein we attempt to derive using one of the normal methods. The necessary rationale is actually much less painful than this. Suppose that we performed the row-switch operation and wanted to undo this change; then, the obvious thing to do would be to perform the same row-switch operation again, hence returning the matrix to its original state. In short, this means that the inverse matrix of is the matrix itself. We formalize this understanding in the following theorem.

### Theorem: Inverse of the Elementary Matrix 𝑃_{𝑖𝑗}

The inverse of the matrix is given by the formula

Few matrices are equal to their own inverse, with these being referred to as “involutory” matrices (which are elusive sightings in the wilderness of linear algebra). The result about their inverses is nearly trivial when considering the effect of their corresponding elementary row operations. Purely for the sake of demonstration, we will give one example that requires the calculation and use of the inverse of the first type of elementary matrix.

### Example 2: Using the First Type of Elementary Matrix and the Corresponding Inverse Matrix

Consider the matrix

- Write the elementary matrix corresponding to the row operation .
- Derive the subsequent row-equivalent matrix .
- Is it true that multiplying by the inverse elementary matrix on the left side will return the original matrix ?

### Answer

The elementary row-swap operation means that we require the elementary matrix that swaps the second and fourth rows of any compatible matrix. The appropriate elementary matrix is equivalent to the identity matrix only with the second and fourth rows having been swapped:

Multiplying matrix on the left by this matrix, we obtain the row-equivalent matrix

The effect has been exactly what we desired: the second and fourth rows of the matrix have been swapped.

We should find that multiplying on the left by the inverse matrix will return the original matrix . Since the first type of elementary matrix is equal to its own inverse, we have and hence

So, yes, it is true that multiplying by the inverse elementary matrix on the left side will return the original matrix .

This exceptionally helpful property of the first type of elementary matrix is useful when attempting to calculate the PLU decomposition of a matrix, as well as many other scenarios. Although the inverses of the second and third types of elementary matrices are slightly more complicated, their similarity to the identity matrix means that their inverse is relatively simple (at least when compared to the average level of complexity that is involved in calculating the inverse of a square matrix).

We will now move on and consider the second type of elementary row operation: the row-scaling operation. Although this type of operation is not quite as simple as the first type of elementary matrices that defined the row-swap operations, the elementary matrix of the second type of row operation has the extremely helpful feature that it is a diagonal matrix, which is among the easiest types of matrices to work with in an algebraic sense. Although the inverses of these matrices will not be involutory like the first type of elementary matrix, they are still diagonal matrices, which means that the inverse is nearly trivial to calculate.

### Definition: The Second Type of Elementary Row Operation and the Corresponding Elementary Matrix

Consider a matrix and the second type of elementary row operation , where , giving the row-equivalent matrix . Then, we can define the corresponding “elementary” matrix which is essentially the identity matrix but with the entry in the th row and th column being equal to instead of 1. Then, the row-equivalent matrix can be written as the matrix multiplication

The second type of elementary matrix is a little more complicated than the first type, and in some ways it is simpler. All we have done is take the th row of the identity matrix and multiply every entry by the constant . Given that the only nonzero entry this in row appears in the th position and has a value of 1, only the value of this entry is altered. We will demonstrate this by example. Suppose that we take the matrix and want to perform the row operation to achieve the row-equivalent matrix . The row-scaling operation is of the second type of elementary row operation, which will require the equivalent elementary matrix , which will be the same as the identity matrix only with the entry in the second row and column being equal to 3 instead of 1:

We then multiply the original matrix on the left side by the elementary matrix . This has exactly the effect that we were aiming for:

The row-equivalent matrix is the same as the original matrix but only after the second row has been multiplied by 3, which is exactly the effect that we were aiming for.

### Example 3: Using the Second Type of Elementary Matrix

Consider the matrix

- Write the elementary matrix corresponding to the row operation .
- Derive the subsequent row-equivalent matrix .

### Answer

Every entry in the third row of is divisible by 2, meaning that using the row operation should only produce integer entries in the row-equivalent matrix . To obtain the corresponding elementary matrix , we take the identity matrix and change the entry in the third row and third column to have the value . This gives the matrix

By multiplying the matrix on the left side by the elementary matrix , we find

The resulting matrix is the same as the original matrix but after every entry in the third row has been divided by 2, which exactly mimics the row operation .

As with the first type of elementary matrix, we will be interested in finding the inverse of the second type of elementary matrix. The rationale for the structure of the inverse matrix is as follows: if we use the second type of row operation to scale every entry in row by a nonzero constant , then this can be undone by once again scaling row by a factor , returning every entry in this row into its original form. Therefore, if the original elementary matrix is , then the inverse matrix is . This is summarized in the following theorem.

### Theorem: Inverse of the Elementary Matrix 𝐷_{𝑖}(𝑐)

The inverse of the matrix is given by the formula

Although this result is not quite as convenient as the analogous result for the first type of elementary matrix (which was equal to its own inverse), it is still very useful. Diagonal matrices are one of the easiest types of matrices to work with, being commutative with each other and having simplified algebraic properties in regards to multiplication and exponentiation as well as inversion. Calculating the inverse of a diagonal matrix is essentially a trivial task, involving none of the complexities that can arise when working with a nondiagonal square matrix, hence the brevity of the above theorem.

### Example 4: Using the Second Type of Elementary Matrix and the Corresponding Inverse Matrix

Consider the matrix

- Write the elementary matrix corresponding to the row operation .
- Derive the subsequent row-equivalent matrix .
- Is it true that multiplying by the inverse elementary matrix on the left side will return the original matrix ?

### Answer

We would like to take the matrix above and implement the elementary row operation . This is a row operation of the second type and can be equivalently expressed by the elementary matrix which can be thought of as a copy of the identity matrix , with the entry in the second row and second column having been replaced by . By multiplying this matrix on the left side with the original , we find the row-equivalent matrix

The effect is precisely what we were expecting: the matrix is the same as the original matrix except that every entry in the second row has been multiplied by . To return to the original matrix , we could take and then use the inverse row operation . The corresponding elementary matrix is

We can check that this is true by multiplying by on the left-hand side:

So, yes, it is true that multiplying by the inverse elementary matrix on the left side will return the original matrix .

In practice, the third type of elementary row operation is the one that is used most frequently, and it is rightly seen as being more complicated than the other two types of row operation. The first type of row operation simply involves switching two rows, whereas the second type of row operation will scale only one row at a time. In contrast, the third type of elementary row operation involves two different rows of the matrix under the combined operation of both scaling and addition, which frequently causes errors to occur due to the more sophisticated nature of the operation. Perhaps surprisingly, the third type of elementary row operation has a corresponding elementary matrix that is still very similar to the identity matrix (much like the elementary matrices for the first and second types of row operations). The third type of elementary matrix will not be equal to its own inverse (like the first type of elementary matrix) and neither will it be diagonal (as in the second type of elementary matrix). We must therefore be mindful that the third type of elementary matrix can only act on the left-hand side and will not have an inverse that is equal to itself.

### Definition: The Third Type of Elementary Row Operation and the Corresponding Elementary Matrix

Consider a matrix and the third type of elementary row operation , giving the row-equivalent matrix . Then, we can define the corresponding “elementary” matrix: which is essentially the identity matrix but with the entry in the th row and th column being equal to instead of 0. Then, the row-equivalent matrix can be written as the matrix multiplication

As ever, the quickest way to demonstrate the efficacy of this technique is with a suitable example. Suppose that we had the matrix and we wished to implement the row operation to obtain the row-equivalent matrix . Then, using the theorem above, the corresponding elementary matrix must be a copy of the identity matrix , except that the entry in the third row and first column must be equal to . The correct elementary matrix is therefore

Multiplying by on the left-hand side should return the same matrix, only with the first row subtracted twice from the third row. Therefore, the first and second rows should be unchanged, as we find when we complete the multiplication:

With the third and final type of elementary matrix having been classified and demonstrated, we will soon be in a position to begin combining the various types of elementary matrix together in sequence, much like we would do when applying a series of row operations in sequence (e.g., when completing Gauss–Jordan elimination or manipulating a square matrix into a specific form). Before doing so, we will briefly practice applying the third type of elementary matrix in isolation, and then we will define the corresponding inverse matrix.

### Example 5: Using the Third Type of Elementary Matrix

Consider the matrix

- Write the elementary matrix corresponding to the row operation .
- Derive the subsequent row-equivalent matrix .

### Answer

The row operation is of the third type and will involve taking every entry in the second row and subtracting three times the entry in the same column of the first row. The correct elementary matrix is therefore a copy of the identity matrix, only with the entry appearing in the second row and first column:

Multiplying this matrix by the original matrix gives

The row-equivalent matrix is exactly the same as if we had taken the original matrix and performed the row operation .

We now turn our attention to the inverse of the third type of elementary matrix. If we had taken a general matrix and performed the row operation , then the quickest way to undo this is to apply the inverse row operation , which will recover the original matrix . Therefore, the inverse of the elementary matrix is also of the third type: . This is summarized in the following theorem:

### Theorem: Inverse of the Elementary Matrix 𝐸_{𝑖𝑗}(𝑐)

The inverse of the matrix is given by the formula

### Example 6: Using the Third Type of Elementary Matrix and the Corresponding Inverse Matrix

Consider the matrix

- Write the elementary matrix corresponding to the row operation .
- Derive the subsequent row-equivalent matrix .
- Is it true that multiplying by the inverse elementary matrix on the left side will return the original matrix ?

### Answer

The row operation is of the third type. The effect will be that we take every entry in the first row and to this we add half the value of each entry in the same column of the second row. The corresponding elementary matrix will thus be a copy of the identity matrix only with the value of in the first row and second column:

By multiplying this on the left-hand side by the original matrix , we obtain the row-equivalent matrix which gives the desired result. The inverse row operation will be and if this is applied to the row-equivalent matrix , then we will simply return the original matrix . The elementary matrix for this row operation is

When multiplying on the left-hand side by , we must find the original matrix . We can check this as follows:

So, yes, it is true that multiplying by the inverse elementary matrix on the left side will return the original matrix .

When working with elementary row operations, it will seldom be the case that we only need a single such operation to solve whatever problem we are working with. For processes such as Gauss–Jordan elimination or LU decomposition, there are usually many elementary row operations that we must apply in sequence before the overall outcome can be achieved. All of our work above would therefore be pretty pointless if the elementary matrices were not accommodating of this frequent requirement!

It is actually quite straightforward to verify that we can chain together elementary matrices in the same way that we can with elementary row operations. Given that we have checked that each of the three types of elementary matrices has the correct effect, we only need to check that it is possible to combine these matrices together in sequence. At this point, it is helpful to remember that when applied to a general matrix , any of the elementary matrices must return a row-equivalent matrix , which is of the same order as the original matrix . Suppose that we have a matrix with order and an elementary matrix with order that is multiplied on the left. In order for the matrix multiplication to be well defined, there must be the same number of columns in as there are rows in . Therefore, we must have , meaning that has order and has order , meaning that the matrix has order . However, we require that have the same order as , which means that and therefore the order of the elementary matrix is necessarily .

Since any elementary matrix must have order , we know that we can combine two of these matrices together, with the output being another matrix of order . This logic extends to the product of many such elementary matrices all having the same order, and we therefore conclude that we can chain together any number of these matrices without changing the order of the output, row-equivalent matrix . Please note that this is not to say that we can combine these matrices together in any order, given that matrix multiplication is not commutative and therefore, generally, it will be the case that . This makes perfect sense, as we cannot generally apply a series of row operations together in an arbitrary order and expect to get the same result.

After having given many examples of how to use each type of the three types of elementary matrix, it is easy enough to apply these in series, thus encoding a series of row operations applied to a matrix. That being said, it is best to standardize the way in which we approach this in order to avoid confusion. We will consider the matrix

Suppose that we then chose to apply the following row operations in order: , , and . We will not show the calculations here but the outcome is the row-equivalent matrix

The first row operation is also of the first type and the elementary matrix is identical to the identity matrix, but with the first and second rows swapped:

We can then multiply the matrix by this elementary matrix on the left-hand side, giving the row-equivalent matrix

For the row operation , we have the matrix

Rather than apply this to , we will instead apply this to , calling the result to avoid confusion. We find

The final row operation has the elementary matrix

We can derive the new matrix by multiplying this matrix by :

This returns the row-equivalent matrix that we gave in equation (1). In the above working, we chose to demonstrate each step and we actually created more work for ourselves than we needed. Instead of completing every single step of the above working, we can choose to derive the answer in a way that is more succinct, without losing any information about which row operations we performed.

Recall that, in the example above, we took the original matrix and multiplied it by the elementary matrix to obtain

We then took the resulting matrix and multiplied it on the left-hand side by the elementary matrix , using the equation immediately above to give

The final elementary matrix was multiplied by to give the matrix . Using the equation above, we would find

This expression for may look as though it has not provided any visual improvement and that is because, at this stage, it has not done so. For this improvement to be realized, we can recall that matrix multiplication is associative, meaning that we can alternatively write

As the equation above indicates, we will now combine all of the elementary matrices into a single matrix, before multiplying by the original matrix . This can be completed as follows, by multiplying the matrices together, this time starting from the right-hand side (although this is arbitrary). We collect all of the matrices together as a new matrix , defined as

Using equation (2), we can then write the entire series of row operations as the one matrix written immediately above, the highly simplified equation , which can be checked manually if needed. Such concise, algebraic expressions are sought after in my theaters of linear algebra, as they can then be thought of in a more abstract sense, allowing the full power of linear algebra to be utilized.

### Example 7: Writing a Series of Elementary Row Operations as a Single Matrix

Consider the matrix and the following row operations performed in the order given to give the row-equivalent matrix , , , , and .

- Write a single matrix corresponding to the combined row operations into .
- Use to calculate .

### Answer

The quickest way to achieve this is to consider all of the elementary row operations and then write down all of the corresponding elementary matrices. For the first three row operations , , and , the elementary matrices are

The final two row operations are and , which have the elementary matrices

We can write the row-equivalent matrix as a product of these elementary matrices: where we have used the associative property of matrix multiplication to group together the product of the elementary matrices into the single matrix , which we must now find. The most error-proof way of doing this is to take the product of the two rightmost terms and then move leftward to complete the calculation. To reduce the level of visual discomfort that would be incurred from writing out all of the elementary matrices at once, we do not write the elementary matrices out in full until they are required for the next stage of the calculation. We begin with the two rightmost elementary matrices (which correspond to the first two row operations that were required, in order):

The rightmost matrix is no longer an elementary matrix of any type, but it still has a reasonably simple form. We continue the process, now absorbing the third row operation performed into the calculation:

There are now only two steps remaining to complete the calculation of . For the fourth row operation, we perform the next stage of the method:

For the final step, we bring in the final row operation, giving the answer

Now that we have calculated the matrix , we can use equation (3) to give . We find that

This is exactly the matrix that we would have found if we had performed the given elementary row operations in order.

There are many possible applications of viewing a series of row operations through the lens of elementary matrices. A common application is found when working with a system of linear equations. Suppose that we began with the square coefficient matrix and applied a series of elementary row operations to achieve the reduced echelon form . Then, by replicating the previous method, we could form the corresponding elementary matrices and combine these together into an overall matrix , such that the reduced echelon matrix could then be written as

Even though the matrix will generally not be an elementary matrix of any type, it is constructed as the product of elementary matrices, each of which is invertible. This means that itself must be invertible. Additionally, it is a known result from linear algebra that if a square matrix of order is also invertible, then the reduced echelon matrix will be equal to the identity matrix . If this is the case, then equation (4) simplifies to

The beauty of this technique is now revealed in full: the matrix is actually the inverse of , since by definition the inverse matrix is that which satisfies the equation

Furthermore, the multiplicative inverse of a matrix is unique. In having found the matrix , we have surprisingly found the inverse as the product of elementary matrices.

### Key Points

- There are three types of elementary row operations and each of these can be written in terms of a square matrix that differs from the corresponding identity matrix in at most two entries.
- By definition, each of the elementary matrices is assumed to act on the left-hand side.
- Every elementary matrix is invertible, with the inverse matrix being straightforward to derive and express.
- Matrix multiplication is associative, which means that chains of elementary matrices can be multiplied together to represent a sequence of row operations.