In Section 1.1 we sketched a procedure for solving a linear system \(L\text{.}\) That procedure involved applying a sequence of row operations to \(L\) to obtain a “simpler” system \(L'\text{.}\)
We will fill out this sketch in the next two sections. Specifically, we will
describe precisely what we mean by a “simpler” system,
provide an algorithm (or recipe) that decides exactly what sequence of row operations to apply to obtain this simpler system,
explain how to find all solutions of the resulting simpler system.
Subsection1.2.1Row echelon matrices
Our first step in this direction will be to introduce a notational convenience. As you may have noticed, when performing row operations on a system of equations, we essentially treat the unknowns, as well as the plus and equals symbols, as placeholders; the only things that actually change in a given step are the coefficients in the equations. The augmented matrix associated to a linear system is a formal way of extracting just the information of the coefficients from a linear system.
Definition1.2.1.Augmented matrix.
Let \(L\) be the linear system
\begin{equation*}
\eqsys\text{.}
\end{equation*}
The augmented matrix associated to \(L\) is the matrix
As defined more thoroughly in Definition 2.1.2, a matrix is just a rectangular array of numbers.
Here is our precise formulation of a “simple” linear system; it is a system whose associated augmented matrix is in row echelon form, as described below.
Definition1.2.3.Row echelon form.
A zero row of a matrix, is a row whose entries are all equal to zero; a nonzero row is a row that contains at least one nonzero entry.
A matrix is in row echelon form if the following conditions hold.
(i)
In any nonzero row the first (i.e., leftmost) nonzero entry is equal to one. A leading one of a matrix is such an entry: i.e., an entry of a row that is equal to one, and that is also the first nonzero entry of that row.
(ii)
All zero rows are grouped together at the bottom of the matrix.
(iii)
Given any two nonzero rows in the matrix, the leading one of the lower row occurs strictly to the right of the leading one of the row above it.
A matrix is in reduced row echelon form if in addition to conditions (i)-(iii) it also satisfies the following condition.
(iv)
Any column of the matrix that contains a leading one has zeros elsewhere. In other words, if a column contains a leading one, then that is the only nonzero entry of that column.
A linear system \(L\) is in row echelon form (resp. reduced row echelon form) if its augmented matrix is in row echelon form (resp. reduced row echelon form).
In practice to decide whether a matrix is in in (reduced) row echelon form, follow these steps:
First verify whether all zero rows are at the bottom.
For each nonzero row, determine whether the first nonzero entry is a 1, and put a box around it.
Make sure your boxes make a staircase pattern.
(For reduced row echelon form only.) Decide whether each column with a box has 0’s everywhere else.
Example1.2.4.Row echelon versus reduced row echelon form.
For each matrix decide (a) whether it is in row echelon form, and (b) whether it is in reduced row echelon form.
Below you find the matrix with leading ones boxed. This matrix fails to be in row echelon form (and hence also reduced row echelon form) for a variety of reasons: the zero rows are not all grouped at the bottom; the first row is nonzero, but does not have a leading one; the leading one of the fourth row is to the left of the leading one of the leading one in the row above it.
Below you find the matrix with leading ones boxed. This matrix is in row echelon form: the zero rows (rows 4 and 5) are grouped at the bottom; each nonzero row has a leading one (boxed in the matrix below); the leading ones step strictly to the right as we move down the rows.
The matrix is not in reduced row echelon form, as the last column contains a leading one in its third row, and a nonzero entry in its first row.
Subsection1.2.2Gaussian elimination
We will now describe a systematic procedure, called Gaussian elimination, that allows us to reduce a given linear system \(L\) to a system \(L'\)in row echelon form. In keeping with the foregoing discussion, we will identify a system \(L\) with its augmented matrix \(A\text{.}\) Furthermore, reducing a linear system using elementary operations on equations is now cast as performing elementary row operations on matrices. At the risk of redundancy we now officially translate a number of our former notions regarding reduction of linear systems to the setting of matrices.
Definition1.2.5.Elementary row operations on matrices.
An elementary row operation is one of the three following types of matrix operations. Let \(A\) be a given \(m\times n\) matrix, and denote by \(r_i\) the \(i\)-th row of \(A\text{.}\)
Scalar multplication
Multiply a row by a nonzero number \(c\ne 0\text{:}\) i.e., replace \(r_i\) with \(c\,r_i\text{,}\) the result of multiplying all entries of the row by \(c\text{.}\)
Row swap
Swap two rows of \(A\text{.}\)
Row addition
Add a multiple of one row to another: i.e., replace \(r_i\) with \(r_i+cr_j\) for some \(c\text{,}\)\(i\text{,}\) and \(j\text{.}\)
The act of transforming a matrix using elementary row operations is called row reduction.
Two matrices are row equivalent if the one can be obtained from the other by performing a finite sequence of elementary row operations.
Remark1.2.6.Notation.
Our former elementary operation notation easily transfers to row operations on matrices. The expressions
\begin{align*}
A\amp \xrightarrow{c\,r_i} B \amp A\amp \xrightarrow{r_i\leftrightarrow r_j} B\amp A\amp\xrightarrow{r_i+c\,r_j} B
\end{align*}
denote the operations that replace row \(r_i\) in \(A\) with \(c\, r_i\text{,}\) swap rows \(r_i\) and \(r_j\) in \(A\text{,}\) and replace \(r_i\) in \(A\) with \(r_i+c\, r_j\text{,}\) respectively.
At last we are ready to define Gaussian elimination. In its essence this is simply a procedure, or algorithm, that takes an input matrix \(A\) and row reduces it to a matrix \(B\) in row echelon form. It is important to note that although we employ Gaussian elimination in this chapter primarily to the end of simplifying and solving linear systems, this is not its only application. Indeed, we will come back to the procedure again and again, in a great variety of contexts and to greatly diverse ends. Gaussian elimination is one of linear algebra’s most important tools. We memorialize this here as an official principle.
Mantra1.2.7.Gaussian elimination mantra.
Gaussian elimination is the workhorse of linear algebra.
Definition1.2.8.Gaussian elimination.
Gaussian elimination is the algorithm described below. It takes as an input a matrix \(A\) and returns as an output a row equivalent matrix \(B\) in row echelon form.
Step 1
Find the leftmost nonzero column and perform a row swap to move the row with this nonzero entry to the top of the matrix.
Step 2
Scale the new top row to produce a leading one in the row. Call this new row \(r\text{.}\)
Step 3
For each row \(r_i\) below \(r\) perform a row operation of the form \(r_i+c\,r\) to replace all entries below the leading one of \(r\) with zeros.
Step 4
Begin again with Step 1 applied to the matrix consisting of all rows below \(r\text{.}\) Continue until the matrix is in row echelon form.
SubsectionModel example
Use the following example as a model of how to both perform and annotate the steps in Gaussian elimination. When first learning this procedure, make sure to follow it to the letter. In particular, in the spirit of the mandate issued in Remark 1.1.16, you should only perform one row operation at a time, and only in the sequence prescribed in Steps 1-4 of Definition 1.2.8.
Example1.2.9.Row echelon form.
We use Gaussian elimination to reduce the linear system
The matrix outputted by Gaussian elimination is guaranteed to be in row echelon form, but it may not be in reduced row echelon form. Gauss-Jordan elimination describes a systematic way to continue reducing to this even simpler state.
Definition1.2.10.Gauss-Jordan elimination.
Gauss-Jordan elimination is the algorithm described below. It takes as an input a matrix \(A\) and returns a row equivalent matrix \(B\) in reduced row echelon form.
Steps 1-4
Apply Gaussian elimination to transform \(A\) to a matrix in row echelon form.
Step 5
Find the rightmost column of the matrix containing a leading one. Let \(r_i\) be the row containing this leading one. For each row \(r_j\) above \(r_i\) perform a row operation of the form \(r_i+c\,r_j\) to replace all entries above the leading one with zeros.
Step 6
Begin again with Step 5 applied to the next column to the left that contains a leading one. Continue until the matrix is in reduced row echelon form.
Example1.2.11.Reduced row echelon form.
We continue our work in Example 1.2.9 to reach a matrix in reduced row echelon form.
Any matrix \(A\) is row equivalent to a matrix in row echelon form. Indeed, Gaussian elimination row reduces any matrix to a matrix in row echelon form.
Reduced row echelon forms exist.
Any matrix \(A\) is row equivalent to a matrix in reduced row echelon form. Indeed, Gauss-Jordan elimination row reduces any matrix to a matrix in reduced row echelon form.
Reduced row echelon forms are unique.
Any matrix \(A\) is row equivalent to a unique matrix in reduced row echelon form.
We will make heavy use of the first two results of Theorem 1.2.12. The proofs of these statements are not difficult, but not especially illuminating. Accordingly we omit them here, and point the interested reader to Robert Beezer’s A First Course in Linear Algebra 1 . (See Theorem REMEF 2 .)
The third statement of Theorem 1.2.12, that every matrix is row equivalent to a unique matrix in reduced row echelon form, does in fact have an enlightening proof. We will postpone this proof, however, until we have a little more theory at our disposal. (See Corollary 2.4.18.) Until then we will conscientiously not make use of this fact when developing any of our further theory.
Example1.2.13.Row echelon form is not unique.
Show that a matrix \(A\) may be row equivalent to two or more matrices in row echelon form.
Take \(A=\begin{bmatrix} 1 \amp 1 \\
1 \amp 2 \end{bmatrix}\text{.}\) This row reduces to \(B=\begin{bmatrix} 1 \amp 1 \\
0 \amp 1 \end{bmatrix}\) using Gaussian elimination; and it row reduces further to \(C=\begin{bmatrix}1\amp 0\\ 0\amp 1\end{bmatrix}\) using Gauss-Jordan elimination. Thus we see that \(A\) is row equivalent to two different matrices in row echelon form. (According to Theorem 1.2.12, the matrix \(C\) is the only matrix in reduced row echelon that is row equivalent to \(A\text{.}\))
In the first Sage cell below you find a recursive implementation of Gaussian elimination in Sage that includes explanatory comments. Evaluate this cell in order to load the row_echelon_form function. The second cell allows you to apply the Gaussian elimination algorithm to a matrix of your choosing. As you can see, the show function provides a nice LaTeX version of the output.
Sage has its own row reduction method, rref, which transforms a matrix to reduced row echelon form. Let’s compare the outputs of these two algorithms.
The following activities could be useful for implementing Gaussian elimination in a manner that shows all intervening steps. Use the empty Sage cell below to experiment.
Modify the row_echelon_form code to make a non-recursive algorithm.
Add show commands to your non-recursive version of row_echelon_form to show steps in the row reduction.
On the augmented matrix \(A\) below , perform all three row operations in the order given, ((a) followed by (b) followed by (c)) and then write the resulting augmented matrix.
For each of the given linear systems, find an equivalent system in row echelon form. Use augmented matrices and follow the Gaussian elimination algorithm to the letter.