Skip to main content

Euclidean vector spaces

Section 3.1 Matrix arithmetic

Matrices played a small supporting role in our discussion of linear systems in ChapterΒ 2. In this chapter we bring them to center stage and give them a full-blown treatment as independent mathematical objects in their own right.
Like any mathematical entity worth its salt, matrices can be employed in a vast multitude of ways. As such it is important to allow matrices to transcend their humble beginnings in this course as boiled down systems of linear equations. We record this observation as another principle.

Subsection 3.1.1 The basics

Definition 3.1.2. Matrix.

A (real) matrix is a rectangular array of real numbers
\begin{equation} A=\genmatrix\text{.}\tag{3.1} \end{equation}
The number \(a_{ij}\) located in the \(i\)-th row and \(j\)-th column of \(A\) is called the \((i,j)\)-entry (or \(ij\)-th entry) of \(A\text{.}\)
A matrix with \(m\) rows and \(n\) columns is said to have size (or dimension) \(m\times n\text{.}\) The set of all \(m\times n\) matrices is denoted \(M_{mn}\text{.}\)
The displayed matrix in (3.1) is costly both in the space it takes up in show, and the time it takes to write down or typeset. Accordingly we introduce two somewhat complementary forms of notation to help describe matrices.

Definition 3.1.3. Matrix notation.

Matrix-building notation
The notation \([a_{ij}]_{m\times n}\) denotes the \(m\times n\) matrix whose \(ij\)-th entry (\(i\)-th row, \(j\)-th column) is \(a_{ij}\text{.}\) When there is no danger of confusion, this notation is often shortened to \([a_{ij}]\text{.}\)
Matrix entry notation
Given a matrix \(A\text{,}\) the notation \([A]_{ij}\) denotes the \(ij\)-th entry of \(A\text{.}\)
Thus if \(A=[a_{ij}]_{m\times n}\text{,}\) then \([A]_{ij}=a_{ij}\) for all \(1\leq i\leq m\) and \(1\leq j\leq n\text{.}\)

Remark 3.1.4.

The matrix-building notation is often used simply to give names to the entries of an arbitrary matrix. However, it can also be used to describe a matrix whose \(ij\)-th entry is given by specified rule or formula.
For example, let \(A=[a_{ij}]_{2\times 3}\text{,}\) where \(a_{ij}=(i-j)j\text{.}\) This is the \(2\times 3\) matrix whose \(ij\)-th entry is \((i-j)j\text{.}\) Thus
\begin{equation*} A=\begin{bmatrix}(1-1)1 \amp (1-2)2 \amp (1-3)3\\ (2-1)1 \amp (2-2)2 \amp (2-3)3 \end{bmatrix}=\begin{bmatrix}0 \amp -2 \amp -6\\ 1 \amp 0 \amp -3 \end{bmatrix}\text{.} \end{equation*}
In this example we have \([A]_{23}=-3\) and \([A]_{ii}=0\) for \(i=1,2\text{.}\)
Using matrix notation we can now precisely define what the rows and columns of a matrix are.

Definition 3.1.5. Rows and columns of a matrix.

Let \(A\) be an \(m\times n\) matrix. For each \(1\leq i\leq m\text{,}\) \(i\)-th row of \(A\) is the \(n\)-tuple
\begin{equation*} (a_{i 1}, a_{i2}, \dots, a_{i n})\in \R^n\text{.} \end{equation*}
Similarly, for each \(1\leq j\leq n\text{,}\) the \(j\)-th column of \(A\) is the \(m\)-tuple
\begin{equation*} (a_{1 j}, a_{2 j}, \dots, a_{m j})\in \R^m\text{.} \end{equation*}
Given \(n\)-tuples \(\boldr_1, \boldr_2,\dots, \boldr_m\in \R^n\text{,}\) we denote by
\begin{equation*} \begin{bmatrix}\ -\boldr_{1}- \ \\ \ -\boldr_{1}- \ \\ \vdots \\ \ -\boldr_{m}- \ \\ \end{bmatrix} \end{equation*}
the \(m\times n\) matrix whose \(i\)-th row is \(\boldr_i\text{.}\)
Similarly, given \(m\)-tuples \(\boldc_1,\boldc_2,\dots, \boldc_n\in \R^m\text{,}\) we denote by
\begin{equation*} \begin{bmatrix}\vert \amp \vert \amp \amp \vert \\ \boldc_1 \amp \boldc_2\amp \cdots \amp \boldc_n \\ \vert \amp \vert \amp \amp \vert \end{bmatrix} \end{equation*}
the \(m\times n\) matrix whose \(j\)-th column is \(\boldc_j\text{.}\)
In everyday language the notion of equality is taken as self-evident. Two things are equal if they are the same. What more is there to say? In mathematics, each time we introduce a new type of mathematical object (e.g., sets, functions, \(n\)-tuples, etc.) we need to spell out exactly what we mean for two objects to be considered equal. We do so now with matrices.

Definition 3.1.6. Matrix equality.

Let \(A\) and \(B\) be matrices of dimension \(m\times n\) and \(m'\times n'\text{,}\) respectively. The two matrices are equal if
  1. \(m=m'\) and \(n=n'\text{;}\)
  2. \([A]_{ij}=[B]_{ij}\) for all \(1\leq i\leq m\) and \(1\leq j\leq n\text{.}\)
In other words, we have \(A=B\) if and only if \(A\) and \(B\) have the same shape, and each entry of \(A\) is equal to the corresponding entry of \(B\text{.}\)

Example 3.1.7. Matrix equality.

The matrices
\begin{align*} A \amp =\begin{bmatrix}1\amp 2\amp 3\amp 4 \end{bmatrix} \amp B\amp = \begin{bmatrix} 1\\ 2\\ 3\\ 4\end{bmatrix} \end{align*}
are not equal to one another, despite their having the same entries that appear roughly in the same order. In this case equality does not hold as \(A\) and \(B\) have different shapes: \(A\) is \(1\times 4\text{,}\) and \(B\) is \(4\times 1\text{.}\)
The matrices \(A=\begin{bmatrix}1\amp 2 \\3\amp 4 \end{bmatrix}\) and \(B=\begin{bmatrix}1\amp 2\\ 5\amp 4\end{bmatrix}\) have the same dimension, but are not equal since \([A]_{21}=3\ne 5=[B]_{21}\text{.}\)

Definition 3.1.8. Matrices of particular shape.

A matrix \(A\) is square if its dimension is \(n\times n\text{:}\) i.e., if \(A\in M_{nn}\text{.}\) The diagonal of a square matrix \(A\in M_{nn}\) consists of the entries of \(A\) of the form \([A]_{ii}\) for \(1\leq i\leq n\text{.}\)
A \(1\times n\) matrix
\begin{equation*} \bolda=\begin{bmatrix}a_1\amp a_2\amp \cdots \amp a_n \end{bmatrix} \end{equation*}
is called a row vector. The \(j\)-th entry of a row vector \(\bolda\) is denoted \([\bolda]_j\)
An \(n\times 1\) matrix
\begin{equation*} \boldb=\begin{bmatrix}b_1\\ b_2\\\vdots \\ b_m \end{bmatrix}\text{,} \end{equation*}
is called a column vector. The \(i\)-th entry of a column vector \(\boldb\) is denoted \([\boldb]_i\text{.}\)

Remark 3.1.9. Tuples, row vectors, and column vectors.

You are perhaps wondering why we make a distinction between \(n\)-tuples, \(1\times n\) row vectors, and \(n\times 1\) column vectors. One answer is that a matrix is not simply an ordered sequence: it is an ordered sequence arranged in a very particular way. This subtlety is baked into the very definition of matrix equality, and allows us to say that
\begin{equation*} \begin{amatrix}[rr]1\amp 2 \end{amatrix}\ne \begin{amatrix}[r]1\\ 2 \end{amatrix}\text{.} \end{equation*}
There are situations, however, where we don’t need this extra layer of structure, where we want to treat an ordered sequence simply as an ordered sequence. In such situations tuples are preferred to row or column vectors.
Of course there will be times where we wish to treat an ordered sequence now as a tuple and now as a row or column vector. In these situations we will clarify what is meant by using the phrase β€œtreated as a tuple”, β€œtreated as a row vector”, or β€œtreated as a column vector”. For example, the tuple \((1,2)\text{,}\) treated as a row vector, is the \(1\times 2\) matrix \(\begin{bmatrix}1 \amp 2\end{bmatrix}\text{.}\)
That said, the close connection between linear systems and matrix equations makes it very convenient to be able to treat an \(n\)-tuple \((c_1,c_2,\dots, c_n)\) as if it were the column vector
\begin{equation*} \colvec{c_1\\ c_2\\ \vdots \\ c_n}\text{,} \end{equation*}
and vice versa. This conflation is so convenient, in fact, that we will simply declare it to be true by fiat! This means that going forward we are permitted to treat tuples as column vectors and vice versa without further comment.

Sage example 4. Matrix entries, rows, and columns.

Sage syntax for accessing specific entries of a matrix is similar in spirit to our matrix entry notation. However, as with all things Python, we always count from 0. Thus if A is assigned to a matrix in Sage, A[i,j] is its \((i+1),(j+1)\)-th entry.
Prescribed subsets of matrix entries are obtained via slicing methods: for example, A[a:b, c:d] returns the collection of entries \([A]_{ij}\) with \(a+1\leq i\lt b+1\) and \(c+1\leq j\lt d\text{,}\) arranged as a matrix.
Leaving the left or right side of : blank in this notation removes the corresponding restriction bound (left or right) from the index in question. Thus A[2, :] returns the third row of \(A\text{,}\) and A[1:, 3] returns the portion of the fourth column of \(A\) beginning with its second entry.
Alternatively, we can obtain a list of all rows or columns of \(A\) using the the methods rows() and columns().
Use the empty cell below to try out some of these commands.

Subsection 3.1.2 Vector space structure of \(M_{mn}\)

We now lay out the various algebraic operations we will use to combine and transform matrices; we refer to the use of these operations loosely as matrix arithmetic. Some of these operations resemble familiar operations from real arithmetic in terms of their notation and definition. Do not be lulled into complacency! These are new operations defined for a new class of mathematical objects, and must be treated carefully. In particular, pay close attention to (a) exactly what type of mathematical objects serve as inputs for each operation (the ingredients of the operation), and (b) what type of mathematical object is outputted.

Definition 3.1.11. Matrix addition and scalar multiplication.

Let \(m\) and \(n\) be positive integers.
  • Matrix addition.
    Given \(m\times n\) matrices \(A,B\in M_{mn}\text{,}\) we define their matrix sum \(A+B\) to be the \(m\times n\) matrix satisfying
    \begin{equation*} [A+B]_{ij}=[A]_{ij}+[B]_{ij} \end{equation*}
    for all \(1\leq i\leq m\) and \(1\leq j\leq n\text{.}\) Equivalently, if \(A=[a_{ij}]\) and \(B=[b_{ij}]\text{,}\) then
    \begin{equation*} A+B=[a_{ij}+b_{ij}]\text{.} \end{equation*}
    The operation
    \begin{align*} M_{mn}\times M_{mn} \amp \rightarrow M_{mn}\\ (A,B) \amp \mapsto A+B \end{align*}
    is called matrix addition.
  • Given an \(m\times n\) matrix \(A\in M_{mn}\) and scalar \(c\in \R\text{,}\) the scalar multiple of \(A\) by \(c\) is the matrix \(cA\) satisfying
    \begin{equation*} [cA]_{ij}=c[A]_{ij} \end{equation*}
    for all \(1\leq i\leq m\) and \(1\leq j\leq n\text{.}\) Equivalently, if \(A=[a_{ij}]\text{,}\) then
    \begin{equation*} cA=[ca_{ij}]\text{.} \end{equation*}
    The operation
    \begin{align*} \R\times M_{mn} \amp \rightarrow M_{mn}\\ (c,A) \amp \mapsto cA \end{align*}
    is called matrix scalar multiplication.

Remark 3.1.12.

Observe that matrix addition is not defined for any pair of matrices. The ingredients of matrix addition are two matrices of the same dimension; and the output is a third matrix of this common dimension.
Not surprisingly, as the names of our matrix operations suggest, the set \(M_{mn}\) of all \(m\times n\) matrices constitutes a vector space with respect to matrix addition and scalar multiplication. Before proving this fact, we introduce what will be the zero vectors and vector inverses of these vector spaces.

Definition 3.1.13. Zero matrices.

The \(m\times n\) zero matrix, denoted \(\boldzero_{m\times n}\text{,}\) is the \(m\times n\) matrix, all of whose entries are equal to zero: i.e., \([\boldzero_{m\times n}]_{ij}=0\) for all \(1\leq i\leq m\) and \(1\leq j\leq n\text{.}\)
When there is no confusion about which dimension zero matrix is intended, we will often write simply \(\boldzero\) instead of \(\boldzero_{m\times n}\text{.}\)

Definition 3.1.14. Additive inverse matrix.

The additive inverse of an \(m\times n\) matrix \(A=[a_{ij}]\) is the \(m\times n\) matrix \(-A\) defined as \(-A=[-a_{ij}]\text{.}\)

Remark 3.1.15. Additive inverse matrix.

β€˜Additive inverse matrix’ is admittedly a bit clunky. We are taking pains here not to call \(-A\) simply the inverse of \(A\text{,}\) as this term is reserved for multiplicative inverses of matrices. (See DefinitionΒ 3.3.1.)

Proof.

Remark 3.1.17. Matrix difference and matrix linear combination.

Having established that \(M_{mn}\) is a vector space under matrix addition and scalar multiplication, this set of matrices automatically inherits the various features and properties enjoyed by general vector spaces. For example, the vector difference operation (defined for any vector space) gives rise in the case of \(M_{mn}\) to a matrix difference operation. Namely, for any \(A,B\in M_{mn}\) we have
\begin{equation*} A-B=A+(-B)=A+(-1)B\text{.} \end{equation*}
Similarly, the general notion of a linear combination of vectors, gives rise to the notion of matrix linear combinations. Namely, given \(A_1,A_2,\dots, A_k\in M_{mn}\text{,}\) and scalars \(c_1,c_2,\dots, c_k\in \R\text{,}\) we have the matrix linear combination
\begin{equation*} c_1A_1+c_2A_2+\cdots +c_kA_k\text{.} \end{equation*}

Example 3.1.18. Matrix linear combinations.

Let \(A=\begin{amatrix}[rrr]1\amp -1\amp 2\\ 0\amp 0\amp 1\end{amatrix}\) and \(B=\begin{amatrix}[rrr]0\amp 1\amp 1\\ -1\amp -1\amp 1\end{amatrix}\text{.}\) Compute \(2A+(-3)B\text{.}\)
Solution.
\begin{align*} 2A+(-3)B\amp= \begin{amatrix}[rrr]2\amp -2\amp 4\\ 0\amp 0\amp 2\end{amatrix}+\begin{amatrix}[rrr]0\amp -3\amp -3\\ 3\amp 3\amp -3\end{amatrix}\\ \amp=\begin{amatrix}[rrr]-2\amp 1\amp -3\\ 3\amp 3\amp -1\end{amatrix} \text{.} \end{align*}

Example 3.1.19. Expressing matrix as a linear combination.

Show that \(B=\begin{amatrix}3\amp -3\amp 3 \end{amatrix}\) can be expressed as a linear combination of the matrices
\begin{equation*} A_1=\begin{amatrix}[rrr]1\amp 1\amp 1\end{amatrix}, \ A_2=\begin{amatrix}[rrr]1\amp -1\amp 0\end{amatrix}, \ A_3=\begin{amatrix}[rrr]1\amp 1\amp -2\end{amatrix}\text{.} \end{equation*}
Solution.
We must solve the matrix (or row vector) equation
\begin{equation*} aA_1+bA_2+cA_3=B \end{equation*}
for the scalars \(a,b,c\text{.}\) Computing the linear combination on the left yields the matrix equation
\begin{equation*} \begin{amatrix}[rrr]a+b+c\amp a-b+c\amp a-2c\end{amatrix}=\begin{amatrix}[rrr]3\amp -3\amp 3\end{amatrix}\text{.} \end{equation*}
Using the definition of matrix equality (DefinitionΒ 3.1.6), we get the system of equations
\begin{equation*} \begin{linsys}{3} 1a \amp +\amp b \amp + \amp c \amp = \amp 3\\ a \amp-\amp b\amp +\amp c\amp =\amp -3\\ a \amp \amp \amp -\amp 2c\amp =\amp 3 \end{linsys}\text{.} \end{equation*}
Using Gaussian elimination we find that there is a unique solution to this system: namely, \((a,b,c)=(1,3,-1)\text{.}\) We conclude that \(B=A_1+3A_2+(-1)A_3=A_1+3A_2-A_3\text{.}\)

Remark 3.1.20.

Let \(A_1, A_2,\dots, A_r\) be \(m\times n\) matrices. An easy induction argument on \(r\) shows that for any scalars \(c_1,c_2,\dots, c_r\) we have
\begin{equation*} [c_1A_1+c_2A_2+\cdots +c_rA_r]_{ij} =c_1[A_1]_{ij}+c_2[A_2]_{ij}+\cdots c_r[A_r]_{ij} \end{equation*}
for all \(1\leq i\leq m\text{,}\) \(1\leq j\leq n\text{.}\) (See ExerciseΒ 3.1.6.11. )

Subsection 3.1.3 Matrix multiplication

So how do we define the product of two matrices? Looking at the previous operations, you might have guessed that we should define the product of two \(m\times n\) matrices by taking the product of their corresponding entries. Not so!

Definition 3.1.21. Matrix multiplication.

Let \(m,n,r\) be positive integers. Given an \(m\times n\) matrix \(A\in M_{mn}\) and an \(n\times r\) matrix \(B\in M_{nr}\text{,}\) we define their product to be the \(m\times r\) matrix \(AB\) satisfying
\begin{equation} [AB]_{ij}=[A]_{i1}[B]_{1j}+[A]_{i2}[B]_{2j}+\cdots [A]_{in}[B]_{nj}=\sum_{k=1}^n[A]_{ik}[B]_{kj}\tag{3.2} \end{equation}
for all \(1\leq i\leq m\) and \(1\leq j\leq r\text{.}\) Equivalently, if \(A=[a_{ij}]\) and \(B=[b_{ij}]\text{,}\) then \(AB=[c_{ij}]\text{,}\) where
\begin{equation*} c_{ij}=\sum_{k=1}^na_{ik}b_{kj}\text{.} \end{equation*}
The operation
\begin{align*} M_{mn}\times M_{nr} \amp \rightarrow M_{mr}\\ (A,B) \amp \mapsto AB \end{align*}
is called matrix multiplication.
Figure 3.1.22. In \(C=AB\text{,}\) the \(ij\)-th entry \(c_{ij}=\sum_{k=1}^na_{ik}b_{kj}\) is computed by moving across the \(i\)-th row of \(A\) and down the \(j\)-th column of \(B\text{.}\)

Remark 3.1.23. Size and matrix multiplication.

Observe how, like addition, matrix multiplication is not defined for any pair of matrices: there must be a certain agreement in their dimensions.
In more detail, for the product of \(A_{mn}\) and \(B_{pr}\) to be defined, we need \(n=p\text{.}\) In other words we need the β€œinner” dimensions of \(A\) and \(B\) to be equal:
\begin{equation*} \underset{m\times \boxed{n}}{A}\hspace{5pt} \underset{\boxed{n}\times r}{B}\text{.} \end{equation*}
If this condition is met, the dimension of the resulting matrix \(AB\) is determined by the β€œouter” dimensions of \(A\) and \(B\text{.}\) Schematically, you can think of the inner dimensions as being β€œcanceled out”:
\begin{equation*} \underset{\boxed{m}\times\cancel{n}}{A}\hspace{5pt}\underset{\cancel{n}\times\boxed{r}}{B}=\underset{m\times r}{AB}. \end{equation*}

Example 3.1.24. Matrix multiplication.

Consider the matrices
\begin{align*} A\amp =\begin{amatrix}[rrr] 1\amp 0\amp -3 \\ -2\amp 1\amp 1 \end{amatrix} \amp B\amp =\begin{amatrix}[rr] 0\amp -1 \\ -1\amp 2 \\ 3\amp 1 \end{amatrix}\text{.} \end{align*}
Since the β€œinner dimensions” of \(A\) and \(B\) agree, we can form the product matrix \(C=AB\text{,}\) which has dimension \(2\times 2\text{.}\) Let \(c_{ij}=[C]_{ij}\) for all \(1\leq i,j\leq 2\text{.}\) Using DefinitionΒ 3.1.21, we compute
\begin{align*} c_{11}\amp =1\cdot 0+0\cdot(-1)+(-3)\cdot 3=-9 \\ c_{12}\amp =1\cdot(-1)+0\cdot 2+(-3)\cdot 1=-4 \\ c_{21} \amp =(-2)\cdot 0+1\cdot (-1)+1\cdot 3=2 \\ c_{22} \amp =(-2)\cdot( -1)+1\cdot 2+1\cdot 1=5\text{.} \end{align*}
We conclude that
\begin{equation*} C=\begin{bmatrix}c_{11} \amp c_{12} \\ c_{21} \amp c_{22}\end{bmatrix}= \begin{amatrix}[rr]-9\amp -4 \\ 2\amp 5 \end{amatrix} \text{.} \end{equation*}
Formula (3.2) for the \(ij\)-th entry of a matrix product \(AB\) is easily identified as the dot product of the \(i\)-th row of \(A\) with the \(j\)-th column of \(B\text{.}\) This gives us a succinct way of describing the entries of the product \(AB\) in terms of the dot product.

Proof.

Let \(A=[a_{ij}]\) and \(B=[b_{ij}]\text{,}\) so that
\begin{align*} \bolda_i \amp = (a_{i1},a_{i2},\dots, a_{in})\\ \boldb_j \amp = (b_{1j},b_{2j},\dots, b_{nj}) \end{align*}
for all \(1\leq i\leq m\) and \(1\leq j\leq r\text{.}\) Given any pair \((i,j)\) with \(1\leq i\leq m\) and \(1\leq j\leq r\text{,}\) we have
\begin{align*} [AB]_{ij} \amp = \sum_{k=1}^na_{ik}b_{kj} \amp( \knowl{./knowl/xref/d_matrix_mult.html}{\text{3.1.21}}) \\ \amp =\bolda_i\cdot \boldb_j \amp (\knowl{./knowl/xref/d_dot_product.html}{\text{1.2.1}}) \text{,} \end{align*}
as claimed.

Example 3.1.26. Matrix multiplication via dot product.

Consider the matrices
\begin{align*} A \amp =\begin{amatrix}[rrrr]1\amp 1\amp 1\amp 1\\ 1\amp 2\amp 1\amp 2\end{amatrix} \amp B\amp =\begin{amatrix}[rr] 1\amp -1 \\ 0\amp 1\\ 1\amp 0 \\ 0\amp 0 \end{amatrix}\text{.} \end{align*}
The two rows of \(A\) are
\begin{align*} \bolda_1 \amp =(1,1,1,1) \amp \bolda_2=(1,2,1,2)\text{.} \end{align*}
The two columns of \(B\) are
\begin{align*} \boldb_1 \amp =(1,0,1,0) \amp \boldb_2=(-1,1,0,0)\text{.} \end{align*}
Using the dot product description of matrix multiplication, we compute
\begin{align*} AB \amp =\begin{bmatrix}\bolda_1\cdot \boldb_1 \amp \bolda_1\cdot \boldb_2 \\ \bolda_2\cdot \boldb_1 \amp \bolda_2\cdot \boldb_2 \end{bmatrix}\\ \amp = \begin{amatrix}[rr] 2 \amp 0 \\ 2\amp 1\end{amatrix}\text{.} \end{align*}
The definition of a matrix product \(AB\) is undoubtedly more complicated than you expected, and seems to come completely out of the blue. All of this will make more sense once we begin thinking of matrices \(A\) as defining certain functions \(T_A\text{.}\) Our formula for the entries of \(AB\) is chosen precisely so that this new matrix corresponds to the composition of the functions \(T_A\) and \(T_B\text{:}\) i.e. so that
\begin{equation*} T_{AB}=T_A\circ T_B\text{.} \end{equation*}
(See TheoremΒ 5.1.41.) Under this interpretation, the ponderous restriction on the dimensions of the ingredient matrices ensures that the two functions \(T_A\) and \(T_B\) can be composed.

Sage example 5. Matrix arithmetic.

We use + and * for matrix addition and multiplication.
As evidence of Sage’s flexibility, the same symbol * is also used for scalar multiplication.
Edit the cell below to practice these operations.

Subsection 3.1.4 Alternative methods of multiplication

In addition to the given definition of matrix multiplication, we will make heavy use of two further ways of computing matrix products, called the column and row methods of matrix multiplication.

Proof.

We prove the equalities in both steps separately.
Proof of Step 1.
We must show \(AB=C\text{,}\) where
\begin{equation*} C=\begin{bmatrix}\vert \amp \vert \amp \amp \vert \\ A\boldb_1\amp A\boldb_2\amp \cdots\amp A\boldb_r\\ \vert \amp \vert \amp \amp \vert \end{bmatrix}\text{.} \end{equation*}
First we show \(AB\) and \(C\) have the same size. By definition of matrix multiplication, \(AB\) is \(m\times r\text{.}\) By construction \(C\) has \(r\) columns and its \(j\)-th column is \(A\boldb_j\text{.}\) Since \(A\) and \(\boldb_j\) have size \(m\times n\) and \(n\times 1\text{,}\) respectively, \(A\boldb_j\) has size \(m\times 1\text{.}\) Thus each of the \(r\) columns of \(C\) is an \(m\times 1\) column vector. It follows that \(C\) is \(m\times r\text{,}\) as desired.
Next we show that \([AB]_{ij}=[C]_{ij}\) for all \(1\leq i\leq m\text{,}\) \(1\leq j\leq r\text{.}\) Since the \(ij\)-th entry of \(C\) is the \(i\)-th entry of the \(j\)-th column of \(C\text{,}\) we have
\begin{align*} [C]_{ij} \amp= [A\boldb_j]_{i} \\ \amp=\sum_{k=1}^n a_{ik}b_{kj} \\ \amp =[AB]_{ij}\text{.} \end{align*}
Proof of Step 2.
We must show that \(A\boldb=\boldc\text{,}\) where
\begin{equation*} \boldc=b_1\bolda_1+b_2\bolda_2+\cdots +b_n\bolda_n\text{.} \end{equation*}
The usual argument shows that both \(A\boldb\) and \(\boldc\) are \(m\times 1\) column vectors. It remains only to show that the \(i\)-th entry \([A\boldb]_i\) of the column \(A\boldb\) is equal to the \(i\)-th entry \([\boldc]_i\) of \(\boldc\) for all \(1\leq i\leq m\text{.}\) For any such \(i\) we have
\begin{align*} [\boldc]_i \amp = [b_1\bolda_1+b_2\bolda_2+\cdots +b_n\bolda_n]_i\\ \amp= b_1[\bolda_1]_i+b_2[\bolda_2]_i+\cdots +b_n[\bolda_n]_i \amp (\knowl{./knowl/xref/rm_entry_lin_comb.html}{\text{Remark 3.1.20}})\\ \amp= b_1a_{i1}+b_2a_{i2}+\cdots +b_n\bolda_{in}\amp (\text{def. of } \bolda_j) \\ \amp= a_{i1}b_1+a_{i2}b_2+\cdots+a_{in}b_n \\ \amp =[A\boldb]_i \amp (\knowl{./knowl/xref/d_matrix_mult.html}{\text{Definition 3.1.21}})\text{.} \end{align*}

Remark 3.1.28.

TheoremΒ 3.1.27 amounts to a two-step process for computing an arbitrary matrix product \(AB\text{.}\)
The first statement (Step 1) tells us that the \(j\)-th column of the matrix \(AB\) can be obtained by computing the product \(A\,\boldb_j\) of \(A\) with the \(j\)-th column of \(B\text{.}\)
The second statement (Step 2) tells us that each product \(A\,\boldb_j\) can itself be computed as a certain linear combination of the columns of \(A\) with coefficients drawn from \(\boldb_j\text{.}\)
A similar remark applies to computing matrix products using the row method, as described below in TheoremΒ 3.1.29.

Proof.

Example 3.1.30. Column and row methods.

Let \(A=\begin{amatrix}[rrr] 1\amp 1 \amp -2 \\ 1\amp 3\amp 2\end{amatrix}\) and \(B=\begin{amatrix}[rc]1\amp 1\\ 0\amp 1 \\ -2\amp 1 \end{amatrix}\)
Compute \(AB\) using (a) the definition of matrix multiplication, (b) the column method, (c) the row method.
Solution.
  1. Using the definition, we see easily that
    \begin{equation*} AB=\begin{amatrix}[rr]5\amp 0\\ -3\amp 6 \end{amatrix} \end{equation*}
  2. Let \(\bolda_1, \bolda_2, \bolda_3\) be the columns of \(A\text{,}\) and let \(\boldb_1, \boldb_2\) be the columns of \(B\text{.}\) We have
    \begin{align*} AB \amp= \begin{amatrix}[cc]\vert \amp \vert \\ A\boldb_1\amp A\boldb_2 \\ \vert\amp \vert\end{amatrix} \amp \text{(Step 1)} \\ \amp= \begin{amatrix}[cc]\vert \amp \vert \\ (1\bolda_1+0\bolda_2-2\bolda_3)\amp (\bolda_1+\bolda_2+\bolda_3) \\ \vert\amp \vert\end{amatrix} \amp \text{(Step 2)}\\ \amp= \begin{amatrix}[rr]5\amp 0\\ -3\amp 6 \end{amatrix} \amp \text{(arithmetic)} \end{align*}
  3. Now let \(\bolda_1, \bolda_2\) be the rows of \(A\text{,}\) and let \(\boldb_1, \boldb_2, \boldb_3\) be the rows of \(B\text{.}\) We have
    \begin{align*} AB \amp= \begin{amatrix}[c]--\bolda_1\, B--\\ --\bolda_2\, B-- \end{amatrix}\amp \text{(Step 1)}\\ \amp= \begin{amatrix}[c]--(1\boldb_1+1\boldb_2-2\boldb_3)-- \\ --(1\boldb_1+3\boldb_2+2\boldb_3)-- \end{amatrix} \amp \text{(Step 2)} \\ \amp=\begin{amatrix}[rr]5\amp 0\\ -3\amp 6 \end{amatrix} \amp \text{(arithmetic)} \end{align*}

Sage example 6. Column and row methods.

Let’s verify the validity of the column and row methods using Sage in some specific examples. Below we generate random integer matrices \(A\) and \(B\) of dimension \(3\times 5\) and \(5\times 4\text{,}\) respectively, and compute their product \(C=AB\text{.}\)
Let’s check that the \(j\)-th column of \(C\) is equal to the product of \(A\) with the \(j\)-th column of \(B\text{.}\)
Alternatively, we can visually confirm these equalities using the display of \(C\) in the first cell above. Observe that the result of A*colsB[i] is displayed by Sage as a tuple, though technically for us this is a column vector.
Next, let’s verify that the result of multiplying \(A\) and the \(j\)-th column of \(B\) is the corresponding linear combination of the columns of \(A\) given by the coefficients of this column.
Now use the Sage cells below to demonstrate the validity of the row method for the product \(C=AB\text{.}\) Simply modify the code in the two cells above to reflect the row method, as opposed to the column method.

Example 3.1.31. Video example of matrix multiplication.

Figure 3.1.32. Video: three methods of matrix multiplication

Subsection 3.1.5 Transpose of a matrix

We end this section with one last operation, matrix transposition. We will not make much use of this operation until later, but this is as good a place as any to introduce it.

Definition 3.1.33. Matrix transposition.

Given an \(m\times n\) matrix \(A=[a_{ij}]\) its transpose \(A^T\) is the matrix whose \(ij\)-entry is the \(ji\)-th entry of \(A\text{.}\) In other words, \(A^T\) is the \(n\times m\) matrix satisfying \([A^T]_{ij}=[A]_{ji}\) for all \(1\leq i\leq n\) and \(1\leq j\leq m\text{.}\)

Remark 3.1.34.

Given a matrix \(A\) we can give a column- or row-based description of \(A^T\) as follows:
  • \(A^T\) is the matrix whose \(i\)-th row is the \(i\)-th column of \(A\text{.}\)
  • \(A^T\) is the matrix whose \(j\)-th column is the \(j\)-th row of \(A\text{.}\)

Example 3.1.35. Transpose.

Let \(A=\begin{bmatrix}1\amp 2\amp 3\\4\amp 5\amp 6 \end{bmatrix}\text{;}\) then \(A^T=\begin{bmatrix}1\amp 4\\2\amp 5\\3\amp 6 \end{bmatrix}\text{.}\)
Let \(B=\begin{bmatrix}1\\0\\3 \end{bmatrix}\text{,}\) then \(B^T=\begin{bmatrix}1\amp 0\amp 3 \end{bmatrix}\text{.}\)

Sage example 7. Matrix transposition.

Matrix transposition is implemented in Sage as the transpose() method. In the cell below we (a) choose random integers \(1\leq m,n\leq 6\text{,}\) (b) choose a random \(m\times n\) matrix \(A\) with integer entries, and (c) compute the transpose of \(A\text{.}\)
As usual, experiment with the Sage cell below.

Exercises 3.1.6 Exercises

WeBWork Exercises

1.
Enter T or F depending on whether the statement is true or false. (You must enter T or F -- True and False will not work.)
  1. If A is a square matrix such that AA equals the 0 matrix, then A must equal the 0 matrix.
  2. If A has dimensions \(m \times r\) and B has dimensions \(r \times n\) , then AB has dimensions \(r \times n\text{.}\)
2.
Matrix Products: Consider the matrices
\begin{equation*} A = \begin{pmatrix}4\amp 7\amp 3\\9\amp 6\amp 9\end{pmatrix}, B = \begin{pmatrix}1\amp 7\amp 10\amp 8\\6\amp 6\amp 10\amp 6\\ 3\amp 8\amp 7\amp 5\end{pmatrix},\textrm{ and } C = \begin{pmatrix}9\amp 9\\6\amp 1\\1\amp 7\\3\amp 7\end{pmatrix} \end{equation*}
Of the possible matrix products \(ABC, ACB, BAC, BCA, CAB, CBA\text{,}\)
which make sense?
Answer.
4.
Determine \(x\) and \(y\) such that
\begin{equation*} \left[\begin{array}{ccc} 0\amp -4\amp 0 \cr -2\amp 1\amp 3 \end{array}\right] + \left[\begin{array}{ccc} x-y \amp -2 \amp 3 \\ 0 \amp x \amp 0 \end{array} \right] = \left[\begin{array}{ccc} 4 \amp -6 \amp 3 \\ -2 \amp 2 x +y \amp 3 \end{array}\right] \end{equation*}
\(x =\)
\(y =\)
Answer 1.
Answer 2.
\(-1.5\)
5.
Determine the value(s) of \(x\) such that
\(\left[\begin{array}{ccc} x \amp 2 \amp 1\cr \end{array}\right] \left[\begin{array}{ccc} 2 \amp 0 \amp -2\cr 0 \amp -3 \amp 0\cr -2 \amp 5 \amp -1 \end{array}\right] \left[\begin{array}{c} x\cr -1\cr 1\cr \end{array}\right] = [0]\)
\(x\) =
Note: If there is more than one value separate them by commas.
Answer.
\(0, 2\)

Written Exercises

6.
For each part below write down the most general \(3\times 3\) matrix \(A=[a_{ij}]\) satisfying the given condition (use letter names \(a,b,c\text{,}\)etc. for entries).
  1. \(a_{ij}=a_{ji}\) for all \(i,j\text{.}\)
  2. \(a_{ij}=-a_{ji}\) for all \(i,j\)
  3. \(a_{ij}=0\) for \(i\ne j\text{.}\)
7.
Let
\begin{equation*} A = \begin{bmatrix}3\amp 0\\ -1\amp 2\\ 1\amp 1 \end{bmatrix} , \hspace{5pt} B = \begin{bmatrix}4\amp -1\\ 0\amp 2 \end{bmatrix} , \hspace{5pt} C = \begin{bmatrix}1\amp 4\amp 2\\ 3\amp 1\amp 5 \end{bmatrix} \end{equation*}
\begin{equation*} D = \begin{bmatrix}1\amp 5\amp 2\\ -1\amp 0\amp 1\\ 3\amp 2\amp 4 \end{bmatrix} , \hspace{5pt} E = \begin{bmatrix}6\amp 1\amp 3\\ -1\amp 1\amp 2\\ 4\amp 1\amp 3 \end{bmatrix}\text{.} \end{equation*}
Compute the following matrices, or else explain why the given expression is not well defined.
  1. \(\displaystyle (2D^T-E)A\)
  2. \(\displaystyle (4B)C+2B\)
  3. \(\displaystyle B^T(CC^T-A^TA)\)
8.
Let
\begin{equation*} A = \begin{bmatrix}3\amp -2\amp 7\\ 6\amp 5\amp 4\\ 0\amp 4\amp 9 \end{bmatrix} , \hspace{5pt} B = \begin{bmatrix}6\amp -2\amp 4\\ 0\amp 1\amp 3\\ 7\amp 7\amp 5 \end{bmatrix}\text{.} \end{equation*}
Compute the following using either the row or column method of matrix multiplication. Make sure to show how you are using the relevant method.
  1. the first column of \(AB\text{;}\)
  2. the second row of \(BB\text{;}\)
  3. the third column of \(AA\text{.}\)
Solution.
  1. Using expansion by columns, the first column of \(AB\) is given by \(A\) times the first column of \(B\text{.}\) We compute
    \begin{equation*} \begin{bmatrix}3\amp -2\amp 7\\ 6\amp 5\amp 4\\ 0\amp 4\amp 9 \end{bmatrix} \begin{bmatrix}6\\ 0\\ 7 \end{bmatrix} = 6 \begin{amatrix}[r]3 \\ 6 \\ 0 \end{amatrix}+0 \begin{amatrix}[r]-2 \\ 5 \\ 4 \end{amatrix}+7\begin{amatrix}[r]7 \\ 4 \\ 9 \end{amatrix}= \begin{bmatrix}67\\ 64\\ 63 \end{bmatrix} \end{equation*}
9.
Use the row or column method to quickly compute the following product:
\begin{equation*} \begin{amatrix}[rrrrr]1\amp -1\amp 1\amp -1\amp 1\\ 1\amp -1\amp 1\amp -1\amp 1\\ 1\amp -1\amp 1\amp -1\amp 1\\ 1\amp -1\amp 1\amp -1\amp 1\\ 1\amp -1\amp 1\amp -1\amp 1 \end{amatrix} \begin{amatrix}[rrrr]1\amp 1\amp 1\amp 1\\ -1\amp 0\amp 0\amp 0\\ 0\amp 1\amp 0\amp 0\\ 0\amp 0\amp 2\amp 0\\ 0\amp 0\amp 0\amp 3 \end{amatrix} \end{equation*}
Solution.
I’ll just describe the row method here.
Note that the rows of \(A\) are all identical, and equal to \(\begin{bmatrix}1 \amp -1 \amp 1 \amp -1 \amp 1 \end{bmatrix}\text{.}\) From the row method it follows that each row of \(AB\) is given by
\begin{equation*} \begin{bmatrix}1 \amp -1 \amp 1 \amp -1 \amp 1 \end{bmatrix} B\text{.} \end{equation*}
Thus the rows of \(AB\) are all identical, and the row method computes the product above by taking the corresponding alternating sum of the rows of \(B\text{:}\)
\begin{equation*} \begin{bmatrix}1 \amp -1 \amp 1 \amp -1 \amp 1 \end{bmatrix} B=\begin{bmatrix}2\amp 2\amp -1\amp 4 \end{bmatrix}\text{.} \end{equation*}
Thus \(AB\) is the the \(5\times 4\) matrix, all of whose rows are \(\begin{bmatrix}2\amp 2\amp -1\amp 4 \end{bmatrix}\text{.}\)
10.
Each of the \(3\times 3\) matrices \(B_i\) below performs a specific row operation when multiplying a \(3\times n\) matrix \(A=\begin{bmatrix}-\boldr_1-\\ -\boldr_2-\\ -\boldr_3- \end{bmatrix}\) on the left; i.e., the matrix \(B_iA\) is the result of performing a certain row operation on the matrix \(A\text{.}\) Use the row method of matrix multiplication to decide what row operation each \(B_i\) performs.
\begin{equation*} B_1=\begin{bmatrix}1\amp 0\amp 0\\ 0\amp 1\amp 0\\ -2\amp 0\amp 1 \end{bmatrix} , B_2=\begin{bmatrix}1\amp 0\amp 0\\ 0\amp \frac{1}{2}\amp 0\\ 0\amp 0\amp 1 \end{bmatrix} , B_3=\begin{bmatrix}0\amp 0\amp 1\\ 0\amp 1\amp 0\\ 1\amp 0\amp 0 \end{bmatrix}\text{.} \end{equation*}
11.
Let \(r\geq 2\) be an integer. Prove, by induction on \(r\text{,}\) that for any \(m\times n\) matrices \(A_1, A_2,\dots, A_r\) and scalars \(c_1,c_2,\dots, c_r\text{,}\) we have
\begin{equation*} [c_1A_1+c_2A_2+\cdots +c_rA_r]_{ij} =c_1[A_1]_{ij}+c_2[A_2]_{ij}+\cdots c_r[A_r]_{ij} \end{equation*}
for all \(1\leq i\leq m\text{,}\) \(1\leq j\leq n\text{.}\)