Skip to main content
Logo image

Section 3.2 Linear transformations

As detailed in Definition 3.2.1 a linear transformation is a special type of function between two vector spaces: one that respects in some sense the vector operations of both spaces.
This manner of theorizing is typical in mathematics: first we introduce a special class of objects defined axiomatically, then we introduce special functions or maps between these objects. Since the original objects of study (e.g. vector spaces) come equipped with special structural properties (e.g. vector operations), the functions we wish to study are the ones that somehow acknowledge this structure.
You have already seen this principle at work in your study of calculus. First we give \(\R\) some structure by defining a notion of proximity (i.e., \(x\) is close to \(y\) if \(\val{x-y}\) is small), then we introduce a special family of functions that somehow respects this structure: these are precisely the continuous functions!
As you will see, linear transformations are not just interesting objects of study in their own right, they also serve as invaluable tools in our continued exploration of the intrinsic properties of vector spaces.
In the meantime rejoice in the fact that we can now give a succinct definition of linear algebra: it is the theory of vector spaces and the linear transformations between them. Go shout it from the rooftops!

Subsection 3.2.1 Linear transformations

First and foremost, a linear transformation is a function. Before continuing on in this section, you may want to reacquaint yourself with the basic function concepts and notation outlined in Section 0.2.

Definition 3.2.1. Linear transformations.

Let \(V\) and \(W\) be vector spaces. A function \(T\colon V\rightarrow W\) is a linear transformation (or linear) if it satisfies the following properties:
  1. For all \(\boldv_1, \boldv_2\in V\text{,}\) we have \(T(\boldv_1+\boldv_2)=T(\boldv_1)+T(\boldv_2)\text{.}\)
  2. For all \(c\in \R\) and \(\boldv\in V\) we have \(T(c\boldv)=cT(\boldv)\text{.}\)
If a function between vector spaces is nonlinear if it is not a linear transformation.

Remark 3.2.2.

How precisely does a linear transformation “respect” vector space structure? In plain English: the image of a sum is the sum of the images, and the image of a scalar multiple is the scalar multiple of the image.
As our first examples of linear transformations, we define the zero transformation and identity transformation on a vector space.

Definition 3.2.3. Zero and identity transformation.

Let \(V\) and \(W\) be vector spaces.
The zero transformation from \(V\) to \(W\), denoted \(T_0\text{,}\) is defined as follows:
\begin{align*} T_0\colon V \amp\rightarrow W \\ \boldv \amp\mapsto T_0(\boldv)=\boldzero_W\text{,} \end{align*}
where \(\boldzero_W\) is the zero vector of \(W\text{.}\) In other words, \(T_0\) is the function that maps all elements of \(V\) to the zero vector of \(W\text{.}\)
The identity transformation of \(V\), denoted \(\id_V\text{,}\) is defined as follows:
\begin{align*} \id_V\colon V \amp\rightarrow V \\ \boldv \amp\mapsto \id_V(\boldv)=\boldv\text{.} \end{align*}
In other words, \(\id_V(\boldv)=\boldv\) for all \(\boldv\in V\text{.}\)

Example 3.2.4. Model of linear transformation proof.

Let’s show that the zero and identity functions are indeed linear transformations.
Let \(V\) and \(W\) be vector spaces, and let \(T_0\colon V\rightarrow W\) be the zero function. We verify each defining property separately.
  1. Given \(\boldv, \boldw\in V\text{,}\) we have
    \begin{align*} T_0(\boldv_1+\boldv_2)\amp =\boldzero_W \amp (\text{by def.}) \\ \amp =\boldzero_W+\boldzero_W \\ \amp = T_0(\boldv_1)+T_0(\boldv_2) \amp (\text{by def.})\text{.} \end{align*}
  2. Given \(c\in \R\) and \(\boldv\in V\text{,}\) we have
    \begin{align*} T_0(c\boldv) \amp = \boldzero_W \amp (\text{by def.})\\ \amp = c\boldzero_W \amp (\knowl{./knowl/th_vectorspace_props.html}{\text{3.1.16}}) \\ \amp = cT_0(\boldv) \amp (\text{by def.})\text{.} \end{align*}
This proves that \(T_0\colon V\rightarrow W\) is a linear transformation.
Now let \(V\) be a vector spaces, and let \(\id_V\colon V\rightarrow V\) be the identity function.
  1. Given \(\boldv, \boldw\in V\text{,}\) we have
    \begin{align*} \id_V(\boldv_1+\boldv_2)\amp =\boldv_1+\boldv_2 \amp (\text{by def.})\\ \amp = \id_V(\boldv_1)+\id_V(\boldv_2) \amp (\text{by def.})\text{.} \end{align*}
  2. Given \(c\in \R\) and \(\boldv\in V\text{,}\) we have
    \begin{align*} \id_V(c\boldv) \amp = c\boldv \amp (\text{by def.})\\ \amp = c\id_V(\boldv) \amp (\text{by def.}) \text{.} \end{align*}
This proves that \(\id_V\colon V\rightarrow V\) is a linear transformation.
  1. We employ some similar trickery to what was done in the proof of Theorem 3.1.16. Assuming \(T\) is linear:
    \begin{align*} T(\boldzero_V) \amp= T(\boldzero_V+\boldzero_V)\\ \amp =T(\boldzero_V)+T(\boldzero_V) \amp (\knowl{./knowl/d_linear_transform.html}{\text{Definition 3.2.1}}) \text{.} \end{align*}
    Thus, whatever \(T(\boldzero_V)\in W\) may be, it satisfies
    \begin{equation*} T(\boldzero_V)=T(\boldzero_V)+T(\boldzero_V)\text{.} \end{equation*}
    Canceling \(T(\boldzero_V)\) on both sides using \(-T(\boldzero_V)\text{,}\) we conclude
    \begin{equation*} \boldzero_W=T(\boldzero_V)\text{.} \end{equation*}
  2. The argument is similar:
    \begin{align*} \boldzero_W \amp= T(\boldzero_V) \amp (\text{by (1)})\\ \amp =T(-\boldv+\boldv)\\ \amp = T(-\boldv)+T(\boldv)\text{.} \end{align*}
    Since \(\boldzero_W=T(-\boldv)+T(\boldv)\text{,}\) adding \(-T(\boldv)\) to both sides of the equation yields
    \begin{equation*} -T(\boldv)=T(-\boldv)\text{.} \end{equation*}
  3. This is an easy proof by induction using the two defining properties of a linear transformation in tandem.

Remark 3.2.6.

Statement (3) of Theorem 3.2.5 provides a more algebraic interpretation of how linear transformations preserve vector space structure: namely, they distribute over linear combinations of vectors.

Subsection 3.2.2 Matrix transformations

We now describe what turns out to be an entire family of examples of linear transformations: so-called matrix transformations of the form \(T_A\colon \R^n\rightarrow \R^m\text{,}\) where \(A\) is a given \(m\times n\) matrix. This is a good place to recall the matrix mantra. Not only can a matrix represent a system of linear equations, it can represent a linear transformation. These are two very different concepts, and the matrix mantra helps us to not confuse the two. In the end a matrix is just a matrix: a mathematical tool that can be employed to diverse ends. Observe that the definition of matrix multiplication marks the first point where Fiat 3.1.8 comes into play.

Definition 3.2.8. Matrix transformations.

Let \(A\) be an \(m\times n\) matrix. The matrix transformation associated to \(A\) is the function \(T_A\) defined as follows:
\begin{align*} T_A\colon \R^n \amp\rightarrow \R^m \\ \boldx\amp\mapsto T_A(\boldx)=A\boldx \text{.} \end{align*}
In other words, given input \(\boldx\in \R^n\text{,}\) the output \(T(\boldx)\) is defined as \(A\boldx\text{.}\)
We use the one-step technique. For any \(c,d\in \R\) and \(\boldx_1, \boldx_2\in \R^n\text{,}\) we have
\begin{align*} T_A(c\boldx_1+d\boldx_2) \amp =A(c\boldx_1+d\boldx_2)\\ \amp =A(c\boldx_1)+A(d\boldx_2) \amp (\knowl{./knowl/th_matrix_alg_props.html}{\text{Theorem 2.2.1}}) \\ \amp =cA\boldx_1+dA\boldx_2 \amp (\knowl{./knowl/th_matrix_alg_props.html}{\text{Theorem 2.2.1}})\\ \amp =cT_A(\boldx_1)+dT_A(\boldx_2)\text{.} \end{align*}
This proves \(T_A\) is a linear transformation.

Remark 3.2.10.

As the title of Theorem 3.2.9 suggests, there is a follow-up result (Corollary 3.6.18), and this states that in fact any linear transformation \(T\colon\R^n\rightarrow\R^m \) is of the form \(T=T_A\) for some \(m\times n\) matrix \(A\text{.}\) In other words, all linear transformations from \(\R^n\) to \(\R^m\) are matrix transformations.
As general as these two results are, mark well the restriction that remains: they apply only to functions with domain and codomain equal to a vector spaces of tuples. They say nothing for example about functions from \(\R^\infty\) to \(F([0,1],\R)\text{.}\)

Remark 3.2.11.

Theorem 3.2.9 gives rise to an alternative technique for showing a function \(T\colon \R^n\rightarrow \R^m\) is a linear transformation: show that \(T=T_A\) for some matrix \(A\text{.}\)
As an example, consider the function
\begin{align*} T\colon \R^2 \amp\rightarrow \R^3 \\ (x,y)\amp\mapsto (7x+2y, -y, x) \text{.} \end{align*}
Conflating tuples with column vectors as described in Definition 3.2.8 we see that \(T=T_A\) where
\begin{equation*} A=\begin{amatrix}[rr]7\amp 2\\ 0\amp -1\\ 1\amp 0 \end{amatrix}\text{.} \end{equation*}
In other words, the original formula is just a description in terms of tuples of the function
\begin{equation*} \begin{amatrix}[c]x\\ y \end{amatrix}\mapsto A\begin{amatrix}[c]x\\ y \end{amatrix}=\begin{amatrix}[c]7x+2y\\ -y\\ x \end{amatrix}\text{.} \end{equation*}
It follows from Theorem 3.2.9 that \(T=T_A\) is linear.

Subsection 3.2.3 Rotations, reflections, and orthogonal projections

We now introduce a number of geometric examples of linear transformations of \(\R^2\) and \(\R^3\text{:}\) namely, rotations, reflections, and orthogonal projections. These operations are described in detail below; we will use Theorem 3.2.9 to prove these operations are in fact linear transformations.Our definitions of these operations will be very geometric in nature. As such we will go back and forth between point and arrow interpretations of elements of \(\R^n\text{.}\) (See Remark 3.1.5.) In particular, we will interpret an \(n\)-tuple \((a_1,a_2,\dots, a_n)\) both as the point \(P=(a_1,a_2,\dots, a_n)\) and as the position vector \(\overrightarrow{OP}\text{.}\)

Definition 3.2.12. Rotation in the plane.

Fix an angle \(\alpha\) and define
\begin{equation*} \rho_\alpha\colon \R^2\rightarrow \R^2 \end{equation*}
to be the function that takes an input vector \(\boldx=(x_1,x_2)\text{,}\) considered as the position vector \(\overrightarrow{OP}\) of the point \(P=(x_1,x_2)\text{,}\) and returns the output \(\boldy=(y_1,y_2)\) obtained by rotating the vector \(\boldx\) by an angle of \(\alpha\) about the origin. The function \(\rho_\alpha\) is called rotation about the origin by the angle \(\alpha\text{.}\)
We can extract a formula from the rule defining \(\rho_\alpha\) by using polar coordinates: if \(\boldx\) has polar coordinates \((r,\theta)\text{,}\) then \(\boldy=\rho_\alpha(\boldx)\) has polar coordinates \((r,\theta+\alpha)\text{.}\)
By Remark 3.2.11, we need only show that \(\rho_\alpha=T_A\) for the matrix indicated.
If the vector \(\boldx=(x,y)\) has polar coordinates \((r,\theta)\) (so that \(x_1=r\cos\theta\) and \(x_2=r\sin\theta\)), then its image \(\boldy=\rho_{\alpha}(P)\) under our rotation has polar coordinates \((r,\theta+\alpha)\text{.}\) Translating back to rectangular coordinates, we see that
\begin{align*} \rho_\alpha(\boldx)\amp= \boldy \\ \amp =\left(r\cos(\theta+\alpha),r\sin(\theta+\alpha)\right)\\ \amp =(r\cos\theta\cos\alpha-r\sin\theta\sin\alpha, r\sin\theta\cos\alpha+r\cos\theta\sin\alpha) \amp (\text{trig. identities}) \\ \amp=(\cos\alpha\, x_1-\sin\alpha\, x_2, \sin\alpha\, x_1+\cos\alpha\, x_2) \amp (\text{since } x_1=r\cos\theta, x_2=r\sin\theta) \text{.} \end{align*}
It follows that \(\rho_{\alpha}=T_A\text{,}\) where
\begin{equation*} A=\begin{amatrix}[rr] \cos\alpha\amp -\sin\alpha\\ \sin\alpha \amp \cos\alpha \end{amatrix}\text{,} \end{equation*}
as claimed.

Remark 3.2.14.

Observe that it is not at all obvious geometrically that the rotation operation is linear: i.e., that it preserves addition and scalar multiplication of vectors in \(\R^2\text{.}\) Indeed, our proof does not even show this directly, but instead first gives a matrix formula for rotation and then uses Theorem 3.2.9.
Since matrices of the form
\begin{equation*} \begin{amatrix}[rr] \cos\alpha\amp -\sin\alpha\\ \sin\alpha \amp \cos\alpha \end{amatrix} \end{equation*}
can be understood as defining rotations of the plane, we call them rotation matrices.

Example 3.2.15. Rotation matrices.

Find formulas for \(\rho_\pi\colon \R^2\rightarrow \R^2\) and \(\rho_{2\pi/3}\colon \R^2\rightarrow \R^2\text{,}\) expressing your answer in terms of pairs (as opposed to column vectors).
Solution.
The rotation matrix corresponding to \(\alpha=\pi\) is
\begin{equation*} A=\begin{amatrix}[rr]\cos\pi\amp -\sin\pi\\ \sin\pi \amp \cos\pi \end{amatrix}= \begin{amatrix}[rr]-1\amp 0\\ 0 \amp -1 \end{amatrix}\text{.} \end{equation*}
Thus \(\rho_\pi=T_A\) has formula
\begin{equation*} \rho_{\pi}(x,y)=(-x,-y)=-(x,y)\text{.} \end{equation*}
Note: this is as expected! Rotating by 180 degrees produces the inverse vector.
The rotation matrix corresponding to \(\alpha=2\pi/3\) is
\begin{equation*} B=\begin{amatrix}[rr]\cos(2\pi/3)\amp -\sin(2\pi/3)\\ \sin(2\pi/3) \amp \cos(2\pi/3) \end{amatrix}= \begin{amatrix}[rr]-\frac{1}{2}\amp -\frac{\sqrt{3}}{2}\\ \frac{\sqrt{3}}{2} \amp -\frac{1}{2} \end{amatrix}\text{.} \end{equation*}
Thus \(\rho_{2\pi/3}=T_B\) has formula
\begin{equation*} \rho_{2\pi/3}(x,y)=\frac{1}{2}(-x-\sqrt{3}y, \sqrt{3}x-y)\text{.} \end{equation*}
Let’s check our formula for \(\rho_{2\pi/3}\)for the vectors \((1,0)\) and \((0,1)\text{:}\)
\begin{align*} \rho_{2\pi/3}(1,0) \amp =(-1/2, \sqrt{3}/2) \\ \rho_{2\pi/3}(0,1) \amp =(-\sqrt{3}/2, -1/2) \text{.} \end{align*}
Confirm for yourself geometrically that these are the vectors you get by rotating the vectors \((1,0)\) and \((0,1)\) by an angle of \(2\pi/3\) about the origin.
A second example of a geometric linear transformation is furnished by reflection through a line in \(\R^2\text{.}\)

Definition 3.2.16. Reflection through a line.

Fix an angle \(\alpha\) with \(0\leq \alpha \leq \pi \text{,}\) and let \(\ell_\alpha\) be the line through the origin that makes an angle of \(\alpha\) with the positive \(x\)-axis.
Define \(r_\alpha\colon \R^2\rightarrow \R^2\) to be the function that takes an input \(\boldx=(x_1,x_2)\text{,}\) considered as a point \(P\text{,}\) and returns the coordinates \(\boldy=(y_1,y_2)\) of the point \(P'\) obtained by reflecting \(P\) through the line \(\ell_\alpha\text{.}\) In more detail: if \(P\) lies on \(\ell_\alpha\text{,}\) then \(P'=P\text{;}\) otherwise, \(P'\) is obtained by drawing the perpendicular through \(\ell_\alpha\) that passes through \(P\) and taking the point on the other side of this line whose distance to \(\ell_\alpha\) is equal to the distance from \(P\) to \(\ell_\alpha\text{.}\)
The function \(r_{\alpha}\) is called reflection through the line \(\ell_\alpha\text{.}\)

Example 3.2.18. Visualizing reflection and rotation.

The GeoGebra 1  interactive below helps visualize rotations and reflections in \(\R^2\) (thought of as operations on points) by showing how they act on the triangle \(\triangle ABC\text{.}\)
  • Move or alter the triangle as you see fit.
  • Check the box of the desired operation, rotation or reflection.
  • If rotation is selected, the slider adjusts the angle \(\alpha\) of rotation.
  • If reflection is selected, the slider adjusts the angle \(\alpha\) determining the line \(\ell_\alpha\) of reflection. Click the “Draw perps” box to see the the perpendicular lines used to define the reflections of vertices \(A, B, C\text{.}\)
Figure 3.2.19. Visualizing reflection and rotation. Made with GeoGebra 2 .
Next we consider the operation of orthogonally projecting a vector onto a line (in \(\R^2\) or \(\R^3\)) or a plane in \(\R^3\text{.}\) See Example 1.1.7 for a refresher on lines and planes in \(\R^2\) and \(\R^3\text{.}\)

Definition 3.2.20. Projection onto a line.

Let \(\boldv=(a_1,a_2,\dots, a_n)\) be a fixed nonzero vector in \(\R^n\text{,}\) where \(n=2\) or \(n=3\text{.}\) The set of all scalar multiples of \(\boldv\) defines a line \(\ell\) in \(\R^n\) passing through the origin: we call \(\boldv\) the direction vector of this line. Given a point \(P=(x_1,x_2,\dots, x_n)\in \R^n\) there is a unique point \(Q\in \ell\) such that the vector
\begin{equation*} \overrightarrow{QP}=(x_1-y_1,x_2-y_2,\dots, x_n-y_n) \end{equation*}
is orthogonal to \(\boldv\text{:}\) i.e., there is a unique \(Q\in \ell\) such that
\begin{equation*} \overrightarrow{QP}\cdot \boldv=0\text{.} \end{equation*}
The point \(Q\) is called the orthogonal projection of \(P\) onto the line \(\ell\text{.}\) We define orthogonal projection onto \(\ell\) to be the function \(\operatorname{proj}_{\ell}\colon \R^n\rightarrow \R^n\) that maps a point \(P\) to its orthogonal projection \(Q=\proj{P}{\ell}\) onto \(\ell\text{.}\)
We prove the matrix formula in the case \(n=3\text{.}\) (The case \(n=2\) is exactly similar.) Let \(\boldv=(a,b,c)\text{.}\) In multivariable calculus we learn that given a point with position vector \(\boldx=(x,y,z)\text{,}\) its orthogonal projection onto \(\ell\) is the point \(Q\) whose position vector is
\begin{align*} \overrightarrow{OQ}\amp=\left(\frac{\boldx\cdot \boldv}{\boldv\cdot \boldv}\right)\boldv \\ \amp = \frac{ax+by+cz}{a^2+b^2+c^2}(a,b,c)\\ \amp = \frac{1}{a^2+b^2+c^2}(a^2x+aby+acz, abx+b^2y+bcz, acx+bcy+c^2z)\\ \amp = \frac{1}{a^2+b^2+c^2}\begin{bmatrix}a^2\amp ab\amp ac\\ ab \amp b^2 \amp bc \\ ac\amp bc\amp c^2 \end{bmatrix}\colvec{x\\ y\\ z}\text{.} \end{align*}
This proves that
\begin{equation*} \proj{\boldx}{\ell}=A\boldx\text{,} \end{equation*}
where
\begin{equation*} A=\frac{1}{a^2+b^2+c^2}\begin{bmatrix}a^2\amp ab\amp ac\\ ab \amp b^2 \amp bc \\ ac\amp bc\amp c^2 \end{bmatrix}\text{,} \end{equation*}
as desired.

Example 3.2.22. Orthogonal projection onto line.

Let \(T\colon \R^2\rightarrow \R^2\) be orthogonal projection onto the line \(\ell\) passing through the origin with direction vector \(\boldv=(1,1,1)\text{.}\) Find the matrix \(A\) such that \(T=T_A\text{.}\) Use \(A\) to compute the orthogonal projection of \((-1,3,2)\) onto \(\ell\text{.}\)
Solution.
Using the formula for \(A\) in Theorem 3.2.21, where \((a,b,c)=(1,1,1)\text{,}\) we see that
\begin{equation*} A=\frac{1}{3}\begin{bmatrix}1\amp 1 \amp 1 \\ 1\amp 1\amp 1\\ 1\amp 1\amp 1\end{bmatrix} \end{equation*}
and hence for any \(\boldx=(x,y,z)\) we have
\begin{equation*} \proj{\boldx}{\ell}=A\boldx=\frac{1}{3}(x+y+z,x+y+z, x+y+z)\text{.} \end{equation*}
In particular, we have \(\proj{(-1,3,2)}{\ell}=(4/3, 4/3, 4/3)\text{.}\) Let’s check that this truly is the orthogonal projection of \(P=(-1,3,2)\) onto \(\ell\text{.}\) Letting \(Q=(4/3,4/3,4/3)\text{,}\) we have \(\overrightarrow{QP}=(-7/3,5/3, 2/3)\text{,}\) which is indeed orthogonal to \(\boldv=(1,1,1)\text{:}\)
\begin{equation*} \boldv\cdot \overrightarrow{QP}=\frac{1}{3}(-7+5+2)=0\text{.} \end{equation*}
The formula really works! In case you need more convincing, here is a Sage Cell that computes the projections and produces a diagram.
Figure 3.2.23. Orthogonal projection onto the line passing through the origin with direction vector \(\boldv=(1,1,1)\)

Definition 3.2.24. Orthogonal projection onto a plane.

Let \(\boldn=(a,b,c)\) be a nonzero vector in \(\R^3\text{,}\) and let \(W\) be the plane passing through the origin with normal vector \(\boldn\text{:}\) i.e., \(W\) is the plane with equation
\begin{equation*} ax+by+cz=0\text{.} \end{equation*}
Given a point \(P=(x,y,z)\in \R^3\text{,}\) there is a unique point \(Q\in W\) such that \(\overrightarrow{QP}\) is orthogonal to \(W\text{.}\) We call \(Q\) the orthogonal projection of \(P\) onto \(W\text{.}\) We define orthogonal projection onto \(W\) to be the function \(\operatorname{proj}_W\colon \R^3\rightarrow \R^3\) that maps a point \(P\in \R^3\) to its orthogonal projection \(Q=\proj{P}{W}\) in \(W\text{.}\)
Let \(\ell\) be the line passing through the origin with direction vector \(\boldn\text{.}\) Given any \(\boldx=(x,y,z)\in \R^3\text{,}\) let \(P\) be the point with coordinates \((x,y,z)\text{.}\) The orthogonal projection \(R=\proj{P}{\ell}\) of \(P\) onto \(\ell\) satisfies
\begin{equation*} \overrightarrow{RP}\cdot \boldn=0\text{.} \end{equation*}
Let \(Q\) be the point of \(\R^3\) with position vector \(\overrightarrow{OQ}\) satisfying
\begin{align} \overrightarrow{OQ}\amp = \overrightarrow{RP} \tag{3.2.6}\\ \amp=\overrightarrow{OP}-\overrightarrow{OR}\tag{3.2.7}\\ \amp=(x,y,z)-\proj{P}{\ell} \tag{3.2.8} \end{align}
so that
\begin{equation} \overrightarrow{OP}=\overrightarrow{OQ}+\overrightarrow{OR}\text{.}\tag{3.2.9} \end{equation}
Since
\begin{align*} \overrightarrow{OQ}\cdot \boldn\amp = \overrightarrow{RP}\cdot \boldn \amp \knowl{./knowl/eq_vec_formula.html}{\text{(3.2.8)}} \\ \amp = 0 \amp (\text{def. of } R=\proj{P}{\ell})\text{,} \end{align*}
the point \(Q\) lies in the plane \(W\text{.}\) (See Example 1.1.7.) Furthermore, we have
\begin{align*} \overrightarrow{QP} \amp =\overrightarrow{OP}-\overrightarrow{OQ}\\ \amp = \overrightarrow{OR} \amp \knowl{./knowl/eq_vec_diff.html}{\text{(3.2.7)}}\text{.} \end{align*}
Since \(R\) is the orthogonal projection of \(P\) onto \(\ell\text{,}\) we have \(R\in \ell\) by definition, which means \(\overrightarrow{OR}\) is a scalar multiple of \(\boldn\text{.}\) Since \(\boldn\) is a normal vector to \(W\text{,}\) we conclude that \(\overrightarrow{QP}=\overrightarrow{OR}\) is orthogonal to \(W\text{.}\) We have shown that \(Q\) lies in \(W\) and that \(\overrightarrow{QP}\) is orthogonal to \(W\text{.}\) We conclude that \(Q\) is the orthogonal projection of \(P\) onto \(W\text{.}\) Thus, using (3.2.8) we have
\begin{equation*} \proj{\boldx}{W}=\boldx-\proj{\boldx}{\ell}\text{.} \end{equation*}
Since \(\proj{\boldx}{\ell}=B\boldx\text{,}\) where
\begin{equation*} B=\frac{1}{a^2+b^2+c^2}\begin{bmatrix} a^2\amp ab\amp ac \\ ab\amp b^2\amp bc \\ ac\amp bc\amp c^2\end{bmatrix} \end{equation*}
by Theorem 3.2.21, we have
\begin{align*} \proj{\boldx}{W} \amp = \boldx-B\boldx\\ \amp = I\boldx-B\boldx\\ \amp = (I-B)\boldx\\ \amp = \frac{1}{a^2+b^2+c^2}\begin{amatrix}[rrr] b^2+c^2\amp -ab \amp -ac \\ -ab \amp a^2+c^2 \amp -bc \\ -ac \amp -bc \amp a^2+b^2 \end{amatrix}\boldx\text{.} \end{align*}
We conclude that \(\operatorname{proj}_W=T_A\) where
\begin{equation*} A=\frac{1}{a^2+b^2+c^2}\begin{amatrix}[rrr] b^2+c^2\amp -ab \amp -ac \\ -ab \amp a^2+c^2 \amp -bc \\ -ac \amp -bc \amp a^2+b^2 \end{amatrix}\text{,} \end{equation*}
as desired.

Example 3.2.26. Visualizing orthogonal projection.

In the course of the proof of Theorem 3.2.25 we discovered an illuminating relationship between orthogonal projection onto a line and orthogonal projection onto the plane orthogonal to this line. In more detail, let \(\boldn=(a,b,c)\) be a nonzero vector, \(\ell\) the line passing through the origin with \(\boldn\) as a direction vector, and \(W\) the plane passing through the origin with normal vector \(\boldn\text{.}\) From our argument in the proof of Theorem 3.2.25 we see that
\begin{equation} \proj{\boldx}{W}=\boldx-\proj{\boldx}{\ell}\text{,}\tag{3.2.10} \end{equation}
or
\begin{equation} \boldx=\proj{\boldx}{\ell}+\proj{\boldx}{W}\text{.}\tag{3.2.11} \end{equation}
Equation (3.2.10) indicates how we can derive the orthogonal projection onto \(W\) from the orthogonal projection onto \(\ell\) (and conversely). Equation (3.2.11) shows how every vector \(\boldx\) can be “decomposed” as a sum of two orthogonal vectors: one pointing parallel to \(\ell\) and the other pointing parallel to \(W\text{.}\)
The GeoGebra 3  interactive below helps visualize these two orthogonal projections, understood as operations on \(\R^3\text{.}\)
  • Drag the point \(Q\) to change the normal vector \(\boldn\text{,}\) and hence also the plane \(W\text{.}\)
  • Drag the point \(P\) to change the input of the transformations \(\operatorname{proj}_\ell\) and \(\operatorname{proj}_W\text{.}\)
  • In keeping with our dual interpretation of vectors in \(\R^3\text{,}\) all the relevant vectors (\(P\text{,}\) \(\proj{P}{\ell}\text{,}\) \(\proj{P}{W}\)) are rendered here both as points and the corresponding position vectors of these points.
Figure 3.2.27. Orthogonal projection onto plane and normal line. Made with GeoGebra 4 .

Subsection 3.2.4 Additional examples

We now proceed to some examples involving our more exotic vector spaces.

Example 3.2.28. Transposition is linear.

Fix \(m,n\geq 1\text{.}\) Define the function \(f\colon M_{mn}\rightarrow M_{nm}\) as follows:
\begin{equation*} f(A)=A^T\text{.} \end{equation*}
In other words, \(f\) maps a matrix to its transpose.
Show that \(f\) is a linear transformation.
Solution.
We must show \(f(cA+dB)=cf(A)+df(B)\) for all scalars \(c,d\in\R\) and all matrices \(A, B\in M_{mn}\text{.}\) This follows easily from properties of transpose:
\begin{align*} f(cA+dB)\amp =(cA+dB)^T \amp \text{ (by def.) }\\ \amp =(cA)^T+(dB)^T \amp (\knowl{./knowl/th_trans_props.html}{\text{Theorem 2.2.11}})\\ \amp =cA^T+dB^T \amp (\knowl{./knowl/th_trans_props.html}{\text{Theorem 2.2.11}})\\ \amp =cf(A)+df(B) \end{align*}

Example 3.2.29. Left-shift transformation.

Define the left-shift operation, \(T_\ell\colon \R^\infty \rightarrow R^{\infty}\) as follows:
\begin{equation*} T_\ell\left( (a_{i})_{i=1}^\infty\right)= (a_{i+1})_{i=1}^\infty\text{.} \end{equation*}
In other words, we have
\begin{equation*} T_\ell \left( (a_1,a_2,a_3,\dots)\right)=(a_2,a_3,\dots)\text{.} \end{equation*}
Show that \(T_\ell\) is a linear transformation.
Solution.
Let \(\boldv=(a_i)_{i=1}^\infty\) and \(\boldw=(b_i)_{i=1}^\infty\) be two infinite sequences in \(\R^\infty\text{.}\) For any \(c,d\in\R\) we have
\begin{align*} T_\ell(c\boldv+d\boldw) \amp=T_\ell\left((ca_i+db_i)_{i=1}^\infty \right)\amp (\knowl{./knowl/ex_vs_infinitesequences.html}{\text{Example 3.1.10}}) \\ \amp= (ca_{i+1}+db_{i+1})_{i=1}^\infty \amp (\text{by def.})\\ \amp=c(a_{i+1})_{i=1}^\infty+d(b_{i+1})_{i=1}^\infty \amp (\knowl{./knowl/ex_vs_infinitesequences.html}{\text{Example 3.1.10}})\\ \amp=cT_\ell(\boldv)+dT_\ell(\boldw)\amp (\text{by def.}) \text{.} \end{align*}
This proves \(T_\ell\) is a linear transformation.

Video examples: deciding if \(T\) is linear.

Figure 3.2.30. Video: deciding if \(T\) is linear
Figure 3.2.31. Video: deciding if \(T\) is linear

Subsection 3.2.5 Composition of linear transformations and matrix multiplication

We end by making good on a promise we made long ago to retroactively make sense of the definition of matrix multiplication. The key connecting concept, as it turns out, is composition of functions. We first need a result showing that composition preserves linearity.
Exercise.
Turning now to matrix multiplication, suppose \(A\) is \(m\times n\) and \(B\) is \(n\times r\text{.}\) Let \(C=AB\) be their product. These matrices give rise to linear transformations
\begin{align*} T_A\colon \R^n \amp\rightarrow \R^m \amp T_B\colon \R^r \amp\rightarrow \R^n \amp T_C\colon \R^r \amp\rightarrow \R^m \text{.} \end{align*}
According to Theorem 3.2.32 the composition \(T_A\circ T_B\) is a linear transformation from \(\R^r\) (the domain of \(T_B\)) to \(\R^m\) (the codomain of \(T_A\)). We claim that \(T_A\circ T_B=T_C\text{.}\) Indeed, identifying elements of \(\R^r\) with column vectors, for all \(\boldx\in \R^n\) we have
\begin{align*} T_A\circ T_B(\boldx) \amp = T_A(T_B(\boldx)) \amp (\knowl{./knowl/d_function_composition.html}{\text{Definition 0.2.9}}) \\ \amp =T_A(B\boldx) \amp (\knowl{./knowl/d_matrix_transform.html}{\text{Definition 3.2.8}})\\ \amp= A(B\boldx) \amp (\knowl{./knowl/d_matrix_transform.html}{\text{Definition 3.2.8}})\\ \amp = (AB)\boldx \amp (\text{assoc.})\\ \amp = T_C(\boldx) \amp (\text{since } C=AB)\text{.} \end{align*}
Thus, we can now understand the definition of matrix multiplication as being chosen precisely to encode how to compute the composition of two matrix transformations. The restriction on the dimension of the ingredient matrices is now understood as guaranteeing that the corresponding matrix transformations can be composed!

Exercises 3.2.6 Exercises

WeBWork Exercises

1.
Let \(T:{\mathbb R}^2 \rightarrow {\mathbb R}^2\) be a linear transformation that sends the vector \(\vec{u} =(5,2)\) into \((2,1)\) and maps \(\vec{v}= (1,3)\) into \((-1, 3)\text{.}\) Use properties of a linear transformation to calculate the following. (Enter your answers as ordered pairs, such as (1,2), including the parentheses.)
\(T(-4 \vec{u}) =\) ,
\(T(6 \vec{v}) =\) ,
\(T(-4 \vec{u} + 6 \vec{v}) =\) .
Answer 1.
\(\left(-8,-4\right)\)
Answer 2.
\(\left(-6,18\right)\)
Answer 3.
\(\left(-14,14\right)\)
2.
Let \(V\) be a vector space, and \(T:V \rightarrow V\) a linear transformation such that \(T(2 \vec{v}_1 - 3 \vec{v}_2)= -4 \vec{v}_1 - 2 \vec{v}_2\) and \(T(-3 \vec{v}_1 + 5 \vec{v}_2)= 3 \vec{v}_1 - 2 \vec{v}_2\text{.}\) Then
\(T(\vec{v}_1)=\) \(\vec{v}_1+\) \(\vec{v}_2\text{,}\)
\(T(\vec{v}_2)=\) \(\vec{v}_1+\) \(\vec{v}_2\text{,}\)
\(T(4 \vec{v}_1 - 4 \vec{v}_2)=\) \(\vec{v}_1+\) \(\vec{v}_2\text{.}\)
Answer 1.
\(-11\)
Answer 2.
\(-16\)
Answer 3.
\(-6\)
Answer 4.
\(-10\)
Answer 5.
\(-20\)
Answer 6.
\(-24\)
3.
Let \(T:P_3 \rightarrow P_3\) be the linear transformation such that
\begin{equation*} T(-2 x^2)= -3 x^2 - 3 x, \ \ \ T(0.5 x - 2)= 2 x^2 + 4 x - 4, \ \ \ T(4 x^2 + 1) = -4 x + 4 . \end{equation*}
Find \(T(1)\text{,}\) \(T(x)\text{,}\) \(T(x^2)\text{,}\) and \(T(a x^2 + b x + c)\text{,}\) where \(a\text{,}\) \(b\text{,}\) and \(c\) are arbitrary real numbers.
\(T(1)=\) ,
\(T(x)=\) ,
\(T(x^2)=\) ,
\(T(a x^2 + b x + c)=\) .
Answer 1.
\(-6x^{2}+-10x+4\)
Answer 2.
\(-20x^{2}+-32x+8\)
Answer 3.
\(1.5x^{2}+1.5x+0\)
Answer 4.
\(a\!\left(1.5x^{2}+1.5x+0\right)+b\!\left(-20x^{2}+-32x+8\right)+c\!\left(-6x^{2}+-10x+4\right)\)
4.
If \(T: P_1 \rightarrow P_1\) is a linear transformation such that \(T(1+4 x) = -1 - 4 x \ \) and \(\ T(4 + 15 x) = -2 - 3 x, \ \) then
\(T(3 - 5 x) =\).
Answer.
\(31+209x\)
5.
Let
\begin{equation*} \vec{v}_1= \left[\begin{array}{c} -3\cr -2 \end{array}\right] \ \mbox{ and } \ \vec{v}_2=\left[\begin{array}{c} 2\cr 1 \end{array}\right] . \end{equation*}
Let \(T:{\mathbb R}^2 \rightarrow {\mathbb R}^2\) be the linear transformation satisfying
\begin{equation*} T(\vec{v}_1)=\left[\begin{array}{c} 1\cr -17 \end{array}\right] \ \mbox{ and } \ T(\vec{v}_2)=\left[\begin{array}{c} 1\cr 11 \end{array}\right] . \end{equation*}
Find the image of an arbitrary vector \(\left[\begin{array}{c} x\cr y\cr \end{array}\right] .\)
\(T \left(\left[\begin{array}{c} x\cr y\cr \end{array}\right]\right) =\) (2 × 1 array)
6.
Let
\begin{equation*} A = \left[\begin{array}{ccc} 5 \amp 8 \amp -6\cr -4 \amp -7 \amp -5 \end{array}\right]. \end{equation*}
Define the linear transformation \(T: {\mathbb R}^3 \rightarrow {\mathbb R}^2\) by \(T(\vec{x}) = A\vec{x}\text{.}\) Find the images of \(\vec{u} = \left[\begin{array}{c} 3\cr -3\cr -1 \end{array}\right]\) and \(\vec{v} =\left[\begin{array}{c} a\cr b\cr c\cr \end{array}\right]\) under \(T\text{.}\)
\(T(\vec{u}) =\) (2 × 1 array)
\(T(\vec{v}) =\) (2 × 1 array)
7.
Let \(V\) be a vector space, \(v, u \in V\text{,}\) and let \(T_1: V \rightarrow V\) and \(T_2: V \rightarrow V\) be linear transformations such that
\begin{equation*} T_1(v) = 7 v + 5 u, \ \ \ T_1(u) = -4 v + 2 u, \end{equation*}
\begin{equation*} T_2(v) = 4 v + 6 u, \ \ \ T_2(u) = -7 v - 4 u. \end{equation*}
Find the images of \(v\) and \(u\) under the composite of \(T_1\) and \(T_2\text{.}\)
\((T_2 T_1)(v) =\) ,
\((T_2 T_1)(u) =\) .
Answer 1.
\(-7v+22u\)
Answer 2.
\(-30v+-32u\)

8.

For each of the following functions \(T\text{,}\) show that \(T\) is nonlinear by providing an explicit counterexample to one of the defining axioms or a consequence thereof.
  1. \(T\colon \R^2\rightarrow \R^2\text{,}\) \(T((x,y))=(x,y)+(1,1)\)
  2. \(T\colon M_{nn}\rightarrow M_{nn}\text{,}\) \(T(A)=A^2\)
  3. \(T\colon M_{nn}\rightarrow \R\text{,}\) \(T(A)=\det A\)
  4. \(T\colon F(\R,\R)\rightarrow F(\R,\R)\text{,}\) \(T(f)=1+f\)
  5. \(T\colon\R^3\rightarrow \R^2\text{,}\) \(T(x,y,z)=(xy,yz)\)

9. Transposition.

Define \(T\colon M_{mn}\rightarrow M_{nm}\) as \(T(A)=A^T\text{:}\) i.e., the function \(T\) takes as input an \(m\times n\) matrix and returns as output an \(n\times m\) matrix. Show that \(T\) is a linear transformation.

10. Scalar multiplication.

Let \(V\) be a vector space. Fix \(c\in \R\) and define \(T\colon V\rightarrow V\) as \(T(\boldv)=c\boldv\text{:}\) i.e., \(T\) is scalar multiplication by \(c\text{.}\) Show that \(T\) is a linear transformation.

11. Trace.

Fix an integer \(n\geq 1\text{.}\) The trace function is the function \(\tr\colon M_{nn}\rightarrow \R\) defined as
\begin{equation*} \tr A=\sum_{i=1}^n a_{ii}=a_{11}+a_{22}+\cdots +a_{nn}\text{.} \end{equation*}
Show that the trace function is a linear transformation.

12. Left/right matrix multiplication.

Let \(B\) be an \(r\times m\) matrix, and let \(C\) be an \(n\times s\) matrix. Define the functions \(T\) and \(S\) as follows:
\begin{align*} T\colon M_{mn} \amp\rightarrow M_{rn} \\ A \amp\mapsto T(A)=BA \\ \amp \\ S\colon M_{mn} \amp\rightarrow M_{ms} \\ A\amp\mapsto S(A)=AC \text{.} \end{align*}
In other words, \(T\) is the “multiply on the left by \(B\)” operation, and \(S\) is the “multiply on the right by C” operation Show that \(T\) and \(S\) are linear transformations.

13. Conjugation.

Fix an invertible matrix \(Q\in M_{nn}\text{.}\) Define \(T\colon M_{nn}\rightarrow M_{nn}\) as \(T(A)=QAQ^{-1}\text{.}\) Show that \(T\) is a linear transformation. This operation is called conjugation by \(Q\).

14. Sequence shift operators.

Let \(V=\R^\infty=\{(a_1,a_2,\dots, )\colon a_i\in\R\}\text{,}\) the space of all infinite sequences. Define the shift left function, \(T_\ell\text{,}\) and shift right function, \(T_r\text{,}\) as follows:
\begin{align*} T_\ell\colon \R^\infty\amp \rightarrow \R^\infty \amp T_r\colon \R^\infty\amp \rightarrow \R^\infty\\ s=(a_1,a_2, a_3,\dots )\amp \longmapsto T_\ell(s)=(a_2, a_3,\dots) \amp s=(a_1,a_2, a_3,\dots )\amp \longmapsto T_r(s)=(0,a_1,a_2,\dots) \end{align*}
Prove that \(T_\ell\) and \(T_r\) are linear transformations.

15. Function shift operators.

Fix \(a\in \R\text{.}\) Define \(T\colon F(\R,\R)\rightarrow F(\R,\R)\) as \(T(f)=g\text{,}\) where \(g(x)=f(x+a)\text{.}\) Show that \(T\) is a linear transformation.

16. Function scaling operators.

Fix \(c\in \R\) and define the functions \(T,S \colon F(\R,\R)\rightarrow F(\R,\R)\) as follows:
\begin{align*} T(f) \amp =g, \text{ where } g(x)=f(cx) \\ S(f)\amp =g, \text{ where } g(x)=cf(x) \text{.} \end{align*}
Show that \(T\) and \(S\) are linear transformations.

17. Adding and scaling linear transformations.

Suppose that \(T\colon V\rightarrow W\) and \(S\colon V\rightarrow W\) are linear transformations.
  1. Define the function \(T+S\colon V\rightarrow W\) as \((T+S)(\boldv)=T(\boldv)+S(\boldv)\text{.}\) Show that \(T+S\) is a linear transformation.
  2. Define the function \(cT\colon V\rightarrow W\) as \(cT(\boldv)=c(T(\boldv))\text{.}\) Show that \(cT\) is a linear transformation.

18.

Let \(T\colon F(\R,\R)\rightarrow F(\R,\R)\) be defined as \(T(f)=g\text{,}\) where \(g(x)=f(x)+f(-x)\text{.}\) Show that \(T\) is linear. You may use the results of Exercise 3.2.6.16 and Exercise 3.2.6.17.

20. Reflection through a line.

Fix an angle \(\alpha\) with \(0\leq \alpha \leq \pi \text{,}\) let \(\ell_\alpha\) be the line through the origin that makes an angle of \(\alpha\) with the positive \(x\)-axis, and let \(r_\alpha\colon\R^2\rightarrow \R^2\) be the reflection operation as described in Definition 3.2.16. Prove that \(r_\alpha\) is a linear transformation following the steps below.
  1. In a manner similar to Theorem 3.2.13, describe \(P'=r_\alpha(P)\) in terms of the polar coordinates \((r,\theta)\) of \(P\text{.}\) Additionally, it helps to write \(\theta=\alpha+\phi\text{,}\) where \(\phi\) is the angle the line segment from the origin to \(P\) makes with the line \(\ell_\alpha\text{.}\) Include a drawing to support your explanation.
  2. Use your description in (a), along with some trigonometric identities, to show \(r_\alpha=T_A\) where
    \begin{equation*} A=\begin{bmatrix}\cos 2\alpha \amp \sin 2\alpha\\ \sin 2\alpha \amp -\cos 2\alpha \end{bmatrix}\text{.} \end{equation*}

21. Compositions of rotations and reflections.

In this exercise we will show that if we compose a rotation or reflection with another rotation or reflection, as defined in Definition 3.2.12 and Definition 3.2.16, the result is yet another rotation or reflection. For each part, express the given composition either as a rotation \(\rho_\theta\) or reflection \(r_\theta\text{,}\) where \(\theta\) is expressed in terms of \(\alpha\) and \(\beta\text{.}\)
  1. \(\displaystyle \rho_\alpha\circ\rho_\beta\)
  2. \(\displaystyle r_\alpha\circ r_\beta\)
  3. \(\displaystyle \rho_\alpha\circ r_\beta\)
  4. \(\displaystyle r_\alpha\circ \rho_\beta\)
Hint.
Use Theorem 3.2.13 and Theorem 3.2.17, along with some trigonometric identities.
geogebra.org
geogebra.org
geogebra.org
geogebra.org