Skip to main content
Logo image

Section 5.3 Orthogonal projection

A trick we learn early on in physics-- specifically, in dynamics problems in R2-- is to pick a convenient axis and then decompose any relevant vectors (force, acceleration, velocity, position, etc.) into a sum of two components: one that points along the chosen axis, and one that points perpendicularly to it. As we will see in this section, this technique can be vastly generalized. Namely, instead of R2 we can take any inner product space (V,,); and instead of a chosen axis in R2, we can choose any finite-dimensional subspace WV; then any vV can be decomposed in the form
v=w+w,
where wW and w is a vector orthogonal to W, in a sense we will make precise below. Just as in our toy physics example, this manner of decomposing vectors helps simplify computations in problems where the subspace W chosen is of central importance.

Subsection 5.3.1 Orthogonal complement

We begin by making sense of what it means for a vector to be orthogonal to a subspace.

Definition 5.3.1. Orthogonal complement.

. Let (V, ,) be an inner product vector space, and let WV be a subspace.
A vector v is orthogonal to W if it is orthogonal to every element of W:i.e., if v,w=0 for all wW.
The orthogonal complement of W, denoted W, is the set of all elements of V orthogonal to W: i.e.,
W={vV:v,w=0 for all wW}.

Remark 5.3.2. Computing W.

According to Definition 5.3.1, to verify that a vector v lies in W, we must show that v,w=0 for all wW. The “for all” quantifier here can potentially make this an onerous task: there are in principle infinitely many w to check! In the special case where W has a finite spanning set, so that W=span{w1,w2,,wr} for some vectors wi, deciding whether vW reduces to checking whether v,wi=0 for all 1ir. In other words, we have
vWv,wi=0 for all 1ir.
The forward implication of this equivalence is clear: if v is orthogonal to all elements of W, then clearly it is orthogonal to each wi. The reverse implication is left as an exercise. (See Exercise 5.3.6.15.)
We illustrate this computational technique in the next examples.

Example 5.3.3.

Consider the inner product space R2 together with the dot product. Let W=span{(1,1)}={(t,t):tR}: the line R2 with equation y=x. Compute W and identify it as a familiar geometric object in R2.
Solution.
According to Remark 5.3.2, since W=span{(1,1)}, we have
xWx(1,1)=0.
Letting x=(x,y), we see that x(1,1)=0 if and only if x+y=0, if and only if y=x. Thus W={(x,y):y=x} is the line R2 with equation y=x. Observe that the lines and are indeed perpendicular to one another. (Graph them!)

Example 5.3.4.

Consider the inner product space R3 together with the dot product. Let WR3 be the plane with equation x2yz=0. Compute W and identify this as a familiar geometric object in R3.
Solution.
First, solving x2yz=0 for (x,y,z), we see that
W={(2s+t,s,t):s,tR}=span{(2,1,0),(1,0,1)}.
Next, according to Remark 5.3.2 we have
xWx(2,1,0)=0 and x(1,0,1)=0.
It follows that W is the set of vectors x=(x,y,z) satisfying the linear system
2x+y=0x+z=0.
Solving this system using Gaussian elimination we conclude that
W={(t,2t,t):tR}=span{(1,2,1)},
which we recognize as the line R3 passing through the origin with direction vector (1,2,1). This is none other than the normal line to the plane W passing through the origin.

Example 5.3.6.

Consider the inner product space R3 with the dot product. Let W=span{(1,1,1)}R3, the line passing through the origin with direction vector (1,1,1). The orthogonal complement W is the set of vectors orthogonal to (1,1,1). Using the definition of dot product, this is the set of solutions (x,y,z) to the equation
x+y+z=0,
which we recognize as the plane passing through the origin with normal vector (1,1,1). Note that we have
dimW+dimW=1+2=3,
as predicted in Theorem 5.3.5.
The notion of orthogonal complement gives us a more conceptual way of understanding the relationship between the various fundamental spaces of a matrix.
  1. Using the dot product method of matrix multiplication, we see that a vector xnull(A) if and only if xri=0 for each row ri of A, if and only if xw=0 for all wspan{r1,r2,,rm}=rowA (see Remark 5.3.2), if and only if x(rowA). This shows nullA=(rowA).
    We can use Corollary 5.3.13 to conclude rowA=(nullA). Alternatively, and more directly, the argument above shows that wrowAw(nullA), proving rowA(nullA). Next, by the rank-nullity theorem we have dimrowA=ndimnullA; and by Theorem 5.3.5 we have dim(nullA)=ndimnullA. It follows that dimrowA=dim(nullA). Since rowA(nullA) and dimrowA=dim(nullA), we conclude by Corollary 3.7.13 that rowA=(nullA).
  2. This follows from (1) and the fact that col(A)=row(AT).

Example 5.3.8.

Understanding the orthogonal relationship between nullA and rowA allows us in many cases to quickly determine/visualize the one from the other. As an example, consider A=[111111]. Looking at the columns, we see easily that rankA=2, which implies that nullityA=32=1. Since (1,1,0) is an element of null(A) and dimnullA=1, we must have nullA=span{(1,1,0)}, a line. By orthogonality, we conclude that
rowA=(nullA),
which is the plane with normal vector (1,1,0) passing through the origin.

Subsection 5.3.2 Orthogonal Projection

Let B={w1,w2,,wr}. We first show that the vectors
(5.3.4)w=i=1rv,wiwi,wiwi
and w=vw satisfy the conditions in (5.3.1). It is clear that the w defined in (5.3.4) is an element of W, since it is a linear combination of the wi. Furthermore, we see easily that our choice w=vw satisfies
w+w=w+(vw)=v.
It remains only to show that w=vwW. Since B is a basis of W, it suffices to show that w,wj=0 for all 1ir. We compute:
w,wj=vprojW(v),wi=vi=1rv,wiwi,wiwi,wj=v,wji=1rv,wiwi,wiwi,wj=v,wjv,wjwj,wjwj,wj=0,
as desired.
Having shown that a decomposition of v of the form (5.3.1) exists, we now show it is unique in the sense specified. Suppose we have
v=w+w=u+u,
where w,uW and w,uW. Rearranging, we see that
wu=uw.
We now claim that wu=uw=0, in which case w=u and w=u, as desired. To see why the claim is true, consider the vector v=wu=uw. Since v=wu, and w,uW, we have vW. On the other hand, since v=uw, and u,wW, we have vW. Thus vWW. Since WW={0} (Theorem 5.3.5), we conclude v=wu=uw=0, as claimed.
At this point we have proved both (1) and (2), and it remains only to show that (5.3.3) holds for all wW. To this end we compute:
vw2=w+(ww)2=w2+ww2(Exercise 5.2.4.18)w2=vw2.
This shows vw2vw2. Taking square-roots now proves the desired inequality.

Remark 5.3.10. Orthogonal projection formula.

The formula (5.3.2) is very convenient for computing an orthogonal projection projW(v), but mark well this important detail: to apply the formula we must first provide an orthogonal basis of W. Thus unless one is provided, our first step in an orthogonal projection computation is to produce an orthogonal basis of W. In some simple cases (e.g., when W is 1- or 2-dimensional) this can be done by inspection. Otherwise, we use the Gram-Schmidt procedure.

Example 5.3.11.

Consider the inner product space R3 with the dot product. Let WR3 be the plane with equation x+y+z=0. Compute projW(v) for each v below.
  1. v=(3,2,2)
  2. v=(2,1,3)
  3. v=(7,7,7)
Solution.
According to Remark 5.3.10 our first step is to produce an orthogonal basis of W. We do so by inspection. Since dimW=2, we simply need to find two solutions to x+y+z=0 that are orthogonal to one another: e.g., w1=(1,1,0) and w2=(1,1,2). Thus we choose B={(1,1,0),(1,1,2)} as our orthogonal basis, and our computations become a matter of applying (5.3.2), which in this case becomes
projW(v)=vw1w1w1w1+vw2w2w2w2=vw12w1+vw26w2.
Now compute:
projW((3,2,2))=52(1,1,0)+36(1,1,2)=(2,3,1)projW((2,1,3))=12(1,1,0)+96(1,1,2)=(2,1,3)projW((7,7,7))=02(1,1,0)+06(1,1,2)=(0,0,0).
The last two computations might give you pause. Why do we have projW((2,1,3))=(2,1,3) and projW((7,77,7))=(0,0,0)? The answer is that (2,1,3) is already an element of W, so it stands to reason that its projection is itself; and (7,7,7) is already orthogonal to W (it is a scalar multiple of (1,1,1)), so it stands to reason that its projection is equal to 0. See Exercise 5.3.6.20 for a rigorous proof of these claims.

Video example: orthogonal projection in function space.

Figure 5.3.12. Video: orthogonal projection in function space
Clearly W(W). For the other direction, take v(W). Using the orthogonal projection theorem, we can write v=w+w with wW and wW. We will show w=0.
Since v(W) we have v,w=0. Then we have
0=v,w=w+w,w=w,w+w,w (since WW=0+w,w
Thus w,w=0. It follows that w=0, and hence v=w+0=wW.
  1. We must show that projW(cv+dw)=cprojW(v)+dprojW(w) for all c,dR and v,wV. We pick an orthogonal basis B={v1,v2,,vr} of W and compute, using formula (5.3.2):
    projW(cv+dw)=i=1rcv+dw,vivi,vivi=i=1rcv,vi+dw,vivi,vivi=ci=1rv,vivi,vivi+di=1rw,vivi,vivi=cprojW(v)+dprojW(w).
  2. By definition we have projW(v)W for all vW, and thus improjWW. For the other direction, if wW, then w=projW(w) (Exercise 5.3.6.20), and thus wimproj. This proves improj=W.
    The fact that nullproj=W follows from the equivalence stated in (b) of Exercise 5.3.6.20.

Subsection 5.3.3 Orthogonal projection in R2 and R3

For this subsection we will always work within Euclidean space: i.e., V=Rn with the dot product. In applications we often want to compute the projection of a point onto a line (in R2 or R3) or plane (in R3). According to Corollary 5.3.14 the operation of projecting onto any subspace WRn is in fact a linear transformation projW:RnRn. By Corollary 3.6.18 we have projW=TA, where
A=[|||projW(e1)projW(e2)projW(en)|||].
Lastly, (5.3.2) gives us an easy formula for computing projW(ej) for all j, once we have selected an orthogonal basis for W. As a result we can easily derive matrix formulas for projection onto any subspace W of any Euclidean space Rn. We illustrate this with some examples in R2 and R3 below.

Example 5.3.15. Projection onto a line R3.

Any line in R3 passing through the origin can be described as =span{v}, for some v=(a,b,c)0. The set {(a,b,c)} is trivially an orthogonal basis of . Using (5.3.2), we have
proj(x)=xvvvv=ax+by+cza2+b2+c2(a,b,c).
It follows that proj=TA, where
A=[|||proj((1,0,0))proj((0,1,0))proj((0,0,1))|||]=1a2+b2+c[a2abacabb2bcacbcc2].

Example 5.3.16.

Consider the line =span{(1,2,1)}R3.
  1. Find the matrix A such that proj=TA.
  2. Use your matrix formula from (a) to compute proj((2,3,1)), proj((2,4,2)), and proj((1,1,1)).
  3. Compute d((2,3,1),) and d((2,4,2),).
Solution.
  1. Using the general formula described in Example 5.3.15, we have
    A=16[121242121].
  2. Now compute
    proj((2,3,1))=16[121242121][231]=16[5105]proj((2,4,2))=16[121242121][242]=16[242]proj((1,1,1))=16[121242121][111]=16[000].
    The last two computations, proj((2,4,2))=(2,4,2) and proj((1,1,1))=(0,0,0), should come as no surprise, since (2,4,2) and (1,1,1). (See Exercise 5.3.6.20.)
  3. We have
    d((2,3,1),)=(2,3,1)proj((2,3,1))=16(17,8,1)=3546d((2,4,2),)=(2,4,2)proj((2,4,2))=(0,0,0)=0.
    Again, the second computation should come as no surprise. Since (2,4,2) is itself an element of , it stands to reason that its distance to is equal to zero.

Example 5.3.17. Projection onto planes in R3.

Any plane WR3 passing through the origin can be described as W={(x,y,z)R3:ax+by+cz=0}. Equivalently, W is the set of all xR3 satisfying x(a,b,c)=0: i.e., W=, where =span{(a,b,c). Consider the orthogonal decomposition with respect to :
x=proj(x)+(xproj(x)).
Since xproj(x)=W and proj(x)=W, we see that this is also an orthogonal decomposition with respect to W! Using the matrix formula for proj from Example 5.3.15, we have
projW(x)=xproj(x)=IxAx(A=1a2+b2+c[a2abacabb2bcacbcc2])=(IA)x=1a2+b2+c2[b2+c2abacaba2+c2bcacbca2+b2].
We conclude that projW=TB, where
B=1a2+b2+c2[b2+c2abacaba2+c2bcacbca2+b2].

Example 5.3.18.

Consider the plane W={(x,y,z)R3:x2y+z=0}.
  1. Find the matrix A such that projW=TA.
  2. Use your matrix formula from (a) to compute projW((2,1,1)) and projW((1,1,1)).
  3. Compute d((2,1,1),W) and d((1,1,1),W).
Solution.
  1. Using the general formula described in Example 5.3.17, we have
    A=16[521222125].
  2. Now compute
    proj((2,1,1))=16[521222125][211]=16[1185]proj((1,1,1))=16[521222125][111]=16[000].
  3. We have
    d((2,1,1),W)=(2,1,1)projW((2,1,1))=16(1,2,1)=66d((1,1,1),W)=(1,1,1)projW((1,1,1))=(0,0,0)=0.

Subsection 5.3.4 Trigonometric polynomial approximation

Consider the inner product space consisting of C([0,2π]) along with the integral inner product f,g=02πf(x)g(x)dx. In Example 5.2.4 we saw that the set
B={1,cos(x),sin(x),cos(2x),sin(2x),,cos(nx),sin(nx)}
is orthogonal with respect to this inner product. Thus B is an orthogonal basis of
spanB={gC([0,2π]):g(x)=a0+k=1nakcoskx+bksinkx for some ai,biR}.
We call W=spanB the space of trigonometric polynomials of degree at most n.
Since B is an orthogonal basis of W, given an arbitrary function f(x)C[0,2π], its orthogonal projection f^=projW(f) is given by
f^(x)=a0+a1cos(x)+b1sin(x)+a2cos(2x)+b2sin(2x)++ancos(nx)+bnsin(nx),
where
a0=12π02πf(x) dx, aj=1π02πf(x)cos(jx)dx, bk=1π02πf(x)sin(kx)dx.
Here we are using (5.3.2), as well as the inner product formulas 1,1=2π and cosnx,cosnx=sinnx,sinnx=π from Example 5.2.4.
What is the relationship between f and f^? Theorem 5.3.9 tells us that f^ is the “best” trigonometric polynomial approximation of f(x) of degree at most n in the following sense: given any any other trigonometric polynomial gW, we have
||ff^||fg.
Unpacking the definition of norm in this inner product space, we conclude that
02π(ff^)2dx02π(fg)2dx
for all gW.
Thus, given a continuous function f on [0,2π], linear algebra shows us how to find its best trigonometric polynomial approximation of the form
g(x)=a0+k=1nakcoskx+bksinkx.
However, linear algebra does not tell us just how good this approximation is. This question, among others, is tackled by another mathematical theory: Fourier analysis. There we learn that the trigonometric polynomial approximations get arbitrarily close to f as we let n increase. More precisely, letting f^n be the orthogonal projection of f onto the space of trigonometric polynomials of degree at most n, we have
limnff^n=0.

Subsection 5.3.5 Least-squares solution to linear systems

In statistics we often wish to approximate a scatter plot of points Pi=(Xi,Yi), 1im, with a line :y=mx+b that “best fits” the data. “Finding” this line amounts to finding the appropriate slope m and y-intercept b: i.e., in this setup, the points Pi=(Xi,Yi) are given, and m and b are the unknowns we wish to find. For the line to perfectly fit the data, we would want
Yi=mXi+b for all 1im.
In other words (m,b) would be a solution to the matrix equation Ax=y, where
x=[mb],A=[X11X21Xm1],y=[Y1Y2Ym].
Of course in most situations the provided points do not lie on a line, and thus there is no solution x to the given matrix equation Ax=y. When this is the case we can use the theory of orthogonal projection to find what is called a least-squares solution, which we now describe in detail.
The least-squares method applies to any matrix equation
(5.3.5)Am×nxn×1=ym×1,
where A and y are given, and x is treated as an unknown vector. Recall that
Ax=y has a solution ycolA(Theorem 3.8.6).
When ycolA, and hence (5.3.5) does not have a solution, the least-squares method proceeds by replacing y with the element of W=colA closest to it: that is, with its orthogonal projection onto W. Let y^=projW(y), where orthogonal projection is taken with respect to the dot product on Rm, and consider the adjusted matrix equation
(5.3.6)Ax=y^.
By definition of projW, we have y^W=colA, and thus there is a solution x^ to (5.3.6). We call x^ a least-squares solution to (5.3.5). Observe that x^ does not necessarily satisfy Ax^=y; rather, it satisfies Ax^=y^. What makes this a “least-squares” solution is that Ax^=y^ is the element of W=colA closest to y. With respect to the dot product, this means that a least-squares solution x^ minimizes the quantity
yAx=(y1y1)2+(y2y2)2++(ynyn)2,
among all xRn.

Example 5.3.19. Best fitting line.

Suppose we wish to find a line :y=mx+b that best fits (in the least-square sense) the following data points: P1=(3,1),P2=(1,2),P3=(2,3). Following the discussion above, we seek a solution x=(m,b) to the matrix equation Ax=y, where
x=[mb],A=[311121],y=[123].
Using Gaussian elimination, we see easily that this equation has no solution: equivalently, yW=colA. Accordingly, we compute y^=projW(y) and find a solution to Ax^=y^. Conveniently, the set B={(3,2,1),(1,1,1)} is already an orthogonal basis of W=colA, allowing us to use (5.3.2):
y^=y(3,1,2)(3,2,1)(3,1,2)(3,1,2)+y(1,1,1)(1,1,1)(1,1,1)(1,1,1)=114(13,33,38).
Lastly, solving Ax^=y^ yields (m,b)=x^=(5/14,2), and we conclude the line :y=(5/14)x+2 is the one that best fits the data in the least-squares sense.

Remark 5.3.20. Visualizing least-squares.

Figure 5.3.21 helps us give a graphical interpretation of how the line :y=(5/14)x+2 best approximates the points P1=(3,1),P2=(1,2),P3=(2,3).
Figure 5.3.21. Least-squares visualization
Let y=(1,2,3)=(y1,y2,y3) be the given y-values of the points, and let y^=(y1,y2,y3) be the orthogonal projection of y onto colA. In the graph the values ϵi denote the vertical difference ϵi=yiyi between the data points, and our fitting line. The projection y^ makes the error yy^=ϵ12+ϵ22+ϵ32 as small as possible. This means if I draw any other line and compute the corresponding differences ϵi at the x-values -3, 1 and 2, then
ϵ12+ϵ22+ϵ32(ϵ1)2+(ϵ2)2+(ϵ3)2
To compute a least-squares solution to Ax=y we must first compute the orthogonal projection of y onto W=colA; and this in turn requires first producing an orthogonal basis of colA, which may require using the Gram-Schmidt procedure. The following result bypasses these potentially onerous steps by characterizing a least-squares solution to Ax=y as a solution to the matrix equation
ATAx=ATy.
Let W=colA, and let y^=projW(y). The key observation is that a vector x^ satisfies Ax^=y^ if and only if
y=Ax^+(yAx^)
is an orthogonal decomposition of y with respect to W=colA; and this is true if and only if yAx^(colA). Thus we have
Ax^=y^yAx^(colA)yAx^nullAT((colA)=nullAT,Theorem 5.3.7)AT(yAx^)=0ATyATAx^=0ATAx^=ATy.

Example 5.3.23.

Consider again the matrix equation Ax=y from Example 5.3.19. According to Theorem 5.3.22 the least-squares solution can be found by solving the equation ATAx=ATy for x. We compute
ATA=[14003]ATy=[56]
and solve
[14003]x=[56]x=(5/14,2),
just as before.

Exercises 5.3.6 Exercises

WeBWork Exercises

1.
Compute the orthogonal projection of v=[891] onto the line L through [564] and the origin.
projL(v)= (3 × 1 array).
2.
Let y=[973] and u=[363]. Write y as the sum of two orthogonal vectors, x1 in Span{u} and x2 orthogonal to u.
x1= (3 × 1 array), x2= (3 × 1 array).
3.
(a) Find the distance from the point P0=(2,7,6) to the plane 2x+4y7z=5. Use sqrt() to enter square roots.
Distance:
(b) Find the equation of the plane that passes through the points P=(2,0,5), Q=(5,1,2), and R=(1,2,2). Write your answer in terms of the variables x, y, z.
Answer:
Answer 1.
1.56502
Answer 2.
13x+46y+15z=49
Solution.
Solution: (a) The idea is to find a convenient point lying in the given plane. For example, let us pick P1=(1,0,1), and let v be the vector from P1 to P0. Then the distance from P0 to the plane is the absolute value of the scalar projection of v onto the normal vector
N=[247]
So, we get that the distance is 1369.
(b) To find an equation of the plane in question, we can work with the point P and the normal vector N given by the cross product of the vector from P to Q, and the vector from P to R. It turns out that
N=[134615]
So, an equation of the plane is 13x46y15z=49.
4.
Find bases of the kernel and image of the orthogonal projection onto the plane 6x3y+z=0 in R3.
A basis for the kernel is { (3 × 1 array) }.
A basis for the image is { (3 × 1 array), (3 × 1 array) }.
5.
Let
x=[1340], y=[3351], z=[111716.536.5].
Use the Gram-Schmidt process to determine an orthonormal basis for the subspace of R4 spanned by x, y, and z.
{ (4 × 1 array), (4 × 1 array), (4 × 1 array) }.
6.
Find the least-squares solution x of the system
[100100]x=[459].
x= (2 × 1 array)
7.
By using the method of least squares, find the best line through the points:
(2,3), (1,3), (0,0).
Step 1. The general equation of a line is c0+c1x=y. Plugging the data points into this formula gives a matrix equation Ac=y.
(3 × 2 array) [c0c1] = (3 × 1 array)
Step 2. The matrix equation Ac=y has no solution, so instead we use the normal equation ATAc^=ATy
ATA= (2 × 2 array)
ATy= (2 × 1 array)
Step 3. Solving the normal equation gives the answer
c^= (2 × 1 array)
which corresponds to the formula
y=
Analysis. Compute the predicted y values: y^=Ac^.
y^= (3 × 1 array)
Compute the error vector: e=yy^.
e= (3 × 1 array)
Compute the total error: SSE=e12+e22+e32.
SSE=
Answer 1.
0.642857+1.92857x
Answer 2.
0.642857
8.
By using the method of least squares, find the best parabola through the points:
(1,0), (2,2), (0,3), (1,1)
Step 1. The general equation of a parabola is c0+c1x+c2x2=y. Plugging the data points into this formula gives a matrix equation Ac=y.
(4 × 3 array) [c0c1c2] = (4 × 1 array)
Step 2. The matrix equation Ac=y has no solution, so instead we use the normal equation ATAc^=ATy
ATA= (3 × 3 array)
ATy= (3 × 1 array)
Step 3. Solving the normal equation gives the answer
c^= (3 × 1 array)
which corresponds to the formula
y=
Answer.
2.1+0.2x+(1)x2

.

In each exercise below you are given an inner product space V, a subspace W=spanB where B is orthogonal, and a vector vV. Compute projW(v).
9.
V=R4 with the dot product; W=span{(1,1,1,1),(1,1,1,1),(1,1,1,1)}; v=(2,3,1,1)
10.
V=R3 with dot product with weights k1=1,k2=2,k3=1; W={(1,1,1),(1,1,1),(1,0,1)}; v=(1,2,3)
11.
V=C([0,2π]) with the integral inner product; W=span{cosx,cos2x,sinx}; f(x)=3 for all x[0,2π]

12.

Let PR3 be the plane passing through the origin with normal vector n=(1,2,1). Find the orthogonal projection of (1,1,1) onto P with respect to the dot product.

13.

Recall that the trace trA of a square matrix is the sum of its diagonal entries. Let V=M22 with inner product A,B=tr(ATB). (You may take for granted that this operation is indeed an inner product on M22.) Define W={AM22:trA=0}.
  1. Compute an orthogonal basis for W. You can do this either by inspection (the space is manageable), or by starting with any basis of W and applying the Gram-Schmidt procedure.
  2. Compute projW(A), where
    A=[1211].

14.

Let V=C([0,1]) with the integral inner product, and let f(x)=x. Find the function of the form g(x)=a+bcos(2πx)+csin(2πx) that “best approximates” f(x) in terms of this inner product: i.e. find the the g(x) of this form that minimizes d(g,f).
Hint.
The set S={f(x)=1,g(x)=cos(2πx),h(x)=sin(2πx)} is orthogonal with respect to the given inner product.

15.

Let (V,,) be an inner product space, let S={w1,w2,,wr}V, and let W=spanS. Prove:
vW if and only if v,wi=0 for all 1ir.
In other words, to check whether an element is in W, it suffices to check that it is orthogonal to each element of its spanning set S.

16.

Consider the inner product space R4 together with the dot product. Let
W={(x1,x2,x3,x4)R4:x1=x3 and x2=x4}.
Provide orthogonal bases for W and W.

18. Dimension of W.

Prove statement (3) of Theorem 5.3.5: if (V,  , ) is an inner product space of dimension n, and W is a subspace of V, then
dimW+dimW=n.
Hint.
By Corollary 5.2.7 there is an orthogonal basis B={v1,,vr} of W, and furthermore, we can extend B to an orthogonal basis B={v1,v2,,vr,u1,,unr} of all of V. Show the ui form a basis for W.

20.

Let V an inner product space, and let WV be a finite-dimensional subspace. Prove the following statements:
  1. vW if and only if projW(v)=v;
  2. vW if and only if projW(v)=0.

21.

We consider the problem of fitting a collection of data points (x,y) with a quadratic curve of the form y=f(x)=ax2+bx+c. Thus we are given some collection of points (x,y), and we seek parameters a,b,c for which the graph of f(x)=ax2+bx+c “best fits” the points in some way.
  1. Show, using linear algebra, that if we are given any three points (x,y)=(r1,s1),(r2,s2),(r3,s3), where the x-coordinates ri are all distinct, then there is a unique choice of a,b,c such that the corresponding quadratic function agrees precisely with the data. In other words, given just about any three points in the plane, there is a unique quadratic curve connecting them.
  2. Now suppose we are given the four data points
    P1=(0,2),P2=(1,0),P3=(2,2),P4=(3,6).
    1. Use the least-squares method described in the lecture notes to come up with a quadratic function y=f(x) that “best fits” the data.
    2. Graph the function f you found, along with the points Pi. (You may want to use technology.) Use your graph to explain precisely in what sense f “best fits” the data.