There are many situations in mathematics where we want to describe an infinite set in a concise manner. We saw this at work already in SectionΒ 2.3, where infinite sets of solutions to linear systems were neatly described with parametric expressions.
A similar issue arises when describing vector spaces and their subspaces. As we know, any vector space is either the zero space or infinite (ExerciseΒ 1.1.3.6). If we happen to be dealing with a subspace of \(\R^n\text{,}\) then there is the possibility of giving a parametric description; but how do we proceed when working in one of our more exotic vector spaces like \(C^1(\R)\text{?}\)
As we will see in SectionΒ 4.3 the relevant linear algebraic tool for this purpose is the concept of a basis. Loosely speaking, a basis for a vector space \(V\) is a set of vectors that is large enough to generate the entire space, and small enough to contain no redundancies. What exactly we mean by βgenerateβ is captured by the rigorous notion of span; and what we mean by βno redundanciesβ is captured by linear independence.
Let \(V\) be a vector space, and let \(S\subseteq V\) be any subset of \(V\text{.}\) The span of \(S\), denoted \(\Span S\text{,}\) is the subset of \(V\) defined as follows:
If \(S=\emptyset\text{,}\) then \(\Span S=\{\boldzero_V\}\text{.}\)
Otherwise we define \(\Span S\) to be the set of all linear combinations of elements of \(S\text{:}\) i.e.,
\begin{equation*}
\Span S=\{\boldv\in V\colon \boldv=c_1\boldv_1+c_2\boldv_2\cdots +c_n\boldv_n \text{ for some } \boldv_i\in S \text{ and } c_i\in \R\}\text{.}
\end{equation*}
Let \(S\) be a subset of \(V\text{.}\) Some simple observations:
The zero vector is always an element of \(\Span S\text{.}\) Indeed, if \(S=\emptyset\text{,}\) then \(\Span S=\{\boldzero\}\) by definition. Otherwise, given any \(\boldv\in S\text{,}\) the linear combination \(0\boldv=\boldzero\) is an element of \(\Span S\text{.}\)
We have \(S\subseteq \Span S\text{:}\) i.e., \(\Span S\) includes \(S\) itself. Indeed, given any \(\boldv\in S\text{,}\) the linear combination \(1\boldv=\boldv\) is an element of \(\Span S\text{.}\)
If \(S=\{\boldv\}\) contains exactly one element, then \(\Span S=\{c\boldv\colon c\in \R\}\) is simply the set of all scalar multiples of \(\boldv\text{.}\)
If \(\boldv\ne \boldzero\text{,}\) then we know that this set is infinite (ExerciseΒ 1.1.3.6). Thus even when \(S\) is finite, \(\Span S\) will be infinite, as long as \(S\) contains nonzero vectors.
\(\Span S\) is the set of all scalar multiples of the nonzero vector \((a,b)\text{.}\) Geometrically, this is the line that passes through the the origin and the point \((a,b)\text{.}\)
By RemarkΒ 4.2.2, we have \(S\subseteq \Span S\text{.}\) Thus \(\R^2\subseteq \Span \R^2\text{.}\) Since \(\Span \R^2\subseteq \R^2\) by definition, we conclude that \(\Span S=\R^2\text{.}\)
Let \(W\subseteq V\) be a subspace that contains all elements of \(S\text{.}\) Since \(W\) is closed under arbitrary linear combinations, it must contain any linear combination of elements of \(S\text{,}\) and thus \(\Span S\subseteq W\text{.}\)
Let \(S\) be a subset of the vector space \(V\text{.}\) We call \(W=\Span S\) the subspace of \(V\) generated by S, and we call \(S\) a spanning set for \(W\text{.}\)
For most of the vector spaces weβve met a natural spanning set springs to mind. We will refer to these loosely as standard spanning sets. Some examples:
Zero space.
Let \(V=\{\boldzero\}\text{.}\) By definition the empty set \(S=\emptyset=\{ \}\) is a spanning set of \(V\text{.}\)
Let \(V=\R^n\text{.}\) For \(1\leq i\leq n\text{,}\) define \(\bolde_i\) to be the \(n\)-tuple with a one in the \(i\)-th entry, and zeros elsewhere. Then \(S=\{\bolde_1, \bolde_2,\dots, \bolde_n\}\) is a spanning set for \(\R^n\text{.}\)
Let \(V=M_{mn}\text{.}\) For each \((i,j)\) with \(1\leq i\leq m\) and \(1\leq j\leq n\text{,}\) define \(E_{ij}\) to be the \(m\times n\) matrix with a one in the \(ij\)-th entry, and zeros elsewhere. Then \(S=\{E_{ij}\colon 1\leq i\leq m, 1\leq j\leq n\}\) is a spanning set for \(M_{mn}\text{.}\)
It is important to observe that spanning sets for vector spaces are not unique. Far from it! In general, for any nonzero vector space there are infinitely many choices of spanning sets.
Since the last column will never contain a leading one, we conclude that the system is consistent for any choice of \(a,b,c,d\text{,}\) and thus that \(\Span S=M_{22}\text{,}\) as claimed.
Let \(V\) be a vector space. A subset \(S\subseteq V\) is linearly independent if for any collection \(\boldv_1,\boldv_2,\dots, \boldv_n\) of distinct vectors of \(S\) (i.e., \(\boldv_i\ne \boldv_j\) for \(i\ne j\)), and any scalars \(c_1,c_2,\dots, c_n\in \R\text{,}\) the following implication holds:
A subset \(S\) is linearly dependent if it is not linearly independent: i.e., if we can find distinct vectors \(\boldv_1,\boldv_2,\dots, \boldv_n\in S\text{,}\) and scalars \(c_1, c_2,\dots, c_n\) with \(c_i\ne 0\) for some \(i\text{,}\) such that
Recalling the notion of trivial and nontrivial linear combinations from DefinitionΒ 1.1.13, we can summarize DefinitionΒ 4.2.10 in plain English as follows:
A set \(S\) is linearly independent if there is no nontrivial linear combination of distinct elements of \(S\) equal to the zero vector.
As stated, our definition of linear independence is pleasingly general in that it places no restriction on the subset in question; in particular, the definition applies to both finite and infinite subsets of vector spaces. That said, one drawback to this definition is that in order to determine whether \(S\) is linearly independent, we must look at each finite subset of elements of \(S\) and determine for this collection whether or not there is a nontrivial linear combination equal to the zero vector. To do so directly would be quite time consuming. Thankfully, we will focus on finite sets \(S\text{,}\) and in this case it turns out that the only subset we need to consider is \(S\) itself.
Let \(V\) be a vector space, and let \(S=\{\boldv_1,\boldv_2,\dots, \boldv_n\}\subseteq V\text{,}\) where the \(\boldv_i\) are distinct. The following are equivalent.
If \(S=\{\boldv\}\text{,}\) then \(S\) is linearly independent if and only if \(\boldv\ne \boldzero\text{.}\) The previous comment shows why \(\boldv\ne \boldzero\) is a necessary condition. Letβs see why it is sufficient.
Suppose \(\boldv\ne\boldzero\text{,}\) and suppose we have \(c\boldv=\boldzero\text{.}\) By TheoremΒ 1.1.9 we have \(c=0\) or \(\boldv=\boldzero\) (TheoremΒ 1.1.9). Since \(\boldv\ne 0\text{,}\) we conclude \(c=0\text{.}\) This shows that the only linear combination of \(S\) yielding \(\boldzero\) is the trivial one.
In both cases, we see that one of the vectors is a scalar multiple of the other. Conversely, if one the two vectors is a scalar multiple of the other, then it is easy to see that there is a nontrivial linear combination equal to \(\boldzero\text{:}\) e.g., if \(\boldv=c\boldw\text{,}\) then \(\boldv-c\boldw=\boldzero\text{.}\) We conclude that \(S\) is linearly dependent of if and only if one of the vectors is a scalar multiple of the other, and linearly independent if and only if neither vector is a scalar multiple of the other.
The simple test in ExampleΒ 4.2.13 for linear independence of a set of two vectors unfortunately does not extend to larger sets. For example, the set \(S=\{(1,1),(1,0), (0,1)\}\) can be shown to be linearly dependent, and yet no element of \(S\) is a scalar multiple of any other element. What is true in these cases, is that some element of \(S\) can be written as a linear combination of the others, as articulated in RemarkΒ 4.2.14.
Let \(S=\{\boldv_1,\boldv_2, \dots, \boldv_n\}\) be a subset of the vector space \(V\text{,}\) where the \(\boldv_i\) are distinct. If \(n\geq 2\text{,}\) then \(S\) is linearly dependent if and only if we can express some element of \(S\) as a linear combination of the others: i.e., if and only if we have
Indeed, assume we have a vector equation of the form (4.8) for some \(1\leq i\leq n\text{.}\) If \(\boldv_i=\boldzero\text{,}\) then \(S\) is automatically dependent. (See (1) from ExampleΒ 4.2.13.) Otherwise the linear combination on the right side of (4.8) must be nontrivial, in which case
Using TheoremΒ 4.2.12, to decide whether a finite set \(S\) is linearly independent, we need to determine whether there is a nontrivial linear combination of its elements equal to the zero vector. As described in the ProcedureΒ 4.2.15, this boils down to a question about the solutions to a certain system of linear equations.
Translate this vector equation into a homogeneous linear system in the unknowns \(c_1,c_2,\dots, c_n \text{,}\) using the definition of equality for your vector space.
Decide, using Gaussian elimination, whether this system has any nonzero (i.e., nontrivial) solutions. If it has no nontrivial solution, \(S\) is linearly independent; if it has a nontrivial solution, \(S\) is linearly dependent.
This is a fitting point to recall our Gaussian elimination mantra. As you can see, even as we move into more and more abstract realms of linear algebra (linear independence, span, etc.), Gaussian elimination remains our most important tool.
Row reduction reveals that this last linear system has a free variable, and hence that there are infinitely many solutions to this system: e.g., \((a,b,c)=(2,1,3)\text{.}\) We conclude that \(S\) is linearly dependent.
StatementsΒ 8β9, can be shown to be equivalent to one of the previous statements of the invertibilty theorem using the column methodΒ 3.1.27 of matrix multiplication. Indeed, given \(\boldb\in \R^n\) the matrix equation \(A\boldx=\boldb\) has a solution if and only if \(\boldb\) can be written as a linear combination of the columns of \(A\text{.}\) Thus, the matrix equation \(A\boldx=\boldb\) has a solution for all \(\boldb\in \R^n\) if and only if all vectors \(\boldb\in \R^n\) lie in the span of the rows of \(A\text{.}\) This proves statementΒ 8 is equivalent to statementΒ 3. Similarly, the matrix equation \(A\boldx=\boldzero\) has a nontrivial solution if and only if there is a nontrivial linear combination of the columns of \(A\) equal to \(\boldzero\text{,}\) if and only if the columns of \(A\) are linearly independent. This proves statementΒ 9 is equivalent to statementΒ 4.
StatementsΒ 8β10 are easily seen to be equivalent to statementsΒ 8β10 using the fact that \(A\) is invertible if and only if \(A^T\) is invertible.
We only know that \(\mathrm{span} \lbrace\mathbf{u}_1,\mathbf{u}_2,\mathbf{u}_3 \rbrace\subseteq \mathrm{span} \lbrace\mathbf{u}_1,\mathbf{u}_2,\mathbf{u}_3, \mathbf{u}_4\rbrace\) .
There is no obvious relationship between \(\mathrm{span} \lbrace\mathbf{u}_1,\mathbf{u}_2,\mathbf{u}_3 \rbrace\) and \(\mathrm{span} \lbrace\mathbf{u}_1,\mathbf{u}_2,\mathbf{u}_3, \mathbf{u}_4 \rbrace\) .
\(\mathrm{span} \lbrace\mathbf{u}_1,\mathbf{u}_2,\mathbf{u}_3 \rbrace= \mathrm{span} \lbrace\mathbf{u}_1,\mathbf{u}_2,\mathbf{u}_3, \mathbf{u}_4\rbrace\) when \(\mathbf{u}_4\) is a scalar multiple of one of \(\lbrace\mathbf{u}_1,\mathbf{u}_2,\mathbf{u}_3 \rbrace\text{.}\)
\(\lbrace{{\bf u}_1, {\bf u}_2, {\bf u}_3, {\bf u}_4}\rbrace\) could be a linearly independent or linearly dependent set of vectors depending on the vectors chosen.
\(\lbrace{{\bf u}_1, {\bf u}_2, {\bf u}_3, {\bf u}_4}\rbrace\) is also a linearly independent set of vectors unless \({\bf u}_4\) is a scalar multiple another vector in the set.
\(\lbrace{{\bf u}_1, {\bf u}_2, {\bf u}_3, {\bf u}_4}\rbrace\) is a linearly independent set of vectors unless \({\bf u}_4\) is a linear combination of other vectors in the set.
\(\lbrace{{\bf u}_1, {\bf u}_2, {\bf u}_3, {\bf u}_4}\rbrace\) could be a linearly independent or linearly dependent set of vectors depending on the vectors chosen.
If they are linearly dependent, find scalars that are not all zero such that the equation below is true. If they are linearly independent, find the only scalars that will make the equation below true.
Let \(V\) be a vector space, and let \(S=\{\boldv_1, \boldv_2, \dots, \boldv_n\}\) be a subset of distinct vectors. Assume \(S\) is linearly independent. Show that any subset \(T\subseteq S\) is linearly independent.
Let \(S=\{\boldv_1,\boldv_2,\dots, \boldv_n\}\) be a subset of \(\R^n\text{,}\) and let \(A\) be the \(n\times n\) matrix whose \(j\)-th column is \(\boldv_j\text{:}\) i.e.,