When discussing matrix algebra we saw that operations from real number arithmetic have natural analogues in the world of matrices. Furthermore, the act of comparing these two different algebraic systems brought to light many interesting features of matrix algebra.
Why stop at matrices? Are there other interesting algebraic systems that admit analogous operations? If so, to what degree do these systems agree with or differ from real number or matrix algebra?
A common technique in mathematics for such investigations is to distill the important properties of the motivating operations into a list of axioms, and then to prove statements that apply to any system that satisfies these axioms.
We now embark on just such an axiomatic approach. The notion of a vector space arises from focusing on just two operations from matrix algebra: matrix addition and matrix scalar multiplication. As we saw in SectionΒ 2.2, these two operations satisfy many useful properties: e.g., commutativity, associativity, distributivity, etc.. Whereas earlier we showed directly that matrix algebra satisfies these properties, now we will come at things the other way: we record these various properties as a list of axioms, and declare any system that satisfies these axioms to be a vector space.
Once weβve established the definition of a vector space, when we go on to investigate the properties enjoyed by vector spaces we make no assumptions beyond the fact that the basic axioms are satisfied. This approach comes off as somewhat abstract, but has the advantage that our conclusions now apply to any vector space you can think of. You donβt have to reinvent the wheel each time you stumble across a new vector space.
This operation takes as input any real number \(c\in R\) and any element \(\boldv\in V\text{,}\) and outputs another element of \(V\text{,}\) denoted \(c\boldv\text{.}\) We describe this operation using function notation as follows:
This operation takes as input any pair of elements \(\boldv, \boldw\in V\) and returns another element of \(V\text{,}\) denoted \(\boldv+\boldw\text{.}\) In function notation:
Whatβs the deal with the βrealβ modifier? The reals are one example of a type of number system called a field. Other examples of fields are given by the complex numbers (\(\C\)) and the rational numbers (\(\Q\)). If \(K\) is a field, and if we replace each mention of \(\R\) in DefinitionΒ 3.1.1 with a mention of \(K\text{,}\) then we are left with the definition of a vector space over \(K\text{.}\) Setting \(K=\C\text{,}\) for example, we get the definition of a complex vector space.
In our treatment of linear algebra we will largely focus on real vector spaces, and as such will often drop this modifier: hence the parentheses in the definition.
When introducing a new vector space there are many details in DefinitionΒ 3.1.1 that must be verified. To help organize this task, follow this checklist:
Make explicit the underlying set \(V\) of the vector space.
Think of items (1)-(3) of our checklist as official declarations about the makeup of our vector space: βThe underlying set shall be as statedβ; βWe declare the vector operations thuslyβ; βThe zero vector shall be this element here, and vector inverses shall be assigned in this mannerβ. Item (4) is where we get down to the nitty gritty of showing that the our proposed vector space structure articulated in (1)-(3) does indeed satisfy all the necessary properties.
In each of the examples below we carefully lay out the details of items (1)-(3) while often leaving much of the work of item (4) to you. You will meet these vector spaces frequently throughout the rest of your life. Each time you do, it will be helpful for orientation purposes to mentally run through items (1)-(3). Ask yourself: What is the underlying set? What are vector operations? What acts as the zero vector, and how do I assign vector inverses?
We showed in TheoremΒ 2.2.1 that matrix scalar multiplication and vector addition, satisfy axioms (i), (ii), (v)-(viii). TheoremΒ 2.2.4 implies that our choice of zero vector (\(\boldzero_{m\times n}\)) and vector inverses (\(-A\)) satisfies axioms (iii)-(iv).
It is clear that structurally speaking \(\R^n\) behaves exactly like \(M_{1n}\text{,}\) the vector space of \(1\times n\) row vectors: we have essentially just replaced brackets with parentheses. As such it follows from the previous example that \(\R^n\text{,}\) along with the given operations, constitutes a vector space.
Remark3.1.5.Visualizing \(\R^n\text{:}\) points and arrows.
Fix \(n\in\{2,3\}\text{.}\) Once we choose a coordinate system for \(\R^n\) (complete with origin and coordinate axes), we can visually represent an element \((a_1,a_2,\dots, a_n)\) of \(\R^n\) either as a point\(P=(a_1,a_2,\dots, a_n)\) or as an arrow (or directed line segment) \(\overrightarrow{QR}\) that begins at a point \(Q=(b_1,b_2,\dots, b_n)\) of our choosing and ends at the point \(R=(a_1+b_1,a_2+b_2,\dots, a_n+b_n)\text{.}\) When we choose the initial point to be the origin \(O=(0,0,\dots, 0)\text{,}\) the corresponding arrow is just \(\overrightarrow{OP}\text{,}\) called the position vector of \(P=(a_1,a_2,\dots, a_n)\text{.}\)FigureΒ 3.1.6 illustrates a variety of visual representations of the element \((1,2)\) of \(\R^2\text{.}\)
As a general rule of thumb, when trying to visualize subsets of \(\R^n\) (e.g., lines and planes), it helps to think of \(n\)-tuples as points; and when trying to visualize vector arithmetic in \(\R^n\text{,}\) it helps to think of \(n\)-tuples as arrows. Indeed, when using the arrow representation of \(n\)-tuples, vector addition can be visualized using the familiar βtip to tailβ method; and vector scalar multiplication can be understood as scaling arrows. FigureΒ 3.1.7 summarizes these visualization techniques in the case \(n=3\text{.}\)
Why introduce a new vector space, \(\R^n\text{,}\) if it is essentially the same thing as \(M_{1n}\text{,}\) or even \(M_{n1}\) for that matter? Recall that a matrix is not simply an ordered sequence: it is an ordered sequence arranged in a very particular way. This subtlety is baked into the very definition of matrix equality, and allows us to say that
There are situations, however, where we donβt need this extra layer of structure, where we want to treat an ordered sequence simply as an ordered sequence. In such situations tuples are preferred to row or column vectors.
That said, the close connection between linear systems and matrix equations makes it very convenient to treat an \(n\)-tuple \((c_1,c_2,\dots, c_n)\) as if it were the column vector
It is clear that \(V=\{\boldzero\}\) satisfies the axioms of DefinitionΒ 3.1.1: for axioms (i)-(ii) and (v)-(viii) both sides of the desired equality are equal to \(\boldzero\text{;}\) axioms (iii)-(iv) boil down to the fact that \(\boldv+\boldv=\boldv\) by definition.
Example3.1.10.The vector space of infinite real sequences.
Underlying set.
The vector space of real infinite sequences, denoted \(\R^\infty\text{,}\) is the set of all infinite sequences \((a_i)_{i=1}^\infty=(a_1,a_2,\dots,)\text{,}\) where \(a_i\in \R\) for all \(i\text{:}\) i.e.,
See ExerciseΒ 4. Observe that since the vector operations are defined entry-wise, the vector arithmetic in \(\R^\infty\) is not so very different from that of \(\R^n\text{.}\)
Let \(I\) be an interval in the real line. The vector space of functions from \(I\) to \(\R\), denoted \(F(I,\R)\text{,}\) is the set of all real-valued functions \(f\colon I\rightarrow \R\text{:}\) i.e., the set of all functions with domain \(I\) and codomain \(\R\text{.}\)
The vector operations on \(F(I,\R)\) defined below are generalizations of operations you may have seen before when learning about function transformations.
The zero vector \(F(I,\R)\) is the constant function \(O_I\) that assigns a value of 0 to all elements of \(I\text{:}\) i.e., \(0_I(x)=0\) for all \(x\in I\text{.}\)
Consider \(F(\R,\R)\text{.}\) A vector of \(F(\R,\R)\) is a function \(f\colon \R\rightarrow \R\text{:}\) a rule that assigns to any input \(x\in \R\) a unique output \(y\in \R\text{.}\) Thus the functions \(f\) and \(g\) defined as \(f(x)=x^2+1\) and \(g(x)=\sin x-x\) are both vectors of \(F(\R,\R)\text{,}\) as is any function given by a formula involving familiar mathematical functions and operations (as long as the formula is defined for all \(x\in \R\)). Thatβs a lot of vectors! And yet we are only beginning to scratch the surface, since a function of \(F(\R,\R)\) need not be given by a nice formula; it simply has to be a well-defined rule. For example, the function \(h\) defined as
\begin{equation*}
h(x)=\begin{cases}
1\amp \text{if } x \text{ is rational}\\
0\amp \text{if } x \text{ is not rational}
\end{cases}
\end{equation*}
Hopefully this discussion gives some indication of how a vector space like \(F(\R,\R)\) is in some sense much larger than spaces like \(\R^n\) or \(M_{mn}\text{,}\) whose general elements can be described in a finite manner. This vague intuition can be made precise with the notion of the dimension of a vector space, which we develop in SectionΒ 3.7.
We end with an example that illustrates how we can define the vector operations to be anything we like, as long as they satisfy the axioms of DefinitionΒ 3.1.1. In this case scalar multiplication will be defined as real number exponentiation, and vector addition will be defined as real number multiplication.
Exercise. We point out, however, that in this case the fact that the operations are actually well-defined should be justified. This is where the positivity of elements of \(\R_{>0}\) comes into play: since \(\boldv=a\) is a positive number, the power \(a^c\) is defined for any \(c\in \R\) and is again positive. Thus \(c\boldv=a^c\) is indeed an element of \(\R_{>0}\text{.}\) Similarly, if \(\boldv=a\) and \(\boldw=b\) are both positive numbers, then so is \(\boldv+\boldw=ab\text{.}\)
The notion of a linear combination of matrices (DefinitionΒ 2.1.13) generalizes easily to any vector space, and will be an important concept in the further development of our theory.
where \(c_i\in\R\) and \(\boldv_i\in V\) for all \(i\text{,}\) is called a linear combination. The scalars \(c_i\) are called the coefficients of the linear combination.
When proving a general fact about vector spaces we can only invoke the defining axioms; we cannot assume the vectors of the space assume any particular form. For example, we cannot assume vectors of \(V\) are \(n\)-tuples, or matrices, etc. We end with an example of such an axiomatic proof.
Let \(V={\mathbb R}\text{.}\) For \(u,v \in V\) and \(a\in{\mathbb R}\) define vector addition by \(u \boxplus v := u+v-3\) and scalar multiplication by \(a \boxdot u := au-3a+3\text{.}\) It can be shown that \((V,\boxplus,\boxdot)\) is a vector space over the scalar field \(\mathbb R\text{.}\) Find the following:
Let \(V={\mathbb R}^2\text{.}\) For \((u_1,u_2),(v_1,v_2) \in V\) and \(a\in{\mathbb R}\) define vector addition by \((u_1,u_2) \boxplus (v_1,v_2) := (u_1+v_1-3,u_2+v_2 + 1)\) and scalar multiplication by \(a \boxdot (u_1,u_2) := (au_1-3a+3,au_2 + a - 1)\text{.}\) It can be shown that \((V,\boxplus,\boxdot)\) is a vector space over the scalar field \(\mathbb R\text{.}\) Find the following:
Let \(V=(-8,\infty)\text{.}\) For \(u,v \in V\) and \(a\in{\mathbb R}\) define vector addition by \(u \boxplus v := uv + 8(u+v)+56\) and scalar multiplication by \(a \boxdot u := (u + 8)^a - 8\text{.}\) It can be shown that \((V,\boxplus,\boxdot)\) is a vector space over the scalar field \(\mathbb R\text{.}\) Find the following:
Let \(I\) be an interval in the real line. Verify that \(F(I,\R)\) along with the vector operations defined as in ExampleΒ 3.1.11 satisfies the axioms of a vector space.
Note: we use the funny symbols \(\odot\) and \(\oplus\) for scalar multiplication and vector addition to prevent confusion between the vector operations of \(\R_{>0}\) and real number arithmetic operations.
In each exercise below, the provided set, along with proposed vector operations, does not constitute a vector space. Identify all details of the vector space definition that fail to be satisfied. In addition to checking the axioms, you should also ask whether the proposed vector operations are well-defined. Provide explicit counterexamples for each failed property.
Let \(x\) be a variable. Define \(V=\{ax+1\colon a\in\R\}\text{,}\) the set of all linear polynomials \(f(x)=ax+1\) with constant coefficient equal to one. Define the vector operations as follows:
Let \(V=\{(a,b)\in\R^2\colon a>0, \ b\lt 0\}\text{:}\) i.e., \(V\) is the set of pairs whose first entry is positive and whose second entry is negative.
Prove statements (2)-(4) of TheoremΒ 3.1.16. When treating a specific part you may assume the results of any part that has already been proven, including statement (1).
Show that the zero vector of \(V\) is unique: i.e., show that if \(\boldw\in V\) satisfies \(\boldw+\boldv=\boldv\) for all \(\boldv\in V\text{,}\) then \(\boldw=\boldzero\text{.}\)
Fix \(\boldv\in V\text{.}\) Show that the vector inverse of \(\boldv\) is unique: i.e., show that if \(\boldw+\boldv=\boldzero\text{,}\) then \(\boldw=-\boldv\text{.}\)
Let \(V\) be a vector space. Prove that either \(V=\{\boldzero\}\) (i.e., \(V\) is the zero space) or \(V\) is infinite. In other words, a vector space contains either exactly one element or infinitely many elements.
Assume \(V\) contains a nonzero vector \(\boldv\ne\boldzero\text{.}\) Show that if \(c\ne d\text{,}\) then \(c\boldv\ne d\boldv\text{.}\) You assume the results of TheoremΒ 3.1.16.