W W L CHEN

(1)

W W L CHEN

c

W W L Chen, 1997, 2008.

This chapter is available free to all individuals, on the understanding that it is not to be used for financial gain,

and may be downloaded and/or photocopied, with or without permission from the author.

However, this document may not be kept on any information storage and retrieval system without permission

from the author, unless such system is not accessible to any individuals other than its owners.

Chapter 8

LINEAR TRANSFORMATIONS

8.1. Euclidean Linear Transformations

By a transformation fromRn intoRm, we mean a function of the typeT :Rn →Rm, with domainRn and codomain Rm. For every vectorx∈Rn, the vectorT(x)∈Rm is called the image ofxunder the transformationT, and the set

R(T) ={T(x) :x∈Rn},

of all images underT, is called the range of the transformationT.

Remark._{For our convenience later, we have chosen to use} _R₍_T_{) instead of the usual}_T_(Rn_{) to denote}

the range of the transformationT.

For everyx= (x1, . . . , xn)∈Rn, we can write

T(x) =T(x1, . . . , xn) = (y1, . . . , ym).

Here, for everyi= 1, . . . , m, we have

yi =Ti(x1, . . . , xn), (1)

whereTi:Rn→Ris a real valued function.

Definition. _{A transformation} _T _: _Rn _→ _Rm _{is called a linear transformation if there exists a real}

matrix

A= 



a11 . . . a1n ..

. ...

am1 . . . amn 

(2)

such that for everyx= (x1, . . . , xn)∈Rn, we haveT(x1, . . . , xn) = (y1, . . . , ym), where

y1= a11x1+. . .+ a1nxn, ..

.

ym=am1x1+. . .+amnxn,

or, in matrix notation,

   y1 .. . ym   =  

a11 . . . a1n ..

. ...

am1 . . . amn     x1 .. . xn 

. (2)

The matrixA is called the standard matrix for the linear transformationT.

Remarks. _{(1) In other words, a transformation} _T _: _Rn _→ _Rm _{is linear if the equation (1) for every} i= 1, . . . , mis linear.

(2) If we writex∈Rn andy∈Rm as column matrices, then (2) can be written in the formy=Ax, and so the linear transformation T can be interpreted as multiplication of x ∈ Rn _{by the standard} matrixA.

Definition._{A linear transformation}_T _:_Rn_→_Rm_{is said to be a linear operator if}_n₌_m_{. In this case,}

we say thatT is a linear operator onRn.

Example 8.1.1._{The linear transformation}_T _:_R5_→_R3_{, deﬁned by the equations}

y1= 2x1+ 3x2+ 5x3+ 7x4−9x5, y2= 3x2+ 4x3 + 2x5, y3= x1 + 3x3−2x4 ,

can be expressed in matrix form as

  y1 y2 y3  =  

2 3 5 7 −9

0 3 4 0 2

1 0 3 −2 0

       x1 x2 x3 x4 x5      .

If (x1, x2, x3, x4, x5) = (1,0,1,0,1), then

  y1 y2 y3  =  

2 3 5 7 −9

0 3 4 0 2

1 0 3 −2 0

       1 0 1 0 1      =   −2 6 4  ,

so thatT(1,0,1,0,1) = (−2,6,4).

Example 8.1.2. _{Suppose that}_A _{is the zero}_m_×_n_{matrix. The linear transformation} _T _:_Rn _→_Rm_, whereT(x) =Axfor everyx∈Rn_{, is the zero transformation from}_Rn _into_Rm_{. Clearly}_T_{(x) =}₀_for everyx∈Rn_.

Example 8.1.3._{Suppose that}_I _{is the identity}_n_×_n_{matrix. The linear operator}_T _:_Rn _→_Rn_{, where}

(3)

PROPOSITION 8A.Suppose that T :Rn →Rm is a linear transformation, and that{e1, . . . ,en} is

the standard basis for Rn. Then the standard matrix forT is given by

A= (T(e1) . . . T(en) ),

whereT(ej)is a column matrix for everyj= 1, . . . , n.

Proof._{This follows immediately from (2).}

8.2. Linear Operators on R2

In this section, we consider the special case when n=m = 2, and study linear operators onR2. For everyx∈R2, we shall write x= (x1, x2).

Example 8.2.1._{Consider reﬂection across the}_x2_{-axis, so that}_T₍_{x1, x2}_{) = (}_{−x1, x2}_{). Clearly we have}

T(e1) =

−1 0

and T(e2) =

0 1

,

and so it follows from Proposition 8A that the standard matrix is given by

A=

−1 0

0 1

.

It is not diﬃcult to see that the standard matrices for reﬂection across thex1-axis and across the line

x1=x2are given respectively by

A=

1 0

0 −1

and A=

0 1 1 0

.

Also, the standard matrix for reﬂection across the origin is given by

A=

−1 0

0 −1

.

We give a summary in the table below:

Linear operator Equations Standard matrix

Reﬂection acrossx2-axis ny1_y2=₌−x1_x2

−1 0

0 1

Reﬂection acrossx1-axis ny1_y2=₌x1_−x2

1 0

0 −1

Reﬂection acrossx1=x2 ny1_y2=₌x2_x1

0 1 1 0

Reﬂection across origin ny1_y2=₌−x1_−x2

−1 0

0 −1

Example 8.2.2._{For orthogonal projection onto the}_x1_{-axis, we have}_T₍_{x1, x2}_{) = (}_x1,_{0), with standard}

matrix

A=

1 0 0 0

(4)

Similarly, the standard matrix for orthogonal projection onto thex2-axis is given by

A=

0 0 0 1

.

Orthogonal projection ontox1-axis ny1_y2=_{= 0}x1

1 0 0 0

Orthogonal projection ontox2-axis

y1= 0

y2=x2

0 0 0 1

Example 8.2.3._{For anticlockwise rotation by an angle}_θ_{, we have}_T₍_{x1, x2}_{) = (}_{y1, y2}_{), where}

y1+ iy2= (x1+ ix2)(cosθ+ i sinθ),

and so

y1 y2

=

cosθ −sinθ

sinθ cosθ

x1 x2

.

It follows that the standard matrix is given by

A=

cosθ −sinθ

sinθ cosθ

.

Anticlockwise rotation by angleθ

y1=x1cosθ−x2sinθ y2=x1sinθ+x2cosθ

cosθ −sinθ

sinθ cosθ

Example 8.2.4._{For contraction or dilation by a non-negative scalar}_k_{, we have}_T₍_{x1, x2}_{) = (}_{kx1, kx2}_),

with standard matrix

A=

k 0 0 k

.

The operator is called a contraction if 0< k <1 and a dilation ifk >1, and can be extended to negative values ofkby noting that fork <0, we have

k 0 0 k

=

−1 0

0 −1

−k 0

0 −k

.

This describes contraction or dilation by non-negative scalar−kfollowed by reﬂection across the origin. We give a summary in the table below:

Contraction or dilation by factork

y1=kx1 y2=kx2

k 0 0 k

(5)

Example 8.2.5. _{For expansion or compression in the} _x1_{-direction by a positive factor} _k_{, we have} T(x1, x2) = (kx1, x2), with standard matrix

A=

k 0 0 1

.

This can be extended to negative values ofkby noting that fork <0, we have

k 0 0 1

=

−1 0

0 1

−k 0

0 1

.

This describes expansion or compression in thex1-direction by positive factor−kfollowed by reﬂection across thex2-axis. Similarly, for expansion or compression in the x2-direction by a non-zero factor k, we have the standard matrix

A=

1 0 0 k

.

Expansion or compression inx1-direction

y1=kx1 y2=x2

k 0 0 1

Expansion or compression inx2-direction ny1_y2=₌x1_kx2

1 0 0 k

Example 8.2.6._{For shears in the}_x1_{-direction with factor}_k_{, we have} _T₍_{x1, x2}_{) = (}_x1₊_{kx2, x2}_{), with}

standard matrix

A=

1 k

0 1

.

For the casek= 1, we have the following.

• • • •

T

(k=1)

For the casek=−1, we have the following.

• • • •

T

(6)

Similarly, for shears in thex2-direction with factork, we have standard matrix A= 1 0 k 1 .

Shear inx1-direction

y1=x1+kx2 y2=x2

1 k

0 1

Shear inx2-direction ny1_y2=₌x1_kx1₊_x2

1 0

k 1

Example 8.2.7._{Consider a linear operator}_T _:_R2_→_R2_{which consists of a reﬂection across the}_x2_-axis,

followed by a shear in thex1-direction with factor 3 and then reflection across thex1-axis. To find the standard matrix, consider the effect ofT on a standard basis {e1,e2}ofR2. Note that

e1=

1 0 7→ −1 0 7→ −1 0 7→ −1 0

=T(e1),

e2=

0 1 7→ 0 1 7→ 3 1 7→ 3 −1

=T(e2),

so it follows from Proposition 8A that the standard matrix forT is

A=

−1 3

0 −1

.

Let us summarize the above and consider a few special cases. We have the following table of invertible linear operators withk6= 0. Clearly, ifAis the standard matrix for an invertible linear operatorT, then the inverse matrixA−1 _{is the standard matrix for the inverse linear operator}_T−1_.

Linear operatorT Standard matrixA Inverse matrixA−1 _{Linear operator}_T−1

Reflection across linex1=x2

0 1 1 0 0 1 1 0 Reflection across linex1=x2

Expansion or compression inx1−direction

k 0 0 1

k−1 ₀

0 1

1 0 0 k

1 0

0 k−1

Shear inx1−direction

1 k 0 1 1 −k 0 1 Shear inx1−direction

Shear inx2−direction

1 0 k 1 1 0 −k 1 Shear inx2−direction

(7)

matrixAby some elementary matrixE to give the productEA. We have the following table.

Elementary row operation Elementary matrixE

Interchanging the two rows

0 1 1 0

Multiplying row 1 by non-zero factork

k 0 0 1

Multiplying row 2 by non-zero factork

1 0 0 k

Addingktimes row 2 to row 1

1 k

0 1

Addingktimes row 1 to row 2

1 0

k 1

Now, we know that any invertible matrixAcan be reduced to the identity matrix by a ﬁnite number of elementary row operations. In other words, there exist a ﬁnite number of elementary matricesE1, . . . , Es of the types above with various non-zero values ofksuch that

Es. . . E1A=I,

so that

A=E1−1. . . E

−1

s . We have proved the following result.

PROPOSITION 8B.Suppose that the linear operatorT :R2→R2has standard matrixA, whereAis invertible. Then T is the product of a succession of ﬁnitely many reﬂections, expansions, compressions and shears.

In fact, we can prove the following result concerning images of straight lines.

PROPOSITION 8C. Suppose that the linear operatorT :R2 →R2 has standard matrix A, where A

is invertible. Then

(a) the image under T of a straight line is a straight line;

(b) the image under T of a straight line through the origin is a straight line through the origin; and (c) the images under T of parallel straight lines are parallel straight lines.

Proof._{Suppose that}_T₍_{x1, x2}_{) = (}_{y1, y2}_{). Since}_A_{is invertible, we have}x=A−1_{y, where}

x=

x1 x2

and y=

y1 y2

.

The equation of a straight line is given byαx1+βx2=γ or, in matrix form, by

(α β)

x1 x2

= (γ).

Hence

(α β)A−1

y1 y2

(8)

Let

(α′ _β′_{) = (}_α _β₎_A−1_.

Then

(α′ _β′₎

y1 y2

= (γ).

In other words, the image underT of the straight lineαx1+βx2=γ isα′

y1+β′

y2=γ, clearly another straight line. This proves (a). To prove (b), note that straight lines through the origin correspond to

γ = 0. To prove (c), note that parallel straight lines correspond to diﬀerent values of γ for the same values ofαandβ.

8.3. Elementary Properties of Euclidean Linear Transformations

In this section, we establish a number of simple properties of euclidean linear transformations.

PROPOSITION 8D. Suppose that T1 : Rn → Rm and T2 : Rm → Rk are linear transformations. ThenT =T2◦T1:Rn→Rk is also a linear transformation.

Proof._Since_T1_and_T2_{are linear transformations, they have standard matrices}_A1_and_A2_{respectively.}

In other words, we haveT1(x) =A1x for everyx∈Rn andT2(y) =A2yfor everyy ∈Rm. It follows thatT(x) =T2(T1(x)) =A2A1xfor everyx∈Rn, so thatT has standard matrixA2A1.

Example 8.3.1. _{Suppose that} _T1 _: _R2 _→ _R2 _{is anticlockwise rotation by} _π/_{2 and} _T2 _: _R2 _→ _R2 _is

orthogonal projection onto thex1-axis. Then the respective standard matrices are

A1=

0 −1

1 0

and A2=

1 0 0 0

.

It follows that the standard matrices forT2◦T1 andT1◦T2 are respectively

A2A1=

0 −1

0 0

and A1A2=

0 0 1 0

.

HenceT2◦T1andT1◦T2are not equal.

Example 8.3.2. _{Suppose that} _T1 _: _R2 _→ _R2 _{is anticlockwise rotation by} _θ _and _T2 _: _R2 _→ _R2 _is

anticlockwise rotation byφ. Then the respective standard matrices are

A1=

cosθ −sinθ

sinθ cosθ

and A2=

cosφ −sinφ

sinφ cosφ

.

It follows that the standard matrix forT2◦T1is

A2A1=

cosφcosθ−sinφsinθ −cosφsinθ−sinφcosθ

sinφcosθ+ cosφsinθ cosφcosθ−sinφsinθ

=

cos(φ+θ) −sin(φ+θ) sin(φ+θ) cos(φ+θ)

.

HenceT2◦T1is anticlockwise rotation byφ+θ.

Example 8.3.3._{The reader should check that in}_R2_{, reﬂection across the}_x1_{-axis followed by reﬂection}

across thex2-axis gives reﬂection across the origin.

(9)

Definition._{A linear transformation}_T _:_Rn_→_Rm_{is said to be one-to-one if for every}_x′

,x′′∈Rn, we havex′

=x′′

wheneverT(x′

) =T(x′′ ).

Example 8.3.4._{If we consider linear operators} _T _:_R2_→_R2_{, then}_T _{is one-to-one precisely when the}

standard matrix Ais invertible. To see this, suppose ﬁrst of all thatA is invertible. IfT(x′

) =T(x′′ ), then Ax′ = Ax′′. Multiplying on the left by A−1_{, we obtain} _x′

= x′′. Suppose next that A is not invertible. Then there existsx∈R2 such thatx6=0andAx=0. On the other hand, we clearly have

A0=0. It follows that T(x) =T(0), so thatT is not one-to-one.

PROPOSITION 8E.Suppose that the linear operator T :Rn_→_Rn _{has standard matrix}_A_{. Then the}

following statements are equivalent: (a) The matrixA is invertible.

(b) The linear operatorT is one-to-one.

(c) The range of T isRn_{; in other words,} _R₍_T_{) =}_Rn_.

Proof._((a)_⇒_{(b)) Suppose that}_T₍x′

) =T(x′′

). ThenAx′ =Ax′′

. Multiplying on the left byA−1_gives

x′ =x′′

.

((b)⇒(a)) Suppose thatT is one-to-one. Then the systemAx=0has unique solution x=0in Rn. It follows thatAcan be reduced by elementary row operations to the identity matrixI, and is therefore invertible.

((a)⇒(c)) For anyy∈Rn_{, clearly}_x₌_A−1_y_satisﬁes_A_x₌_{y, so that}_T_{(x) =}_y.

((c)⇒(a)) Suppose that{e1, . . . ,e_n} is the standard basis for Rn. Letx1, . . . ,x_n ∈Rn be chosen to satisfyT(x_j) =e_j, so thatAx_j=ej, for everyj= 1, . . . , n. Write

C= (x1 . . . xn).

ThenAC=I, so thatAis invertible.

Definition._{Suppose that the linear operator}_T _:_Rn_→_Rn_{has standard matrix}_A_{, where}_A_{is invertible.} Then the linear operator T−1 _: _Rn _→ _Rn_{, deﬁned by} _T−1_{(x) =} _A−1_x _{for every} _x _∈_Rn_{, is called the} inverse of the linear operatorT.

Remark._Clearly_T−1₍_T₍_x_{)) =}_x_and_T₍_T−1₍_x_{)) =}_x_{for every}_x_∈_Rn_.

Example 8.3.5. _{Consider the linear operator} _T _: _R2 _→_R2_{, deﬁned by} _T_{(x) =} _Ax for everyx ∈R2, where

A=

1 1 1 2

.

ClearlyAis invertible, and

A−1₌

2 −1

−1 1

.

Hence the inverse linear operator isT−1_:_R2_→_R2_{, deﬁned by}_T−1_{(x) =}_A−1_x_{for every} _x_∈_R2_.

Example 8.3.6. _{Suppose that} _T _: _R2 _→ _R2 _{is anticlockwise rotation by angle} _θ_{. The reader should}

check thatT−1_:_R2_→_R2 _{is anticlockwise rotation by angle 2}_π₋_θ_.

(10)

PROPOSITION 8F. A transformation T : Rn → Rm is linear if and only if the following two conditions are satisﬁed:

(a) For everyu,v∈Rn, we haveT(u+v) =T(u) +T(v). (b) For everyu∈Rn andc∈R, we haveT(cu) =cT(u).

Proof._{Suppose ﬁrst of all that}_T _:_Rn_→_Rm_{is a linear transformation. Let}_A_{be the standard matrix} forT. Then for everyu,v∈Rn andc∈R, we have

T(u+v) =A(u+v) =Au+Av=T(u) +T(v)

and

T(cu) =A(cu) =c(Au) =cT(u).

Suppose now that (a) and (b) hold. To show that T is linear, we need to ﬁnd a matrix A such that

T(x) =Ax for everyx∈Rn. Suppose that{e1, . . . ,en} is the standard basis forRn. As suggested by Proposition 8A, we write

A= (T(e1) . . . T(en) ),

whereT(ej) is a column matrix for everyj= 1, . . . , n. For any vector

x= 



x1

.. .

xn 



inRn_{, we have}

Ax= (T(e1) . . . T(en) )





x1

.. .

xn 

=x1T(e1) +. . .+xnT(en).

Using (b) on each summand and then using (a) inductively, we obtain

Ax=T(x1e1) +. . .+T(xnen) =T(x1e1+. . .+xnen) =T(x)

as required.

To conclude our study of euclidean linear transformations, we brieﬂy mention the problem of eigen-values and eigenvectors of euclidean linear operators.

Definition. _{Suppose that} _T _: _Rn _→_Rn _{is a linear operator. Then any real number} _λ_∈ _R _{is called} an eigenvalue ofT if there exists a non-zero vectorx∈Rn such that T(x) =λx. This non-zero vector x∈Rn is called an eigenvector ofT corresponding to the eigenvalueλ.

Remark. _{Note that the equation} _T_{(x) =} _λx is equivalent to the equation Ax =λx. It follows that there is no distinction between eigenvalues and eigenvectors of T and those of the standard matrix A. We therefore do not need to discuss this problem any further.

8.4. General Linear Transformations

(11)

By a transformation from V into W, we mean a function of the type T : V → W, with domain V

and codomain W. For every vector u∈ V, the vector T(u) ∈ W is called the image of u under the transformationT.

Definition. _{A transformation} _T _: _V _→ _W _{from a real vector space}_V _{into a real vector space} _W _is

called a linear transformation if the following two conditions are satisﬁed: (LT1) For everyu,v∈V, we haveT(u+v) =T(u) +T(v).

(LT2) For everyu∈V andc∈R, we haveT(cu) =cT(u).

Definition._{A linear transformation}_T _:_V _→_V _{from a real vector space}_V _{into itself is called a linear}

operator onV.

Example 8.4.1. _{Suppose that}_V _and _W _{are two real vector spaces. The transformation}_T _: _V _→_W_,

whereT(u) =0for everyu∈V, is clearly linear, and is called the zero transformation fromV toW.

Example 8.4.2._{Suppose that}_V _{is a real vector space. The transformation}_I_:_V _→_V_{, where}_I_{(u) =}u for everyu∈V, is clearly linear, and is called the identity operator onV.

Example 8.4.3._{Suppose that} _V _{is a real vector space, and that} _k _∈ _R_{is ﬁxed. The transformation} T : V → V, whereT(u) = ku for every u∈ V, is clearly linear. This operator is called a dilation if

k >1 and a contraction if 0< k <1.

Example 8.4.4._{Suppose that}_V _{is a ﬁnite dimensional vector space, with basis}_{w1, . . . ,wn}. Deﬁne a transformationT :V →Rn as follows. For everyu∈V, there exists a unique vector (β1, . . . , βn)∈Rn such thatu=β1w1+. . .+βnwn. We let T(u) = (β1, . . . , βn). In other words, the transformationT gives the coordinates of any vector u∈V with respect to the given basis {w1, . . . ,wn}. Suppose now thatv=γ1w1+. . .+γnwn is another vector inV. Thenu+v= (β1+γ1)w1+. . .+ (βn+γn)wn, so that

T(u+v) = (β1+γ1, . . . , βn+γn) = (β1, . . . , βn) + (γ1, . . . , γn) =T(u) +T(v).

Also, ifc∈R, thencu=cβ1w1+. . .+cβnwn, so that

T(cu) = (cβ1, . . . , cβn) =c(β1, . . . , βn) =cT(u).

HenceT is a linear transformation. We shall return to this in greater detail in the next section.

Example 8.4.5._{Suppose that}_P_n _{denotes the vector space of all polynomials with real coeﬃcients and}

degree at mostn. Deﬁne a transformation T :Pn→Pn as follows. For every polynomial

p=p0+p1x+. . .+pnxn

inPn, we let

T(p) =pn+pn−1x+. . .+p0xn.

Suppose now thatq=q0+q1x+. . .+qnxn is another polynomial inPn. Then

p+q= (p0+q0) + (p1+q1)x+. . .+ (pn+qn)xn,

so that

T(p+q) = (pn+qn) + (pn−1+qn−1)x+. . .+ (p0+q0)xn

(12)

Also, for anyc∈R, we havecp=cp0+cp1x+. . .+cpnxn, so that

T(cp) =cpn+cpn−1x+. . .+cp0xn=c(pn+pn−1x+. . .+p0xn) =cT(p). HenceT is a linear transformation.

Example 8.4.6._Let_V _{denote the vector space of all real valued functions diﬀerentiable everywhere in}_R,

and letW denote the vector space of all real valued functions deﬁned onR. Consider the transformation

T :V →W, where T(f) =f′ _{for every}_f _∈_V_{. It is easy to check from properties of derivatives that}_T is a linear transformation.

Example 8.4.7._Let_V _{denote the vector space of all real valued functions that are Riemann integrable}

over the interval [0,1]. Consider the transformationT :V →R, where

T(f) = Z 1

0

f(x) dx

for everyf ∈V. It is easy to check from properties of the Riemann integral thatT is a linear transfor-mation.

Consider a linear transformationT :V →W from a ﬁnite dimensional real vector spaceV into a real vector spaceW. Suppose that {v1, . . . ,vn} is a basis ofV. Then everyu∈V can be written uniquely in the formu=β1v1+. . .+βnvn, whereβ1, . . . , βn∈R. It follows that

T(u) =T(β1v1+. . .+βnvn) =T(β1v1) +. . .+T(βnvn) =β1T(v1) +. . .+βnT(vn).

We have therefore proved the following generalization of Proposition 8A.

PROPOSITION 8G.Suppose that T :V →W is a linear transformation from a ﬁnite dimensional real vector spaceV into a real vector spaceW. Suppose further that{v1, . . . ,vn} is a basis ofV. Then

T is completely determined byT(v1), . . . , T(vn).

Example 8.4.8._{Consider a linear transformation}_T _:_P2_→_{R, where}_T_{(1) = 1,}_T₍_x_{) = 2 and}_T₍_x2_{) = 3.}

Since {1, x, x2_} _{is a basis of}_P2_{, this linear transformation is completely determined. In particular, we}

have, for example,

T(5−3x+ 2x2) = 5T(1)−3T(x) + 2T(x2) = 5.

Example 8.4.9._{Consider a linear transformation}_T _:_R4_→_{R, where} _T₍₁_,₀_,₀_,_{0) = 1,}_T₍₁_,₁_,₀_,_{0) = 2,} T(1,1,1,0) = 3 andT(1,1,1,1) = 4. Since{(1,0,0,0),(1,1,0,0),(1,1,1,0),(1,1,1,1)}is a basis ofR4, this linear transformation is completely determined. In particular, we have, for example,

T(6,4,3,1) =T(2(1,0,0,0) + (1,1,0,0) + 2(1,1,1,0) + (1,1,1,1))

= 2T(1,0,0,0) +T(1,1,0,0) + 2T(1,1,1,0) +T(1,1,1,1) = 14.

We also have the following generalization of Proposition 8D.

PROPOSITION 8H.Suppose that V, W, U are real vector spaces. Suppose further that T1:V →W

andT2:W →U are linear transformations. ThenT =T2◦T1:V →U is also a linear transformation.

Proof._{Suppose that}u,v∈V. Then

T(u+v) =T2(T1(u+v)) =T2(T1(u) +T1(v)) =T2(T1(u)) +T2(T1(v)) =T(u) +T(v).

Also, ifc∈R, then

T(cu) =T2(T1(cu)) =T2(cT1(u)) =cT2(T1(u)) =cT(u).

(13)

8.5. Change of Basis

Suppose thatV is a real vector space, with basis B ={u1, . . . ,un}. Then every vectoru∈V can be written uniquely as a linear combination

u=β1u1+. . .+βnun, whereβ1, . . . , βn∈R. (3)

It follows that the vectorucan be identiﬁed with the vector (β1, . . . , βn)∈Rn.

Definition._{Suppose that}u∈V and (3) holds. Then the matrix

[u]B= 

 

β1

.. .

βn 

 

is called the coordinate matrix ofurelative to the basisB={u1, . . . ,un}.

Example 8.5.1._{The vectors}

u1= (1,2,1,0), u2= (3,3,3,0), u3= (2,−10,0,0), u4= (−2,1,−6,2)

are linearly independent in R4, and so B = {u1,u2,u3,u4} is a basis of R4. It follows that for any u= (x, y, z, w)∈R4, we can write

u=β1u1+β2u2+β3u3+β4u4.

In matrix notation, this becomes



 

x y z w



 =



 

1 3 2 −2

2 3 −10 1

1 3 0 −6

0 0 0 2



 



 

β1 β2 β3 β4



 ,

so that

[u]B= 

 

β1 β2 β3 β4



 =



 

1 3 2 −2

2 3 −10 1

1 3 0 −6

0 0 0 2



 

−1_

 

x y z w



 .

Remark._{Consider a function}_φ_:_V _→_Rn_{, where}_φ_{(u) = [u]B} _{for every}u∈V. It is not diﬃcult to see that this function gives rise to a one-to-one correspondence between the elements ofV and the elements ofRn. Furthermore, note that

[u+v]B= [u]B+ [v]B and [cu]B=c[u]B,

so that φ(u+v) =φ(u) +φ(v) andφ(cu) =cφ(u) for everyu,v ∈V and c∈ R. Thusφ is a linear transformation, and preserves much of the structure ofV. We also say that V is isomorphic toRn. In practice, once we have made this identiﬁcation between vectors and their coordinate matrices, then we can basically forget about the basisB and imagine that we are working inRn _{with the standard basis.}

Clearly, if we change from one basisB={u1, . . . ,un}to another basisC={v1, . . . ,vn}ofV, then we also need to ﬁnd a way of calculating [u]C in terms of [u]B for every vectoru∈V. To do this, note that each of the vectorsv1, . . . ,vncan be written uniquely as a linear combination of the vectorsu1, . . . ,un. Suppose that fori= 1, . . . , n, we have

(14)

so that

[vi]B= 



a1i .. .

ani 

.

For everyu∈V, we can write

u=β1u1+. . .+βnun=γ1v1+. . .+γnvn, whereβ1, . . . , βn, γ1, . . . , γn∈R,

so that

[u]B= 

 

β1

.. .

βn 



 and [u]C = 

 

γ1

.. .

γn 

 .

Clearly

u=γ1v1+. . .+γnvn

=γ1(a11u1+. . .+an1un) +. . .+γn(a1nu1+. . .+annun) = (γ1a11+. . .+γna1n)u1+. . .+ (γ1an1+. . .+γnann)un =β1u1+. . .+βnun.

Hence

β1=γ1a11+. . .+γna1n, ..

.

βn=γ1an1+. . .+γnann.

Written in matrix notation, we have



 

β1

.. .

βn 

 =





a11 . . . a1n ..

. ...

an1 . . . ann 

 

 

γ1

.. .

γn 

 .

We have proved the following result.

PROPOSITION 8J. Suppose that B = {u1, . . . ,un} and C = {v1, . . . ,vn} are two bases of a real

vector spaceV. Then for every u∈V, we have

[u]B=P[u]C,

where the columns of the matrix

P = ( [v1]B . . . [vn]B)

are precisely the coordinate matrices of the elements of C relative to the basisB.

Remark._{Strictly speaking, Proposition 8J gives [u]B} _{in terms of [u]C. However, note that the matrix} P is invertible (why?), so that [u]C =P−1_[u]B.

Definition._{The matrix}_P _{in Proposition 8J is sometimes called the transition matrix from the basis}_C

(15)

Example 8.5.2._{We know that with}

u1= (1,2,1,0), u2= (3,3,3,0), u3= (2,−10,0,0), u4= (−2,1,−6,2),

and with

v1= (1,2,1,0), v2= (1,−1,1,0), v3= (1,0,−1,0), v4= (0,0,0,2),

bothB={u1,u2,u3,u4}andC={v1,v2,v3,v4}are bases ofR4. It is easy to check that

v1=u1,

v2=−2u1+u2,

v3= 11u1−4u2+u3,

v4=−27u1+ 11u2−2u3+u4,

so that

P = ( [v1]B [v2]B [v3]B [v4]B) =



 

1 −2 11 −27

0 1 −4 11

0 0 1 −2

0 0 0 1



 .

Hence [u]B=P[u]C for every u∈R4. It is also easy to check that

u1=v1,

u2= 2v1+v2,

u3=−3v1+ 4v2+v3,

u4=−v1−3v2+ 2v3+v4,

so that

Q= ( [u1]C [u2]C [u3]C [u4]C) =



 

1 2 −3 −1

0 1 4 −3

0 0 1 2

0 0 0 1



 .

Hence [u]C =Q[u]B for everyu∈R4. Note thatP Q=I. Now letu= (6,−1,2,2). We can check that u=v1+ 3v2+ 2v3+v4, so that

[u]C = 

  1 3 2 1



 .

Then

[u]B= 

 

1 −2 11 −27

0 1 −4 11

0 0 1 −2

0 0 0 1



 



  1 3 2 1



 =



 

−10 6 0 1



 .

(16)

Example 8.5.3._{Consider the vector space}_P2_{. It is not too diﬃcult to check that}

u1= 1 +x, u2= 1 +x2, u3=x+x2

form a basis ofP2. Letu= 1 + 4x−x2_{. Then}_u₌_β1u1₊_β2u2₊_β3u3_{, where}

1 + 4x−x2=β1(1 +x) +β2(1 +x2) +β3(x+x2) = (β1+β2) + (β1+β3)x+ (β2+β3)x2,

so that β1+β2 = 1, β1 +β3 = 4 and β2+β3 = −1. Hence (β1, β2, β3) = (3,−2,1). If we write

B={u1, u2, u3}, then

[u]B= 

 3

−2 1



.

On the other hand, it is also not too diﬃcult to check that

v1= 1, v2= 1 +x, v3= 1 +x+x2

form a basis ofP2. Alsou=γ1v1+γ2v2+γ3v3, where

1 + 4x−x2=γ1+γ2(1 +x) +γ3(1 +x+x2) = (γ1+γ2+γ3) + (γ2+γ3)x+γ3x2,

so that γ1+γ2+γ3 = 1, γ2+γ3 = 4 and γ3 = −1. Hence (γ1, γ2, γ3) = (−3,5,−1). If we write

C={v1, v2, v3}, then

[u]C = 



−3 5

−1 

.

Next, note that

v1= 1 2u1+

1 2u2−

1 2u3, v2=u1,

v3= 1 2u1+

1 2u2+

1 2u3.

Hence

P = ( [v1]B [v2]B [v3]B) = 



1/2 1 1/2 1/2 0 1/2

−1/2 0 1/2 

.

To verify that [u]B=P[u]C, note that



 3

−2 1



= 



1/2 1 1/2 1/2 0 1/2

−1/2 0 1/2 

 



−3 5

−1 

.

8.6. Kernel and Range

Consider ﬁrst of all a euclidean linear transformation T : Rn →Rm. Suppose that A is the standard matrix forT. Then the range of the transformationT is given by

(17)

It follows thatR(T) is the set of all linear combinations of the columns of the matrixA, and is therefore the column space ofA. On the other hand, the set

{x∈Rn:Ax=0}

is the nullspace ofA.

Recall that the sum of the dimension of the nullspace ofAand dimension of the column space ofAis equal to the number of columns ofA. This is known as the Rank-nullity theorem. The purpose of this section is to extend this result to the setting of linear transformations. To do this, we need the following generalization of the idea of the nullspace and the column space.

Definition._{Suppose that}_T_:_V _→_W _{is a linear transformation from a real vector space}_V _{into a real}

vector spaceW. Then the set

ker(T) ={u∈V :T(u) =0}

is called the kernel ofT, and the set

R(T) ={T(u) :u∈V}

is called the range ofT.

Example 8.6.1._{For a euclidean linear transformation}_T _{with standard matrix}_A_{, we have shown that}

ker(T) is the nullspace ofA, whileR(T) is the column space ofA.

Example 8.6.2._{Suppose that}_T _:_V _→_W _{is the zero transformation. Clearly we have ker(}_T_{) =}_V _and R(T) ={0}.

Example 8.6.3._{Suppose that}_T _:_V _→_V _{is the identity operator on}_V_{. Clearly we have ker(}_T_{) =}_{0}

andR(T) =V.

Example 8.6.4. _{Suppose that}_T _:_R2 _→_R2 _{is orthogonal projection onto the} _x1_{-axis. Then ker(}_T_{) is}

thex2-axis, whileR(T) is thex1-axis.

Example 8.6.5._{Suppose that}_T _:_Rn_→_Rn _{is one-to-one. Then ker(}_T_{) =}_{0} andR(T) =Rn, in view of Proposition 8E.

Example 8.6.6._{Consider the linear transformation} _T _:_V _→_W_{, where}_V _{denotes the vector space of}

all real valued functions diﬀerentiable everywhere in R, whereW denotes the space of all real valued functions deﬁned inR, and whereT(f) =f′

for everyf ∈V. Then ker(T) is the set of all diﬀerentiable functions with derivative 0, and so is the set of all constant functions inR.

Example 8.6.7. _{Consider the linear transformation}_T _: _V _→_{R, where} _V _{denotes the vector space of}

all real valued functions Riemann integrable over the interval [0,1], and where

T(f) = Z 1

0

f(x) dx

for every f ∈V. Then ker(T) is the set of all Riemann integrable functions in [0,1] with zero mean, whileR(T) =R.

PROPOSITION 8K.Suppose that T :V →W is a linear transformation from a real vector spaceV

(18)

Proof._Since_T_{(0) =}_{0, it follows that}₀_∈_ker(_T₎_⊆_V _and₀_∈_R₍_T₎_⊆_W_{. For any}_u_,_v_∈_ker(_T_{), we}

have

T(u+v) =T(u) +T(v) =0+0=0,

so thatu+v∈ker(T). Suppose further thatc∈R. Then

T(cu) =cT(u) =c0=0,

so thatcu∈ker(T). Hence ker(T) is a subspace ofV. Suppose next thatw,z∈R(T). Then there exist u,v∈V such thatT(u) =wandT(v) =z. Hence

T(u+v) =T(u) +T(v) =w+z,

so thatw+z∈R(T). Suppose further thatc∈R. Then

T(cu) =cT(u) =cw,

so thatcw∈R(T). HenceR(T) is a subspace ofW.

To complete this section, we prove the following generalization of the Rank-nullity theorem.

PROPOSITION 8L.Suppose thatT :V →W is a linear transformation from ann-dimensional real vector spaceV into a real vector space W. Then

dim ker(T) + dimR(T) =n.

Proof._{Suppose ﬁrst of all that dim ker(}_T_{) =}_n_{. Then ker(}_T_{) =}_V_{, and so}_R₍_T_{) =}_{0}, and the result follows immediately. Suppose next that dim ker(T) = 0, so that ker(T) = {0}. If {v1, . . . ,vn} is a basis ofV, then it follows thatT(v1), . . . , T(vn) are linearly independent inW, for otherwise there exist c1, . . . , cn∈R, not all zero, such that

c1T(v1) +. . .+cnT(vn) =0,

so that T(c1v1+. . .+cnvn) =0, a contradiction since c1v1+. . .+cnvn 6= 0. On the other hand, elements ofR(T) are linear combinations ofT(v1), . . . , T(vn). Hence dimR(T) =n, and the result again

follows immediately. We may therefore assume that dim ker(T) =r, where 1≤r < n. Let{v1, . . . ,vr} be a basis of ker(T). This basis can be extended to a basis{v1, . . . ,vr,vr+1, . . . ,vn}ofV. It suﬃces to show that

{T(vr+1), . . . , T(vn)} (4)

is a basis ofR(T). Suppose thatu∈V. Then there existβ1, . . . , βn∈Rsuch that

u=β1v1+. . .+βrvr+βr+1vr+1+. . .+βnvn,

so that

T(u) =β1T(v1) +. . .+βrT(vr) +βr+1T(vr+1) +. . .+βnT(vn) =βr+1T(vr+1) +. . .+βnT(vn).

It follows that (4) spansR(T). It remains to prove that its elements are linearly independent. Suppose thatcr+1, . . . , cn ∈Rand

(19)

We need to show that

cr+1=. . .=cn= 0. (6)

By linearity, it follows from (5) thatT(cr+1vr+1+. . .+cnvn) =0, so that

cr+1vr+1+. . .+cnvn ∈ker(T).

Hence there existc1, . . . , cr∈Rsuch that

cr+1vr+1+. . .+cnvn=c1v1+. . .+crvr,

so that

c1v1+. . .+crvr−cr+1vr+1−. . .−cnvn =0.

Since{v1, . . . ,v_n}is a basis ofV, it follows thatc1=. . .=cr=cr+1=. . .=cn= 0, so that (6) holds. This completes the proof.

Remark._{We sometimes say that dim}_R₍_T_{) and dim ker(}_T_{) are respectively the rank and the nullity of}

the linear transformationT.

8.7. Inverse Linear Transformations

In this section, we generalize some of the ideas ﬁrst discussed in Section 8.3.

Definition._{A linear transformation}_T _:_V _→_W _{from a real vector space}_V _{into a real vector space}_W

is said to be one-to-one if for everyu′

,u′′

∈V, we haveu′ =u′′

wheneverT(u′

) =T(u′′ ).

The result below follows immediately from our deﬁnition.

PROPOSITION 8M.Suppose thatT :V →W is a linear transformation from a real vector space V

into a real vector spaceW. ThenT is one-to-one if and only ifker(T) ={0}.

Proof.₍_⇒_{) Clearly}₀_∈_ker(_T_{). Suppose that ker(}_T₎₆₌_{₀_}_{. Then there exists a non-zero}_v_∈_ker(_T_).

It follows thatT(v) =T(0), and soT is not one-to-one.

(⇐) Suppose that ker(T) ={0}. Given any u′,u′′∈V, we have

T(u′)−T(u′′) =T(u′−u′′) =0

if and only ifu′−u′′=0; in other words, if and only ifu′ =u′′.

We have the following generalization of Proposition 8E.

PROPOSITION 8N.Suppose thatT :V →V is a linear operator on a ﬁnite-dimensional real vector spaceV. Then the following statements are equivalent:

(a) The linear operatorT is one-to-one. (b) We haveker(T) ={0}.

(c) The range of T isV; in other words, R(T) =V.

Proof._{The equivalence of (a) and (b) is established by Proposition 8M. The equivalence of (b) and (c)}

(20)

Suppose thatT :V →W is a one-to-one linear transformation from a real vector spaceV into a real vector spaceW. Then for everyw∈R(T), there exists exactly oneu∈V such thatT(u) =w. We can therefore deﬁne a transformationT−1 _:_R₍_T₎_→_V _{by writing}_T−1₍_w_{) =}_u_{, where}_u_∈_V _{is the unique}

vector satisfyingT(u) =w.

PROPOSITION 8P.Suppose thatT :V →W is a one-to-one linear transformation from a real vector spaceV into a real vector space W. ThenT−1_:_R₍_T₎_→_V _{is a linear transformation.}

Proof._{Suppose that}w,z∈R(T). Then there existu,v∈V such that T−1_{(w) =}_u_and_T−1_{(z) =}_v.

It follows thatT(u) =wandT(v) =z, so thatT(u+v) =T(u) +T(v) =w+z, whence

T−1_(w₊_{z) =}_u₊_v₌_T−1_{(w) +}_T−1_(z)_.

Suppose further thatc∈R. ThenT(cu) =cw, so that

T−1₍_c_{w) =}_c_u₌_cT−1_(w)_.

This completes the proof.

We also have the following result concerning compositions of linear transformations and which requires no further proof, in view of our knowledge concerning inverse functions.

PROPOSITION 8Q. Suppose that V, W, U are real vector spaces. Suppose further that T1:V →W

andT2:W →U are one-to-one linear transformations. Then

(a) the linear transformation T2◦T1:V →U is one-to-one; and (b) (T2◦T1)−1₌_T−1

1 ◦T

−1 2 .

8.8. Matrices of General Linear Transformations

Suppose thatT :V →W is a linear transformation from a real vector spaceV to a real vector spaceW. Suppose further that the vector spacesV andW are ﬁnite dimensional, with dimV =nand dimW =m. We shall show that if we make use of a basisB ofV and a basis Cof W, then it is possible to describe

T indirectly in terms of some matrixA. The main idea is to make use of coordinate matrices relative to the basesBandC.

Let us recall some discussion in Section 8.5. Suppose that B ={v1, . . . ,vn} is a basis of V. Then every vectorv∈V can be written uniquely as a linear combination

v=β1v1+. . .+βnvn, whereβ1, . . . , βn∈R. (7)

The matrix

[v]B= 

 

β1

.. .

βn 



 (8)

is the coordinate matrix ofvrelative to the basisB.

Consider now a transformation φ: V →Rn, where φ(v) = [v]B for every v∈ V. The proof of the following result is straightforward.

PROPOSITION 8R. Suppose that the real vector space V has basis B = {v1, . . . ,vn}. Then the

transformation φ : V → Rn, where φ(v) = [v]B satisﬁes (7) and (8) for every v ∈ V, is a

one-to-one linear transformation, with range R(φ) = Rn. Furthermore, the inverse linear transformation

(21)

Suppose next that C ={w1, . . . ,wm} is a basis of W. Then we can deﬁne a linear transformation

ψ : W → Rm, where ψ(w) = [w]C for every w ∈ W, in a similar way. We now have the following diagram of linear transformations.

V W

Rn Rm

T

φ ψ

φ−1 _ψ−1

Clearly the composition

S=ψ◦T◦φ−1_:_Rn _→_Rm

is a euclidean linear transformation, and can therefore be described in terms of a standard matrix A. Our task is to determine this matrixAin terms ofT and the basesBandC.

We know from Proposition 8A that

A= (S(e1) . . . S(en) ),

where{e1, . . . ,e_n} is the standard basis forRn_{. For every}_j_{= 1}_{, . . . , n}_{, we have}

S(ej) = (ψ◦T ◦φ−1)(ej) =ψ(T(φ−1(ej))) =ψ(T(vj)) = [T(vj)]C.

It follows that

A= ( [T(v1)]C . . . [T(vn)]C). (9)

Definition._{The matrix}_A_{given by (9) is called the matrix for the linear transformation}_T _{with respect}

to the basesBandC.

We now have the following diagram of linear transformations.

V W

Rn Rm

T

φ ψ

S

φ−1 _ψ−1

Hence we can writeT as the composition

T =ψ−1◦S◦φ:V →W.

For everyv∈V, we have the following:

(22)

More precisely, ifv=β1v1+. . .+βnvn, then [v]B=    β1 .. . βn  

 and A[v]B=A    β1 .. . βn   =    γ1 .. . γm   ,

say, and soT(v) =ψ−1₍_A_{[v]B) =}_γ1_w

1+. . .+γmwm. We have proved the following result.

PROPOSITION 8S.Suppose that T :V →W is a linear transformation from a real vector space V

into a real vector space W. Suppose further that V andW are ﬁnite dimensional, with bases B andC

respectively, and thatAis the matrix for the linear transformation T with respect to the basesB andC. Then for everyv∈V, we haveT(v) =w, wherew∈W is the unique vector satisfying [w]C =A[v]B.

Remark._{In the special case when} _V ₌_W_{, the linear transformation} _T _: _V _→_W _{is a linear operator}

on T. Of course, we may choose a basis B for the domain V of T and a basis C for the codomain V

ofT. In the case when T is the identity linear operator, we often chooseB 6=C since this represents a change of basis. In the case whenT is not the identity operator, we often choose B=Cfor the sake of convenience; we then say thatAis the matrix for the linear operatorT with respect to the basisB.

Example 8.8.1._{Consider an operator}_T _:_P3_→_P3 _{on the real vector space}_P3 _{of all polynomials with}

real coeﬃcients and degree at most 3, where for every polynomialp(x) inP3, we haveT(p(x)) =xp′ (x), the product of x with the formal derivative p′

(x) of p(x). The reader is invited to check thatT is a linear operator. Now consider the basisB={1, x, x2_{, x}3_} _of_P3_{. The matrix for}_T _{with respect to}_B _is

given by

A= ( [T(1)]B [T(x)]B [T(x2_)]B _[_T₍_x3_)]B_{) = ( [0]B} _[_x_]B _[2_x2_]B _[3_x3_]B_{) =}



 

0 0 0 0 0 1 0 0 0 0 2 0 0 0 0 3



 .

Suppose thatp(x) = 1 + 2x+ 4x2_{+ 3}_x3_{. Then}

[p(x)]B=    1 2 4 3  

 and A[p(x)]B= 

 

0 0 0 0 0 1 0 0 0 0 2 0 0 0 0 3

      1 2 4 3   =    0 2 8 9   ,

so thatT(p(x)) = 2x+ 8x2_{+ 9}_x3_{. This can be easily veriﬁed by noting that}

T(p(x)) =xp′(x) =x(2 + 8x+ 9x2) = 2x+ 8x2+ 9x3.

In general, ifp(x) =p0+p1x+p2x2₊_p3x3_{, then}

[p(x)]B=    p0 p1 p2 p3  

 

0 0 0 0 0 1 0 0 0 0 2 0 0 0 0 3

      p0 p1 p2 p3   =    0 p1 2p2 3p3   ,

so thatT(p(x)) =p1x+ 2p2x2_{+ 3}_p3x3_{. Observe that}

T(p(x)) =xp′(x) =x(p1+ 2p2x+ 3p3x2) =p1x+ 2p2x2+ 3p3x3,

(23)

Example 8.8.2. _{Consider the linear operator}_T _:_R2_→_R2_{, given by}_T₍_{x1, x2}_{) = (2}_x1₊_{x2, x1}_{+ 3}_x2₎

for every (x1, x2)∈R2. Consider also the basisB={(1,0),(1,1)} of R2. Then the matrix for T with respect toB is given by

A= ( [T(1,0)]B [T(1,1)]B) = ( [(2,1)]B [(3,4)]B) =

1 −1

1 4

.

Suppose that (x1, x2) = (3,2). Then

[(3,2)]B=

1 2

and A[(3,2)]B=

1 −1

1 4

1 2

=

−1 9

,

so thatT(3,2) =−(1,0) + 9(1,1) = (8,9). This can be easily veriﬁed directly. In general, we have

[(x1, x2)]B=

x1−x2 x2

and A[(x1, x2)]B=

1 −1

1 4

x1−x2 x2

=

x1−2x2 x1+ 3x2

,

so thatT(x1, x2) = (x1−2x2)(1,0) + (x1+ 3x2)(1,1) = (2x1+x2, x1+ 3x2).

Example 8.8.3._{Suppose that} _T _:_Rn_→_Rm_{is a linear transformation. Suppose further that}_B _and_C are the standard bases forRn andRm respectively. Then the matrix forT with respect to Band C is given by

A= ( [T(e1)]C . . . [T(en)]C) = (T(e1) . . . T(en) ),

so it follows from Proposition 8A thatAis simply the standard matrix forT.

Suppose now that T1 : V →W and T2 :W → U are linear transformations, where the real vector spaces V, W, U are ﬁnite dimensional, with respective basesB ={v1, . . . ,v_n}, C ={w1, . . . ,w_m} and

D={u1, . . . ,uk}. We then have the following diagram of linear transformations.

V W U

Rn Rm Rk

T1

φ

T2

ψ η

S1

φ−1

S2

ψ−1 _η−1

Hereη:U →Rk, whereη(u) = [u]D for everyu∈U, is a linear transformation, and

S1=ψ◦T1◦φ−1:Rn→Rm and S2=η◦T2◦ψ−1:Rm→Rk

are euclidean linear transformations. Suppose that A1 and A2 are respectively the standard matrices forS1 andS2, so that they are respectively the matrix forT1 with respect toB and C and the matrix forT2with respect to C andD. Clearly

S2◦S1=η◦T2◦T1◦φ−1_:

Rn →Rk.

(24)

PROPOSITION 8T. Suppose thatT1 :V → W andT2 :W →U are linear transformations, where the real vector spacesV, W, U are ﬁnite dimensional, with basesB,C,Drespectively. Suppose further that

A1 is the matrix for the linear transformationT1 with respect to the basesB and C, and that A2 is the matrix for the linear transformationT2 with respect to the basesC andD. ThenA2A1 is the matrix for the linear transformationT2◦T1 with respect to the basesB andD.

Example 8.8.4. _{Consider the linear operator} _T1 _: _P3 _→ _P3_{, where for every polynomial} _p₍_x_{) in} _P3_,

we have T1(p(x)) = xp′

(x). We have already shown that the matrix forT1 with respect to the basis

B={1, x, x2_{, x}3_} _of_P3 _{is given by}

A1= 

 

0 0 0 0 0 1 0 0 0 0 2 0 0 0 0 3



 .

Consider next the linear operatorT2:P3→P3, where for every polynomialq(x) =q0+q1x+q2x2₊_q3x3

inP3, we have

T2(q(x)) =q(1 +x) =q0+q1(1 +x) +q2(1 +x)2+q3(1 +x)3.

We haveT2(1) = 1,T2(x) = 1 +x, T2(x2_{) = 1 + 2}_x₊_x2 _and_T2₍_x3_{) = 1 + 3}_x_{+ 3}_x2₊_x3_{, so that the}

matrix forT2 with respect toB is given by

A2= ( [T2(1)]B [T2(x)]B [T2(x2_)]B _[_T2₍_x3_)]B_{) =}



 

1 1 1 1 0 1 2 3 0 0 1 3 0 0 0 1



 .

Consider now the compositionT =T2◦T1:P3→P3. LetAdenote the matrix forT with respect toB. By Proposition 8T, we have

A=A2A1= 

 

1 1 1 1 0 1 2 3 0 0 1 3 0 0 0 1

     

0 0 0 0 0 1 0 0 0 0 2 0 0 0 0 3

  =   

0 1 2 3 0 1 4 9 0 0 2 9 0 0 0 3



 .

Suppose thatp(x) =p0+p1x+p2x2₊_p3x3_{. Then}

[p(x)]B=    p0 p1 p2 p3  

 

0 1 2 3 0 1 4 9 0 0 2 9 0 0 0 3

      p0 p1 p2 p3   =   

p1+ 2p2+ 3p3 p1+ 4p2+ 9p3

2p2+ 9p3

3p3



 ,

so thatT(p(x)) = (p1+ 2p2+ 3p3) + (p1+ 4p2+ 9p3)x+ (2p2+ 9p3)x2_{+ 3}_p3x3_{. We can check this directly}

by noting that

T(p(x)) =T2(T1(p(x))) =T2(p1x+ 2p2x2+ 3p3x3) =p1(1 +x) + 2p2(1 +x)2+ 3p3(1 +x)3 = (p1+ 2p2+ 3p3) + (p1+ 4p2+ 9p3)x+ (2p2+ 9p3)x2+ 3p3x3.

Example 8.8.5. _{Consider the linear operator}_T _:_R2_→_R2_{, given by}_T₍_{x1, x2}_{) = (2}_x1₊_{x2, x1}_{+ 3}_x2₎

for every (x1, x2) ∈ R2. We have already shown that the matrix for T with respect to the basis

B={(1,0),(1,1)}ofR2 is given by

A=

1 −1

1 4

(25)

Consider the linear operatorT2_:_R2_→_R2_{. By Proposition 8T, the matrix for}_T2 _{with respect to}_B _is

given by

A2₌

1 −1

1 4

1 −1

1 4

=

0 −5 5 15

.

Suppose that (x1, x2)∈R2. Then

[(x1, x2)]B=

x1−x2 x2

and A2[(x1, x2)]B=

0 −5 5 15

x1−x2 x2

=

−5x2

5x1+ 10x2

,

so thatT(x1, x2) =−5x2(1,0) + (5x1+ 10x2)(1,1) = (5x1+ 5x2,5x1+ 10x2_{). The reader is invited to}

check this directly.

A simple consequence of Propositions 8N and 8T is the following result concerning inverse linear transformations.

PROPOSITION 8U.Suppose thatT :V →V is a linear operator on a ﬁnite dimensional real vector spaceV with basis B. Suppose further thatA is the matrix for the linear operatorT with respect to the basisB. Then T is one-to-one if and only ifAis invertible. Furthermore, ifT is one-to-one, then A−1

is the matrix for the inverse linear operatorT−1_:_V _→_V _{with respect to the basis} _B_.

Proof._{Simply note that}_T _{is one-to-one if and only if the system}_Ax=0has only the trivial solution x = 0. The last assertion follows easily from Proposition 8T, since if A′ _{denotes the matrix for the} inverse linear operatorT−1 _{with respect to}_B_{, then we must have}_A′_A₌_I_{, the matrix for the identity} operatorT−1_◦_T _{with respect to} _B_.

Example 8.8.6._{Consider the linear operator}_T _:_P3_→_P3_{, where for every}_q₍_x_{) =}_q0₊_q1x₊_q2x2₊_q3x3

inP3, we have

T(q(x)) =q(1 +x) =q0+q1(1 +x) +q2(1 +x)2+q3(1 +x)3.

We have already shown that the matrix forT with respect to the basisB={1, x, x2, x3}is given by

A= 

 

1 1 1 1 0 1 2 3 0 0 1 3 0 0 0 1



 .

This matrix is invertible, so it follows thatT is one-to-one. Furthermore, it can be checked that

A−1= 

 

1 −1 1 −1

0 1 −2 3

0 0 1 −3

0 0 0 1



 .

Suppose thatp(x) =p0+p1x+p2x2₊_p3x3_{. Then}

[p(x)]B=    p0 p1 p2 p3  

 and A

−1_[_p₍_x_)]B₌



 

1 −1 1 −1

0 1 −2 3

0 0 1 −3

0 0 0 1

      p0 p1 p2 p3   =   

p0−p1+p2−p3 p1−2p2+ 3p3

p2−3p3 p3



 ,

so that

T−1₍_p₍_x_{)) = (}_p0₋_p1₊_p2₋_p3_{) + (}_p1₋₂_p2_{+ 3}_p3₎_x_{+ (}_p2₋₃_p3₎_x2₊_p3x3

=p0+p1(x−1) +p2(x2₋₂_x_{+ 1) +}_p3₍_x3₋₃_x2_{+ 3}_x₋₁₎

(26)

8.9. Change of Basis

Suppose that V is a ﬁnite dimensional real vector space, with one basis B={v1, . . . ,vn} and another basisB′ ₌_{_u_{1, . . . ,}_u

n}. Suppose that T :V →V is a linear operator onV. LetA denote the matrix for T with respect to the basis B, and let A′ _{denote the matrix for} _T _{with respect to the basis} _B′_{. If} v∈V andT(v) =w, then

[w]B=A[v]B (10)

and

[w]B′ =A′[v]B′. (11)

We wish to ﬁnd the relationship betweenA′ _and_A_.

Recall Proposition 8J, that if

P= ( [u1]B . . . [un]B)

denotes the transition matrix from the basisB′ _{to the basis}_B_{, then}

[v]B=P[v]B′ and [w]B=P[w]B′. (12)

Note that the matrixP can also be interpreted as the matrix for the identity operatorI:V →V with respect to the basesB′

andB. It is easy to see that the matrixP is invertible, and

P−1= ( [v1]B′ . . . [v_n]B′)

denotes the transition matrix from the basisBto the basisB′

, and can also be interpreted as the matrix for the identity operatorI:V →V with respect to the basesBandB′

.

Combining (10) and (12), we conclude that

[w]B′ =P

−1_[w]B₌_P−1_A_[v]B₌_P−1_AP_[v]B

′.

Comparing this with (11), we conclude that

P−1_AP ₌_A′

. (13)

This implies that

A=P A′P−1. (14)

Remark._{We can use the notation}

A= [T]B and A′

= [T]B′

to denote thatAandA′ _{are the matrices for}_T _{with respect to the basis}_B_{and with respect to the basis}

B′

respectively. We can also write

P = [I]B,B′

to denote thatP is the transition matrix from the basis B′

to the basisB, so that

(27)

Then (13) and (14) become respectively

[I]B′_,B[T]B[I]B_,B′ = [T]B′ and [I]B_,B′[T]B′[I]B′_,B= [T]B. We have proved the following result.

PROPOSITION 8V.Suppose that T :V →V is a linear operator on a ﬁnite dimensional real vector space V, with bases B = {v1, . . . ,vn} and B′ = {u1, . . . ,un}. Suppose further that A and A′ are the

matrices forT with respect to the basis B and with respect to the basisB′

respectively. Then

P−1_AP ₌_A′

and A′

=P AP−1_,

where

P = ( [u1]B . . . [un]B)

denotes the transition matrix from the basisB′ _{to the basis}_B_.

Remarks._{(1) We have the following picture.}

v w

[v]B! [w]B!

[v]B [w]B

T

I I

T

A!

P

A

P−1

(2) The idea can be extended to the case of linear transformationsT :V →W from a ﬁnite dimensional real vector space into another, with a change of basis inV and a change of basis inW.

Example 8.9.1. _{Consider the vector space} _P3 _{of all polynomials with real coeﬃcients and degree at}

most 3, with bases B ={1, x, x2_{, x}3_} _and _B′

={1,1 +x,1 +x+x2_,_{1 +}_x₊_x2₊_x3_}_{. Consider also}

the linear operator T :P3 →P3, where for every polynomialp(x) =p0+p1x+p2x2₊_p3x3_{, we have} T(p(x)) = (p0+p1) + (p1+p2)x+ (p2+p3)x2+ (p0+p3)x3. LetAdenote the matrix forT with respect to the basisB. ThenT(1) = 1 +x3_,_T₍_x_{) = 1 +}_x_,_T₍_x2_{) =}_x₊_x2 _and_T₍_x3_{) =}_x2₊_x3_{, and so}

A= ( [T(1)]B [T(x)]B [T(x2_)]B _[_T₍_x3_)]B_{) =}



 

1 1 0 0 0 1 1 0 0 0 1 1 1 0 0 1



 .

Next, note that the transition matrix from the basisB′ _{to the basis}_B_{is given by}

P = ( [1]B [1 +x]B [1 +x+x2_]B _{[1 +}_x₊_x2₊_x3_]B_{) =}



 

1 1 1 1 0 1 1 1 0 0 1 1 0 0 0 1



(28)

It can be checked that

P−1= 

 

1 −1 0 0

0 1 −1 0

0 0 1 −1

0 0 0 1



 ,

and so

A′

=P−1_AP ₌



 

1 −1 0 0

0 1 −1 0

0 0 1 −1

0 0 0 1



 



 

1 1 0 0 0 1 1 0 0 0 1 1 1 0 0 1



 



 

1 1 1 1 0 1 1 1 0 0 1 1 0 0 0 1



 =



 

1 1 0 0

0 1 1 0

−1 −1 0 0

1 1 1 2



 

is the matrix forT with respect to the basisB′

. It follows that

T(1) = 1−(1 +x+x2) + (1 +x+x2+x3) = 1 +x3,

T(1 +x) = 1 + (1 +x)−(1 +x+x2) + (1 +x+x2+x3) = 2 +x+x3, T(1 +x+x2) = (1 +x) + (1 +x+x2+x3) = 2 + 2x+x2+x3,

T(1 +x+x2₊_x3_{) = 2(1 +}_x₊_x2₊_x3_{) = 2 + 2}_x_{+ 2}_x2_{+ 2}_x3_.

These can be veriﬁed directly.

8.10. Eigenvalues and Eigenvectors

Definition._{Suppose that} _T _:_V _→_V _{is a linear operator on a ﬁnite dimensional real vector space}_V_.

Then any real numberλ∈Ris called an eigenvalue ofT if there exists a non-zero vectorv∈V such that

T(v) =λv. This non-zero vectorv∈V is called an eigenvector ofT corresponding to the eigenvalueλ.

The purpose of this section is to show that the problem of eigenvalues and eigenvectors of the linear operator T can be reduced to the problem of eigenvalues and eigenvectors of the matrix for T with respect to any basis BofV. The starting point of our argument is the following theorem, the proof of which is left as an exercise.

PROPOSITION 8W.Suppose thatT :V →V is a linear operator on a ﬁnite dimensional real vector spaceV, with bases BandB′_{. Suppose further that}_A_and_A′ _{are the matrices for}_T _{with respect to the}

basisB and with respect to the basisB′ _{respectively. Then}

(a) detA= detA′

; (b) AandA′

have the same rank;

(c) AandA′ _{have the same characteristic polynomial;}

(d) AandA′ _{have the same eigenvalues; and}

(e) the dimension of the eigenspace of Acorresponding to an eigenvalue λis equal to the dimension of the eigenspace ofA′

corresponding toλ.

We also state without proof the following result.

PROPOSITION 8X.Suppose that T :V →V is a linear operator on a ﬁnite dimensional real vector spaceV. Suppose further thatAis the matrix for T with respect to a basisB of V. Then

(a) the eigenvalues of T are precisely the eigenvalues ofA; and

(29)

Suppose now that A is the matrix for a linear operator T : V → V on a ﬁnite dimensional real vector spaceV with respect to a basisB={v1, . . . ,v_n}. IfAcan be diagonalized, then there exists an invertible matrixP such that

P−1AP =D

is a diagonal matrix. Furthermore, the columns of P are eigenvectors of A, and so are the coordinate matrices of eigenvectors ofT with respect to the basisB. In other words,

P = ( [u1]B . . . [un]B),

whereB′ ₌_{_u_{1, . . . ,}_u

n}is a basis ofV consiting of eigenvectors ofT. Furthermore,P is the transition matrix from the basisB′ _{to the basis}_B_{. It follows that the matrix for}_T _{with respect to the basis}_B′ _is given by

D= 



λ1

. ..

λn 

,

whereλ1, . . . , λn are the eigenvalues ofT.

Example 8.10.1._{Consider the vector space} _P2 _{of all polynomials with real coeﬃcients and degree at}

most 2, with basis B = {1, x, x2_}_{. Consider also the linear operator} _T _: _P2 _→ _P2_{, where for every}

polynomialp(x) =p0+p1x+p2x2_{, we have}_T₍_p₍_x_{)) = (5}_p0₋₂_p1_{) + (6}_p1_{+ 2}_p2₋₂_p0₎_x_{+ (2}_p1_{+ 7}_p2₎_x2_.

ThenT(1) = 5−2x,T(x) =−2 + 6x+ 2x2 _and_T₍_x2_{) = 2}_x_{+ 7}_x2_{, so that the matrix for}_T _{with respect}

to the basisBis given by

A= ( [T(1)]B [T(x)]B [T(x2_)]B_{) =}





5 −2 0

−2 6 2

0 2 7



.

It is a simple exercise to show that the matrixAhas eigenvalues 3,6,9, with corresponding eigenvectors

x1=



 2 2

−1 

, x2=



 2

−1 2



, x3=





−1 2 2



,

so that writing

P = 



2 2 −1

2 −1 2

−1 2 2



,

we have

P−1AP = 



3 0 0 0 6 0 0 0 9



.

Now letB′

={p1(x), p2(x), p3(x)}, where

[p1(x)]B= 

 2 2

−1 

, [p2(x)]B= 

 2

−1 2



, [p3(x)]B= 



−1 2 2



(30)

ThenP is the transition matrix from the basisB′

to the basisB, andD is the matrix forT with respect to the basisB′

. Clearlyp1(x) = 2 + 2x−x2_,_p2₍_x_{) = 2}₋_x_{+ 2}_x2_and_p3₍_x_{) =}₋_{1 + 2}_x_{+ 2}_x2_{. Note now}

that