W W L CHEN

(1)

W W L CHEN

c

W W L Chen, 1997, 2008.

This chapter is available free to all individuals, on the understanding that it is not to be used for ﬁnancial gain,

and may be downloaded and/or photocopied, with or without permission from the author.

However, this document may not be kept on any information storage and retrieval system without permission

from the author, unless such system is not accessible to any individuals other than its owners.

Chapter 10

ORTHOGONAL MATRICES

10.1. Introduction

Definition. _{A square matrix} _A _{with real entries and satisfying the condition} _A−1 ₌_At _{is called an} orthogonal matrix.

Example 10.1.1. _{Consider the euclidean space} _R2 _{with the euclidean inner product. The vectors} u1 = (1,0) and u2 = (0,1) form an orthonormal basis B = {u1,u2}. Let us now rotate u1 and u2 anticlockwise by an angleθ to obtainv1= (cosθ,sinθ) andv2 = (−sinθ,cosθ). ThenC={v1,v2} is also an orthonormal basis.

θ

u

1

u₂

(2)

The transition matrix from the basisC to the basisBis given by

P= ( [v1]B [v2]B) =

cosθ −sinθ sinθ cosθ

.

Clearly

P−1₌_Pt₌

cosθ sinθ

−sinθ cosθ

.

In fact, our example is a special case of the following general result.

PROPOSITION 10A. Suppose that B = {u1, . . . ,un_} and C = {v1, . . . ,vn_} are two orthonormal bases of a real inner product space V. Then the transition matrix P from the basis C to the basis B is an orthogonal matrix.

Example 10.1.2._{The matrix}

A=





1/3 −2/3 2/3 2/3 −1/3 −2/3

2/3 2/3 1/3





is orthogonal, since

At_A₌





1/3 2/3 2/3

−2/3 −1/3 2/3 2/3 −2/3 1/3



 



1/3 −2/3 2/3 2/3 −1/3 −2/3

2/3 2/3 1/3



= 



1 0 0 0 1 0 0 0 1



.

Note also that the row vectors ofA, namely (1/3,−2/3,2/3), (2/3,−1/3,−2/3) and (2/3,2/3,1/3) are orthonormal. So are the column vectors ofA.

In fact, our last observation is not a coincidence.

PROPOSITION 10B.Suppose thatA is ann×n matrix with real entries. Then

(a) A is orthogonal if and only if the row vectors of A form an orthonormal basis of Rn _{under the}

euclidean inner product; and

(b) A is orthogonal if and only if the column vectors ofA form an orthonormal basis of Rn under the euclidean inner product.

Proof._{We shall only prove (a), since the proof of (b) is almost identical. Let}r1, . . . ,rndenote the row vectors ofA. Then

AAt=





r1·r1 . . . r1·rn ..

. ...

rn_·r1 . . . rn_·rn



.

It follows thatAAt₌_I _{if and only if for every}_{i, j}_{= 1, . . . , n, we have}

ri_·rj =

1 ifi=j, 0 ifi6=j,

if and only ifr1, . . . ,rn are orthonormal.

PROPOSITION 10C.Suppose that A is ann×nmatrix with real entries. Suppose further that the inner product inRn is the euclidean inner product. Then the following are equivalent:

(a) Ais orthogonal.

(3)

Proof._((a)_⇒_{(b)) Suppose that}_A_{is orthogonal, so that}_At_A₌_{I. It follows that for every}_x

∈Rn, we have

kAxk2₌_Ax_·_Ax₌_xt_At_Ax₌_xt_Ix₌_xt_x₌_x

·x=kx_k2.

((b)⇒(c)) Suppose thatkAxk=kx_k for everyx_∈Rn. Then for every u,v_∈Rn, we have

Au·Av= 1

4kAu+Avk 2₋1

4kAu−Avk 2₌ 1

4kA(u+v)k 2₋1

4kA(u−v)k 2₌ 1

4ku+vk 2₋1

4ku−vk

2₌_u_·_v.

((c)⇒(a)) Suppose thatAu·Av=u_·vfor every u,v_∈Rn. Then

Iu_·v=u_·v=Au_·Av=vtAtAu=AtAu_·v,

so that

(At_A

−I)u·v= 0.

In particular, this holds whenv= (At_A

−I)u, so that

(At_A

−I)u·(At_A

−I)u= 0,

whence

(AtA−I)u=0, (1)

in view of Proposition 9A(d). But then (1) is a system ofnhomogeneous linear equations innunknowns satisfied by everyu_∈Rn_{. Hence the coefficient matrix}_At_A₋_I_{must be the zero matrix, and so}_At_A₌_I.

Proof of Proposition 10A._{For every}_u_∈_V_{, we can write}

u=β1u1+. . .+βnun=γ1v1+. . .+γnvn, whereβ1, . . . , βn, γ1, . . . , γn∈R,

and whereB={u1, . . . ,un_} andC={v1, . . . ,vn_}are two orthonormal bases ofV. Then

ku_k2=hu,u_i=hβ1u1+. . .+βnun, β1u1+. . .+βnuni= n

X

i=1 n

X

j=1

βiβjhui,uj_i= n

X

i=1 βi2

= (β1, . . . , βn)·(β1, . . . , βn).

Similarly,

ku_k2=hu,u_i=hγ1v1+. . .+γnvn, γ1v1+. . .+γnvni= n

X

i=1 n

X

j=1

γiγjhvi,vj_i= n

X

i=1 γi2

= (γ1, . . . , γn)·(γ1, . . . , γn).

It follows that in Rn with the euclidean norm, we havek[u]Bk =k[u]Ck, and so kP[u]Ck =k[u]Ck for

(4)

10.2. Eigenvalues and Eigenvectors

In this section, we give a brief review on eigenvalues and eigenvectors first discussed in Chapter 7.

Suppose that

A=





a11 . . . a1n ..

. ...

an1 . . . ann





is an n×nmatrix with real entries. Suppose further that there exist a numberλ∈Rand a non-zero vectorv_∈Rn _{such that} _Av₌_{λv. Then we say that}_λ_{is an eigenvalue of the matrix}_{A, and that}_v _is an eigenvector corresponding to the eigenvalueλ. In this case, we haveAv=λv=λIv, whereI is the n×nidentity matrix, so that (A−λI)v=0. Sincev_∈Rn is non-zero, it follows that we must have

det(A−λI) = 0. (2)

In other words, we must have

det



  

a11−λ a12 . . . a1n a21 a22−λ a2n

..

. . .. ...

an1 an2 . . . ann−λ



  

= 0.

Note that (2) is a polynomial equation. The polynomial det(A−λI) is called the characteristic polynomial of the matrixA. Solving this equation (2) gives the eigenvalues of the matrixA.

On the other hand, for any eigenvalueλof the matrix A, the set

{v_∈_Rn: (A−λI)v=0_} (3)

is the nullspace of the matrixA−λI, and forms a subspace ofRn. This space (3) is called the eigenspace corresponding to the eigenvalueλ.

Suppose now that A has eigenvalues λ1, . . . , λn ∈ R, not necessarily distinct, with corresponding eigenvectorsv1, . . . ,vn_∈Rn, and thatv1, . . . ,vn are linearly independent. Then it can be shown that

P−1_AP ₌_D,

where

P = (v1 . . . vn) and D=





λ1 . ..

λn



.

In fact, we say that A is diagonalizable if there exists an invertible matrix P with real entries such thatP−1_AP _{is a diagonal matrix with real entries. It follows that}_A_{is diagonalizable if its eigenvectors} form a basis of Rn. In the opposite direction, one can show that if A is diagonalizable, then it has n linearly independent eigenvectors inRn. It therefore follows that the question of diagonalizing a matrix Awith real entries is reduced to one of linear independence of its eigenvectors.

(5)

DIAGONALIZATION PROCESS.Suppose that Ais an n×nmatrix with real entries. (1) Determine whether the nroots of the characteristic polynomialdet(A−λI)are real.

(2) If not, then Ais not diagonalizable. If so, then find the eigenvectors corresponding to these eigen-values. Determine whether we can findn linearly independent eigenvectors.

(3) If not, then Ais not diagonalizable. If so, then write

P = (v1 . . . vn) and D=





λ1

. ..

λn



,

where λ1, . . . , λn ∈ R are the eigenvalues of A and where v1, . . . ,vn _∈ Rn _{are respectively their}

corresponding eigenvectors. ThenP−1_AP ₌_D_.

In particular, it can be shown that ifA has distinct eigenvalues λ1, . . . , λn ∈ R, with corresponding eigenvectorsv1, . . . ,vn _∈Rn, thenv1, . . . ,vnare linearly independent. It follows that all such matrices Aare diagonalizable.

10.3. Orthonormal Diagonalization

We now consider the euclidean space Rn an as inner product space with the euclidean inner product. Given any n×n matrixA with real entries, we wish to find out whether there exists an orthonormal basis ofRn consisting of eigenvectors ofA.

Recall that in the Diagonalization process discussed in the last section, the columns of the matrixP are eigenvectors ofA, and these vectors form a basis of Rn_{. It follows from Proposition 10B that this} basis is orthonormal if and only if the matrixP is orthogonal.

Definition._An_n_×_n_matrix_A_{with real entries is said to be orthogonally diagonalizable if there exists} an orthogonal matrix P with real entries such that P−1_AP ₌ _Pt_AP _{is a diagonal matrix with real} entries.

First of all, we would like to determine which matrices are orthogonally diagonalizable. For those that are, we then need to discuss how we may find an orthogonal matrixP to carry out the diagonalization.

To study the first question, we have the following result which gives a restriction on those matrices that are orthogonally diagonalizable.

PROPOSITION 10D.Suppose thatAis a orthogonally diagonalizable matrix with real entries. Then

A is symmetric.

Proof._{Suppose that}_A_{is orthogonally diagonalizable. Then there exists an orthogonal matrix} _P _and a diagonal matrix D, both with real entries and such that Pt_AP ₌ _{D. Since} _{P P}t ₌ _Pt_P ₌ _I _and Dt₌_{D, we have}

A=P DPt=P DtPt,

so that

At_{= (P D}t_Pt₎t_{= (P}t₎t_(Dt₎t_Pt₌_{P DP}t₌_A,

whenceAis symmetric.

(6)

PROPOSITION 10E. Suppose that A is ann×n matrix with real entries. Then it is orthogonally diagonalizable if and only if it is symmetric.

The remainder of this section is devoted to finding a way to orthogonally diagonalize a symmetric matrix with real entries. We begin by stating without proof the following result. The proof requires results from the theory of complex vector spaces.

PROPOSITION 10F.Suppose thatAis a symmetric matrix with real entries. Then all the eigenvalues of Aare real.

Our idea here is to follow the Diagonalization process discussed in the last section, knowing that since A is diagonalizable, we shall find a basis of Rn _{consisting of eigenvectors of} _{A. We may then wish to} orthogonalize this basis by the Gram-Schmidt process. This last step is considerably simplified in view of the following result.

PROPOSITION 10G. Suppose that u1 and u2 are eigenvectors of a symmetric matrix A with real

entries, corresponding to distinct eigenvaluesλ1 andλ2 respectively. Then u1·u2= 0. In other words,

eigenvectors of a symmetric real matrix corresponding to distinct eigenvalues are orthogonal.

Proof._{Note that if we write}u1 andu2as column matrices, then since Ais symmetric, we have

Au1·u2=ut2Au1=ut2Atu1= (Au2)tu1=u1·Au2.

It follows that

λ1u1·u2=Au1·u2=u1·Au2=u1·λ2u2,

so that (λ1−λ2)(u1·u2) = 0. Sinceλ16=λ2, we must haveu1·u2= 0.

We can now follow the procedure below.

ORTHOGONAL DIAGONALIZATION PROCESS.Suppose thatAis a symmetricn×nmatrix with real entries.

(1) Determine thenreal rootsλ1, . . . , λnof the characteristic polynomialdet(A−λI), and findnlinearly independent eigenvectors u1, . . . ,un of Acorresponding to these eigenvalues as in the Diagonaliza-tion process.

(2) Apply the Gram-Schmidt orthogonalization process to the eigenvectorsu1, . . . ,un to obtain

orthogo-nal eigenvectorsv1, . . . ,vn ofA, noting that eigenvectors corresponding to distinct eigenvalues are

already orthogonal.

(3) Normalize the orthogonal eigenvectorsv1, . . . ,vn to obtain orthonormal eigenvectorsw1, . . . ,wn of

A. These form an orthonormal basis of Rn_{. Furthermore, write}

P = (w1 . . . wn) and D=





λ1

. ..

λn



,

where λ1, . . . , λn ∈ R are the eigenvalues of A and where w1, . . . ,wn ∈ Rn are respectively their

orthogonalized and normalized eigenvectors. ThenPt_AP ₌_D_.

(7)

Example 10.3.1._{Consider the matrix}

A=





2 2 1 2 5 2 1 2 2



.

To find the eigenvalues ofA, we need to find the roots of

det





2−λ 2 1

2 5−λ 2

1 2 2−λ



= 0;

in other words, (λ−7)(λ−1)2_{= 0. The eigenvalues are therefore}_λ1_{= 7 and (double root)}_λ2₌_λ3_{= 1.} An eigenvector corresponding toλ1= 7 is a solution of the system

(A−7I)u=





−5 2 1

2 −2 2

1 2 −5



u=0, with root u1=

  1 2 1  .

Eigenvectors corresponding toλ2=λ3= 1 are solutions of the system

(A−I)u=





1 2 1 2 4 2 1 2 1



u=0, with roots u2=

  1 0 −1 

 and u3=

  2 −1 0  

which are linearly independent. Next, we apply the Gram-Schmidt orthogonalization process tou2 and u3, and obtain

v2=

  1 0 −1 

 and v3=

  1 −1 1  

which are now orthogonal to each other. Note that we do not have to do anything tou1 at this stage, in view of Proposition 10G. We now conclude that

v1=   1 2 1 

, v2=   1 0 −1 

, v3=   1 −1 1  

form an orthogonal basis ofR3. Normalizing each of these, we obtain respectively

w1=





1/√6 2/√6 1/√6



, w2=





1/√2 0

−1/√2



, w3=





1/√3

−1/√3 1/√3



.

We now take

P = (w1 w2 w3) =





1/√6 1/√2 1/√3 2/√6 0 −1/√3 1/√6 −1/√2 1/√3



.

Then

P−1=Pt=





1/√6 2/√6 1/√6 1/√2 0 −1/√2 1/√3 −1/√3 1/√3



 and P

t AP =





7 0 0 0 1 0 0 0 1



(8)

A=





−1 6 −12

0 −13 30

0 −9 20



.

det





−1−λ 6 −12

0 −13−λ 30

0 −9 20−λ



= 0;

in other words, (λ+ 1)(λ−2)(λ−5) = 0. The eigenvalues are thereforeλ1 =−1,λ2= 2 and λ3 = 5. An eigenvector correspondingλ1=−1 is a solution of the system

(A+I)u=





0 6 −12

0 −12 30 0 −9 21



  1 0 0  .

An eigenvector corresponding toλ2= 2 is a solution of the system

(A−2I)u=





−3 6 −12

0 −15 30

0 −9 18



u=0, with root u2=   0 2 1  .

(A−5I)u=





−6 6 −12

0 −18 30

0 −9 15



  1 −5 −3  .

Note that whileu1,u2,u3 correspond to distinct eigenvalues ofA, they are not orthogonal. The matrix Ais not symmetric, and so Proposition 10G does not apply in this case.

A=





5 −2 0

−2 6 2

0 2 7



.

det





5−λ −2 0

−2 6−λ 2

0 2 7−λ



= 0;

in other words, (λ−3)(λ−6)(λ−9) = 0. The eigenvalues are thereforeλ1= 3,λ2= 6 andλ3= 9. An eigenvector correspondingλ1= 3 is a solution of the system

(A−3I)u=





2 −2 0

−2 3 2

0 2 4



  2 2 −1  .

(A−6I)u=





−1 −2 0

−2 0 2

0 2 1



(9)

(A−9I)u=





−4 −2 0

−2 −3 2

0 2 −2



u=0, with root u3= 

 −1

2 2



.

Note now that the eigenvalues are distinct, so it follows from Proposition 10G that u1,u2,u3 are or-thogonal, so we do not have to apply Step (2) of the Orthogonal diagonalization process. Normalizing each of these vectors, we obtain respectively

w1=





2/3 2/3

−1/3



, w2=





2/3

−1/3 2/3



, w3=



 −1/3

2/3 2/3



.

We now take

P = (w1 w2 w3) =





2/3 2/3 −1/3 2/3 −1/3 2/3

−1/3 2/3 2/3



.

Then

P−1=Pt=





2/3 2/3 −1/3 2/3 −1/3 2/3

−1/3 2/3 2/3



 and P

t AP =





3 0 0 0 6 0 0 0 9



(10)

Problems for Chapter 10

1. Prove Proposition 10B(b).

2. Let

A=

a+b b−a a−b b+a

,

wherea, b∈R. Determine whenAis orthogonal.

3. Suppose thatAis an orthogonal matrix with real entries. Prove that a) A−1 _{is an orthogonal matrix; and}

b) detA=±1.

4. Suppose thatAandB are orthogonal matrices with real entries. Prove thatABis orthogonal.

5. Verify that for everya∈R, the matrix

A= 1 1 + 2a2





1 −2a 2a2

2a 1−2a2 ₋_2a

2a2 _2a ₁





is orthogonal.

6. Suppose thatλis an eigenvalue of an orthogonal matrixAwith real entries. Prove that 1/λis also an eigenvalue ofA.

7. Suppose that

A=

a b c d

is an orthogonal matrix with real entries. Explain whya2₊_b2₌_c2₊_d2_{= 1 and}_ac₊_bd_{= 0, and} quote clearly any result that you use. Deduce thatAhas one of the two possible forms

A=

cosθ −sinθ sinθ cosθ

or A=

cosθ −sinθ

−sinθ −cosθ

,

whereθ∈[0,2π).

8. Consider the matrix

A=





1 −√6 √3

−_√√6 2 √2

3 √2 3



.

a) Find the characteristic polynomial ofAand show thatAhas eigenvalues 4 (twice) and−2. b) Find an eigenvector ofAcorresponding to the eigenvalue−2.

c) Find two orthogonal eigenvectors ofAcorresponding to the eigenvalue 4. d) Find an orthonormal basis ofR3 consisting of eigenvectors ofA.

(11)

9. Apply the Orthogonal diagonalization process to each of the following matrices:

a) A=





5 0 6

0 11 6 6 6 −2



 b) A=





0 2 0 2 0 1 0 1 0





c) A=





1 −4 2

−4 1 −2

2 −2 −2



 d) A=





2 0 36

0 3 0

36 0 23





e) A=



 

1 1 0 0 1 1 0 0 0 0 0 0 0 0 0 0





 f) A=



 

−7 24 0 0

24 7 0 0

0 0 −7 24

0 0 24 7



 