• Nenhum resultado encontrado

W W L CHEN

N/A
N/A
Protected

Academic year: 2019

Share "W W L CHEN"

Copied!
39
0
0

Texto

(1)

W W L CHEN

c

W W L Chen, 1982, 2008.

This chapter originates from material used by the author at Imperial College, University of London, between 1981 and 1990.

It is available free to all individuals, on the understanding that it is not to be used for financial gain,

and may be downloaded and/or photocopied, with or without permission from the author.

However, this document may not be kept on any information storage and retrieval system without permission

from the author, unless such system is not accessible to any individuals other than its owners.

Chapter 2

MATRICES

2.1. Introduction

A rectangular array of numbers of the form

a11 . . . a1n ..

. ...

am1 . . . amn

 (1)

is called anm×nmatrix, withmrows andncolumns. We count rows from the top and columns from the left. Hence

(ai1 . . . ain) and

 

a1j .. . amj

 

represent respectively the i-th row and thej-th column of the matrix (1), and aij represents the entry in the matrix (1) on thei-th row andj-th column.

Example 2.1.1.Consider the 3×4 matrix

2 4 3 −1

3 1 5 2

−1 0 7 6

.

Here

( 3 1 5 2 ) and 

 3 5 7

(2)

represent respectively the 2-nd row and the 3-rd column of the matrix, and 5 represents the entry in the matrix on the 2-nd row and 3-rd column.

We now consider the question of arithmetic involving matrices. First of all, let us study the problem of addition. A reasonable theory can be derived from the following definition.

Definition.Suppose that the two matrices

A= 

a11 . . . a1n ..

. ...

am1 . . . amn

 and B= 

b11 . . . b1n ..

. ...

bm1 . . . bmn

both havemrows andncolumns. Then we write

A+B = 

a11+b11 . . . a1n+b1n ..

. ...

am1+bm1 . . . amn+bmn

and call this the sum of the two matricesAandB.

Example 2.1.2.Suppose that

A= 

2 4 3 −1

3 1 5 2

−1 0 7 6

 and B= 

1 2 −2 7

0 2 4 −1

−2 1 3 3

.

Then

A+B = 

2 + 1 4 + 2 3−2 −1 + 7 3 + 0 1 + 2 5 + 4 2−1

−1−2 0 + 1 7 + 3 6 + 3 

= 

3 6 1 6

3 3 9 1

−3 1 10 9 

.

Example 2.1.3.We do not have a definition for “adding” the matrices

2 4 3 −1

−1 0 7 6

and 

2 4 3

3 1 5

−1 0 7 

.

PROPOSITION 2A. (MATRIX ADDITION) Suppose that A, B, C are m×n matrices. Suppose further that O represents them×n matrix with all entries zero. Then

(a) A+B =B+A;

(b) A+ (B+C) = (A+B) +C; (c) A+O=A; and

(d) there is an m×nmatrixA′ such that A+A=O.

Proof.Parts (a)–(c) are easy consequences of ordinary addition, as matrix addition is simply entry-wise

addition. For part (d), we can consider the matrixA′ obtained fromAby multiplying each entry of A

by−1.

The theory of multiplication is rather more complicated, and includes multiplication of a matrix by a scalar as well as multiplication of two matrices.

(3)

Definition.Suppose that the matrix

A= 

a11 . . . a1n ..

. ...

am1 . . . amn

hasmrows andncolumns, and thatc∈R. Then we write

cA= 

ca11 . . . ca1n ..

. ...

cam1 . . . camn

and call this the product of the matrixAby the scalarc.

Example 2.1.4.Suppose that

A= 

2 4 3 −1

3 1 5 2

−1 0 7 6 

.

Then

2A= 

4 8 6 −2

6 2 10 4

−2 0 14 12 

.

PROPOSITION 2B.(MULTIPLICATION BY SCALAR)Suppose that A, Barem×nmatrices, and that c, d∈R. Suppose further thatO represents them×nmatrix with all entries zero. Then

(a) c(A+B) =cA+cB; (b) (c+d)A=cA+dA; (c) 0A=O; and

(d) c(dA) = (cd)A.

Proof.These are all easy consequences of ordinary multiplication, as multiplication by scalarcis simply

entry-wise multiplication by the numberc.

The question of multiplication of two matrices is rather more complicated. To motivate this, let us consider the representation of a system of linear equations

a11x1+. . .+ a1nxn = b1,

.. .

am1x1+. . .+amnxn =bm,

(2)

in the formAx=b, where

A= 

a11 . . . a1n ..

. ...

am1 . . . amn

 and b= 

 b1

.. . bm

 (3)

represent the coefficients and

x= 

 x1

.. . xn

(4)

represents the variables. This can be written in full matrix notation by

a11 . . . a1n ..

. ...

am1 . . . amn

 

 x1

.. . xn

= 

 b1

.. . bm

.

Can you work out the meaning of this representation?

Now let us define matrix multiplication more formally.

Definition.Suppose that

A= 

a11 . . . a1n ..

. ...

am1 . . . amn

 and B= 

 

b11 . . . b1p ..

. ...

bn1 . . . bnp

 

are respectively anm×n matrix and ann×pmatrix. Then the matrix productAB is given by the m×pmatrix

AB= 

 

q11 . . . q1p ..

. ...

qm1 . . . qmp

 ,

where for everyi= 1, . . . , mandj = 1, . . . , p, we have

qij = n X

k=1

aikbkj=ai1b1j+. . .+ainbnj.

Remark.Note first of all that the number of columns of the first matrix must be equal to the number

of rows of the second matrix. On the other hand, for a simple way to work outqij, the entry in thei-th row andj-th column ofAB, we observe that thei-th row ofAand thej-th column ofBare respectively

(ai1 . . . ain) and

 

b1j .. . bnj

 .

We now multiply the corresponding entries – fromai1 withb1j, and so on, untilainwithbnj – and then

add these products to obtainqij.

Example 2.1.5.Consider the matrices

A= 

2 4 3 −1

3 1 5 2

−1 0 7 6

 and B= 

 

1 4

2 3

0 −2

3 1

 .

Note thatA is a 3×4 matrix andB is a 4×2 matrix, so that the productAB is a 3×2 matrix. Let us calculate the product

AB= 

q11 q12

q21 q22

q31 q32

(5)

Consider first of allq11. To calculate this, we need the 1-st row ofAand the 1-st column ofB, so let us

cover up all unnecessary information, so that

2 4 3 −1

× × × ×

× × × ×

 

 

1 ×

2 ×

0 ×

3 ×

 =

 q11 ×

× ×

× ×

.

From the definition, we have

q11= 2·1 + 4·2 + 3·0 + (−1)·3 = 2 + 8 + 0−3 = 7.

Consider nextq12. To calculate this, we need the 1-st row ofAand the 2-nd column ofB, so let us cover

up all unnecessary information, so that

2 4 3 −1

× × × ×

× × × ×

 

 

× 4

× 3

× −2

× 1

 =

× q12

× ×

× ×

.

From the definition, we have

q12= 2·4 + 4·3 + 3·(−2) + (−1)·1 = 8 + 12−6−1 = 13.

Consider nextq21. To calculate this, we need the 2-nd row ofAand the 1-st column ofB, so let us cover

up all unnecessary information, so that

× × × ×

3 1 5 2

× × × ×

 

 

1 ×

2 ×

0 ×

3 ×

 =

× ×

q21 ×

× ×

.

From the definition, we have

q21= 3·1 + 1·2 + 5·0 + 2·3 = 3 + 2 + 0 + 6 = 11.

Consider next q22. To calculate this, we need the 2-nd row of A and the 2-nd column of B, so let us

cover up all unnecessary information, so that

× × × ×

3 1 5 2

× × × ×

 

 

× 4

× 3

× −2

× 1

 =

× ×

× q22

× ×

.

From the definition, we have

q22= 3·4 + 1·3 + 5·(−2) + 2·1 = 12 + 3−10 + 2 = 7.

Consider nextq31. To calculate this, we need the 3-rd row ofAand the 1-st column ofB, so let us cover

up all unnecessary information, so that

× × × ×

× × × ×

−1 0 7 6

 

 

1 ×

2 ×

0 ×

3 ×

 =

× ×

× ×

q31 ×

.

From the definition, we have

(6)

Consider finallyq32. To calculate this, we need the 3-rd row ofA and the 2-nd column ofB, so let us

cover up all unnecessary information, so that

× × × ×

× × × ×

−1 0 7 6

 

 

× 4

× 3

× −2

× 1

 =

× ×

× ×

× q32

.

From the definition, we have

q32= (−1)·4 + 0·3 + 7·(−2) + 6·1 =−4 + 0 +−14 + 6 =−12.

We therefore conclude that

AB= 

2 4 3 −1

3 1 5 2

−1 0 7 6

 

 

1 4

2 3

0 −2

3 1

 =

7 13

11 7

17 −12 

.

Example 2.1.6.Consider again the matrices

A= 

2 4 3 −1

3 1 5 2

−1 0 7 6

 and B= 

 

1 4

2 3

0 −2

3 1

 .

Note that B is a 4×2 matrix and A is a 3×4 matrix, so that we do not have a definition for the “product”BA.

We leave the proofs of the following results as exercises for the interested reader.

PROPOSITION 2C.(ASSOCIATIVE LAW)Suppose thatAis anm×nmatrix,B is ann×pmatrix andC is an p×r matrix. ThenA(BC) = (AB)C.

PROPOSITION 2D.(DISTRIBUTIVE LAWS)

(a) Suppose that Ais anm×nmatrix andB andC aren×pmatrices. ThenA(B+C) =AB+AC. (b) Suppose thatAandB arem×nmatrices andC is ann×pmatrix. Then(A+B)C=AC+BC.

PROPOSITION 2E.Suppose thatAis anm×nmatrix,B is ann×pmatrix, and thatc∈R. Then c(AB) = (cA)B=A(cB).

2.2. Systems of Linear Equations

Note that the system (2) of linear equations can be written in matrix form as

Ax=b,

where the matricesA,xandbare given by (3) and (4). In this section, we shall establish the following important result.

(7)

Proof.Clearly the system (2) has either no solution, exactly one solution, or more than one solution.

It remains to show that if the system (2) has two distinct solutions, then it must have infinitely many solutions. Suppose thatx=uandx=vrepresent two distinct solutions. Then

Au=b and Av=b,

so that

A(u−v) =Au−Av=b−b=0,

where0is the zerom×1 matrix. It now follows that for everyc∈R, we have

A(u+c(u−v)) =Au+A(c(u−v)) =Au+c(A(u−v)) =b+c0=b,

so thatx=u+c(u−v) is a solution for everyc∈R. Clearly we have infinitely many solutions.

2.3. Inversion of Matrices

For the remainder of this chapter, we shall deal with square matrices, those where the number of rows equals the number of columns.

Definition.Then×nmatrix

In= 

a11 . . . a1n ..

. ...

an1 . . . ann

,

where

aij =

1 if i=j, 0 if i6=j,

is called the identity matrix of ordern.

Remark.Note that

I1= ( 1 ) and I4=

 

1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1

 .

The following result is relatively easy to check. It shows that the identity matrixInacts as the identity for multiplication ofn×nmatrices.

PROPOSITION 2G.For every n×nmatrixA, we have AIn=InA=A.

This raises the following question: Given ann×nmatrixA, is it possible to find anothern×nmatrix B such thatAB=BA=In?

(8)

Definition. Ann×n matrix A is said to be invertible if there exists an n×n matrix B such that

AB=BA=In. In this case, we say thatB is the inverse ofAand write B=A−1.

PROPOSITION 2H.Suppose that Ais an invertiblen×n matrix. Then its inverseA−1 is unique.

Proof.Suppose thatB satisfies the requirements for being the inverse ofA. ThenAB=BA=In. It

follows that

A−1=A−1In =A−1(AB) = (A−1A)B=InB =B.

Hence the inverseA−1 is unique.

PROPOSITION 2J.Suppose thatA andB are invertiblen×nmatrices. Then (AB)−1=B−1A−1.

Proof.In view of the uniqueness of inverse, it is sufficient to show thatB−1A−1 satisfies the

require-ments for being the inverse ofAB. Note that

(AB)(B−1A−1) =A(B(B−1A−1)) =A((BB−1)A−1) =A(InA−1) =AA−1=In

and

(B−1A−1)(AB) =B−1(A−1(AB)) =B−1((A−1A)B) =B−1(InB) =B−1B=In

as required.

PROPOSITION 2K.Suppose that Ais an invertiblen×n matrix. Then(A−1)−1=A.

Proof.Note that both (A−1)−1 andAsatisfy the requirements for being the inverse ofA−1. Equality

follows from the uniqueness of inverse.

2.4. Application to Matrix Multiplication

In this section, we shall discuss an application of invertible matrices. Detailed discussion of the technique involved will be covered in Chapter 7.

Definition.Ann×nmatrix

A= 

a11 . . . a1n ..

. ...

an1 . . . ann

,

whereaij = 0 wheneveri6=j, is called a diagonal matrix of ordern.

Example 2.4.1.The 3×3 matrices

1 0 0 0 2 0 0 0 0

 and 

0 0 0 0 0 0 0 0 0

are both diagonal.

Given ann×nmatrixA, it is usually rather complicated to calculate

Ak=A . . . A | {z }

k .

(9)

Example 2.4.2.Consider the 3×3 matrix

A= 

17 −10 −5 45 −28 −15

−30 20 12

.

Suppose that we wish to calculateA98. It can be checked that if we take

P = 

1 1 2

3 0 3

−2 3 0 

,

then

P−1=

−3 2 1

−2 4/3 1

3 −5/3 −1 

.

Furthermore, if we write

D= 

−3 0 0

0 2 0

0 0 2

,

then it can be checked thatA=P DP−1, so that

A98= (P DP−1). . .(P DP−1)

| {z }

98

=P D98P−1=P

398 0 0

0 298 0

0 0 298

P−1.

This is much simpler than calculatingA98 directly. Note that this example is only an illustration. We

have not discussed here how the matricesP andD are found.

2.5. Finding Inverses by Elementary Row Operations

In this section, we shall discuss a technique by which we can find the inverse of a square matrix, if the inverse exists. Before we discuss this technique, let us recall the three elementary row operations we discussed in the previous chapter. These are: (1) interchanging two rows; (2) adding a multiple of one row to another row; and (3) multiplying one row by a non-zero constant.

Let us now consider the following example.

Example 2.5.1.Consider the matrices

A= 

a11 a12 a13

a21 a22 a23

a31 a32 a33

 and I3=

1 0 0 0 1 0 0 0 1

.

• Let us interchange rows 1 and 2 ofAand do likewise forI3. We obtain respectively

a21 a22 a23

a11 a12 a13

a31 a32 a33

 and 

0 1 0 1 0 0 0 0 1

(10)

Note that

a21 a22 a23

a11 a12 a13

a31 a32 a33

= 

0 1 0 1 0 0 0 0 1

 

a11 a12 a13

a21 a22 a23

a31 a32 a33

.

• Let us interchange rows 2 and 3 ofAand do likewise forI3. We obtain respectively

a11 a12 a13

a31 a32 a33

a21 a22 a23

 and 

1 0 0 0 0 1 0 1 0

.

Note that

a11 a12 a13

a31 a32 a33

a21 a22 a23

= 

1 0 0 0 0 1 0 1 0

 

a11 a12 a13

a21 a22 a23

a31 a32 a33

.

• Let us add 3 times row 1 to row 2 ofAand do likewise forI3. We obtain respectively

a11 a12 a13

3a11+a21 3a12+a22 3a13+a23

a31 a32 a33

 and 

1 0 0 3 1 0 0 0 1

.

Note that 

a11 a12 a13

3a11+a21 3a12+a22 3a13+a23

a31 a32 a33

= 

1 0 0 3 1 0 0 0 1

 

a11 a12 a13

a21 a22 a23

a31 a32 a33

.

• Let us add−2 times row 3 to row 1 ofA and do likewise forI3. We obtain respectively

−2a31+a11 −2a32+a12 −2a33+a13

a21 a22 a23

a31 a32 a33

 and 

1 0 −2

0 1 0

0 0 1

.

Note that

−2a31+a11 −2a32+a12 −2a33+a13

a21 a22 a23

a31 a32 a33

= 

1 0 −2

0 1 0

0 0 1

 

a11 a12 a13

a21 a22 a23

a31 a32 a33

.

• Let us multiply row 2 ofAby 5 and do likewise forI3. We obtain respectively

a11 a12 a13

5a21 5a22 5a23

a31 a32 a33

 and 

1 0 0 0 5 0 0 0 1

.

Note that

a11 a12 a13

5a21 5a22 5a23

a31 a32 a33

= 

1 0 0 0 5 0 0 0 1

 

a11 a12 a13

a21 a22 a23

a31 a32 a33

.

• Let us multiply row 3 ofAby−1 and do likewise forI3. We obtain respectively

a11 a12 a13

a21 a22 a23 −a31 −a32 −a33

 and 

1 0 0

0 1 0

0 0 −1 

(11)

Note that

a11 a12 a13

a21 a22 a23 −a31 −a32 −a33

= 

1 0 0

0 1 0

0 0 −1 

 

a11 a12 a13

a21 a22 a23

a31 a32 a33

.

Let us now consider the problem in general.

Definition.By an elementaryn×nmatrix, we mean ann×nmatrix obtained fromInby an elementary

row operation.

We state without proof the following important result. The interested reader may wish to construct a proof, taking into account the different types of elementary row operations.

PROPOSITION 2L.Suppose that A is ann×n matrix, and suppose that B is obtained from A by an elementary row operation. Suppose further that E is an elementary matrix obtained from In by the same elementary row operation. Then B=EA.

We now adopt the following strategy. Consider ann×nmatrixA. Suppose that it is possible to reduce the matrix A by a sequence α1, α2, . . . , αk of elementary row operations to the identity matrix In. If

E1, E2, . . . , Ek are respectively the elementaryn×nmatrices obtained fromIn by the same elementary

row operationsα1, α2. . . , αk, then

In=Ek. . . E2E1A.

We therefore must have

A−1=Ek. . . E

2E1=Ek. . . E2E1In.

It follows that the inverseA−1can be obtained fromInby performing the same elementary row operations

α1, α2, . . . , αk. Since we are performing the same elementary row operations onAandIn, it makes sense

to put them side by side. The process can then be described pictorially by

(A|In) α1

−−−→(E1A|E1In)

α2

−−−→(E2E1A|E2E1In)

α3

−−−→. . . αk

−−−→(Ek. . . E2E1A|Ek. . . E2E1In) = (In|A−1).

In other words, we consider an array with the matrixAon the left and the matrixIn on the right. We now perform elementary row operations on the array and try to reduce the left hand half to the matrix In. If we succeed in doing so, then the right hand half of the array gives the inverseA−1.

Example 2.5.2.Consider the matrix

A= 

1 1 2

3 0 3

−2 3 0 

.

To findA−1, we consider the array

(A|I3) =

1 1 2 1 0 0

3 0 3 0 1 0

−2 3 0 0 0 1 

(12)

We now perform elementary row operations on this array and try to reduce the left hand half to the matrix I3. Note that if we succeed, then the final array is clearly in reduced row echelon form. We

therefore follow the same procedure as reducing an array to reduced row echelon form. Adding−3 times row 1 to row 2, we obtain

1 1 2 1 0 0

0 −3 −3 −3 1 0

−2 3 0 0 0 1

.

Adding 2 times row 1 to row 3, we obtain

1 1 2 1 0 0

0 −3 −3 −3 1 0

0 5 4 2 0 1

.

Multiplying row 3 by 3, we obtain

1 1 2 1 0 0

0 −3 −3 −3 1 0

0 15 12 6 0 3

.

Adding 5 times row 2 to row 3, we obtain

1 1 2 1 0 0

0 −3 −3 −3 1 0 0 0 −3 −9 5 3

.

Multiplying row 1 by 3, we obtain

3 3 6 3 0 0

0 −3 −3 −3 1 0 0 0 −3 −9 5 3

.

Adding 2 times row 3 to row 1, we obtain

3 3 0 −15 10 6

0 −3 −3 −3 1 0

0 0 −3 −9 5 3

.

Adding−1 times row 3 to row 2, we obtain

3 3 0 −15 10 6

0 −3 0 6 −4 −3

0 0 −3 −9 5 3

.

Adding 1 times row 2 to row 1, we obtain

3 0 0 −9 6 3

0 −3 0 6 −4 −3

0 0 −3 −9 5 3

.

Multiplying row 1 by 1/3, we obtain

1 0 0 −3 2 1

0 −3 0 6 −4 −3

0 0 −3 −9 5 3

(13)

Multiplying row 2 by−1/3, we obtain

1 0 0 −3 2 1

0 1 0 −2 4/3 1

0 0 −3 −9 5 3

.

Multiplying row 3 by−1/3, we obtain

1 0 0 −3 2 1

0 1 0 −2 4/3 1

0 0 1 3 −5/3 −1 

.

Note now that the array is in reduced row echelon form, and that the left hand half is the identity matrix I3. It follows that the right hand half of the array represents the inverseA−1. Hence

A−1=

−3 2 1

−2 4/3 1

3 −5/3 −1 

.

Example 2.5.3.Consider the matrix

A= 

 

1 1 2 3 2 2 4 5 0 3 0 0 0 0 0 1

 .

To findA−1, we consider the array

(A|I4) =

 

1 1 2 3 1 0 0 0 2 2 4 5 0 1 0 0 0 3 0 0 0 0 1 0 0 0 0 1 0 0 0 1

 .

We now perform elementary row operations on this array and try to reduce the left hand half to the matrixI4. Adding−2 times row 1 to row 2, we obtain

 

1 1 2 3 1 0 0 0

0 0 0 −1 −2 1 0 0

0 3 0 0 0 0 1 0

0 0 0 1 0 0 0 1

 .

Adding 1 times row 2 to row 4, we obtain

 

1 1 2 3 1 0 0 0

0 0 0 −1 −2 1 0 0

0 3 0 0 0 0 1 0

0 0 0 0 −2 1 0 1

 .

Interchanging rows 2 and 3, we obtain

 

1 1 2 3 1 0 0 0

0 3 0 0 0 0 1 0

0 0 0 −1 −2 1 0 0

0 0 0 0 −2 1 0 1

(14)

At this point, we observe that it is impossible to reduce the left hand half of the array toI4. For those

who remain unconvinced, let us continue. Adding 3 times row 3 to row 1, we obtain

 

1 1 2 0 −5 3 0 0

0 3 0 0 0 0 1 0

0 0 0 −1 −2 1 0 0

0 0 0 0 −2 1 0 1

 .

Adding−1 times row 4 to row 3, we obtain

 

1 1 2 0 −5 3 0 0

0 3 0 0 0 0 1 0

0 0 0 −1 0 0 0 −1

0 0 0 0 −2 1 0 1

 .

Multiplying row 1 by 6 (here we want to avoid fractions in the next two steps), we obtain

 

6 6 12 0 −30 18 0 0

0 3 0 0 0 0 1 0

0 0 0 −1 0 0 0 −1

0 0 0 0 −2 1 0 1

 .

Adding−15 times row 4 to row 1, we obtain

 

6 6 12 0 0 3 0 −15

0 3 0 0 0 0 1 0

0 0 0 −1 0 0 0 −1

0 0 0 0 −2 1 0 1

 .

Adding−2 times row 2 to row 1, we obtain

 

6 0 12 0 0 3 −2 −15

0 3 0 0 0 0 1 0

0 0 0 −1 0 0 0 −1

0 0 0 0 −2 1 0 1

 .

Multiplying row 1 by 1/6, multiplying row 2 by 1/3, multiplying row 3 by−1 and multiplying row 4 by

−1/2, we obtain

 

1 0 2 0 0 1/2 −1/3 −5/2

0 1 0 0 0 0 1/3 0

0 0 0 1 0 0 0 1

0 0 0 0 1 −1/2 0 −1/2

 .

Note now that the array is in reduced row echelon form, and that the left hand half is not the identity matrixI4. Our technique has failed. In fact, the matrixAis not invertible.

2.6. Criteria for Invertibility

Examples 2.5.2–2.5.3 raise the question of when a given matrix is invertible. In this section, we shall obtain some partial answers to this question. Our first step here is the following simple observation.

PROPOSITION 2M.Every elementary matrix is invertible.

Proof.Let us consider elementary row operations. Recall that these are: (1) interchanging two rows;

(15)

These elementary row operations can clearly be reversed by elementary row operations. For (1), we interchange the two rows again. For (2), if we have originally addedctimes rowito rowj, then we can reverse this by adding −c times row i to rowj. For (3), if we have multiplied any row by a non-zero constant c, we can reverse this by multiplying the same row by the constant 1/c. Note now that each elementary matrix is obtained fromIn by an elementary row operation. The inverse of this elementary matrix is clearly the elementary matrix obtained fromInby the elementary row operation that reverses the original elementary row operation.

Suppose that an n×n matrix B can be obtained from an n×n matrixA by a finite sequence of elementary row operations. Then since these elementary row operations can be reversed, the matrixA can be obtained from the matrixB by a finite sequence of elementary row operations.

Definition.Ann×nmatrixAis said to be row equivalent to ann×nmatrixB if there exist a finite

number of elementaryn×nmatricesE1, . . . , Ek such thatB =Ek. . . E1A.

Remark. Note that B = Ek. . . E1A implies that A = E−1 1 . . . E−

1

k B. It follows that if A is row equivalent toB, then B is row equivalent toA. We usually say thatAandB are row equivalent.

The following result gives conditions equivalent to the invertibility of ann×nmatrixA.

PROPOSITION 2N.Suppose that

A= 

a11 . . . a1n ..

. ... an1 . . . ann

,

and that

x= 

 x1

.. . xn

and 0= 

 0

.. . 0

aren×1matrices, wherex1, . . . , xn are variables.

(a) Suppose that the matrix A is invertible. Then the systemAx=0 of linear equations has only the trivial solution.

(b) Suppose that the systemAx=0of linear equations has only the trivial solution. Then the matrices AandIn are row equivalent.

(c) Suppose that the matricesA andIn are row equivalent. ThenA is invertible.

Proof.(a) Suppose thatx0is a solution of the system Ax=0. Then since Ais invertible, we have

x0=Inx0= (A−1A)x0=A−1(Ax0) =A−10=0.

It follows that the trivial solution is the only solution.

(b) Note that if the system Ax =0of linear equations has only the trivial solution, then it can be reduced by elementary row operations to the system

x1= 0, . . . , xn= 0.

This is equivalent to saying that the array

a11 . . . a1n ..

. ...

an1 . . . ann

0 .. . 0

(16)

can be reduced by elementary row operations to the reduced row echelon form

1 . . . 0 ..

. ... 0 . . . 1

0 .. . 0

.

Hence the matricesAandIn are row equivalent.

(c) Suppose that the matricesAandInare row equivalent. Then there exist elementaryn×nmatrices E1, . . . , Ek such thatIn=Ek. . . E1A. By Proposition 2M, the matricesE1, . . . , Ek are all invertible, so

that

A=E−1 1 . . . E−

1

k In =E−

1 1 . . . E−

1

k is a product of invertible matrices, and is therefore itself invertible.

2.7. Consequences of Invertibility

Suppose that the matrix

A= 

a11 . . . a1n ..

. ...

an1 . . . ann

is invertible. Consider the systemAx=b, where

x= 

 x1

.. . xn

 and b= 

 b1

.. . bn

aren×1 matrices, wherex1, . . . , xn are variables andb1, . . . , bn ∈Rare arbitrary. SinceAis invertible,

let us considerx=A−1b. Clearly

Ax=A(A−1b) = (AA−1)b=Inb=b,

so thatx=A−1bis a solution of the system. On the other hand, letx

0 be any solution of the system.

ThenAx0=b, so that

x0=Inx0= (A−1A)x0=A−1(Ax0) =A−1b.

It follows that the system has unique solution. We have proved the following important result.

PROPOSITION 2P.Suppose that

A= 

a11 . . . a1n ..

. ... an1 . . . ann

,

and that

x= 

 x1

.. . xn

and b= 

 b1

.. . bn

are n×1 matrices, where x1, . . . , xn are variables and b1, . . . , bn ∈ R are arbitrary. Suppose further

(17)

We next attempt to study the question in the opposite direction.

PROPOSITION 2Q.Suppose that

A= 

a11 . . . a1n ..

. ... an1 . . . ann

,

and that

x= 

 x1

.. . xn

and b= 

 b1

.. . bn

are n×1 matrices, wherex1, . . . , xn are variables. Suppose further that for every b1, . . . , bn ∈ R, the

systemAx=bof linear equations is soluble. Then the matrixAis invertible.

Proof.Suppose that

b1=

     1 0 .. . 0 0

    

, . . . , bn= 

     0 0 .. . 0 1

     .

In other words, for everyj= 1, . . . , n,bjis ann×1 matrix with entry 1 on rowj and entry 0 elsewhere. Now let

x1=

 x11

.. . xn1

, . . . , xn= 

 x1n

.. . xnn

denote respectively solutions of the systems of linear equations

Ax=b1, . . . , Ax=bn.

It is easy to check that

A(x1 . . . xn) = (b1 . . . bn) ;

in other words,

A 

x11 . . . x1n ..

. ...

xn1 . . . xnn

=In,

so thatAis invertible.

We can now summarize Propositions 2N, 2P and 2Q as follows.

PROPOSITION 2R.In the notation of Proposition 2N, the following four statements are equivalent: (a) The matrixA is invertible.

(b) The systemAx=0of linear equations has only the trivial solution. (c) The matrices AandIn are row equivalent.

(18)

2.8. Application to Economics

In this section, we describe briefly the Leontief input-output model, where an economy is divided inton sectors.

For every i= 1, . . . , n, let xi denote the monetary value of the total output of sector i over a fixed period, and let di denote the output of sector i needed to satisfy outside demand over the same fixed period. Collecting togetherxi anddi fori= 1, . . . , n, we obtain the vectors

x= 

 x1

.. . xn

∈Rn and d= 

 d1

.. . dn

∈Rn,

known respectively as the production vector and demand vector of the economy.

On the other hand, each of thensectors requires material from some or all of the sectors to produce its output. For i, j = 1, . . . , n, let cij denote the monetary value of the output of sector i needed by sectorj to produce one unit of monetary value of output. For everyj= 1, . . . , n, the vector

cj= 

 

c1j .. . cnj

 ∈Rn

is known as the unit consumption vector of sectorj. Note that the column sum

c1j+. . .+cnj≤1 (5)

in order to ensure that sectorj does not make a loss. Collecting together the unit consumption vectors, we obtain the matrix

C= (c1 . . . cn) = 

c11 . . . c1n ..

. ...

cn1 . . . cnn

,

known as the consumption matrix of the economy.

Consider the matrix product

Cx= 

c11x1+. . .+c1nxn

.. .

cn1x1+. . .+cnnxn

.

For everyi= 1, . . . , n, the entryci1x1+. . .+cinxnrepresents the monetary value of the output of sector

ineeded by all the sectors to produce their output. This leads to the production equation

x=Cx+d. (6)

Here Cx represents the part of the total output that is required by the various sectors of the economy to produce the output in the first place, anddrepresents the part of the total output that is available to satisfy outside demand.

Clearly (I−C)x=d. If the matrixI−C is invertible, then

x= (I−C)−1d

(19)

PROPOSITION 2S.Suppose that the entries of the consumption matrix C and the demand vectord are non-negative. Suppose further that the inequality (5)holds for each column of C. Then the inverse matrix(I−C)−1 exists, and the production vector x= (IC)−1dhas non-negative entries and is the

unique solution of the production equation(6).

Let us indulge in some heuristics. Initially, we have demandd. To produce d, we need Cdas input. To produce this extra Cd, we need C(Cd) = C2d as input. To produce this extra C2d, we need

C(C2d) =C3das input. And so on. Hence we need to produce

d+Cd+C2d+C3d+. . .= (I+C+C2+C3+. . .)d

in total. Now it is not difficult to check that for every positive integerk, we have

(I−C)(I+C+C2+C3+. . .+Ck) =ICk+1.

If the entries ofCk+1 are all very small, then

(I−C)(I+C+C2+C3+. . .+Ck)≈I, so that

(I−C)−1I+C+C2+C3+. . .+Ck. This gives a practical way of approximating (I−C)−1, and also suggests that

(I−C)−1=I+C+C2+C3+. . . .

Example 2.8.1.An economy consists of three sectors. Their dependence on each other is summarized

in the table below:

To produce one unit of monetary value of output in sector

1 2 3

monetary value of output required from sector 1 0.3 0.2 0.1 monetary value of output required from sector 2 0.4 0.5 0.2 monetary value of output required from sector 3 0.1 0.1 0.3

Suppose that the final demand from sectors 1, 2 and 3 are respectively 30, 50 and 20. Then the production vector and demand vector are respectively

x= 

 x1

x2

x3

 and d= 

 d1

d2

d3

= 

 30 50 20

,

while the consumption matrix is given by

C= 

0.3 0.2 0.1 0.4 0.5 0.2 0.1 0.1 0.3

, so that I−C= 

0.7 −0.2 −0.1

−0.4 0.5 −0.2

−0.1 −0.1 0.7 

.

The production equation (I−C)x=dhas augmented matrix

0.7 −0.2 −0.1

−0.4 0.5 −0.2

−0.1 −0.1 0.7

30 50 20

, equivalent to 

7 −2 −1

−4 5 −2

−1 −1 7

300 500 200

(20)

and which can be converted to reduced row echelon form 

1 0 0 0 1 0 0 0 1

3200/27 6100/27 700/9  .

This givesx1≈119,x2≈226 andx3≈78, to the nearest integers.

2.9. Matrix Transformation on the Plane

Let A be a 2×2 matrix with real entries. A matrix transformation T : R2 → R2 can be defined as follows: For everyx= (x1, x2)∈R, we writeT(x) =y, wherey= (y1, y2)∈R2 satisfies

y1 y2 =A x1 x2 .

Such a transformation is linear, in the sense thatT(x′+x′′) =T(x) +T(x) for everyx,x′′R2 and

T(cx) =cT(x) for everyx∈R2 and everyc∈R. To see this, simply observe that

A

x′

1+x′′1

x′

2+x′′2

=A x′ 1 x′ 2 +A x′′ 1 x′′ 2 and A cx1 cx2 =cA x1 x2 .

We shall study linear transformations in greater detail in Chapter 8. Here we confine ourselves to looking at a few simple matrix transformations on the plane.

Example 2.9.1.The matrix

A=

1 0

0 −1 satisfies A x1 x2 = 1 0

0 −1 x1 x2 = x1 −x2

for every (x1, x2)∈R2, and so represents reflection across thex1-axis, whereas the matrix

A=

−1 0

0 1 satisfies A x1 x2 =

−1 0

0 1 x1 x2 =

−x1

x2

for every (x1, x2)∈R2, and so represents reflection across thex2-axis. On the other hand, the matrix

A=

−1 0

0 −1 satisfies A x1 x2 =

−1 0

0 −1 x1

x2

=

−x1 −x2

for every (x1, x2)∈R2, and so represents reflection across the origin, whereas the matrix

A= 0 1 1 0 satisfies A x1 x2 = 0 1 1 0 x1 x2 = x2 x1

for every (x1, x2)∈R2, and so represents reflection across the linex1=x2. We give a summary in the

table below:

Transformation Equations Matrix

Reflection acrossx1-axis

ny1=x1 y2=−x2

1 0

0 −1

Reflection acrossx2-axis

ny1=x1 y2=x2

−1 0

0 1

Reflection across origin nyy1=−x1

2=−x2

−1 0

0 −1

Reflection acrossx1=x2

ny1=x2 y2=x1

0 1 1 0

(21)

Example 2.9.2.Letk be a fixed positive real number. The matrix

A=

k 0 0 k

satisfies A

x1

x2

=

k 0 0 k

x1

x2

=

kx1

kx2

for every (x1, x2)∈ R2, and so represents a dilation if k > 1 and a contraction if 0 < k <1. On the

other hand, the matrix

A=

k 0 0 1

satisfies A

x1

x2

=

k 0 0 1

x1

x2

=

kx1

x2

for every (x1, x2)∈R2, and so represents an expansionn in the x1-direction ifk >1 and a compression

in thex1-direction if 0< k <1, whereas the matrix

A=

1 0 0 k

satisfies A

x1

x2

=

1 0 0 k

x1

x2

=

x1

kx2

for every (x1, x2)∈R2, and so represents a expansion in the x2-direction ifk >1 and a compression in

thex2-direction if 0< k <1. We give a summary in the table below:

Transformation Equations Matrix

Dilation or contraction by factork >0

y1=kx1

y2=kx2

k 0 0 k

Expansion or compression inx1-direction by factork >0

y1=kx1

y2=x2

k 0 0 1

Expansion or compression inx2-direction by factork >0

ny1=x1 y2=kx2

1 0 0 k

Example 2.9.3.Letk be a fixed real number. The matrix

A=

1 k 0 1

satisfies A

x1

x2

=

1 k 0 1

x1

x2

=

x1+kx2

x2

for every (x1, x2)∈R2, and so represents a shear in the x1-direction. For the casek= 1, we have the

following:

• • • •

• • • •

T

(k=1)

For the casek=−1, we have the following:

• • • •

• • • •

T

(22)

Similarly, the matrix

A=

1 0 k 1

satisfies A

x1

x2

=

1 0 k 1

x1

x2

=

x1

kx1+x2

for every (x1, x2)∈R2, and so represents a shear in the x2-direction. We give a summary in the table

below:

Transformation Equations Matrix

Shear inx1-direction

y1=x1+kx2

y2=x2

1 k 0 1

Shear inx2-direction

ny1=x1 y2=kx1+x2

1 0 k 1

Example 2.9.4.For anticlockwise rotation by an angleθ, we haveT(x1, x2) = (y1, y2), where

y1+ iy2= (x1+ ix2)(cosθ+ i sinθ),

and so

y1

y2

=

cosθ −sinθ sinθ cosθ

x1

x2

.

It follows that the matrix in question is given by

A=

cosθ −sinθ sinθ cosθ

.

We give a summary in the table below:

Transformation Equations Matrix

Anticlockwise rotation by angleθ

y1=x1cosθ−x2sinθ

y2=x1sinθ+x2cosθ

cosθ −sinθ sinθ cosθ

We conclude this section by establishing the following result which reinforces the linearity of matrix transformations on the plane.

PROPOSITION 2T. Suppose that a matrix transformation T : R2 → R2 is given by an invertible matrixA. Then

(a) the image under T of a straight line is a straight line;

(b) the image under T of a straight line through the origin is a straight line through the origin; and (c) the images under T of parallel straight lines are parallel straight lines.

Proof.Suppose thatT(x1, x2) = (y1, y2). SinceAis invertible, we havex=A−1y, where

x=

x1

x2

and y=

y1

y2

.

The equation of a straight line is given byαx1+βx2=γ or, in matrix form, by

(α β)

x1

x2

(23)

Hence

(α β)A−1

y1

y2

= (γ).

Let

(α′ β) = (α β)A−1.

Then

(α′ β)

y1

y2

= (γ).

In other words, the image underT of the straight lineαx1+βx2=γ isα′y1+β′y2=γ, clearly another

straight line. This proves (a). To prove (b), note that straight lines through the origin correspond to γ = 0. To prove (c), note that parallel straight lines correspond to different values of γ for the same values ofαandβ.

2.10. Application to Computer Graphics

Example 2.10.1.Consider the letterM in the diagram below:

Following the boundary in the anticlockwise direction starting at the origin, the 12 vertices can be represented by the coordinates

0 0

,

1 0

,

1 6

,

4 0

,

7 6

,

7 0

,

8 0

,

8 8

,

7 8

,

4 2

,

1 8

,

0 8

.

Let us apply a matrix transformation to these vertices, using the matrix

A=

1 1 2

0 1

,

representing a shear in thex1-direction with factor 0.5, so that

A

x1

x2

=

x1+12x2

x2

(24)

Then the images of the 12 vertices are respectively

0 0

,

1 0

,

4 6

,

4 0

,

10

6

,

7 0

,

8 0

,

12

8

,

11 8

,

5 2

,

5 8

,

4 8

,

noting that

1 12 0 1

0 1 1 4 7 7 8 8 7 4 1 0 0 0 6 0 6 0 0 8 8 2 8 8

=

0 1 4 4 10 7 8 12 11 5 5 4

0 0 6 0 6 0 0 8 8 2 8 8

.

In view of Proposition 2T, the image of any line segment that joins two vertices is a line segment that joins the images of the two vertices. Hence the image of the letter M under the shear looks like the following:

Next, we may wish to translate this image. However, a translation is a transformation by vector h= (h1, h2)∈R2 is of the form

y1

y2

=

x1

x2

+

h1

h2

for every (x1, x2)∈R2,

and this cannot be described by a matrix transformation on the plane. To overcome this deficiency, we introduce homogeneous coordinates. For every point (x1, x2) ∈ R2, we identify it with the point

(x1, x2,1)∈R3. Now we wish to translate a point (x1, x2) to (x1, x2) + (h1, h2) = (x1+h1, x2+h2), so

we attempt to find a 3×3 matrixA∗ such that

x1+h1

x2+h2

1 

=A∗ 

 x1

x2

1 

 for every (x1, x2)∈R2.

It is easy to check that 

x1+h1

x2+h2

1 

= 

1 0 h1

0 1 h2

0 0 1

 

 x1

x2

1 

 for every (x1, x2)∈R2.

It follows that using homogeneous coordinates, translation by vectorh= (h1, h2)∈R2can be described

by the matrix

A∗=

1 0 h1

0 1 h2

0 0 1

(25)

Remark.Consider a matrix transformationT :R2R2 on the plane given by a matrix

A=

a11 a12

a21 a22

.

Suppose thatT(x1, x2) = (y1, y2). Then

y1

y2

=A

x1

x2

=

a11 a12

a21 a22

x1

x2

.

Under homogeneous coordinates, the image of the point (x1, x2,1) is now (y1, y2,1). Note that

 y1

y2

1 

= 

a11 a12 0

a21 a22 0

0 0 1

 

 x1

x2

1 

.

It follows that homogeneous coordinates can also be used to study all the matrix transformations we have discussed in Section 2.9. By moving over to homogeneous coordinates, we simply replace the 2×2 matrixAby the 3×3 matrix

A∗=

A 0

0 1

.

Example 2.10.2.Returning to Example 2.10.1 of the letterM, the 12 vertices are now represented by

homogeneous coordinates, put in an array in the form 

0 1 1 4 7 7 8 8 7 4 1 0 0 0 6 0 6 0 0 8 8 2 8 8 1 1 1 1 1 1 1 1 1 1 1 1

.

Then the 2×2 matrix

A=

1 12 0 1

is now replaced by the 3×3 matrix

A∗=

 1 1

2 0

0 1 0

0 0 1

.

Note that

A∗

0 1 1 4 7 7 8 8 7 4 1 0 0 0 6 0 6 0 0 8 8 2 8 8 1 1 1 1 1 1 1 1 1 1 1 1

= 

1 12 0

0 1 0

0 0 1

 

0 1 1 4 7 7 8 8 7 4 1 0 0 0 6 0 6 0 0 8 8 2 8 8 1 1 1 1 1 1 1 1 1 1 1 1

= 

0 1 4 4 10 7 8 12 11 5 5 4

0 0 6 0 6 0 0 8 8 2 8 8

1 1 1 1 1 1 1 1 1 1 1 1

.

Next, let us consider a translation by the vector (2,3). The matrix under homogeneous coordinates for this translation is given by

B∗=

1 0 2 0 1 3 0 0 1

(26)

Note that

B∗A

0 1 1 4 7 7 8 8 7 4 1 0 0 0 6 0 6 0 0 8 8 2 8 8 1 1 1 1 1 1 1 1 1 1 1 1

= 

1 0 2 0 1 3 0 0 1

 

0 1 4 4 10 7 8 12 11 5 5 4

0 0 6 0 6 0 0 8 8 2 8 8

1 1 1 1 1 1 1 1 1 1 1 1

= 

2 3 6 6 12 9 10 14 13 7 7 6

3 3 9 3 9 3 3 11 11 5 11 11

1 1 1 1 1 1 1 1 1 1 1 1

,

giving rise to coordinates inR2, displayed as an array

2 3 6 6 12 9 10 14 13 7 7 6

3 3 9 3 9 3 3 11 11 5 11 11

Hence the image of the letterM under the shear followed by translation looks like the following:

Example 2.10.3. Under homogeneous coordinates, the transformation representing a reflection across

the x1-axis, followed by a shear by factor 2 in the x1-direction, followed by anticlockwise rotation by

90◦, and followed by translation by vector (2,1), has matrix

1 0 2

0 1 −1

0 0 1

 

0 −1 0

1 0 0

0 0 1

 

1 2 0 0 1 0 0 0 1

 

1 0 0

0 −1 0

0 0 1

= 

0 1 2

1 −2 −1

0 0 1

.

2.11. Complexity of a Non-Homogeneous System

(27)

One way of solving the systemAx=bis to write down the augmented matrix

a11 . . . a1n ..

. ...

an1 . . . ann

b1

.. . bn

, (7)

and then convert it to reduced row echelon form by elementary row operations.

The first step is to reduce it to row echelon form:

(I) First of all, we may need to interchange two rows in order to ensure that the top left entry in the array is non-zero. This requiresn+ 1 operations.

(II) Next, we need to multiply the new first row by a constant in order to make the top left pivot entry equal to 1. This requiresn+ 1 operations, and the array now looks like

  

1 a12 . . . a1n a21 a22 . . . a2n

..

. ... ... an1 an2 . . . ann

b1

b2

.. . bn

  .

Note that we are abusing notation somewhat, as the entrya12 here, for example, may well be different

from the entrya12 in the augmented matrix (7).

(III) For each rowi= 2, . . . , n, we now multiply the first row by−ai1 and then add to rowi. This

requires 2(n−1)(n+ 1) operations, and the array now looks like

  

1 a12 . . . a1n 0 a22 . . . a2n ..

. ... ... 0 an2 . . . ann

b1

b2

.. . bn

 

. (8)

(IV) In summary, to proceed from the form (7) to the form (8), the number of operations required is at most 2(n+ 1) + 2(n−1)(n+ 1) = 2n(n+ 1).

(V) Our next task is to convert the smaller array

a22 . . . a2n ..

. ...

an2 . . . ann

b2

.. . bn

to an array that looks like

  

1 a23 . . . a2n 0 a33 . . . a3n ..

. ... ... 0 an3 . . . ann

b2

b3

.. . bn

  .

These have one row and one column fewer than the arrays (7) and (8), and the number of operations required is at most 2m(m+ 1), wherem=n−1. We continue in this way systematically to reach row echelon form, and conclude that the number of operations required to convert the augmented matrix (7) to row echelon form is at most

n X

m=1

2m(m+ 1)≈2

3n

(28)

The next step is to convert the row echelon form to reduced row echelon form. This is simpler, as many entries are now zero. It can be shown that the number of operations required is bounded by something like 2n2 – indeed, by something like n2 if one analyzes the problem more carefully. In any

case, these estimates are insignificant compared to the estimate 23n3earlier.

We therefore conclude that the number of operations required to solve the systemAx=bby reducing the augmented matrix to reduced row echelon form is bounded by something like 23n3 whennis large.

Another way of solving the systemAx=bis to first find the inverse matrixA−1. This may involve

converting the array

a11 . . . a1n ..

. ...

an1 . . . ann

1 . ..

1 

to reduced row echelon form by elementary row operations. It can be shown that the number of operations required is something like 2n3, so this is less efficient than our first method.

2.12. Matrix Factorization

In some situations, we may need to solve systems of linear equations of the formAx=b, with the same coefficient matrixA but for many different vectorsb. IfA is an invertible square matrix, then we can find its inverseA−1 and then compute A−1bfor each vectorb. However, the matrixA may not be a

square matrix, and we may have to convert the augmented matrix to reduced row echelon form.

In this section, we describe a way for solving this problem in a more efficient way. To describe this, we first need a definition.

Definition. A rectangular array of numbers is said to be in quasi row echelon form if the following

conditions are satisfied:

(1) The left-most non-zero entry of any non-zero row is called a pivot entry. It is not necessary for its value to be equal to 1.

(2) All zero rows are grouped together at the bottom of the array.

(3) The pivot entry of a non-zero row occurring lower in the array is to the right of the pivot entry of a non-zero row occurring higher in the array.

In other words, the array looks like row echelon form in shape, except that the pivot entries do not have to be equal to 1.

We consider first of all a special case.

PROPOSITION 2U.Suppose that an m×nmatrixA can be converted to quasi row echelon form by elementary row operations but without interchanging any two rows. ThenA=LU, whereLis anm×m lower triangular matrix with diagonal entries all equal to1 andU is a quasi row echelon form of A.

Sketch of Proof.Recall that applying an elementary row operation to anm×nmatrix corresponds

(29)

such elementary matrices unit lower triangular. If anm×n matrix A can be reduced in this way to quasi row echelon formU, then

U =Ek. . . E2E1A,

where the elementary matrices E1, E2, . . . , Ek are all unit lower triangular. Let L = (Ek. . . E2E1)−1.

Then A= LU. It can be shown that products and inverses of unit lower triangular matrices are also unit lower triangular. HenceLis a unit lower triangular matrix as required.

IfAx=bandA=LU, thenL(Ux) =b. Writingy=Ux, we have

Ly=b and Ux=y.

It follows that the problem of solving the systemAx=bcorresponds to first solving the systemLy=b and then solving the systemUx=y. Both of these systems are easy to solve since bothLand U have many zero entries. It remains to findLandU.

If we reduce the matrixAto quasi row echelon form by only performing the elementary row operation of adding a multiple of a row higher in the array to another row lower in the array, then U can be taken as the quasi row echelon form resulting from this. It remains to find L. However, note that L= (Ek. . . E2E1)−1, whereU =Ek. . . E2E1A, and so

I=Ek. . . E2E1L.

This means that the very elementary row operations that convert A to U will convert L to I. We therefore wish to create a matrixL such that this is satisfied. It is simplest to illustrate the technique by an example.

Example 2.12.1.Consider the matrix

A= 

 

2 −1 2 −2 3

4 1 6 −5 8

2 −10 −4 8 −5 2 −13 −6 16 −5

 .

The entry 2 in row 1 and column 1 is a pivot entry, and column 1 is a pivot column. Adding−2 times row 1 to row 2, adding−1 times row 1 to row 3, and adding−1 times row 1 to row 4, we obtain

 

2 −1 2 −2 3

0 3 2 −1 2

0 −9 −6 10 −8 0 −12 −8 18 −8

 .

Note that the same three elementary row operations convert

 

1 0 0 0 2 1 0 0

1 ∗ 1 0

1 ∗ ∗ 1

  to

 

1 0 0 0 0 1 0 0

0 ∗ 1 0

0 ∗ ∗ 1

 .

Next, the entry 3 in row 2 and column 2 is a pivot entry, and column 2 is a pivot column. Adding 3 times row 2 to row 3, and adding 4 times row 2 to row 4, we obtain

 

2 −1 2 −2 3

0 3 2 −1 2

0 0 0 7 −2

0 0 0 14 0

(30)

Note that the same two elementary row operations convert

 

1 0 0 0

0 1 0 0

0 −3 1 0 0 −4 ∗ 1

   to   

1 0 0 0 0 1 0 0 0 0 1 0

0 0 ∗ 1

 .

Next, the entry 7 in row 3 and column 4 is a pivot entry, and column 4 is a pivot column. Adding−2 times row 3 to row 4, we obtain the quasi row echelon form

U = 

 

2 −1 2 −2 3

0 3 2 −1 2

0 0 0 7 −2

0 0 0 0 4

 ,

where the entry 4 in row 4 and column 5 is a pivot entry, and column 5 is a pivot column. Note that the same elementary row operation converts

 

1 0 0 0 0 1 0 0 0 0 1 0 0 0 2 1

   to   

1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1

 .

Now observe that if we take

L= 

 

1 0 0 0

2 1 0 0

1 −3 1 0 1 −4 2 1

 ,

thenLcan be converted toI4 by the same elementary operations that convertA toU.

The strategy is now clear. Every time we find a new pivot, we note its value and the entries below it. The lower triangular entries ofLare formed by these columns with each column divided by the value of the pivot entry in that column.

Example 2.12.2.Let us examine our last example again. The pivot columns at the time of establishing

the pivot entries are respectively

   2 4 2 2   ,    ∗ 3 −9 −12   ,    ∗ ∗ 7 14   ,    ∗ ∗ ∗ 4   .

Dividing them respectively by the pivot entries 2, 3, 7 and 4, we obtain respectively the columns

   1 2 1 1   ,    ∗ 1 −3 −4   ,    ∗ ∗ 1 2   ,    ∗ ∗ ∗ 1   .

Note that the lower triangular entries of the matrix

L= 

 

1 0 0 0

2 1 0 0

1 −3 1 0 1 −4 2 1

 

(31)

LU FACTORIZATION ALGORITHM.

(1) Reduce the matrixA to quasi row echelon form by only performing the elementary row operation of adding a multiple of a row higher in the array to another row lower in the array. LetU be the quasi row echelon form obtained.

(2) Record any new pivot column at the time of its first recognition, and modify it by replacing any entry above the pivot entry by zero and dividing every other entry by the value of the pivot entry. (3) Let Ldenote the square matrix obtained by letting the columns be the pivot columns as modified in

step (2).

Example 2.12.3.We wish to solve the system of linear equationsAx=b, where

A= 

 

3 −1 2 −4 1

−3 3 −5 5 −2

6 −4 11 −10 6

−6 8 −21 13 −9

 and b= 

 

1

−2 9

−15 

 .

Let us first apply LU factorization to the matrixA. The first pivot column is column 1, with modified version

 

1

−1 2

−2 

 .

Adding row 1 to row 2, adding−2 times row 1 to row 3, and adding 2 times row 1 to row 4, we obtain

 

3 −1 2 −4 1

0 2 −3 1 −1

0 −2 7 −2 4

0 6 −17 5 −7

 .

The second pivot column is column 2, with modified version

 

0 1

−1 3

 .

Adding row 2 to row 3, and adding−3 times row 2 to row 4, we obtain

 

3 −1 2 −4 1

0 2 −3 1 −1

0 0 4 −1 3

0 0 −8 2 −4

 .

The third pivot column is column 3, with modified version

 

0 0 1

−2 

 .

Adding 2 times row 3 to row 4, we obtain the quasi row echelon form

 

3 −1 2 −4 1

0 2 −3 1 −1

0 0 4 −1 3

0 0 0 0 2

(32)

The last pivot column is column 5, with modified version

  0 0 0 1

 .

It follows that

L= 

 

1 0 0 0

−1 1 0 0

2 −1 1 0

−2 3 −2 1

 and U= 

 

3 −1 2 −4 1

0 2 −3 1 −1

0 0 4 −1 3

0 0 0 0 2

 .

We now consider the systemLy=b, with augmented matrix

 

1 0 0 0

−1 1 0 0

2 −1 1 0

−2 3 −2 1

1

−2 9

−15 

 .

Using row 1, we obtainy1= 1. Using row 2, we obtain y2−y1=−2, so thaty2=−1. Using row 3, we

obtainy3+ 2y1−y2 = 9, so thaty3 = 6. Using row 4, we obtain y4−2y1+ 3y2−2y3=−15, so that

y4= 2. Hence

y= 

 

1

−1 6 2

 .

We next consider the systemUx=y, with augmented matrix

 

3 −1 2 −4 1

0 2 −3 1 −1

0 0 4 −1 3

0 0 0 0 2

1

−1 6 2

 .

Here the free variable isx4. Let x4 =t. Using row 4, we obtain 2x5= 2, so thatx5= 1. Using row 3,

we obtain 4x3= 6 +x4−3x5= 3 +t, so thatx3= 34+14t. Using row 2, we obtain

2x2=−1 + 3x3−x4+x5= 94−14t,

so thatx2=98− 1

8t. Using row 1, we obtain 3x1= 1 +x2−2x3+ 4x4−x5= 278t− 3

8, so thatx1= 98t− 1 8.

Hence

(x1, x2, x3, x4, x5) =

9t−1

8 , 9−t

8 , 3 +t

4 , t,1

, where t∈R.

Remarks.(1) In practical situations, interchanging rows is usually necessary to convert a matrixA to

quasi row echelon form. The technique here can be modified to produce a matrixL which is not unit lower triangular, but which can be made unit lower triangular by interchanging rows.

(2) Computing an LU factorization of ann×nmatrix takes approximately 23n3 operations. Solving

the systemsLy=bandUx=yrequires approximately 2n2 operations.

(33)

2.13. Application to Games of Strategy

Consider a game with two players. Player R, usually known as the row player, has m possible moves, denoted byi= 1,2,3, . . . , m, while playerC, usually known as the column player, hasnpossible moves, denoted byj = 1,2,3, . . . , n. For everyi= 1,2,3, . . . , mandj= 1,2,3, . . . , n, letaij denote the payoff that player C has to make to player R if player R makes move i and player C makes move j. These numbers give rise to the payoff matrix

A= 

a11 . . . a1n ..

. ...

am1 . . . amn

.

The entries can be positive, negative or zero.

Suppose that for every i = 1,2,3, . . . , m, player R makes move i with probability pi, and that for everyj = 1,2,3, . . . , n, playerC makes movej with probabilityqj. Then

p1+. . .+pm= 1 and q1+. . .+qn= 1.

Assume that the players make moves independently of each other. Then for everyi= 1,2,3, . . . , mand j = 1,2,3, . . . , n, the numberpiqj represents the probability that playerR makes moveiand playerC makes movej. Then the double sum

EA(p,q) = m X

i=1

n X

j=1

aijpiqj

represents the expected payoff that playerC has to make to playerR.

The matrices

p= (p1 . . . pm) and q=

 

q1

.. . qn

 

are known as the strategies of playerR and playerC respectively. Clearly the expected payoff

EA(p,q) = m X

i=1

n X

j=1

aijpiqj= (p1 . . . pm)

a11 . . . a1n ..

. ...

am1 . . . amn

 

 

q1

.. . qn

=pAq.

Here we have slightly abused notation. The right hand side is a 1×1 matrix!

We now consider the following problem: Suppose thatAis fixed. Is it possible for playerR to choose a strategy pto try to maximize the expected payoff EA(p,q)? Is it possible for player C to choose a strategyqto try to minimize the expected payoffEA(p,q)?

FUNDEMENTAL THEOREM OF ZERO SUM GAMES.There exist strategiesp∗andqsuch

that

EA(p∗,q)EA(p,q)EA(p,q)

for every strategypof player Rand every strategy qof player C.

Remark.The strategyp∗is known as an optimal strategy for playerR, and the strategyqis known as

an optimal strategy for playerC. The quantityEA(p∗,q) is known as the value of the game. Optimal

strategies are not necessarily unique. However, if p∗∗ and q∗∗ are another pair of optimal strategies,

Referências

Documentos relacionados

On the other hand, the dimension of the column space of A is equal to the number of pivot columns in the row echelon form.. However, the number of non-zero rows in the row echelon

Vale mencionar que esta forma de propaganda sobre a TV digital em que são ressaltadas somente suas possibilidades comerciais não foi difundida apenas no Brasil, mas também em

O facto de se desenvolver uma ferramenta de aplicação a ambientes industriais, justifica desde já o seu teste empírico por meio de um estudo de caso (Perry, 1998; Rowley, 2002).

Median percentage of influenza cases that occurred during the 3-month peak period and median number of months to have 80% of influ- enza cases during a season in countries of

To minimize the incidence of anastomotic leakage and bleeding related to the standard 4-row linear cutter, recently a new 6-row linear cutter that provides optimal tissue compression

É considerando esses fatores que o presente trabalho possui o objetivo de desenvolver uma proposta arquitetônica de uma edificação para abrigar a Câmara Municipal de Vereadores do

In Figure 1, the first row (Institution) represents the 4 hospitals from which AT data were collected; the sec- ond row (Department) represents the department that contains