Numerical Mathematics Livros e Materiais Online Evandro Marquesone

(1)

Numerical Mathematics

Alfio Quarteroni

Riccardo Sacco

Fausto Saleri

(2)

Texts in Applied Mathematics

m

37 Springer

New York Berlin Heidelberg Barcelona Hong Kong London Milan Paris Singapore Tokyo

(3)

(4)

Alfio Quarteroni

MM

Riccardo Sacco

Fausto Saleri

123

Numerical Mathematics

(5)

Alfio Quarteroni

Department of Mathematics Ecole Polytechnique

MFe´de´rale de Lausanne CH-1015 Lausanne Switzerland

alfio.quarteroni@epfl.ch

Riccardo Sacco

Dipartimento di Matematica Politecnico di Milano Piazza Leonardo da Vinci 32 20133 Milan

Italy

ricsac@mate.polimi.it

Fausto Saleri

Dipartimento di Matematica,

M“F. Enriques” Università degli Studi di

MMilano Via Saldini 50 20133 Milan Italy

fausto.saleri@unimi.it

Series Editors

J.E. Marsden

Control and Dynamical Systems, 107–81 California Institute of Technology Pasadena, CA 91125

USA M. Golubitsky

Department of Mathematics University of Houston Houston, TX 77204-3476 USA

L. Sirovich

Division of Applied Mathematics Brown University

Providence, RI 02912 USA

W. J¨ager

Department of Applied Mathematics Universit ¨at Heidelberg

Im Neuenheimer Feld 294 69120 Heidelberg Germany

Library of Congress Cataloging-in-Publication Data Quarteroni, Alfio.

Numerical mathematics/Alfio Quarteroni, Riccardo Sacco, Fausto Saleri. p.Mcm. — (Texts in applied mathematics; 37)

Includes bibliographical references and index. ISBN 0-387-98959-5 (alk. paper)

1. Numerical analysis.MI. Sacco, Riccardo.MII. Saleri, Fausto.MIII. Title.MIV. Series. I. Title.MMII. Series.

QA297.Q83M2000

519.4—dc21 99-059414

All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher (Springer-Verlag New York, Inc., 175 Fifth Avenue, New York, NY 10010, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or heraf-ter developed is forbidden.

The use of general descriptive names, trade names, trademarks, etc., in this publication, even if the former are not especially identified, is not to be taken as a sign that such names, as understood by the Trade Marks and Merchandise Marks Act, may accordingly be used freely by anyone.

ISBN 0-387-98959-5nSpringer-VerlagnNew YorknBerlinnHeidelbergMSPIN 10747955

(6)

(7)

Preface

Numerical mathematics is the branch of mathematics that proposes, de-velops, analyzes and applies methods from scientific computing to several fields including analysis, linear algebra, geometry, approximation theory, functional equations, optimization and differential equations. Other disci-plines such as physics, the natural and biological sciences, engineering, and economics and the financial sciences frequently give rise to problems that need scientific computing for their solutions.

As such, numerical mathematics is the crossroad of several disciplines of great relevance in modern applied sciences, and can become a crucial tool for their qualitative and quantitative analysis. This role is also emphasized by the continual development of computers and algorithms, which make it possible nowadays, using scientiﬁc computing, to tackle problems of such a large size that real-life phenomena can be simulated providing accurate responses at aﬀordable computational cost.

The corresponding spread of numerical software represents an enrichment for the scientiﬁc community. However, the user has to make the correct choice of the method (or the algorithm) which best suits the problem at hand. As a matter of fact, no black-box methods or algorithms exist that can eﬀectively and accurately solve all kinds of problems.

(8)

viii Preface

and cons. This is done using the MATLAB 1 _{software environment. This} choice satisﬁes the two fundamental needs of user-friendliness and wide-spread diﬀusion, making it available on virtually every computer.

Every chapter is supplied with examples, exercises and applications of the discussed theory to the solution of real-life problems. The reader is thus in the ideal condition for acquiring the theoretical knowledge that is required to make the right choice among the numerical methodologies and make use of the related computer programs.

This book is primarily addressed to undergraduate students, with partic-ular focus on the degree courses in Engineering, Mathematics, Physics and Computer Science. The attention which is paid to the applications and the related development of software makes it valuable also for graduate stu-dents, researchers and users of scientiﬁc computing in the most widespread professional ﬁelds.

The content of the volume is organized into four parts and 13 chapters. Part I comprises two chapters in which we review basic linear algebra and introduce the general concepts of consistency, stability and convergence of a numerical method as well as the basic elements of computer arithmetic.

Part II is on numerical linear algebra, and is devoted to the solution of linear systems (Chapters 3 and 4) and eigenvalues and eigenvectors com-putation (Chapter 5).

We continue with Part III where we face several issues about functions and their approximation. Speciﬁcally, we are interested in the solution of nonlinear equations (Chapter 6), solution of nonlinear systems and opti-mization problems (Chapter 7), polynomial approximation (Chapter 8) and numerical integration (Chapter 9).

Part IV, which is the more demanding as a mathematical background, is concerned with approximation, integration and transforms based on orthog-onal polynomials (Chapter 10), solution of initial value problems (Chap-ter 11), boundary value problems (Chap(Chap-ter 12) and initial-boundary value problems for parabolic and hyperbolic equations (Chapter 13).

Part I provides the indispensable background. Each of the remaining Parts has a size and a content that make it well suited for a semester course.

A guideline index to the use of the numerous MATLAB Programs de-veloped in the book is reported at the end of the volume. These programs are also available at the web site address:

http://www1.mate.polimi.it/˜calnum/programs.html

For the reader’s ease, any code is accompanied by a brief description of its input/output parameters.

We express our thanks to the staﬀ at Springer-Verlag New York for their expert guidance and assistance with editorial aspects, as well as to Dr.

(9)

Preface ix

Martin Peters from Springer-Verlag Heidelberg and Dr. Francesca Bonadei from Springer-Italia for their advice and friendly collaboration all along this project.

We gratefully thank Professors L. Gastaldi and A. Valli for their useful comments on Chapters 12 and 13.

We also wish to express our gratitude to our families for their forbearance and understanding, and dedicate this book to them.

(10)

1 Foundations of Matrix Analysis

In this chapter we recall the basic elements of linear algebra which will be employed in the remainder of the text. For most of the proofs as well as for the details, the reader is referred to [Bra75], [Nob69], [Hal58]. Further results on eigenvalues can be found in [Hou75] and [Wil65].

1.1 Vector Spaces

Definition 1.1 Avector spaceover the numeric fieldK(K=RorK=C) is a nonempty set V, whose elements are called vectorsand in which two operations are defined, calledadditionandscalar multiplication, that enjoy the following properties:

1. addition is commutative and associative;

2. there exists an element0_∈ V (the zero vector or null vector) such thatv+0=vfor eachv_∈V;

3. 0·v=0, 1·v=v, where 0 and 1 are respectively the zero and the unity ofK;

4. for each elementv_∈V there exists its opposite, ₋v, inV such that

(21)

2 1. Foundations of Matrix Analysis

5. the following distributive properties hold

∀α_∈K, _∀v,w_∈V, α(v+w) =αv+αw,

∀α, β∈K, ∀v∈V, (α+β)v=αv+βv;

6. the following associative property holds

∀α, β_∈K, _∀v_∈V, (αβ)v=α(βv).

Example 1.1 Remarkable instances of vector spaces are:

- V =Rn(respectivelyV =Cn): the set of then-tuples of real (respectively complex) numbers,n≥1;

- V =Pn: the set of polynomialspn(x) =n_k₌₀akxk with real (or complex)

coeﬃcientsakhaving degree less than or equal ton,n≥0;

-V =Cp_([_{a, b}_{]): the set of real (or complex)-valued functions which are}

con-tinuous on [a, b] up to theirp-th derivative, 0≤p <∞. •

Deﬁnition 1.2 We say that a nonempty partW ofV is avector subspace

ofV iﬀW is a vector space over K.

Example 1.2 The vector spacePnis a vector subspace ofC∞(R), which is the

space of inﬁnite continuously diﬀerentiable functions on the real line. A trivial subspace of any vector space is the one containing only the zero vector. •

In particular, the setW of the linear combinations of a system ofpvectors ofV,{v1, . . . ,vp}, is a vector subspace ofV, called thegenerated subspace or spanof the vector system, and is denoted by

W = span{v1, . . . ,vp}

=_{v=α1v1+. . .+αpvp withαi∈K, i= 1, . . . , p}.

The system{v1, . . . ,vp} is called a system ofgeneratorsforW. IfW1, . . . , Wmare vector subspaces ofV, then the set

S=_{w: w=v1+. . .+vmwithvi∈Wi, i= 1, . . . , m}

(22)

1.2 Matrices 3

Deﬁnition 1.3 A system of vectors _{v1, . . . ,vm} of a vector space V is called linearly independentif the relation

α1v1+α2v2+. . .+αmvm=0

withα1, α2, . . . , αm∈Kimplies thatα1=α2=. . .=αm= 0. Otherwise, the system will be calledlinearly dependent.

We call a basis ofV any system of linearly independent generators of V. If _{u1, . . . ,un} is a basis of V, the expression v = v1u1+. . .+vnun is called the decomposition of v with respect to the basis and the scalars

v1, . . . , vn ∈ K are the components of v with respect to the given basis. Moreover, the following property holds.

Property 1.1 Let V be a vector space which admits a basis ofn vectors.

Then every system of linearly independent vectors of V has at most n

el-ements and any other basis of V has n elements. The number n is called

the dimension ofV and we writedim(V) =n.

If, instead, for any n there always exist n linearly independent vectors of

V, the vector space is called inﬁnite dimensional.

Example 1.3 For any integerpthe spaceCp_([_{a, b}_{]) is inﬁnite dimensional. The}

spacesRnandCnhave dimension equal ton. The usual basis forRnis the set of

unit vectors {e1, . . . ,en}where (ei)j=δij fori, j= 1, . . . n, where δij denotes

theKronecker symbol equal to 0 ifi=j and 1 ifi=j. This choice is of course not the only one that is possible (see Exercise 2). •

1.2 Matrices

Letmandnbe two positive integers. We call amatrixhavingmrows and

n columns, or a matrixm×n, or a matrix (m, n), with elements in K, a set ofmn scalarsaij ∈K, withi= 1, . . . , mand j= 1, . . . n, represented in the following rectangular array

A =

    

a11 a12 . . . a1n

a21 a22 . . . a2n ..

. ... ...

am1 am2 . . . amn

   

. (1.1)

(23)

We shall abbreviate (1.1) as A = (aij) withi= 1, . . . , mandj = 1, . . . n. The index i is called row index, while j is the column index. The set (ai1, ai2, . . . , ain) is called the i-th row of A; likewise, (a1j, a2j, . . . , amj) is thej-th columnof A.

If n=m the matrix is called squared or having ordern and the set of the entries (a11, a22, . . . , ann) is called itsmain diagonal.

A matrix having one row or one column is called arow vectororcolumn

vectorrespectively. Unless otherwise speciﬁed, we shall always assume that

a vector is a column vector. In the casen=m= 1, the matrix will simply denote a scalar ofK.

Sometimes it turns out to be useful to distinguish within a matrix the set made up by speciﬁed rows and columns. This prompts us to introduce the following deﬁnition.

Deﬁnition 1.4 Let A be a matrixm_×n. Let 1_≤i1< i2< . . . < ik ≤m and 1_≤j1< j2< . . . < jl≤ntwo sets of contiguous indexes. The matrix S(k_×l) of entries spq =aipjq with p = 1, . . . , k, q = 1, . . . , l is called a

submatrixof A. Ifk=l andir=jr forr= 1, . . . , k, S is called aprincipal

submatrixof A.

Deﬁnition 1.5 A matrix A(m_×n) is called block partitioned or said to

be partitioned into submatricesif

A =

    

A11 A12 . . . A1l A21 A22 . . . A2l

..

. ... . .. ... Ak1 Ak2 . . . Akl

    ,

where Aij are submatrices of A.

Among the possible partitions of A, we recall in particular the partition by columns

A = (a1, a2, . . . ,an),

aibeing thei-th column vector of A. In a similar way the partition by rows of A can be deﬁned. To ﬁx the notations, if A is a matrix m_×n, we shall denote by

A(i1:i2, j1:j2) = (aij) i1≤i≤i2, j1≤j ≤j2

the submatrix of A of size (i2−i1+ 1)×(j2−j1+ 1) that lies between the rowsi1 andi2 and the columnsj1 andj2. Likewise, ifvis a vector of size

n, we shall denote byv(i1 :i2) the vector of size i2−i1+ 1 made up by thei1-th to thei2-th components ofv.

(24)

1.3 Operations with Matrices 5

1.3 Operations with Matrices

Let A = (aij) and B = (bij) be two matricesm×n overK. We say that

A is equal to B, if aij =bij for i= 1, . . . , m, j = 1, . . . , n. Moreover, we

deﬁne the following operations:

- matrix sum: the matrix sum is the matrix A+B = (aij+bij). The neutral

element in a matrix sum is thenull matrix, still denoted by 0 and made up only by null entries;

- matrix multiplication by a scalar: the multiplication of A by λ_∈K, is a

matrixλA = (λaij);

- matrix product: the product of two matrices A and B of sizes (m, p)

and (p, n) respectively, is a matrix C(m, n) whose entries are cij = p

k=1

aikbkj, fori= 1, . . . , m,j= 1, . . . , n.

The matrix product is associative and distributive with respect to the ma-trix sum, but it is not in general commutative. The square matrices for which the property AB = BA holds, will be called commutative.

In the case of square matrices, the neutral element in the matrix product is a square matrix of order n called the unit matrix of order n or, more frequently, the identity matrix given by In = (δij). The identity matrix is, by deﬁnition, the only matrix n_×n such that AIn = InA = A for all square matrices A. In the following we shall omit the subscriptnunless it is strictly necessary. The identity matrix is a special instance of adiagonal

matrixof ordern, that is, a square matrix of the type D = (diiδij). We will

use in the following the notation D = diag(d11, d22, . . . , dnn).

Finally, if A is a square matrix of ordernandpis an integer, we deﬁne Ap as the product of A with itself iterated ptimes. We let A0_{= I.}

Let us now address the so-called elementary row operations that can be performed on a matrix. They consist of:

- multiplying the i-th row of a matrix by a scalar α; this operation is equivalent to pre-multiplying A by the matrix D = diag(1, . . . ,1, α,

1, . . . ,1), whereαoccupies thei-th position;

- exchanging the i-th and j-th rows of a matrix; this can be done by pre-multiplying A by the matrix P(i,j)_{of elements}

p(i,j)rs =

        

1 ifr=s= 1, . . . , i₋1, i+ 1, . . . , j₋1, j+ 1, . . . n,

1 ifr=j, s=ior r=i, s=j,

0 otherwise,

(25)

where Ir denotes the identity matrix of order r = j₋i₋1 if j > i (henceforth, matrices with size equal to zero will correspond to the empty set). Matrices like (1.2) are calledelementary permutation

matrices. The product of elementary permutation matrices is called

apermutation matrix, and it performs the row exchanges associated

with each elementary permutation matrix. In practice, a permutation matrix is a reordering by rows of the identity matrix;

- adding αtimes the j-th row of a matrix to itsi-th row. This operation can also be performed by pre-multiplying A by the matrix I + N(i,j)α , where N(i,j)α is a matrix having null entries except the one in position

i, jwhose value isα.

1.3.1 Inverse of a Matrix

Deﬁnition 1.6 A square matrix A of ordernis calledinvertible(orregular

or nonsingular) if there exists a square matrix B of order n such that

A B = B A = I. B is called theinverse matrixof A and is denoted by A−1_. A matrix which is not invertible is calledsingular.

If A is invertible its inverse is also invertible, with (A−1₎−1_{= A. Moreover,} if A and B are two invertible matrices of order n, their product AB is also invertible, with (A B)−1_{= B}−1_A−1_{. The following property holds.}

Property 1.2 A square matrix is invertible iﬀ its column vectors are lin-early independent.

Deﬁnition 1.7 We call the transpose of a matrix A_∈ Rm×n the matrix

n_×m, denoted by AT_{, that is obtained by exchanging the rows of A with}

the columns of A.

Clearly, (AT₎T _{= A, (A + B)}T _{= A}T _{+ B}T_{, (AB)}T _{= B}T_AT _{and (}_α_A)T ₌

αAT

∀α_∈R. If A is invertible, then also (AT₎−1_{= (A}−1₎T _{= A}−T_.

Deﬁnition 1.8 Let A_∈Cm×n; the matrix B = AH

∈Cn×mis called the

conjugate transpose(oradjoint) of A ifbij= ¯aji, where ¯ajiis the complex

conjugate ofaji.

In analogy with the case of the real matrices, it turns out that (A+B)H ₌ AH_{+ B}H_{, (AB)}H_{= B}H_AH _{and (}_α_A)H_{= ¯}_α_AH

∀α_∈C.

(26)

1.3 Operations with Matrices 7

Deﬁnition 1.10 A matrix A_∈Cn×n is calledhermitianorself-adjoint if AT _{= ¯}_{A, that is, if A}H _{= A, while it is called}_unitary _{if A}H_{A = AA}H_{= I.} Finally, if AAH _{= A}H_{A, A is called}_normal_. As a consequence, a unitary matrix is one such that A−1_{= A}H_.

Of course, a unitary matrix is also normal, but it is not in general her-mitian. For instance, the matrix of the Example 1.4 is unitary, although not symmetric (if s= 0). We ﬁnally notice that the diagonal entries of an hermitian matrix must necessarily be real (see also Exercise 5).

1.3.2 Matrices and Linear Mappings

Deﬁnition 1.11 Alinear mapfrom Cn into Cm is a functionf :Cn_−→ Cmsuch thatf(αx+βy) =αf(x) +βf(y),_∀α, β_∈Kand_∀x,y_∈Cn.

The following result links matrices and linear maps.

Property 1.3 Let f : Cn _−→ Cm be a linear map. Then, there exists a

unique matrixAf _∈Cm×n such that

f(x) = Afx ∀x∈Cn. (1.3)

Conversely, if Af ∈ Cm×n _{then the function deﬁned in} _(1.3) _{is a linear}

map fromCn _into_Cm_.

Example 1.4 An important example of a linear map is the counterclockwise

rotation by an angleϑ in the plane (x1, x2). The matrix associated with such a

map is given by

G(ϑ) =

c s

−s c

, c= cos(ϑ), s= sin(ϑ)

and it is called arotation matrix. •

1.3.3 Operations with Block-Partitioned Matrices

All the operations that have been previously introduced can be extended to the case of a block-partitioned matrix A, provided that the size of each single block is such that any single matrix operation is well-deﬁned. Indeed, the following result can be shown (see, e.g., [Ste73]).

Property 1.4 Let AandBbe the block matrices

A =

  

A11 . . . A1l

..

. . .. ...

Ak1 . . . Akl

 

, B =

  

B11 . . . B1n

..

. . .. ...

Bm1 . . . Bmn

  

(27)

1.

λA =

  

λA11 . . . λA1l

..

. . .. ...

λAk1 . . . λAkl

 

, λ_∈C; AT ₌

  

AT

11 . . . ATk1

..

. . .. ...

AT

1l . . . ATkl

  ;

2. ifk=m,l=n,mi=ki andnj =lj, then

A + B =

  

A11+ B11 . . . A1l+ B1l

..

. . .. ...

Ak1+ Bk1 . . . Akl+ Bkl

  ;

3. ifl=m,li=mi andki =ni, then, letting Cij=

m

s=1 AisBsj,

AB =

  

C11 . . . C1l

..

. . .. ...

Ck1 . . . Ckl

  .

1.4 Trace and Determinant of a Matrix

Let us consider a square matrix A of ordern. Thetraceof a matrix is the

sum of the diagonal entries of A, that is tr(A) = n

i=1

aii.

We call thedeterminantof A the scalar deﬁned through the following for-mula

det(A) = π_∈_P

sign(π)a1π1a2π2. . . anπn,

where P =π= (π1, . . . , πn)T is the set of the n! vectors that are ob-tained by permuting the index vectori= (1, . . . , n)T and sign(π) equal to 1 (respectively, ₋1) if an even (respectively, odd) number of exchanges is needed to obtainπ fromi.

The following properties hold

det(A) = det(AT₎_, _{det(AB) = det(A)det(B)}_, _det(A−1_{) = 1}_/_det(A)_,

det(AH_{) = det(A)}_, _det(_α_{A) =}_αn_det(A)_,

∀α_∈K.

(28)

1.5 Rank and Kernel of a Matrix 9

of sign in the determinant. Of course, the determinant of a diagonal matrix is the product of the diagonal entries.

Denoting by Aij the matrix of order n₋1 obtained from A by elimi-nating thei-th row and thej-th column, we call thecomplementary minor

associated with the entry aij the determinant of the matrix Aij. We call

the k-th principal (dominating) minor of A, dk, the determinant of the

principal submatrix of order k, Ak = A(1 : k,1 : k). If we denote by ∆ij = (−1)i+j_det(A

ij) the cofactorof the entry aij, the actual computa-tion of the determinant of A can be performed using the following recursive relation

det(A) =

        

a11 if n= 1,

n

j=1

∆ijaij, forn >1,

(1.4)

which is known as the Laplace rule. If A is a square invertible matrix of ordern, then

A−1= 1 det(A)C

where C is the matrix having entries ∆ji,i, j= 1, . . . , n.

As a consequence, a square matrix is invertible iﬀ its determinant is non-vanishing. In the case of nonsingular diagonal matrices the inverse is still a diagonal matrix having entries given by the reciprocals of the diagonal entries of the matrix.

Everyorthogonal matrixis invertible, its inverse is given by AT_{, moreover} det(A) =_±1.

1.5 Rank and Kernel of a Matrix

Let A be a rectangular matrix m_×n. We call the determinant of order

q (with q _≥ 1) extracted from matrix A, the determinant of any square

matrix of order q obtained from A by eliminating m₋q rows and n₋q

columns.

Deﬁnition 1.12 The rank of A (denoted by rank(A)) is the maximum order of the nonvanishing determinants extracted from A. A matrix has

complete or full rankif rank(A) = min(m,n).

Notice that the rank of A represents the maximum number of linearly independent column vectors of A that is, the dimension of therangeof A, deﬁned as

(29)

Rigorously speaking, one should distinguish between the column rank of A and the row rank of A, the latter being the maximum number of linearly independent row vectors of A. Nevertheless, it can be shown that the row rank and column rank do actually coincide.

Thekernelof A is deﬁned as the subspace

ker(A) ={x∈Rn : Ax=0}.

The following relations hold

1. rank(A) = rank(AT) (if A_∈Cm×n, rank(A) = rank(AH)) 2. rank(A) + dim(ker(A)) =n.

In general, dim(ker(A))= dim(ker(AT_{)). If A is a nonsingular square} ma-trix, then rank(A) =nand dim(ker(A)) = 0.

Example 1.5 Let

A =

1 1 0 1 −1 1

.

Then, rank(A) = 2, dim(ker(A)) = 1 and dim(ker(AT_{)) = 0.} _•

We ﬁnally notice that for a matrix A∈Cn×n _{the following properties are} equivalent:

1. A is nonsingular;

2. det(A)= 0;

3. ker(A) =_{0_};

4. rank(A) =n;

5. A has linearly independent rows and columns.

1.6 Special Matrices

1.6.1 Block Diagonal Matrices

These are matrices of the form D = diag(D1, . . . ,Dn), where Diare square matrices with i = 1, . . . , n. Clearly, each single diagonal block can be of diﬀerent size. We shall say that a block diagonal matrix has size n if n

(30)

1.6 Special Matrices 11

1.6.2 Trapezoidal and Triangular Matrices

A matrix A(m_×n) is calledupper trapezoidalifaij = 0 fori > j, while it

is lower trapezoidalifaij = 0 fori < j. The name is due to the fact that,

in the case of upper trapezoidal matrices, withm < n, the nonzero entries of the matrix form a trapezoid.

Atriangular matrixis a square trapezoidal matrix of ordernof the form

L =

    

l11 0 . . . 0

l21 l22 . . . 0 ..

. ... ...

ln1 ln2 . . . lnn

   

 or U =

    

u11 u12 . . . u1n 0 u22 . . . u2n

..

. ... ... 0 0 . . . unn

    .

The matrix L is calledlower triangular while U isupper triangular. Let us recall some algebraic properties of triangular matrices that are easy to check.

- The determinant of a triangular matrix is the product of the diagonal entries;

- the inverse of a lower (respectively, upper) triangular matrix is still lower (respectively, upper) triangular;

- the product of two lower triangular (respectively, upper trapezoidal) ma-trices is still lower triangular (respectively, upper trapezodial);

- if we call unit triangular matrix a triangular matrix that has diagonal entries equal to 1, then, the product of lower (respectively, upper) unit triangular matrices is still lower (respectively, upper) unit triangular.

1.6.3 Banded Matrices

The matrices introduced in the previous section are a special instance of banded matrices. Indeed, we say that a matrix A _∈Rm×n (or in Cm×n)

has lower band p if aij = 0 when i > j+p and upper band q if aij = 0

whenj > i+q. Diagonal matrices are banded matrices for whichp=q= 0, while trapezoidal matrices havep=m−1,q= 0 (lower trapezoidal),p= 0,

q=n−1 (upper trapezoidal).

Other banded matrices of relevant interest are the tridiagonal matrices

for whichp=q= 1 and theupper bidiagonal(p= 0,q= 1) orlower bidiag-onal(p= 1,q= 0). In the following, tridiagn(b,d,c) will denote the triadi-agonal matrix of sizenhaving respectively on the lower and upper principal diagonals the vectorsb= (b1, . . . , bn−1)T andc= (c1, . . . , cn−1)T, and on

the principal diagonal the vectord= (d1, . . . , dn)T. If bi =β,di =δand

(31)

We also mention the so-called lower Hessenberg matrices (p = m₋1,

q = 1) and upper Hessenberg matrices (p= 1, q = n₋1) that have the following structure

H =

     

h11 h12

0

h21 h22 . .. ..

. . .. hm−1n

hm1 . . . hmn

    

 or H =      

h11 h12 . . . h1n

h21 h22 h2n . .. . .. .._.

0

hmn−1 hmn

     .

Matrices of similar shape can obviously be set up in the block-like format.

1.7 Eigenvalues and Eigenvectors

Let A be a square matrix of ordernwith real or complex entries; the number

λ∈Cis called aneigenvalueof A if there exists a nonnull vector x∈Cn such that Ax = λx. The vector x is the eigenvector associated with the eigenvalueλand the set of the eigenvalues of A is called thespectrumof A, denoted byσ(A). We say thatxandyare respectively aright eigenvector

and aleft eigenvectorof A, associated with the eigenvalueλ, if

Ax=λx, yH_{A =}_λ_yH_.

The eigenvalueλcorresponding to the eigenvectorxcan be determined by computing the Rayleigh quotientλ=xH_A_x_/₍_xH_x_{). The number}_λ_{is the} solution of the characteristic equation

pA(λ) = det(A−λI) = 0,

where pA(λ) is thecharacteristic polynomial. Since this latter is a

polyno-mial of degreenwith respect toλ, there certainly existneigenvalues of A not necessarily distinct. The following properties can be proved

det(A) = n

i=1

λi, tr(A) = n

i=1

λi, (1.6)

and since det(AT₋_λ_{I) = det((A}₋_λ_I)T_{) = det(A}₋_λ_{I) one concludes that}

σ(A) =σ(AT_{) and, in an analogous way, that}_σ_(AH_{) =}_σ_{( ¯}_A).

From the ﬁrst relation in (1.6) it can be concluded that a matrix is singular iﬀ it has at least one null eigenvalue, since pA(0) = det(A) = Πn

i=1λi.

Secondly, if A has real entries, pA(λ) turns out to be a real-coeﬃcient

(32)

1.7 Eigenvalues and Eigenvectors 13

Finally, due to the Cayley-Hamilton Theorem if pA(λ) is the

charac-teristic polynomial of A, then pA(A) = 0, wherepA(A) denotes a matrix

polynomial (for the proof see, e.g., [Axe94], p. 51).

The maximum module of the eigenvalues of A is called thespectral radius

of A and is denoted by

ρ(A) = max

λ∈σ(A)|λ|. (1.7)

Characterizing the eigenvalues of a matrix as the roots of a polynomial implies in particular thatλis an eigenvalue of A_∈Cn×n iﬀ ¯λis an eigen-value of AH_{. An immediate consequence is that}_ρ_{(A) =}_ρ_(AH_{). Moreover,} ∀A_∈Cn×n,_∀α_∈C,ρ(αA) =_|α_|ρ(A), andρ(Ak_{) = [}_ρ_(A)]k

∀k_∈N.

Finally, assume that A is a block triangular matrix

A =

    

A11 A12 . . . A1k 0 A22 . . . A2k

..

. . .. ... 0 . . . 0 Akk

    .

As pA(λ) =p_A11(λ)p_A22(λ)· · ·pA_kk(λ), the spectrum of A is given by the

union of the spectra of each single diagonal block. As a consequence, if A is triangular, the eigenvalues of A are its diagonal entries.

For each eigenvalue λof a matrix A the set of the eigenvectors associated with λ, together with the null vector, identiﬁes a subspace ofCn which is called the eigenspace associated withλ and corresponds by deﬁnition to ker(A-λI). The dimension of the eigenspace is

dim [ker(A₋λI)] =n₋rank(A₋λI),

and is called geometric multiplicity of the eigenvalue λ. It can never be greater than the algebraic multiplicity of λ, which is the multiplicity of

λas a root of the characteristic polynomial. Eigenvalues having geometric multiplicity strictly less than the algebraic one are calleddefective. A matrix having at least one defective eigenvalue is called defective.

The eigenspace associated with an eigenvalue of a matrix A is invariant with respect to A in the sense of the following deﬁnition.

Deﬁnition 1.13 A subspaceS inCn is calledinvariantwith respect to a square matrix A if AS_⊂S, where AS is the transformed ofS through A.

(33)

1.8 Similarity Transformations

Deﬁnition 1.14 Let C be a square nonsingular matrix having the same order as the matrix A. We say that the matrices A and C−1_{AC are}_similar_, and the transformation from A to C−1_{AC is called a} _similarity

transfor-mation. Moreover, we say that the two matrices are unitarily similarif C

is unitary.

Two similar matrices share the same spectrum and the same characteris-tic polynomial. Indeed, it is easy to check that if (λ,x) is an eigenvalue-eigenvector pair of A, (λ,C−1_x_{) is the same for the matrix C}−1_{AC since}

(C−1_AC)C−1_x_{= C}−1_A_x₌_λ_C−1_x_.

We notice in particular that the product matrices AB and BA, with A ∈ Cn×m _{and B} _∈ _Cm×n_{, are not similar but satisfy the following property} (see [Hac94], p.18, Theorem 2.4.6)

σ(AB)\ {0}=σ(BA)\ {0}

that is, AB and BA share the same spectrum apart from null eigenvalues so thatρ(AB) =ρ(BA).

The use of similarity transformations aims at reducing the complexity of the problem of evaluating the eigenvalues of a matrix. Indeed, if a given matrix could be transformed into a similar matrix in diagonal or triangular form, the computation of the eigenvalues would be immediate. The main result in this direction is the following theorem (for the proof, see [Dem97], Theorem 4.2).

Property 1.5 (Schur decomposition) Given A∈Cn×n_{, there exists}_U

unitary such that

U−1AU = UHAU =

    

λ1 b12 . . . b1n 0 λ2 b2n

..

. . .. ...

0 . . . 0 λn

    = T,

whereλi are the eigenvalues of A.

It thus turns out that every matrix A is unitarily similar to an upper triangular matrix. The matrices T and U are not necessarily unique [Hac94]. The Schur decomposition theorem gives rise to several important results; among them, we recall:

1. every hermitian matrix is unitarily similar to a diagonal real ma-trix, that is, when A is hermitian every Schur decomposition of A is diagonal. In such an event, since

(34)

1.8 Similarity Transformations 15

it turns out that AU = UΛ, that is, Aui =λiui fori = 1, . . . , nso that the column vectors of U are the eigenvectors of A. Moreover, since the eigenvectors are orthogonal two by two, it turns out that an hermitian matrix has a system of orthonormal eigenvectors that generates the whole spaceCn_{. Finally, it can be shown that a matrix} A of ordernis similar to a diagonal matrix D iﬀ the eigenvectors of A form a basis forCn _[Axe94];

2. a matrix A∈Cn×n _{is normal iﬀ it is unitarily similar to a diagonal} matrix. As a consequence, a normal matrix A ∈ Cn×n _{admits the} followingspectral decomposition: A = UΛUH =ni=1λiuiuHi being U unitary and Λ diagonal [SS90];

3. let A and B be two normal and commutative matrices; then, the generic eigenvalue µi of A+B is given by the sum λi +ξi, where

λi and ξi are the eigenvalues of A and B associated with the same eigenvector.

There are, of course, nonsymmetric matrices that are similar to diagonal matrices, but these are not unitarily similar (see, e.g., Exercise 7).

The Schur decomposition can be improved as follows (for the proof see, e.g., [Str80], [God66]).

Property 1.6 (Canonical Jordan Form) Let A be any square matrix.

Then, there exists a nonsingular matrixXwhich transformsAinto a block

diagonal matrix Jsuch that

X−1AX = J = diag (Jk1(λ1),Jk2(λ2), . . . ,Jkl(λl)),

which is called canonical Jordan form, λj being the eigenvalues of A and

Jk(λ)∈Ck×k _{a Jordan block of the form} _J1(_λ_{) =}_λ_if _k_{= 1} _and

Jk(λ) =

        

λ 1 0 . . . 0 0 λ 1 _{· · ·} ... ..

. . .. ... 1 0

..

. . .. λ 1

0 . . . 0 λ         

, fork >1.

(35)

Partitioning X by columns, X = (x1, . . . ,xn), it can be seen that the

ki vectors associated with the Jordan block Jki(λi) satisfy the following

recursive relation

Axl=λixl, l= i−1

j=1

mj+ 1,

Axj =λixj+xj−1, j =l+ 1, . . . , l−1 +ki, if ki = 1.

(1.8)

The vectorsxiare calledprincipal vectorsorgeneralized eigenvectorsof A.

Example 1.6 Let us consider the following matrix

A =        

7/4 3/4 −1/4 −1/4 −1/4 1/4

0 2 0 0 0 0

−1/2 −1/2 5/2 1/2 −1/2 1/2 −1/2 −1/2 −1/2 5/2 1/2 1/2 −1/4 −1/4 −1/4 −1/4 11/4 1/4 −3/2 −1/2 −1/2 1/2 1/2 7/2

        .

The Jordan canonical form of A and its associated matrix X are given by

J =        

2 1 0 0 0 0 0 2 0 0 0 0 0 0 3 1 0 0 0 0 0 3 1 0 0 0 0 0 3 0 0 0 0 0 0 2

       

, X =        

1 0 0 0 0 1 0 1 0 0 0 1 0 0 1 0 0 1 0 0 0 1 0 1 0 0 0 0 1 1 1 1 1 1 1 1

        .

Notice that two diﬀerent Jordan blocks are related to the same eigenvalue (λ= 2). It is easy to check property (1.8). Consider, for example, the Jordan block associated with the eigenvalueλ2 = 3; we have

Ax3= [0 0 3 0 0 3]T = 3 [0 0 1 0 0 1]T=λ2x3,

Ax4= [0 0 1 3 0 4]T = 3 [0 0 0 1 0 1]T+ [0 0 1 0 0 1]T=λ2x4+x3,

Ax5= [0 0 0 1 3 4]T = 3 [0 0 0 0 1 1]T+ [0 0 0 1 0 1]T=λ2x5+x4.

•

1.9 The Singular Value Decomposition (SVD)

Any matrix can be reduced in diagonal form by a suitable pre and post-multiplication by unitary matrices. Precisely, the following result holds.

Property 1.7 LetA_∈Cm×n. There exist two unitary matricesU_∈Cm×m

andV_∈Cn×n such that

UHAV = Σ = diag(σ1, . . . , σp)∈Cm×n with p= min(m, n) (1.9)

and σ1 ≥. . .≥σp ≥0. Formula(1.9) is called Singular Value

Decompo-sition or (SVD) of A and the numbers σi (or σi(A)) are called singular

(36)

1.10 Scalar Product and Norms in Vector Spaces 17

If A is a real-valued matrix, U and V will also be real-valued and in (1.9) UT _{must be written instead of U}H_{. The following characterization of the} singular values holds

σi(A) =λi(AH_A)_, _i_{= 1}_{, . . . , n.} _(1.10)

Indeed, from (1.9) it follows that A = UΣVH, AH= VΣUH so that, U and V being unitary, AH_{A = VΣ}2_VH_{, that is,} _λ_i(AH_{A) =}_λ_i(Σ2_{) = (}_σ_i(A))2_. Since AAH _{and A}H_{A are hermitian matrices, the columns of U, called the}

left singular vectors of A, turn out to be the eigenvectors of AAH _(see

Section 1.8) and, therefore, they are not uniquely deﬁned. The same holds for the columns of V, which are theright singular vectorsof A.

Relation (1.10) implies that if A_∈Cn×nis hermitian with eigenvalues given byλ1,λ2, . . . , λn, then the singular values of A coincide with the modules of the eigenvalues of A. Indeed because AAH _{= A}2_,_σ

i =

λ2

i =|λi| for

i= 1, . . . , n. As far as the rank is concerned, if

σ1≥. . .≥σr> σr+1=. . .=σp= 0,

then the rank of A is r, the kernel of A is the span of the column vectors of V,_{vr+1, . . . ,vn}, and the range of A is the span of the column vectors of U, _{u1, . . . ,ur}.

Deﬁnition 1.15 Suppose that A∈Cm×n has rank equal tor and that it admits a SVD of the type UH_{AV = Σ. The matrix A}† _{= VΣ}†_UH _{is called}

theMoore-Penrose pseudo-inversematrix, being

Σ†= diag

₁

σ1

, . . . , 1 σr

,0, . . . ,0

. (1.11)

The matrix A† is also called thegeneralized inverseof A (see Exercise 13). Indeed, if rank(A) = n < m, then A† _{= (A}T_A)−1_AT_{, while if} _n ₌ _m ₌ rank(A), A†_{= A}−1_{. For further properties of A}†_{, see also Exercise 12.}

1.10 Scalar Product and Norms in Vector Spaces

(37)

Deﬁnition 1.16 A scalar product on a vector space V deﬁned over K

is any map (_·,_·) acting from V _×V into K which enjoys the following properties:

1. it is linear with respect to the vectors of V, that is

(γx+λz,y) =γ(x,y) +λ(z,y), ∀x,z∈V, ∀γ, λ∈K;

2. it ishermitian, that is, (y,x) = (x,y), ∀x,y∈V;

3. it is positive deﬁnite, that is, (x,x) > 0, ∀x = 0 (in other words, (x,x)_≥0, and (x,x) = 0 if and only ifx=0).

In the case V =Cn (or Rn), an example is provided by the classical Eu-clidean scalar product given by

(x,y) =yH_x₌ n

i=1

xiy¯i,

where ¯z denotes the complex conjugate ofz.

Moreover, for any given square matrix A of ordernand for anyx,y∈Cn the following relation holds

(Ax,y) = (x,AHy). (1.12)

In particular, since for any matrix Q_∈Cn×n, (Qx,Qy) = (x,QH_Q_y_{), one} gets

Property 1.8 Unitary matrices preserve the Euclidean scalar product, that is, (Qx,Qy) = (x,y)for any unitary matrixQand for any pair of vectors

x andy.

Deﬁnition 1.17 LetV be a vector space over K. We say that the map · fromV into Ris a normonV if the following axioms are satisﬁed:

1. (i) v_≥0_∀v_∈V and (ii) v = 0 if and only ifv=0;

2. αv =|α| v ∀α∈K, ∀v∈V (homogeneity property);

3. v+w ≤ v + w ∀v,w∈V (triangular inequality),

where _|α_| denotes the absolute value of α if K = R, the module of α if

(38)

1.10 Scalar Product and Norms in Vector Spaces 19

The pair (V,_·) is called a normed space. We shall distinguish among norms by a suitable subscript at the margin of the double bar symbol. In the case the map_{| · |}from V intoRenjoys only the properties 1(i), 2 and 3 we shall call such a map a seminorm. Finally, we shall call aunit vector

any vector ofV having unit norm.

An example of a normed space isRn_{, equipped for instance by the}_p-norm

(or H¨older norm); this latter is deﬁned for a vector xof components{xi}

as

x p=

_n

i=1 |xi|p

1/p

, for 1_≤p <_∞. (1.13)

Notice that the limit aspgoes to infinity of x pexists, is finite, and equals the maximum module of the components ofx. Such a limit defines in turn a norm, called theinfinity norm(ormaximum norm), given by

x ∞= max

1≤i≤n|xi|.

When p = 2, from (1.13) the standard deﬁnition of Euclidean norm is recovered

x 2= (x,x)1/2=

_n

i=1 |xi|2

1/2

=xTx1/2,

for which the following property holds.

Property 1.9 (Cauchy-Schwarz inequality) For any pairx,y∈Rn_,

|(x,y)_|=_|xTy_{| ≤}x 2 y 2, (1.14)

where strict equality holds iﬀ y=αx for someα_∈R.

We recall that the scalar product in Rn can be related to the p-norms introduced overRn in (1.13) by theH¨older inequality

|(x,y)_{| ≤}x p y q, with 1

p+

1

q = 1.

In the case where V is a ﬁnite-dimensional space the following property holds (for a sketch of the proof, see Exercise 14).

Property 1.10 Any vector norm · deﬁned onV is a continuous function

of its argument, namely, ∀ε > 0, ∃C > 0 such that if x−x ≤ ε then

| x − x | ≤Cε, for any x,x∈V.

(39)

Property 1.11 Let_·be a norm ofRn andA_∈Rn×n be a matrix with

nlinearly independent columns. Then, the function _·A2 acting fromRn

intoRdeﬁned as

x A2 = Ax ∀x∈Rn,

is a norm of Rn.

Two vectorsx,yinV are said to beorthogonalif (x,y) = 0. This statement has an immediate geometric interpretation when V =R2 since in such a case

(x,y) = x 2 y 2cos(ϑ),

where ϑ is the angle between the vectors x and y. As a consequence, if (x,y) = 0 thenϑis a right angle and the two vectors are orthogonal in the geometric sense.

Deﬁnition 1.18 Two norms_·p and · q onV are equivalentif there exist two positive constantscpq andCpq such that

cpq x q ≤ x p≤Cpq x q ∀x∈V.

In a ﬁnite-dimensional normed space all norms are equivalent. In particular, ifV =Rn it can be shown that for thep-norms, withp= 1, 2, and_∞, the constantscpq andCpq take the value reported in Table 1.1.

cpq q= 1 q= 2 q=∞

p= 1 1 1 1

p= 2 n−1/2 ₁ ₁

p=∞ n−1 _n−1/2 ₁

Cpq q= 1 q= 2 q=∞

p= 1 1 n1/2 _n

p= 2 1 1 n1/2

p=∞ 1 1 1

TABLE 1.1. Equivalence constants for the main norms ofRn

In this book we shall often deal with sequences of vectors and with their

convergence. For this purpose, we recall that a sequence of vectorsx(k)

in a vector spaceV having ﬁnite dimensionn, converges to a vectorx, and we write lim

k→∞x

(k)₌_x_if

lim k→∞x

(k)

i =xi, i= 1, . . . , n (1.15)

(40)

1.11 Matrix Norms 21

sequence of real numbers, (1.15) implies also the uniqueness of the limit, if existing, of a sequence of vectors.

We further notice that in a ﬁnite-dimensional space all the norms are topo-logically equivalent in the sense of convergence, namely, given a sequence of vectorsx(k)_,

|||x(k)||| →0 ⇔ x(k) →0 if k→ ∞,

where_{||| · |||} and_· are any two vector norms. As a consequence, we can establish the following link between norms and limits.

Property 1.12 Let_·be a norm in a space ﬁnite dimensional space V. Then

lim k→∞x

(k)₌_x

⇔ lim

k→∞ x−x

(k) _{= 0}_,

wherex_∈V andx(k)_{is a sequence of elements of}_V_.

1.11 Matrix Norms

Deﬁnition 1.19 Amatrix normis a mapping_·:Rm×n _→Rsuch that:

1. A_≥0_∀A_∈Rm×n and A = 0 if and only if A = 0;

2. αA =_|α_|A_∀α_∈R, _∀A_∈Rm×n (homogeneity);

3. A + B ≤ A + B ∀A,B∈Rm×n (triangular inequality).

Unless otherwise speciﬁed we shall employ the same symbol_·, to denote matrix norms and vector norms.

We can better characterize the matrix norms by introducing the concepts of compatible norm and norm induced by a vector norm.

Deﬁnition 1.20 We say that a matrix norm_·iscompatibleorconsistent

with a vector norm _·if

Ax_≤Ax , _∀x_∈Rn. (1.16)

(41)

Deﬁnition 1.21 We say that a matrix norm_· is sub-multiplicative if ∀A_∈Rn×m,_∀B_∈Rm×q

AB ≤ A B . (1.17)

This property is not satisﬁed by any matrix norm. For example (taken from [GL89]), the norm A ∆ = max|aij| fori = 1, . . . , n, j = 1, . . . , m does not satisfy (1.17) if applied to the matrices

A = B =

1 1 1 1

,

since 2 = AB ∆> A ∆ B ∆= 1.

Notice that, given a certain sub-multiplicative matrix norm _·α, there always exists a consistent vector norm. For instance, given any ﬁxed vector

y=0inCn, it suﬃces to deﬁne the consistent vector norm as

x = xyH α x∈Cn.

As a consequence, in the case of sub-multiplicative matrix norms it is no longer necessary to explicitly specify the vector norm with respect to the matrix norm is consistent.

Example 1.7 The norm

AF =

n

i,j=1

|aij|2= tr(AAH) (1.18)

is a matrix norm called theFrobenius norm(orEuclidean norminCn2) and is compatible with the Euclidean vector norm · 2. Indeed,

Ax22 =

n

i=1

n

j=1

aijxj

2

≤

n

i=1

_n

j=1

|aij|2 n

j=1

|xj|2

=A2Fx22.

Notice that for such a normInF =√n. •

In view of the deﬁnition of a natural norm, we recall the following theorem.

Theorem 1.1 Let · be a vector norm. The function

A = sup

x=0

Ax

x (1.19)

(42)

1.11 Matrix Norms 23

Proof.We start by noticing that (1.19) is equivalent to

A= sup

x=1

Ax. (1.20)

Indeed, one can deﬁne for anyx=0the unit vectoru=x/x, so that (1.19) becomes

A= sup

u=1

Au=Aw withw= 1.

This being taken as given, let us check that (1.19) (or, equivalently, (1.20)) is actually a norm, making direct use of Deﬁnition 1.19.

1. IfAx ≥0, then it follows thatA= sup

x=1

Ax ≥0. Moreover

A= sup

x=0

Ax

x = 0⇔ Ax= 0 ∀x=0

and Ax=0∀x=0if and only if A=0; thereforeA= 0⇔A = 0. 2. Given a scalarα,

αA= sup

x=1

αAx=|α| sup

x=1

Ax=|α| A.

3. Finally, triangular inequality holds. Indeed, by deﬁnition of supremum, if

x=0then

Ax

x ≤ A ⇒ Ax ≤ Ax,

so that, takingxwith unit norm, one gets

(A + B)x ≤ Ax+Bx ≤ A+B,

from which it follows thatA + B= sup

x=1

(A + B)x ≤ A+B.

✸ Relevant instances of induced matrix norms are the so-called p-norms de-ﬁned as

A p= sup

x=0

Ax p

x p

The 1-norm and the inﬁnity norm are easily computable since

A 1= max j=1,... ,n

m

i=1

|aij|, A ∞=_{i=1,... ,m}max

n

j=1 |aij|

and they are called the column sum normand therow sum norm, respec-tively.

Moreover, we have A 1 = AT ∞ and, if A is self-adjoint or real

sym-metric, A 1= A ∞.

(43)

Theorem 1.2 Letσ1(A)be the largest singular value ofA. Then

A 2=

ρ(AH_{A) =}_ρ_(AAH_{) =}_σ_1(A)_. _(1.21)

In particular, ifA is hermitian (or real and symmetric), then

A 2=ρ(A), (1.22)

while, if Ais unitary, A 2= 1.

Proof.Since AH_{A is hermitian, there exists a unitary matrix U such that}

UHAHAU = diag(µ1, . . . , µn),

whereµiare the (positive) eigenvalues of AHA. Lety= UHx, then

A2 = sup

x=0

(AH_A_x_,_x₎

(x,x) = supy=0

(UH_AH_AU_y_,_y₎

(y,y)

= sup

y=0

n

i=1

µi|yi|2/ n

i=1

|yi|2=

max

i=1,... ,n|µi|,

from which (1.21) follows, thanks to (1.10).

If A is hermitian, the same considerations as above apply directly to A. Finally, if A is unitary

Ax22= (Ax,Ax) = (x,AHAx) =x22

so thatA2= 1. ✸

As a consequence, the computation of A 2 is much more expensive than that of A _∞ or A 1. However, if only an estimate of A 2 is required, the following relations can be proﬁtably employed in the case of square matrices

max

i,j |aij| ≤ A 2≤nmaxi,j |aij|, 1

√

n A ∞≤ A 2≤ √_n _A

∞,

1

√

n A 1≤ A 2≤ √_n _A

1,

A 2≤

A 1 A ∞.

For other estimates of similar type we refer to Exercise 17. Moreover, if A is normal then A 2≤ A p for anynand allp≥2.