• Nenhum resultado encontrado

Numerical Mathematics Livros e Materiais Online Evandro Marquesone

N/A
N/A
Protected

Academic year: 2018

Share "Numerical Mathematics Livros e Materiais Online Evandro Marquesone"

Copied!
669
0
0

Texto

(1)

Numerical Mathematics

Alfio Quarteroni

Riccardo Sacco

Fausto Saleri

(2)

Texts in Applied Mathematics

m

37

Springer

New York Berlin Heidelberg Barcelona Hong Kong London Milan Paris Singapore Tokyo

(3)
(4)

Alfio Quarteroni

MM

Riccardo Sacco

Fausto Saleri

123

Numerical Mathematics

(5)

Alfio Quarteroni

Department of Mathematics Ecole Polytechnique

MFe´de´rale de Lausanne CH-1015 Lausanne Switzerland

alfio.quarteroni@epfl.ch

Riccardo Sacco

Dipartimento di Matematica Politecnico di Milano Piazza Leonardo da Vinci 32 20133 Milan

Italy

ricsac@mate.polimi.it

Fausto Saleri

Dipartimento di Matematica,

M“F. Enriques” Università degli Studi di

MMilano Via Saldini 50 20133 Milan Italy

fausto.saleri@unimi.it

Series Editors

J.E. Marsden

Control and Dynamical Systems, 107–81 California Institute of Technology Pasadena, CA 91125

USA M. Golubitsky

Department of Mathematics University of Houston Houston, TX 77204-3476 USA

L. Sirovich

Division of Applied Mathematics Brown University

Providence, RI 02912 USA

W. J¨ager

Department of Applied Mathematics Universit ¨at Heidelberg

Im Neuenheimer Feld 294 69120 Heidelberg Germany

Library of Congress Cataloging-in-Publication Data Quarteroni, Alfio.

Numerical mathematics/Alfio Quarteroni, Riccardo Sacco, Fausto Saleri. p.Mcm. — (Texts in applied mathematics; 37)

Includes bibliographical references and index. ISBN 0-387-98959-5 (alk. paper)

1. Numerical analysis.MI. Sacco, Riccardo.MII. Saleri, Fausto.MIII. Title.MIV. Series. I. Title.MMII. Series.

QA297.Q83M2000

519.4—dc21 99-059414

© 2000 Springer-Verlag New York, Inc.

All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher (Springer-Verlag New York, Inc., 175 Fifth Avenue, New York, NY 10010, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or heraf-ter developed is forbidden.

The use of general descriptive names, trade names, trademarks, etc., in this publication, even if the former are not especially identified, is not to be taken as a sign that such names, as understood by the Trade Marks and Merchandise Marks Act, may accordingly be used freely by anyone.

ISBN 0-387-98959-5nSpringer-VerlagnNew YorknBerlinnHeidelbergMSPIN 10747955

(6)
(7)

Preface

Numerical mathematics is the branch of mathematics that proposes, de-velops, analyzes and applies methods from scientific computing to several fields including analysis, linear algebra, geometry, approximation theory, functional equations, optimization and differential equations. Other disci-plines such as physics, the natural and biological sciences, engineering, and economics and the financial sciences frequently give rise to problems that need scientific computing for their solutions.

As such, numerical mathematics is the crossroad of several disciplines of great relevance in modern applied sciences, and can become a crucial tool for their qualitative and quantitative analysis. This role is also emphasized by the continual development of computers and algorithms, which make it possible nowadays, using scientific computing, to tackle problems of such a large size that real-life phenomena can be simulated providing accurate responses at affordable computational cost.

The corresponding spread of numerical software represents an enrichment for the scientific community. However, the user has to make the correct choice of the method (or the algorithm) which best suits the problem at hand. As a matter of fact, no black-box methods or algorithms exist that can effectively and accurately solve all kinds of problems.

(8)

viii Preface

and cons. This is done using the MATLAB 1 software environment. This choice satisfies the two fundamental needs of user-friendliness and wide-spread diffusion, making it available on virtually every computer.

Every chapter is supplied with examples, exercises and applications of the discussed theory to the solution of real-life problems. The reader is thus in the ideal condition for acquiring the theoretical knowledge that is required to make the right choice among the numerical methodologies and make use of the related computer programs.

This book is primarily addressed to undergraduate students, with partic-ular focus on the degree courses in Engineering, Mathematics, Physics and Computer Science. The attention which is paid to the applications and the related development of software makes it valuable also for graduate stu-dents, researchers and users of scientific computing in the most widespread professional fields.

The content of the volume is organized into four parts and 13 chapters. Part I comprises two chapters in which we review basic linear algebra and introduce the general concepts of consistency, stability and convergence of a numerical method as well as the basic elements of computer arithmetic.

Part II is on numerical linear algebra, and is devoted to the solution of linear systems (Chapters 3 and 4) and eigenvalues and eigenvectors com-putation (Chapter 5).

We continue with Part III where we face several issues about functions and their approximation. Specifically, we are interested in the solution of nonlinear equations (Chapter 6), solution of nonlinear systems and opti-mization problems (Chapter 7), polynomial approximation (Chapter 8) and numerical integration (Chapter 9).

Part IV, which is the more demanding as a mathematical background, is concerned with approximation, integration and transforms based on orthog-onal polynomials (Chapter 10), solution of initial value problems (Chap-ter 11), boundary value problems (Chap(Chap-ter 12) and initial-boundary value problems for parabolic and hyperbolic equations (Chapter 13).

Part I provides the indispensable background. Each of the remaining Parts has a size and a content that make it well suited for a semester course.

A guideline index to the use of the numerous MATLAB Programs de-veloped in the book is reported at the end of the volume. These programs are also available at the web site address:

http://www1.mate.polimi.it/˜calnum/programs.html

For the reader’s ease, any code is accompanied by a brief description of its input/output parameters.

We express our thanks to the staff at Springer-Verlag New York for their expert guidance and assistance with editorial aspects, as well as to Dr.

(9)

Preface ix

Martin Peters from Springer-Verlag Heidelberg and Dr. Francesca Bonadei from Springer-Italia for their advice and friendly collaboration all along this project.

We gratefully thank Professors L. Gastaldi and A. Valli for their useful comments on Chapters 12 and 13.

We also wish to express our gratitude to our families for their forbearance and understanding, and dedicate this book to them.

(10)

Contents

Series Preface v

Preface vii

PART I: Getting Started

1. Foundations of Matrix Analysis 1

1.1 Vector Spaces . . . 1

1.2 Matrices . . . 3

1.3 Operations with Matrices . . . 5

1.3.1 Inverse of a Matrix . . . 6

1.3.2 Matrices and Linear Mappings . . . 7

1.3.3 Operations with Block-Partitioned Matrices . . . . 7

1.4 Trace and Determinant of a Matrix . . . 8

1.5 Rank and Kernel of a Matrix . . . 9

1.6 Special Matrices . . . 10

1.6.1 Block Diagonal Matrices . . . 10

1.6.2 Trapezoidal and Triangular Matrices . . . 11

1.6.3 Banded Matrices . . . 11

1.7 Eigenvalues and Eigenvectors . . . 12

1.8 Similarity Transformations . . . 14

1.9 The Singular Value Decomposition (SVD) . . . 16

1.10 Scalar Product and Norms in Vector Spaces . . . 17

(11)

xii Contents

1.11.1 Relation Between Norms and the

Spectral Radius of a Matrix . . . 25

1.11.2 Sequences and Series of Matrices . . . 26

1.12 Positive Definite, Diagonally Dominant and M-Matrices . 27 1.13 Exercises . . . 30

2. Principles of Numerical Mathematics 33 2.1 Well-Posedness and Condition Number of a Problem . . . 33

2.2 Stability of Numerical Methods . . . 37

2.2.1 Relations Between Stability and Convergence . . . 40

2.3 A priorianda posterioriAnalysis . . . 41

2.4 Sources of Error in Computational Models . . . 43

2.5 Machine Representation of Numbers . . . 45

2.5.1 The Positional System . . . 45

2.5.2 The Floating-Point Number System . . . 46

2.5.3 Distribution of Floating-Point Numbers . . . 49

2.5.4 IEC/IEEE Arithmetic . . . 49

2.5.5 Rounding of a Real Number in Its Machine Representation . . . 50

2.5.6 Machine Floating-Point Operations . . . 52

2.6 Exercises . . . 54

PART II: Numerical Linear Algebra 3. Direct Methods for the Solution of Linear Systems 57 3.1 Stability Analysis of Linear Systems . . . 58

3.1.1 The Condition Number of a Matrix . . . 58

3.1.2 Forwarda prioriAnalysis . . . 60

3.1.3 Backwarda prioriAnalysis . . . 63

3.1.4 A posterioriAnalysis . . . 64

3.2 Solution of Triangular Systems . . . 65

3.2.1 Implementation of Substitution Methods . . . 65

3.2.2 Rounding Error Analysis . . . 67

3.2.3 Inverse of a Triangular Matrix . . . 67

3.3 The Gaussian Elimination Method (GEM) and LU Factorization . . . 68

3.3.1 GEM as a Factorization Method . . . 72

3.3.2 The Effect of Rounding Errors . . . 76

3.3.3 Implementation of LU Factorization . . . 77

3.3.4 Compact Forms of Factorization . . . 78

3.4 Other Types of Factorization . . . 79

3.4.1 LDMT Factorization . . . . 79

3.4.2 Symmetric and Positive Definite Matrices: The Cholesky Factorization . . . 80

(12)

Contents xiii

3.5 Pivoting . . . 85

3.6 Computing the Inverse of a Matrix . . . 89

3.7 Banded Systems . . . 90

3.7.1 Tridiagonal Matrices . . . 91

3.7.2 Implementation Issues . . . 92

3.8 Block Systems . . . 93

3.8.1 Block LU Factorization . . . 94

3.8.2 Inverse of a Block-Partitioned Matrix . . . 95

3.8.3 Block Tridiagonal Systems . . . 95

3.9 Sparse Matrices . . . 97

3.9.1 The Cuthill-McKee Algorithm . . . 98

3.9.2 Decomposition into Substructures . . . 100

3.9.3 Nested Dissection . . . 103

3.10 Accuracy of the Solution Achieved Using GEM . . . 103

3.11 An Approximate Computation ofK(A) . . . 106

3.12 Improving the Accuracy of GEM . . . 109

3.12.1 Scaling . . . 110

3.12.2 Iterative Refinement . . . 111

3.13 Undetermined Systems . . . 112

3.14 Applications . . . 115

3.14.1 Nodal Analysis of a Structured Frame . . . 115

3.14.2 Regularization of a Triangular Grid . . . 118

3.15 Exercises . . . 121

4. Iterative Methods for Solving Linear Systems 123 4.1 On the Convergence of Iterative Methods . . . 123

4.2 Linear Iterative Methods . . . 126

4.2.1 Jacobi, Gauss-Seidel and Relaxation Methods . . . 127

4.2.2 Convergence Results for Jacobi and Gauss-Seidel Methods . . . 129

4.2.3 Convergence Results for the Relaxation Method . 131 4.2.4 A prioriForward Analysis . . . 132

4.2.5 Block Matrices . . . 133

4.2.6 Symmetric Form of the Gauss-Seidel and SOR Methods . . . 133

4.2.7 Implementation Issues . . . 135

4.3 Stationary and Nonstationary Iterative Methods . . . 136

4.3.1 Convergence Analysis of the Richardson Method . 137 4.3.2 Preconditioning Matrices . . . 139

4.3.3 The Gradient Method . . . 146

4.3.4 The Conjugate Gradient Method . . . 150

4.3.5 The Preconditioned Conjugate Gradient Method . 156 4.3.6 The Alternating-Direction Method . . . 158

4.4 Methods Based on Krylov Subspace Iterations . . . 159

(13)

xiv Contents

4.4.2 The GMRES Method . . . 165

4.4.3 The Lanczos Method for Symmetric Systems . . . 167

4.5 The Lanczos Method for Unsymmetric Systems . . . 168

4.6 Stopping Criteria . . . 171

4.6.1 A Stopping Test Based on the Increment . . . 172

4.6.2 A Stopping Test Based on the Residual . . . 174

4.7 Applications . . . 174

4.7.1 Analysis of an Electric Network . . . 174

4.7.2 Finite Difference Analysis of Beam Bending . . . . 177

4.8 Exercises . . . 179

5. Approximation of Eigenvalues and Eigenvectors 183 5.1 Geometrical Location of the Eigenvalues . . . 183

5.2 Stability and Conditioning Analysis . . . 186

5.2.1 A prioriEstimates . . . 186

5.2.2 A posterioriEstimates . . . 190

5.3 The Power Method . . . 192

5.3.1 Approximation of the Eigenvalue of Largest Module . . . 192

5.3.2 Inverse Iteration . . . 195

5.3.3 Implementation Issues . . . 196

5.4 The QR Iteration . . . 200

5.5 The Basic QR Iteration . . . 201

5.6 The QR Method for Matrices in Hessenberg Form . . . 203

5.6.1 Householder and Givens Transformation Matrices 204 5.6.2 Reducing a Matrix in Hessenberg Form . . . 207

5.6.3 QR Factorization of a Matrix in Hessenberg Form 209 5.6.4 The Basic QR Iteration Starting from Upper Hessenberg Form . . . 210

5.6.5 Implementation of Transformation Matrices . . . . 212

5.7 The QR Iteration with Shifting Techniques . . . 215

5.7.1 The QR Method with Single Shift . . . 215

5.7.2 The QR Method with Double Shift . . . 218

5.8 Computing the Eigenvectors and the SVD of a Matrix . . 221

5.8.1 The Hessenberg Inverse Iteration . . . 221

5.8.2 Computing the Eigenvectors from the Schur Form of a Matrix . . . 221

5.8.3 Approximate Computation of the SVD of a Matrix 222 5.9 The Generalized Eigenvalue Problem . . . 224

5.9.1 Computing the Generalized Real Schur Form . . . 225

5.9.2 Generalized Real Schur Form of Symmetric-Definite Pencils . . . 226

5.10 Methods for Eigenvalues of Symmetric Matrices . . . 227

5.10.1 The Jacobi Method . . . 227

(14)

Contents xv

5.11 The Lanczos Method . . . 233

5.12 Applications . . . 235

5.12.1 Analysis of the Buckling of a Beam . . . 236

5.12.2 Free Dynamic Vibration of a Bridge . . . 238

5.13 Exercises . . . 240

PART III: Around Functions and Functionals 6. Rootfinding for Nonlinear Equations 245 6.1 Conditioning of a Nonlinear Equation . . . 246

6.2 A Geometric Approach to Rootfinding . . . 248

6.2.1 The Bisection Method . . . 248

6.2.2 The Methods of Chord, Secant and Regula Falsi and Newton’s Method . . . 251

6.2.3 The Dekker-Brent Method . . . 256

6.3 Fixed-Point Iterations for Nonlinear Equations . . . 257

6.3.1 Convergence Results for Some Fixed-Point Methods . . . 260

6.4 Zeros of Algebraic Equations . . . 261

6.4.1 The Horner Method and Deflation . . . 262

6.4.2 The Newton-Horner Method . . . 263

6.4.3 The Muller Method . . . 267

6.5 Stopping Criteria . . . 269

6.6 Post-Processing Techniques for Iterative Methods . . . 272

6.6.1 Aitken’s Acceleration . . . 272

6.6.2 Techniques for Multiple Roots . . . 275

6.7 Applications . . . 276

6.7.1 Analysis of the State Equation for a Real Gas . . 276

6.7.2 Analysis of a Nonlinear Electrical Circuit . . . 277

6.8 Exercises . . . 279

7. Nonlinear Systems and Numerical Optimization 281 7.1 Solution of Systems of Nonlinear Equations . . . 282

7.1.1 Newton’s Method and Its Variants . . . 283

7.1.2 Modified Newton’s Methods . . . 284

7.1.3 Quasi-Newton Methods . . . 288

7.1.4 Secant-Like Methods . . . 288

7.1.5 Fixed-Point Methods . . . 290

7.2 Unconstrained Optimization . . . 294

7.2.1 Direct Search Methods . . . 295

7.2.2 Descent Methods . . . 300

7.2.3 Line Search Techniques . . . 302

7.2.4 Descent Methods for Quadratic Functions . . . 304

(15)

xvi Contents

7.2.7 Secant-Like Methods . . . 309

7.3 Constrained Optimization . . . 311

7.3.1 Kuhn-Tucker Necessary Conditions for Nonlinear Programming . . . 313

7.3.2 The Penalty Method . . . 315

7.3.3 The Method of Lagrange Multipliers . . . 317

7.4 Applications . . . 319

7.4.1 Solution of a Nonlinear System Arising from Semiconductor Device Simulation . . . 320

7.4.2 Nonlinear Regularization of a Discretization Grid . 323 7.5 Exercises . . . 325

8. Polynomial Interpolation 327 8.1 Polynomial Interpolation . . . 328

8.1.1 The Interpolation Error . . . 329

8.1.2 Drawbacks of Polynomial Interpolation on Equally Spaced Nodes and Runge’s Counterexample . . . . 330

8.1.3 Stability of Polynomial Interpolation . . . 332

8.2 Newton Form of the Interpolating Polynomial . . . 333

8.2.1 Some Properties of Newton Divided Differences . . 335

8.2.2 The Interpolation Error Using Divided Differences 337 8.3 Piecewise Lagrange Interpolation . . . 338

8.4 Hermite-Birkoff Interpolation . . . 341

8.5 Extension to the Two-Dimensional Case . . . 343

8.5.1 Polynomial Interpolation . . . 343

8.5.2 Piecewise Polynomial Interpolation . . . 344

8.6 Approximation by Splines . . . 348

8.6.1 Interpolatory Cubic Splines . . . 349

8.6.2 B-Splines . . . 353

8.7 Splines in Parametric Form . . . 357

8.7.1 B´ezier Curves and Parametric B-Splines . . . 359

8.8 Applications . . . 362

8.8.1 Finite Element Analysis of a Clamped Beam . . . 363

8.8.2 Geometric Reconstruction Based on Computer Tomographies . . . 366

8.9 Exercises . . . 368

9. Numerical Integration 371 9.1 Quadrature Formulae . . . 371

9.2 Interpolatory Quadratures . . . 373

9.2.1 The Midpoint or Rectangle Formula . . . 373

9.2.2 The Trapezoidal Formula . . . 375

9.2.3 The Cavalieri-Simpson Formula . . . 377

9.3 Newton-Cotes Formulae . . . 378

(16)

Contents xvii

9.5 Hermite Quadrature Formulae . . . 386

9.6 Richardson Extrapolation . . . 387

9.6.1 Romberg Integration . . . 389

9.7 Automatic Integration . . . 391

9.7.1 Non Adaptive Integration Algorithms . . . 392

9.7.2 Adaptive Integration Algorithms . . . 394

9.8 Singular Integrals . . . 398

9.8.1 Integrals of Functions with Finite Jump Discontinuities . . . 398

9.8.2 Integrals of Infinite Functions . . . 398

9.8.3 Integrals over Unbounded Intervals . . . 401

9.9 Multidimensional Numerical Integration . . . 402

9.9.1 The Method of Reduction Formula . . . 403

9.9.2 Two-Dimensional Composite Quadratures . . . 404

9.9.3 Monte Carlo Methods for Numerical Integration . . . 407

9.10 Applications . . . 408

9.10.1 Computation of an Ellipsoid Surface . . . 408

9.10.2 Computation of the Wind Action on a Sailboat Mast . . . 410

9.11 Exercises . . . 412

PART IV: Transforms, Differentiation and Problem Discretization 10. Orthogonal Polynomials in Approximation Theory 415 10.1 Approximation of Functions by Generalized Fourier Series 415 10.1.1 The Chebyshev Polynomials . . . 417

10.1.2 The Legendre Polynomials . . . 419

10.2 Gaussian Integration and Interpolation . . . 419

10.3 Chebyshev Integration and Interpolation . . . 424

10.4 Legendre Integration and Interpolation . . . 426

10.5 Gaussian Integration over Unbounded Intervals . . . 428

10.6 Programs for the Implementation of Gaussian Quadratures 429 10.7 Approximation of a Function in the Least-Squares Sense . 431 10.7.1 Discrete Least-Squares Approximation . . . 431

10.8 The Polynomial of Best Approximation . . . 433

10.9 Fourier Trigonometric Polynomials . . . 435

10.9.1 The Gibbs Phenomenon . . . 439

10.9.2 The Fast Fourier Transform . . . 440

10.10 Approximation of Function Derivatives . . . 442

10.10.1 Classical Finite Difference Methods . . . 442

10.10.2 Compact Finite Differences . . . 444

10.10.3 Pseudo-Spectral Derivative . . . 448

(17)

xviii Contents

10.11.1 The Fourier Transform . . . 450

10.11.2 (Physical) Linear Systems and Fourier Transform . 453 10.11.3 The Laplace Transform . . . 455

10.11.4 The Z-Transform . . . 457

10.12 The Wavelet Transform . . . 458

10.12.1 The Continuous Wavelet Transform . . . 458

10.12.2 Discrete and Orthonormal Wavelets . . . 461

10.13 Applications . . . 463

10.13.1 Numerical Computation of Blackbody Radiation . 463 10.13.2 Numerical Solution of Schr¨odinger Equation . . . . 464

10.14 Exercises . . . 467

11. Numerical Solution of Ordinary Differential Equations 469 11.1 The Cauchy Problem . . . 469

11.2 One-Step Numerical Methods . . . 472

11.3 Analysis of One-Step Methods . . . 473

11.3.1 The Zero-Stability . . . 475

11.3.2 Convergence Analysis . . . 477

11.3.3 The Absolute Stability . . . 479

11.4 Difference Equations . . . 482

11.5 Multistep Methods . . . 487

11.5.1 Adams Methods . . . 490

11.5.2 BDF Methods . . . 492

11.6 Analysis of Multistep Methods . . . 492

11.6.1 Consistency . . . 493

11.6.2 The Root Conditions . . . 494

11.6.3 Stability and Convergence Analysis for Multistep Methods . . . 495

11.6.4 Absolute Stability of Multistep Methods . . . 499

11.7 Predictor-Corrector Methods . . . 502

11.8 Runge-Kutta Methods . . . 508

11.8.1 Derivation of an Explicit RK Method . . . 511

11.8.2 Stepsize Adaptivity for RK Methods . . . 512

11.8.3 Implicit RK Methods . . . 514

11.8.4 Regions of Absolute Stability for RK Methods . . 516

11.9 Systems of ODEs . . . 517

11.10 Stiff Problems . . . 519

11.11 Applications . . . 521

11.11.1 Analysis of the Motion of a Frictionless Pendulum 522 11.11.2 Compliance of Arterial Walls . . . 523

11.12 Exercises . . . 527

12. Two-Point Boundary Value Problems 531 12.1 A Model Problem . . . 531

(18)

Contents xix

12.2.1 Stability Analysis by the Energy Method . . . 534

12.2.2 Convergence Analysis . . . 538

12.2.3 Finite Differences for Two-Point Boundary Value Problems with Variable Coefficients . . . 540

12.3 The Spectral Collocation Method . . . 542

12.4 The Galerkin Method . . . 544

12.4.1 Integral Formulation of Boundary-Value Problems 544 12.4.2 A Quick Introduction to Distributions . . . 546

12.4.3 Formulation and Properties of the Galerkin Method . . . 547

12.4.4 Analysis of the Galerkin Method . . . 548

12.4.5 The Finite Element Method . . . 550

12.4.6 Implementation Issues . . . 556

12.4.7 Spectral Methods . . . 559

12.5 Advection-Diffusion Equations . . . 560

12.5.1 Galerkin Finite Element Approximation . . . 561

12.5.2 The Relationship Between Finite Elements and Finite Differences; the Numerical Viscosity . . . . 563

12.5.3 Stabilized Finite Element Methods . . . 567

12.6 A Quick Glance to the Two-Dimensional Case . . . 572

12.7 Applications . . . 575

12.7.1 Lubrication of a Slider . . . 575

12.7.2 Vertical Distribution of Spore Concentration over Wide Regions . . . 576

12.8 Exercises . . . 578

13. Parabolic and Hyperbolic Initial Boundary Value Problems 581 13.1 The Heat Equation . . . 581

13.2 Finite Difference Approximation of the Heat Equation . . 584

13.3 Finite Element Approximation of the Heat Equation . . . 586

13.3.1 Stability Analysis of theθ-Method . . . 588

13.4 Space-Time Finite Element Methods for the Heat Equation . . . 593

13.5 Hyperbolic Equations: A Scalar Transport Problem . . . . 597

13.6 Systems of Linear Hyperbolic Equations . . . 599

13.6.1 The Wave Equation . . . 601

13.7 The Finite Difference Method for Hyperbolic Equations . . 602

13.7.1 Discretization of the Scalar Equation . . . 602

13.8 Analysis of Finite Difference Methods . . . 605

13.8.1 Consistency . . . 605

13.8.2 Stability . . . 605

13.8.3 The CFL Condition . . . 606

13.8.4 Von Neumann Stability Analysis . . . 608

(19)

xx Contents

13.9.1 Equivalent Equations . . . 614

13.10 Finite Element Approximation of Hyperbolic Equations . . 618

13.10.1 Space Discretization with Continuous and Discontinuous Finite Elements . . . 618

13.10.2 Time Discretization . . . 620

13.11 Applications . . . 623

13.11.1 Heat Conduction in a Bar . . . 623

13.11.2 A Hyperbolic Model for Blood Flow Interaction with Arterial Walls . . . 623

13.12 Exercises . . . 625

References 627

Index of MATLAB Programs 643

(20)

1

Foundations of Matrix Analysis

In this chapter we recall the basic elements of linear algebra which will be employed in the remainder of the text. For most of the proofs as well as for the details, the reader is referred to [Bra75], [Nob69], [Hal58]. Further results on eigenvalues can be found in [Hou75] and [Wil65].

1.1 Vector Spaces

Definition 1.1 Avector spaceover the numeric fieldK(K=RorK=C) is a nonempty set V, whose elements are called vectorsand in which two operations are defined, calledadditionandscalar multiplication, that enjoy the following properties:

1. addition is commutative and associative;

2. there exists an element0 V (the zero vector or null vector) such thatv+0=vfor eachvV;

3. 0·v=0, 1·v=v, where 0 and 1 are respectively the zero and the unity ofK;

4. for each elementvV there exists its opposite, v, inV such that

(21)

2 1. Foundations of Matrix Analysis

5. the following distributive properties hold

∀αK, v,wV, α(v+w) =αv+αw,

∀α, β∈K, ∀v∈V, (α+β)v=αv+βv;

6. the following associative property holds

∀α, βK, vV, (αβ)v=α(βv).

Example 1.1 Remarkable instances of vector spaces are:

- V =Rn(respectivelyV =Cn): the set of then-tuples of real (respectively complex) numbers,n≥1;

- V =Pn: the set of polynomialspn(x) =nk=0akxk with real (or complex)

coefficientsakhaving degree less than or equal ton,n≥0;

-V =Cp([a, b]): the set of real (or complex)-valued functions which are

con-tinuous on [a, b] up to theirp-th derivative, 0≤p <∞. •

Definition 1.2 We say that a nonempty partW ofV is avector subspace

ofV iffW is a vector space over K.

Example 1.2 The vector spacePnis a vector subspace ofC∞(R), which is the

space of infinite continuously differentiable functions on the real line. A trivial subspace of any vector space is the one containing only the zero vector. •

In particular, the setW of the linear combinations of a system ofpvectors ofV,{v1, . . . ,vp}, is a vector subspace ofV, called thegenerated subspace or spanof the vector system, and is denoted by

W = span{v1, . . . ,vp}

={v=α1v1+. . .+αpvp withαi∈K, i= 1, . . . , p}.

The system{v1, . . . ,vp} is called a system ofgeneratorsforW. IfW1, . . . , Wmare vector subspaces ofV, then the set

S={w: w=v1+. . .+vmwithvi∈Wi, i= 1, . . . , m}

(22)

1.2 Matrices 3

Definition 1.3 A system of vectors {v1, . . . ,vm} of a vector space V is called linearly independentif the relation

α1v1+α2v2+. . .+αmvm=0

withα1, α2, . . . , αm∈Kimplies thatα1=α2=. . .=αm= 0. Otherwise, the system will be calledlinearly dependent.

We call a basis ofV any system of linearly independent generators of V. If {u1, . . . ,un} is a basis of V, the expression v = v1u1+. . .+vnun is called the decomposition of v with respect to the basis and the scalars

v1, . . . , vn ∈ K are the components of v with respect to the given basis. Moreover, the following property holds.

Property 1.1 Let V be a vector space which admits a basis ofn vectors.

Then every system of linearly independent vectors of V has at most n

el-ements and any other basis of V has n elements. The number n is called

the dimension ofV and we writedim(V) =n.

If, instead, for any n there always exist n linearly independent vectors of

V, the vector space is called infinite dimensional.

Example 1.3 For any integerpthe spaceCp([a, b]) is infinite dimensional. The

spacesRnandCnhave dimension equal ton. The usual basis forRnis the set of

unit vectors {e1, . . . ,en}where (ei)j=δij fori, j= 1, . . . n, where δij denotes

theKronecker symbol equal to 0 ifi=j and 1 ifi=j. This choice is of course not the only one that is possible (see Exercise 2). •

1.2 Matrices

Letmandnbe two positive integers. We call amatrixhavingmrows and

n columns, or a matrixm×n, or a matrix (m, n), with elements in K, a set ofmn scalarsaij ∈K, withi= 1, . . . , mand j= 1, . . . n, represented in the following rectangular array

A =

    

a11 a12 . . . a1n

a21 a22 . . . a2n ..

. ... ...

am1 am2 . . . amn

   

. (1.1)

(23)

4 1. Foundations of Matrix Analysis

We shall abbreviate (1.1) as A = (aij) withi= 1, . . . , mandj = 1, . . . n. The index i is called row index, while j is the column index. The set (ai1, ai2, . . . , ain) is called the i-th row of A; likewise, (a1j, a2j, . . . , amj) is thej-th columnof A.

If n=m the matrix is called squared or having ordern and the set of the entries (a11, a22, . . . , ann) is called itsmain diagonal.

A matrix having one row or one column is called arow vectororcolumn

vectorrespectively. Unless otherwise specified, we shall always assume that

a vector is a column vector. In the casen=m= 1, the matrix will simply denote a scalar ofK.

Sometimes it turns out to be useful to distinguish within a matrix the set made up by specified rows and columns. This prompts us to introduce the following definition.

Definition 1.4 Let A be a matrixm×n. Let 1i1< i2< . . . < ik ≤m and 1j1< j2< . . . < jl≤ntwo sets of contiguous indexes. The matrix S(k×l) of entries spq =aipjq with p = 1, . . . , k, q = 1, . . . , l is called a

submatrixof A. Ifk=l andir=jr forr= 1, . . . , k, S is called aprincipal

submatrixof A.

Definition 1.5 A matrix A(m×n) is called block partitioned or said to

be partitioned into submatricesif

A =

    

A11 A12 . . . A1l A21 A22 . . . A2l

..

. ... . .. ... Ak1 Ak2 . . . Akl

    ,

where Aij are submatrices of A.

Among the possible partitions of A, we recall in particular the partition by columns

A = (a1, a2, . . . ,an),

aibeing thei-th column vector of A. In a similar way the partition by rows of A can be defined. To fix the notations, if A is a matrix m×n, we shall denote by

A(i1:i2, j1:j2) = (aij) i1≤i≤i2, j1≤j ≤j2

the submatrix of A of size (i2−i1+ 1)×(j2−j1+ 1) that lies between the rowsi1 andi2 and the columnsj1 andj2. Likewise, ifvis a vector of size

n, we shall denote byv(i1 :i2) the vector of size i2−i1+ 1 made up by thei1-th to thei2-th components ofv.

(24)

1.3 Operations with Matrices 5

1.3 Operations with Matrices

Let A = (aij) and B = (bij) be two matricesm×n overK. We say that

A is equal to B, if aij =bij for i= 1, . . . , m, j = 1, . . . , n. Moreover, we

define the following operations:

- matrix sum: the matrix sum is the matrix A+B = (aij+bij). The neutral

element in a matrix sum is thenull matrix, still denoted by 0 and made up only by null entries;

- matrix multiplication by a scalar: the multiplication of A by λK, is a

matrixλA = (λaij);

- matrix product: the product of two matrices A and B of sizes (m, p)

and (p, n) respectively, is a matrix C(m, n) whose entries are cij = p

k=1

aikbkj, fori= 1, . . . , m,j= 1, . . . , n.

The matrix product is associative and distributive with respect to the ma-trix sum, but it is not in general commutative. The square matrices for which the property AB = BA holds, will be called commutative.

In the case of square matrices, the neutral element in the matrix product is a square matrix of order n called the unit matrix of order n or, more frequently, the identity matrix given by In = (δij). The identity matrix is, by definition, the only matrix n×n such that AIn = InA = A for all square matrices A. In the following we shall omit the subscriptnunless it is strictly necessary. The identity matrix is a special instance of adiagonal

matrixof ordern, that is, a square matrix of the type D = (diiδij). We will

use in the following the notation D = diag(d11, d22, . . . , dnn).

Finally, if A is a square matrix of ordernandpis an integer, we define Ap as the product of A with itself iterated ptimes. We let A0= I.

Let us now address the so-called elementary row operations that can be performed on a matrix. They consist of:

- multiplying the i-th row of a matrix by a scalar α; this operation is equivalent to pre-multiplying A by the matrix D = diag(1, . . . ,1, α,

1, . . . ,1), whereαoccupies thei-th position;

- exchanging the i-th and j-th rows of a matrix; this can be done by pre-multiplying A by the matrix P(i,j)of elements

p(i,j)rs =

        

1 ifr=s= 1, . . . , i1, i+ 1, . . . , j1, j+ 1, . . . n,

1 ifr=j, s=ior r=i, s=j,

0 otherwise,

(25)

6 1. Foundations of Matrix Analysis

where Ir denotes the identity matrix of order r = ji1 if j > i (henceforth, matrices with size equal to zero will correspond to the empty set). Matrices like (1.2) are calledelementary permutation

matrices. The product of elementary permutation matrices is called

apermutation matrix, and it performs the row exchanges associated

with each elementary permutation matrix. In practice, a permutation matrix is a reordering by rows of the identity matrix;

- adding αtimes the j-th row of a matrix to itsi-th row. This operation can also be performed by pre-multiplying A by the matrix I + N(i,j)α , where N(i,j)α is a matrix having null entries except the one in position

i, jwhose value isα.

1.3.1 Inverse of a Matrix

Definition 1.6 A square matrix A of ordernis calledinvertible(orregular

or nonsingular) if there exists a square matrix B of order n such that

A B = B A = I. B is called theinverse matrixof A and is denoted by A−1. A matrix which is not invertible is calledsingular.

If A is invertible its inverse is also invertible, with (A−1)−1= A. Moreover, if A and B are two invertible matrices of order n, their product AB is also invertible, with (A B)−1= B−1A−1. The following property holds.

Property 1.2 A square matrix is invertible iff its column vectors are lin-early independent.

Definition 1.7 We call the transpose of a matrix A Rm×n the matrix

n×m, denoted by AT, that is obtained by exchanging the rows of A with

the columns of A.

Clearly, (AT)T = A, (A + B)T = AT + BT, (AB)T = BTAT and (αA)T =

αAT

∀αR. If A is invertible, then also (AT)−1= (A−1)T = A−T.

Definition 1.8 Let ACm×n; the matrix B = AH

∈Cn×mis called the

conjugate transpose(oradjoint) of A ifbij= ¯aji, where ¯ajiis the complex

conjugate ofaji.

In analogy with the case of the real matrices, it turns out that (A+B)H = AH+ BH, (AB)H= BHAH and (αA)H= ¯αAH

∀αC.

(26)

1.3 Operations with Matrices 7

Definition 1.10 A matrix ACn×n is calledhermitianorself-adjoint if AT = ¯A, that is, if AH = A, while it is calledunitary if AHA = AAH= I. Finally, if AAH = AHA, A is callednormal. As a consequence, a unitary matrix is one such that A−1= AH.

Of course, a unitary matrix is also normal, but it is not in general her-mitian. For instance, the matrix of the Example 1.4 is unitary, although not symmetric (if s= 0). We finally notice that the diagonal entries of an hermitian matrix must necessarily be real (see also Exercise 5).

1.3.2 Matrices and Linear Mappings

Definition 1.11 Alinear mapfrom Cn into Cm is a functionf :Cn−→ Cmsuch thatf(αx+βy) =αf(x) +βf(y),α, βKandx,yCn.

The following result links matrices and linear maps.

Property 1.3 Let f : Cn −→ Cm be a linear map. Then, there exists a

unique matrixAf Cm×n such that

f(x) = Afx ∀x∈Cn. (1.3)

Conversely, if Af ∈ Cm×n then the function defined in (1.3) is a linear

map fromCn intoCm.

Example 1.4 An important example of a linear map is the counterclockwise

rotation by an angleϑ in the plane (x1, x2). The matrix associated with such a

map is given by

G(ϑ) =

c s

−s c

, c= cos(ϑ), s= sin(ϑ)

and it is called arotation matrix. •

1.3.3 Operations with Block-Partitioned Matrices

All the operations that have been previously introduced can be extended to the case of a block-partitioned matrix A, provided that the size of each single block is such that any single matrix operation is well-defined. Indeed, the following result can be shown (see, e.g., [Ste73]).

Property 1.4 Let AandBbe the block matrices

A =

  

A11 . . . A1l

..

. . .. ...

Ak1 . . . Akl

 

, B =

  

B11 . . . B1n

..

. . .. ...

Bm1 . . . Bmn

  

(27)

8 1. Foundations of Matrix Analysis

1.

λA =

  

λA11 . . . λA1l

..

. . .. ...

λAk1 . . . λAkl

 

, λC; AT =

  

AT

11 . . . ATk1

..

. . .. ...

AT

1l . . . ATkl

  ;

2. ifk=m,l=n,mi=ki andnj =lj, then

A + B =

  

A11+ B11 . . . A1l+ B1l

..

. . .. ...

Ak1+ Bk1 . . . Akl+ Bkl

  ;

3. ifl=m,li=mi andki =ni, then, letting Cij=

m

s=1 AisBsj,

AB =

  

C11 . . . C1l

..

. . .. ...

Ck1 . . . Ckl

  .

1.4 Trace and Determinant of a Matrix

Let us consider a square matrix A of ordern. Thetraceof a matrix is the

sum of the diagonal entries of A, that is tr(A) = n

i=1

aii.

We call thedeterminantof A the scalar defined through the following for-mula

det(A) = πP

sign(π)a1π1a2π2. . . anπn,

where P =π= (π1, . . . , πn)T is the set of the n! vectors that are ob-tained by permuting the index vectori= (1, . . . , n)T and sign(π) equal to 1 (respectively, 1) if an even (respectively, odd) number of exchanges is needed to obtainπ fromi.

The following properties hold

det(A) = det(AT), det(AB) = det(A)det(B), det(A−1) = 1/det(A),

det(AH) = det(A), det(αA) =αndet(A),

∀αK.

(28)

1.5 Rank and Kernel of a Matrix 9

of sign in the determinant. Of course, the determinant of a diagonal matrix is the product of the diagonal entries.

Denoting by Aij the matrix of order n1 obtained from A by elimi-nating thei-th row and thej-th column, we call thecomplementary minor

associated with the entry aij the determinant of the matrix Aij. We call

the k-th principal (dominating) minor of A, dk, the determinant of the

principal submatrix of order k, Ak = A(1 : k,1 : k). If we denote by ∆ij = (−1)i+jdet(A

ij) the cofactorof the entry aij, the actual computa-tion of the determinant of A can be performed using the following recursive relation

det(A) =

        

a11 if n= 1,

n

j=1

∆ijaij, forn >1,

(1.4)

which is known as the Laplace rule. If A is a square invertible matrix of ordern, then

A−1= 1 det(A)C

where C is the matrix having entries ∆ji,i, j= 1, . . . , n.

As a consequence, a square matrix is invertible iff its determinant is non-vanishing. In the case of nonsingular diagonal matrices the inverse is still a diagonal matrix having entries given by the reciprocals of the diagonal entries of the matrix.

Everyorthogonal matrixis invertible, its inverse is given by AT, moreover det(A) =±1.

1.5 Rank and Kernel of a Matrix

Let A be a rectangular matrix m×n. We call the determinant of order

q (with q 1) extracted from matrix A, the determinant of any square

matrix of order q obtained from A by eliminating mq rows and nq

columns.

Definition 1.12 The rank of A (denoted by rank(A)) is the maximum order of the nonvanishing determinants extracted from A. A matrix has

complete or full rankif rank(A) = min(m,n).

Notice that the rank of A represents the maximum number of linearly independent column vectors of A that is, the dimension of therangeof A, defined as

(29)

10 1. Foundations of Matrix Analysis

Rigorously speaking, one should distinguish between the column rank of A and the row rank of A, the latter being the maximum number of linearly independent row vectors of A. Nevertheless, it can be shown that the row rank and column rank do actually coincide.

Thekernelof A is defined as the subspace

ker(A) ={x∈Rn : Ax=0}.

The following relations hold

1. rank(A) = rank(AT) (if ACm×n, rank(A) = rank(AH)) 2. rank(A) + dim(ker(A)) =n.

In general, dim(ker(A))= dim(ker(AT)). If A is a nonsingular square ma-trix, then rank(A) =nand dim(ker(A)) = 0.

Example 1.5 Let

A =

1 1 0 1 −1 1

.

Then, rank(A) = 2, dim(ker(A)) = 1 and dim(ker(AT)) = 0.

We finally notice that for a matrix A∈Cn×n the following properties are equivalent:

1. A is nonsingular;

2. det(A)= 0;

3. ker(A) ={0};

4. rank(A) =n;

5. A has linearly independent rows and columns.

1.6 Special Matrices

1.6.1 Block Diagonal Matrices

These are matrices of the form D = diag(D1, . . . ,Dn), where Diare square matrices with i = 1, . . . , n. Clearly, each single diagonal block can be of different size. We shall say that a block diagonal matrix has size n if n

(30)

1.6 Special Matrices 11

1.6.2 Trapezoidal and Triangular Matrices

A matrix A(m×n) is calledupper trapezoidalifaij = 0 fori > j, while it

is lower trapezoidalifaij = 0 fori < j. The name is due to the fact that,

in the case of upper trapezoidal matrices, withm < n, the nonzero entries of the matrix form a trapezoid.

Atriangular matrixis a square trapezoidal matrix of ordernof the form

L =

    

l11 0 . . . 0

l21 l22 . . . 0 ..

. ... ...

ln1 ln2 . . . lnn

   

 or U =

    

u11 u12 . . . u1n 0 u22 . . . u2n

..

. ... ... 0 0 . . . unn

    .

The matrix L is calledlower triangular while U isupper triangular. Let us recall some algebraic properties of triangular matrices that are easy to check.

- The determinant of a triangular matrix is the product of the diagonal entries;

- the inverse of a lower (respectively, upper) triangular matrix is still lower (respectively, upper) triangular;

- the product of two lower triangular (respectively, upper trapezoidal) ma-trices is still lower triangular (respectively, upper trapezodial);

- if we call unit triangular matrix a triangular matrix that has diagonal entries equal to 1, then, the product of lower (respectively, upper) unit triangular matrices is still lower (respectively, upper) unit triangular.

1.6.3 Banded Matrices

The matrices introduced in the previous section are a special instance of banded matrices. Indeed, we say that a matrix A Rm×n (or in Cm×n)

has lower band p if aij = 0 when i > j+p and upper band q if aij = 0

whenj > i+q. Diagonal matrices are banded matrices for whichp=q= 0, while trapezoidal matrices havep=m−1,q= 0 (lower trapezoidal),p= 0,

q=n−1 (upper trapezoidal).

Other banded matrices of relevant interest are the tridiagonal matrices

for whichp=q= 1 and theupper bidiagonal(p= 0,q= 1) orlower bidiag-onal(p= 1,q= 0). In the following, tridiagn(b,d,c) will denote the triadi-agonal matrix of sizenhaving respectively on the lower and upper principal diagonals the vectorsb= (b1, . . . , bn−1)T andc= (c1, . . . , cn−1)T, and on

the principal diagonal the vectord= (d1, . . . , dn)T. If bi =β,di =δand

(31)

12 1. Foundations of Matrix Analysis

We also mention the so-called lower Hessenberg matrices (p = m1,

q = 1) and upper Hessenberg matrices (p= 1, q = n1) that have the following structure

H =

     

h11 h12

0

h21 h22 . .. ..

. . .. hm−1n

hm1 . . . hmn

    

 or H =      

h11 h12 . . . h1n

h21 h22 h2n . .. . .. ...

0

hmn−1 hmn

     .

Matrices of similar shape can obviously be set up in the block-like format.

1.7 Eigenvalues and Eigenvectors

Let A be a square matrix of ordernwith real or complex entries; the number

λ∈Cis called aneigenvalueof A if there exists a nonnull vector x∈Cn such that Ax = λx. The vector x is the eigenvector associated with the eigenvalueλand the set of the eigenvalues of A is called thespectrumof A, denoted byσ(A). We say thatxandyare respectively aright eigenvector

and aleft eigenvectorof A, associated with the eigenvalueλ, if

Ax=λx, yHA =λyH.

The eigenvalueλcorresponding to the eigenvectorxcan be determined by computing the Rayleigh quotientλ=xHAx/(xHx). The numberλis the solution of the characteristic equation

pA(λ) = det(A−λI) = 0,

where pA(λ) is thecharacteristic polynomial. Since this latter is a

polyno-mial of degreenwith respect toλ, there certainly existneigenvalues of A not necessarily distinct. The following properties can be proved

det(A) = n

i=1

λi, tr(A) = n

i=1

λi, (1.6)

and since det(ATλI) = det((AλI)T) = det(AλI) one concludes that

σ(A) =σ(AT) and, in an analogous way, thatσ(AH) =σ( ¯A).

From the first relation in (1.6) it can be concluded that a matrix is singular iff it has at least one null eigenvalue, since pA(0) = det(A) = Πn

i=1λi.

Secondly, if A has real entries, pA(λ) turns out to be a real-coefficient

(32)

1.7 Eigenvalues and Eigenvectors 13

Finally, due to the Cayley-Hamilton Theorem if pA(λ) is the

charac-teristic polynomial of A, then pA(A) = 0, wherepA(A) denotes a matrix

polynomial (for the proof see, e.g., [Axe94], p. 51).

The maximum module of the eigenvalues of A is called thespectral radius

of A and is denoted by

ρ(A) = max

λ∈σ(A)|λ|. (1.7)

Characterizing the eigenvalues of a matrix as the roots of a polynomial implies in particular thatλis an eigenvalue of ACn×n iff ¯λis an eigen-value of AH. An immediate consequence is thatρ(A) =ρ(AH). Moreover, ∀ACn×n,αC,ρ(αA) =|α|ρ(A), andρ(Ak) = [ρ(A)]k

∀kN.

Finally, assume that A is a block triangular matrix

A =

    

A11 A12 . . . A1k 0 A22 . . . A2k

..

. . .. ... 0 . . . 0 Akk

    .

As pA(λ) =pA11(λ)pA22(λ)· · ·pAkk(λ), the spectrum of A is given by the

union of the spectra of each single diagonal block. As a consequence, if A is triangular, the eigenvalues of A are its diagonal entries.

For each eigenvalue λof a matrix A the set of the eigenvectors associated with λ, together with the null vector, identifies a subspace ofCn which is called the eigenspace associated withλ and corresponds by definition to ker(A-λI). The dimension of the eigenspace is

dim [ker(AλI)] =nrank(AλI),

and is called geometric multiplicity of the eigenvalue λ. It can never be greater than the algebraic multiplicity of λ, which is the multiplicity of

λas a root of the characteristic polynomial. Eigenvalues having geometric multiplicity strictly less than the algebraic one are calleddefective. A matrix having at least one defective eigenvalue is called defective.

The eigenspace associated with an eigenvalue of a matrix A is invariant with respect to A in the sense of the following definition.

Definition 1.13 A subspaceS inCn is calledinvariantwith respect to a square matrix A if ASS, where AS is the transformed ofS through A.

(33)

14 1. Foundations of Matrix Analysis

1.8 Similarity Transformations

Definition 1.14 Let C be a square nonsingular matrix having the same order as the matrix A. We say that the matrices A and C−1AC aresimilar, and the transformation from A to C−1AC is called a similarity

transfor-mation. Moreover, we say that the two matrices are unitarily similarif C

is unitary.

Two similar matrices share the same spectrum and the same characteris-tic polynomial. Indeed, it is easy to check that if (λ,x) is an eigenvalue-eigenvector pair of A, (λ,C−1x) is the same for the matrix C−1AC since

(C−1AC)C−1x= C−1Ax=λC−1x.

We notice in particular that the product matrices AB and BA, with A ∈ Cn×m and B Cm×n, are not similar but satisfy the following property (see [Hac94], p.18, Theorem 2.4.6)

σ(AB)\ {0}=σ(BA)\ {0}

that is, AB and BA share the same spectrum apart from null eigenvalues so thatρ(AB) =ρ(BA).

The use of similarity transformations aims at reducing the complexity of the problem of evaluating the eigenvalues of a matrix. Indeed, if a given matrix could be transformed into a similar matrix in diagonal or triangular form, the computation of the eigenvalues would be immediate. The main result in this direction is the following theorem (for the proof, see [Dem97], Theorem 4.2).

Property 1.5 (Schur decomposition) Given A∈Cn×n, there existsU

unitary such that

U−1AU = UHAU =

    

λ1 b12 . . . b1n 0 λ2 b2n

..

. . .. ...

0 . . . 0 λn

    = T,

whereλi are the eigenvalues of A.

It thus turns out that every matrix A is unitarily similar to an upper triangular matrix. The matrices T and U are not necessarily unique [Hac94]. The Schur decomposition theorem gives rise to several important results; among them, we recall:

1. every hermitian matrix is unitarily similar to a diagonal real ma-trix, that is, when A is hermitian every Schur decomposition of A is diagonal. In such an event, since

(34)

1.8 Similarity Transformations 15

it turns out that AU = UΛ, that is, Aui =λiui fori = 1, . . . , nso that the column vectors of U are the eigenvectors of A. Moreover, since the eigenvectors are orthogonal two by two, it turns out that an hermitian matrix has a system of orthonormal eigenvectors that generates the whole spaceCn. Finally, it can be shown that a matrix A of ordernis similar to a diagonal matrix D iff the eigenvectors of A form a basis forCn [Axe94];

2. a matrix A∈Cn×n is normal iff it is unitarily similar to a diagonal matrix. As a consequence, a normal matrix A ∈ Cn×n admits the followingspectral decomposition: A = UΛUH =ni=1λiuiuHi being U unitary and Λ diagonal [SS90];

3. let A and B be two normal and commutative matrices; then, the generic eigenvalue µi of A+B is given by the sum λi +ξi, where

λi and ξi are the eigenvalues of A and B associated with the same eigenvector.

There are, of course, nonsymmetric matrices that are similar to diagonal matrices, but these are not unitarily similar (see, e.g., Exercise 7).

The Schur decomposition can be improved as follows (for the proof see, e.g., [Str80], [God66]).

Property 1.6 (Canonical Jordan Form) Let A be any square matrix.

Then, there exists a nonsingular matrixXwhich transformsAinto a block

diagonal matrix Jsuch that

X−1AX = J = diag (Jk1(λ1),Jk2(λ2), . . . ,Jkl(λl)),

which is called canonical Jordan form, λj being the eigenvalues of A and

Jk(λ)∈Ck×k a Jordan block of the form J1(λ) =λif k= 1 and

Jk(λ) =

        

λ 1 0 . . . 0 0 λ 1 · · · ... ..

. . .. ... 1 0

..

. . .. λ 1

0 . . . 0 λ         

, fork >1.

(35)

16 1. Foundations of Matrix Analysis

Partitioning X by columns, X = (x1, . . . ,xn), it can be seen that the

ki vectors associated with the Jordan block Jki(λi) satisfy the following

recursive relation

Axl=λixl, l= i−1

j=1

mj+ 1,

Axj =λixj+xj−1, j =l+ 1, . . . , l−1 +ki, if ki = 1.

(1.8)

The vectorsxiare calledprincipal vectorsorgeneralized eigenvectorsof A.

Example 1.6 Let us consider the following matrix

A =        

7/4 3/4 −1/4 −1/4 −1/4 1/4

0 2 0 0 0 0

−1/2 −1/2 5/2 1/2 −1/2 1/2 −1/2 −1/2 −1/2 5/2 1/2 1/2 −1/4 −1/4 −1/4 −1/4 11/4 1/4 −3/2 −1/2 −1/2 1/2 1/2 7/2

        .

The Jordan canonical form of A and its associated matrix X are given by

J =        

2 1 0 0 0 0 0 2 0 0 0 0 0 0 3 1 0 0 0 0 0 3 1 0 0 0 0 0 3 0 0 0 0 0 0 2

       

, X =        

1 0 0 0 0 1 0 1 0 0 0 1 0 0 1 0 0 1 0 0 0 1 0 1 0 0 0 0 1 1 1 1 1 1 1 1

        .

Notice that two different Jordan blocks are related to the same eigenvalue (λ= 2). It is easy to check property (1.8). Consider, for example, the Jordan block associated with the eigenvalueλ2 = 3; we have

Ax3= [0 0 3 0 0 3]T = 3 [0 0 1 0 0 1]T=λ2x3,

Ax4= [0 0 1 3 0 4]T = 3 [0 0 0 1 0 1]T+ [0 0 1 0 0 1]T=λ2x4+x3,

Ax5= [0 0 0 1 3 4]T = 3 [0 0 0 0 1 1]T+ [0 0 0 1 0 1]T=λ2x5+x4.

1.9 The Singular Value Decomposition (SVD)

Any matrix can be reduced in diagonal form by a suitable pre and post-multiplication by unitary matrices. Precisely, the following result holds.

Property 1.7 LetACm×n. There exist two unitary matricesUCm×m

andVCn×n such that

UHAV = Σ = diag(σ1, . . . , σp)∈Cm×n with p= min(m, n) (1.9)

and σ1 ≥. . .≥σp ≥0. Formula(1.9) is called Singular Value

Decompo-sition or (SVD) of A and the numbers σi (or σi(A)) are called singular

(36)

1.10 Scalar Product and Norms in Vector Spaces 17

If A is a real-valued matrix, U and V will also be real-valued and in (1.9) UT must be written instead of UH. The following characterization of the singular values holds

σi(A) =λi(AHA), i= 1, . . . , n. (1.10)

Indeed, from (1.9) it follows that A = UΣVH, AH= VΣUH so that, U and V being unitary, AHA = VΣ2VH, that is, λi(AHA) =λi(Σ2) = (σi(A))2. Since AAH and AHA are hermitian matrices, the columns of U, called the

left singular vectors of A, turn out to be the eigenvectors of AAH (see

Section 1.8) and, therefore, they are not uniquely defined. The same holds for the columns of V, which are theright singular vectorsof A.

Relation (1.10) implies that if ACn×nis hermitian with eigenvalues given byλ1,λ2, . . . , λn, then the singular values of A coincide with the modules of the eigenvalues of A. Indeed because AAH = A2,σ

i =

λ2

i =|λi| for

i= 1, . . . , n. As far as the rank is concerned, if

σ1≥. . .≥σr> σr+1=. . .=σp= 0,

then the rank of A is r, the kernel of A is the span of the column vectors of V,{vr+1, . . . ,vn}, and the range of A is the span of the column vectors of U, {u1, . . . ,ur}.

Definition 1.15 Suppose that A∈Cm×n has rank equal tor and that it admits a SVD of the type UHAV = Σ. The matrix A= VΣUH is called

theMoore-Penrose pseudo-inversematrix, being

Σ†= diag

1

σ1

, . . . , 1 σr

,0, . . . ,0

. (1.11)

The matrix A† is also called thegeneralized inverseof A (see Exercise 13). Indeed, if rank(A) = n < m, then A† = (ATA)−1AT, while if n = m = rank(A), A†= A−1. For further properties of A, see also Exercise 12.

1.10 Scalar Product and Norms in Vector Spaces

(37)

18 1. Foundations of Matrix Analysis

Definition 1.16 A scalar product on a vector space V defined over K

is any map (·,·) acting from V ×V into K which enjoys the following properties:

1. it is linear with respect to the vectors of V, that is

(γx+λz,y) =γ(x,y) +λ(z,y), ∀x,z∈V, ∀γ, λ∈K;

2. it ishermitian, that is, (y,x) = (x,y), ∀x,y∈V;

3. it is positive definite, that is, (x,x) > 0, ∀x = 0 (in other words, (x,x)0, and (x,x) = 0 if and only ifx=0).

In the case V =Cn (or Rn), an example is provided by the classical Eu-clidean scalar product given by

(x,y) =yHx= n

i=1

xiy¯i,

where ¯z denotes the complex conjugate ofz.

Moreover, for any given square matrix A of ordernand for anyx,y∈Cn the following relation holds

(Ax,y) = (x,AHy). (1.12)

In particular, since for any matrix QCn×n, (Qx,Qy) = (x,QHQy), one gets

Property 1.8 Unitary matrices preserve the Euclidean scalar product, that is, (Qx,Qy) = (x,y)for any unitary matrixQand for any pair of vectors

x andy.

Definition 1.17 LetV be a vector space over K. We say that the map · fromV into Ris a normonV if the following axioms are satisfied:

1. (i) v0vV and (ii) v = 0 if and only ifv=0;

2. αv =|α| v ∀α∈K, ∀v∈V (homogeneity property);

3. v+w ≤ v + w ∀v,w∈V (triangular inequality),

where |α| denotes the absolute value of α if K = R, the module of α if

(38)

1.10 Scalar Product and Norms in Vector Spaces 19

The pair (V, · ) is called a normed space. We shall distinguish among norms by a suitable subscript at the margin of the double bar symbol. In the case the map| · |from V intoRenjoys only the properties 1(i), 2 and 3 we shall call such a map a seminorm. Finally, we shall call aunit vector

any vector ofV having unit norm.

An example of a normed space isRn, equipped for instance by thep-norm

(or H¨older norm); this latter is defined for a vector xof components{xi}

as

x p=

n

i=1 |xi|p

1/p

, for 1p <. (1.13)

Notice that the limit aspgoes to infinity of x pexists, is finite, and equals the maximum module of the components ofx. Such a limit defines in turn a norm, called theinfinity norm(ormaximum norm), given by

x ∞= max

1≤i≤n|xi|.

When p = 2, from (1.13) the standard definition of Euclidean norm is recovered

x 2= (x,x)1/2=

n

i=1 |xi|2

1/2

=xTx1/2,

for which the following property holds.

Property 1.9 (Cauchy-Schwarz inequality) For any pairx,y∈Rn,

|(x,y)|=|xTy| ≤ x 2 y 2, (1.14)

where strict equality holds iff y=αx for someαR.

We recall that the scalar product in Rn can be related to the p-norms introduced overRn in (1.13) by theH¨older inequality

|(x,y)| ≤ x p y q, with 1

p+

1

q = 1.

In the case where V is a finite-dimensional space the following property holds (for a sketch of the proof, see Exercise 14).

Property 1.10 Any vector norm · defined onV is a continuous function

of its argument, namely, ∀ε > 0, ∃C > 0 such that if x−x ≤ ε then

| x − x | ≤Cε, for any x,x∈V.

(39)

20 1. Foundations of Matrix Analysis

Property 1.11 Let · be a norm ofRn andARn×n be a matrix with

nlinearly independent columns. Then, the function · A2 acting fromRn

intoRdefined as

x A2 = Ax ∀x∈Rn,

is a norm of Rn.

Two vectorsx,yinV are said to beorthogonalif (x,y) = 0. This statement has an immediate geometric interpretation when V =R2 since in such a case

(x,y) = x 2 y 2cos(ϑ),

where ϑ is the angle between the vectors x and y. As a consequence, if (x,y) = 0 thenϑis a right angle and the two vectors are orthogonal in the geometric sense.

Definition 1.18 Two norms · p and · q onV are equivalentif there exist two positive constantscpq andCpq such that

cpq x q ≤ x p≤Cpq x q ∀x∈V.

In a finite-dimensional normed space all norms are equivalent. In particular, ifV =Rn it can be shown that for thep-norms, withp= 1, 2, and, the constantscpq andCpq take the value reported in Table 1.1.

cpq q= 1 q= 2 q=∞

p= 1 1 1 1

p= 2 n−1/2 1 1

p=∞ n−1 n−1/2 1

Cpq q= 1 q= 2 q=∞

p= 1 1 n1/2 n

p= 2 1 1 n1/2

p=∞ 1 1 1

TABLE 1.1. Equivalence constants for the main norms ofRn

In this book we shall often deal with sequences of vectors and with their

convergence. For this purpose, we recall that a sequence of vectorsx(k)

in a vector spaceV having finite dimensionn, converges to a vectorx, and we write lim

k→∞x

(k)=xif

lim k→∞x

(k)

i =xi, i= 1, . . . , n (1.15)

(40)

1.11 Matrix Norms 21

sequence of real numbers, (1.15) implies also the uniqueness of the limit, if existing, of a sequence of vectors.

We further notice that in a finite-dimensional space all the norms are topo-logically equivalent in the sense of convergence, namely, given a sequence of vectorsx(k),

|||x(k)||| →0 ⇔ x(k) →0 if k→ ∞,

where||| · ||| and · are any two vector norms. As a consequence, we can establish the following link between norms and limits.

Property 1.12 Let · be a norm in a space finite dimensional space V. Then

lim k→∞x

(k)=x

⇔ lim

k→∞ x−x

(k) = 0,

wherexV andx(k)is a sequence of elements ofV.

1.11 Matrix Norms

Definition 1.19 Amatrix normis a mapping · :Rm×n Rsuch that:

1. A0ARm×n and A = 0 if and only if A = 0;

2. αA =|α| AαR, ARm×n (homogeneity);

3. A + B ≤ A + B ∀A,B∈Rm×n (triangular inequality).

Unless otherwise specified we shall employ the same symbol · , to denote matrix norms and vector norms.

We can better characterize the matrix norms by introducing the concepts of compatible norm and norm induced by a vector norm.

Definition 1.20 We say that a matrix norm · iscompatibleorconsistent

with a vector norm · if

AxA x , xRn. (1.16)

(41)

22 1. Foundations of Matrix Analysis

Definition 1.21 We say that a matrix norm · is sub-multiplicative if ∀ARn×m,BRm×q

AB ≤ A B . (1.17)

This property is not satisfied by any matrix norm. For example (taken from [GL89]), the norm A ∆ = max|aij| fori = 1, . . . , n, j = 1, . . . , m does not satisfy (1.17) if applied to the matrices

A = B =

1 1 1 1

,

since 2 = AB ∆> A ∆ B ∆= 1.

Notice that, given a certain sub-multiplicative matrix norm · α, there always exists a consistent vector norm. For instance, given any fixed vector

y=0inCn, it suffices to define the consistent vector norm as

x = xyH α x∈Cn.

As a consequence, in the case of sub-multiplicative matrix norms it is no longer necessary to explicitly specify the vector norm with respect to the matrix norm is consistent.

Example 1.7 The norm

AF =

n

i,j=1

|aij|2= tr(AAH) (1.18)

is a matrix norm called theFrobenius norm(orEuclidean norminCn2) and is compatible with the Euclidean vector norm · 2. Indeed,

Ax22 =

n

i=1

n

j=1

aijxj

2

n

i=1

n

j=1

|aij|2 n

j=1

|xj|2

=A2Fx22.

Notice that for such a normInF =√n. •

In view of the definition of a natural norm, we recall the following theorem.

Theorem 1.1 Let · be a vector norm. The function

A = sup

x=0

Ax

x (1.19)

(42)

1.11 Matrix Norms 23

Proof.We start by noticing that (1.19) is equivalent to

A= sup

x=1

Ax. (1.20)

Indeed, one can define for anyx=0the unit vectoru=x/x, so that (1.19) becomes

A= sup

u=1

Au=Aw withw= 1.

This being taken as given, let us check that (1.19) (or, equivalently, (1.20)) is actually a norm, making direct use of Definition 1.19.

1. IfAx ≥0, then it follows thatA= sup

x=1

Ax ≥0. Moreover

A= sup

x=0

Ax

x = 0⇔ Ax= 0 ∀x=0

and Ax=0∀x=0if and only if A=0; thereforeA= 0⇔A = 0. 2. Given a scalarα,

αA= sup

x=1

αAx=|α| sup

x=1

Ax=|α| A.

3. Finally, triangular inequality holds. Indeed, by definition of supremum, if

x=0then

Ax

x ≤ A ⇒ Ax ≤ Ax,

so that, takingxwith unit norm, one gets

(A + B)x ≤ Ax+Bx ≤ A+B,

from which it follows thatA + B= sup

x=1

(A + B)x ≤ A+B.

✸ Relevant instances of induced matrix norms are the so-called p-norms de-fined as

A p= sup

x=0

Ax p

x p

The 1-norm and the infinity norm are easily computable since

A 1= max j=1,... ,n

m

i=1

|aij|, A ∞=i=1,... ,mmax

n

j=1 |aij|

and they are called the column sum normand therow sum norm, respec-tively.

Moreover, we have A 1 = AT ∞ and, if A is self-adjoint or real

sym-metric, A 1= A ∞.

(43)

24 1. Foundations of Matrix Analysis

Theorem 1.2 Letσ1(A)be the largest singular value ofA. Then

A 2=

ρ(AHA) =ρ(AAH) =σ1(A). (1.21)

In particular, ifA is hermitian (or real and symmetric), then

A 2=ρ(A), (1.22)

while, if Ais unitary, A 2= 1.

Proof.Since AHA is hermitian, there exists a unitary matrix U such that

UHAHAU = diag(µ1, . . . , µn),

whereµiare the (positive) eigenvalues of AHA. Lety= UHx, then

A2 = sup

x=0

(AHAx,x)

(x,x) = supy=0

(UHAHAUy,y)

(y,y)

= sup

y=0

n

i=1

µi|yi|2/ n

i=1

|yi|2=

max

i=1,... ,n|µi|,

from which (1.21) follows, thanks to (1.10).

If A is hermitian, the same considerations as above apply directly to A. Finally, if A is unitary

Ax22= (Ax,Ax) = (x,AHAx) =x22

so thatA2= 1. ✸

As a consequence, the computation of A 2 is much more expensive than that of A or A 1. However, if only an estimate of A 2 is required, the following relations can be profitably employed in the case of square matrices

max

i,j |aij| ≤ A 2≤nmaxi,j |aij|, 1

n A ∞≤ A 2≤ √n A

∞,

1

n A 1≤ A 2≤ √n A

1,

A 2≤

A 1 A ∞.

For other estimates of similar type we refer to Exercise 17. Moreover, if A is normal then A 2≤ A p for anynand allp≥2.

Theorem 1.3 Let||| · |||be a matrix norm induced by a vector norm · . Then

Imagem

FIGURE 2.1. Errors in computational models
FIGURE 2.2. Variation of relative distance for the set of numbers F(2, 24, − 125, 128) IEC/IEEE in single precision
FIGURE 3.1. The reduced factorization. The matrices of the QR factorization are drawn in dashed lines
FIGURE 3.2. Partial pivoting by row (left) or complete pivoting (right). Shaded areas of the matrix are those involved in the searching for the pivotal entry
+7

Referências

Documentos relacionados

The irregular pisoids from Perlova cave have rough outer surface, no nuclei, subtle and irregular lamination and no corrosional surfaces in their internal structure (Figure

didático e resolva as ​listas de exercícios (disponíveis no ​Classroom​) referentes às obras de Carlos Drummond de Andrade, João Guimarães Rosa, Machado de Assis,

The probability of attending school four our group of interest in this region increased by 6.5 percentage points after the expansion of the Bolsa Família program in 2007 and

No campo, os efeitos da seca e da privatiza- ção dos recursos recaíram principalmente sobre agricultores familiares, que mobilizaram as comunidades rurais organizadas e as agências

i) A condutividade da matriz vítrea diminui com o aumento do tempo de tratamento térmico (Fig.. 241 pequena quantidade de cristais existentes na amostra já provoca um efeito

Ousasse apontar algumas hipóteses para a solução desse problema público a partir do exposto dos autores usados como base para fundamentação teórica, da análise dos dados

This log must identify the roles of any sub-investigator and the person(s) who will be delegated other study- related tasks; such as CRF/EDC entry. Any changes to

não existe emissão esp Dntânea. De acordo com essa teoria, átomos excita- dos no vácuo não irradiam. Isso nos leva à idéia de que emissão espontânea está ligada à