Topics in Advanced Linear Algebra
9.3 Reducible Matrices
Nonnegative Matrices and Stochastic Matrices 9-7
With D=I , a relaxation of this bound onτ(P ) yields the expression
≤min
⎧⎨
⎩ρ− n
j=1
mini
Pi jvj vi
,
n j=1
maxi
Pi jvj vi
−ρ
⎫⎬
⎭.
5. [RT85, Theorem 4.3] For a positive vector u ∈Rn, consider the function Mu:Rn →Rdefined for a∈Rnby
Mu(a)=max{xTa : x∈Rn,x ≤1, xTu=0}.
This function has a simple explicit representation obtained by sorting the ratiosauj
j, i.e., identifying a permutation j (1),. . ., j (n) of 1,. . ., n such that
aj (1)
uj (1)
≤aj (2)
uj (2)
≤ · · · ≤ aj (n)
uj (n)
.
With kas the smallest integer in{1,. . ., n}such that 2kp=1 uj ( p)>nt=1utand
µ≡1+
⎛
⎝n
t=1
ut−2
k
p=1
uj ( p)
⎞
⎠,
we have that
Mu(a)=
k−1 p=1
aj ( p)+µaj (k)− n p=k+1
aj ( p).
With.as the∞-norm onRnand (D−1P D)1,. . ., (D−1P D)nas the columns of D−1P D, the bound in Fact 11 on the coefficient of ergodicityτ(P ) of P becomes
rmax=1,...,nMD−1w[(D−1P D)r].
P is convergent or transient if limm→∞Pm=0.
P is semiconvergent if limm→∞Pmexists.
P is weakly expanding if P u≥u for some u>0.
P is expanding if for some P u>u for some u>0.
An n×n matrix polynomial of degree d in the (integer) variable m is a polynomial in m with coefficients that are n×n matrices (expressible as S(m)=dt=0mtBtwith B1,. . ., Bdas n×n matrices and Bd=0).
Facts:
Facts requiring proofs for which no specific reference is given can be found in [BP94, Chap. 2].
1. The set of basic classes of a nonnegative matrix is always nonempty.
2. (Spectral Properties of the Perron Value) Let P be a nonnegative n×n matrix with spectral radius ρand indexν.
(a) [Fro12]ρis an eigenvalue of P .
(b) [Fro12] There exist semipositive right and left eigenvectors of P corresponding toρ, i.e.,ρis a distinguished eigenvalue of both P and PT.
(c) [Rot75]νis the largest number of vertices on a simple walk in R∗(P ).
(d) [Rot75] For each basic class B having height h, there exists a generalized eigenvector vB in Nρh(P ), with (vB)i>0 if i →B and (vB)i =0 otherwise.
(e) [Rot75] The dimension of Nρν(P ) is the number of basic classes of P . Further, if B1,. . ., Bp
are the basic classes of P and vB1,. . ., vBrare generalized eigenvectors of P atρthat satisfy the conclusions of Fact 2(d) with respect to B1,. . ., Br, respectively, then vB1,. . ., vBpform a basis of Nρν(P ).
(f) [RiSc78, Sch86] If B1,. . ., Bpis an enumeration of the basic classes of P with nondecreasing heights (in particular, s <t assures that we do not have Bt→Bs), then there exist generalized eigenvectors vB1,. . ., vBp of P atρthat satisfy the assumptions and conclusions of Fact 2(e) and a nonnegative p×p upper triangular matrix M with all diagonal elements equal toρ, such that
P [vB1,. . ., vBp]=[vB1,. . ., vBp]M
(in particular, vB1,. . ., vBp is a basis of Nρν(P )). Relationships between the matrix M and the Jordan Canonical Form of P are beyond the scope of the current review; see [Sch56], [Sch86], [HS89], [HS91a], [HS91b], [HRS89], and [NS94].
(g) [Vic85], [Sch86], [Tam04] If B1,. . ., Brare the basic classes of P having height 1 and vB1,. . ., vBr are generalized eigenvectors of P atρthat satisfy the conclusions of Fact 2(d) with respect to B1,. . ., Br, respectively, then vB1,. . ., vBrare linearly independent, nonnegative eigenvectors of P atρthat span the cone (R+0)n∩Nρ1(P ); that is, each vector in the cone (R+0)n∩Nρ1(P ) is a linear combination with nonnegative coefficients of vB1,. . ., vBr(in fact, the sets{αvBs:α≥0} for s =1,. . ., r are the the extreme rays of the cone (R+0)n∩Nρ1(P )).
3. (Spectral Properties of Eigenvaluesλ=ρ(P ) with|λ| =ρ(P )) Let P be a nonnegative n×n matrix with spectral radiusρ, indexν, co-index ¯ν, period q , and coefficient of ergodicityτ.
(a) [Rot81a] The following are equivalent:
i. {λ∈σ(P )\ {ρ}:|λ| =ρ} = ∅.
ii. ¯ν=0.
iii. P is aperiodic (q=1).
Nonnegative Matrices and Stochastic Matrices 9-9 (b) [Rot81a] Ifλ ∈ σ(P )\ {ρ}and|λ| = ρ, then (λρ)h = 1 for some h ∈ {2,. . ., n}; further, q = min{h = 2,. . ., n : (ρλ)h = 1 for eachλ ∈ σ(P )\ {ρ}with|λ| = ρ} ≤ n (here the minimum over the empty set is taken to be 1).
(c) [Rot80] Ifλ∈σ(P )\{ρ}and|λ| =ρ, thenνP(λ) is bounded by the largest number of vertices on a simple walk in R∗(P ) with each vertex corresponding to a (basic) access equivalence class C that hasλ∈σ(P [C ]); in particular, ¯ν≤ν.
4. (Distinguished Eigenvalues) Let P be a nonnegative n×n matrix.
(a) [Vic85]λis a distinguished eigenvalue of P if and only if there is a final set C withρ(P [C ])=λ.
It is noted that the set of distinguished eigenvalues of P and PTneed not coincide (and the above characterization of distinguished eigenvalues is not invariant of the application of the transpose operator). (See Example 1 below.)
(b) [HS88b] Ifλis a distinguished eigenvalue,νP(λ) is the largest number of vertices on a simple walk in R∗(P [λ]).
(c) [HS88b] Ifµ >0, thenµ≤min{λ:λis a distinguished eigenvalue of P}if and only if there exists a vector u>0 with P u≥µu.
(For additional characterizations of the minimal distinguished eigenvalue, see the concluding remarks of Facts 12(h) and 12(i).)
Additional properties of distinguished eigenvaluesλof P that depend on P [λ] can be found in [HS88b] and [Tam04].
5. (Convergence Properties of Powers) Let P be a nonnegative n×n matrix with positive spectral radius ρ, indexν, co-index ¯ν, period q , and coefficient of ergodicityτ(for the case whereρ=0, see Fact 12(j) below).
(a) [Rot81a] There exists an n×n matrix polynomial S(m) of degreeν−1 in the (integer) variable m such that limm→∞[(Pρ)m−S(m)]=0 (C, p) for every p≥ν¯; further, if P is aperiodic, this limit holds as a regular limit and the convergence is geometric with rateρτ <1.
(b) [Rot81a] There exist matrix polynomials S0(m),. . ., Sq−1(m) of degreeν−1 in the (integer) variable m, such that for each k = 0,. . ., q−1, limm→∞[(Pρ)mq+k −St(m)] = 0 and the convergence of these sequences to their limit is geometric with rate (τρ)q <1.
(c) [Rot81a] There exists a matrix polynomial T (m) of degreeνin the (integer) variable m with limm→∞[ms=−01(Pρ)s −T (m)]=0 (C, p) for every p≥ν; further, if P is aperiodic, this limit¯ holds as a regular limit and the convergence is geometric with rateτρ <1.
(d) [FrSc80] The limit ofρmPmmν−1[I+ Pρ + · · · +(Pρ)q−1] exists and is semipositive.
(e) [Rot81b] Let x =[xi] be a nonnegative vector inRnand let i ∈ n. With K (i, x) ≡ {j ∈ n: j→i} ∩ {j ∈ n: u→j for some u∈ nwith xu>0},
r (i|x, P )≡inf{α >0 : lim
m→∞α−m(Pmx)i=0} =ρ(P [K (i, x)]) and if r ≡r (i|x, P )>0,
k(i|x, P )≡inf{k =0, 1,. . . : lim
m→∞m−kr−m(Pmx)i =0} =νP [K (i,x)](r ).
Explicit expressions for the polynomials mentioned in Facts 5(a) to 5(d) in terms of characteristics of the underlying matrix P are available in Fact 12(a)ii for the case whereν=1 and in [Rot81a]
for the general case. In fact, [Rot81a] provides (explicit) polynomial approximations of additional high-order partial sums of normalized powers of nonnegative matrices.
6. (Bounds on the Perron Value) Let P be a nonnegative n×n matrix with spectral radiusρand letµ be a nonnegative scalar.
(a) For ∈ {<,≤,=,≥,>},
[P uµu for some vector u>0]⇒[ρµ] ; further, the inverse implication holds foras<, implying that
ρ=max
x0 min
{i : xi>0}
( Ax)i
xi . (b) For ∈ {,≤,=,≥,},
[ρµ]⇒[P uµu for some vector u0] ;
further, the inverse implication holds foras≥. (c) ρ < µif and only if P u< ρu for some vector u≥0 .
Sinceρ(PT) = ρ(P ), the above properties (and characterizations) of ρ can be expressed by applying the above conditions to PT. (See Example 3 below.)
Some of the above results can be expressed in terms of the Collatz–Wielandt sets. (See Fact 7 of Section 9.2 and Chapter 26.)
7. (Bounds on the Spectral Radius) Let P be a nonnegative n×n matrix and let A be a complex n×n matrix such that|A| ≤P . Thenρ( A)≤ρ(P ).
8. (Functional Inequalities) Consider the functionρ(.) mapping nonnegative n×n matrices to their spectral radius.
(a) ρ(.) is nondecreasing in each element (of the domain matrices); that is, if A and B are non-negative n×n matrices with A≥B≥0, thenρ( A)≥ρ(B ).
(b) [Coh78]ρ(.) is (jointly) convex in the diagonal elements; that is, if A and D are n×n matrices, with D diagonal, A and A+D nonnegative, and if 0< α <1, thenρ[αA+(1−α)( A+D)]≤
αρ( A)+(1−α)ρ( A+D).
(c) [EJD88] If A=[ai j] and B=[bi j] are nonnegative n×n matrices, 0< α <1 and C =[ci j] with ci j =ai jαb1i j−αfor each i, j=1,. . ., n, thenρ(C )≤ρ( A)αρ(B )1−α.
Further functional inequalities aboutρ(.) can be found in [EJD88] and [EHP90].
9. (Resolvent Expansions) Let P be a nonnegative square matrix with spectral radiusρand letµ > ρ.
ThenµI−P is invertible and
(µI−P )−1=∞
t=0
Pt
µt+1 ≥ I
µ+ P
µ2 ≥ I
µ ≥0
(the invertibility ofµI−P and the power series expansion of its inverse do not require nonnegativity of P ).
For explicit expansions of the resolvent about the spectral radius, that is, for explicit power series representations of [(z+ρ)I −P ]−1with|z|positive and sufficiently small, see [Rot81c], and [HNR90] (the latter uses such expansions to prove Perron–Frobenius-type spectral results for nonnegative matrices).
10. (Puiseux Expansions of the Perron Value) [ERS95] The functionρ(.) mapping irreducible non-negative n×n matrices X = [xi j] to their spectral radius has a converging Puiseux (fractional power series) expansion at each point; i.e., if P is a nonnegative n×n matrix and if F is an n×n matrix with P+F ≥0 for all sufficiently small positive, thenρ(P +F ) has a representation ∞
k=0ρkk/qwithρ0=ρ(P ) and q as a positive integer.
11. (Bounds on the Ergodicity Coefficient) [RT85, extension of Theorem 3.1] Let P be a nonnegative n×n matrix with spectral radiusρ, corresponding semipositive right eigenvector v, and ergodicity
Nonnegative Matrices and Stochastic Matrices 9-11 coefficientτ, let D be a diagonal n×n matrix with positive diagonal elements, and let.be a norm onRn. Then
τ ≤ max
x∈Rn,x≤1,xTD−1v=0
xTD−1P D.
12. (Special Cases) Let P be a nonnegative n×n matrix with spectral radiusρ, indexν, and period q . (a) (Index 1) Supposeν=1.
i. ρI−P has a group inverse.
ii. [Rot81a] With P≡I−(ρI−P )(ρI−P )#, all of the convergence properties stated in Fact 6 of Section 9.2 apply.
iii. Ifρ >0, then Pρmmis bounded in m (element-wise).
iv. ρ=0 if and only if P =0.
(b) (Positive eigenvector) The following are equivalent:
i. P has a positive right eigenvector corresponding toρ. ii. The final classes of P are precisely its basic classes.
iii. There is no vector w satisfying wTP ρwT. Further, when the above conditions hold:
i. ν=1 and the conclusions of Fact 12(a) hold.
ii. If P satisfies the above conditions and P = 0, thenρ > 0 and there exists a diagonal matrix D having positive diagonal elements such that S≡ ρ1D−1P D is stochastic (that is, S≥0 and S1=1; see Chapter 4).
(c) [Sch53] There exists a vector x>0 with P x≤ρx if and only if every basic class of P is final.
(d) (Positive generalized eigenvector) [Rot75], [Sch86], [HS88a] The following are equivalent:
i. P has a positive right generalized eigenvector atρ.
ii. Each final class of P is basic.
iii. P u≥ρu for some u>0.
iv. Every vector w≥0 with wTP ≤ρwT must satisfy wTP=ρwT. v. ρis the only distinguished eigenvalue of P .
(e) (Convergent/Transient) The following are equivalent:
i. P is convergent.
ii. ρ <1.
iii. I−P is invertible and (I−P )−1≥0.
iv. There exists a positive vector u∈Rnwith P u<u.
Further, when the above conditions hold, (I−P )−1=∞t=0Pt≥I . (f) (Semiconvergent) The following are equivalent:
i. P is semiconvergent.
ii. Eitherρ <1 orρ=ν=1 and 1 is the only eigenvalueλof P with|λ| =1.
(g) (Bounded) Pmis bounded in m (element-wise) if and only if eitherρ <1 orρ=1 andν=1.
(h) (Weakly Expanding) [HS88a], [TW89] [DR05] The following are equivalent:
i. P is weakly expanding.
ii. There is no vector w∈Rnwith w≥0 and wTP wT. iii. Every distinguished eigenvalueλof P satisfiesλ≥1.
iv. Every final class C of P hasρ(P [C ])≥1.
v. If C is a final set of P , thenρ(P [C ])≥1.
Givenµ >0, the application of the above equivalence toPµyields characterizations of instances where each distinguished eigenvalue of P is bigger than or equal toµ.
(i) (Expanding) [HS88a], [TW89] [DR05] The following are equivalent:
i. P is expanding.
ii. There exists a vector u∈Rnwith u≥0 and P u>u.
iii. There is no vector w∈Rnwith w0 and wTP ≤wT. iv. Every distinguished eigenvalueλof P satisfiesλ >1.
v. Every final class C of P hasρ(P [C ])>1.
vi. If C is a final set of P , thenρ(P [C ])>1.
Givenµ >0, the application of the above equivalence toPµyields characterizations of instances where each distinguished eigenvalue of P is bigger thanµ.
(j) (Nilpotent) The following are equivalent conditions:
i. P is nilpotent; that is, Pm=0 for some positive integer m.
ii. P is permutation similar to an upper triangular matrix all of whose diagonal elements are 0.
iii. ρ=0.
iv. Pn=0.
v. Pν=0.
(k) (Symmetric) Suppose P is symmetric.
i. ρ=maxu0uuTTP uu.
ii. ρ=uuTTP uu for u0 if and only if u is an eigenvector of P corresponding toρ. iii. [CHR97, Theorem 1] For u, w0 with wi =√
ui(P u)ifor i =1,. . ., n,uuTTP uu ≤ wwTTP ww
and equality holds if and only if u[S] is an eigenvector of P [S] corresponding toρ, where S≡ {i : ui>0}.
Examples:
1. We illustrate parts of Fact 2 using the matrix
P =
⎡
⎢⎢
⎢⎢
⎢⎢
⎢⎣
2 2 2 0 0 0
0 2 0 0 0 0
0 0 1 2 0 0
0 0 0 1 1 0
0 0 0 1 1 1
0 0 0 0 0 1
⎤
⎥⎥
⎥⎥
⎥⎥
⎥⎦ .
The eigenvalues of P are 2,1, and 0; so, ρ(P ) = 2 ∈ σ(P ) as is implied by Fact 2(a). The vectors v =[1, 0, 0, 0, 0, 0]T and w=[0, 0, 0, 1, 1, 1] are semipositive right and left eigenvectors corresponding to the eigenvalue 2; their existence is implied by Fact 2(b).
The basic classes are B1 = {1}, B1 = {2}and B3 = {4, 5}. The digraph corresponding to P , its reduced digraph, and the basic reduced digraph of P are illustrated in Figure 9.1. From Figure 9.1(c), the largest number of vertices in a simple walk in the basic reduced digraph of P is 2 (going from B1
to either B2or B3); hence, Fact 2(c) implies thatνP(2)=2. The height of basic class B1is 1 and the height of basic classes B2and B3is 2. Semipositive generalized eigenvectors of P at (the eigenvalue)
Nonnegative Matrices and Stochastic Matrices 9-13
5 3 4
(a) (b) (c)
1
2 {3}
{4,5}
{1}
{2}
{6}
{4,5}
{1}
{2}
6
FIGURE 9.1 (a) The digraph(P ), (b) reduced digraph R[(P )], and (c) basic reduced digraph R∗(P ).
2 that satisfy the assumptions of Fact 2(f) are uB1 =[1, 0, 0, 0, 0, 0]T, uB2 =[1, 1, 0, 0, 0, 0]T, and uB3 =[1, 0, 2, 1, 1, 0]T. The implied equality
P [uB1,. . ., uBp]=[uB1,. . ., uBp]M of Fact 2(f) holds as
⎡
⎢⎢
⎢⎢
⎢⎢
⎢⎣
2 2 2 0 0 0
0 2 0 0 0 0
0 0 1 2 0 0
0 0 0 1 1 0
0 0 0 1 1 1
0 0 0 0 0 1
⎤
⎥⎥
⎥⎥
⎥⎥
⎥⎦
⎡
⎢⎢
⎢⎢
⎢⎢
⎢⎣
1 1 1
0 1 0
0 0 2
0 0 1
0 0 1
0 0 0
⎤
⎥⎥
⎥⎥
⎥⎥
⎥⎦
=
⎡
⎢⎢
⎢⎢
⎢⎢
⎢⎣
2 4 6
0 2 0
0 0 4
0 0 2
0 0 2
0 0 0
⎤
⎥⎥
⎥⎥
⎥⎥
⎥⎦
=
⎡
⎢⎢
⎢⎢
⎢⎢
⎢⎣
1 1 1
0 1 0
0 0 2
0 0 1
0 0 1
0 0 0
⎤
⎥⎥
⎥⎥
⎥⎥
⎥⎦
⎡
⎢⎣
2 2 4
0 2 0
0 0 2
⎤
⎥⎦.
In particular, Fact 2(e) implies that uB1, uB2, uB3 form a basis of Nρν(P )(P ) = N22. We note that while there is only a single basic class of height 1, dim[Nρ1(P )] = 2 and uB1, 2uB2 −uB3 = [−1, 2,−2,−1,−1, 0]T form a basis of Nρ1(P ). Still, Fact 2(g) assures that (R+0)n∩Nρ1(P ) is the cone{αuB1 :α≥0}(consisting of its single ray).
Fact 4(a) and Figure 9.1 imply that the distinguished eigenvalues of P are 1 and 2, while 2 is the only distinguished eigenvalue of PT.
2. Let H = 0 1
1 0
; properties of H were demonstrated in Example 2 of section 9.2. We will demon-strate Facts 2(c), 5(b), and 5(a) on the matrix
P ≡
H I
0 H
.
The spectral radius of P is 1 and its basic classes of P are B1 = {1, 2}and B2 = {3, 4}with B1
having access to B2. Thus, the index of 1 with respect to P , as the largest number of vertices on a walk of the marked reduced graph of P , is 2 (Fact 2(c)). Also, as the period of each of the two basic
classes of P is 2, the period of P is 2. To verify the convergence properties of P , note that
Pm=
⎧⎪
⎪⎪
⎪⎨
⎪⎪
⎪⎪
⎩
I mH
0 I
if m is even
H mI
0 H
if m is odd,
immediately providing matrix–polynomials S0(m) and S1(m) of degree 1 such that limm→∞P2m− S0(m)=0 and limm→∞P2m+1−S1(m)=0. In this example,τ(P ) is 0 (as the maximum over the empty set) and the convergence of the above sequences is geometric with rate 0.
The above representation of Pmshows that Pm=
Hm mHm+1
0 Hm
and Example 2 of section 9.2 shows that
mlim→∞Hm= I+H
2 =
.5 .5
.5 .5
(C,1).
We next consider the upper-right blocks of Pm. We observe that 1
m
m−1
t=0
Pt[B1, B2]= mI
4 +(m−42)H if m is even
(m−1)2I
4m +(m24m−1)H if m is odd,
=
m(I+H)
4 − H2 if m is even
m(I+H)
4 −2I + I−4mH if m is odd, implying that
mlim→∞
1 m
m−1
t=0
Pt[B1, B2]−m
I+H 4
+ I+H
4 =0 (C,1).
As m−1= m1m−1t=0 t for each m=1, 2,. . ., the above shows that
m→∞lim 1 m
m−1
t=0
Pt[B1, B2]−t
I+H
4 =0 (C,1),
and, therefore (recalling that (C,1)-convergence implies (C,2)-convergence),
mlim→∞
⎧⎪
⎪⎪
⎨
⎪⎪
⎪⎩ Pm−
⎡
⎢⎢
⎢⎣
.5 .5 −.25m −.25m
.5 .5 −.25m −.25m
0 0 .5 .5
0 0 .5 .5
⎤
⎥⎥
⎥⎦
⎫⎪
⎪⎪
⎬
⎪⎪
⎪⎭
=0 (C,2).
3. Fact 6 implies many equivalencies, in particular, as the spectral radius of a matrix equals that of its transpose. For example, for a nonnegative n×n matrix P with spectral radiusρand nonnegative scalarµ, the following are equivalent:
(a) ρ < µ.
(b) P u< µu for some vector u>0.
(c) wTP < µwTfor some vector w>0.
Nonnegative Matrices and Stochastic Matrices 9-15 (d) P u< ρu for some vector u≥0.
(e) wTP < ρwTfor some vector w≥0.
(f) There is no vector u0 satisfying P u≥µu.
(g) There is no vector w0 satisfying wTP ≥µwT.