Chapter II Markov
A. NONHOMOGENEOUS MARKOV CHAINS AND SYSTEMS 1. Functionals over Stochastic Matrices
The matrices to be considered in this subsection are countably infinite unless
otherwise specified.
68
Chapter II. Markov Chains Definition 1.1: Given a stochastic matrixP =
[pol and an arbitrary vectorn =
(nl ) we define
d(P)
=
sup sup!PIIJ - PI.JI, den)=
supInl , -nl.1
J h,1I it,it
t5(P)
=
sup sup ~ (PI,J - PI.J)1,,1. rn'} JE[n'}
where
{n'l
denotes a subsequence of the sequence of natural numbers (to be denoted by{n}). IfPis a finite matrix, then "sup" is to be replaced by "max"and "inf' by "min."
Notation: Ifais a real number thena+ = max(a, 0) anda- = min(a, 0).
Proposition 1.1: t5(P)
=
SUPII.l.~(Pili - PI.J+The proof is left as an exercise.
Proposition 1.2: 0<d(P)
<
t5(P)< 1.
Proof: It is a trivial consequence of the definition that 0<d(P). For any fixedj,ilo andi2it is clear that
(PIIJ - PI.iY
<
sup ~ (PM - PI.k)=
~(PM - PI.kY{n'} kErn'}
ButSUPII,I,IPI,J - PI.il
=
SUP",I.(PI,J - P;,JY since the indexes i1and i2 are in-terchangeable so thatsupIPIIJ- PI.JI
=
sup(PIIJ - PI.iY<
sup sup ~ (PM - PI'i)il,i! it,it (J,l, {II'} kE{n/)
for any fixedj and tp.erefored(P)
=
SUPiSUPII,I./PIIJ- PI.JI<
t5(P). Finally, for fixedi1and i2we have that~(PI'J - PI.i)+
<
~PI,J - j~n'}~ PI,J<
Iand using Proposition 1.1 we get thatt5(P)
< 1. I
Proposition 1.3: IfP= [Po] and Q = [qli] are stochastic matrices thent5(PQ)
<t5(P) t5(Q).
Proof: Fixi1and
i
2• We show first that~k (PM- PI.k)+
+
~(PM-PI.k)-k=
~k (PM - PI,k)=
~PMk - PI.k= 1 - 1 = 0
and, therefore~(PM - P/okY
= -
~(P"k - PI.k)- (1)k k
Denoting by ~' summation over a subset of the set of natural numbers we have
~ (~(PMJ - PI.k)qkJt
=
~' ~J k (PM - Pi.k)qkJ=
~k (PM - PI'k)~'J qkJ<
~(PM - PI.kt sup~'q"J" " J
+
~(PM - PI.,,)- inf~'q"J" k J
=
~(Pil" -
PI.,,)+(sup~'qkJ- inf~'qkJ)" " J " J
where the indices involved in the summation ~' may depend on il and
;z.
But sup~/q"J- inf~/q"J=
sup~/(qklJ- q".J)" J "J "•.k. J
which is independent on
;1
and;z.
Thus, ~i~(PI,,, - PI.k)q"J)+
<
~k(Pll"- Pi."t~(Q)so that~(PQ)
=
supit.it [~(~(PM - Pi,k)q"Jt]<
sup[~(PM- Ph")+ ~(Q)]j il,it
<
~(Q)sup~(PI," - PI.,,)+=
~(P)~(Q)I
(l,it k
Definition 1.2: Ife
=
(el) is an arbitrary vector andPan arbitrary matrix, thenlei =
SUPIlell, IPI =
SUPi,J IpiJI;lIeli =
~lell
provided that ~lell <
00 andlIell
= 00 otherwise;IIPII
= SUPI~Jlpl}l provided that ~Jlpl}l<
00 for all;, andIIPII =
00 otherwise.Proposition 1.4: LetP
=
(PI}) be a stochasti~ matrix and lete
be a nonzero vector of the same dimension asPsuch thatIlell <
00and ~el= 0
[e=
(el)]then lIePIl
< lIell
~(P).Proof: Define the vectors
'1
= (,/) and'Z
= (,/)asYI-2 e/ d YZ-2 Iel-1
'01 - ~ an '01 - ~
Then using an argument similar to the one used in proving formula (1) we have that both
'1
and'Z
are stochastic vectors and'I - 'Z = 2(e/llell).
Let Qbe a matrix such that its first row is
'I
all the other rows being equal to'z.
Then2~(QP)= 2~ (~('"I - '"Z)p"J)+ = ~
I
~('"I - '"Z)PkJIJ " J "
again using the formula (1).
By the definition of
'I
and'Z
we have that~ I t: ('"I -
'"Z)PkJI= 2~ I ~1~iIP"JI
_ 2 _
lIePl1- Il'TI ~ I ~
ekPkJI - 2W
70 ChapterII. Markov Chains Thus, II~PII/II~II = o(QP)
<
o(Q) o(P)<
o(P)by Propositions 1.2 and 1.3, so that II~PII<
1I~lIo(P).I
Corollary 1.5: IfPis a matrix such that all its rows have the properties of the vector~in Proposition 1.4 and Qis a stochastic matrix then
IIPQII < IIPII
o(Q)Corollary 1.6: IfPandQare stochastic matrices, then IIPQ -
QII <
2o(Q). In particular if7T. is a stochastic vector andp
is row of Q, thenIlnQ - pll <
2o(Q).
Proof: IIPQ -
QII =
II(P- I)QII<
liP- Illo(Q)< (IIPII + IIIIJ)o(Q) =
2o(Q). [See Exercise 8 at the end of this section.]
I
Definition 1.3: Given a stochasic matrixP
=
[Pu],yep)
is defined asProposition 1.7: LetPbe a stochastic matrix, theno(P)
=
1 -yep).
Proof: Denote
and then
Oili'(P)
= 2:
(Pil) - Pi'})+= 2:
(Pill - min(pill'Pi,}))} }
=
1 -2:
min(pill' Pi'i)=
1 - Yil;,{P)i
Therefore,o(P)
>
Oili'(P)=
1 - Yili'(P), which implies thato(P)>
1 -yep).
Similarly, Oi'i'(P)
=
1 - Yili'(P)<
1 -yep)
which implies that o(P)<
1-yep).
Combining the two inequalities we have that o(P)=
I -yep)· I
Proposition 1.8: If P and Qare stochastic matrices and 11 a column vector such that
lI1il <
1 for all i, thend(PQ)<
o(P)d(Q)and d(PI1)<
o(P)d(I1)·Proof: It suffices to prove the second inequality. Let i1and izbe two arbi-trary rows inP. Since2:ilpili- Pi'il
<
2, we can find for any givenf a number kosuch that= I;
(Pili - Pi,J(lli - llio)i*h
< I;
ko (Pili - Pi,J(lli - lli.)+
2fi~1
By the definition of
llioall the terms of the form
lli- llioare nonnegative in the above sum so that by omitting the terms such that
hi<
Pi,ithe sum is increased. Also
(lli - llio)<
d(ll)with the result that
II;
j hilli-I;
j Pi'illil< I;
j (Pili - Pi,J+ d(ll)+
2f<
o(P)d(ll)+
2fSince
f> 0 is arbitrarily small and
il>i2are arbitrary, the proposition fol-lows. I
Proposition 1.9: If
Pand Q are stochastic matrices and 11 is a vector as in Proposition 1.8, then 1P1l - 111 <
d(ll)and
IPQ - QI<
d(Q).Proof:
The same method used in the proof of the previous proposition can be used here beginning with the inequality 0 <
I;iPililli -I;
fiti lliwhere
fij
is equal to 0 except for a unique, but arbitrary, entry which is equal to
1,and continuing the same way as in the previous proof. The details are left to the reader. I
Example:
Let
Pbe the matrix
(H f)
then IIPII =
1[this is true for any stochastic matrix];
IPI= i;
d(P)= the
maximal distance between two elements in the same column = t [ = Ip"
-PI3!], o(P)
= t [ =
I;l~, (Pli - P3i)+]and yep) = t ( =
I;1~1min(p'i>
P3;))'The inequalities proved in this section are easily verified.
EXERCISES
1. Prove Proposition 1.1.
2. Prove Proposition 1.9.
3. Illustrate by examples all the inequalities proved in this section.
4. If
Pis a finite stochastic matrix of order n, then a.
d(P)>
l/nO(P).b. It is possible that
d(P)<
1and
o(P)=
1.c.
d(P) =0 if and only if
o(P) =O.
If
Pis an infinite stochastic matrix, then
72 ChapterII. Markov Chains d. For anyE, there isPsuch thatd(P)
<
Ebuto(P)= 1.
e. d(P) = 0 if and only ifo(P) = O.
5. Prove: IfreP)=F0 for a stochastic matrixP, then reP) is not smaller than the minimal nonzero entry inP and is not smaller than the sum of the mini-mal elements in the columns ofP.
6.
Prove that every stochastic matrixPcan be expressed in the formP = E+
QwhereE is a stochastic constant matrix and
IIQII <
2o(P)7. Prove: IfPis a constant stochastic matrix, theno(P)
=
d(P)=
0[reP)=
1];ifPis a degenerate nonconstant stochastic matrix, theno(P)
=
d(P)=
I [reP)= 0].
8.
Prove that the functionals"II II"
and"I I"
have the following properties:For any matricesP,
Q
and real numberoc
it is true thatIIPII >
0,liP + QII <
IIPII + IIQII, IIPII =
0 if and only ifP=
0,IlocP11 = loclllPll
[definingo·
co=
00],
and similarly for"I I".
9.
LetPbe a Markov matrix representing a given Markov process. Lettl} be the probability that the process will transite from both states i and j to some common consequent state in the first step. Prove that ti}>
0, for any two states iandj, if and only ifreP)>0.
10.
Prove that for arbitrary matricesA and B,IIABII < IIAII IIBII
11. Let
AI> ... , An
andAI, ... , An
be two sets ofn matrices such thatIIA
I-Aill <
E, for i=
1,2, ... ,then IIII7~lAI -
1I7~1Alii <
nE.12. LetPbe a Markov matrix and letPI, be the Markov matrix such that all its rows are equal to the i, row ofP. Prove that o(P)
> tilP - Pi,11
but for everyEthere is an indexi
osuch thato(P)< !II
P -Pi,11 +
tE.13. A
double stochastic matrix is a stochastic matrixP =
[PI}] such that both L.}PI}=
1,i=
1,2, ... andL.IPI}=
I,j=
1,2... , i.e., the sum of the en-tries in any column is also equal to1.
Prove:a. If
P
is double stochastic and o(P)=
0, thenP
is of finite order, say n, and all the entries ofPare equal to lin.b. The set of doubly stochastic matrices is closed under multiplication [since I is double stochastic this implies that the set of doubly stochastic matrices is a monoid.]
c. IfPis a double stochastic matrix of finite ordernsuch thato(P)
<
1, and E is a matrix all the entries of which are equal to lin, thenIimllpm -
E!I =
0 [lim pm=
E]d. IfPis a countable double stochastic matrix, theno(P)
=
114. Consider the following Markov matrix
Prove that
P = [I -
p p ]q I-q'
p + q > 0, p, q > 0
limpn = lP t q P ~ qJ
n~= _q_ _P_
p+q p+q
15. Prove that sup (11ePll/llelD =
c5(P)where e ranges over vectors such that
Ilell <
00and I;ei
=O.
16. Prove that any vector e such that Ilell <
00and I; ei = 0 can be express-ed in the form e =
I;;"'~l'I where the 'I =
('0)vectors have only two non-zero entries, "'111 <
00,I;J
'0= 0, and Ilell = I; 11'111·
2. Nonhomogeneous Markov Chains
The different functionals
d,c5, y introduced in the previous section provide, in a certain sense, a measure of the "distance" between two arbitrary rows of a given stochastic matrix. Thus if the matrix
Pis constant, then
r5(P)=
d(P)= 0 and
y(P)= I [see Exercise 7 in the previous section]. These functionals will be used subsequently for studying the long-range behavior of Markov chains. As mentioned before, a nonhomogeneous Markov chain can be repre-sented by an infinite sequence of Markov matrices
{palOOsuch that the matrix Pi represents the transition probabilities of the system from state to state at time
t= i. Let
Hm•be defined as the matrix
then the
ijentry in
Hm•is the probability that the system will enter the state
jat time t = n if it was at state
iat time t = m. We shall now distinguish between two cases for the long-range behavior of a given Markov chain.
Case 1:
limn_=r5(Hmn) =0,
m=0, 1,2 .... In this case the chain is called
weakly ergodic.Case 2: For any given m there exists a constant stochastic matrix
Qsuch that lim._=IJHm. - QII = 0 in this case the chain is called
strongly ergodic.In addition to the two above distinctions, there may be other distinctions as
well (e.g., the matrix
Qin the second case may not be constant, or the limit-in
both cases-may exist only for some m, but not for all m, etc.) but because of
their restrictive nature those distinctions will not be considered here. We shall
give now some characterizations of the above defined properties.
74 Chapter
II.Markov Chains Theorem
2.1:A Markov chain is weakly ergodic if and only if there exists a subdivision of the chain into blocks of matrices (HiJiJ..} such that Lj=1 y(HiJiJoJ diverges,
[il= 0].
Proof: The condition is sufficient, since
Lj~1y(H;JIJ+1) diverges implies that for any
jo, limn~ooII')=}.
(1 - y(HIJIJ.J)= 0 and using Propositions 1.3, 1.2, and 1.7, we have that
m+n m+n m+n n
~(II
Pi) <
~(II HiJlJ.J < II
~(HiJiJ.J= II
(1 -y(HiJIJ.J)
i=m i/:2.m ii2:.m ii2:..m
where
i}> m means that the product begins with the first index
i}> m.
Taking limits on both sides, we get that
m+n N
lim
~(IIPi) < lim
~(II HIJiJ.J = 0
11 ...00 i=m N-oo iJ>m
with
N= m + n. If
limn~oo ~(IIf=mPi) = 0, m = 1,2, ... , then by Proposi-tion 1.7,
lim YCO: PI) = lim
(1 -~ IT PI)) =
(1 -lim ~(rr PI)) =
111--+00 i=m i=m 11-+00 J'=m
Let 0 <.
f<
1be a small constant, then if follows from the above inequalties that a sequence of blocks HIJIJ.I can be found such that y(HIJIJ.J >
fso that
Lj~1