7.2 Chains on General State Spaces
7.2.3 Invariant Measures and Stationarity
We prove this inequality by induction. For n= 1 we can write µ(A) =µQ(A)≥µ(α)Q(α, A) =Q(α, A) = Pα(X1∈A). Now assume now that the inequality holds for somen≥1. Then
µ(A) =Q(α, A) + Z
αc
µ(dy)Q(y, A)
≥Q(α, A) +
n
X
k=1
Eα[Q(Xk, A)1{τα≥k}1{Xk∈α}/ ]
≥Q(α, A) +
n
X
k=1
Eα[Q(Xk, A)1{τα≥k+1}]. Because {τα≥k+ 1} ∈ FkX, the Markov property yields
Eα[Q(Xk, A)1{τα≥k+1}] = Pα(Xk+1∈A, τα≥k+ 1), whence
µ(A)≥Q(α, A) +
n+1
X
k=2
Pα(Xk∈A, τα≥k) =
n+1
X
k=1
Pα(Xk∈A, τα≥k). This completes the induction, and we conclude that µ≥µα.
Assume that there exists a setAsuch thatµ(A)> µα(A). It is straightforward thatµandµαare both invariant for the resolvent kernelKδ (see (7.17)), for anyδ∈ (0,1). Becauseαis accessible,Kδ(x, α)>0 for allx∈X. HenceR
Aµ(dx)Q(x, α)>
R
Aµα(dx)Q(x, α), which implies that 1 =µ(α) =µKδ(α) =
Z
A
µ(dx)Kδ(x, α) + Z
Ac
µ(dx)Kδ(x, α)
>
Z
A
µα(dx)Kδ(x, α) + Z
Ac
µα(dx)Kδ(x, α) =µαKδ(α) =µα(α) = 1.
This contradiction shows that µ=µα.
We finally prove thatµαis a maximal irreducibility measure. Letψbe a maximal irreducibility measure and assume that ψ(A) = 0. Then Px(τA <∞) = 0 for ψ-almost all x∈ X. This obviously implies that Px(τA < ∞) = 0 for ψ-almost all x ∈ α. Because Px(τA < ∞) is constant over α, we find that Px(τA < ∞) = 0 for all x ∈α, and this yields µα(A) = 0. Thus µα is absolutely continuous with respect toψ, hence an irreducibility measure. Let againKδ be the resolvent kernel.
By Theorem 147,µαKδ is a maximal irreducibility measure. But, as noted above, µαK=µα, and thereforeµαis a maximal irreducibility measure.
Proposition 173. LetQbe a recurrent phi-irreducible transition kernel that admits an accessible (1, , ν)-small set C. Then it admits a non-trivial invariant measure, unique up to multiplication by a constant and such that 0 < π(C)<∞, and any invariant measure is a maximal irreducibility measure.
Proof. By (7.26), (µQ)? = µ?Q, so thatˇ µ is Q-invariant if and only if µ? is ˇ Q-invariant. Let ˇµbe a ˇQ-invariant measure and define
µ= Z
C×{0}
ˇ
µ(dˇx)R(x,·) + Z
Cc×{0}
ˇ
µ(dˇx)Q(x,·) + ˇµ(X× {1})ν .
By application of the definition of the split kernel and measures, it can be checked that ˇµQˇ =µ?. Henceµ?= ˇµQˇ = ˇµ. We thus see thatµ? is ˇQ-invariant, which, as
noted above, implies that µisQ-invariant. Hence we have shown that there exists aQ-invariant measure if and only if there exists a ˇQ-invariant one.
If Q is recurrent then C is recurrent, and as appears in the proof of Propo-sition 173 this implies that the atom ˇα is recurrent for the split chain ˇQ. Thus, by Proposition 154 the kernel ˇQis recurrent, and by Proposition 172 it admits an invariant measure that is unique up to a scaling factor. Hence Q also admits an invariant measure, unique up to a scaling factor and such that 0< π(C)<∞.
Letµbe Q-invariant. Thenµ? is ˇQ-invariant and hence, by Proposition 172, a maximal irreducibility measure. Ifµ(A)>0, thenµ?(A× {0,1}) =µ(A)>0. Thus A× {0,1}is accessible, and this implies thatAis accessible. We conclude thatµis an irreducibility measure, and it is maximal because it isKη-invariant.
If the kernelQis phi-irreducible and admits an accessible (m, , ν)-small setC, then, by Proposition 165, for anyη∈(0,1) the setCis an accessible (1, 0, ν)-small set for the resolvent kernelKη. IfC is recurrent for Q, it is also recurrent forKη
and therefore, by Proposition 164,Kη has a unique invariant probability measure.
The following result shows that this probability measure is invariant also forQ.
Lemma 174. A measureµon(X,X)isQ-invariant if and only ifµisKη-invariant for some (hence for all) η∈(0,1).
Proof. If µQ = µ, then obviously µQn = µ for all n ≥ 0, so that µKη = µ.
Conversely, assume thatµKη =µ. Because Kη =ηQKη+ (1−η)Q0 and QKη = KηQ, it holds that
µ=µKη=ηµQKη+ (1−η)µ=ηµKηQ+ (1−η)µ=ηµQ+ (1−η)µ . HenceηµQ=ηµ, which concludes the proof.
Drift Conditions
We first give a sufficient condition for a chain to be positive, based on the expectation of the return time to an accessible small set.
Proposition 175. LetQbe a transition kernel that admits an accessible small set C such that
sup
x∈C
Ex[τC]<∞. (7.31)
Then the chain is positive and the invariant probability measureπ satisfies, for all A∈ X,
π(A) = Z
C
π(dy) Ey
"τC−1 X
k=0
1A(Xk)
#
= Z
C
π(dy) Ey
"τC X
k=1
1A(Xk)
#
. (7.32) If f is a non-negative measurable function such that
sup
x∈C
Ex
"τC−1 X
k=0
f(Xk)
#
<∞, (7.33)
thenf is integrable with respect to πand π(f) =
Z
C
π(dy) Ey
"τC−1 X
k=0
f(Xk)
#
= Z
C
π(dy) Ey
"τC X
k=1
f(Xk)
# .
Proof. First note that by Proposition 156, Q is phi-irreducible. Equation (7.31) implies that for all Px(τC < ∞) = 1 x ∈ C, that is, C is Harris recurrent. By Proposition 167, C is recurrent, and so, by Proposition 164, Qis recurrent. Letπ be an invariant measure such that 0< π(C)<∞, the existence of which is given by Proposition 173. Then define a measure µC onX by
µC(A)def= Z
C
π(dy) Ey
"τC X
k=1
1A(Xk)
# .
Because τC < ∞ Py-a.s. for all y ∈ C, it holds that µC(C) = π(C). Then we can show that µC(A) =π(A) for allA∈ X. The proof is along the same lines as the proof of Proposition 172 and is therefore omitted. Thus, µC is invariant. In addition, we obtain that for any measurable setA,
Z
C
π(dy) Ey[1A(X0)] =π(A∩C) =µC(A∩C) = Z
C
π(dy) Ey[1A(XτC)] , and this yields
µC(A) = Z
C
π(dy) Ey
"τC X
k=1
1A(Xk)
#
= Z
C
π(dy) Ey
"τC−1 X
k=0
1A(Xk)
# . We thus obtain the following equivalent expressions forµC:
µC(A) = Z
C
π(dy) Ey
"τC−1 X
k=0
1A(Xk)
#
= Z
C
µC(dy) Ey
"τC−1 X
k=0
1A(Xk)
#
= Z
C
µC(dy) Ey
"τC X
k=1
1A(Xk)
#
= Z
C
π(dy) Ey
"τC X
k=1
1A(Xk)
#
=π(A). Hence
π(X) = Z
C
π(dy) Ey
"τC−1 X
k=0
1X(Xk)
#
≤π(C) sup
y∈C
Ey[τC]<∞,
so that any invariant measure is finite and the chain is positive. Finally, under (7.33) we obtain that
π(f) = Z
C
π(dy) Ey
"τC−1 X
k=0
f(Xk)
#
≤π(C) sup
y∈C
Ey
"τC−1 X
k=1
f(Xk)
#
<∞.
Except in specific examples (where, for example, the invariant distribution is known in advance), it may be difficult to decide if a chain is positive or null. To check such properties, it is convenient to use drift conditions.
Proposition 176. Assume that there exists a setC∈ X, two measurable functions 1≤f ≤V, and a constantb >0such that
QV ≤V −f+b1C. (7.34)
Then
Ex[τC]≤V(x) +b1C(x), (7.35)
Ex[V(XτC)] + Ex
"τC−1 X
k=0
f(Xk)
#
≤V(x) +b1C(x). (7.36)
If C is an accessible small set and V is bounded on C, then the chain is positive recurrent andπ(f)<∞.
Proof. Set forn≥1, Mn=
"
V(Xn) +
n−1
X
k=0
f(Xk)
#
1{τC≥n}.
Then
E[Mn+1| Fn] =
"
QV(Xn) +
n
X
k=0
f(Xk)
#
1{τC≥n+1}
≤
"
V(Xn)−f(Xn) +b1C(Xn) +
n
X
k=0
f(Xk)
#
1{τC≥n+1}
=
"
V(Xn) +
n−1
X
k=0
f(Xk)
#
1{τC≥n+1}≤Mn ,
as1C(Xn)1{τC≥n+1}= 0. Hence{Mn}n≥1is a non-negative super-martingale. For any integer n, τC∧n is a bounded stopping time, and Doob’s optional stopping theorem shows that for anyx∈X,
Ex[MτC∧n]≤Ex[M1]≤V(x) +b1C(x). (7.37) Applying this relation withf ≡1 yields for anyx∈Xandn≥0,
Ex[τC∧n]≤V(x) +b1C(x),
and (7.35) follows using monotone convergence. This implies in particular that Px(τC <∞) = 1 for any x∈X. The proof of (7.36) follows similarly from (7.37) by the lettingn→ ∞andπ(f) is finite by (7.33).
Example 177 (Random Walk on the Half-Line, Continued). Consider again the model of Example 153. Previously we have seen that sets of the form [0, c] are small.
If Γ((−∞,−c])>0, then forx∈[0, c],
Q(x, A)≥Γ((−∞,−c])1A(0) ;
otherwise there exists an integermsuch that Γ∗m((−∞,−c])>0, whence Qm(x, A)≥Γ∗m((−∞,−c])1A(0).
To prove recurrence forµ <0, we apply Proposition 176. Becauseµ <0, there existsc >0 such thatR∞
−cwΓ(dw)≤µ/2<0. Thus takingV(x) =xforx > c, QV(x)−V(x) =
Z ∞
−∞
[(x+w)+−x] Γ(dw)
=−xΓ((−∞,−x]) + Z ∞
−x
wΓ(dw)≤µ/2. Hence the chain is positive recurrent.
Consider now the case µ > 0. In view of Proposition 154, we have to show that the atom {0} is transient. For any n, Xn ≥ X0+Pn
i=1Wi. Define Cn = n−1Pn
i=1Wi−µ
≥µ/2 and write Dn for{Xn = 0}. The strong law of large numbers implies that P0(Dn i.o.)≤P0(Cn i.o.) = 0.Hence the atom {0} is tran-sient, and so is the chain.
Whenµ= 0, additional assumptions on Γ are needed to prove the recurrence of the RWHL (see for instance Meyn and Tweedie, 1993, Lemma 8.5.2).
Example 178 (Autoregressive Model, Continued). Consider again the model of Example 148 and assume that the noise process has zero mean and finite variance.
Choosing V(x) =x2 we have
P V(x) = E[(φx+U1)2] =φ2V(x) + E[U12],
so that (7.34) holds when C= [−M, M] for some large enoughM, provided|φ|<
1. Because we know that every compact set is small if the noise process has an everywhere continuous positive density, Proposition 176 shows that the chain is positive recurrent. Note that this approach provides an existence result but does not help us to determineπ. If{Uk} are Gaussian with zero mean and varianceσ2, then one can check that the invariant distribution also is Gaussian with zero mean and varianceσ2/(1−φ2).
Theorem 170 shows that if a chain is phi-irreducible and recurrent then the chain is positive, that is, it admits a unique invariant probability measure π. In certain situations, and in particular when dealing with MCMC procedures, it is known that Qadmits an invariant probability measure, but it is not known, a priori, that the chain is recurrent. The following result shows that positivity implies recurrence.
Proposition 179. If the Markov kernelQ is positive, then it is recurrent.
Proof. Suppose that the chain is positive and let π be an invariant probability measure. IfQis transient, the state spaceXis covered by a countable family{Aj} of uniformly transient subsets (see Theorem 151). For anyj andk,
kπ(Aj) =
k
X
n=1
πQn(Aj)≤ Z
π(dx) Ex[ηAj]≤sup
x∈X
Ex[ηAj]. (7.38) The strong Markov property implies that
Ex[ηAj] = Ex[ηAj1{σAj<∞}]
≤ Ex{1{σAj<∞}EXσA
j
[ηAj]} ≤ sup
x∈Aj
Ex[ηAj] Px(σAj <∞). Thus, the left-hand side of (7.38) is bounded ask→ ∞. This implies thatπ(Aj) = 0, and henceπ(X) = 0. This is a contradiction so the chain cannot be transient.