Invariant Measures and Stationarity - Chains on General State Spaces

7.2 Chains on General State Spaces

7.2.3 Invariant Measures and Stationarity

We prove this inequality by induction. For n= 1 we can write µ(A) =µQ(A)≥µ(α)Q(α, A) =Q(α, A) = Pα(X1∈A). Now assume now that the inequality holds for somen≥1. Then

µ(A) =Q(α, A) + Z

α^c

µ(dy)Q(y, A)

≥Q(α, A) +

k=1

Eα[Q(Xk, A)1{τα≥k}1{Xk∈α}/ ]

≥Q(α, A) +

k=1

Eα[Q(Xk, A)1{τα≥k+1}]. Because {τα≥k+ 1} ∈ F_k^X, the Markov property yields

Eα[Q(Xk, A)1{τα≥k+1}] = Pα(Xk+1∈A, τα≥k+ 1), whence

µ(A)≥Q(α, A) +

n+1

k=2

P_α(X_k∈A, τ_α≥k) =

n+1

k=1

P_α(X_k∈A, τ_α≥k). This completes the induction, and we conclude that µ≥µα.

Assume that there exists a setAsuch thatµ(A)> µα(A). It is straightforward thatµandµαare both invariant for the resolvent kernelKδ (see (7.17)), for anyδ∈ (0,1). Becauseαis accessible,Kδ(x, α)>0 for allx∈X. HenceR

Aµ(dx)Q(x, α)>

Aµα(dx)Q(x, α), which implies that 1 =µ(α) =µKδ(α) =

µ(dx)Kδ(x, α) + Z

A^c

µ(dx)Kδ(x, α)

µα(dx)Kδ(x, α) + Z

A^c

µα(dx)Kδ(x, α) =µαKδ(α) =µα(α) = 1.

This contradiction shows that µ=µα.

We finally prove thatµαis a maximal irreducibility measure. Letψbe a maximal irreducibility measure and assume that ψ(A) = 0. Then Px(τA <∞) = 0 for ψ-almost all x∈ X. This obviously implies that P_x(τ_A < ∞) = 0 for ψ-almost all x ∈ α. Because P_x(τ_A < ∞) is constant over α, we find that P_x(τ_A < ∞) = 0 for all x ∈α, and this yields µ_α(A) = 0. Thus µ_α is absolutely continuous with respect toψ, hence an irreducibility measure. Let againK_δ be the resolvent kernel.

By Theorem 147,µ_αK_δ is a maximal irreducibility measure. But, as noted above, µαK=µα, and thereforeµαis a maximal irreducibility measure.

Proposition 173. LetQbe a recurrent phi-irreducible transition kernel that admits an accessible (1, , ν)-small set C. Then it admits a non-trivial invariant measure, unique up to multiplication by a constant and such that 0 < π(C)<∞, and any invariant measure is a maximal irreducibility measure.

Proof. By (7.26), (µQ)^? = µ^?Q, so thatˇ µ is Q-invariant if and only if µ^? is ˇ Q-invariant. Let ˇµbe a ˇQ-invariant measure and define

µ= Z

C×{0}

µ(dˇx)R(x,·) + Z

C^c×{0}

µ(dˇx)Q(x,·) + ˇµ(X× {1})ν .

By application of the definition of the split kernel and measures, it can be checked that ˇµQˇ =µ^?. Henceµ^?= ˇµQˇ = ˇµ. We thus see thatµ^? is ˇQ-invariant, which, as

noted above, implies that µisQ-invariant. Hence we have shown that there exists aQ-invariant measure if and only if there exists a ˇQ-invariant one.

If Q is recurrent then C is recurrent, and as appears in the proof of Propo-sition 173 this implies that the atom ˇα is recurrent for the split chain ˇQ. Thus, by Proposition 154 the kernel ˇQis recurrent, and by Proposition 172 it admits an invariant measure that is unique up to a scaling factor. Hence Q also admits an invariant measure, unique up to a scaling factor and such that 0< π(C)<∞.

Letµbe Q-invariant. Thenµ^? is ˇQ-invariant and hence, by Proposition 172, a maximal irreducibility measure. Ifµ(A)>0, thenµ^?(A× {0,1}) =µ(A)>0. Thus A× {0,1}is accessible, and this implies thatAis accessible. We conclude thatµis an irreducibility measure, and it is maximal because it isK_η-invariant.

If the kernelQis phi-irreducible and admits an accessible (m, , ν)-small setC, then, by Proposition 165, for anyη∈(0,1) the setCis an accessible (1, ⁰, ν)-small set for the resolvent kernelKη. IfC is recurrent for Q, it is also recurrent forKη

and therefore, by Proposition 164,Kη has a unique invariant probability measure.

The following result shows that this probability measure is invariant also forQ.

Lemma 174. A measureµon(X,X)isQ-invariant if and only ifµisKη-invariant for some (hence for all) η∈(0,1).

Proof. If µQ = µ, then obviously µQⁿ = µ for all n ≥ 0, so that µK_η = µ.

Conversely, assume thatµK_η =µ. Because K_η =ηQK_η+ (1−η)Q⁰ and QK_η = K_ηQ, it holds that

µ=µKη=ηµQKη+ (1−η)µ=ηµKηQ+ (1−η)µ=ηµQ+ (1−η)µ . HenceηµQ=ηµ, which concludes the proof.

Drift Conditions

We first give a sufficient condition for a chain to be positive, based on the expectation of the return time to an accessible small set.

Proposition 175. LetQbe a transition kernel that admits an accessible small set C such that

sup

x∈C

E_x[τ_C]<∞. (7.31)

Then the chain is positive and the invariant probability measureπ satisfies, for all A∈ X,

π(A) = Z

π(dy) E_y

"_τ_C₋₁ X

k=0

1A(X_k)

= Z

π(dy) E_y

"_τ_C X

k=1

1A(X_k)

. (7.32) If f is a non-negative measurable function such that

sup

x∈C

E_x

"_τ_C₋₁ X

k=0

f(X_k)

<∞, (7.33)

thenf is integrable with respect to πand π(f) =

π(dy) E_y

"_τ_C₋₁ X

k=0

f(X_k)

= Z

π(dy) E_y

"_τ_C X

k=1

f(X_k)

# .

Proof. First note that by Proposition 156, Q is phi-irreducible. Equation (7.31) implies that for all P_x(τ_C < ∞) = 1 x ∈ C, that is, C is Harris recurrent. By Proposition 167, C is recurrent, and so, by Proposition 164, Qis recurrent. Letπ be an invariant measure such that 0< π(C)<∞, the existence of which is given by Proposition 173. Then define a measure µ_C onX by

µC(A)^def= Z

π(dy) Ey

"_τ_C X

k=1

1A(Xk)

# .

Because τC < ∞ Py-a.s. for all y ∈ C, it holds that µC(C) = π(C). Then we can show that µC(A) =π(A) for allA∈ X. The proof is along the same lines as the proof of Proposition 172 and is therefore omitted. Thus, µC is invariant. In addition, we obtain that for any measurable setA,

π(dy) E_y[1A(X₀)] =π(A∩C) =µ_C(A∩C) = Z

π(dy) E_y[1A(X_τ_C)] , and this yields

µC(A) = Z

π(dy) Ey

"_τ_C X

k=1

1A(Xk)

= Z

π(dy) Ey

"_τ_C₋₁ X

k=0

1A(Xk)

# . We thus obtain the following equivalent expressions forµ_C:

µC(A) = Z

π(dy) Ey

"_τ_C₋₁ X

k=0

1A(Xk)

= Z

µC(dy) Ey

"_τ_C₋₁ X

k=0

1A(Xk)

= Z

µ_C(dy) E_y

"_τ_C X

k=1

1A(X_k)

= Z

π(dy) E_y

"_τ_C X

k=1

1A(X_k)

=π(A). Hence

π(X) = Z

π(dy) Ey

"_τ_C₋₁ X

k=0

1X(Xk)

≤π(C) sup

y∈C

Ey[τC]<∞,

so that any invariant measure is finite and the chain is positive. Finally, under (7.33) we obtain that

π(f) = Z

π(dy) Ey

"_τ_C₋₁ X

k=0

f(Xk)

≤π(C) sup

y∈C

"_τ_C₋₁ X

k=1

f(Xk)

<∞.

Except in specific examples (where, for example, the invariant distribution is known in advance), it may be difficult to decide if a chain is positive or null. To check such properties, it is convenient to use drift conditions.

Proposition 176. Assume that there exists a setC∈ X, two measurable functions 1≤f ≤V, and a constantb >0such that

QV ≤V −f+b1C. (7.34)

Then

Ex[τC]≤V(x) +b1C(x), (7.35)

E_x[V(X_τ_C)] + E_x

"_τ_C₋₁ X

k=0

f(X_k)

≤V(x) +b1C(x). (7.36)

If C is an accessible small set and V is bounded on C, then the chain is positive recurrent andπ(f)<∞.

Proof. Set forn≥1, Mn=

V(Xn) +

n−1

k=0

f(Xk)

1{τC≥n}.

Then

E[M_n+1| Fn] =

QV(X_n) +

k=0

f(X_k)

1{τC≥n+1}

≤

V(Xn)−f(Xn) +b1C(Xn) +

k=0

f(Xk)

1{τC≥n+1}

V(X_n) +

n−1

k=0

f(X_k)

1{τC≥n+1}≤M_n ,

as1C(Xn)1{τC≥n+1}= 0. Hence{Mn}_n≥1is a non-negative super-martingale. For any integer n, τC∧n is a bounded stopping time, and Doob’s optional stopping theorem shows that for anyx∈X,

Ex[Mτ_C∧n]≤Ex[M1]≤V(x) +b1C(x). (7.37) Applying this relation withf ≡1 yields for anyx∈Xandn≥0,

E_x[τ_C∧n]≤V(x) +b1C(x),

and (7.35) follows using monotone convergence. This implies in particular that P_x(τ_C <∞) = 1 for any x∈X. The proof of (7.36) follows similarly from (7.37) by the lettingn→ ∞andπ(f) is finite by (7.33).

Example 177 (Random Walk on the Half-Line, Continued). Consider again the model of Example 153. Previously we have seen that sets of the form [0, c] are small.

If Γ((−∞,−c])>0, then forx∈[0, c],

Q(x, A)≥Γ((−∞,−c])1A(0) ;

otherwise there exists an integermsuch that Γ^∗m((−∞,−c])>0, whence Q^m(x, A)≥Γ^∗m((−∞,−c])1A(0).

To prove recurrence forµ <0, we apply Proposition 176. Becauseµ <0, there existsc >0 such thatR∞

−cwΓ(dw)≤µ/2<0. Thus takingV(x) =xforx > c, QV(x)−V(x) =

Z ∞

−∞

[(x+w)₊−x] Γ(dw)

=−xΓ((−∞,−x]) + Z ∞

−x

wΓ(dw)≤µ/2. Hence the chain is positive recurrent.

Consider now the case µ > 0. In view of Proposition 154, we have to show that the atom {0} is transient. For any n, X_n ≥ X₀+Pn

i=1W_i. Define C_n = n⁻¹Pn

i=1W_i−µ

≥µ/2 and write D_n for{X_n = 0}. The strong law of large numbers implies that P0(Dn i.o.)≤P0(Cn i.o.) = 0.Hence the atom {0} is tran-sient, and so is the chain.

Whenµ= 0, additional assumptions on Γ are needed to prove the recurrence of the RWHL (see for instance Meyn and Tweedie, 1993, Lemma 8.5.2).

Example 178 (Autoregressive Model, Continued). Consider again the model of Example 148 and assume that the noise process has zero mean and finite variance.

Choosing V(x) =x² we have

P V(x) = E[(φx+U₁)²] =φ²V(x) + E[U₁²],

so that (7.34) holds when C= [−M, M] for some large enoughM, provided|φ|<

1. Because we know that every compact set is small if the noise process has an everywhere continuous positive density, Proposition 176 shows that the chain is positive recurrent. Note that this approach provides an existence result but does not help us to determineπ. If{Uk} are Gaussian with zero mean and varianceσ², then one can check that the invariant distribution also is Gaussian with zero mean and varianceσ²/(1−φ²).

Theorem 170 shows that if a chain is phi-irreducible and recurrent then the chain is positive, that is, it admits a unique invariant probability measure π. In certain situations, and in particular when dealing with MCMC procedures, it is known that Qadmits an invariant probability measure, but it is not known, a priori, that the chain is recurrent. The following result shows that positivity implies recurrence.

Proposition 179. If the Markov kernelQ is positive, then it is recurrent.

Proof. Suppose that the chain is positive and let π be an invariant probability measure. IfQis transient, the state spaceXis covered by a countable family{Aj} of uniformly transient subsets (see Theorem 151). For anyj andk,

kπ(A_j) =

n=1

πQⁿ(A_j)≤ Z

π(dx) E_x[η_A_j]≤sup

x∈X

E_x[η_A_j]. (7.38) The strong Markov property implies that

Ex[ηA_j] = Ex[ηA_j1{σ_Aj<∞}]

≤ Ex{1{σ_Aj<∞}EX_σA

[ηA_j]} ≤ sup

x∈Aj

Ex[ηA_j] Px(σA_j <∞). Thus, the left-hand side of (7.38) is bounded ask→ ∞. This implies thatπ(Aj) = 0, and henceπ(X) = 0. This is a contradiction so the chain cannot be transient.

No documento Inference in Hidden Markov Models (páginas 170-175)