37.05.07 · probability / 05-markov-chains

The Ergodic Theorem for Markov Chains and Detailed Balance

shipped3 tiersLean: none

Anchor (Master): Norris 1997 *Markov Chains* (Cambridge) §1.9-1.10; Levin-Peres 2017 *Markov Chains and Mixing Times* 2e Ch. 3, §12.1-12.2 (spectral consequences of reversibility); Meyn-Tweedie 2009 *Markov Chains and Stochastic Stability* 2e Ch. 17 (the law of large numbers for Markov chains); Robert-Casella 2004 *Monte Carlo Statistical Methods* 2e Ch. 6-7 (Metropolis-Hastings)

Intuition Beginner

Run a Markov chain for a very long time and keep a tally: of all the steps so far, what fraction landed on each state? The ergodic theorem says this fraction settles down to the equilibrium weight of that state, no matter where you started. The chain's travel diary — the long-run share of time spent in each place — reproduces the equilibrium distribution exactly. More than that: if you attach a number to each state, say a payoff, then the average payoff per step over a long run converges to the payoff you would get by drawing a single state from equilibrium. Time-averages turn into equilibrium-averages.

Why care? Because it lets you compute hard equilibrium quantities by simply running the chain and averaging. You do not need to solve for the equilibrium weights by hand; you let the chain do the work and read the answer off the diary. This single idea powers a huge family of practical algorithms.

The second theme is reversibility. Some chains run the same backwards as forwards: if you filmed a long equilibrium run and played the film in reverse, you could not tell which way time was flowing. The precise condition for this is a balance rule — for every pair of states, the equilibrium traffic flowing from the first to the second equals the traffic flowing back. This rule is called detailed balance. It is a stronger, more local condition than mere equilibrium, and chains that obey it are especially well-behaved.

Detailed balance is also a design tool. Suppose you have a target distribution you want to sample from — maybe the shape of some complicated probability law you can only evaluate up to an overall constant. You can build a chain, by hand, whose equilibrium is exactly that target, by arranging its moves to satisfy detailed balance. Then you run the chain, average along the way, and the ergodic theorem hands you estimates of the target. This recipe is called the Metropolis-Hastings algorithm, and it is one of the most-used computational ideas in all of science.

The takeaway: the ergodic theorem says long-run time-averages of a chain equal equilibrium-averages; detailed balance is the local balance rule for chains that look the same run backwards; and together they let you design a chain to sample any target distribution you like.

Visual Beginner

Picture a long run of a chain, the tally of time spent in each state, and the backwards-run symmetry of detailed balance.

Top: one long run of the chain. Counting the fraction of steps spent in each state and comparing to the equilibrium bars on the right shows them matching — this is the ergodic theorem. Bottom: the detailed-balance picture. The flow of equilibrium probability from state $i$ to state $j$ in one step is the height $π_{i}$ times the move-chance $p_{ij}$ ; detailed balance says this equals the reverse flow $π_{j}$ times $p_{j i}$ , so every individual pair of states is in balance, not just the totals.

Worked example Beginner

Take a chain on three states ${1, 2, 3}$ where from any state you stay put with probability $\frac{1}{2}$ , and otherwise move to one of the other two states, each with probability $\frac{1}{4}$ . By symmetry the equilibrium gives each state weight $\frac{1}{3}$ . We will see two things: that the long-run time-share is $\frac{1}{3}$ for each state, and that the chain satisfies detailed balance.

Step 1. Check detailed balance against the equal-weight equilibrium. Pick states $1$ and $2$ . The forward flow is the equilibrium weight of state $1$ times the chance of moving $1 \to 2$ , which is $\frac{1}{3} \times \frac{1}{4} = \frac{1}{12}$ . The backward flow is the weight of state $2$ times the chance of moving $2 \to 1$ , which is $\frac{1}{3} \times \frac{1}{4} = \frac{1}{12}$ . They are equal. The same holds for every pair, so detailed balance holds and the equal-weight distribution is the equilibrium.

Step 2. Predict the long-run time-share. The ergodic theorem says the fraction of steps spent in each state converges to its equilibrium weight $\frac{1}{3}$ .

Step 3. Average a payoff. Suppose state $1$ pays $6$ dollars, state $2$ pays $0$ , and state $3$ pays $3$ . The equilibrium average payoff is $\frac{1}{3} \times 6 + \frac{1}{3} \times 0 + \frac{1}{3} \times 3 = 2 + 0 + 1 = 3$ dollars.

Step 4. Read the ergodic conclusion. Over a long run, the average payoff per step converges to $3$ dollars — the same number you would get by drawing one state from equilibrium and collecting its payoff. You did not need to know the equilibrium in advance to estimate this: a long enough run, averaged, would have revealed it.

What this tells us: detailed balance is a quick local check that a guessed distribution is the equilibrium, and once you have the equilibrium, the long-run average of any per-state payoff equals the equilibrium average of that payoff. Running and averaging replaces solving.

Check your understanding Beginner

Exercise (easy, multiple choice).

The ergodic theorem for a Markov chain says that, over a long run, the average value of a per-state payoff converges to:

A. The largest payoff among the states B. The payoff of the starting state C. The equilibrium average of the payoff (each state's payoff weighted by its equilibrium weight) D. Zero

Hint

Long-run time-shares match the equilibrium weights, so a time-average becomes an equilibrium-weighted average.

Answer

C. The equilibrium average of the payoff. The long-run fraction of time at each state equals its equilibrium weight, so averaging a per-state payoff over time converges to the equilibrium-weighted average of those payoffs. Feedback-correct: time-averages turn into equilibrium-averages — that is the whole content of the theorem. Feedback-wrong: the starting state is forgotten (so B fails), and the limit is a weighted blend, not the maximum (A) or zero (D).

Formal definition Intermediate+

Throughout, $(X_{n})_{n \geq 0}$ is a time-homogeneous Markov chain on a countable state space $I$ with stochastic transition matrix $P = (p_{ij})$ , as in 37.05.01. Communication, irreducibility, and period are as in 37.05.02; the strong Markov property, the first-return time $T_{i} := in f {n \geq 1 : X_{n} = i}$ , and recurrence are as in 37.05.04. The invariant distribution theory — existence of a nonzero invariant measure for an irreducible recurrent chain, positive recurrence as finiteness of the mean return time $m_{i} := E_{i} [T_{i}]$ , and the Kac identity $π_{i} = 1/ m_{i}$ — is 37.05.05, whose results are used freely. We write $E_{π} [f] := \sum_{i} π_{i} f (i)$ for a function $f : I \to R$ and an invariant distribution $π$ .

Definition (additive functional and time-average). Given $f : I \to R$ , the additive functional of $f$ along the chain is $S_{n} (f) := \sum_{k = 0}^{n - 1} f (X_{k})$ , and the time-average (or empirical average) is $\frac{1}{n} S_{n} (f) = \frac{1}{n} \sum_{k = 0}^{n - 1} f (X_{k})$ . The special case $f = 1_{{i}}$ gives $\frac{1}{n} S_{n} (1_{{i}}) = V_{i} (n) / n$ , the fraction of the first $n$ steps spent at state $i$ , where $V_{i} (n) := \sum_{k = 0}^{n - 1} 1_{{X_{k} = i}}$ is the occupation count.

Definition (reversibility and detailed balance). A measure $ν = (ν_{i})_{i \in I}$ with $ν_{i} \geq 0$ is in detailed balance with $P$ if $ν_{i} p_{ij} = ν_{j} p_{j i} for all i, j \in I .$ A chain is reversible with respect to $ν$ if $ν$ is in detailed balance with $P$ . When $ν = π$ is a probability distribution, reversibility is equivalent to the statement that the stationary chain run backwards is again a Markov chain with the same law: for $X_{0} \sim π$ , the reversed sequence $(X_{n}, X_{n - 1}, \dots, X_{0})$ has the same distribution as $(X_{0}, X_{1}, \dots, X_{n})$ . The reversed transition matrix is $\overset{p}{^}_{ij} := π_{j} p_{j i} / π_{i}$ (for $π_{i} > 0$ ); reversibility is exactly $\hat{P} = P$ .

Definition (Metropolis-Hastings chain). Fix a target distribution $π$ on $I$ with $π_{i} > 0$ , known possibly only up to a normalising constant, and a proposal matrix $Q = (q_{ij})$ that is irreducible. The Metropolis-Hastings chain proposes a move $i \to j$ with probability $q_{ij}$ and accepts it with probability $α_{ij} := min (1, \frac{π _{j} q _{j i}}{π _{i} q _{ij}}),$ otherwise staying at $i$ . Its transition matrix is $p_{ij} = q_{ij} α_{ij}$ for $j \neq = i$ and $p_{ii} = 1 - \sum_{j \neq = i} q_{ij} α_{ij}$ . Only the ratios $π_{j} / π_{i}$ enter $α_{ij}$ , so the normalising constant of $π$ is never needed.

The chain on ${1, 2, 3}$ with stay-probability $\frac{1}{2}$ and equal $\frac{1}{4}$ moves to the other two states is reversible with respect to $π = (\frac{1}{3}, \frac{1}{3}, \frac{1}{3})$ , since $π_{i} p_{ij} = \frac{1}{3} \cdot \frac{1}{4} = π_{j} p_{j i}$ for $i \neq = j$ . The simple symmetric random walk on $Z$ is reversible with respect to the counting measure $ν_{i} \equiv 1$ , exhibiting a reversible measure without a reversible distribution (the chain is null recurrent, 37.05.05).

Counterexamples to common slips Intermediate+

Detailed balance is sufficient for invariance but not necessary. Summing $ν_{i} p_{ij} = ν_{j} p_{j i}$ over $i$ gives $\sum_{i} ν_{i} p_{ij} = ν_{j} \sum_{i} p_{j i} = ν_{j}$ , so $ν = ν P$ . The converse fails: a directed three-cycle $1 \to 2 \to 3 \to 1$ has $π = (\frac{1}{3}, \frac{1}{3}, \frac{1}{3})$ invariant, yet $π_{1} p_{12} = \frac{1}{3} \neq = 0 = π_{2} p_{21}$ , so it is stationary but not reversible.
The ergodic theorem needs positive recurrence, not merely recurrence. On a null recurrent chain the mean return time is infinite, the denominator in the renewal-reward ratio diverges, and the time-average of a $π$ -integrable $f$ tends to $0$ , not to $E_{π} [f]$ (which is undefined, as no invariant distribution exists).
The ergodic theorem does not require aperiodicity. Unlike convergence to equilibrium 37.05.06, the law of large numbers for time-averages holds for periodic chains too: averaging over time washes out the cyclic rotation. Aperiodicity governs convergence of the marginal law $p_{ij}^{(n)}$ , a separate question.
The Metropolis acceptance ratio carries the proposal asymmetry. Omitting the $q_{j i} / q_{ij}$ factor (the Hastings correction) gives a chain reversible for the wrong distribution when the proposal is asymmetric. With a symmetric proposal $q_{ij} = q_{j i}$ the ratio collapses to $min (1, π_{j} / π_{i})$ , the original Metropolis rule.

Key theorem with proof Intermediate+

Theorem (ergodic theorem for Markov chains). Let $P$ be irreducible and positive recurrent with invariant distribution $π$ , and let $f : I \to R$ satisfy $\sum_{i} π_{i} ∣ f (i) ∣ < \infty$ . Then for every initial distribution, almost surely $\frac{1}{n} k = 0 \sum n - 1 f (X_{k}) ⟶ i \sum π_{i} f (i) = E_{π} [f] (n \to \infty) .$ In particular, taking $f = 1_{{i}}$ , the long-run fraction of time at state $i$ satisfies $V_{i} (n) / n \to π_{i} = 1/ m_{i}$ almost surely.

Proof. Fix a reference state $k$ . By irreducibility and recurrence (37.05.04) the chain visits $k$ infinitely often, so the successive visit times $0 \leq τ_{0} < τ_{1} < τ_{2} < \dots$ to $k$ are almost surely finite; here $τ_{0} := in f {n \geq 0 : X_{n} = k}$ and $τ_{r} := in f {n > τ_{r - 1} : X_{n} = k}$ . Define the excursion sums and excursion lengths for $r \geq 1$ , $W_{r} := n = τ_{r - 1} \sum τ_{r} - 1 f (X_{n}), L_{r} := τ_{r} - τ_{r - 1} .$ By the strong Markov property at the stopping times $τ_{r - 1}$ (37.05.04), the blocks $(W_{r}, L_{r})$ for $r \geq 1$ are independent and identically distributed, each distributed as a single excursion from $k$ back to $k$ started afresh at $k$ .

Compute their means. The length has $E [L_{1}] = E_{k} [T_{k}] = m_{k}$ , finite by positive recurrence. For the sum, using the excursion measure $γ_{i}^{k} = E_{k} [\sum_{n = 0}^{T_{k} - 1} 1_{{X_{n} = i}}]$ of 37.05.05, which satisfies $γ^{k} = m_{k} π$ (since $π = γ^{k} / m_{k}$ ), $E [W_{1}] = E_{k} [n = 0 \sum T_{k} - 1 f (X_{n})] = i \sum f (i) γ_{i}^{k} = m_{k} i \sum π_{i} f (i) = m_{k} E_{π} [f],$ the interchange of sum and expectation justified by $\sum_{i} ∣ f (i) ∣ γ_{i}^{k} = m_{k} \sum_{i} π_{i} ∣ f (i) ∣ < \infty$ ( $π$ -integrability) and absolute convergence. So $E [∣ W_{1} ∣] < \infty$ and $E [L_{1}] < \infty$ .

Now squeeze. For $n \geq τ_{0}$ , let $R = R (n) := max {r : τ_{r} \leq n}$ be the number of completed excursions by time $n$ ; since the $τ_{r}$ are finite, $R (n) \to \infty$ a.s. Writing $S_{n} (f) = \sum_{k = 0}^{n - 1} f (X_{k})$ and isolating the initial segment before $τ_{0}$ and the incomplete final excursion, $S_{n} (f) = fixed, finite n^{'} < τ_{0} \sum f (X_{n^{'}}) + r = 1 \sum R W_{r} + partial block n^{'} = τ_{R} \sum n - 1 f (X_{n^{'}}) .$ Apply the strong law of large numbers 37.02.02 to the i.i.d. sequences $(W_{r})$ and $(L_{r})$ : $\frac{1}{R} r = 1 \sum R W_{r} \to E [W_{1}] = m_{k} E_{π} [f], \frac{1}{R} r = 1 \sum R L_{r} = \frac{τ _{R} - τ _{0}}{R} \to E [L_{1}] = m_{k} a.s.$ Because $τ_{R} \leq n < τ_{R + 1}$ , we have $τ_{R} / R \to m_{k}$ and $τ_{R + 1} / R \to m_{k}$ , hence $n / R \to m_{k}$ as well. The fixed initial segment contributes $o (n)$ , and the partial final block is bounded in expectation per excursion and contributes $o (n)$ along the recurrence times by the standard renewal-reward estimate (its length $n - τ_{R} \leq L_{R + 1}$ satisfies $L_{R + 1} / R \to 0$ , and the SLLN for $\sum ∣ f (X_{n}) ∣$ over a block controls the partial sum). Therefore $\frac{S _{n} ( f )}{n} = \frac{\frac{1}{R} \sum _{r \leq R} W _{r} + o ( 1 )}{\frac{1}{R} n} ⟶ \frac{m _{k} E _{π} [ f ]}{m _{k}} = E_{π} [f] a.s.$ Setting $f = 1_{{i}}$ gives $V_{i} (n) / n \to π_{i} = 1/ m_{i}$ . $□$

Bridge. This theorem builds toward the entire computational and ergodic-theoretic life of the chapter and appears again in every Monte Carlo estimate, because it certifies that running one long path and averaging recovers an equilibrium expectation. The foundational reason it holds is the strong Markov restart of 37.05.04: each visit to $k$ launches an independent excursion, so the path decomposes into i.i.d. blocks and the SLLN of 37.02.02 applies to the block sums and block lengths separately. The ratio $E [W_{1}] / E [L_{1}] = E_{π} [f]$ is exactly the renewal-reward identity, and this is dual to the Kac formula $π_{i} = 1/ m_{i}$ of 37.05.05, which is the case $f = 1_{{i}}$ read as a reward of one unit per visit. Putting these together, the time-average is the reward rate of a renewal-reward process whose cycles are excursions, and the central insight is that positive recurrence — finiteness of $m_{k}$ — is exactly what makes the denominator finite and the rate well defined. The bridge is that this pathwise law of large numbers generalises the marginal convergence $p_{ij}^{(n)} \to π_{j}$ of 37.05.06 from one expectation to almost-sure path behaviour, and it is the statement that makes designing a chain by detailed balance into a sampling algorithm.

Exercises Intermediate+

Exercise 3 (medium, symbolic).

Show that the Metropolis-Hastings chain with target $π$ and proposal $Q$ satisfies detailed balance with respect to $π$ , hence has $π$ as a stationary distribution.

Hint

For $i \neq = j$ , the transition probability is $p_{ij} = q_{ij} min (1, π_{j} q_{j i} / (π_{i} q_{ij}))$ . Compute $π_{i} p_{ij}$ and show it is symmetric in $i, j$ .

Answer

For $i \neq = j$ , $π_{i} p_{ij} = π_{i} q_{ij} min (1, \frac{π _{j} q _{j i}}{π _{i} q _{ij}}) = min (π_{i} q_{ij}, π_{j} q_{j i})$ , using $a min (1, b / a) = min (a, b)$ for $a > 0$ . The right-hand side is symmetric under swapping $i \leftrightarrow j$ : $min (π_{i} q_{ij}, π_{j} q_{j i}) = min (π_{j} q_{j i}, π_{i} q_{ij}) = π_{j} p_{j i}$ . Hence $π_{i} p_{ij} = π_{j} p_{j i}$ for all $i \neq = j$ (the case $i = j$ is automatic), so $π$ is in detailed balance with $P$ . By Exercise 2, $π = π P$ : the target is stationary. This is the design principle of Metropolis-Hastings — the acceptance ratio is chosen precisely so that the edge flows symmetrise.

Exercise 6 (medium, multiple choice).

Which hypothesis is essential to the ergodic theorem $\frac{1}{n} \sum_{k < n} f (X_{k}) \to E_{π} [f]$ , yet is not needed for the conclusion to hold (the same conclusion holds with or without it)?

A. Irreducibility — essential B. Positive recurrence — essential C. Aperiodicity — not needed; the ergodic theorem holds for periodic chains too D. $π$ -integrability of $f$ — essential

Hint

Time-averaging washes out the cyclic rotation; it is the marginal-law convergence that needs aperiodicity.

Answer

C. Aperiodicity is not needed. The excursion decomposition and the SLLN make no use of aperiodicity: a periodic positive-recurrent irreducible chain still has finite mean return times and i.i.d. excursions, so the time-average converges. Aperiodicity is required only for the marginal convergence $p_{ij}^{(n)} \to π_{j}$ of 37.05.06. Irreducibility (A), positive recurrence (B), and $π$ -integrability (D) are all genuinely needed.

Exercise 7 (hard, symbolic).

Let $P$ be irreducible and reversible with respect to a distribution $π$ ( $π_{i} > 0$ for all $i$ ). Show that $P$ is self-adjoint on the weighted space $ℓ^{2} (π)$ with inner product $⟨ f, g ⟩_{π} = \sum_{i} π_{i} f (i) g (i)$ , i.e. $⟨ P f, g ⟩_{π} = ⟨ f, P g ⟩_{π}$ where $(P f) (i) = \sum_{j} p_{ij} f (j)$ . Deduce that the eigenvalues of $P$ (on a finite state space) are real.

Hint

Expand $⟨ P f, g ⟩_{π} = \sum_{i} π_{i} (P f) (i) g (i) = \sum_{i, j} π_{i} p_{ij} f (j) g (i)$ and use detailed balance to symmetrise.

Answer

Expand $⟨ P f, g ⟩_{π} = \sum_{i} π_{i} (\sum_{j} p_{ij} f (j)) g (i) = \sum_{i, j} π_{i} p_{ij} f (j) g (i)$ . Detailed balance $π_{i} p_{ij} = π_{j} p_{j i}$ rewrites the weight: $= \sum_{i, j} π_{j} p_{j i} f (j) g (i)$ . Relabel $i \leftrightarrow j$ in the double sum: $= \sum_{i, j} π_{i} p_{ij} f (i) g (j) = \sum_{i} π_{i} f (i) (\sum_{j} p_{ij} g (j)) = ⟨ f, P g ⟩_{π}$ . Hence $P$ is self-adjoint on $ℓ^{2} (π)$ . On a finite state space $ℓ^{2} (π)$ is a finite-dimensional real inner-product space and $P$ a self-adjoint operator, so by the spectral theorem its eigenvalues are real (equivalently, $D^{1/2} P D^{- 1/2}$ with $D = diag (π)$ is a symmetric matrix similar to $P$ ). The spectral gap $1 - λ_{2}$ then controls the rate of convergence to equilibrium.

Exercise 8 (hard, symbolic).

Use the ergodic theorem to prove the ratio limit form of MCMC estimation: for an irreducible positive-recurrent chain with stationary $π$ and two $π$ -integrable functions $f, g$ with $E_{π} [g] \neq = 0$ , $\frac{\sum _{k = 0}^{n - 1} f ( X _{k} )}{\sum _{k = 0}^{n - 1} g ( X _{k} )} ⟶ \frac{E _{π} [ f ]}{E _{π} [ g ]} a.s.$ Explain why this lets one estimate $E_{π} [f]$ when $π$ is known only up to a normalising constant.

Hint

Apply the ergodic theorem to $f$ and to $g$ separately, then take the ratio.

Answer

By the ergodic theorem applied to $f$ and to $g$ , almost surely $\frac{1}{n} \sum_{k < n} f (X_{k}) \to E_{π} [f]$ and $\frac{1}{n} \sum_{k < n} g (X_{k}) \to E_{π} [g]$ . On the almost-sure event where both limits hold and $E_{π} [g] \neq = 0$ , the ratio of the two numerators (after dividing top and bottom by $n$ ) converges to $E_{π} [f] / E_{π} [g]$ by the algebra of limits, the denominator being eventually nonzero. For unnormalised sampling: suppose $π_{i} = w_{i} / Z$ with known weights $w_{i}$ but unknown $Z = \sum_{i} w_{i}$ . Build a chain (for instance by Metropolis-Hastings, which needs only the ratios $w_{j} / w_{i}$ ) stationary for $π$ , and take $g \equiv 1$ so $E_{π} [g] = 1$ ; then $\frac{1}{n} \sum_{k < n} f (X_{k}) \to E_{π} [f]$ directly. More generally the ratio form lets the unknown $Z$ cancel between numerator and denominator, so expectations under $π$ are estimable from a single trajectory without ever computing $Z$ — the operational heart of Markov chain Monte Carlo.

Advanced results Master

The excursion decomposition that proves the ergodic theorem and the detailed-balance condition that defines reversibility meet in the Metropolis-Hastings construction: detailed balance fixes a prescribed stationary distribution, and the ergodic theorem turns one trajectory into a consistent estimator of every $π$ -expectation. Reversibility additionally makes the transition operator self-adjoint, so the convergence rate becomes a spectral gap.

Theorem 1 (ergodic theorem; central limit refinement). Let $P$ be irreducible positive recurrent with stationary $π$ and $f$ $π$ -integrable. Then $\frac{1}{n} S_{n} (f) \to E_{π} [f]$ a.s. If moreover the excursion sums have finite variance, $Var (W_{1}) < \infty$ and $Var (L_{1}) < \infty$ , then a central limit theorem holds: $n (\frac{1}{n} S_{n} (f) - E_{π} [f]) \Rightarrow N (0, σ_{f}^{2})$ with asymptotic variance $σ_{f}^{2} = Var_{π} (f) + 2 \sum_{k \geq 1} Cov_{π} (f (X_{0}), f (X_{k}))$ , the sum of the stationary autocovariances. The renewal-reward variance formula $σ_{f}^{2} = E [(W_{1} - E_{π} [f] L_{1})^{2}] / m_{k}$ exhibits $σ_{f}^{2}$ through the centred excursion reward, and its finiteness is the regeneration condition that makes Monte Carlo error bars meaningful.

Theorem 2 (reversibility, self-adjointness, and the spectral gap). If $P$ is irreducible and reversible with respect to $π$ , then $P$ is self-adjoint on $ℓ^{2} (π)$ ; its spectrum is real and contained in $[- 1, 1]$ , with $1$ a simple eigenvalue (eigenvector the constants). On a finite state space order the eigenvalues $1 = λ_{1} > λ_{2} \geq \dots \geq λ_{∣ I ∣} \geq - 1$ and set $λ_{⋆} = max (λ_{2}, ∣ λ_{∣ I ∣} ∣)$ . Then $∥ p_{i \cdot}^{(n)} - π ∥_{TV} \leq \frac{1}{2} \frac{1 - π _{i}}{π _{i}} λ_{⋆}^{n},$ so the absolute spectral gap $1 - λ_{⋆}$ governs the geometric rate of convergence. Reversibility is what makes this spectral analysis available: a non-reversible chain has a generally non-normal transition matrix whose convergence is not read off real eigenvalues alone. The Dirichlet form $E (f, f) = \frac{1}{2} \sum_{i, j} π_{i} p_{ij} (f (i) - f (j))^{2}$ and the variational characterisation $1 - λ_{2} = min {E (f, f) / Var_{π} (f)}$ make the gap estimable by test functions.

Theorem 3 (Metropolis-Hastings: reversibility for an arbitrary target). Given a target $π$ with $π_{i} > 0$ and an irreducible proposal $Q$ , the Metropolis-Hastings chain with acceptance $α_{ij} = min (1, π_{j} q_{j i} / (π_{i} q_{ij}))$ is reversible with respect to $π$ , hence has $π$ as its unique stationary distribution; if additionally it is aperiodic (for instance whenever some proposal is rejected with positive probability), it converges to $π$ in total variation, and the ergodic theorem yields $\frac{1}{n} \sum_{k < n} f (X_{k}) \to E_{π} [f]$ for $π$ -integrable $f$ . The acceptance rule depends on $π$ only through ratios $π_{j} / π_{i}$ , so an unnormalised target $π_{i} \propto w_{i}$ is sampled with no knowledge of the partition function $Z = \sum_{i} w_{i}$ . Special cases: the original Metropolis rule $α_{ij} = min (1, π_{j} / π_{i})$ for symmetric proposals; the Gibbs sampler, where coordinate-wise conditional proposals are accepted with probability one; and Glauber dynamics for spin systems.

Theorem 4 (Birkhoff's pointwise ergodic theorem as the measure-theoretic parent). The Markov-chain ergodic theorem is the special case, for the shift on path space, of Birkhoff's theorem: for a measure-preserving transformation $Θ$ on a probability space $(Ω, F, P)$ and $h \in L^{1}$ , the averages $\frac{1}{n} \sum_{k < n} h (Θ^{k} ω)$ converge a.s. to the conditional expectation of $h$ given the invariant $σ$ -field, which equals $E [h]$ when $Θ$ is ergodic. Taking $Ω = I^{N}$ with the stationary law $P_{π}$ , $Θ$ the shift, and $h (ω) = f (ω_{0})$ , the shift is ergodic precisely when $P$ is irreducible (positive recurrent), recovering $\frac{1}{n} \sum_{k < n} f (X_{k}) \to E_{π} [f]$ . The Markov excursion proof is the constructive, regeneration-based route to the same limit, and it additionally delivers the rate-and-variance information that the abstract theorem suppresses.

Synthesis. The foundational reason one trajectory suffices to compute every equilibrium expectation is the strong Markov restart of 37.05.04: each return to a fixed state regenerates the chain, so the path is a renewal-reward process whose i.i.d. cycles are excursions, and the SLLN of 37.02.02 on the cycle sums and cycle lengths gives the ratio $E_{π} [f]$ . This is exactly the Kac formula $π_{i} = 1/ m_{i}$ of 37.05.05 read for the reward $f = 1_{{i}}$ , and it is dual to the marginal convergence $p_{ij}^{(n)} \to π_{j}$ of 37.05.06: the ergodic theorem is the pathwise (almost-sure) twin of that distributional limit, the two together asserting that both the time-average and the ensemble-average collapse onto $π$ . Putting these together with detailed balance, reversibility makes $P$ self-adjoint, so the same positive-recurrence equilibrium is now equipped with a real spectrum whose gap quantifies the rate; the central insight is that designing $α_{ij}$ to symmetrise the edge flow forces $π$ to be the reversible stationary law of a chain we can simulate, after which the ergodic theorem converts simulation into estimation. The bridge is that the abstract Birkhoff theorem and the concrete excursion argument generalise the elementary $π_{i} = 1/ m_{i}$ to arbitrary observables, and the Metropolis-Hastings construction is the engineering inverse: prescribe $π$ , build the reversible chain, and let the ergodic theorem read $E_{π} [f]$ off a single run.

Full proof set Master

Proposition 1 (detailed balance implies stationarity). If $ν \geq 0$ satisfies $ν_{i} p_{ij} = ν_{j} p_{j i}$ for all $i, j$ , then $ν = ν P$ .

Proof. Fix $j$ . Summing the detailed-balance identity over $i$ , $\sum_{i} ν_{i} p_{ij} = \sum_{i} ν_{j} p_{j i} = ν_{j} \sum_{i} p_{j i} = ν_{j}$ , the last step by row-stochasticity $\sum_{i} p_{j i} = 1$ . Thus $(ν P)_{j} = ν_{j}$ for every $j$ , i.e. $ν = ν P$ . $□$

Proposition 2 (reversal characterisation). Let $π$ be a stationary distribution with $π_{i} > 0$ and set $\overset{p}{^}_{ij} := π_{j} p_{j i} / π_{i}$ . Then $\hat{P}$ is stochastic and is the transition matrix of the time-reversed stationary chain; $P$ is reversible with respect to $π$ iff $\hat{P} = P$ , equivalently iff for $X_{0} \sim π$ the vector $(X_{0}, \dots, X_{n})$ has the same law as its reversal $(X_{n}, \dots, X_{0})$ for every $n$ .

Proof. Stochasticity: $\sum_{j} \overset{p}{^}_{ij} = \sum_{j} π_{j} p_{j i} / π_{i} = (π P)_{i} / π_{i} = π_{i} / π_{i} = 1$ by stationarity. For the reversal, under $P_{π}$ the finite-dimensional law is $P_{π} (X_{0} = i_{0}, \dots, X_{n} = i_{n}) = π_{i_{0}} p_{i_{0} i_{1}} \dots p_{i_{n - 1} i_{n}}$ . Reading the same path backwards and inserting $π_{i_{m}} p_{i_{m} i_{m + 1}} = π_{i_{m + 1}} \overset{p}{^}_{i_{m + 1} i_{m}}$ repeatedly gives $π_{i_{n}} \overset{p}{^}_{i_{n} i_{n - 1}} \dots \overset{p}{^}_{i_{1} i_{0}}$ , the law of the $\hat{P}$ -chain started from $π$ run over the reversed indices. Hence the reversed process is Markov with matrix $\hat{P}$ , and equality of forward and reversed laws is $\hat{P} = P$ , i.e. $π_{j} p_{j i} = π_{i} p_{ij}$ for all $i, j$ — detailed balance. $□$

Proposition 3 (Metropolis-Hastings satisfies detailed balance). For a target $π$ with $π_{i} > 0$ and irreducible proposal $Q$ , the chain $p_{ij} = q_{ij} min (1, π_{j} q_{j i} / (π_{i} q_{ij}))$ ( $j \neq = i$ ) is reversible with respect to $π$ .

Proof. For $i \neq = j$ , using $a min (1, b / a) = min (a, b)$ with $a = π_{i} q_{ij} > 0$ and $b = π_{j} q_{j i}$ , $π_{i} p_{ij} = π_{i} q_{ij} min (1, \frac{π _{j} q _{j i}}{π _{i} q _{ij}}) = min (π_{i} q_{ij}, π_{j} q_{j i}) .$ The right-hand expression is symmetric in $(i, j)$ , so it equals $min (π_{j} q_{j i}, π_{i} q_{ij}) = π_{j} p_{j i}$ . The case $i = j$ is an identity (both sides equal $π_{i} p_{ii}$ ). Thus $π_{i} p_{ij} = π_{j} p_{j i}$ for all $i, j$ , which is detailed balance; by Proposition 1, $π = π P$ . Irreducibility of the resulting $P$ on the support of $π$ follows from irreducibility of $Q$ and positivity of the acceptance on at least the proposed edges, giving uniqueness of $π$ as the stationary distribution. $□$

Proposition 4 (self-adjointness on $ℓ^{2} (π)$ ). If $P$ is reversible with respect to $π$ ( $π_{i} > 0$ ), then $⟨ P f, g ⟩_{π} = ⟨ f, P g ⟩_{π}$ for all $f, g \in ℓ^{2} (π)$ , where $⟨ f, g ⟩_{π} = \sum_{i} π_{i} f (i) g (i)$ .

Proof. $⟨ P f, g ⟩_{π} = \sum_{i} π_{i} (\sum_{j} p_{ij} f (j)) g (i) = \sum_{i, j} π_{i} p_{ij} f (j) g (i)$ . Detailed balance replaces $π_{i} p_{ij}$ by $π_{j} p_{j i}$ , giving $\sum_{i, j} π_{j} p_{j i} f (j) g (i)$ ; relabelling the summation indices $i \leftrightarrow j$ yields $\sum_{i, j} π_{i} p_{ij} f (i) g (j) = \sum_{i} π_{i} f (i) (\sum_{j} p_{ij} g (j)) = ⟨ f, P g ⟩_{π}$ . Absolute convergence of the double sums for $f, g \in ℓ^{2} (π)$ (Cauchy-Schwarz against the probability weights $π_{i} p_{ij}$ ) justifies the rearrangement. $□$

Proposition 5 (ergodic theorem: occupation form). For irreducible positive-recurrent $P$ with stationary $π$ , the occupation fraction satisfies $V_{i} (n) / n \to π_{i}$ a.s. for every $i$ .

Proof. Apply the Key theorem with $f = 1_{{i}}$ , which is $π$ -integrable ( $\sum_{j} π_{j} ∣ f (j) ∣ = π_{i} \leq 1$ ). The excursion sum $W_{r} = \sum_{n = τ_{r - 1}}^{τ_{r} - 1} 1_{{X_{n} = i}}$ counts the visits to $i$ in one excursion from the reference $k$ , with $E [W_{1}] = γ_{i}^{k} = m_{k} π_{i}$ , while $E [L_{1}] = m_{k}$ . The SLLN-ratio gives $V_{i} (n) / n = \frac{1}{n} S_{n} (1_{{i}}) \to E [W_{1}] / E [L_{1}] = π_{i}$ . Taking $k = i$ specialises to $π_{i} = 1/ m_{i}$ , recovering the Kac identity of 37.05.05 as the long-run visit frequency. $□$

Proposition 6 (the directed cycle is stationary but not reversible). The chain on ${1, 2, 3}$ with $p_{12} = p_{23} = p_{31} = 1$ has stationary $π = (\frac{1}{3}, \frac{1}{3}, \frac{1}{3})$ but admits no distribution in detailed balance with $P$ .

Proof. Stationarity: $(π P)_{1} = \sum_{i} π_{i} p_{i 1} = π_{3} p_{31} = \frac{1}{3} = π_{1}$ , and symmetrically for the other coordinates, so $π P = π$ . For detailed balance, any $ν$ would need $ν_{1} p_{12} = ν_{2} p_{21}$ ; but $p_{12} = 1$ and $p_{21} = 0$ , forcing $ν_{1} = 0$ , and cyclically $ν_{2} = ν_{3} = 0$ . The only solution is $ν \equiv 0$ , so no nonzero reversible measure exists; the chain runs strictly forward and its reversal is the opposite cycle, a different matrix. $□$

Connections Master

Invariant measures and positive/null recurrence 37.05.05 supplies the equilibrium $π$ and the mean return time $m_{i}$ that the ergodic theorem averages onto: the excursion measure $γ^{k} = m_{k} π$ constructed there is exactly the expected excursion reward in the renewal-reward decomposition here, and the Kac identity $π_{i} = 1/ m_{i}$ is the ergodic theorem evaluated at the indicator $f = 1_{{i}}$ , so that unit's static equilibrium becomes this unit's dynamic time-average.
The strong law of large numbers 37.02.02 is the analytic engine: the i.i.d. excursion sums $W_{r}$ and lengths $L_{r}$ produced by the strong Markov restart are summed by the SLLN, and the ratio of the two limits is the whole content of the Markov ergodic theorem, so the classical law of large numbers for independent summands lifts, through regeneration, to dependent chains.
Convergence to equilibrium via coupling 37.05.06 is the distributional twin: that unit proves $p_{ij}^{(n)} \to π_{j}$ (an ensemble statement requiring aperiodicity), while this unit proves the pathwise $\frac{1}{n} \sum_{k < n} 1_{{X_{k} = j}} \to π_{j}$ (a time-average holding even with periodicity), and the spectral gap of a reversible chain bounds the coupling-time tail that controls the former.
The strong Markov property and recurrence/transience dichotomy 37.05.04 licenses the excursion decomposition by restarting the chain at each visit to the reference state, the same stopping-time machinery that there established the visit-with-certainty of a recurrent class; without it the path could not be cut into independent identically distributed blocks.

Historical & philosophical context Master

The pointwise ergodic theorem in its general measure-preserving form is due to George Birkhoff in 1931, who proved that time-averages of an integrable observable along the orbits of a measure-preserving transformation converge almost everywhere; the mean-square version is John von Neumann's, proved slightly earlier in 1931 and published in 1932. For Markov chains the specialisation — that the long-run fraction of time in a state equals its stationary mass, and more generally that empirical averages converge to stationary expectations — was developed through the renewal-theoretic and excursion viewpoint codified in the textbook tradition, presented cleanly by Norris ^{[Norris 1997]} from the regeneration structure of returns to a fixed state. The general-state-space law of large numbers, run off a small set rather than a single recurrent point, is the Harris-chain theory of Meyn and Tweedie.

Reversibility and the detailed-balance condition entered probability from statistical physics, where microscopic reversibility — the time-symmetry of equilibrium fluctuations — is a foundational principle; Lars Onsager's reciprocal relations are its macroscopic shadow. The algorithmic exploitation of detailed balance is the Markov chain Monte Carlo method, initiated by Nicholas Metropolis and collaborators in 1953 ^{[Metropolis 1953]} for simulating equilibrium configurations of interacting particles, where the acceptance rule $min (1, π_{j} / π_{i})$ was introduced to drive a chain to a prescribed Boltzmann distribution. W. K. Hastings generalised the construction in 1970 ^{[Hastings 1970]} to arbitrary proposal kernels by inserting the correction ratio $q_{j i} / q_{ij}$ , giving the Metropolis-Hastings algorithm now standard across Bayesian statistics and computational physics, with the systematic statistical development given by Robert and Casella ^{[Robert-Casella 2004]}.

Bibliography Master

@book{Norris1997,
  author    = {Norris, James R.},
  title     = {Markov Chains},
  series    = {Cambridge Series in Statistical and Probabilistic Mathematics},
  publisher = {Cambridge University Press},
  year      = {1997}
}

@article{Birkhoff1931,
  author  = {Birkhoff, George D.},
  title   = {Proof of the ergodic theorem},
  journal = {Proceedings of the National Academy of Sciences},
  volume  = {17},
  number  = {12},
  year    = {1931},
  pages   = {656--660}
}

@article{Metropolis1953,
  author  = {Metropolis, Nicholas and Rosenbluth, Arianna W. and Rosenbluth, Marshall N. and Teller, Augusta H. and Teller, Edward},
  title   = {Equation of state calculations by fast computing machines},
  journal = {Journal of Chemical Physics},
  volume  = {21},
  number  = {6},
  year    = {1953},
  pages   = {1087--1092}
}

@article{Hastings1970,
  author  = {Hastings, W. Keith},
  title   = {Monte Carlo sampling methods using Markov chains and their applications},
  journal = {Biometrika},
  volume  = {57},
  number  = {1},
  year    = {1970},
  pages   = {97--109}
}

@book{RobertCasella2004,
  author    = {Robert, Christian P. and Casella, George},
  title     = {Monte Carlo Statistical Methods},
  edition   = {2},
  publisher = {Springer},
  year      = {2004}
}

@book{LevinPeres2017,
  author    = {Levin, David A. and Peres, Yuval},
  title     = {Markov Chains and Mixing Times},
  edition   = {2},
  publisher = {American Mathematical Society},
  year      = {2017}
}

Prerequisites

37.05.05
37.02.02

Tier anchors

beginner: Norris 1997 *Markov Chains* (Cambridge) §1.10; informal picture of the long-run share of time a wandering chain spends in each state matching the equilibrium weights, and of a chain that runs the same forwards as backwards
intermediate: Norris 1997 *Markov Chains* (Cambridge) §1.10 (ergodic theorem), §1.9 (reversibility, detailed balance); Levin-Peres 2017 *Markov Chains and Mixing Times* 2e §1.6, §3.1-3.2 (reversibility, Metropolis chains)
master: Norris 1997 *Markov Chains* (Cambridge) §1.9-1.10; Levin-Peres 2017 *Markov Chains and Mixing Times* 2e Ch. 3, §12.1-12.2 (spectral consequences of reversibility); Meyn-Tweedie 2009 *Markov Chains and Stochastic Stability* 2e Ch. 17 (the law of large numbers for Markov chains); Robert-Casella 2004 *Monte Carlo Statistical Methods* 2e Ch. 6-7 (Metropolis-Hastings)

References

Norris — Markov Chains · Cambridge University Press 1997, §1.9 (reversibility and detailed balance), §1.10 (the ergodic theorem)
Levin-Peres — Markov Chains and Mixing Times, 2e · American Mathematical Society 2017, §1.6 (reversibility), §3.1-3.2 (Metropolis chains and Glauber dynamics), §12.1-12.2 (the spectral gap of a reversible chain)
Metropolis-Rosenbluth-Rosenbluth-Teller-Teller — Equation of state calculations by fast computing machines · Journal of Chemical Physics 21 (1953), 1087-1092 (the original Metropolis algorithm)
Hastings — Monte Carlo sampling methods using Markov chains and their applications · Biometrika 57 (1970), 97-109 (the Metropolis-Hastings generalisation)
Robert-Casella — Monte Carlo Statistical Methods, 2e · Springer 2004, Ch. 6-7 (the Metropolis-Hastings algorithm, detailed-balance design)

Estimated time

beginner: 18m
intermediate: 54m
master: 90m