37.05.07 · probability / 05-markov-chains

The Ergodic Theorem for Markov Chains and Detailed Balance

shipped3 tiersLean: none

Anchor (Master): Norris 1997 *Markov Chains* (Cambridge) §1.9-1.10; Levin-Peres 2017 *Markov Chains and Mixing Times* 2e Ch. 3, §12.1-12.2 (spectral consequences of reversibility); Meyn-Tweedie 2009 *Markov Chains and Stochastic Stability* 2e Ch. 17 (the law of large numbers for Markov chains); Robert-Casella 2004 *Monte Carlo Statistical Methods* 2e Ch. 6-7 (Metropolis-Hastings)

Intuition Beginner

Run a Markov chain for a very long time and keep a tally: of all the steps so far, what fraction landed on each state? The ergodic theorem says this fraction settles down to the equilibrium weight of that state, no matter where you started. The chain's travel diary — the long-run share of time spent in each place — reproduces the equilibrium distribution exactly. More than that: if you attach a number to each state, say a payoff, then the average payoff per step over a long run converges to the payoff you would get by drawing a single state from equilibrium. Time-averages turn into equilibrium-averages.

Why care? Because it lets you compute hard equilibrium quantities by simply running the chain and averaging. You do not need to solve for the equilibrium weights by hand; you let the chain do the work and read the answer off the diary. This single idea powers a huge family of practical algorithms.

The second theme is reversibility. Some chains run the same backwards as forwards: if you filmed a long equilibrium run and played the film in reverse, you could not tell which way time was flowing. The precise condition for this is a balance rule — for every pair of states, the equilibrium traffic flowing from the first to the second equals the traffic flowing back. This rule is called detailed balance. It is a stronger, more local condition than mere equilibrium, and chains that obey it are especially well-behaved.

Detailed balance is also a design tool. Suppose you have a target distribution you want to sample from — maybe the shape of some complicated probability law you can only evaluate up to an overall constant. You can build a chain, by hand, whose equilibrium is exactly that target, by arranging its moves to satisfy detailed balance. Then you run the chain, average along the way, and the ergodic theorem hands you estimates of the target. This recipe is called the Metropolis-Hastings algorithm, and it is one of the most-used computational ideas in all of science.

The takeaway: the ergodic theorem says long-run time-averages of a chain equal equilibrium-averages; detailed balance is the local balance rule for chains that look the same run backwards; and together they let you design a chain to sample any target distribution you like.

Visual Beginner

Picture a long run of a chain, the tally of time spent in each state, and the backwards-run symmetry of detailed balance.

Top: one long run of the chain. Counting the fraction of steps spent in each state and comparing to the equilibrium bars on the right shows them matching — this is the ergodic theorem. Bottom: the detailed-balance picture. The flow of equilibrium probability from state to state in one step is the height times the move-chance ; detailed balance says this equals the reverse flow times , so every individual pair of states is in balance, not just the totals.

Worked example Beginner

Take a chain on three states where from any state you stay put with probability , and otherwise move to one of the other two states, each with probability . By symmetry the equilibrium gives each state weight . We will see two things: that the long-run time-share is for each state, and that the chain satisfies detailed balance.

Step 1. Check detailed balance against the equal-weight equilibrium. Pick states and . The forward flow is the equilibrium weight of state times the chance of moving , which is . The backward flow is the weight of state times the chance of moving , which is . They are equal. The same holds for every pair, so detailed balance holds and the equal-weight distribution is the equilibrium.

Step 2. Predict the long-run time-share. The ergodic theorem says the fraction of steps spent in each state converges to its equilibrium weight .

Step 3. Average a payoff. Suppose state pays dollars, state pays , and state pays . The equilibrium average payoff is dollars.

Step 4. Read the ergodic conclusion. Over a long run, the average payoff per step converges to dollars — the same number you would get by drawing one state from equilibrium and collecting its payoff. You did not need to know the equilibrium in advance to estimate this: a long enough run, averaged, would have revealed it.

What this tells us: detailed balance is a quick local check that a guessed distribution is the equilibrium, and once you have the equilibrium, the long-run average of any per-state payoff equals the equilibrium average of that payoff. Running and averaging replaces solving.

Check your understanding Beginner

Formal definition Intermediate+

Throughout, is a time-homogeneous Markov chain on a countable state space with stochastic transition matrix , as in 37.05.01. Communication, irreducibility, and period are as in 37.05.02; the strong Markov property, the first-return time , and recurrence are as in 37.05.04. The invariant distribution theory — existence of a nonzero invariant measure for an irreducible recurrent chain, positive recurrence as finiteness of the mean return time , and the Kac identity — is 37.05.05, whose results are used freely. We write for a function and an invariant distribution .

Definition (additive functional and time-average). Given , the additive functional of along the chain is , and the time-average (or empirical average) is . The special case gives , the fraction of the first steps spent at state , where is the occupation count.

Definition (reversibility and detailed balance). A measure with is in detailed balance with if A chain is reversible with respect to if is in detailed balance with . When is a probability distribution, reversibility is equivalent to the statement that the stationary chain run backwards is again a Markov chain with the same law: for , the reversed sequence has the same distribution as . The reversed transition matrix is (for ); reversibility is exactly .

Definition (Metropolis-Hastings chain). Fix a target distribution on with , known possibly only up to a normalising constant, and a proposal matrix that is irreducible. The Metropolis-Hastings chain proposes a move with probability and accepts it with probability otherwise staying at . Its transition matrix is for and . Only the ratios enter , so the normalising constant of is never needed.

The chain on with stay-probability and equal moves to the other two states is reversible with respect to , since for . The simple symmetric random walk on is reversible with respect to the counting measure , exhibiting a reversible measure without a reversible distribution (the chain is null recurrent, 37.05.05).

Counterexamples to common slips Intermediate+

  • Detailed balance is sufficient for invariance but not necessary. Summing over gives , so . The converse fails: a directed three-cycle has invariant, yet , so it is stationary but not reversible.

  • The ergodic theorem needs positive recurrence, not merely recurrence. On a null recurrent chain the mean return time is infinite, the denominator in the renewal-reward ratio diverges, and the time-average of a -integrable tends to , not to (which is undefined, as no invariant distribution exists).

  • The ergodic theorem does not require aperiodicity. Unlike convergence to equilibrium 37.05.06, the law of large numbers for time-averages holds for periodic chains too: averaging over time washes out the cyclic rotation. Aperiodicity governs convergence of the marginal law , a separate question.

  • The Metropolis acceptance ratio carries the proposal asymmetry. Omitting the factor (the Hastings correction) gives a chain reversible for the wrong distribution when the proposal is asymmetric. With a symmetric proposal the ratio collapses to , the original Metropolis rule.

Key theorem with proof Intermediate+

Theorem (ergodic theorem for Markov chains). Let be irreducible and positive recurrent with invariant distribution , and let satisfy . Then for every initial distribution, almost surely In particular, taking , the long-run fraction of time at state satisfies almost surely.

Proof. Fix a reference state . By irreducibility and recurrence (37.05.04) the chain visits infinitely often, so the successive visit times to are almost surely finite; here and . Define the excursion sums and excursion lengths for , By the strong Markov property at the stopping times (37.05.04), the blocks for are independent and identically distributed, each distributed as a single excursion from back to started afresh at .

Compute their means. The length has , finite by positive recurrence. For the sum, using the excursion measure of 37.05.05, which satisfies (since ), the interchange of sum and expectation justified by (-integrability) and absolute convergence. So and .

Now squeeze. For , let be the number of completed excursions by time ; since the are finite, a.s. Writing and isolating the initial segment before and the incomplete final excursion, Apply the strong law of large numbers 37.02.02 to the i.i.d. sequences and : Because , we have and , hence as well. The fixed initial segment contributes , and the partial final block is bounded in expectation per excursion and contributes along the recurrence times by the standard renewal-reward estimate (its length satisfies , and the SLLN for over a block controls the partial sum). Therefore Setting gives .

Bridge. This theorem builds toward the entire computational and ergodic-theoretic life of the chapter and appears again in every Monte Carlo estimate, because it certifies that running one long path and averaging recovers an equilibrium expectation. The foundational reason it holds is the strong Markov restart of 37.05.04: each visit to launches an independent excursion, so the path decomposes into i.i.d. blocks and the SLLN of 37.02.02 applies to the block sums and block lengths separately. The ratio is exactly the renewal-reward identity, and this is dual to the Kac formula of 37.05.05, which is the case read as a reward of one unit per visit. Putting these together, the time-average is the reward rate of a renewal-reward process whose cycles are excursions, and the central insight is that positive recurrence — finiteness of — is exactly what makes the denominator finite and the rate well defined. The bridge is that this pathwise law of large numbers generalises the marginal convergence of 37.05.06 from one expectation to almost-sure path behaviour, and it is the statement that makes designing a chain by detailed balance into a sampling algorithm.

Exercises Intermediate+

Advanced results Master

The excursion decomposition that proves the ergodic theorem and the detailed-balance condition that defines reversibility meet in the Metropolis-Hastings construction: detailed balance fixes a prescribed stationary distribution, and the ergodic theorem turns one trajectory into a consistent estimator of every -expectation. Reversibility additionally makes the transition operator self-adjoint, so the convergence rate becomes a spectral gap.

Theorem 1 (ergodic theorem; central limit refinement). Let be irreducible positive recurrent with stationary and -integrable. Then a.s. If moreover the excursion sums have finite variance, and , then a central limit theorem holds: with asymptotic variance , the sum of the stationary autocovariances. The renewal-reward variance formula exhibits through the centred excursion reward, and its finiteness is the regeneration condition that makes Monte Carlo error bars meaningful.

Theorem 2 (reversibility, self-adjointness, and the spectral gap). If is irreducible and reversible with respect to , then is self-adjoint on ; its spectrum is real and contained in , with a simple eigenvalue (eigenvector the constants). On a finite state space order the eigenvalues and set . Then so the absolute spectral gap governs the geometric rate of convergence. Reversibility is what makes this spectral analysis available: a non-reversible chain has a generally non-normal transition matrix whose convergence is not read off real eigenvalues alone. The Dirichlet form and the variational characterisation make the gap estimable by test functions.

Theorem 3 (Metropolis-Hastings: reversibility for an arbitrary target). Given a target with and an irreducible proposal , the Metropolis-Hastings chain with acceptance is reversible with respect to , hence has as its unique stationary distribution; if additionally it is aperiodic (for instance whenever some proposal is rejected with positive probability), it converges to in total variation, and the ergodic theorem yields for -integrable . The acceptance rule depends on only through ratios , so an unnormalised target is sampled with no knowledge of the partition function . Special cases: the original Metropolis rule for symmetric proposals; the Gibbs sampler, where coordinate-wise conditional proposals are accepted with probability one; and Glauber dynamics for spin systems.

Theorem 4 (Birkhoff's pointwise ergodic theorem as the measure-theoretic parent). The Markov-chain ergodic theorem is the special case, for the shift on path space, of Birkhoff's theorem: for a measure-preserving transformation on a probability space and , the averages converge a.s. to the conditional expectation of given the invariant -field, which equals when is ergodic. Taking with the stationary law , the shift, and , the shift is ergodic precisely when is irreducible (positive recurrent), recovering . The Markov excursion proof is the constructive, regeneration-based route to the same limit, and it additionally delivers the rate-and-variance information that the abstract theorem suppresses.

Synthesis. The foundational reason one trajectory suffices to compute every equilibrium expectation is the strong Markov restart of 37.05.04: each return to a fixed state regenerates the chain, so the path is a renewal-reward process whose i.i.d. cycles are excursions, and the SLLN of 37.02.02 on the cycle sums and cycle lengths gives the ratio . This is exactly the Kac formula of 37.05.05 read for the reward , and it is dual to the marginal convergence of 37.05.06: the ergodic theorem is the pathwise (almost-sure) twin of that distributional limit, the two together asserting that both the time-average and the ensemble-average collapse onto . Putting these together with detailed balance, reversibility makes self-adjoint, so the same positive-recurrence equilibrium is now equipped with a real spectrum whose gap quantifies the rate; the central insight is that designing to symmetrise the edge flow forces to be the reversible stationary law of a chain we can simulate, after which the ergodic theorem converts simulation into estimation. The bridge is that the abstract Birkhoff theorem and the concrete excursion argument generalise the elementary to arbitrary observables, and the Metropolis-Hastings construction is the engineering inverse: prescribe , build the reversible chain, and let the ergodic theorem read off a single run.

Full proof set Master

Proposition 1 (detailed balance implies stationarity). If satisfies for all , then .

Proof. Fix . Summing the detailed-balance identity over , , the last step by row-stochasticity . Thus for every , i.e. .

Proposition 2 (reversal characterisation). Let be a stationary distribution with and set . Then is stochastic and is the transition matrix of the time-reversed stationary chain; is reversible with respect to iff , equivalently iff for the vector has the same law as its reversal for every .

Proof. Stochasticity: by stationarity. For the reversal, under the finite-dimensional law is . Reading the same path backwards and inserting repeatedly gives , the law of the -chain started from run over the reversed indices. Hence the reversed process is Markov with matrix , and equality of forward and reversed laws is , i.e. for all — detailed balance.

Proposition 3 (Metropolis-Hastings satisfies detailed balance). For a target with and irreducible proposal , the chain () is reversible with respect to .

Proof. For , using with and , The right-hand expression is symmetric in , so it equals . The case is an identity (both sides equal ). Thus for all , which is detailed balance; by Proposition 1, . Irreducibility of the resulting on the support of follows from irreducibility of and positivity of the acceptance on at least the proposed edges, giving uniqueness of as the stationary distribution.

Proposition 4 (self-adjointness on ). If is reversible with respect to (), then for all , where .

Proof. . Detailed balance replaces by , giving ; relabelling the summation indices yields . Absolute convergence of the double sums for (Cauchy-Schwarz against the probability weights ) justifies the rearrangement.

Proposition 5 (ergodic theorem: occupation form). For irreducible positive-recurrent with stationary , the occupation fraction satisfies a.s. for every .

Proof. Apply the Key theorem with , which is -integrable (). The excursion sum counts the visits to in one excursion from the reference , with , while . The SLLN-ratio gives . Taking specialises to , recovering the Kac identity of 37.05.05 as the long-run visit frequency.

Proposition 6 (the directed cycle is stationary but not reversible). The chain on with has stationary but admits no distribution in detailed balance with .

Proof. Stationarity: , and symmetrically for the other coordinates, so . For detailed balance, any would need ; but and , forcing , and cyclically . The only solution is , so no nonzero reversible measure exists; the chain runs strictly forward and its reversal is the opposite cycle, a different matrix.

Connections Master

  • Invariant measures and positive/null recurrence 37.05.05 supplies the equilibrium and the mean return time that the ergodic theorem averages onto: the excursion measure constructed there is exactly the expected excursion reward in the renewal-reward decomposition here, and the Kac identity is the ergodic theorem evaluated at the indicator , so that unit's static equilibrium becomes this unit's dynamic time-average.

  • The strong law of large numbers 37.02.02 is the analytic engine: the i.i.d. excursion sums and lengths produced by the strong Markov restart are summed by the SLLN, and the ratio of the two limits is the whole content of the Markov ergodic theorem, so the classical law of large numbers for independent summands lifts, through regeneration, to dependent chains.

  • Convergence to equilibrium via coupling 37.05.06 is the distributional twin: that unit proves (an ensemble statement requiring aperiodicity), while this unit proves the pathwise (a time-average holding even with periodicity), and the spectral gap of a reversible chain bounds the coupling-time tail that controls the former.

  • The strong Markov property and recurrence/transience dichotomy 37.05.04 licenses the excursion decomposition by restarting the chain at each visit to the reference state, the same stopping-time machinery that there established the visit-with-certainty of a recurrent class; without it the path could not be cut into independent identically distributed blocks.

Historical & philosophical context Master

The pointwise ergodic theorem in its general measure-preserving form is due to George Birkhoff in 1931, who proved that time-averages of an integrable observable along the orbits of a measure-preserving transformation converge almost everywhere; the mean-square version is John von Neumann's, proved slightly earlier in 1931 and published in 1932. For Markov chains the specialisation — that the long-run fraction of time in a state equals its stationary mass, and more generally that empirical averages converge to stationary expectations — was developed through the renewal-theoretic and excursion viewpoint codified in the textbook tradition, presented cleanly by Norris [Norris 1997] from the regeneration structure of returns to a fixed state. The general-state-space law of large numbers, run off a small set rather than a single recurrent point, is the Harris-chain theory of Meyn and Tweedie.

Reversibility and the detailed-balance condition entered probability from statistical physics, where microscopic reversibility — the time-symmetry of equilibrium fluctuations — is a foundational principle; Lars Onsager's reciprocal relations are its macroscopic shadow. The algorithmic exploitation of detailed balance is the Markov chain Monte Carlo method, initiated by Nicholas Metropolis and collaborators in 1953 [Metropolis 1953] for simulating equilibrium configurations of interacting particles, where the acceptance rule was introduced to drive a chain to a prescribed Boltzmann distribution. W. K. Hastings generalised the construction in 1970 [Hastings 1970] to arbitrary proposal kernels by inserting the correction ratio , giving the Metropolis-Hastings algorithm now standard across Bayesian statistics and computational physics, with the systematic statistical development given by Robert and Casella [Robert-Casella 2004].

Bibliography Master

@book{Norris1997,
  author    = {Norris, James R.},
  title     = {Markov Chains},
  series    = {Cambridge Series in Statistical and Probabilistic Mathematics},
  publisher = {Cambridge University Press},
  year      = {1997}
}

@article{Birkhoff1931,
  author  = {Birkhoff, George D.},
  title   = {Proof of the ergodic theorem},
  journal = {Proceedings of the National Academy of Sciences},
  volume  = {17},
  number  = {12},
  year    = {1931},
  pages   = {656--660}
}

@article{Metropolis1953,
  author  = {Metropolis, Nicholas and Rosenbluth, Arianna W. and Rosenbluth, Marshall N. and Teller, Augusta H. and Teller, Edward},
  title   = {Equation of state calculations by fast computing machines},
  journal = {Journal of Chemical Physics},
  volume  = {21},
  number  = {6},
  year    = {1953},
  pages   = {1087--1092}
}

@article{Hastings1970,
  author  = {Hastings, W. Keith},
  title   = {Monte Carlo sampling methods using Markov chains and their applications},
  journal = {Biometrika},
  volume  = {57},
  number  = {1},
  year    = {1970},
  pages   = {97--109}
}

@book{RobertCasella2004,
  author    = {Robert, Christian P. and Casella, George},
  title     = {Monte Carlo Statistical Methods},
  edition   = {2},
  publisher = {Springer},
  year      = {2004}
}

@book{LevinPeres2017,
  author    = {Levin, David A. and Peres, Yuval},
  title     = {Markov Chains and Mixing Times},
  edition   = {2},
  publisher = {American Mathematical Society},
  year      = {2017}
}