07.05.15 · representation-theory / symmetric

The Bernoulli-Laplace and Ehrenfest urn diffusion models

shipped3 tiersLean: nonepending prereqs

Anchor (Master): Diaconis-Shahshahani 1987 SIAM J. Math. Anal. 18 (Time to reach stationarity in the Bernoulli-Laplace diffusion model); Diaconis 1988 IMS Lecture Notes 11 Ch. 3F + Ch. 7; Ceccherini-Silberstein-Scarabotti-Tolli 2008 Harmonic Analysis on Finite Groups (CUP) Ch. 6

Intuition Beginner

Imagine two boxes and a pile of balls. The Ehrenfest urn has balls split between the two boxes. At each step you pick one ball at random and move it to the other box. Over time the count in the left box wanders up and down, but it spends most of its time near half-full. The question is: starting from all balls on the left, how many steps until you can no longer tell which box things began in?

The Bernoulli-Laplace model is the swapping cousin. Now each box holds exactly balls, one box red and one box white at the start. At each step you grab one ball from each box and swap them. The number of red balls in the left box slowly settles toward an even mix.

Both models ask the same thing: how long until the system forgets where it started? The remarkable fact is that the exact answer comes from a classical family of polynomials.

Visual Beginner

Picture the count of balls in the left urn as a marker hopping along a line of positions from to . Near the ends the marker is strongly pulled back toward the middle; near the middle it drifts almost freely. The distribution of the marker starts as a spike at one end and spreads into a bell shape centered at .

The arrows of the restoring pull get shorter as you approach the center. That shrinking pull is what the eigenvalues record: small wiggles relax slowly, big swings relax fast.

Worked example Beginner

Take the Ehrenfest urn with balls. The left urn can hold or balls. If the left urn has balls, then with probability you move one out (count drops to ) and with probability you move one in (count rises to ).

Start with all on the left. Step 1: a ball must leave, so the count goes to for certain. Step 2: from , the count drops to with probability or returns to with probability .

The long-run distribution is the binomial: position has probability , giving . The center is six times as likely as either end.

The smallest eigenvalue gap is . So the leading correction to stationarity shrinks by a factor of each step, and after a handful of steps the memory of starting at is mostly gone.

Check your understanding Beginner

Formal definition Intermediate+

The Ehrenfest urn. Identify a configuration of labelled balls across two urns with a point of the hypercube : coordinate is or according to which urn holds ball . One step picks a coordinate uniformly and flips it. This is the (lazy-free) random walk on driven by the measure uniform on the standard generators . The Hamming-weight projection collapses the walk to a birth-death chain on with transition kernel

This projected chain is the Ehrenfest chain. Its stationary distribution is , and it is reversible with respect to .

The Bernoulli-Laplace diffusion. Fix and . Take two urns of balls each (so balls in all), with marked balls distributed between them. One step draws a ball from each urn uniformly and swaps them. The number of marked balls in the left urn is a birth-death chain on with

and the holding probability accounts for swaps that exchange like-colored balls. For unequal urn sizes the rates carry the obvious mixed denominators. The chain lives on the Johnson scheme : states are the -subsets of an -set, and the walk is the natural -symmetric chain on . Its stationary distribution is hypergeometric, .

Eigenfunctions. The Ehrenfest chain is diagonalized by the Krawtchouk polynomials , orthogonal with respect to the binomial weight; the Bernoulli-Laplace chain is diagonalized by the dual-Hahn polynomials, orthogonal with respect to the hypergeometric weight. In both cases these are the spherical functions of the underlying association scheme 07.05.06, and the eigenvalue attached to the degree- polynomial is

The sign convention is the standard probabilist's one: eigenvalues lie in , with for the constant eigenfunction.

Counterexamples to common slips

  • Confusing the hypercube walk with its projection. The walk on has states; the Ehrenfest chain is its lumping to states by Hamming weight. They share eigenvalues but the multiplicities differ — the hypercube has eigenfunctions at level , the projected chain has one.
  • Laziness changes the spectrum, not the eigenfunctions. Adding a holding probability replaces by a convex combination with ; the Krawtchouk and dual-Hahn eigenvectors are unchanged. Cutoff statements depend on which version is used.
  • The Bernoulli-Laplace eigenvalues are not . The swap dynamics give a quadratic-in- gap structure (dual Hahn), unlike the linear Krawtchouk gaps of Ehrenfest. Reading off the Ehrenfest formula for Bernoulli-Laplace gives the wrong cutoff constant.

Key theorem with proof Intermediate+

Theorem (Krawtchouk diagonalization of the Ehrenfest chain). Let be the Ehrenfest transition kernel on . For the Krawtchouk polynomial

is a right eigenfunction of with eigenvalue :

Proof. Lift to the hypercube. On the walk averages a function over single-coordinate flips: . The characters of are for , and they are an eigenbasis: flipping coordinate multiplies by when and by otherwise, so

Thus every character of weight is an eigenfunction with eigenvalue . Project: an eigenfunction of on the hypercube that depends only on the Hamming weight descends to an eigenfunction of the lumped chain with the same eigenvalue. Summing over all of size produces a weight-only function,

since the number of of size meeting the support of in points contributes the alternating count above. This is the degree- Krawtchouk polynomial, and it inherits eigenvalue .

Bridge. This computation builds toward the upper-bound-lemma estimate of 07.05.05 specialized to a commutative walk: because is abelian, the character sum over irreducible representations collapses to a single sum over Krawtchouk polynomials, and this is exactly the abelian face of the non-abelian Fourier program. The foundational reason the orthogonal polynomials appear is that they are the spherical functions of the Hamming association scheme 07.05.06, so this is dual to the Gelfand-pair zonal-spherical theory of 07.05.13: the central insight is that diagonalizing a symmetric Markov chain is computing the spherical functions of its scheme, and it appears again in the dual-Hahn analysis of Bernoulli-Laplace, where the same projection from generalises the hypercube case.

Exercises Intermediate+

Advanced results Master

Theorem 1 (Bernoulli-Laplace cutoff — Diaconis-Shahshahani 1987). For the balanced Bernoulli-Laplace diffusion on with , the total variation distance to stationarity exhibits a cutoff at : for any fixed ,

a fixed profile decreasing from to , so the walk is mixed at and not before. The proof diagonalizes the chain by dual-Hahn polynomials 07.05.06, identifies the leading eigenvalue , and feeds the eigenvalue spectrum into the upper-bound lemma 07.05.05; the matching lower bound uses the variance of the slowest dual-Hahn eigenfunction (a second-moment / Wilson argument).

Theorem 2 (Ehrenfest cutoff). The lazy Ehrenfest walk on , equivalently the random walk that flips a uniform coordinate, mixes with a cutoff at in total variation. The Krawtchouk eigenvalues make the upper-bound sum a single elementary expression whose transition occurs at ; the same constant governs the projected birth-death chain on Hamming weights. This is the canonical hypercube cutoff of 07.05.08.

Theorem 3 (spectral gap and relaxation time). The Ehrenfest gap is with relaxation time ; the balanced Bernoulli-Laplace gap is with relaxation time . In both cases the relaxation time is order while the mixing time is order , the logarithmic factor being the signature of cutoff: many eigenvalues near the spectral edge conspire so that mixing takes a multiple of the relaxation time.

Theorem 4 (Ornstein-Uhlenbeck scaling limit). Center and rescale the Ehrenfest count: . As , converges weakly to the stationary Ornstein-Uhlenbeck process , whose generator has Hermite-polynomial eigenfunctions with eigenvalues . The Krawtchouk polynomials converge to Hermite polynomials under this scaling, and the discrete eigenvalues become the continuous , matching the OU spectrum.

Synthesis. These two models are the foundational reason representation theory and classical orthogonal polynomials are the same subject viewed from two sides: the Ehrenfest chain is diagonalized by Krawtchouk polynomials and the Bernoulli-Laplace chain by dual-Hahn polynomials, and this is exactly the statement that the spherical functions of the Hamming and Johnson schemes 07.05.06 are these polynomial families. Putting these together with the upper-bound lemma 07.05.05, the entire mixing analysis reduces to estimating one elementary sum, and the central insight is that the abelian (or Gelfand-pair) symmetry forces the matrix Fourier transform to collapse to a single orthogonal-polynomial eigenvalue at each level. The pattern generalises: every reversible chain with a transitive symmetry group is diagonalized by the spherical functions of its association scheme, so Ehrenfest and Bernoulli-Laplace are the worked prototypes that build toward the general theory, and the Ornstein-Uhlenbeck limit shows the bridge is to continuous diffusions, where Krawtchouk degenerates to Hermite and the cutoff constant persists as the relaxation-to-mixing logarithmic gap. The Bernoulli-Laplace quadratic-gap structure is dual to the Ehrenfest linear-gap structure exactly as the Johnson scheme is dual to the Hamming scheme.

Full proof set Master

Proposition 1 (stationarity and reversibility of the Ehrenfest chain). The binomial law is reversible for the Ehrenfest kernel , .

Proof. Reversibility is the detailed-balance identity . The left side is and the right side is . Using ,

Both sides agree, so is reversible, hence stationary.

Proposition 2 (the abelian upper-bound lemma is exact at the leading order for Ehrenfest). For the Ehrenfest walk on started at , the total variation distance satisfies, for with slowly,

Proof sketch. The upper bound of Exercise 7 gives . With the term is and the tail is , so the bound is , giving . The matching lower bound applies the second-moment method to the linear eigenfunction : under its mean is while under it has mean and standard deviation , so the two laws are separated by a constant multiple of standard deviations, which Chebyshev converts into the lower bound. The two bounds match to leading order.

Proposition 3 (Bernoulli-Laplace eigenvalues are the dual-Hahn spectrum). For the balanced model , the eigenvalues of the swap chain are , , with the degree- dual-Hahn polynomial as eigenfunction.

Proof. The swap operator is -invariant on , and is a Gelfand pair, so decomposes multiplicity-free into -irreducibles with indexed by the two-row partition . The swap operator acts as the scalar on by Schur's lemma. Evaluating the operator on the spherical function (the zonal eigenvector), which is the degree- dual-Hahn polynomial in the overlap variable , computes the scalar: the action of swapping one pair changes the overlap by with the stated quadratic rates, and the dual-Hahn three-term recurrence forces the eigenvalue . The multiplicity-free decomposition guarantees these are all the eigenvalues.

Connections Master

  • Association schemes and Krawtchouk/Hahn polynomials 07.05.06. The eigenfunctions of both urn chains are the spherical functions of the Hamming scheme (Krawtchouk) and the Johnson scheme (dual Hahn). That unit supplies the Bose-Mesner algebra and the orthogonal-polynomial families; this unit is its first dynamical payoff, turning the eigenvalue matrices into Markov-chain spectra.

  • Random walk on a finite group; Upper Bound Lemma 07.05.05. The mixing bounds here are the commutative specialization of the upper-bound lemma: because is abelian and is a Gelfand pair, the matrix character sum collapses to a single orthogonal-polynomial eigenvalue per level, making the bound an elementary sum.

  • Cutoff phenomenon 07.05.08. The transitions of Ehrenfest and Bernoulli-Laplace are two of the original cutoff examples; this unit supplies the orthogonal-polynomial diagonalization that the abstract cutoff machinery of that unit consumes, and the Ehrenfest hypercube walk is the cutoff prototype stated there.

  • Partially ranked data and Gelfand pairs 07.05.13. Bernoulli-Laplace lives on the same coset space as partially ranked data; the zonal spherical functions used there to analyze ranking statistics are exactly the dual-Hahn eigenfunctions diagonalizing the swap chain here, so the two units share their harmonic analysis.

  • De Finetti exchangeability 07.05.14. The Bernoulli-Laplace stationary law is exchangeable on the -subset, and the urn dynamics are the canonical finite exchangeable Markov chain; this connects the equilibrium analysis here to the exchangeability theory of that unit.

Historical & philosophical context Master

Paul and Tatiana Ehrenfest introduced their urn model in 1907 [Ehrenfest 1907] to answer Loschmidt's and Zermelo's objections to Boltzmann's -theorem: a finite, fully reversible, recurrent Markov chain can nonetheless display apparently irreversible relaxation toward equilibrium, with Poincaré recurrence times so long as to be physically unobservable. The model made the statistical-mechanical reconciliation of microscopic reversibility and macroscopic irreversibility concrete and computable. Mark Kac later popularized it as the standard pedagogical model for approach to equilibrium. The exact spectral analysis via Krawtchouk polynomials was carried out by Karlin and McGregor in 1965 [Karlin McGregor 1965] within their broader program of diagonalizing birth-death chains by orthogonal polynomials.

The Bernoulli-Laplace model descends from Daniel Bernoulli's and Laplace's eighteenth-century urn schemes for diffusion and mixing. Diaconis and Shahshahani's 1987 paper [Diaconis Shahshahani 1987] computed its exact mixing time, proving the sharp cutoff using the dual-Hahn polynomials as the spherical functions of the Johnson scheme, and Diaconis's 1988 monograph [Diaconis 1988] set both urn models within the representation-theoretic framework, presenting them as the commutative worked examples that parallel the non-abelian random-transposition and riffle-shuffle analyses.

Bibliography Master

@article{Ehrenfest1907,
  author = {Ehrenfest, Paul and Ehrenfest, Tatiana},
  title = {Über zwei bekannte Einwände gegen das Boltzmannsche H-Theorem},
  journal = {Physikalische Zeitschrift},
  volume = {8},
  year = {1907},
  pages = {311--314},
}

@article{DiaconisShahshahani1987,
  author = {Diaconis, Persi and Shahshahani, Mehrdad},
  title = {Time to reach stationarity in the {Bernoulli}-{Laplace} diffusion model},
  journal = {SIAM Journal on Mathematical Analysis},
  volume = {18},
  year = {1987},
  pages = {208--218},
}

@book{Diaconis1988,
  author = {Diaconis, Persi},
  title = {Group Representations in Probability and Statistics},
  publisher = {Institute of Mathematical Statistics},
  year = {1988},
  series = {IMS Lecture Notes--Monograph Series},
  volume = {11},
}

@article{KarlinMcGregor1965,
  author = {Karlin, Samuel and McGregor, James},
  title = {Ehrenfest urn models},
  journal = {Journal of Applied Probability},
  volume = {2},
  year = {1965},
  pages = {352--376},
}

@book{LevinPeres2017,
  author = {Levin, David A. and Peres, Yuval},
  title = {Markov Chains and Mixing Times},
  edition = {2nd},
  publisher = {American Mathematical Society},
  year = {2017},
}

@book{CeccheriniSilberstein2008,
  author = {Ceccherini-Silberstein, Tullio and Scarabotti, Fabio and Tolli, Filippo},
  title = {Harmonic Analysis on Finite Groups},
  publisher = {Cambridge University Press},
  year = {2008},
}