38.04.02 · dynamics / ergodic-theorems

Ergodicity, Unique Ergodicity, and Equidistribution

shipped3 tiersLean: none

Anchor (Master): Walters 1982 *An Introduction to Ergodic Theory* (Springer GTM 79) Ch. 1, Ch. 6 (unique ergodicity and uniform convergence; Weyl's theorem); Einsiedler-Ward 2011 *Ergodic Theory with a View towards Number Theory* (Springer GTM 259) Ch. 4, Ch. 6; Cornfeld-Fomin-Sinai 1982 *Ergodic Theory* (Springer Grundlehren 245) Ch. 3 (toral automorphisms, spectral criterion); Glasner 2003 *Ergodic Theory via Joinings* (AMS) Ch. 4 (ergodic decomposition)

Intuition Beginner

A measure-preserving rule keeps the total volume of states fixed, but that alone does not stop the space from splitting into separate compartments that never talk to each other. Imagine a race track split into two lanes by a wall: a runner started in the inner lane circles there forever and never enters the outer one. The rule preserves length in each lane, yet a runner's long-run behaviour depends entirely on which lane you dropped them into. Ergodicity is the condition that forbids such walls: the only regions the rule leaves completely unchanged are the empty region and the whole space. There are no invariant compartments, so a single trajectory is free to explore everything.

When there are no walls, a beautiful thing happens. Track how often one long trajectory visits a given region. Because the trajectory cannot get trapped anywhere, the fraction of time it spends in a region settles down to the size of that region, the same answer no matter which starting point you chose (apart from a negligible set of exceptions). The average measured along one orbit over time equals the average taken over space all at once. This is the headline slogan of the subject: time average equals space average.

Now sharpen the question. Sometimes a rule preserves not just one notion of volume but exactly one — there is a single way to assign sizes that the rule respects, and no other. Such a rule is called uniquely ergodic, and it is even better behaved: the fraction-of-time answer converges for every starting point, with no exceptions at all, and it converges evenly across all starting points at once. The cleanest example is rotating a circle by an angle that is not a simple fraction of a full turn: the orbit of any point, with no exceptions, spreads perfectly evenly around the circle.

The takeaway: ergodicity rules out invariant walls and makes time-average equal space-average for almost every orbit; unique ergodicity is the stronger, single-invariant-measure version that makes the equality hold for every orbit and evenly. Its most famous payoff is that the points $α, 2 α, 3 α, \dots$ wrapped around the circle land everywhere with perfectly uniform density when $α$ is irrational.

Visual Beginner

Picture the circle drawn as a clock face and mark the points you get by repeatedly adding a fixed irrational step, wrapping around each time. Early on the marks look scattered; after many steps they fill the circle with even spacing.

The three panels show the same orbit at $5$ , $30$ , and $300$ steps: scattered, then spreading, then perfectly even. The shaded arc illustrates the punchline — the share of marks falling in any arc approaches that arc's length, identically for every starting point. That even filling is equidistribution, and it is the visible signature of unique ergodicity.

Worked example Beginner

We watch the points $α, 2 α, 3 α, \dots$ fill the circle, using the slightly irrational-looking step $α = 0.4$ for a hand-computable run, and check that an arc collects its fair share.

Step 1. The system. Take the circle as the numbers from $0$ up to $1$ , wrapping so $1$ is the same point as $0$ . The rule adds $α = 0.4$ and wraps. Starting from $0$ , the orbit is $0, 0.4, 0.8, 0.2, 0.6, 0.0, 0.4, \dots$ — it cycles through five evenly spaced points $0, 0.2, 0.4, 0.6, 0.8$ . (A truly irrational step never repeats; $0.4 = 2/5$ is rational, so it repeats every five steps, which is enough to show the counting.)

Step 2. The arc to watch. Let $A$ be the arc from $0$ up to $0.4$ , of length $0.4$ , that is two fifths of the circle. The five visited points $0, 0.2, 0.4, 0.6, 0.8$ include $0$ and $0.2$ inside $A$ (taking the arc as $[0, 0.4)$ ), so two of the five points sit in $A$ .

Step 3. The fraction of time. Over one full cycle of five steps the orbit lands in $A$ on $2$ of them, a fraction $2/5 = 0.4$ . Because the orbit just repeats this cycle forever, the long-run fraction of visits to $A$ stays exactly $0.4$ .

Step 4. Compare to the size of $A$ . The arc $A$ has length $0.4$ , which is exactly the long-run fraction of visits we found. The time-average ( $0.4$ of the steps land in $A$ ) equals the space-average (the arc is $0.4$ of the circle).

What this tells us: even for this repeating rational example, the share of visits to an arc matches the arc's length. For a genuinely irrational step the orbit never repeats and fills the circle with no gaps, and the same equality — fraction of visits equals length of arc — holds for every starting point and every arc. That is exactly Weyl's equidistribution statement, and it is what unique ergodicity of the irrational rotation delivers.

Check your understanding Beginner

Exercise (easy, multiple choice).

A measure-preserving rule is called ergodic when:

A. Every starting point returns to where it began after finitely many steps B. The only regions left completely unchanged by the rule are the empty region and the whole space C. The rule moves every point to a brand-new location each step D. The space can be split into two equal halves that the rule keeps separate

Hint

Ergodicity forbids invariant "walls"; think about what an invariant region of in-between size would represent.

Answer

B. The only unchanged regions are the empty region and the whole space.

Feedback-correct: ergodicity is the no-invariant-walls condition — any region the rule leaves fixed must be negligible or everything — which is what lets a single orbit explore the whole space. Feedback-wrong: A describes periodicity, much stronger and usually false; C is neither necessary nor sufficient; D describes a non-ergodic system, the opposite of what ergodicity asserts.

Formal definition Intermediate+

Throughout, $(X, B, μ)$ is a probability space and $T : X \to X$ is measure-preserving, in the sense of 38.04.01: $μ (T^{- 1} B) = μ (B)$ for all $B \in B$ . The Koopman operator $U_{T} f = f \circ T$ is the associated $L^{2} (μ)$ isometry.

Definition (ergodicity). The system $(X, B, μ, T)$ is ergodic if every almost-invariant set $B$ — one with $μ (T^{- 1} B △ B) = 0$ — has $μ (B) \in {0, 1}$ . Equivalently (Proposition below), the invariant $σ$ -algebra $I = {B : μ (T^{- 1} B △ B) = 0}$ is $μ$ -degenerate, every set in it having measure $0$ or $1$ .

Proposition (characterisations of ergodicity). The following are equivalent for a measure-preserving system:

$T$ is ergodic (every almost-invariant set is null or conull).
Every measurable $g$ with $g \circ T = g$ a.e. is a.e. constant.
Every $f \in L^{2} (μ)$ with $U_{T} f = f$ is a.e. constant; i.e. the eigenvalue $1$ of $U_{T}$ is simple, its eigenspace being the constants.
For all $A, B \in B$ , $\frac{1}{n} \sum_{k = 0}^{n - 1} μ (T^{- k} A \cap B) \to μ (A) μ (B)$ .
For every $f \in L^{1} (μ)$ the Birkhoff averages $\frac{1}{n} \sum_{k = 0}^{n - 1} f \circ T^{k}$ converge a.e. to the constant $\int_{X} f d μ$ .

Definition (unique ergodicity). Let $X$ be a compact metric space and $T : X \to X$ continuous. Write $M (X)$ for the Borel probability measures on $X$ and $M_{T} (X)$ for the $T$ -invariant ones. The set $M_{T} (X)$ is nonempty (Krylov-Bogolyubov, below), convex, and weak- $*$ compact. The system $(X, T)$ is uniquely ergodic if $M_{T} (X)$ is a singleton: there is exactly one $T$ -invariant Borel probability measure $μ$ . That unique $μ$ is automatically ergodic, since the extreme points of $M_{T} (X)$ are precisely the ergodic measures and a one-point convex set is its own unique extreme point.

Definition (equidistribution). A sequence $(x_{n})_{n \geq 0}$ in a compact metric space $X$ carrying a probability measure $μ$ is equidistributed (or $μ$ -uniformly distributed) if for every continuous $f : X \to R$ , $\frac{1}{N} n = 0 \sum N - 1 f (x_{n}) ⟶ \int_{X} f d μ (N \to \infty) .$ For $X = [0, 1)$ with Lebesgue measure this is equivalent, by 21.15.02, to the Weyl criterion: $\frac{1}{N} \sum_{n = 0}^{N - 1} e^{2 π ik x_{n}} \to 0$ for every nonzero integer $k$ .

Canonical examples. (i) Irrational rotation $T x = x + α mod 1$ , $α \in / Q$ : ergodic and uniquely ergodic for Lebesgue measure. (ii) Rational rotation $T x = x + p / q mod 1$ : not ergodic — each orbit is a $q$ -point set and the union of finitely many such orbits is invariant. (iii) Doubling map $T x = 2 x mod 1$ : ergodic for Lebesgue measure but not uniquely ergodic, since Dirac masses on periodic orbits (e.g. $δ_{0}$ ) are also invariant. (iv) Toral automorphism $T = A mod 1$ on $T^{d}$ with $A \in SL_{d} (Z)$ : ergodic for Lebesgue measure iff no eigenvalue of $A$ is a root of unity. (v) Bernoulli shift $(p^{\otimes N}, shift)$ : ergodic (indeed mixing) but not uniquely ergodic.

Counterexamples to common slips Intermediate+

Ergodic does not mean uniquely ergodic. The doubling map and every Bernoulli shift are ergodic for their natural measure yet carry many other invariant measures (Dirac masses on periodic points, other Bernoulli laws). Unique ergodicity is the much rarer property that the invariant measure is the only one; it forces the dynamics to be, in a sense, minimal and rigid.
Ergodicity is a property of the pair $(T, μ)$ , not of $T$ . The doubling map is ergodic for Lebesgue measure and also "ergodic" for the Dirac mass $δ_{0}$ (vacuously, as $δ_{0}$ has no invariant sets of intermediate measure), but these are different systems. Naming a map ergodic without naming the measure is the standard ambiguity.
The averaging criterion is Cesàro, not termwise. Condition (4) requires the averaged correlations $\frac{1}{n} \sum_{k} μ (T^{- k} A \cap B)$ to converge to $μ (A) μ (B)$ ; the stronger termwise convergence $μ (T^{- n} A \cap B) \to μ (A) μ (B)$ is mixing, a strictly stronger property than ergodicity. An irrational rotation is ergodic but not mixing.
Equidistribution of $(n α)$ needs $α$ irrational. For rational $α = p / q$ the sequence $(n α mod 1)$ visits only the $q$ points ${0, 1/ q, \dots, (q - 1) / q}$ and is not equidistributed for Lebesgue measure; it is equidistributed for the uniform measure on those $q$ points. The Weyl criterion fails at $k = q$ , where $e^{2 π i q n α} = 1$ for all $n$ .
Unique ergodicity gives convergence for every $x$ and continuous $f$ only. The uniform-convergence theorem is stated for continuous test functions; for a discontinuous $f = 1_{A}$ the convergence $\frac{1}{N} \sum_{n} f (T^{n} x) \to μ (A)$ holds at every $x$ provided $μ (\partial A) = 0$ (a $μ$ -continuity set), by approximating $1_{A}$ above and below by continuous functions.

Key theorem with proof Intermediate+

Theorem (Weyl equidistribution via unique ergodicity; Weyl 1916). Let $α \in R$ be irrational and $T x = x + α mod 1$ the rotation of $X = [0, 1) ≅ R / Z$ . Then $T$ is uniquely ergodic, its unique invariant measure being Lebesgue measure $λ$ , and consequently for every $x \in X$ the orbit $(x + n α mod 1)_{n \geq 0}$ is equidistributed: for every continuous $f$ , $\frac{1}{N} n = 0 \sum N - 1 f (x + n α) ⟶ \int_{0}^{1} f d λ uniformly in x .$ In particular $(n α mod 1)_{n \geq 1}$ is equidistributed in $[0, 1)$ .

Proof. First, $T$ preserves $λ$ by translation invariance, and $T$ is ergodic for $λ$ . To see ergodicity, let $f \in L^{2} (λ)$ satisfy $U_{T} f = f$ , i.e. $f (x + α) = f (x)$ a.e. Expand in the Fourier basis $f = \sum_{k \in Z} \hat{f} (k) e^{2 π ik x}$ with $\hat{f} (k) = \int_{0}^{1} f (x) e^{- 2 π ik x} d x$ . Then $f \circ T$ has Fourier coefficients $f \circ T (k) = e^{2 π ik α} \hat{f} (k)$ , so invariance forces $\hat{f} (k) (e^{2 π ik α} - 1) = 0$ for every $k$ . For $k \neq = 0$ , irrationality of $α$ gives $k α \in / Z$ , hence $e^{2 π ik α} \neq = 1$ and $\hat{f} (k) = 0$ . Thus $f = \hat{f} (0)$ is a.e. constant, and by characterisation (3) above $T$ is ergodic.

Now we upgrade ergodicity to unique ergodicity together with the uniform convergence. Let $ν$ be any $T$ -invariant Borel probability measure. Its Fourier coefficients $\overset{ν}{^} (k) = \int e^{- 2 π ik x} d ν (x)$ satisfy, by invariance, $\overset{ν}{^} (k) = \int e^{- 2 π ik (x + α)} d ν (x) = e^{- 2 π ik α} \overset{ν}{^} (k)$ . For $k \neq = 0$ , $e^{- 2 π ik α} \neq = 1$ forces $\overset{ν}{^} (k) = 0$ , so $ν$ has the same Fourier coefficients as $λ$ ( $\overset{ν}{^} (0) = 1 = \hat{λ} (0)$ , $\overset{ν}{^} (k) = 0 = \hat{λ} (k)$ for $k \neq = 0$ ). By uniqueness of Fourier coefficients on the circle, $ν = λ$ . Hence $λ$ is the only invariant measure: $T$ is uniquely ergodic.

For the uniform convergence, verify it first on trigonometric polynomials. For $f (x) = e^{2 π ik x}$ with $k \neq = 0$ , a geometric sum gives $\frac{1}{N} n = 0 \sum N - 1 e^{2 π ik (x + n α)} = \frac{e ^{2 π ik x}}{N} \cdot \frac{e ^{2 π ik N α} - 1}{e ^{2 π ik α} - 1},$ whose modulus is at most $\frac{1}{N} \cdot \frac{2}{∣ e ^{2 π ik α} - 1∣} \to 0$ as $N \to \infty$ , uniformly in $x$ , while $\int e^{2 π ik x} d x = 0$ . For $k = 0$ the average is identically $1 = \int 1 d λ$ . By linearity the uniform convergence holds for all trigonometric polynomials. Given a continuous $f$ and $ε > 0$ , Weierstrass approximation supplies a trigonometric polynomial $p$ with $∥ f - p ∥_{\infty} < ε$ ; since the Birkhoff averaging operator $f \mapsto \frac{1}{N} \sum_{n} f \circ T^{n}$ has operator norm $1$ on $C (X)$ and $\int (\cdot) d λ$ likewise, the triangle inequality gives $\frac{1}{N} \sum_{n} f \circ T^{n} - \int f d λ_{\infty} \leq 2 ε + \frac{1}{N} \sum_{n} p \circ T^{n} - \int p d λ_{\infty} .$ The last term tends to $0$ , so the limsup is at most $2 ε$ for every $ε$ , giving uniform convergence for $f$ . Taking $f$ to approximate $1_{[a, b)}$ from above and below (a Lebesgue-continuity set) yields $\frac{1}{N} # {n < N : n α mod 1 \in [a, b)} \to b - a$ , the equidistribution of $(n α)$ . $□$

Bridge. This theorem builds toward the general principle that unique ergodicity is the dynamical engine of equidistribution, and it appears again in the Master-tier equivalence between unique ergodicity and uniform convergence of Birkhoff averages for all continuous $f$ . The foundational reason the proof works is that an invariant measure is pinned down by its action on the characters $e^{2 π ik x}$ , and irrationality kills every nonzero character — this is exactly the Fourier-analytic Weyl criterion 21.15.02 read as a statement about invariant measures, so the number-theoretic and the dynamical proofs of Weyl's theorem are dual to one another. The central insight is that ergodicity for $λ$ gives a.e. convergence by Birkhoff 37.02.03, but unique ergodicity upgrades "almost every $x$ " to "every $x$ , uniformly," because there is no second invariant measure to host an exceptional orbit. Putting these together, recurrence and the Kac geometry of 38.04.01 guarantee orbits return, ergodicity makes the return frequencies equal to measure, and unique ergodicity makes those frequencies hold pointwise; the bridge is that minimality plus a unique invariant measure forces every orbit to be a faithful sampler of that measure.

Exercises Intermediate+

Exercise 3 (medium, symbolic).

Prove characterisation (3): a measure-preserving $T$ is ergodic iff every $f \in L^{2} (μ)$ with $U_{T} f = f$ is a.e. constant.

Hint

For the forward direction approximate an invariant $f$ by indicators of super-level sets ${f > c}$ , which are almost invariant. For the reverse, apply the hypothesis to $f = 1_{B}$ for an almost-invariant $B$ .

Answer

Suppose $T$ is ergodic and $f \in L^{2}$ with $f \circ T = f$ a.e. For each real $c$ the super-level set $B_{c} = {f > c}$ satisfies $T^{- 1} B_{c} = {f \circ T > c} = {f > c} = B_{c}$ (mod $μ$ ), so $B_{c}$ is almost invariant and $μ (B_{c}) \in {0, 1}$ by ergodicity. The function $c \mapsto μ (B_{c})$ is non-increasing, right-continuous, takes only the values $0$ and $1$ , and falls from $1$ to $0$ ; let $c_{0} = sup {c : μ (B_{c}) = 1}$ . Then $μ (f > c_{0} - ε) = 1$ and $μ (f > c_{0} + ε) = 0$ for every $ε > 0$ , forcing $f = c_{0}$ a.e. Conversely, if every invariant $L^{2}$ function is a.e. constant and $B$ is almost invariant, then $1_{B} \in L^{2}$ satisfies $1_{B} \circ T = 1_{T^{- 1} B} = 1_{B}$ a.e., so $1_{B}$ is a.e. constant, i.e. $μ (B) \in {0, 1}$ . Hence $T$ is ergodic.

Exercise 4 (medium, symbolic).

Using the Fourier method, show that the rotation $T x = x + α mod 1$ is ergodic for Lebesgue measure iff $α$ is irrational.

Hint

An invariant $f = \sum_{k} \hat{f} (k) e^{2 π ik x}$ forces $\hat{f} (k) (e^{2 π ik α} - 1) = 0$ . Ask when a nonzero coefficient with $k \neq = 0$ can survive.

Answer

If $f \circ T = f$ then matching Fourier coefficients gives $\hat{f} (k) e^{2 π ik α} = \hat{f} (k)$ , i.e. $\hat{f} (k) (e^{2 π ik α} - 1) = 0$ for all $k$ . When $α$ is irrational, $k α \in / Z$ for every $k \neq = 0$ , so $e^{2 π ik α} \neq = 1$ and $\hat{f} (k) = 0$ ; thus $f$ is constant and $T$ is ergodic by Exercise 3. When $α = p / q$ is rational (in lowest terms), $e^{2 π i q α} = 1$ , so $f (x) = e^{2 π i q x}$ is a non-constant invariant function, and the super-level sets of its real part are almost-invariant sets of intermediate measure; $T$ is not ergodic. Hence ergodicity holds exactly for irrational $α$ .

Exercise 5 (medium, symbolic).

Show that the toral automorphism $T = A mod 1$ on $T^{2}$ with $A = (2111) \in SL_{2} (Z)$ is ergodic for Lebesgue measure, by checking the no-root-of-unity eigenvalue criterion.

Hint

An invariant $f = \sum_{m \in Z^{2}} \hat{f} (m) e^{2 π i ⟨ m, x ⟩}$ forces $\hat{f} (m) = \hat{f} (A^{⊤} m)$ . The orbit of any nonzero $m$ under $A^{⊤}$ must be infinite unless an eigenvalue is a root of unity.

Answer

Characters on $T^{2}$ are $χ_{m} (x) = e^{2 π i ⟨ m, x ⟩}$ , $m \in Z^{2}$ , and $χ_{m} \circ T = χ_{A^{⊤} m}$ . Invariance of $f = \sum_{m} \hat{f} (m) χ_{m}$ gives $\hat{f} (m) = \hat{f} (A^{⊤} m)$ for all $m$ , so $\hat{f}$ is constant along each forward orbit of $A^{⊤}$ on $Z^{2}$ . The eigenvalues of $A$ are $\frac{3 \pm 5}{2}$ , real, irrational, and not roots of unity (neither lies on the unit circle: their product is $det A = 1$ but one exceeds $1$ ). Hence $A^{⊤}$ has no nonzero integer eigenvector and the $A^{⊤}$ -orbit of any $m \neq = 0$ is infinite. Since $\sum_{m} ∣ \hat{f} (m) ∣^{2} < \infty$ , a coefficient constant along an infinite orbit must vanish, so $\hat{f} (m) = 0$ for all $m \neq = 0$ and $f$ is constant. By Exercise 3, $T$ is ergodic. The general criterion: $A$ is ergodic iff no eigenvalue is a root of unity, exactly when no $A^{⊤}$ -orbit on $Z^{2} ∖ {0}$ is finite.

Exercise 7 (hard, symbolic).

Prove the Oxtoby criterion: a continuous map $T$ of a compact metric space $X$ is uniquely ergodic iff for every $f \in C (X)$ the Birkhoff averages $A_{N} f = \frac{1}{N} \sum_{n < N} f \circ T^{n}$ converge uniformly to a constant. Show the constant is $\int f d μ$ for the unique invariant $μ$ .

Hint

For necessity, suppose $A_{N} f$ did not converge uniformly to $\int f d μ$ ; extract points $x_{N}$ and a weak- $*$ limit of the empirical measures $\frac{1}{N} \sum_{n} δ_{T^{n} x_{N}}$ , which is invariant, to build a second invariant measure. For sufficiency, any invariant $ν$ integrates $A_{N} f$ to $\int f d ν$ .

Answer

( $\Rightarrow$ ) Suppose $T$ is uniquely ergodic with invariant measure $μ$ . Fix $f \in C (X)$ and set $c_{N} = sup_{x} ∣ A_{N} f (x) - \int f d μ ∣$ . If $c_{N} \neq \to 0$ , choose $ε > 0$ , a subsequence, and points $x_{N}$ with $∣ A_{N} f (x_{N}) - \int f d μ ∣ \geq ε$ . The empirical measures $μ_{N} = \frac{1}{N} \sum_{n < N} δ_{T^{n} x_{N}}$ live in the weak- $*$ compact set $M (X)$ ; pass to a convergent subsequence $μ_{N_{j}} \to ν$ . For any $g \in C (X)$ , $\int g d μ_{N} - \int g \circ T d μ_{N} = \frac{1}{N} (g (x_{N}) - g (T^{N} x_{N})) \to 0$ , so $ν$ is $T$ -invariant, hence $ν = μ$ . But $\int f d μ_{N_{j}} = A_{N_{j}} f (x_{N_{j}})$ stays $\geq ε$ away from $\int f d μ = \int f d ν$ , a contradiction. So $A_{N} f \to \int f d μ$ uniformly.

( $\Leftarrow$ ) Suppose $A_{N} f \to c (f)$ uniformly (a constant) for each $f \in C (X)$ . The map $f \mapsto c (f)$ is a positive normalised linear functional, hence $c (f) = \int f d μ$ for a probability measure $μ$ by Riesz representation, and $μ$ is invariant since $c (f \circ T) = c (f)$ . If $ν$ is any invariant measure, integrating $A_{N} f$ against $ν$ gives $\int A_{N} f d ν = \int f d ν$ (invariance), and the uniform limit gives $\int A_{N} f d ν \to c (f) = \int f d μ$ ; thus $\int f d ν = \int f d μ$ for all $f \in C (X)$ , so $ν = μ$ . Unique ergodicity follows.

Exercise 8 (hard, symbolic).

State and prove the Krylov-Bogolyubov theorem: every continuous map $T$ of a compact metric space $X$ admits at least one invariant Borel probability measure.

Hint

Start from any $μ_{0} \in M (X)$ , form Cesàro averages $μ_{N} = \frac{1}{N} \sum_{n < N} T_{*}^{n} μ_{0}$ , and take a weak- $*$ limit; show invariance via $T_{*} μ_{N} - μ_{N} \to 0$ .

Answer

Pick any $μ_{0} \in M (X)$ (e.g. a point mass) and define $μ_{N} = \frac{1}{N} \sum_{n = 0}^{N - 1} T_{*}^{n} μ_{0}$ , where $T_{*} ν = ν \circ T^{- 1}$ is the pushforward. Each $μ_{N} \in M (X)$ , and $M (X)$ is weak- $*$ compact (Banach-Alaoglu, $X$ compact metric so $C (X)$ separable), so a subsequence $μ_{N_{j}} \to μ$ weak- $*$ . Compute $T_{*} μ_{N} - μ_{N} = \frac{1}{N} n = 0 \sum N - 1 (T_{*}^{n + 1} μ_{0} - T_{*}^{n} μ_{0}) = \frac{1}{N} (T_{*}^{N} μ_{0} - μ_{0}),$ a telescoping sum. For any $g \in C (X)$ , $∣ \int g d (T_{*} μ_{N}) - \int g d μ_{N} ∣ = \frac{1}{N} ∣ \int g d (T_{*}^{N} μ_{0}) - \int g d μ_{0} ∣ \leq \frac{2∥ g ∥ _{\infty}}{N} \to 0$ . Since $T_{*}$ is weak- $*$ continuous (as $T$ is continuous), passing to the limit along $N_{j}$ gives $\int g d (T_{*} μ) = \int g d μ$ for all $g \in C (X)$ , i.e. $T_{*} μ = μ$ . Hence $μ$ is $T$ -invariant, and $M_{T} (X) \neq = \emptyset$ .

Advanced results Master

Theorem 1 (characterisations of ergodicity). For a measure-preserving system $(X, B, μ, T)$ the conditions are equivalent: ( $a$ ) every almost-invariant set is null or conull; ( $b$ ) every $T$ -invariant measurable function is a.e. constant; ( $c$ ) $1$ is a simple eigenvalue of the Koopman operator $U_{T}$ on $L^{2} (μ)$ , with the constants as eigenspace; ( $d$ ) for all $A, B$ , $\frac{1}{n} \sum_{k < n} μ (T^{- k} A \cap B) \to μ (A) μ (B)$ ; ( $e$ ) the Birkhoff averages of every $f \in L^{1}$ converge a.e. to $\int f d μ$ . The spectral condition (c) places ergodicity at the bottom of the spectral hierarchy: weak mixing is continuous spectrum on the orthocomplement of the constants, and (strong) mixing strengthens the Cesàro condition (d) to a termwise limit ^{[Birkhoff 1931]}.

Theorem 2 (ergodicity of the irrational rotation and toral automorphisms). The rotation $x \mapsto x + α$ on $T^{d}$ is ergodic for Haar measure iff $1, α_{1}, \dots, α_{d}$ are rationally independent. A toral automorphism $A \in GL_{d} (Z)$ is ergodic for Haar measure iff no eigenvalue of $A$ is a root of unity, equivalently iff the dual action $A^{⊤}$ on $Z^{d} ∖ {0}$ has no finite orbit; ergodic toral automorphisms are exactly the hyperbolic (no eigenvalue on the unit circle) ones among those with a root-of-unity-free spectrum, and the hyperbolic ones are mixing of all orders and Bernoulli. The character-orbit proof is the Fourier shadow of the dynamical statement ^{[Walters 1982]}.

Theorem 3 (unique ergodicity $\Leftrightarrow$ uniform convergence; Oxtoby 1952). A continuous map $T$ of a compact metric space $X$ is uniquely ergodic iff for every $f \in C (X)$ the Birkhoff averages $A_{N} f$ converge uniformly to the constant $\int f d μ$ . Equivalently, the empirical measures $\frac{1}{N} \sum_{n < N} δ_{T^{n} x}$ converge weak- $*$ to $μ$ uniformly in $x$ . Unique ergodicity therefore removes the a.e.-exceptional set of Birkhoff's theorem entirely: every orbit equidistributes, and it does so at a rate independent of the basepoint ^{[Oxtoby 1952]}.

Theorem 4 (Weyl equidistribution; Weyl 1916). For irrational $α$ , the sequence $(n α mod 1)$ is equidistributed in $[0, 1)$ ; more generally, for a polynomial $p (n) = α_{d} n^{d} + \dots + α_{1} n + α_{0}$ with at least one of $α_{1}, \dots, α_{d}$ irrational, $(p (n) mod 1)$ is equidistributed. The linear case is unique ergodicity of the rotation; the polynomial case is unique ergodicity of a unipotent affine map on $T^{d}$ (the skew product / Anzai skew shift) together with Weyl differencing 21.15.02. Equidistribution is the analytic face of unique ergodicity, and the Fourier-analytic Weyl criterion is its number-theoretic mirror ^{[Weyl 1916]}.

Theorem 5 (ergodic decomposition; Krylov-Bogolyubov 1937). Let $T$ be a continuous (or Borel) map of a standard Borel space and $μ \in M_{T} (X)$ . Then $M_{T} (X)$ is a Choquet simplex whose extreme points are exactly the ergodic measures, and $μ$ has a unique representation as a barycentre $μ = \int_{E} e d ρ (e)$ over the set $E$ of ergodic measures, where $ρ$ is a probability measure on $E$ . Concretely there is a measurable map $x \mapsto μ_{x}^{I}$ (the conditional measures on the invariant $σ$ -algebra) with each $μ_{x}^{I}$ ergodic and $μ = \int μ_{x}^{I} d μ (x)$ , so that the Birkhoff limit $E [f ∣ I] (x) = \int f d μ_{x}^{I}$ samples the ergodic component through $x$ . The non-ergodic theory thereby reduces to the ergodic theory fibre by fibre ^{[Krylov-Bogolyubov 1937]}.

Synthesis. The five results are one statement at successive depths, and the foundational reason they cohere is that each measures how badly the invariant $σ$ -algebra can fail to be degenerate. Ergodicity is the demand that $I$ be $μ$ -degenerate, which is exactly the simplicity of the Koopman eigenvalue $1$ and exactly the collapse of the Birkhoff limit to a constant; this is the central insight that an invariant set of intermediate measure, an invariant non-constant function, a surviving nonzero Fourier mode, and a non-extreme invariant measure are four names for one obstruction. Unique ergodicity is the case where $M_{T} (X)$ is a single point, so the simplex of Theorem 5 degenerates and the a.e. statement of Birkhoff becomes the everywhere-and-uniform statement of Oxtoby — this is dual to the spectral picture, where having no invariant sets is replaced by having no invariant measures. Putting these together, Weyl equidistribution generalises from a fact about $(n α)$ to the assertion that every orbit of a uniquely ergodic system faithfully samples the unique measure, and the bridge is that the character orbits the Fourier criterion 21.15.02 tracks are precisely the obstructions a unique invariant measure annihilates. The ergodic decomposition is the foundational reason the subject reduces to its ergodic case: any invariant measure is built from ergodic bricks, so theorems for ergodic systems propagate by integration, and the recurrence-and-tower technology of 38.04.01 together with Birkhoff 37.02.03 assembles into the structure of stationary dynamics.

Full proof set Master

Proposition 1 (ergodicity $\Leftrightarrow$ invariant $L^{2}$ functions constant). A measure-preserving $T$ is ergodic iff every $f \in L^{2} (μ)$ with $U_{T} f = f$ a.e. is a.e. constant.

Proof. ( $\Rightarrow$ ) Let $f \in L^{2}$ be invariant. The super-level sets $B_{c} = {f > c}$ satisfy $T^{- 1} B_{c} = B_{c}$ (mod $μ$ ), so each is almost invariant; ergodicity gives $μ (B_{c}) \in {0, 1}$ . The non-increasing right-continuous ${0, 1}$ -valued function $c \mapsto μ (B_{c})$ has a unique jump at $c_{0} = sup {c : μ (B_{c}) = 1}$ , and $μ ({f > c_{0} - ε} ∖ {f > c_{0} + ε}) = 1$ for all $ε > 0$ forces $f = c_{0}$ a.e. ( $\Leftarrow$ ) If $B$ is almost invariant, $1_{B} \in L^{2}$ is invariant, hence a.e. constant, so $μ (B) \in {0, 1}$ . $□$

Proposition 2 (irrational rotation is uniquely ergodic). For irrational $α$ , Lebesgue measure $λ$ is the unique $T$ -invariant Borel probability measure of $T x = x + α mod 1$ .

Proof. Let $ν \in M_{T} ([0, 1))$ . For $k \in Z$ , invariance gives $\overset{ν}{^} (k) = \int e^{- 2 π ik x} d ν = \int e^{- 2 π ik (x + α)} d ν = e^{- 2 π ik α} \overset{ν}{^} (k)$ . For $k \neq = 0$ , $e^{- 2 π ik α} \neq = 1$ (irrationality), so $\overset{ν}{^} (k) = 0$ ; and $\overset{ν}{^} (0) = 1$ . These coincide with the Fourier coefficients of $λ$ , and a finite Borel measure on the circle is determined by its Fourier coefficients (the characters are dense in $C (T)$ , so equality of $\int g d ν$ and $\int g d λ$ on characters extends to all $g \in C (T)$ ). Hence $ν = λ$ . $□$

Proposition 3 (unique ergodicity forces uniform convergence on $C (X)$ ). If $(X, T)$ is uniquely ergodic with invariant measure $μ$ , then $A_{N} f \to \int f d μ$ uniformly for every $f \in C (X)$ .

Proof. Suppose not: there are $f$ , $ε > 0$ , $N_{j} \to \infty$ , and $x_{j}$ with $∣ A_{N_{j}} f (x_{j}) - \int f d μ ∣ \geq ε$ . The empirical measures $μ_{j} = \frac{1}{N _{j}} \sum_{n < N_{j}} δ_{T^{n} x_{j}}$ have a weak- $*$ cluster point $ν$ . For $g \in C (X)$ , $\int (g \circ T - g) d μ_{j} = \frac{1}{N _{j}} (g (T^{N_{j}} x_{j}) - g (x_{j})) \to 0$ , so $ν$ is invariant, hence $ν = μ$ by unique ergodicity. But $\int f d μ_{j} = A_{N_{j}} f (x_{j})$ stays $\geq ε$ from $\int f d μ = \int f d ν$ along the subsequence converging to $ν$ , contradicting weak- $*$ convergence. $□$

Proposition 4 (existence of invariant measures; Krylov-Bogolyubov). Every continuous map of a compact metric space has an invariant Borel probability measure.

Proof. For arbitrary $μ_{0} \in M (X)$ set $μ_{N} = \frac{1}{N} \sum_{n < N} T_{*}^{n} μ_{0}$ . By weak- $*$ compactness of $M (X)$ choose $μ_{N_{j}} \to μ$ . The telescoping identity $T_{*} μ_{N} - μ_{N} = \frac{1}{N} (T_{*}^{N} μ_{0} - μ_{0})$ gives $∣ \int g d (T_{*} μ_{N}) - \int g d μ_{N} ∣ \leq 2∥ g ∥_{\infty} / N \to 0$ for every $g \in C (X)$ ; weak- $*$ continuity of $T_{*}$ then yields $T_{*} μ = μ$ . $□$

Proposition 5 (ergodic measures are the extreme points of $M_{T} (X)$ ). A measure $μ \in M_{T} (X)$ is ergodic iff it is an extreme point of the convex set $M_{T} (X)$ .

Proof. ( $\Rightarrow$ ) Let $μ$ be ergodic and $μ = t μ_{1} + (1 - t) μ_{2}$ with $t \in (0, 1)$ , $μ_{i} \in M_{T} (X)$ . Then $μ_{1} ≪ μ$ , and the Radon-Nikodym derivative $h = d μ_{1} / d μ$ is $T$ -invariant: for invariant test sets the pushforward identity gives $h \circ T = h$ a.e. Ergodicity makes $h$ a.e. constant, so $h \equiv 1$ and $μ_{1} = μ$ ; thus $μ$ is extreme. ( $\Leftarrow$ ) If $μ$ is not ergodic there is an invariant set $B$ with $0 < μ (B) < 1$ ; the conditioned measures $μ_{B} (\cdot) = μ (\cdot \cap B) / μ (B)$ and $μ_{B^{c}}$ are invariant, distinct, and $μ = μ (B) μ_{B} + μ (B^{c}) μ_{B^{c}}$ is a proper convex combination of distinct measures, so $μ$ is not extreme. $□$

Connections Master

Measure-preserving systems, Poincaré recurrence, and the Kac formula 38.04.01 supply the ambient framework: ergodicity is the indecomposability condition layered on top of measure-preservation, the Koopman operator and invariant-set language are imported directly, and the ergodic Kac formula $\int_{A} r_{A} d μ = 1$ is exactly the statement that an ergodic system's saturation of any positive-measure set is everything. This unit sharpens recurrence into equidistribution.
The ergodic theorems of Birkhoff, von Neumann, and Kingman 37.02.03 are the analytic engine: characterisation (e) of ergodicity is Birkhoff's theorem with a degenerate invariant $σ$ -algebra, and the mean ergodic theorem is the $L^{2}$ statement that the Koopman averages converge to the projection onto the constants. Unique ergodicity is precisely what upgrades Birkhoff's a.e. convergence to the uniform convergence of Oxtoby's theorem.
Weyl sums, Weyl differencing, and equidistribution 21.15.02 give the number-theoretic mirror: the Weyl criterion $\frac{1}{N} \sum_{n} e^{2 π ik x_{n}} \to 0$ is the Fourier-analytic test that this unit derives from unique ergodicity of the rotation, and Weyl differencing extends equidistribution from $(n α)$ to polynomial sequences $(p (n))$ , which correspond dynamically to unipotent skew products on the torus.
Mixing and the spectral theory of dynamical systems 38.05.01 sit one level above ergodicity: weak mixing is continuous Koopman spectrum on the orthocomplement of the constants, strong mixing strengthens the Cesàro correlation decay (d) to a termwise limit, and the Halmos-von Neumann theorem identifies discrete-spectrum systems with group rotations — the uniquely ergodic rotations of this unit being the prototype.
Topological dynamics and minimality 38.01.01 interlocks with unique ergodicity: a uniquely ergodic system whose unique measure has full support is strictly ergodic, and minimal isometries (such as irrational rotations) are uniquely ergodic, so the equidistribution proved here is the measurable counterpart of topological minimality.

Historical & philosophical context Master

Hermann Weyl proved the equidistribution of $(n α)$ and the general polynomial case in his 1916 Mathematische Annalen paper ^{[Weyl 1916]}, introducing the exponential-sum criterion that bears his name. Weyl's argument was Fourier-analytic and predated the abstract ergodic theory by fifteen years; the recognition that his theorem is the unique ergodicity of the rotation came only after Birkhoff's 1931 pointwise theorem ^{[Birkhoff 1931]} and the operator-theoretic reframing of the early 1930s. The two proofs of Weyl's theorem — the direct estimation of exponential sums and the dynamical deduction from a single invariant measure — remain the standard illustration of the bridge between analytic number theory and ergodic theory, and the dynamical viewpoint generalised, through Furstenberg's work on the skew shift and later through Ratner's measure-classification theorems, into the modern theory of equidistribution on homogeneous spaces.

The notion of unique ergodicity and its equivalence with uniform convergence was isolated by John Oxtoby in 1952 ^{[Oxtoby 1952]}, building on the Krylov-Bogolyubov existence theorem of 1937 ^{[Krylov-Bogolyubov 1937]}, which guaranteed at least one invariant measure for any continuous map of a compact space and, with it, the convex-analytic setting in which the ergodic measures are the extreme points. The ergodic decomposition — that every invariant measure is a barycentre of ergodic ones — was developed in the same circle and given its definitive Choquet-simplex formulation in the 1950s and 1960s. The toral-automorphism criterion, that ergodicity is governed by the root-of-unity-freeness of the eigenvalues, was worked out by Halmos and others and connected the spectral theory of the Koopman operator to the arithmetic of the defining integer matrix, a thread that runs forward into the entropy theory of hyperbolic toral automorphisms.

Bibliography Master

@article{Weyl1916,
  author  = {Weyl, Hermann},
  title   = {\"Uber die Gleichverteilung von Zahlen mod. Eins},
  journal = {Mathematische Annalen},
  volume  = {77},
  number  = {3},
  year    = {1916},
  pages   = {313--352}
}

@article{Birkhoff1931,
  author  = {Birkhoff, George D.},
  title   = {Proof of the ergodic theorem},
  journal = {Proceedings of the National Academy of Sciences},
  volume  = {17},
  number  = {12},
  year    = {1931},
  pages   = {656--660}
}

@article{Oxtoby1952,
  author  = {Oxtoby, John C.},
  title   = {Ergodic sets},
  journal = {Bulletin of the American Mathematical Society},
  volume  = {58},
  number  = {2},
  year    = {1952},
  pages   = {116--136}
}

@article{KrylovBogolyubov1937,
  author  = {Krylov, Nikolai and Bogolyubov, Nikolai},
  title   = {La th\'eorie g\'en\'erale de la mesure dans son application \`a l'\'etude des syst\`emes dynamiques de la m\'ecanique non lin\'eaire},
  journal = {Annals of Mathematics},
  volume  = {38},
  number  = {1},
  year    = {1937},
  pages   = {65--113}
}

@book{Walters1982,
  author    = {Walters, Peter},
  title     = {An Introduction to Ergodic Theory},
  publisher = {Springer},
  series    = {Graduate Texts in Mathematics},
  volume    = {79},
  year      = {1982}
}

@book{EinsiedlerWard2011,
  author    = {Einsiedler, Manfred and Ward, Thomas},
  title     = {Ergodic Theory with a View towards Number Theory},
  publisher = {Springer},
  series    = {Graduate Texts in Mathematics},
  volume    = {259},
  year      = {2011}
}

@book{CornfeldFominSinai1982,
  author    = {Cornfeld, Isaac P. and Fomin, Sergei V. and Sinai, Yakov G.},
  title     = {Ergodic Theory},
  publisher = {Springer},
  series    = {Grundlehren der mathematischen Wissenschaften},
  volume    = {245},
  year      = {1982}
}

@book{Glasner2003,
  author    = {Glasner, Eli},
  title     = {Ergodic Theory via Joinings},
  publisher = {American Mathematical Society},
  series    = {Mathematical Surveys and Monographs},
  volume    = {101},
  year      = {2003}
}

@book{Petersen1983,
  author    = {Petersen, Karl},
  title     = {Ergodic Theory},
  publisher = {Cambridge University Press},
  year      = {1983}
}

Prerequisites

38.04.01
37.02.03
21.15.02

Tier anchors

beginner: Walters 1982 *An Introduction to Ergodic Theory* (Springer GTM 79) Ch. 1 (informal: the orbit that fills the circle evenly); Einsiedler-Ward 2011 *Ergodic Theory with a View towards Number Theory* (Springer GTM 259) Ch. 4 (equidistribution as the visible face of ergodicity)
intermediate: Walters 1982 *An Introduction to Ergodic Theory* (Springer GTM 79) §1.5-1.7, §6.2-6.5 (ergodicity, its characterisations, unique ergodicity, the irrational rotation); Petersen 1983 *Ergodic Theory* (Cambridge) §2.3-2.5
master: Walters 1982 *An Introduction to Ergodic Theory* (Springer GTM 79) Ch. 1, Ch. 6 (unique ergodicity and uniform convergence; Weyl's theorem); Einsiedler-Ward 2011 *Ergodic Theory with a View towards Number Theory* (Springer GTM 259) Ch. 4, Ch. 6; Cornfeld-Fomin-Sinai 1982 *Ergodic Theory* (Springer Grundlehren 245) Ch. 3 (toral automorphisms, spectral criterion); Glasner 2003 *Ergodic Theory via Joinings* (AMS) Ch. 4 (ergodic decomposition)

References

Weyl — Über die Gleichverteilung von Zahlen mod. Eins · Mathematische Annalen 77 (1916), 313-352 (equidistribution and the Weyl criterion)
Birkhoff — Proof of the ergodic theorem · Proceedings of the National Academy of Sciences 17 (1931), 656-660
Oxtoby — Ergodic sets · Bulletin of the American Mathematical Society 58 (1952), 116-136 (unique ergodicity and uniform convergence)
Krylov-Bogolyubov — La théorie générale de la mesure dans son application à l'étude des systèmes dynamiques de la mécanique non linéaire · Annals of Mathematics 38 (1937), 65-113 (existence of invariant measures; ergodic decomposition)
Walters — An Introduction to Ergodic Theory · Springer GTM 79, 1982, Ch. 1, Ch. 6 (ergodicity, unique ergodicity, equidistribution)
Einsiedler-Ward — Ergodic Theory with a View towards Number Theory · Springer GTM 259, 2011, Ch. 4, Ch. 6 (unique ergodicity, equidistribution)

Estimated time

beginner: 18m
intermediate: 56m
master: 95m