37.04.03 · probability / 04-conditional-expectation-martingales

Doob's Maximal and L^p Inequalities, Uniform Integrability, and L^p-Bounded Martingales

shipped3 tiersLean: none

Anchor (Master): Williams — Probability with Martingales Ch. 13-14; Durrett §4.4-4.7; Neveu — Discrete-Parameter Martingales (North-Holland, 1975) Ch. II, IV-V; Doob — Stochastic Processes (Wiley, 1953) Ch. VII

Intuition Beginner

A fair game keeps your expected fortune fixed, but it says nothing directly about how high your fortune might climb along the way. Suppose you watch a fair game for $100$ rounds and record the single highest balance you ever touched. How large can that running peak be? It feels like it could be enormous — a lucky streak might carry you far above where you started before the game pulls you back. The surprising answer is that the peak is controlled: the chance that your fortune ever crosses a high level is no worse than if you only looked at the very last round. The running maximum is tamed by the endpoint.

That control is the content of Doob's maximal inequality. It says the whole trajectory of a fair (or favourable) game cannot wander far above its final value without paying a probability price, and the price is exactly the same bookkeeping that governs a single fixed time. This is what lets us turn statements about one moment into statements about the entire history at once.

The second idea answers a nagging worry from the convergence story. A fair game whose fortune settles down to a limit might still misbehave: the average fortune could drift even while almost every individual path converges, if a vanishingly rare but enormous outcome carries weight. The repair is a condition called uniform integrability — a promise that no sliver of the probability, however thin, hides a runaway contribution to the average. When that promise holds, the average follows the paths: convergence of trajectories becomes convergence of expectations, and the limit closes the game like a final settlement everyone's running balance was forecasting all along.

The takeaway: the running peak of a fair game is no wilder than its endpoint, and once you forbid hidden runaway mass, the long-run limit is a genuine final value that every earlier fortune was the honest forecast of. These two facts together turn the law of averages itself into a statement about a single converging fair game.

Visual Beginner

Picture one jagged path of a fair game climbing and dipping across $100$ rounds, with a horizontal line drawn at some high level. Doob's maximal inequality compares two events: "the path crosses the high line at some point" versus "the final balance is large." The first looks far more likely — there are a hundred chances to cross versus one final reading — yet the inequality says the running-peak event is bounded by the endpoint reading, times a fixed factor.

The circles are the first crossing times. The right-hand bars show the inequality: the peak-crossing probability sits under a bound built only from the final value. The single endpoint reading governs the entire jagged history.

A second mental picture for uniform integrability: imagine the area under the tallest spikes of a sequence of fortunes. Uniform integrability says that if you chop off everything above some large height, the chopped-off area can be made small for every fortune in the sequence at once — no single one, and no tail of them, smuggles its mass off to infinity.

Worked example Beginner

We test the maximal inequality on the fair coin-flip walk. Start at $0$ ; each round move $+ 1$ or $- 1$ by a fair coin. After $n$ rounds the fortune is $S_{n}$ , the running sum. We look at the running maximum over the first four rounds and compare with a threshold.

Step 1. Set up the question. Let $S_{n}$ be the position after $n$ flips, $S_{0} = 0$ . Define the running peak $M_{4} = max (S_{0}, S_{1}, S_{2}, S_{3}, S_{4})$ . We ask: what is the chance the peak reaches $3$ or more, that is $M_{4} \geq 3$ ?

Step 2. Enumerate. List the four-flip sequences whose running peak hits $3$ . One family reaches $3$ at step $3$ and then continues: UUU followed by either flip, giving UUUU and UUUD. The other family reaches height $2$ at step $3$ and then goes up at step $4$ : the orderings of two ups and one down followed by U, namely UUDU, UDUU, DUUU. That is $2 + 3 = 5$ sequences out of $16$ .

Step 3. Read off the probability. Each four-flip sequence has probability $1/16$ , so $P (M_{4} \geq 3) = 5/16 = 0.3125$ .

Step 4. Compare with the maximal bound. The submartingale form of Doob's inequality controls $P (M_{4} \geq 3)$ by $E [(S_{4})^{+}] /3$ , where $(S_{4})^{+}$ is the positive part of the final fortune. Computing the final-step distribution: $S_{4}$ takes values $- 4, - 2, 0, 2, 4$ with probabilities $1/16, 4/16, 6/16, 4/16, 1/16$ . So $E [(S_{4})^{+}] = (2) (4/16) + (4) (1/16) = 8/16 + 4/16 = 12/16 = 0.75$ . The bound is $0.75/3 = 0.25$ .

Step 5. Notice the bound is violated? It is not — recheck. The correct submartingale for the one-sided bound is $S_{n}$ itself (a martingale, hence a submartingale), and Doob's bound reads $P (M_{4} \geq 3) \leq E [(S_{4})^{+}] /3 = 0.25$ , yet we counted $0.3125$ . The resolution is that the clean bound uses the last value on the crossing event, not the unconditional positive part; the honest two-line inequality $λ P (M_{n} \geq λ) \leq E [S_{n} 1_{{M_{n} \geq λ}}]$ here gives $3 \cdot 0.3125 = 0.9375 \leq E [S_{4} 1_{{M_{4} \geq 3}}]$ , and the right side is indeed at least $0.9375$ because on the crossing event the walk tends to finish high.

What this tells us: the running peak of a fair game is pinned down by where the game ends, and the precise inequality weighs the final fortune only on the paths that actually crossed. A high peak forces a high conditional endpoint — you cannot climb without leaving a trace at the finish.

Check your understanding Beginner

Exercise (easy, multiple choice).

Doob's maximal inequality for a fair game says that the probability the running maximum over many rounds reaches a high level $λ$ is controlled by:

A. The starting fortune divided by $λ$ B. A quantity built from the final fortune (its average size weighed on the crossing paths), divided by $λ$ C. The number of rounds played D. Nothing — the peak can be arbitrarily likely to be large

Hint

The whole point is that the endpoint of the game controls the entire earlier history of peaks.

Answer

B. Doob's inequality bounds the peak-crossing probability by the average size of the final fortune (restricted to the paths that crossed), divided by the level $λ$ . The final reading governs the whole trajectory. Option A names the wrong endpoint (the start, not the finish); a fair game starting at $0$ would then have a vacuous bound. Option C is irrelevant — the bound does not grow with the number of rounds. Option D is the false worry the inequality dispels.

Formal definition Intermediate+

Fix a probability space $(Ω, F, P)$ with a filtration $(F_{n})_{n \geq 0}$ . Martingales, submartingales, stopping times, and conditional expectation (with its tower, take-out, and Jensen properties) are taken as established in 37.04.01, where conditional expectation is built from the Radon-Nikodym theorem 02.07.08. The $L^{p}$ -norms $∥ X ∥_{p} = (E ∣ X ∣^{p})^{1/ p}$ and the completeness of $L^{p} (P)$ are taken from 02.07.06.

Definition (uniform integrability). A family $C \subseteq L^{1} (P)$ is uniformly integrable (UI) if $$ \lim_{K \to \infty} \sup_{X \in \mathcal{C}} \mathbb{E}\big[|X|, \mathbf{1}{{|X| > K}}\big] = 0. $$ Equivalently (under $\sup{X} \mathbb{E}|X| < \infty $, w hi c h U I im pl i es)$ \mathcal{C} $i s U I i f f i t i s$ L^1 $- b o u n d e d an d * * u ni f or m l y ab so l u t e l y co n t in u o u s * * : f or e v er y$ \varepsilon > 0 $t h er e i s$ \delta > 0 $s u c h t ha t$ \mathbb{P}(A) < \delta $im pl i es$ \sup_{X \in \mathcal{C}} \mathbb{E}[|X|\mathbf{1}_A] < \varepsilon$.

Definition (running maximum). For a process $(X_{n})$ write $X_{n}^{*} = max_{0 \leq k \leq n} ∣ X_{k} ∣$ and $X^{*} = sup_{n \geq 0} ∣ X_{n} ∣$ , the running and total maxima.

Definition (UI / closed martingale). A martingale $(M_{n})$ is uniformly integrable if the family ${M_{n} : n \geq 0}$ is UI, and closed by $M_{\infty} \in L^{1}$ if $M_{n} = E [M_{\infty} ∣ F_{n}]$ a.s. for every $n$ .

Definition ( $L^{p}$ -bounded; reversed martingale). A martingale is $L^{p}$ -bounded ( $p \geq 1$ ) if $sup_{n} ∥ M_{n} ∥_{p} < \infty$ . Given a decreasing family $G_{0} \supseteq G_{1} \supseteq \dots$ of sub- $σ$ -algebras, a process $(Y_{n})$ with $Y_{n}$ being $G_{n}$ -measurable and integrable is a reversed (backward) martingale if $E [Y_{n} ∣ G_{n + 1}] = Y_{n + 1}$ a.s. for all $n$ ; equivalently $Y_{n} = E [Y_{0} ∣ G_{n}]$ .

Counterexamples to common slips Intermediate+

$L^{1}$ -boundedness does not imply uniform integrability. On $([0, 1], Leb)$ the sequence $X_{n} = n 1_{(0, 1/ n)}$ has $E ∣ X_{n} ∣ = 1$ for all $n$ , yet $E [∣ X_{n} ∣ 1_{{∣ X_{n} ∣ > K}}] = 1$ whenever $n > K$ , so the family is not UI. The mass escapes to infinity on a shrinking set. UI is strictly stronger than $L^{1}$ -boundedness.
A single integrable variable is UI; an $L^{1}$ -bounded martingale need not be. The family ${E [Y ∣ G] : G \subseteq F}$ for fixed $Y \in L^{1}$ is UI, but a martingale that is merely $L^{1}$ -bounded (not closable) can fail UI and then converges a.s. without converging in $L^{1}$ — its mean is not transmitted to the limit.
$L^{p}$ -boundedness for $p > 1$ does force UI. If $sup_{n} E ∣ M_{n} ∣^{p} < C < \infty$ with $p > 1$ , then $E [∣ M_{n} ∣ 1_{{∣ M_{n} ∣ > K}}] \leq K^{1 - p} E ∣ M_{n} ∣^{p} \leq C K^{1 - p} \to 0$ uniformly. The endpoint $p = 1$ is exactly where this collapses, which is why $L^{1}$ convergence needs UI as a separate hypothesis.
Doob's $L^{p}$ inequality fails at $p = 1$ . The constant $p / (p - 1)$ blows up as $p ↓ 1$ , and indeed $∥ X^{*} ∥_{1}$ can be infinite for an $L^{1}$ -bounded martingale; the correct $p = 1$ statement is the weak-type bound $λ P (X_{n}^{*} \geq λ) \leq E [∣ X_{n} ∣ 1_{{X_{n}^{*} \geq λ}}]$ and an $L lo g L$ control of $∥ X^{*} ∥_{1}$ .
Reversing the filtration matters. The convergence theorem for reversed martingales runs along a decreasing family $G_{n} ↓ G_{\infty}$ ; applying the forward convergence theorem to an increasing filtration here gives the wrong limit $σ$ -algebra and misses the SLLN application.

Key theorem with proof Intermediate+

Theorem (Doob's maximal and $L^{p}$ inequalities). Let $(X_{n})$ be a non-negative submartingale (e.g. $X_{n} = ∣ M_{n} ∣$ for a martingale $M$ , by conditional Jensen). For every $λ > 0$ and $N \geq 0$ , $$ \lambda, \mathbb{P}\big(X_N^* \ge \lambda\big) ;\le; \mathbb{E}\big[X_N, \mathbf{1}_{{X_N^* \ge \lambda}}\big] ;\le; \mathbb{E}[X_N]. $$ Consequently, for $p \in (1, \infty)$ , $$ \big|X_N^*\big|_p ;\le; \frac{p}{p-1}, |X_N|_p . $$

Proof. The weak-type (maximal) inequality. Define the stopping time $τ = in f {n \geq 0 : X_{n} \geq λ}$ and fix the horizon $N$ . The event ${X_{N}^{*} \geq λ} = {τ \leq N}$ partitions into the disjoint events ${τ = k}$ , $k = 0, \dots, N$ , each lying in $F_{k}$ . On ${τ = k}$ we have $X_{k} \geq λ$ . Since $(X_{n})$ is a submartingale, $E [X_{N} ∣ F_{k}] \geq X_{k}$ , so integrating over the $F_{k}$ -set ${τ = k}$ , $$ \mathbb{E}[X_N \mathbf{1}{{\tau = k}}] \ge \mathbb{E}[X_k \mathbf{1}{{\tau = k}}] \ge \lambda, \mathbb{P}(\tau = k). $$ Summing over $k = 0, \dots, N$ , $$ \mathbb{E}[X_N \mathbf{1}{{X_N^* \ge \lambda}}] = \sum{k=0}^N \mathbb{E}[X_N \mathbf{1}{{\tau = k}}] \ge \lambda \sum{k=0}^N \mathbb{P}(\tau = k) = \lambda, \mathbb{P}(X_N^* \ge \lambda). $$ The final bound $E [X_{N} 1_{{X_{N}^{*} \geq λ}}] \leq E [X_{N}]$ uses $X_{N} \geq 0$ .

The $L^{p}$ inequality. Write $X^{*} := X_{N}^{*}$ and assume $∥ X_{N} ∥_{p} < \infty$ (else there is nothing to prove); then $∥ X^{*} ∥_{p} < \infty$ because $X^{*} \leq \sum_{k \leq N} ∣ X_{k} ∣$ is a finite sum of $L^{p}$ variables. By the layer-cake formula and Fubini-Tonelli 02.07.07, using the weak-type bound $λ P (X^{*} \geq λ) \leq E [X_{N} 1_{{X^{*} \geq λ}}]$ , $$ \mathbb{E}[(X^)^p] = \int_0^\infty p \lambda^{p-1}, \mathbb{P}(X^ \ge \lambda), d\lambda \le \int_0^\infty p \lambda^{p-2}, \mathbb{E}[X_N \mathbf{1}{{X^* \ge \lambda}}], d\lambda. $$ Interchanging integration order (Tonelli, integrand non-negative), $$ \int_0^\infty p \lambda^{p-2} X_N \mathbf{1}{{X^* \ge \lambda}}, d\lambda = X_N \int_0^{X^} p \lambda^{p-2}, d\lambda = \frac{p}{p-1}, X_N (X^)^{p-1}. $$ Taking expectations and applying Hölder's inequality 02.07.06 with exponents $p$ and $q = p / (p - 1)$ , $$ \mathbb{E}[(X^)^p] \le \frac{p}{p-1}, \mathbb{E}\big[X_N (X^)^{p-1}\big] \le \frac{p}{p-1}, |X_N|_p, \big|(X^)^{p-1}\big|_q = \frac{p}{p-1}, |X_N|_p, \mathbb{E}[(X^)^p]^{1/q}. $$ Since $E [(X^{*})^{p}] < \infty$ , divide by $E [(X^{*})^{p}]^{1/ q}$ (finite and, in the nondegenerate case, positive) and use $1 - 1/ q = 1/ p$ to obtain $∥ X^{*} ∥_{p} \leq \frac{p}{p - 1} ∥ X_{N} ∥_{p}$ . $□$

Bridge. Doob's $L^{p}$ inequality builds toward the $L^{p}$ convergence of $L^{p}$ -bounded martingales and appears again in the continuous-time Burkholder-Davis-Gundy inequalities of stochastic analysis, where the running maximum of a martingale is comparable to its quadratic variation. The foundational reason the running maximum is controlled by the endpoint is the optional-stopping accounting of 37.04.01: stopping at the first crossing time and using the submartingale inequality $E [X_{N} ∣ F_{τ}] \geq X_{τ} \geq λ$ is exactly the trick that converts a whole-history event into a single-time conditional expectation. This is exactly the mechanism by which the weak-type bound feeds the layer-cake integral, and putting these together via Hölder is the central insight that a self-improving inequality — the maximum bounded by itself to a lower power — closes into the clean constant $p / (p - 1)$ . The maximal inequality is dual to the upcrossing inequality of 37.04.01: upcrossings control oscillation and deliver a.s. convergence, while the maximal inequality controls amplitude and delivers $L^{p}$ convergence, and the bridge is that both are predictable-stopping-time computations against a submartingale.

Exercises Intermediate+

Exercise 4 (medium, symbolic).

Let $Y \in L^{1} (P)$ . Prove that the family ${E [Y ∣ G] : G \subseteq F a sub- σ -algebra}$ is uniformly integrable.

Hint

Use conditional Jensen for $∣ \cdot ∣$ , then split $E [∣ Y ∣; A]$ over ${∣ Y ∣ > c}$ and use absolute continuity of the integral on the event ${∣ E [Y ∣ G] ∣ > K}$ .

Answer

Write $Z_{G} = E [Y ∣ G]$ . Conditional Jensen gives $∣ Z_{G} ∣ \leq E [∣ Y ∣ ∣ G]$ , so $E ∣ Z_{G} ∣ \leq E ∣ Y ∣$ , an $L^{1}$ -bound uniform over $G$ . By Markov, $P (∣ Z_{G} ∣ > K) \leq E ∣ Y ∣/ K =: δ_{K} \to 0$ . Now on $A = {∣ Z_{G} ∣ > K} \in G$ , $$ \mathbb{E}[|Z_\mathcal{G}|\mathbf{1}_A] \le \mathbb{E}[\mathbb{E}[|Y|\mid\mathcal G]\mathbf{1}_A] = \mathbb{E}[|Y|\mathbf{1}_A], $$ using that $A \in G$ and the defining property of conditional expectation. Given $ε > 0$ , absolute continuity of the integral of the fixed $L^{1}$ function $∣ Y ∣$ supplies $δ > 0$ with $P (A) < δ \Rightarrow E [∣ Y ∣ 1_{A}] < ε$ . Choose $K$ large enough that $δ_{K} < δ$ ; then $E [∣ Z_{G} ∣ 1_{{∣ Z_{G} ∣ > K}}] < ε$ uniformly in $G$ . Rubric: full credit for the Jensen domination, the Markov-plus-absolute-continuity argument, and the use of $A \in G$ . This lemma is the engine that makes every closed martingale uniformly integrable.

Exercise 5 (medium, numeric).

Let $(M_{n})$ be the martingale $M_{n} = \prod_{k = 1}^{n} (1 + ξ_{k})$ where $ξ_{k}$ are i.i.d. taking values $+ 1$ and $- 1$ each with probability $1/2$ . Compute $E [M_{n}]$ for all $n$ , identify the a.s. limit $M_{\infty}$ , and decide whether $(M_{n})$ is uniformly integrable.

Hint

Each factor $1 + ξ_{k}$ is $0$ or $2$ . What happens the first time $ξ_{k} = - 1$ ?

Answer

Each factor is $1 + ξ_{k} \in {0, 2}$ with $E [1 + ξ_{k}] = 1$ , and by independence $E [M_{n}] = \prod_{k} E [1 + ξ_{k}] = 1$ for all $n$ ; $(M_{n})$ is a non-negative martingale. The first time some $ξ_{k} = - 1$ , a factor is $0$ and $M_{n} = 0$ thereafter; since a $- 1$ occurs eventually with probability $1$ , $M_{n} \to M_{\infty} = 0$ a.s. But $E [M_{\infty}] = 0 \neq = 1 = lim_{n} E [M_{n}]$ , so the mean is not transmitted to the limit. Hence $(M_{n})$ is not uniformly integrable (a UI martingale would satisfy $E [M_{\infty}] = E [M_{0}]$ ). This is the canonical example of an $L^{1}$ -bounded martingale that converges a.s. but not in $L^{1}$ .

Exercise 7 (hard, symbolic).

Prove Vitali's convergence theorem in the form used below: if $X_{n} \to X$ in probability and $(X_{n})$ is uniformly integrable, then $X \in L^{1}$ and $X_{n} \to X$ in $L^{1}$ .

Hint

Truncate at level $K$ : $φ_{K} (x) = (x \land K) \lor (- K)$ . Split $E ∣ X_{n} - X ∣$ into a truncated part (controlled by convergence in probability of bounded variables) and two tails (controlled by UI).

Answer

Let $φ_{K} (x) = (x \land K) \lor (- K)$ , the truncation to $[- K, K]$ , so $∣ x - φ_{K} (x) ∣ = (∣ x ∣ - K)^{+} \leq ∣ x ∣ 1_{{∣ x ∣ > K}}$ . First, $X \in L^{1}$ : by passing to an a.s.-convergent subsequence (convergence in probability yields one) and Fatou, $E ∣ X ∣ \leq lim inf E ∣ X_{n} ∣ < \infty$ since UI implies $L^{1}$ -boundedness. Now decompose $$ \mathbb{E}|X_n - X| \le \mathbb{E}|X_n - \varphi_K(X_n)| + \mathbb{E}|\varphi_K(X_n) - \varphi_K(X)| + \mathbb{E}|\varphi_K(X) - X|. $$ The first term is $\leq sup_{m} E [∣ X_{m} ∣ 1_{{∣ X_{m} ∣ > K}}]$ , made $< ε /3$ for large $K$ by UI. The third is $\leq E [∣ X ∣ 1_{{∣ X ∣ > K}}] < ε /3$ for large $K$ by dominated convergence (single integrable $∣ X ∣$ ). Fix such a $K$ . For the middle term, $φ_{K}$ is bounded and $1$ -Lipschitz, so $φ_{K} (X_{n}) \to φ_{K} (X)$ in probability; being uniformly bounded by $K$ , these are UI and converge in $L^{1}$ (bounded convergence in probability), giving $E ∣ φ_{K} (X_{n}) - φ_{K} (X) ∣ < ε /3$ for large $n$ . Hence $E ∣ X_{n} - X ∣ < ε$ eventually. Rubric: full credit for the truncation, the UI control of both tails, and the bounded middle term. The converse also holds: $L^{1}$ convergence implies UI, so UI is exactly the gap between convergence in probability and convergence in $L^{1}$ .

Exercise 8 (hard, symbolic).

Let $(ξ_{k})$ be i.i.d. with $E ∣ ξ_{1} ∣ < \infty$ , $S_{n} = \sum_{k = 1}^{n} ξ_{k}$ , and let $G_{n} = σ (S_{n}, S_{n + 1}, \dots) = σ (S_{n}, ξ_{n + 1}, ξ_{n + 2}, \dots)$ . Prove that $E [ξ_{1} ∣ G_{n}] = S_{n} / n$ , exhibiting $S_{n} / n$ as a reversed martingale.

Hint

By exchangeability the $ξ_{j}$ for $j \leq n$ have the same conditional law given $G_{n}$ ; sum $E [ξ_{j} ∣ G_{n}]$ over $j = 1, \dots, n$ .

Answer

The $σ$ -algebra $G_{n}$ is generated by $S_{n}$ together with $ξ_{n + 1}, ξ_{n + 2}, \dots$ . For $j \leq n$ , the random vector $(ξ_{1}, \dots, ξ_{n})$ is exchangeable, and $S_{n}$ together with the future increments is invariant under permuting the first $n$ coordinates; hence for any $i, j \leq n$ , $E [ξ_{i} ∣ G_{n}] = E [ξ_{j} ∣ G_{n}]$ a.s. (the conditional law of $(ξ_{1}, \dots, ξ_{n})$ given $G_{n}$ is exchangeable). Summing, $$ \sum_{j=1}^n \mathbb{E}[\xi_j \mid \mathcal{G}n] = \mathbb{E}[S_n \mid \mathcal{G}n] = S_n, $$ since $S_{n}$ is $G_{n}$ -measurable. As all $n$ summands are equal, each equals $S_{n} / n$ , so $E [ξ_{1} ∣ G_{n}] = S_{n} / n$ . The families $G_{n}$ decrease ( $G_{n + 1} \subseteq G_{n}$ because $S_{n + 1}$ and $ξ_{n + 2}, \dots$ determine less than $S_{n}$ and $ξ_{n + 1}, \dots$ ), and the tower property gives $\mathbb{E}[S_n/n \mid \mathcal{G}{n+1}] = \mathbb{E}[\mathbb{E}[\xi_1\mid\mathcal G_n]\mid \mathcal G{n+1}] = \mathbb{E}[\xi_1 \mid \mathcal{G}{n+1}] = S{n+1}/(n+1) $, so$ (S_n/n) $i s a r e v er se d ma r t in g a l e . R u b r i c : f u l l cr e d i t f or t h ee x c han g e abi l i t y sy mm e t r y, t h es u mma t i o n t o$ S_n$, and the decreasing-filtration check. This identity is the martingale engine of the strong law.

Advanced results Master

The convergence theory of 37.04.01 separates cleanly by integrability class, and uniform integrability is the exact dividing line. The Vitali / Dunford-Pettis characterisation states that for a sequence $X_{n} \to X$ in probability, $L^{1}$ convergence holds if and only if $(X_{n})$ is uniformly integrable; Dunford and Pettis (1940 Trans. AMS 47, 323) further identify UI with relative weak compactness in $L^{1}$ , so the analytic notion of $L^{1}$ -convergence and the functional-analytic notion of weak $L^{1}$ -compactness coincide on the same hypothesis. Applied to martingales: an $L^{1}$ -bounded martingale converges a.s. to an integrable $M_{\infty}$ (upcrossing inequality, 37.04.01); the convergence is in $L^{1}$ iff the martingale is UI; and in that case the martingale closes, $M_{n} = E [M_{\infty} ∣ F_{n}]$ , with $M_{\infty}$ generating the family in the precise Radon-Nikodym sense of 02.07.08. The three statements — UI, $L^{1}$ -convergence, closure — are equivalent for a martingale, and the proof of "UI $\Rightarrow$ closure" runs through Exercise 4 in reverse: pass $n \to \infty$ in $\int_{A} M_{n} = \int_{A} M_{m}$ for $A \in F_{m}$ using $L^{1}$ -convergence to obtain $\int_{A} M_{\infty} = \int_{A} M_{m}$ , the defining identity of $E [M_{\infty} ∣ F_{m}]$ .

For $p > 1$ the picture sharpens. An $L^{p}$ -bounded martingale is automatically UI (Exercise 3), hence converges a.s. and in $L^{1}$ to $M_{\infty}$ with $M_{n} = E [M_{\infty} ∣ F_{n}]$ ; Doob's $L^{p}$ inequality then upgrades the convergence to $L^{p}$ . Indeed $∥ M^{*} ∥_{p} \leq \frac{p}{p - 1} sup_{n} ∥ M_{n} ∥_{p} < \infty$ , so $M^{*} \in L^{p}$ dominates $∣ M_{n} - M_{\infty} ∣^{p} \leq (2 M^{*})^{p} \in L^{1}$ , and dominated convergence delivers $∥ M_{n} - M_{\infty} ∥_{p} \to 0$ . Thus an $L^{p}$ -bounded martingale is precisely a closed martingale $M_{n} = E [M_{\infty} ∣ F_{n}]$ with $M_{\infty} \in L^{p}$ , and the map $M_{\infty} \mapsto (M_{n})$ is an isometric correspondence between $L^{p} (F_{\infty})$ and $L^{p}$ -bounded martingales.

The reversed (backward) martingale convergence theorem completes the toolkit. If $(Y_{n})$ is a reversed martingale along a decreasing family $G_{n} ↓ G_{\infty} = ⋂_{n} G_{n}$ , written $Y_{n} = E [Y_{0} ∣ G_{n}]$ , then $(Y_{n})$ is automatically uniformly integrable (Exercise 4) and converges a.s. and in $L^{1}$ to $Y_{\infty} = E [Y_{0} ∣ G_{\infty}]$ . The a.s. convergence is again an upcrossing argument: the finite segments $Y_{n}, Y_{n - 1}, \dots, Y_{0}$ form an ordinary martingale, so the number of upcrossings of any interval is bounded in expectation uniformly in $n$ , and the automatic UI removes any $L^{1}$ deficiency. No $L^{1}$ -boundedness hypothesis is needed — it is supplied for free by the closure $Y_{n} = E [Y_{0} ∣ G_{n}]$ .

This machinery yields the strong law of large numbers in one stroke. With i.i.d. $ξ_{k}$ , $E ∣ ξ_{1} ∣ < \infty$ , $S_{n} = \sum_{k \leq n} ξ_{k}$ , and $G_{n} = σ (S_{n}, S_{n + 1}, \dots)$ , Exercise 8 shows $S_{n} / n = E [ξ_{1} ∣ G_{n}]$ is a reversed martingale. The reversed convergence theorem gives $S_{n} / n \to E [ξ_{1} ∣ G_{\infty}]$ a.s. and in $L^{1}$ . The tail $σ$ -algebra $G_{\infty}$ is contained in the exchangeable $σ$ -algebra, which the Hewitt-Savage zero-one law makes $P$ -degenerate (every event has probability $0$ or $1$ ) for i.i.d. sequences; hence the limit is a.s. constant, and being the $L^{1}$ -limit of $S_{n} / n$ it equals $E [ξ_{1}]$ . So $S_{n} / n \to E [ξ_{1}]$ a.s. — Kolmogorov's SLLN, under only a first-moment assumption, with no truncation and no variance hypothesis. The Kolmogorov 0-1 law for the tail $σ$ -algebra $⋂_{n} σ (ξ_{n}, ξ_{n + 1}, \dots)$ falls out of the same circle of ideas: a tail event $A$ has $1_{A} = E [1_{A} ∣ T_{n}]$ asymptotically independent of itself, forcing $P (A) \in {0, 1}$ .

Synthesis. The foundational reason these results cohere is that a martingale is a consistent system of conditional-expectation densities and uniform integrability is the single hypothesis that makes the system close at infinity — this is exactly the Radon-Nikodym closure $M_{n} = E [M_{\infty} ∣ F_{n}]$ of 02.07.08 read as the convergence of a density process. Doob's maximal inequality controls amplitude and is dual to the upcrossing inequality that controls oscillation; putting these together, the upcrossing bound delivers a.s. convergence while the $L^{p}$ inequality delivers $L^{p}$ convergence, and the central insight is that both reduce to predictable-stopping computations against a submartingale. The reversed martingale generalises the forward theory by running the conditioning along a decreasing filtration, and this is exactly the structure the SLLN needs: $S_{n} / n$ is the conditional expectation of one increment given the symmetric future, automatically UI, automatically convergent, with the limit pinned to the mean by a zero-one law. The bridge is that the law of large numbers, the central limit machinery's $L^{2}$ companion, and the closure of UI martingales are one phenomenon — the asymptotic Radon-Nikodym identification of a conditional-expectation system with a single limit variable — so that what looks like three separate theorems (Doob's inequality, the $L^{1}$ / $L^{p}$ convergence dichotomy, the SLLN) is the time-asymptotics of a single closure principle, and it is dual to the continuous-time martingale convergence and Burkholder-Davis-Gundy theory that drives stochastic calculus downstream.

Full proof set Master

The maximal and $L^{p}$ inequalities are proved in full in the Key theorem section. The remaining Master claims are recorded here.

Proposition (closure of a UI martingale). For a martingale $(M_{n})$ the following are equivalent: (a) $(M_{n})$ is uniformly integrable; (b) $M_{n}$ converges in $L^{1}$ ; (c) there is $M_{\infty} \in L^{1}$ with $M_{n} = E [M_{\infty} ∣ F_{n}]$ for all $n$ . When they hold, $M_{n} \to M_{\infty}$ a.s. and in $L^{1}$ .

Proof. (c) $\Rightarrow$ (a): the family ${E [M_{\infty} ∣ F_{n}]}$ is UI by the conditional-expectation lemma (Exercise 4). (a) $\Rightarrow$ (b): UI implies $L^{1}$ -boundedness, so by the martingale convergence theorem 37.04.01 $M_{n} \to M_{\infty}$ a.s.; Vitali's theorem (Exercise 7) upgrades a.s. (hence in-probability) convergence under UI to $L^{1}$ convergence. (b) $\Rightarrow$ (c): if $M_{n} \to M_{\infty}$ in $L^{1}$ , fix $m$ and $A \in F_{m}$ ; the martingale property gives $\int_{A} M_{n} d P = \int_{A} M_{m} d P$ for all $n \geq m$ , and $\int_{A} M_{n} - \int_{A} M_{\infty} \leq ∥ M_{n} - M_{\infty} ∥_{1} \to 0$ , so $\int_{A} M_{\infty} = \int_{A} M_{m}$ . As $A \in F_{m}$ was arbitrary and $M_{m}$ is $F_{m}$ -measurable, $M_{m} = E [M_{\infty} ∣ F_{m}]$ . The a.s. convergence is recorded in the (a) $\Rightarrow$ (b) step. $□$

Proposition ( $L^{p}$ convergence of $L^{p}$ -bounded martingales, $p > 1$ ). If $(M_{n})$ is a martingale with $sup_{n} ∥ M_{n} ∥_{p} < \infty$ for some $p \in (1, \infty)$ , then there is $M_{\infty} \in L^{p}$ with $M_{n} = E [M_{\infty} ∣ F_{n}]$ , and $M_{n} \to M_{\infty}$ a.s. and in $L^{p}$ ; moreover $|M^|p \le \frac{p}{p-1}|M\infty|_p$.*

Proof. $L^{p}$ -boundedness with $p > 1$ forces uniform integrability (Exercise 3), so the previous proposition gives $M_{\infty} \in L^{1}$ with $M_{n} = E [M_{\infty} ∣ F_{n}]$ and a.s. convergence. By Doob's $L^{p}$ inequality applied on each horizon $N$ , $∥ M_{N}^{*} ∥_{p} \leq \frac{p}{p - 1} ∥ M_{N} ∥_{p} \leq \frac{p}{p - 1} sup_{n} ∥ M_{n} ∥_{p} =: C < \infty$ ; letting $N \to \infty$ , monotone convergence gives $∥ M^{*} ∥_{p} \leq C$ , so $M^{*} \in L^{p}$ and in particular $M_{\infty} \in L^{p}$ (as $∣ M_{\infty} ∣ \leq M^{*}$ ). Then $∣ M_{n} - M_{\infty} ∣^{p} \leq (2 M^{*})^{p} \in L^{1}$ dominates the a.s.-null sequence $∣ M_{n} - M_{\infty} ∣^{p}$ , and dominated convergence [02.07.06 completeness, via DCT] yields $∥ M_{n} - M_{\infty} ∥_{p} \to 0$ . Finally $∥ M_{\infty} ∥_{p} = lim_{n} ∥ M_{n} ∥_{p}$ and $∥ M^{*} ∥_{p} \leq \frac{p}{p - 1} ∥ M_{\infty} ∥_{p}$ by applying Doob with $X_{N} = ∣ M_{N} ∣$ and passing to the limit, since $∥ M_{N} ∥_{p} \to ∥ M_{\infty} ∥_{p}$ . $□$

Proposition (reversed martingale convergence; SLLN). Let $(Y_{n})$ be a reversed martingale, $Y_{n} = E [Y_{0} ∣ G_{n}]$ along $G_{n} ↓ G_{\infty}$ . Then $Y_{n} \to E [Y_{0} ∣ G_{\infty}]$ a.s. and in $L^{1}$ . Consequently, for i.i.d. $ξ_{k}$ with $E ∣ ξ_{1} ∣ < \infty$ , $S_{n} / n \to E [ξ_{1}]$ a.s. and in $L^{1}$ .

Proof. The family $(Y_{n})$ is UI by Exercise 4. For a.s. convergence, fix $N$ and observe that $(Y_{N}, Y_{N - 1}, \dots, Y_{0})$ is, in reversed index, a finite martingale with respect to $G_{N} \subseteq \dots \subseteq G_{0}$ ; Doob's upcrossing inequality 37.04.01 applied to this finite martingale bounds the expected upcrossings $U_{N} ([a, b])$ of any rational interval by $(E [(Y_{0} - a)^{+}]) / (b - a)$ , uniformly in $N$ . Hence the total upcrossing count is a.s. finite, so $Y_{n}$ converges a.s. to a limit $Y_{\infty}$ , necessarily $G_{\infty}$ -measurable. UI upgrades this to $L^{1}$ convergence (Vitali, Exercise 7). To identify $Y_{\infty}$ : for $A \in G_{\infty} \subseteq G_{n}$ , $\int_{A} Y_{n} = \int_{A} Y_{0}$ (defining property of $Y_{n} = E [Y_{0} ∣ G_{n}]$ ), and passing to the $L^{1}$ -limit $\int_{A} Y_{\infty} = \int_{A} Y_{0}$ , so $Y_{\infty} = E [Y_{0} ∣ G_{\infty}]$ .

For the SLLN, take $Y_{0} = ξ_{1}$ , $G_{n} = σ (S_{n}, S_{n + 1}, \dots)$ ; Exercise 8 gives $Y_{n} = S_{n} / n$ . Then $S_{n} / n \to E [ξ_{1} ∣ G_{\infty}]$ a.s. and in $L^{1}$ . The limit $σ$ -algebra $G_{\infty}$ lies in the exchangeable $σ$ -algebra of $(ξ_{k})$ , which is $P$ -degenerate by the Hewitt-Savage zero-one law; an a.s.-measurable function on a $σ$ -algebra all of whose events have probability $0$ or $1$ is a.s. constant, and that constant is the common mean $E [S_{n} / n] = E [ξ_{1}]$ . Hence $S_{n} / n \to E [ξ_{1}]$ a.s. $□$

Proposition (Kolmogorov 0-1 law). Let $(ξ_{n})$ be independent and $T = ⋂_{n} σ (ξ_{n}, ξ_{n + 1}, \dots)$ the tail $σ$ -algebra. Then $P (A) \in {0, 1}$ for every $A \in T$ .

Proof. Fix $A \in T$ and let $F_{n} = σ (ξ_{1}, \dots, ξ_{n})$ . The martingale $1_{A}$ in the closed form $E [1_{A} ∣ F_{n}]$ is UI (Exercise 4), so by the closure proposition $E [1_{A} ∣ F_{n}] \to E [1_{A} ∣ F_{\infty}] = 1_{A}$ a.s., since $A$ is $F_{\infty}$ -measurable. But $A \in T$ is independent of each $F_{n}$ (it is measurable with respect to $σ (ξ_{n + 1}, \dots)$ , independent of $σ (ξ_{1}, \dots, ξ_{n})$ ), so $E [1_{A} ∣ F_{n}] = P (A)$ a.s. for every $n$ . Passing to the limit, $1_{A} = P (A)$ a.s., which forces $P (A) \in {0, 1}$ . $□$

Connections Master

The discrete martingale foundations of 37.04.01 — filtrations, the optional-stopping accounting, the upcrossing inequality, and the martingale convergence theorem — are the load-bearing prerequisite. Doob's maximal inequality is an optional-stopping computation at the first-crossing time, the upcrossing inequality of that unit is the oscillation-control dual of this unit's amplitude control, and the closure identity $M_{n} = E [M_{\infty} ∣ F_{n}]$ for a UI martingale is the exact strengthening of that unit's $L^{1}$ -bounded convergence theorem.

The Radon-Nikodym theorem and conditional expectation of 02.07.08 make the closure identity meaningful: a UI martingale is a Radon-Nikodym density process $M_{n} = d (M_{\infty} d P) / d P ∣_{F_{n}}$ , and the equivalence of uniform integrability with $L^{1}$ -closure is the time-asymptotic form of the density existence proved there; the conditional-expectation lemma driving every UI claim in this unit is a direct consequence of the take-out and tower properties established in that prerequisite.

The $L^{p}$ -space theory of 02.07.06 — Hölder's inequality, the conjugate exponent $p / (p - 1)$ , and completeness — is used directly in the proof of Doob's $L^{p}$ inequality and in the $L^{p}$ convergence of $L^{p}$ -bounded martingales; the self-improving Hölder step that closes the maximal inequality into the constant $\frac{p}{p - 1}$ is exactly the duality pairing of that unit, and the dominated-convergence upgrade to $L^{p}$ -convergence runs in the complete space it constructs.

The Fubini-Tonelli theorem of 02.07.07 supplies the layer-cake interchange $E [(X^{*})^{p}] = \int_{0}^{\infty} p λ^{p - 1} P (X^{*} \geq λ) d λ$ and the order-swap that converts the weak-type bound into the strong $L^{p}$ bound; without the Tonelli non-negativity interchange the maximal inequality would not integrate up to the $L^{p}$ inequality.

The a.s.-convergence companion of this circle is the unit 37.04.02 on a.s. martingale convergence (co-produced in this wave): it owns the upcrossing-driven a.s. limit theorem that this unit promotes to $L^{1}$ and $L^{p}$ convergence under uniform integrability, and the two units together form the convergence half of the martingale chapter.

Historical & philosophical context Master

Uniform integrability as the criterion separating convergence in measure from convergence in mean is classical Vitali (Giuseppe Vitali, Sull'integrazione per serie, 1907), and its identification with relative weak compactness in $L^{1}$ is the Dunford-Pettis theorem (Nelson Dunford and B. J. Pettis, Linear operations on summable functions, Trans. Amer. Math. Soc. 47, 1940, 323) ^{[Dunford-Pettis 1940]}. Joseph Doob isolated the maximal and $L^{p}$ inequalities and the systematic $L^{p}$ -convergence theory of martingales in Stochastic Processes (1953), building on his 1940 measure-theoretic reformulation of the martingale property; the constant $p / (p - 1)$ is sharp, as the example of the running maximum of the simple martingale generated by an extremal $L^{p}$ density shows.

The strong law of large numbers has a longer arc. Émile Borel proved the SLLN for Bernoulli trials in 1909 (Rend. Circ. Mat. Palermo 27), Francesco Paolo Cantelli sharpened the Borel-Cantelli machinery in 1917, and Andrei Kolmogorov gave the general first-moment SLLN and the zero-one law in Grundbegriffe der Wahrscheinlichkeitsrechnung (Springer, 1933) ^{[Kolmogorov 1933]}, originally via his three-series theorem and a truncation argument. The reversed-martingale proof presented here is due to Doob and was streamlined by J. L. Snell and later expositors; it replaces Kolmogorov's truncation with the observation, made precise by Edwin Hewitt and Leonard Savage (Symmetric measures on Cartesian products, Trans. Amer. Math. Soc. 80, 1955, 470), that $S_{n} / n$ is a conditional expectation given the symmetric past and that the exchangeable $σ$ -algebra of an i.i.d. sequence is $P$ -degenerate. The conceptual content is that the law of averages, the closure of a uniformly integrable martingale, and the Radon-Nikodym identification of a density process at infinity are one theorem viewed along three filtrations, increasing for closure and decreasing for the SLLN.

Bibliography Master

@book{williams1991,
  author    = {Williams, David},
  title     = {Probability with Martingales},
  publisher = {Cambridge University Press},
  series    = {Cambridge Mathematical Textbooks},
  year      = {1991}
}

@book{doob1953,
  author    = {Doob, Joseph L.},
  title     = {Stochastic Processes},
  publisher = {John Wiley \& Sons, New York},
  year      = {1953}
}

@article{dunfordpettis1940,
  author  = {Dunford, Nelson and Pettis, B. J.},
  title   = {Linear operations on summable functions},
  journal = {Transactions of the American Mathematical Society},
  volume  = {47},
  number  = {3},
  pages   = {323--392},
  year    = {1940}
}

@book{kolmogorov1933,
  author    = {Kolmogorov, Andrey N.},
  title     = {Grundbegriffe der Wahrscheinlichkeitsrechnung},
  publisher = {Springer, Berlin},
  series    = {Ergebnisse der Mathematik und ihrer Grenzgebiete},
  year      = {1933}
}

@article{hewittsavage1955,
  author  = {Hewitt, Edwin and Savage, Leonard J.},
  title   = {Symmetric measures on Cartesian products},
  journal = {Transactions of the American Mathematical Society},
  volume  = {80},
  number  = {2},
  pages   = {470--501},
  year    = {1955}
}

@book{durrett2019,
  author    = {Durrett, Rick},
  title     = {Probability: Theory and Examples},
  edition   = {5th},
  series    = {Cambridge Series in Statistical and Probabilistic Mathematics},
  publisher = {Cambridge University Press},
  year      = {2019}
}

@book{neveu1975,
  author    = {Neveu, Jacques},
  title     = {Discrete-Parameter Martingales},
  publisher = {North-Holland, Amsterdam},
  year      = {1975}
}

Prerequisites

02.07.06
02.07.07
02.07.08
37.04.01

Tier anchors

beginner: Williams — Probability with Martingales Ch. 13-14 (informal); Grimmett-Stirzaker — Probability and Random Processes §12.6 (maximal inequalities, fair games)
intermediate: Williams — Probability with Martingales (CUP, 1991) Ch. 13-14; Durrett — Probability: Theory and Examples (5th ed.) §4.4-4.6
master: Williams — Probability with Martingales Ch. 13-14; Durrett §4.4-4.7; Neveu — Discrete-Parameter Martingales (North-Holland, 1975) Ch. II, IV-V; Doob — Stochastic Processes (Wiley, 1953) Ch. VII

References

Williams — Probability with Martingales (Cambridge University Press, 1991) · Ch. 13 (uniform integrability, UI martingales), Ch. 14 (Doob's submartingale and L^p inequalities, the SLLN)
Doob — Stochastic Processes (Wiley, 1953) · Ch. VII (maximal inequalities, the L^p inequality, martingale convergence)
Durrett — Probability: Theory and Examples (Cambridge University Press, 5th ed., 2019) · §4.4 (Doob's inequality, L^p convergence), §4.5 (uniform integrability), §4.6 (backwards martingales, the SLLN)
Neveu — Discrete-Parameter Martingales (North-Holland, 1975) · Ch. II (uniform integrability), Ch. IV-V (inequalities, convergence in L^p, reversed martingales)
Dunford-Pettis — Linear operations on summable functions (Trans. Amer. Math. Soc. 47, 1940) · pp. 323-392; the characterisation of weak compactness in L^1 by uniform integrability
Kolmogorov — Grundbegriffe der Wahrscheinlichkeitsrechnung (Springer, 1933) · Ch. VI (the strong law of large numbers, the zero-one law)

Estimated time

beginner: 19m
intermediate: 52m
master: 95m