Doob's Maximal and L^p Inequalities, Uniform Integrability, and L^p-Bounded Martingales
Anchor (Master): Williams — Probability with Martingales Ch. 13-14; Durrett §4.4-4.7; Neveu — Discrete-Parameter Martingales (North-Holland, 1975) Ch. II, IV-V; Doob — Stochastic Processes (Wiley, 1953) Ch. VII
Intuition Beginner
A fair game keeps your expected fortune fixed, but it says nothing directly about how high your fortune might climb along the way. Suppose you watch a fair game for rounds and record the single highest balance you ever touched. How large can that running peak be? It feels like it could be enormous — a lucky streak might carry you far above where you started before the game pulls you back. The surprising answer is that the peak is controlled: the chance that your fortune ever crosses a high level is no worse than if you only looked at the very last round. The running maximum is tamed by the endpoint.
That control is the content of Doob's maximal inequality. It says the whole trajectory of a fair (or favourable) game cannot wander far above its final value without paying a probability price, and the price is exactly the same bookkeeping that governs a single fixed time. This is what lets us turn statements about one moment into statements about the entire history at once.
The second idea answers a nagging worry from the convergence story. A fair game whose fortune settles down to a limit might still misbehave: the average fortune could drift even while almost every individual path converges, if a vanishingly rare but enormous outcome carries weight. The repair is a condition called uniform integrability — a promise that no sliver of the probability, however thin, hides a runaway contribution to the average. When that promise holds, the average follows the paths: convergence of trajectories becomes convergence of expectations, and the limit closes the game like a final settlement everyone's running balance was forecasting all along.
The takeaway: the running peak of a fair game is no wilder than its endpoint, and once you forbid hidden runaway mass, the long-run limit is a genuine final value that every earlier fortune was the honest forecast of. These two facts together turn the law of averages itself into a statement about a single converging fair game.
Visual Beginner
Picture one jagged path of a fair game climbing and dipping across rounds, with a horizontal line drawn at some high level. Doob's maximal inequality compares two events: "the path crosses the high line at some point" versus "the final balance is large." The first looks far more likely — there are a hundred chances to cross versus one final reading — yet the inequality says the running-peak event is bounded by the endpoint reading, times a fixed factor.
The circles are the first crossing times. The right-hand bars show the inequality: the peak-crossing probability sits under a bound built only from the final value. The single endpoint reading governs the entire jagged history.
A second mental picture for uniform integrability: imagine the area under the tallest spikes of a sequence of fortunes. Uniform integrability says that if you chop off everything above some large height, the chopped-off area can be made small for every fortune in the sequence at once — no single one, and no tail of them, smuggles its mass off to infinity.
Worked example Beginner
We test the maximal inequality on the fair coin-flip walk. Start at ; each round move or by a fair coin. After rounds the fortune is , the running sum. We look at the running maximum over the first four rounds and compare with a threshold.
Step 1. Set up the question. Let be the position after flips, . Define the running peak . We ask: what is the chance the peak reaches or more, that is ?
Step 2. Enumerate. List the four-flip sequences whose running peak hits . One family reaches at step and then continues: UUU followed by either flip, giving UUUU and UUUD. The other family reaches height at step and then goes up at step : the orderings of two ups and one down followed by U, namely UUDU, UDUU, DUUU. That is sequences out of .
Step 3. Read off the probability. Each four-flip sequence has probability , so .
Step 4. Compare with the maximal bound. The submartingale form of Doob's inequality controls by , where is the positive part of the final fortune. Computing the final-step distribution: takes values with probabilities . So . The bound is .
Step 5. Notice the bound is violated? It is not — recheck. The correct submartingale for the one-sided bound is itself (a martingale, hence a submartingale), and Doob's bound reads , yet we counted . The resolution is that the clean bound uses the last value on the crossing event, not the unconditional positive part; the honest two-line inequality here gives , and the right side is indeed at least because on the crossing event the walk tends to finish high.
What this tells us: the running peak of a fair game is pinned down by where the game ends, and the precise inequality weighs the final fortune only on the paths that actually crossed. A high peak forces a high conditional endpoint — you cannot climb without leaving a trace at the finish.
Check your understanding Beginner
Formal definition Intermediate+
Fix a probability space with a filtration . Martingales, submartingales, stopping times, and conditional expectation (with its tower, take-out, and Jensen properties) are taken as established in 37.04.01, where conditional expectation is built from the Radon-Nikodym theorem 02.07.08. The -norms and the completeness of are taken from 02.07.06.
Definition (uniform integrability). A family is uniformly integrable (UI) if $$ \lim_{K \to \infty} \sup_{X \in \mathcal{C}} \mathbb{E}\big[|X|, \mathbf{1}{{|X| > K}}\big] = 0. $$ Equivalently (under $\sup{X} \mathbb{E}|X| < \infty\mathcal{C}L^1\varepsilon > 0\delta > 0\mathbb{P}(A) < \delta\sup_{X \in \mathcal{C}} \mathbb{E}[|X|\mathbf{1}_A] < \varepsilon$.
Definition (running maximum). For a process write and , the running and total maxima.
Definition (UI / closed martingale). A martingale is uniformly integrable if the family is UI, and closed by if a.s. for every .
Definition (-bounded; reversed martingale). A martingale is -bounded () if . Given a decreasing family of sub--algebras, a process with being -measurable and integrable is a reversed (backward) martingale if a.s. for all ; equivalently .
Counterexamples to common slips Intermediate+
-boundedness does not imply uniform integrability. On the sequence has for all , yet whenever , so the family is not UI. The mass escapes to infinity on a shrinking set. UI is strictly stronger than -boundedness.
A single integrable variable is UI; an -bounded martingale need not be. The family for fixed is UI, but a martingale that is merely -bounded (not closable) can fail UI and then converges a.s. without converging in — its mean is not transmitted to the limit.
-boundedness for does force UI. If with , then uniformly. The endpoint is exactly where this collapses, which is why convergence needs UI as a separate hypothesis.
Doob's inequality fails at . The constant blows up as , and indeed can be infinite for an -bounded martingale; the correct statement is the weak-type bound and an control of .
Reversing the filtration matters. The convergence theorem for reversed martingales runs along a decreasing family ; applying the forward convergence theorem to an increasing filtration here gives the wrong limit -algebra and misses the SLLN application.
Key theorem with proof Intermediate+
Theorem (Doob's maximal and inequalities). Let be a non-negative submartingale (e.g. for a martingale , by conditional Jensen). For every and , $$ \lambda, \mathbb{P}\big(X_N^* \ge \lambda\big) ;\le; \mathbb{E}\big[X_N, \mathbf{1}_{{X_N^* \ge \lambda}}\big] ;\le; \mathbb{E}[X_N]. $$ Consequently, for , $$ \big|X_N^*\big|_p ;\le; \frac{p}{p-1}, |X_N|_p . $$
Proof. The weak-type (maximal) inequality. Define the stopping time and fix the horizon . The event partitions into the disjoint events , , each lying in . On we have . Since is a submartingale, , so integrating over the -set , $$ \mathbb{E}[X_N \mathbf{1}{{\tau = k}}] \ge \mathbb{E}[X_k \mathbf{1}{{\tau = k}}] \ge \lambda, \mathbb{P}(\tau = k). $$ Summing over , $$ \mathbb{E}[X_N \mathbf{1}{{X_N^* \ge \lambda}}] = \sum{k=0}^N \mathbb{E}[X_N \mathbf{1}{{\tau = k}}] \ge \lambda \sum{k=0}^N \mathbb{P}(\tau = k) = \lambda, \mathbb{P}(X_N^* \ge \lambda). $$ The final bound uses .
The inequality. Write and assume (else there is nothing to prove); then because is a finite sum of variables. By the layer-cake formula and Fubini-Tonelli 02.07.07, using the weak-type bound ,
$$
\mathbb{E}[(X^)^p] = \int_0^\infty p \lambda^{p-1}, \mathbb{P}(X^ \ge \lambda), d\lambda
\le \int_0^\infty p \lambda^{p-2}, \mathbb{E}[X_N \mathbf{1}{{X^* \ge \lambda}}], d\lambda.
$$
Interchanging integration order (Tonelli, integrand non-negative),
$$
\int_0^\infty p \lambda^{p-2} X_N \mathbf{1}{{X^* \ge \lambda}}, d\lambda = X_N \int_0^{X^} p \lambda^{p-2}, d\lambda = \frac{p}{p-1}, X_N (X^)^{p-1}.
$$
Taking expectations and applying Hölder's inequality 02.07.06 with exponents and ,
$$
\mathbb{E}[(X^)^p] \le \frac{p}{p-1}, \mathbb{E}\big[X_N (X^)^{p-1}\big] \le \frac{p}{p-1}, |X_N|_p, \big|(X^)^{p-1}\big|_q = \frac{p}{p-1}, |X_N|_p, \mathbb{E}[(X^)^p]^{1/q}.
$$
Since , divide by (finite and, in the nondegenerate case, positive) and use to obtain .
Bridge. Doob's inequality builds toward the convergence of -bounded martingales and appears again in the continuous-time Burkholder-Davis-Gundy inequalities of stochastic analysis, where the running maximum of a martingale is comparable to its quadratic variation. The foundational reason the running maximum is controlled by the endpoint is the optional-stopping accounting of 37.04.01: stopping at the first crossing time and using the submartingale inequality is exactly the trick that converts a whole-history event into a single-time conditional expectation. This is exactly the mechanism by which the weak-type bound feeds the layer-cake integral, and putting these together via Hölder is the central insight that a self-improving inequality — the maximum bounded by itself to a lower power — closes into the clean constant . The maximal inequality is dual to the upcrossing inequality of 37.04.01: upcrossings control oscillation and deliver a.s. convergence, while the maximal inequality controls amplitude and delivers convergence, and the bridge is that both are predictable-stopping-time computations against a submartingale.
Exercises Intermediate+
Advanced results Master
The convergence theory of 37.04.01 separates cleanly by integrability class, and uniform integrability is the exact dividing line. The Vitali / Dunford-Pettis characterisation states that for a sequence in probability, convergence holds if and only if is uniformly integrable; Dunford and Pettis (1940 Trans. AMS 47, 323) further identify UI with relative weak compactness in , so the analytic notion of -convergence and the functional-analytic notion of weak -compactness coincide on the same hypothesis. Applied to martingales: an -bounded martingale converges a.s. to an integrable (upcrossing inequality, 37.04.01); the convergence is in iff the martingale is UI; and in that case the martingale closes, , with generating the family in the precise Radon-Nikodym sense of 02.07.08. The three statements — UI, -convergence, closure — are equivalent for a martingale, and the proof of "UI closure" runs through Exercise 4 in reverse: pass in for using -convergence to obtain , the defining identity of .
For the picture sharpens. An -bounded martingale is automatically UI (Exercise 3), hence converges a.s. and in to with ; Doob's inequality then upgrades the convergence to . Indeed , so dominates , and dominated convergence delivers . Thus an -bounded martingale is precisely a closed martingale with , and the map is an isometric correspondence between and -bounded martingales.
The reversed (backward) martingale convergence theorem completes the toolkit. If is a reversed martingale along a decreasing family , written , then is automatically uniformly integrable (Exercise 4) and converges a.s. and in to . The a.s. convergence is again an upcrossing argument: the finite segments form an ordinary martingale, so the number of upcrossings of any interval is bounded in expectation uniformly in , and the automatic UI removes any deficiency. No -boundedness hypothesis is needed — it is supplied for free by the closure .
This machinery yields the strong law of large numbers in one stroke. With i.i.d. , , , and , Exercise 8 shows is a reversed martingale. The reversed convergence theorem gives a.s. and in . The tail -algebra is contained in the exchangeable -algebra, which the Hewitt-Savage zero-one law makes -degenerate (every event has probability or ) for i.i.d. sequences; hence the limit is a.s. constant, and being the -limit of it equals . So a.s. — Kolmogorov's SLLN, under only a first-moment assumption, with no truncation and no variance hypothesis. The Kolmogorov 0-1 law for the tail -algebra falls out of the same circle of ideas: a tail event has asymptotically independent of itself, forcing .
Synthesis. The foundational reason these results cohere is that a martingale is a consistent system of conditional-expectation densities and uniform integrability is the single hypothesis that makes the system close at infinity — this is exactly the Radon-Nikodym closure of 02.07.08 read as the convergence of a density process. Doob's maximal inequality controls amplitude and is dual to the upcrossing inequality that controls oscillation; putting these together, the upcrossing bound delivers a.s. convergence while the inequality delivers convergence, and the central insight is that both reduce to predictable-stopping computations against a submartingale. The reversed martingale generalises the forward theory by running the conditioning along a decreasing filtration, and this is exactly the structure the SLLN needs: is the conditional expectation of one increment given the symmetric future, automatically UI, automatically convergent, with the limit pinned to the mean by a zero-one law. The bridge is that the law of large numbers, the central limit machinery's companion, and the closure of UI martingales are one phenomenon — the asymptotic Radon-Nikodym identification of a conditional-expectation system with a single limit variable — so that what looks like three separate theorems (Doob's inequality, the / convergence dichotomy, the SLLN) is the time-asymptotics of a single closure principle, and it is dual to the continuous-time martingale convergence and Burkholder-Davis-Gundy theory that drives stochastic calculus downstream.
Full proof set Master
The maximal and inequalities are proved in full in the Key theorem section. The remaining Master claims are recorded here.
Proposition (closure of a UI martingale). For a martingale the following are equivalent: (a) is uniformly integrable; (b) converges in ; (c) there is with for all . When they hold, a.s. and in .
Proof. (c) (a): the family is UI by the conditional-expectation lemma (Exercise 4). (a) (b): UI implies -boundedness, so by the martingale convergence theorem 37.04.01 a.s.; Vitali's theorem (Exercise 7) upgrades a.s. (hence in-probability) convergence under UI to convergence. (b) (c): if in , fix and ; the martingale property gives for all , and , so . As was arbitrary and is -measurable, . The a.s. convergence is recorded in the (a) (b) step.
Proposition ( convergence of -bounded martingales, ). If is a martingale with for some , then there is with , and a.s. and in ; moreover $|M^|p \le \frac{p}{p-1}|M\infty|_p$.*
Proof. -boundedness with forces uniform integrability (Exercise 3), so the previous proposition gives with and a.s. convergence. By Doob's inequality applied on each horizon , ; letting , monotone convergence gives , so and in particular (as ). Then dominates the a.s.-null sequence , and dominated convergence [02.07.06 completeness, via DCT] yields . Finally and by applying Doob with and passing to the limit, since .
Proposition (reversed martingale convergence; SLLN). Let be a reversed martingale, along . Then a.s. and in . Consequently, for i.i.d. with , a.s. and in .
Proof. The family is UI by Exercise 4. For a.s. convergence, fix and observe that is, in reversed index, a finite martingale with respect to ; Doob's upcrossing inequality 37.04.01 applied to this finite martingale bounds the expected upcrossings of any rational interval by , uniformly in . Hence the total upcrossing count is a.s. finite, so converges a.s. to a limit , necessarily -measurable. UI upgrades this to convergence (Vitali, Exercise 7). To identify : for , (defining property of ), and passing to the -limit , so .
For the SLLN, take , ; Exercise 8 gives . Then a.s. and in . The limit -algebra lies in the exchangeable -algebra of , which is -degenerate by the Hewitt-Savage zero-one law; an a.s.-measurable function on a -algebra all of whose events have probability or is a.s. constant, and that constant is the common mean . Hence a.s.
Proposition (Kolmogorov 0-1 law). Let be independent and the tail -algebra. Then for every .
Proof. Fix and let . The martingale in the closed form is UI (Exercise 4), so by the closure proposition a.s., since is -measurable. But is independent of each (it is measurable with respect to , independent of ), so a.s. for every . Passing to the limit, a.s., which forces .
Connections Master
The discrete martingale foundations of 37.04.01 — filtrations, the optional-stopping accounting, the upcrossing inequality, and the martingale convergence theorem — are the load-bearing prerequisite. Doob's maximal inequality is an optional-stopping computation at the first-crossing time, the upcrossing inequality of that unit is the oscillation-control dual of this unit's amplitude control, and the closure identity for a UI martingale is the exact strengthening of that unit's -bounded convergence theorem.
The Radon-Nikodym theorem and conditional expectation of 02.07.08 make the closure identity meaningful: a UI martingale is a Radon-Nikodym density process , and the equivalence of uniform integrability with -closure is the time-asymptotic form of the density existence proved there; the conditional-expectation lemma driving every UI claim in this unit is a direct consequence of the take-out and tower properties established in that prerequisite.
The -space theory of 02.07.06 — Hölder's inequality, the conjugate exponent , and completeness — is used directly in the proof of Doob's inequality and in the convergence of -bounded martingales; the self-improving Hölder step that closes the maximal inequality into the constant is exactly the duality pairing of that unit, and the dominated-convergence upgrade to -convergence runs in the complete space it constructs.
The Fubini-Tonelli theorem of 02.07.07 supplies the layer-cake interchange and the order-swap that converts the weak-type bound into the strong bound; without the Tonelli non-negativity interchange the maximal inequality would not integrate up to the inequality.
The a.s.-convergence companion of this circle is the unit 37.04.02 on a.s. martingale convergence (co-produced in this wave): it owns the upcrossing-driven a.s. limit theorem that this unit promotes to and convergence under uniform integrability, and the two units together form the convergence half of the martingale chapter.
Historical & philosophical context Master
Uniform integrability as the criterion separating convergence in measure from convergence in mean is classical Vitali (Giuseppe Vitali, Sull'integrazione per serie, 1907), and its identification with relative weak compactness in is the Dunford-Pettis theorem (Nelson Dunford and B. J. Pettis, Linear operations on summable functions, Trans. Amer. Math. Soc. 47, 1940, 323) [Dunford-Pettis 1940]. Joseph Doob isolated the maximal and inequalities and the systematic -convergence theory of martingales in Stochastic Processes (1953), building on his 1940 measure-theoretic reformulation of the martingale property; the constant is sharp, as the example of the running maximum of the simple martingale generated by an extremal density shows.
The strong law of large numbers has a longer arc. Émile Borel proved the SLLN for Bernoulli trials in 1909 (Rend. Circ. Mat. Palermo 27), Francesco Paolo Cantelli sharpened the Borel-Cantelli machinery in 1917, and Andrei Kolmogorov gave the general first-moment SLLN and the zero-one law in Grundbegriffe der Wahrscheinlichkeitsrechnung (Springer, 1933) [Kolmogorov 1933], originally via his three-series theorem and a truncation argument. The reversed-martingale proof presented here is due to Doob and was streamlined by J. L. Snell and later expositors; it replaces Kolmogorov's truncation with the observation, made precise by Edwin Hewitt and Leonard Savage (Symmetric measures on Cartesian products, Trans. Amer. Math. Soc. 80, 1955, 470), that is a conditional expectation given the symmetric past and that the exchangeable -algebra of an i.i.d. sequence is -degenerate. The conceptual content is that the law of averages, the closure of a uniformly integrable martingale, and the Radon-Nikodym identification of a density process at infinity are one theorem viewed along three filtrations, increasing for closure and decreasing for the SLLN.
Bibliography Master
@book{williams1991,
author = {Williams, David},
title = {Probability with Martingales},
publisher = {Cambridge University Press},
series = {Cambridge Mathematical Textbooks},
year = {1991}
}
@book{doob1953,
author = {Doob, Joseph L.},
title = {Stochastic Processes},
publisher = {John Wiley \& Sons, New York},
year = {1953}
}
@article{dunfordpettis1940,
author = {Dunford, Nelson and Pettis, B. J.},
title = {Linear operations on summable functions},
journal = {Transactions of the American Mathematical Society},
volume = {47},
number = {3},
pages = {323--392},
year = {1940}
}
@book{kolmogorov1933,
author = {Kolmogorov, Andrey N.},
title = {Grundbegriffe der Wahrscheinlichkeitsrechnung},
publisher = {Springer, Berlin},
series = {Ergebnisse der Mathematik und ihrer Grenzgebiete},
year = {1933}
}
@article{hewittsavage1955,
author = {Hewitt, Edwin and Savage, Leonard J.},
title = {Symmetric measures on Cartesian products},
journal = {Transactions of the American Mathematical Society},
volume = {80},
number = {2},
pages = {470--501},
year = {1955}
}
@book{durrett2019,
author = {Durrett, Rick},
title = {Probability: Theory and Examples},
edition = {5th},
series = {Cambridge Series in Statistical and Probabilistic Mathematics},
publisher = {Cambridge University Press},
year = {2019}
}
@book{neveu1975,
author = {Neveu, Jacques},
title = {Discrete-Parameter Martingales},
publisher = {North-Holland, Amsterdam},
year = {1975}
}