37.04.04 · probability / 04-conditional-expectation-martingales

Kakutani's Theorem on Product Martingales and Absolute Continuity of Product Measures

shipped3 tiersLean: none

Anchor (Master): Williams — Probability with Martingales §14.12-14.18; Durrett §4.3.3; Williams — Probability with Martingales §14.18 (Kakutani's theorem, statement and proof); Shiryaev — Probability (Springer, 2nd ed., 1996) Ch. VII §6

Intuition Beginner

Imagine multiplying together a long string of random factors, each one a fair multiplier: on average each factor equals , but any single factor might come out a bit above or a bit below . Your running product is like a fortune in a fair game played by multiplication instead of addition — given everything so far, your best forecast for tomorrow's product is exactly today's. The question is what happens after infinitely many factors: does the product settle to a sensible positive number, or does it collapse to zero?

The surprising answer is that there is no middle ground. Either the product converges to a genuinely positive limit and behaves like a faithful final settlement, or it slides all the way to zero, with no possibility of hovering at some random positive value in an unreliable way. Which of the two happens is decided by a single number you can compute in advance: take the average of the square root of each factor, and multiply all those averages together. If that grand product stays positive, the fortune survives; if it shrinks to zero, the fortune dies.

Why the square root? Because the square root gently penalises spread. A factor that is sometimes much bigger and sometimes much smaller than has an average of , but the average of its square root is strictly less than — the square root sees the risk that the additive average hides. Each risky factor chips a little off the square-root product, and if the chips add up to everything, the fortune is doomed.

The takeaway: a product of independent fair factors lives or dies by a clean all-or-nothing rule, and the verdict is read off a single square-root bookkeeping number. This same rule decides when two ways of assigning probabilities to an endless sequence of experiments are close enough to be comparable, or so far apart that each calls impossible what the other calls certain.

Visual Beginner

Picture three running products plotted across many multiplication steps. The horizontal axis is the number of factors multiplied so far; the vertical axis is the running product, starting at . One path keeps fluctuating around a comfortable positive band and settles to a positive number. A second path drifts steadily downward, hugging zero. The square-root bookkeeping number — the running product of the average square roots — is drawn underneath each: it stays positive for the surviving path and decays to zero for the dying one.

The picture makes the dichotomy visible: the running product either finds a positive home or falls to zero, and the companion square-root number predicts which, computed from the factors alone before any path is drawn.

Worked example Beginner

We test the rule on a concrete product. Let each factor take the value or , each with probability one half, for a fixed number with . The factors are independent. The running product is .

Step 1. Check the fair-game property. The average of each factor is . So the average of the product is for every : a fair multiplicative game.

Step 2. Compute the square-root bookkeeping number for one factor. We need the average of : $$ \mathbb{E}\big[\sqrt{X_k}\big] = \tfrac12\sqrt{1 + a} + \tfrac12\sqrt{1 - a}. $$ Take . Then and , so the average square root is .

Step 3. Multiply over all factors. The square-root product after factors is . Since , raising it to higher and higher powers drives it to zero: , , and it keeps shrinking.

Step 4. Read the verdict. The square-root product collapses to zero, so the rule predicts the running product dies: . Each factor is genuinely risky (it really does swing by ), each swing chips the square-root product below , and infinitely many chips erase it.

What this tells us: even though every running product has average exactly , the typical product collapses to zero. The average is propped up by rare paths where the product stays large; the square-root number sees past that illusion and reports the fate of the typical path. Only if the factors were close enough to that the square-root product stayed positive would the fortune survive.

Check your understanding Beginner

Formal definition Intermediate+

Fix a probability space . Martingales, the a.s. convergence theorem for -bounded martingales 37.04.02, the upcrossing inequality and optional-stopping accounting 37.04.01, and the uniform-integrability / -closure theory 37.04.03 are taken as established. Conditional expectation is built from the Radon-Nikodym theorem 02.07.08, and product measures from Fubini-Tonelli 02.07.07.

Definition (product martingale). Let be independent, non-negative, integrable random variables with for every . Set and . The product martingale is $$ M_0 = 1, \qquad M_n = \prod_{k=1}^n X_k . $$ That is a martingale follows from independence: , using that is independent of . Each and , so is a non-negative -bounded martingale and by 37.04.02 converges a.s. to a limit with (Fatou).

Definition (Hellinger coefficient). For each set $$ a_k = \mathbb{E}\big[\sqrt{X_k}\big] \in (0, 1]. $$ By Jensen's inequality applied to the strictly concave map , , with equality iff a.s. The Hellinger product is , a decreasing product in that either converges to a strictly positive limit or to .

Definition (Hellinger integral of two laws). Let be probability densities with respect to a common -finite measure (for instance ). The Hellinger integral (affinity) is $$ H(p, q) = \int \sqrt{p, q}; d\mu \in [0, 1], $$ independent of the dominating . The associated Hellinger distance is , so that . When is a likelihood ratio under with the -th coordinate densities, .

Counterexamples to common slips Intermediate+

  • Mean one does not protect positivity of the limit. The product with equally likely satisfies , yet a.s. once any factor hits . The constant mean is propped up by an event of vanishing probability.

  • Pointwise convergence is not enough. Even with each close to , the verdict is the convergence of the infinite product , equivalently the convergence of . If the product collapses; the rate, not the limit of the terms, decides survival.

  • -boundedness is automatic and says nothing. Every product martingale here is -bounded (), so the a.s. limit always exists. The genuine content is whether the convergence is in (limit faithful, strictly positive) or merely a.s. (limit zero); the dichotomy is exactly the UI/non-UI split of 37.04.03.

  • Absolute continuity is all-or-nothing for products, unlike for single measures. Two laws on one coordinate can be neither equivalent nor singular (e.g. one supported on a sub-interval of the other). For infinite products of independent coordinates the Kakutani dichotomy forces or with nothing in between, because the relevant event is a tail event.

Key theorem with proof Intermediate+

Theorem (Kakutani's dichotomy for product martingales). Let be independent, non-negative, with , and let with a.s. limit . Set . Then exactly one of the following holds.

(i) If , then is uniformly integrable, in , , and a.s.

(ii) If , then a.s.

Proof. Define the auxiliary process $$ N_n = \prod_{k=1}^n \frac{\sqrt{X_k}}{a_k}. $$ Each factor is non-negative with mean , and the factors are independent, so by the same computation as for , is a non-negative martingale with . In particular is -bounded and converges a.s. to some .

Case (i): . We show is -bounded, hence UI by 37.04.03. Compute, using independence and , $$ \mathbb{E}[N_n^2] = \prod_{k=1}^n \mathbb{E}\Big[\frac{X_k}{a_k^2}\Big] = \prod_{k=1}^n \frac{1}{a_k^2} = \Big(\prod_{k=1}^n a_k\Big)^{-2} \le A^{-2} < \infty, $$ since as the make the partial products decrease to . Thus : is -bounded. By Doob's theory 37.04.03 converges in (hence in ) to a limit with . The pointwise identity $$ M_n = \prod_{k=1}^n X_k = \Big(\prod_{k=1}^n a_k\Big)^2 N_n^2 \le N_n^2 $$ holds because and . The convergence gives in , and an -convergent sequence is uniformly integrable by Vitali's theorem 37.04.03; hence is UI. Domination by a UI family preserves UI, so makes uniformly integrable. Therefore in by 37.04.03, and .

It remains to show a.s. Since in and , we have , so is not a.s. zero. The event is a tail event of the independent sequence : indeed up to the behaviour of finitely many positive factors, hence measurable with respect to for every . By the Kolmogorov 0-1 law 37.04.03 it has probability or ; as rules out probability , we get , i.e. a.s. Finally a.s. (the partial products converge to ), and matching this a.s. limit with gives a.s.

Case (ii): . Then . The process is still a non-negative martingale with , so it converges a.s. to a finite a.s. (a.s. finiteness of the limit of a non-negative -bounded martingale, 37.04.02). Therefore $$ M_n = \Big(\prod_{k=1}^n a_k\Big)^2 N_n^2 \xrightarrow[n\to\infty]{} 0 \cdot N_\infty^2 = 0 \quad \text{a.s.}, $$ because while a.s. Hence a.s. (Here the convergence is not in , since ; the martingale fails UI exactly because the Hellinger product collapses.)

Bridge. Kakutani's dichotomy builds toward the absolute-continuity-versus-singularity classification of infinite product measures and appears again in the theory of statistical contiguity and the Neyman-Pearson likelihood-ratio test, where is the likelihood ratio of two hypotheses after observations. The foundational reason the square root governs everything is that is the Hellinger affinity , and is, up to the deterministic Hellinger factor, an -bounded martingale — this is exactly the move that converts an question about into an question about , where the Hilbert-space geometry of 37.04.03 applies. The dichotomy is dual to the all-or-nothing tail behaviour of independent sequences: is a tail event, so the Kolmogorov 0-1 law forces a clean split, and putting these together, the central insight is that uniform integrability of the product martingale, positivity of its limit, and the survival of the Hellinger product are one condition viewed through the lift, the tail 0-1 law, and the closure theorem respectively.

Exercises Intermediate+

Advanced results Master

The product martingale of independent unit-mean factors is the prototype on which the entire likelihood-ratio theory of statistics is built, and Kakutani's dichotomy is the statement that the asymptotics admit no intermediate regime. The lift is the structural device: it separates the deterministic Hellinger decay from a genuine -bounded martingale , so that the survival question for the object reduces to the Hilbert-space boundedness of , decided exactly by . The same lift shows the dichotomy is sharp: when the martingale still converges a.s. to a finite limit, and multiplying by the vanishing deterministic factor forces , while the constant mean certifies the failure of convergence — the canonical UI/non-UI split of 37.04.03.

The signature application is the Kakutani dichotomy for infinite product measures. Let and be product probability measures on a countable product of measurable spaces, with on each coordinate. Under the finite-coordinate likelihood ratios form a product martingale with . Kakutani's theorem then yields a perfect alternative: either , in which case with Radon-Nikodym derivative (the limit existing in and strictly positive -a.s.); or , in which case . Because on each coordinate is symmetric to when the laws are mutually absolutely continuous, in the equivalent-coordinates case the dichotomy reads (mutual equivalence) versus (mutual singularity), with no possibility of one-sided absolute continuity for infinite products of equivalent factors. This is the Kakutani 0-1 phenomenon: the event is a tail event of the independent coordinates under , so the Kolmogorov 0-1 law forces — the derivative is either a.s. positive (equivalence) or a.s. zero (singularity), never positive on a proper intermediate set.

The Hellinger geometry makes the criterion computable. The Hellinger affinity and the Hellinger distance with turn the product condition into the additive condition , equivalently . Square-summability of the per-coordinate Hellinger distances is exactly the threshold separating equivalence from singularity. For Gaussian shifts versus one computes , recovering the Cameron-Martin / Feldman-Hájek condition : two product Gaussian measures on a sequence space are either equivalent or mutually singular, equivalence holding iff the mean shift lies in the Cameron-Martin space . The infinite-dimensional rigidity — no intermediate absolute continuity, only equivalence or orthogonality — is the measure-theoretic face of the fact that infinite-dimensional Gaussian measures have no translation-invariant reference measure.

In statistics this is the asymptotics of likelihood-ratio testing and contiguity. The product martingale is the likelihood ratio after independent observations; a.s. (the contiguous, mutually-absolutely-continuous regime) is the case where no test can perfectly separate from even with infinitely many observations, while -a.s. (singularity) is the case of asymptotically perfect discrimination, where a consistent test of power one against size zero exists. Le Cam's theory of contiguity is the local refinement: when depends on so that the Hellinger product tends to a positive constant, and are mutually contiguous, and the log-likelihood ratio is asymptotically Gaussian by the central limit theorem applied to , the linearisation of that the lift makes natural.

Synthesis. The foundational reason these results cohere is that divided by its deterministic Hellinger factor is an -bounded martingale precisely when the Hellinger product survives — this is exactly the device that converts the closure question of 37.04.03 into a Hilbert-space boundedness question, and it generalises the additive strong-law machinery to the multiplicative setting. The dichotomy is dual to the tail-event 0-1 law: is a tail event, so the Kolmogorov law forces the clean split, and putting these together, the central insight is that uniform integrability of the likelihood-ratio martingale, mutual absolute continuity of the infinite product measures, and square-summability of the Hellinger distances are one condition read through the closure theorem, the Radon-Nikodym derivative, and the additive Hellinger geometry respectively. This is exactly the multiplicative companion of the additive Kolmogorov three-series and zero-one circle of 37.04.03: there the survival of a sum of independent terms was decided by summable variances; here the survival of a product is decided by summable Hellinger deficits, and the bridge is that taking the square root linearises the product into the sum, so Kakutani's theorem is the strong law of large numbers wearing the Hellinger metric, with the Radon-Nikodym derivative playing the role that the sample mean plays for sums — the faithful limit when the system closes, the identically-zero collapse when it does not.

Full proof set Master

The dichotomy and its product-measure corollary are proved in full above. The remaining Master claims are recorded here.

Proposition (Hellinger criterion as a series). For probability densities with , the infinite product is positive iff , where .

Proof. Set with . The product is positive iff . Since , , so . Conversely, if then , and for one has ; only finitely many indices have and each contributes a finite term, so and .

Proposition (Kakutani equivalence/singularity for product measures). Let , on with (mutual equivalence) on each coordinate. Then either or ; the first holds iff , and then -a.s.

Proof. Under , the likelihood ratios are independent with and . So is a Kakutani product martingale under and on . If , case (i) gives in , -a.s., . For , , and since and in , the measures and agree on the generating field , hence on ; thus with . As -a.s., the symmetric argument with exchanged (using ) gives , so . If , case (ii) gives -a.s.; then for all , yet , so cannot be . By the symmetric computation under , the reciprocal martingale has the same Hellinger product , so it collapses -a.s.; hence -a.s. The set has and , witnessing .

Proposition (Kakutani 0-1 phenomenon). In the setting above, the event is a -tail event, so ; correspondingly or with no intermediate case.

Proof. The a.s. limit satisfies, for each , , and each -a.s. (as ). Hence up to a -null set, which is measurable with respect to for every ; thus lies in the tail -algebra . By the Kolmogorov 0-1 law 37.04.03 under the independent product , . If then -a.s. and the previous proposition gives ; if then -a.s. and . No proper-subset positivity of the derivative is possible.

Connections Master

The discrete martingale foundations of 37.04.01 supply the framework: the product is a martingale by the independence-plus-take-out computation of that unit, and the optional-stopping and tail--algebra apparatus there is what lets be analysed as a stopping-time-free tail event.

The almost-sure convergence theorem of 37.04.02 is the load-bearing input that guarantees exists a.s. for the non-negative -bounded product martingale before any dichotomy is invoked; the a.s. finiteness of the limit of in case (ii) is exactly that unit's non-negative-martingale convergence statement, and the upcrossing mechanism is what forbids the product from oscillating instead of settling.

The uniform-integrability and theory of 37.04.03 is where the dichotomy lives: case (i) is precisely the UI/-closure regime, proved here by lifting to the -bounded martingale and invoking Doob's bound, and case (ii) is the non-UI a.s.-but-not- regime; the Kolmogorov 0-1 law used to pin is the same zero-one circle assembled in that unit.

The Radon-Nikodym theorem of 02.07.08 gives meaning to the product-measure corollary: identifies the martingale limit with the density of the infinite product law, and the per-coordinate likelihood ratios are the Radon-Nikodym derivatives constructed there; the whole equivalence-versus-singularity statement is a Radon-Nikodym alternative read at infinity.

The Fubini-Tonelli and product-measure construction of 02.07.07 is what makes and the Hellinger integral well-defined objects, and the independence-under- factorisation of the likelihood ratio is the product-measure structure of that unit.

Historical & philosophical context Master

Shizuo Kakutani proved the dichotomy for infinite product measures in On equivalence of infinite product measures (Ann. of Math. 49, 1948, 214) [Kakutani 1948], establishing that two infinite product probability measures are either equivalent or mutually singular, with the alternative decided by the convergence of the product of Hellinger integrals. The Hellinger integral and the associated quadratic-form distance go back to Ernst Hellinger's 1909 study of quadratic forms in infinitely many variables (J. Reine Angew. Math. 136, 1909, 210) [Hellinger 1909], originally a tool in the spectral theory of operators rather than probability; its reinterpretation as a measure of statistical affinity came later. The martingale proof — recognising the finite-coordinate likelihood ratios as a non-negative product martingale and reading the dichotomy off uniform integrability — is due to Joseph Doob and was given its now-standard textbook form by David Williams in Probability with Martingales (Cambridge, 1991), where the square-root lift to an -bounded auxiliary martingale is the organising computation.

The Gaussian instance was settled independently by Jacob Feldman and Jaroslav Hájek in 1958, giving the Feldman-Hájek dichotomy: two Gaussian measures on a function space are equivalent or orthogonal, equivalence for shifts holding iff the mean difference lies in the Cameron-Martin space, the condition that the Hellinger computation reproduces. The statistical descendants are Lucien Le Cam's theory of contiguity and asymptotic equivalence of experiments, in which the product martingale becomes the likelihood-ratio process and the Hellinger product its limiting affinity. The mathematical content is that infinite-dimensional measure spaces are rigid in a way finite-dimensional ones are not: a single coordinate admits a full continuum of mutual relationships between two laws, but an infinite independent product collapses all of them onto the binary equivalence-or-singularity, because the deciding quantity is a tail functional and tails of independent sequences are deterministic.

Bibliography Master

@article{kakutani1948,
  author  = {Kakutani, Shizuo},
  title   = {On equivalence of infinite product measures},
  journal = {Annals of Mathematics},
  volume  = {49},
  number  = {1},
  pages   = {214--224},
  year    = {1948}
}

@article{hellinger1909,
  author  = {Hellinger, Ernst},
  title   = {Neue Begr\"undung der Theorie quadratischer Formen von unendlichvielen Ver\"anderlichen},
  journal = {Journal f\"ur die reine und angewandte Mathematik},
  volume  = {136},
  pages   = {210--271},
  year    = {1909}
}

@book{williams1991,
  author    = {Williams, David},
  title     = {Probability with Martingales},
  publisher = {Cambridge University Press},
  series    = {Cambridge Mathematical Textbooks},
  year      = {1991}
}

@book{durrett2019,
  author    = {Durrett, Rick},
  title     = {Probability: Theory and Examples},
  edition   = {5th},
  series    = {Cambridge Series in Statistical and Probabilistic Mathematics},
  publisher = {Cambridge University Press},
  year      = {2019}
}

@book{shiryaev1996,
  author    = {Shiryaev, Albert N.},
  title     = {Probability},
  edition   = {2nd},
  series    = {Graduate Texts in Mathematics},
  volume    = {95},
  publisher = {Springer},
  year      = {1996}
}

@article{feldman1958,
  author  = {Feldman, Jacob},
  title   = {Equivalence and perpendicularity of Gaussian processes},
  journal = {Pacific Journal of Mathematics},
  volume  = {8},
  pages   = {699--708},
  year    = {1958}
}

@book{lecam1986,
  author    = {Le Cam, Lucien},
  title     = {Asymptotic Methods in Statistical Decision Theory},
  series    = {Springer Series in Statistics},
  publisher = {Springer},
  year      = {1986}
}