Varadhan's Integral Lemma and the Laplace Principle
Anchor (Master): Dembo & Zeitouni 1998 *Large Deviations Techniques and Applications* 2nd ed. (Springer) §4.3-§4.4 (Varadhan, the Laplace principle, Bryc's theorem, the moment condition 4.3.2); Dupuis & Ellis 1997 *A Weak Convergence Approach to the Theory of Large Deviations* (Wiley) Ch. 1, §1.2 (the Laplace principle as primitive); Varadhan 1966 *Asymptotic probabilities and differential equations* (CPAM 19); Varadhan 1984 *Large Deviations and Applications* (SIAM CBMS-NSF 46) §2-§3
Intuition Beginner
Suppose you want to add up an enormous number of contributions, where each contribution is an exponential and is huge. The exponential is so steep that the single largest term swamps all the others put together. So instead of doing the whole sum, you can just hunt for the one point where the gain is largest and read off its value. This shortcut — replace a giant exponential sum or integral by its peak — is the Laplace method, and it is centuries old.
Now layer in randomness. The "points" are the possible values of a random average, and they are not all equally available: a value far from typical is itself exponentially rare, carrying a cost. From the previous units we already know that cost — it is the rate function. So when we weight an exponential reward by how often the random average actually visits , two exponential forces compete: the reward pulling toward high-payoff points, and the cost (the rate function ) pulling back toward typical ones.
Varadhan's lemma is the clean statement of who wins. The total exponential integral grows at the rate set by the single best compromise point — the place where reward minus cost, , is largest. Everything else is exponentially negligible. In one line: averaging an exponential reward over a large-deviation system is the same, on the exponential scale, as solving a tug-of-war between payoff and rarity.
A picture helps. Think of the reward as money offered at each location and the rate function as the admission price to stand there. You will not necessarily go where the money is highest, nor where it is cheapest to stand, but where your net take — money minus admission — is best. The growth rate of your total expected winnings is exactly that best net take.
This pairing of a reward against a cost, picking the best difference, is the same Legendre-Fenchel move met in the convex-duality unit. Varadhan's lemma is that move made into a limit theorem, and run backwards it lets you recover the cost function from the rewards — which is the Laplace principle and its inverse.
Visual Beginner
Figure: two curves over a horizontal axis of values . The first curve is the reward , a gently varying bump. The second is the cost , the familiar valley dipping to zero at the typical value. Below them a third curve plots the difference . A vertical marker sits at the peak of this difference curve — the winning compromise point — and a caption notes that the exponential integral grows at the rate equal to the height of that peak. An inset shows that if the reward is flat (constant), the peak of sits at the bottom of the valley, recovering the typical value.
value
F(x) ___ reward: a gentle bump
_/ \_
I(x) \ / cost: the rate-function valley
\ / (zero at the typical value)
\___ _________/
F(x)-I(x) __ net = reward minus cost
_/ \_
/ ^ \ peak at x_star = best compromise
/ | \
----------+---x_star-+------------- x
growth rate of the integral = height of this peak = max (F - I)
Worked example Beginner
Return to the fair-coin average of 37.07.01, whose cost is for a fraction between and , with . Offer a reward : you are paid the fraction of heads itself. We ask at what rate the average payoff grows.
Step 1. Set up the tug-of-war. Varadhan's lemma says the growth rate is the largest value of over in . Reward rises to the right; cost rises as we leave . The best compromise is somewhere past one half.
Step 2. Tabulate the net . Using the values of computed in the prerequisite unit:
| net | ||
|---|---|---|
Step 3. Read off the winner. The net take peaks near at about . So as grows; the average exponential payoff grows like .
Step 4. Sanity check the two pulls. A pure greed strategy would sit at (all heads), but , giving net — worse, because all-heads is far too rare. A pure caution strategy sits at , net — also worse, because it leaves reward on the table. The optimum balances them.
What this tells us. The growth rate of an exponential average is neither the maximum reward nor the typical value, but the best net of reward minus rarity-cost. That single number, , is the whole content of Varadhan's lemma in this example.
Check your understanding Beginner
Formal definition Intermediate+
Throughout, is a topological space with Borel -algebra, is a family of Borel probability measures on satisfying the large deviation principle 37.07.01 at speed with a good rate function . For a measurable write the scaled cumulant of
$$
\Lambda_\varepsilon(F) := a_\varepsilon \log \int_{\mathcal{X}} e^{F(x)/a_\varepsilon},\mu_\varepsilon(dx),
$$
the object whose behaviour the theory describes.
Definition (the moment / tail condition). A continuous satisfies the Varadhan moment condition if for some $$ \limsup_{\varepsilon\to0} a_\varepsilon \log \int_{\mathcal{X}} e^{\gamma F(x)/a_\varepsilon},\mu_\varepsilon(dx) ;<; +\infty. $$ This is the large-deviation surrogate for uniform integrability: it forbids the integral from being dominated by mass on regions where is large but the LDP control is weak [Dembo & Zeitouni §4.3]. When is bounded above the condition holds automatically, because then .
Definition (the Laplace functional and the variational value). For continuous define the Laplace functional and the variational value
$$
\Lambda(F) := \sup_{x\in\mathcal{X}}\big(F(x) - I(x)\big),
$$
the Legendre-type pairing of the gain against the cost already met in 37.07.03. Goodness of guarantees that, for bounded above, the supremum is attained on the compact sublevel sets of .
Definition (the Laplace principle). The family satisfies the Laplace principle at speed with rate function if for every bounded continuous $$ \lim_{\varepsilon\to0} a_\varepsilon\log\int_{\mathcal{X}} e^{F(x)/a_\varepsilon},\mu_\varepsilon(dx) ;=; \sup_{x\in\mathcal{X}}\big(F(x) - I(x)\big). $$ Equivalently, the Laplace principle is the conjunction of a Laplace upper bound ( for all ) and a Laplace lower bound ( for all ). The two bounds mirror, on the integral side, the closed-set upper bound and open-set lower bound of the LDP.
Varadhan's integral lemma is the assertion that the LDP implies the Laplace limit for every continuous meeting the moment condition; the converse implication, that the Laplace principle implies the LDP (under exponential tightness), is Bryc's inverse lemma. The two together say the LDP and the Laplace principle are interchangeable descriptions of the same asymptotic data.
Counterexamples to common slips
- The moment condition is not removable for unbounded . On let with and ; this has the weak rate , for . For the atom at contributes , which blows the integral up far past . The moment condition fails here (no controls the integral), and so does Varadhan's conclusion.
- Continuity of is used, not just measurability. Both LDP bounds are stated through interiors and closures; the lower bound needs open and the upper bound approximates by simple functions on closed level sets. A discontinuous with a jump across the minimiser of can make the and disagree, so the variational identity can fail at exactly the optimising point.
- is a supremum, not the reward at the cost-minimiser. Evaluating at the typical point gives only a lower bound ; the true value can be strictly larger because a costlier point may carry a much larger reward. Confusing "where the system usually is" with "where the integral concentrates" is the central error the lemma corrects.
Key theorem with proof Intermediate+
We prove Varadhan's integral lemma in the form most used in practice and isolate the two halves, since they have different hypotheses: the lower bound needs only the LDP lower bound and continuity, while the upper bound needs goodness and the moment condition.
Theorem (Varadhan's integral lemma). Let satisfy the LDP at speed with good rate function , and let be continuous and satisfy the moment condition for some . Then $$ \lim_{\varepsilon\to0} a_\varepsilon\log\int_{\mathcal{X}} e^{F(x)/a_\varepsilon},\mu_\varepsilon(dx) ;=; \sup_{x\in\mathcal{X}}\big(F(x) - I(x)\big). $$
Proof of the lower bound . Fix with and ; it suffices to show , then take the supremum over . By continuity of , for any there is an open neighbourhood with on . Restricting the integral to and using positivity of the integrand, $$ \int e^{F/a_\varepsilon},d\mu_\varepsilon ;\ge; \int_G e^{F/a_\varepsilon},d\mu_\varepsilon ;\ge; e^{(F(x_0)-\delta)/a_\varepsilon},\mu_\varepsilon(G). $$ Taking and using the LDP open-set lower bound , $$ \liminf_\varepsilon \Lambda_\varepsilon(F) ;\ge; (F(x_0)-\delta) - I(x_0). $$ Letting gives , and the supremum over completes the lower bound. (No moment condition or goodness was used.)
Proof of the upper bound . Write , finite because the lower bound already gives and the moment condition bounds the . We first treat bounded above, then remove the bound by the moment condition.
Step 1 (bounded above). Suppose . Fix . For each , , so . By upper semicontinuity of and lower semicontinuity of , each point has an open neighbourhood on which and . Then on , ... more directly, the closed set may be chosen (regularity) with and . The sublevel set is compact (goodness); cover it by finitely many such . Outside the rate function exceeds , contributing at most with value . On each , $$ a_\varepsilon\log\int_{\overline{G_{x_j}}}e^{F/a_\varepsilon}d\mu_\varepsilon \le \sup_{\overline{G_{x_j}}}F + \limsup_\varepsilon a_\varepsilon\log\mu_\varepsilon(\overline{G_{x_j}}) \le (F(x_j)+\eta) - \inf_{\overline{G_{x_j}}}I \le (F(x_j)+\eta)-(I(x_j)-\eta)\le V+2\eta. $$ Combining the finitely many pieces by the largest-term rule (the of a finite sum is the max of the pieces, since ), . Let .
Step 2 (remove the bound, using the moment condition). For general apply Step 1 to , which is bounded above and still continuous, giving . On the set Hölder/Chebyshev with the moment exponent controls the tail: writing , the inequality (since on , ) gives . Splitting and applying the largest-term rule, . Taking drives the tail term to , leaving . With the lower bound, the limit equals .
Bridge. This theorem builds toward every concrete evaluation of exponential asymptotics in the chapter and appears again in Bryc's inverse, Sanov-type integral computations, and the Freidlin-Wentzell exit-cost formulas. This is exactly the rigorous Laplace method on the LDP scale: the integral concentrates at the point realising , the central insight being that the rate function plays the role of the phase in the classical Laplace/saddle-point integral , while the LDP lower and upper bounds supply the two-sided pinch. The split into a continuity-only lower bound and a goodness-plus-moment upper bound is exactly the open/closed asymmetry of 37.07.01 transported to integrals, and putting these together the variational value is dual to the Legendre-Fenchel pairing of 37.07.03: with linear, , so Varadhan's lemma generalises the cumulant-conjugate identity from linear tilts to arbitrary continuous gains.
Exercises Intermediate+
Advanced results Master
Bryc's inverse lemma: from the Laplace principle to the LDP
The implication of the Key theorem reverses. Bryc's lemma [Bryc 1990] states: if is exponentially tight and for every the limit
$$
\Lambda(F) := \lim_{\varepsilon\to0} a_\varepsilon\log\int e^{F/a_\varepsilon},d\mu_\varepsilon
$$
exists, then satisfies the LDP with the good rate function
$$
I(x) = \sup_{F\in C_b(\mathcal{X})}\big(F(x) - \Lambda(F)\big),
$$
the Legendre-Fenchel transform of the functional over the Banach space . The proof mirrors Exercises 3 and 8: the lower bound follows by feeding indicator-approximating test functions into the assumed Laplace limit, and the upper bound by feeding compact-supported bumps; exponential tightness reduces closed to compact. Thus the LDP, the Laplace principle, and the existence of all bounded-continuous exponential-integral limits are three packagings of one datum, and the rate function is recovered as a conjugate — the abstract, function-space form of the cumulant-conjugate identity of 37.07.03.
The Laplace principle as a primitive: the Dupuis-Ellis programme
Dupuis and Ellis [Dupuis & Ellis 1997] invert the logical order, defining large deviations through the Laplace principle and deriving the LDP as a consequence. The payoff is a variational representation of the prelimit functional itself: for many models one has the exact identity
$$
-a_\varepsilon\log\int e^{-F/a_\varepsilon},d\mu_\varepsilon = \inf_{\nu}\Big(\mathbb{E}\nu[F] + a\varepsilon, H(\nu,|,\mu_\varepsilon)\Big),
$$
the Donsker-Varadhan/Gibbs variational formula with relative entropy 37.07.06. Passing to the limit, the entropy term becomes the rate function and the infimum becomes , so the weak-convergence analysis of the controlled representation yields the Laplace limit and hence the LDP. This route makes Varadhan's lemma not a corollary but the organising definition, and turns large-deviation proofs into stochastic-control problems.
The classical Laplace and saddle-point methods recovered
Varadhan's lemma is the probabilistic lift of Laplace's 1782 asymptotic method [Laplace 1782]. For a deterministic integral with attaining a unique non-degenerate maximum at , Laplace's method gives . Identifying the reference family normalised Lebesgue (or any family whose LDP rate function is on the integration domain), the variational value reproduces the leading-order Laplace exponent; the subexponential Gaussian prefactor lives below the LDP scale and is invisible to . The complex saddle-point/steepest-descent method is the analytic continuation of the same principle. Varadhan's lemma is thus the statement that the Laplace exponent survives the introduction of a genuine cost: the phase is replaced by the net .
The dominated-convergence analogy
Varadhan's lemma is the large-deviation analogue of the dominated convergence theorem, with replacing , replacing the limit, and the moment condition replacing the dominating integrable envelope. The lower bound is a Fatou-type estimate (it survives without domination, paralleling the moment-condition-free lower bound of the Key theorem), while the upper bound needs the moment condition exactly as the dominated convergence upper passage needs an integrable dominator. The semiring — the "max-plus" or tropical algebra — is the limiting arithmetic: sends to , sums become maxima, products become sums, and integration becomes supremum. In this idempotent measure theory the rate function is a "tropical density" and Varadhan's lemma is the change-of-variables/integration identity.
Synthesis. The central insight is that the LDP and the Laplace principle are equivalent, with Varadhan's lemma supplying one direction and Bryc's inverse the other, and this is exactly the max-plus shadow of the ordinary integral: degenerates to , so summation becomes maximisation and the rate function becomes a tropical density. The foundational reason the variational value pairs against by subtraction is the Legendre-Fenchel duality of 37.07.03: for linear the value is , and Varadhan's lemma generalises that cumulant-conjugate identity from linear tilts to all continuous gains, while Bryc's inverse is dual to it by recovering as the conjugate of over . Putting these together with the tilted-measure corollary (Exercise 7), the lemma both evaluates exponential integrals and relocates concentration to , appears again in the Gibbs/Donsker-Varadhan entropy representation 37.07.06 and the Freidlin-Wentzell exit theory, and builds toward the weak-convergence (Dupuis-Ellis) reformulation in which the Laplace principle is the definition. The bridge is the equivalence itself: the rate function of 37.07.01, the conjugate of 37.07.03, and the limiting Laplace functional are one object viewed three ways.
Full proof set Master
Proposition 1 (the moment condition is automatic for bounded above). If is continuous and bounded above by , then the Varadhan moment condition holds for every , and consequently Varadhan's limit holds for .
Proof. For any , , so for every , and the is . Choosing any satisfies the moment condition, and the Key theorem applies. (Step 2 of the Key theorem is then vacuous: Step 1 already concludes for bounded-above .)
Proposition 2 (Varadhan's lemma extends the cumulant-conjugate identity). Let on satisfy the LDP with good rate . For every such that meets the moment condition, $$ \lim_{\varepsilon\to0} a_\varepsilon\log\int e^{\langle\lambda,x\rangle/a_\varepsilon},\mu_\varepsilon(dx) = I^(\lambda), $$ the Legendre-Fenchel conjugate of . In particular if $I=\Lambda^\Lambda\Lambda^{**}=\Lambda$.
Proof. Apply the Key theorem to the continuous : the limit equals , which is by definition 37.07.03. If with closed proper convex, then by the Fenchel-Moreau biconjugation theorem of 37.07.03. Thus the scaled cumulant generating function of converges to , the converse direction of the Gärtner-Ellis hypothesis.
Proposition 3 (uniqueness of the rate function via the Laplace functional). Suppose satisfies the Laplace principle with two good rate functions and . Then .
Proof. For every both rate functions give the same Laplace limit, so . Fix and . By goodness and lsc choose, for each , a bounded continuous with and off a shrinking neighbourhood with . Then as (the supremum is realised near once the far region is suppressed below the value at ), and likewise . Equality of the two suprema for every forces , hence . As was arbitrary, . This is the integral-side counterpart of the LDP uniqueness theorem of 37.07.01, now driven by separating points with bounded continuous functions.
Connections Master
Varadhan's integral lemma promotes the large deviation principle of
37.07.01from a statement about probabilities of sets to a statement about exponential integrals: under that unit's good-rate-function LDP and a moment condition, the asymptotics of are governed by , with the continuity-only lower bound and the goodness-plus-moment upper bound transporting that unit's open/closed asymmetry to integrals.The variational value is the Legendre-Fenchel pairing of
37.07.03: for a linear gain it is exactly the conjugate , so Varadhan's lemma generalises the cumulant-generating-function-to-rate-function duality from linear exponential tilts to arbitrary continuous test functions, and Bryc's inverse recovers as the conjugate of the limiting Laplace functional over .The Dupuis-Ellis variational representation expresses the prelimit Laplace functional through relative entropy, linking this unit to the Donsker-Varadhan formula and entropic rate function of
37.07.06: the limit of is , the Laplace principle read through stochastic control.
Historical & philosophical context Master
The asymptotic evaluation of integrals dominated by their largest integrand value is due to Pierre-Simon Laplace, who in 1782 [Laplace 1782] developed the method of approximating integrals of the form for large by expansion about the maximum of , in the course of his work on probability and celestial mechanics; the complex-variable refinement is the method of steepest descent associated with Riemann and Debye. The probabilistic generalisation — that the same concentration governs exponential integrals against a family of measures obeying a large deviation principle, with the rate function entering as a competing cost — was proved by S. R. S. Varadhan in 1966 [Varadhan 1966] alongside his abstract formulation of the LDP, and systematised in his 1984 lectures [Varadhan 1984].
The inverse direction, recovering the LDP from the convergence of exponential integrals of bounded continuous functions, was established by Włodzimierz Bryc [Bryc 1990] using exponential tightness, and the equivalence was elevated to a definitional standpoint by Paul Dupuis and Richard Ellis [Dupuis & Ellis 1997], who built large-deviation theory on the Laplace principle and a weak-convergence analysis of entropy-penalised control representations. The standard reference treatment is Dembo and Zeitouni [Dembo & Zeitouni §4.3]. The max-plus reading, in which degenerates integration to optimisation, connects the lemma to idempotent analysis as developed by Maslov and collaborators, where rate functions are densities for an idempotent measure theory.
Bibliography Master
@article{varadhan1966asymptotic,
author = {Varadhan, S. R. S.},
title = {Asymptotic probabilities and differential equations},
journal = {Communications on Pure and Applied Mathematics},
volume = {19},
pages = {261--286},
year = {1966}
}
@book{varadhan1984large,
author = {Varadhan, S. R. S.},
title = {Large Deviations and Applications},
series = {CBMS-NSF Regional Conference Series in Applied Mathematics},
number = {46},
publisher = {SIAM},
year = {1984}
}
@incollection{bryc1990large,
author = {Bryc, W{\l}odzimierz},
title = {On the large deviation principle by the asymptotic value method},
booktitle = {Diffusion Processes and Related Problems in Analysis, Volume I},
series = {Progress in Probability},
number = {22},
publisher = {Birkh\"auser},
pages = {447--472},
year = {1990}
}
@book{dupuisellis1997weak,
author = {Dupuis, Paul and Ellis, Richard S.},
title = {A Weak Convergence Approach to the Theory of Large Deviations},
series = {Wiley Series in Probability and Statistics},
publisher = {Wiley},
year = {1997}
}
@book{dembozeitouni1998ldp,
author = {Dembo, Amir and Zeitouni, Ofer},
title = {Large Deviations Techniques and Applications},
edition = {2nd},
series = {Applications of Mathematics},
number = {38},
publisher = {Springer},
year = {1998}
}
@book{denhollander2000large,
author = {den Hollander, Frank},
title = {Large Deviations},
series = {Fields Institute Monographs},
number = {14},
publisher = {American Mathematical Society},
year = {2000}
}
@incollection{laplace1782memoire,
author = {Laplace, Pierre-Simon},
title = {M\'emoire sur les approximations des formules qui sont fonctions de tr\`es-grands nombres},
booktitle = {M\'emoires de l'Acad\'emie Royale des Sciences de Paris},
year = {1782}
}