02.07.08 · analysis / measure-theory

Absolute Continuity and the Radon-Nikodym Theorem

shipped3 tiersLean: partial

Anchor (Master): Bogachev, Measure Theory Vol. 1 §3.2; Rudin, Real and Complex Analysis 3e §6-7 (von Neumann Hilbert-space proof)

Intuition Beginner

A measure tells you how much "stuff" sits in each region of a space: Lebesgue measure on the real line tells you the length of an interval, counting measure tells you how many points it contains, and a probability measure tells you the chance of an outcome falling in that region. Suppose you have two measures on the same space and you ask: can the second measure be written as the first measure weighted by some density function? If the answer is yes, that density is what the Radon-Nikodym theorem produces, and the equation it satisfies is the foundational identity behind probability densities, mass distributions, and conditional expectation.

The picture to keep in mind is mass spread along a wire. The "background" measure is uniform length along the wire, and the "target" measure assigns each piece of the wire a different mass depending on how thick the wire is at that point. The local thickness is the density of mass relative to length, and it lets you recover the mass of any segment by integrating the thickness over the length of that segment. The Radon-Nikodym derivative is the abstract version of this thickness function: it is the density of one measure with respect to another.

The condition that makes the procedure work is called absolute continuity. The target measure must not place any mass on a region that the background measure says has length zero. If the target ever puts mass on a length-zero set, no density function defined pointwise can capture that mass — the integral of any density over a length-zero set is zero. So the assignment of a density is possible exactly when the target is absolutely continuous with respect to the background.

When the target measure does place mass on length-zero regions, the Lebesgue decomposition theorem says you can still split the target into two parts: an absolutely continuous part that has a density, plus a singular part concentrated entirely on length-zero pieces. The classic example is a probability distribution that has both a continuous bell curve and a delta spike at one specific point: the bell curve part has a density, and the delta spike is the singular part.

The one-sentence takeaway: when a target measure assigns no mass to background-null regions, the Radon-Nikodym theorem says the target can be written as the background measure weighted by a unique density function, and this density is the abstract version of a probability density or mass distribution.

Visual Beginner

Imagine two ways of measuring the same wire. The first method assigns each piece its length (the background). The second method weighs each piece of the wire to get its mass (the target). If the wire is thicker in some places and thinner in others, the mass per unit length varies along the wire.

Top: length measure on the wire is uniform. Bottom: mass measure depends on local thickness, captured by the density curve. The mass of any segment is the area under the density curve over that segment.

Mathematically, the mass of a segment equals the integral of the density (which is the Radon-Nikodym derivative of mass with respect to length) over that segment. The condition for this picture to be possible is absolute continuity: no piece of zero length carries any mass.

Worked example Beginner

We compute a Radon-Nikodym density between two probability measures and confirm it acts as a density.

Step 1. The setup. Let the background measure be the standard normal distribution on the real line, with density against Lebesgue measure. Let the target measure be a different normal distribution with mean and the same variance, with density against Lebesgue measure.

Step 2. The density of target against background. Since both measures have densities against Lebesgue measure, the density of the target against the background is the ratio of the two density functions: .

Step 3. Verify by integrating an indicator. Consider the interval . The mass the background gives to this interval is the area under the standard-normal bell curve from to , roughly . The mass the target gives is the area under the shifted bell curve from to , roughly as well (by translation symmetry).

According to the Radon-Nikodym formula, the target mass should equal the area under the product of the density of target against background times the background density, taken over the interval: the area under from to . Plugging in gives the area under from to . The exponent simplifies via , so the integrand becomes , and the resulting area is indeed the target mass.

Step 4. The probability picture. The function is what statisticians call the likelihood ratio of the shifted normal against the standard normal. It tells you how much more likely each outcome is under the target than under the background. Importance sampling, hypothesis testing, and the change-of-measure technique in stochastic analysis all build on this Radon-Nikodym density.

Step 5. A singular example. Now consider a probability measure that is a half-and-half mix of the standard normal and a point mass at : the measure that puts half its mass on the continuous bell curve and half on the single point . There is no density function with respect to Lebesgue measure that captures the point-mass half, because any function integrated over the single point gives zero — but the point mass assigns probability to that point. The Lebesgue decomposition splits this mixed measure into its absolutely continuous half (the bell curve, with density ) plus its singular half (the point mass concentrated on ).

What this tells us: when the target measure has no mass on background-null sets, the Radon-Nikodym derivative exists as an integrable function. When it does have such mass, the Lebesgue decomposition separates the well-behaved density-having part from the singular part that resists density representation.

Check your understanding Beginner

Formal definition Intermediate+

Let be a measurable space and let be measures on .

Definition (absolute continuity). The measure is absolutely continuous with respect to , written , if every -null measurable set is also -null: implies for every .

Definition (mutual singularity). The measures and are mutually singular, written , if there exists a measurable set with and . Equivalently, and are concentrated on disjoint measurable sets.

Definition (signed measure). A signed measure on is a countably additive set-function taking at most one of the values , with . The positive part and negative part are non-negative measures arising from the Hahn-Jordan decomposition (Theorem 1 below), and .

Definition (Radon-Nikodym derivative). If and both measures are -finite, then there is a non-negative -measurable function , unique up to -almost everywhere equality, such that The function is the Radon-Nikodym derivative of with respect to , denoted . The defining identity rewrites as in differential notation.

Definition (Lebesgue decomposition). For two -finite measures on , there exist unique measures on with , , and . The pair is the Lebesgue decomposition of relative to .

Counterexamples to common slips Intermediate+

  • Absolute continuity does not require equivalence. The relation is one-directional: may have density against without having density against . For example, Lebesgue measure on is absolutely continuous with respect to Lebesgue measure on , but not conversely (the half has positive mass on but is outside the support of the first measure).

  • Without -finiteness the Radon-Nikodym derivative may not exist. On the real line, let be counting measure (which is not -finite on ) and let be Lebesgue measure. Every counting-null set is the empty set (counting measure of any non-empty set is at least ), so vacuously . But there is no Borel-measurable function with for every Borel : such a function would need , forcing , but then for any of positive Lebesgue measure. The -finiteness of is load-bearing.

  • Mutual singularity is not the negation of absolute continuity. Two measures can fail to be absolutely continuous without being mutually singular. The example: on the real line, take Lebesgue measure on and Lebesgue measure on . Neither is the other (each has support outside the other's), but they are not mutually singular either: they overlap on where both are non-zero. The Lebesgue decomposition of with respect to here is , with the first piece and the second piece .

  • The Radon-Nikodym derivative is a function, not a number. Common slip: writing for a constant when in fact the derivative depends on . The constant case happens precisely when , that is when is a scalar multiple of . For typical pairs of measures the derivative varies pointwise.

  • Uniqueness is only -a.e., not pointwise. Two functions that agree -almost everywhere give the same Radon-Nikodym density. Any pointwise modification on a -null set leaves the integral unchanged. So the Radon-Nikodym derivative is an equivalence class in (or when is finite), not a specific function.

Key theorem with proof Intermediate+

Theorem (Lebesgue-Radon-Nikodym; Radon 1913 Sitzungsber. Wien 122, 1295; Nikodym 1930 Fund. Math. 15, 131). Let be a measurable space and let be -finite measures on . There exist unique -finite measures on with:

(a) ;

(b) ;

(c) .

Moreover, there exists a non-negative -measurable function , unique up to -a.e. equality, such that The function is the Radon-Nikodym derivative of with respect to .

Proof (von Neumann's Hilbert-space argument; von Neumann 1940 Bull. AMS 46, 376). We first treat the case where both and are finite; the -finite case follows by a standard exhaustion.

Step 1 (the Hilbert space). Let . Both and are dominated by (each is ), and is a finite measure since are finite. Form the Hilbert space with inner product . The space is a Hilbert space by Riesz-Fischer completeness 02.07.06.

Step 2 (a bounded linear functional). Define by . The functional is well-defined and bounded: by the Cauchy-Schwarz inequality applied to , using in the second inequality and Cauchy-Schwarz with the constant in the third. So .

Step 3 (Riesz representation). By the Riesz representation theorem for Hilbert spaces, there exists with for every . The function is uniquely determined as the orthogonal projection of onto the relevant subspace, and -a.e. (the upper and lower bounds are forced by choosing and as test functions and observing the integral identities).

Step 4 (the decomposition sets). Define and . Both sets are measurable since is. The set is the support of the singular part: on , the identity for supported on rearranges (using ) to for every such , hence .

Step 5 (the absolutely continuous part). Define and . Then . The singular part is concentrated on , where , so .

For the absolutely continuous part: on , , so . Setting for measurable (after verifying integrability), the Riesz identity gives . Using , the right-hand side splits as , and rearranging gives , i.e., where on (and on ). So (the integral over vanishes since ), establishing with on and on .

Step 6 (-finite case). Decompose with and disjoint. Apply the finite-measure result to each to get densities on . Patch together: define for . The patched function is -measurable and gives the global Radon-Nikodym density.

Step 7 (uniqueness). Suppose both hold. Then for every , which forces -almost everywhere by the standard argument (apply to and to deduce ).

Bridge. The von Neumann Hilbert-space proof of Lebesgue-Radon-Nikodym is one of the cleanest applications of the Riesz representation theorem on , replacing the older measure-theoretic arguments (Radon 1913, Nikodym 1930, Hahn 1921) with a functional-analytic argument that pivots on a single bounded linear functional and its orthogonal projection. The resulting density is the foundational object behind every density representation in probability theory: probability density functions against Lebesgue measure, the likelihood ratio between two probability measures, the conditional expectation defined as the -measurable Radon-Nikodym derivative of with respect to , and the change-of-measure formula in Girsanov's theorem for Brownian motion under a drifted measure. Each of these probabilistic constructions is a special case of the abstract Radon-Nikodym derivative, with the underlying -finite Hilbert-space argument running invisibly behind the explicit formulas. Beyond probability, Radon-Nikodym is the load-bearing tool in the duality for on -finite measure spaces, the foundation of functional analysis; in the construction of conditional expectation as the orthogonal projection onto an -subspace, the foundation of martingale theory; and in the chain rule for nested density representations, the foundation of pushforward measures in differential geometry and information theory.

Exercises Intermediate+

Lean formalization Intermediate+

lean_status: partial — Mathlib provides the central Lebesgue-Radon-Nikodym architecture. MeasureTheory.Measure.AbsolutelyContinuous is the relation , with notation ν ≪ μ. MeasureTheory.Measure.MutuallySingular is the relation , with notation ν ⟂ₘ μ. MeasureTheory.Measure.haveLebesgueDecomposition_of_sigmaFinite gives existence of the Lebesgue decomposition on -finite measures, with MeasureTheory.Measure.singularPart and MeasureTheory.Measure.rnDeriv extracting the singular part and the Radon-Nikodym derivative. The change-of-variables formula appears as MeasureTheory.lintegral_rnDeriv_mul. The von Neumann Hilbert-space proof is not explicitly reconstructed in Mathlib — Mathlib follows a more direct decomposition argument via the Hahn-Jordan decomposition — but the resulting density and Lebesgue decomposition are fully formalised in the companion module Codex.Analysis.MeasureTheory.AbsoluteContinuityRadonNikodym.

import Mathlib.MeasureTheory.Decomposition.RadonNikodym
import Mathlib.MeasureTheory.Decomposition.Lebesgue
import Mathlib.MeasureTheory.Measure.AbsolutelyContinuous

open MeasureTheory

variable {α : Type*} [MeasurableSpace α]
variable (μ ν : Measure α) [SigmaFinite μ] [SigmaFinite ν]

-- Absolute continuity: ν ≪ μ means every μ-null set is ν-null
example (h : ν ≪ μ) (E : Set α) (hE : MeasurableSet E) (hμ : μ E = 0) :
    ν E = 0 := h hμ

-- Lebesgue decomposition: ν = ν.singularPart μ + μ.withDensity (ν.rnDeriv μ)
example [μ.HaveLebesgueDecomposition ν] :
    ν = ν.singularPart μ + μ.withDensity (ν.rnDeriv μ) :=
  (Measure.haveLebesgueDecomposition_add ν μ).symm

-- Radon-Nikodym identity: when ν ≪ μ, integration against ν reduces to
-- integration against μ weighted by the RN derivative
example (h : ν ≪ μ) (f : α → ENNReal) (hf : Measurable f) :
    ∫⁻ x, f x ∂ν = ∫⁻ x, f x * ν.rnDeriv μ x ∂μ :=
  lintegral_rnDeriv_mul h hf.aemeasurable

Advanced results Master

The advanced theory of absolute continuity, mutual singularity, and the Radon-Nikodym derivative splits across nine strands: the Hahn-Jordan decomposition of signed measures, the Lebesgue decomposition into absolutely continuous plus singular parts, the unsigned and signed Radon-Nikodym theorems, the chain rule and change-of-variables for Radon-Nikodym derivatives, -finiteness as a load-bearing hypothesis and explicit counterexamples without it, conditional expectation in probability theory, the Riesz representation of , the martingale convergence theorem, and Girsanov-style changes of measure in stochastic analysis.

Theorem 1 (Hahn-Jordan decomposition; Hahn 1921 Ann. Sc. Norm. Sup. Pisa 9, 429). Let be a measurable space and let be a signed measure on . There exist measurable sets with , , and for every measurable and for every measurable . The decomposition is unique up to -null modifications. Defining and gives non-negative measures with and .

The Hahn-Jordan decomposition is the structural foundation of signed-measure theory: every signed measure splits canonically into a positive part and a negative part, supported on disjoint sets. The total variation measure is , and the total variation norm is . The proof uses an extremal-set argument: take to be a set of maximum positive measure, that is one achieving the supremum , and verify the positive-set property via a refinement argument that successively removes negative-mass subsets [Hahn 1921].

Theorem 2 (Lebesgue decomposition; Lebesgue 1904 Leçons sur l'intégration §VII). Let be -finite measures on . There exist unique -finite measures on with , , and .

Lebesgue's 1904 Leçons sur l'intégration et la recherche des fonctions primitives [Lebesgue 1904] introduced the decomposition for the special case of measures on the real line, motivated by the structure of distribution functions: every monotone increasing function on has a unique decomposition as the sum of an absolutely continuous monotone function (with -density derivative), a singular continuous monotone function (with derivative zero a.e., concentrated on a Lebesgue-null set, like the Cantor function), and a jump function (a sum of step functions). The abstract measure-theoretic version (general -finite ) was completed by Radon 1913 and Nikodym 1930. The proof is essentially Step 4-5 of the main theorem's proof: the decomposition sets are and from the Hilbert-space argument.

Theorem 3 (Radon-Nikodym, unsigned; Radon 1913, Nikodym 1930). Let be -finite measures on with . There exists a non-negative -measurable function , unique up to -a.e. equality, with for every .

Radon's 1913 paper [Radon 1913], titled Theorie und Anwendungen der absolut additiven Mengenfunktionen, established the theorem for measures absolutely continuous with respect to Lebesgue measure on . Nikodym's 1930 Fundamenta Mathematicae paper [Nikodym 1930] extended it to arbitrary -finite background measures on abstract measurable spaces, completing the modern statement. The proof is the von Neumann Hilbert-space argument (Step 1-7 of the main theorem) for the finite case, plus -finite exhaustion.

Theorem 4 (Radon-Nikodym, signed and complex; Folland 2e Theorem 3.8). Let be a -finite measure on . For every -finite signed measure with (defined via ), there exists an -measurable in , unique up to -a.e. equality, with for every . The complex case extends by decomposing into real and imaginary parts.

The signed-and-complex Radon-Nikodym theorem reduces to the unsigned case via the Hahn-Jordan decomposition : apply Theorem 3 to each of to get non-negative densities , then is the signed density. The complex case decomposes the complex measure into real and imaginary signed measures and applies the real case to each.

Theorem 5 (Chain rule for Radon-Nikodym; Folland 2e Theorem 3.7). Let be -finite measures on with . Then , and the Radon-Nikodym derivatives satisfy -almost everywhere.

The chain rule is the multiplicative analogue of the additive chain rule of calculus, and it is the foundational algebraic identity for nested density representations. The proof is Exercise 4 above: substitute the density of against into the defining identity of against and apply uniqueness of the resulting density against .

A consequence: under mutual absolute continuity , the derivatives are reciprocals: -a.e. (and -a.e.). This is the symmetric-density identity behind the Kullback-Leibler divergence and the change-of-coordinates formula in differential geometry.

Theorem 6 (Change of variables under Radon-Nikodym; Folland 2e Theorem 3.9). Let be -finite with and Radon-Nikodym derivative . For every non-negative -measurable function : For signed/complex measurable , the identity holds when either side has finite absolute integral, with both being defined as -convergent integrals on a full-measure set.

The change-of-variables formula extends the defining identity (which is the special case ) to general non-negative measurable functions via the standard machine: indicators simple non-negative measurable (MCT) signed via . The identity is the abstract version of the calculus substitution rule for a smooth bijection : in measure-theoretic language, the Jacobian determinant is the Radon-Nikodym derivative of the pullback measure against .

Theorem 7 ($(L^p)^ \cong L^q$ via Radon-Nikodym; Riesz 1910 Math. Ann. 69, 449).* Let be a -finite measure space and with conjugate exponent . The map defined by is an isometric isomorphism.

The Radon-Nikodym theorem is the load-bearing step in the proof: starting from a bounded linear functional , define a signed measure on sets of finite measure, observe , apply Radon-Nikodym to get a density , and verify with . The full proof is Exercise 7. The duality identification is the foundational structural fact behind the weak topology on , the Banach-Alaoglu compactness of bounded sequences, and the reflexivity of for (since via two applications of the duality, modulo Clarkson's inequality controlling the canonical embedding).

The endpoint fails: is strictly larger than on a -finite space, with the difference filled by "purely finitely additive" measures (Banach limits and finitely-additive set-functions). The construction of these non- functionals on requires the axiom of choice via the Hahn-Banach theorem [Riesz 1910].

Theorem 8 (Conditional expectation as Radon-Nikodym derivative; Kolmogorov 1933 Grundbegriffe der Wahrscheinlichkeitsrechnung). Let be a probability space, , and a sub--algebra. There exists a -measurable function , unique up to -a.e. equality, with for every . The function is the conditional expectation of given , denoted .

The proof (Exercise 5) identifies as the Radon-Nikodym derivative of the signed measure (on ) against . This identification is the foundational construction of modern probability theory: conditional expectation is not defined pointwise but rather as an equivalence class of -measurable random variables satisfying a Radon-Nikodym-type integral identity.

The basic properties — tower property for nested , linearity, monotonicity, -boundedness for (Jensen's inequality), and projection being the -orthogonal projection onto the closed subspace — all derive from the Radon-Nikodym defining identity.

Theorem 9 (Doob martingale convergence; Doob 1940 Trans. AMS 47, 455; Doob 1953 Stochastic Processes Ch. VII). Let be a uniformly integrable martingale on adapted to filtration . Then almost surely and in , where is -measurable and for every .

The proof (sketched in Exercise 8) identifies as the Radon-Nikodym derivative on of the limit signed measure extending the consistent sequence . The almost-sure convergence comes from Doob's upcrossing inequality and the -convergence from uniform integrability via Dunford-Pettis [Doob — Stochastic Processes].

Doob's theorem is the time-asymptotic Radon-Nikodym theorem, with the filtration representing accumulating information and the martingale representing the evolving estimates. The convergence theorem says the estimates do converge to the limit density on the union -algebra, both almost surely and in . The result is the foundational convergence theorem of probability theory and the abstract underpinning of every recursive estimation procedure (Kalman filtering, stochastic approximation, Bayesian posterior updating).

Theorem 10 (Girsanov change of measure; Girsanov 1960 Theory Probab. Appl. 5, 285). Let be a Brownian motion on , an adapted process with a.s. and Novikov's condition . Define by . Then under , the process is a Brownian motion.

Girsanov's theorem is the change-of-measure formula in stochastic analysis: the Radon-Nikodym derivative is the exponential martingale, and the change of measure shifts the drift of the Brownian motion by . The construction is foundational in mathematical finance, where it underlies the risk-neutral pricing of derivatives (Black-Scholes 1973 J. Polit. Econ. 81, 637 and Harrison-Kreps 1979 J. Econ. Theory 20, 381): the equivalent martingale measure is constructed via Girsanov to make the discounted asset price a martingale, and option prices are computed as -expectations [Folland 2e for Radon-Nikodym background; the Girsanov construction itself is downstream].

Synthesis. The Lebesgue-Radon-Nikodym architecture identifies one of the three most foundational pillars of measure theory (alongside Carathéodory's outer-measure construction and Fubini-Tonelli on product spaces). The central insight is the existence and uniqueness of a density function for absolutely continuous pairs of -finite measures, together with the Lebesgue decomposition splitting general pairs into an absolutely continuous part plus a singular part. The von Neumann Hilbert-space proof reveals the structural reason the theorem works: the density is the orthogonal projection of the constant function onto a specific closed subspace of , and the Riesz representation theorem on Hilbert spaces does all the work.

The pattern generalises in five directions. First, to probability theory: conditional expectation, regular conditional probabilities, disintegrations of measures, and martingales are all explicit Radon-Nikodym-derivative constructions. Second, to functional analysis: the duality for on -finite spaces is a direct Radon-Nikodym application. Third, to stochastic analysis: Girsanov's theorem and the risk-neutral pricing framework of mathematical finance are exponential-martingale Radon-Nikodym derivatives. Fourth, to information theory: the Kullback-Leibler divergence is the integral of the log-density, the foundational information-theoretic measure of distance between probability distributions. Fifth, to differential geometry: the pullback and pushforward of measures under smooth maps are governed by Jacobian determinants, which are the Radon-Nikodym derivatives in coordinate charts.

Putting these together identifies Radon-Nikodym as the bridge between absolute continuity (a measure-theoretic relation) and density-existence (a function-theoretic representation), with the Hilbert-space proof revealing the bridge as a Riesz-representation projection in disguise — six structurally distinct but Radon-Nikodym-unified pillars of modern analysis, probability, and applied mathematics.

Full proof set Master

Proposition 1 (Hahn decomposition — full proof). Every signed measure on admits a Hahn decomposition with positive and negative.

Proof. Assume takes values in (the case takes values in is symmetric). Let . Choose sets with . The claim is that one can construct a measurable with (and positive).

Sub-step 1a (positivity lemma). For each , define to be a positive set (every measurable subset has non-negative -measure) achieving . This is constructed by transfinite removal of negative-mass measurable subsets: if contains a measurable subset with , replace by , which has . Iterate (countably or transfinitely as needed); the process terminates with a positive set , because at each step the removed-measure exceeds for some increasing sequence , with the total removed measure bounded by .

Sub-step 1b (Hahn set). Define . The union of positive sets is positive (the -measure of a measurable subset equals ). By construction , and by the supremum definition. So and is positive.

Sub-step 1c (negative complement). Define . Every measurable subset satisfies : otherwise would have , contradicting the supremum. So is negative.

Proposition 2 (Lebesgue decomposition existence and uniqueness; finite-measure case). Let be finite measures on . There exist unique measures with , , and .

Proof. The existence is Step 4-5 of the main theorem's proof: the Hilbert-space argument produces measurable sets with being the singular support and being the absolutely continuous support. The measures and give the decomposition.

For uniqueness: suppose is another decomposition. Then . The left-hand side is absolutely continuous against (difference of two measures), and the right-hand side is mutually singular with (difference of two measures, both supported on -null sets). The only signed measure that is both absolutely continuous against and mutually singular with is the zero measure (the absolutely continuous part vanishes on -null sets, and the singular part is concentrated on a -null set). So and .

Proposition 3 (-finite extension). The Lebesgue-Radon-Nikodym theorem extends from finite to -finite measures via exhaustion.

Proof. Let with disjoint and . Apply the finite-measure Radon-Nikodym (Proposition 2 plus the Hilbert-space density argument) to each : get densities on and singular parts on . The patched functions for and are the global density and singular part. Measurability is preserved because the patching is along a countable measurable partition.

Proposition 4 (Uniqueness of the Radon-Nikodym derivative). If , then -almost everywhere.

Proof. Set . The hypothesis gives for every . Apply this with on each -finite piece : with non-negative integrand , so -a.e. on , hence on . Similarly . Taking the union over gives , so -a.e.

Proposition 5 (Signed Radon-Nikodym). Let be -finite and a signed measure with . Then there is with for every , unique up to -a.e. equality.

Proof. By the Hahn-Jordan decomposition (Proposition 1), with non-negative measures and (since ). Apply the unsigned Radon-Nikodym (Proposition 3) to each: get and . Set . Then by linearity of the integral. The function is in because are. Uniqueness reduces to the unsigned case (Proposition 4) applied to and separately.

Proposition 6 (Chain rule). For -finite measures , the derivatives satisfy -a.e.

Proof. Let and . The change-of-variables formula (Theorem 6, Proposition 7 below) gives, for non-negative measurable : where the second equality applies the change-of-variables formula with the density of against . Specialising to : . By uniqueness of the Radon-Nikodym derivative of against , -a.e.

Proposition 7 (Change of variables for Radon-Nikodym). For -finite with , density , and non-negative measurable :

Proof. The standard machine: indicators simple non-negative measurable.

Step 1 (indicators). For : , where the second equality is the defining identity of the Radon-Nikodym derivative.

Step 2 (simple). For a non-negative simple function: linearity of the integral on both sides gives .

Step 3 (non-negative measurable). Approximate by an increasing sequence of non-negative simple functions via the standard simple-function approximation 02.07.03. By MCT 02.07.04 applied to both sides: and (note pointwise since ). The identity for simple (Step 2) passes to the limit, giving the identity for .

Proposition 8 ($(L^p)^ \cong L^q1 \leq p < \infty\sigma\mu$).* The map defined by is an isometric isomorphism.

Proof. Step 1-7 of Exercise 7. The injectivity and isometry are immediate from Hölder and the duality-extremising function. The surjectivity uses Radon-Nikodym: starting from , define the signed measure on finite-measure sets, observe , apply Radon-Nikodym to get a density that recovers on simple functions, then extend by -density to all .

The -finiteness of is load-bearing twice: once for the existence of the Radon-Nikodym derivative on the signed measure , and once for the -density of simple functions of finite support (which fails on non--finite measure spaces, where the dual of is strictly larger than ).

Proposition 9 (Conditional expectation properties). For and , the conditional expectation defined via Radon-Nikodym satisfies:

(a) Linearity: for .

(b) Tower property: for .

(c) -projection: for , is the orthogonal projection of onto the closed subspace .

(d) Jensen: for convex with , -a.s.

Proof sketch. All four properties derive from the defining Radon-Nikodym identity for .

Linearity (a): both sides satisfy the defining identity with on the right. By uniqueness, they are -a.e. equal.

Tower (b): for , . By uniqueness of the Radon-Nikodym derivative on , the two sides agree -a.s.

-projection (c): for , is in (Jensen with ). The orthogonality condition for all reduces (by linearity in ) to for , which is exactly the defining identity. So is the orthogonal projection.

Jensen (d): apply the convex inequality pointwise using the supporting-line characterisation of convex functions. The proof uses a representation over affine minorants, applies linearity and monotonicity of conditional expectation to each affine minorant, and takes the supremum.

Proposition 10 (Doob martingale convergence — outline). A uniformly integrable martingale converges -a.s. and in to a limit , identified as the Radon-Nikodym derivative on .

Proof outline. Step 1 (almost-sure convergence): Doob's upcrossing inequality on the number of upcrossings of any interval shows that with probability , the sequence makes only finitely many oscillations across any rational interval, hence converges in .

Step 2 (-convergence under uniform integrability): the Dunford-Pettis theorem characterises uniformly integrable subsets of as relatively weakly compact, and combined with the a.s. limit gives -convergence (avoiding the Vitali convergence theorem direct path).

Step 3 (Radon-Nikodym identification): the consistent sequence of signed measures on extends to a single signed measure on (Kolmogorov extension on the increasing union -algebra). The Radon-Nikodym derivative , identifying the martingale limit as a density.

Connections Master

  • Lebesgue integral and monotone convergence 02.07.04. The MCT in the change-of-variables formula (Proposition 7) extends the defining identity from indicators to non-negative measurable functions, the standard-machine step that bridges the measure-theoretic statement (the indicator-level identity) to the integral-theoretic statement (the general non-negative integrand).

  • Fatou's lemma and dominated convergence 02.07.05. The DCT is used in the proof of the -density theorem invoked in Proposition 8 (Riesz representation) and in the proof of -convergence in the martingale convergence theorem (Proposition 10): under uniform integrability the a.s. convergence implies -convergence via a DCT-style domination by the uniformly integrable family.

  • spaces and Riesz-Fischer completeness 02.07.06. The just-shipped peer in -theory. The Hilbert space in the von Neumann proof of Radon-Nikodym is constructed via Riesz-Fischer completeness on . The duality (Proposition 8 / Exercise 7) is a direct Radon-Nikodym application and is the foundational structural theorem of -theory.

  • Fubini-Tonelli theorem and product measures 02.07.07. The immediate predecessor. Fubini-Tonelli and Radon-Nikodym together comprise the two pillars of modern integration theory beyond the basic Lebesgue-MCT-DCT framework: Fubini-Tonelli handles product spaces, while Radon-Nikodym handles density representations on single measure spaces. The two theorems combine in the construction of regular conditional probabilities, where disintegrations of a joint measure on a product space are obtained via Fubini-style fibrewise decomposition and Radon-Nikodym fibrewise density.

  • Lebesgue outer measure and Carathéodory construction 02.07.02. The construction of the singular part in the Lebesgue decomposition relies on Carathéodory's outer-measure machinery applied to the measurable structure of the support set . Without Carathéodory the structural splitting into -additive pieces is unavailable.

  • Riesz representation theorem in functional analysis [forward: 02.11.14]. Proposition 8's identification for is one of the founding theorems of functional analysis, with Radon-Nikodym as the load-bearing step. The forward connection to functional-analytic representation theorems (Riesz-Markov for measures on locally compact spaces, Riesz-Fréchet on Hilbert spaces, Hahn-Banach extensions) all build on the Radon-Nikodym density-representation framework.

  • Conditional expectation and martingale theory [forward: probability theory]. Theorem 8 and Proposition 9 identify conditional expectation as a Radon-Nikodym derivative and develop the algebraic properties (linearity, tower, -projection, Jensen) from the defining identity. Doob's martingale convergence (Theorem 9, Proposition 10) is the time-asymptotic version of Radon-Nikodym, and it is the foundational convergence theorem of probability theory. Forward connections to stochastic calculus, Girsanov's theorem (Theorem 10), and mathematical finance all build on this Radon-Nikodym-driven framework.

  • Fourier analysis and PDE via weak derivatives [lateral: 02.10 Fourier; 02.13 PDE]. The Fourier transform of an -function is a bounded continuous function (Riemann-Lebesgue), and the inverse transform is defined via a density representation against the Fourier-conjugate measure. Weak derivatives in Sobolev space theory are defined via Radon-Nikodym-type integration-by-parts identities: for test functions , with the unique -density satisfying the integral identity. The lateral connection to PDE theory is via this distributional integration-by-parts formulation, which is a Radon-Nikodym statement at heart.

  • Information theory: Kullback-Leibler divergence and entropy [lateral: information theory]. The Kullback-Leibler divergence is the integral of the log-density, the foundational information-theoretic measure of distance between two probability distributions. Shannon entropy is the special case where one measure is the uniform reference. The lateral connection to coding theory, channel capacity, and statistical inference is via this Radon-Nikodym log-density framework.

  • Differential geometry: Jacobians and measure pullback [lateral: differential geometry]. Under a smooth bijection , the pullback measure has Radon-Nikodym derivative the Jacobian determinant: . The lateral connection to coordinate changes in integration on manifolds is via this Jacobian-as-Radon-Nikodym-derivative framework.

Historical & philosophical context Master

Henri Lebesgue's 1904 Leçons sur l'intégration et la recherche des fonctions primitives [Lebesgue 1904] introduced the modern integration theory and along the way characterised the structure of monotone increasing functions on : every such function has a unique decomposition as the sum of an absolutely continuous part (with -density derivative), a singular continuous part (with derivative zero almost everywhere, like the Cantor function), and a jump part (a sum of step functions). Lebesgue's decomposition theorem for distribution functions was the historical germ of the Lebesgue decomposition theorem for measures, although the abstract formulation in terms of general -finite measures appeared only with Radon 1913 and Nikodym 1930.

Johann Radon's 1913 Sitzungsberichte der Wiener Akademie paper [Radon 1913], titled Theorie und Anwendungen der absolut additiven Mengenfunktionen (Theory and applications of absolutely additive set-functions), introduced the abstract measure-theoretic framework for "absolute additive set-functions" (in modern terminology, signed measures) and proved the existence of a density function for measures absolutely continuous with respect to Lebesgue measure on . Radon's paper was foundational in two ways: it axiomatised the modern signed-measure framework, and it established the density-existence theorem for Lebesgue background. The proof technique used a refinement-of-partitions argument that was technically intricate; the cleaner Hilbert-space proof of von Neumann came three decades later.

Hans Hahn's 1921 Annali della Scuola Normale Superiore di Pisa paper [Hahn 1921], titled Über die Multiplikation total-additiver Mengenfunktionen (On the multiplication of totally additive set-functions), proved the Hahn-Jordan decomposition: every signed measure splits canonically into a positive part and a negative part, supported on disjoint measurable sets. The Hahn decomposition is the structural foundation of signed-measure theory, and it is the missing piece that lets one reduce the signed Radon-Nikodym theorem to the unsigned case via the Hahn-Jordan splitting.

Otton Nikodym's 1930 Fundamenta Mathematicae paper [Nikodym 1930], titled Sur une généralisation des intégrales de M. J. Radon (On a generalisation of M. J. Radon's integrals), extended Radon's theorem to arbitrary -finite background measures on abstract measurable spaces, completing the modern Lebesgue-Radon-Nikodym theorem. Nikodym's contribution was the recognition that the abstract measure-theoretic framework Radon had axiomatised did not require Lebesgue measure as background — any -finite measure could play the role of the reference, and the density-existence theorem went through. Nikodym's paper also clarified the -finiteness hypothesis as load-bearing, with explicit examples of failure under counting-measure-type non--finite backgrounds.

John von Neumann's 1940 Bulletin of the AMS paper [von Neumann 1940], titled On rings of operators III (with the relevant Hilbert-space proof of Radon-Nikodym appearing in §6), introduced the elegant Hilbert-space proof presented in this unit's main theorem. Von Neumann's insight was that the Radon-Nikodym density could be identified as where is the orthogonal projection of the constant function onto a specific closed subspace of . The Riesz representation theorem on Hilbert spaces does all the work, reducing the original measure-theoretic existence question to a one-line application of the Riesz lemma. Von Neumann's proof became the standard textbook proof of Radon-Nikodym after 1950, displacing the older refinement-of-partitions arguments of Radon and Nikodym.

Paul Halmos's 1950 Measure Theory monograph and Walter Rudin's 1966 Real and Complex Analysis [Rudin RCA; Folland 2e] codified the modern textbook formulation of Lebesgue-Radon-Nikodym: the von Neumann Hilbert-space proof, the Hahn-Jordan decomposition for the signed case, the Lebesgue decomposition for the singular-plus-absolutely-continuous splitting, and the chain rule and change-of-variables for the Radon-Nikodym derivative. Joseph Doob's 1953 Stochastic Processes [Doob — Stochastic Processes] applied the framework to martingale theory, identifying conditional expectation as a Radon-Nikodym derivative and proving the martingale convergence theorem as a time-asymptotic version of Radon-Nikodym (Theorem 9). Igor Girsanov's 1960 Theory of Probability and its Applications paper extended the framework to stochastic analysis, with the exponential-martingale Radon-Nikodym derivative behind the change-of-measure formula for Brownian motion (Theorem 10), foundational in mathematical finance.

The philosophical thread: Lebesgue-Radon-Nikodym is the rigorous version of the question "can one measure be written as a density against another?" — a question that goes back at least to Newton and Leibniz's mass-density formulations in classical mechanics, and to Daniel Bernoulli's 1738 Hydrodynamica formulation of probability density. The seven-decade arc from Lebesgue 1904 to von Neumann 1940 to Doob 1953 to Girsanov 1960 tracks the maturation of the density-existence question from a special case for monotone functions on to an abstract structural theorem on -finite measure spaces with applications across probability theory, functional analysis, stochastic analysis, and mathematical finance. The von Neumann Hilbert-space proof reveals the deep structural reason the theorem works: the density is an orthogonal projection in disguise, and the Riesz representation theorem on Hilbert spaces is the load-bearing step. This functional-analytic insight is one of the most elegant proofs in modern analysis and a model for how Hilbert-space arguments can illuminate measure-theoretic phenomena.

Bibliography Master

@book{Lebesgue1904,
  author    = {Lebesgue, Henri},
  title     = {Le\c{c}ons sur l'int\'egration et la recherche des fonctions primitives},
  publisher = {Gauthier-Villars},
  address   = {Paris},
  year      = {1904}
}

@article{Radon1913,
  author  = {Radon, Johann},
  title   = {Theorie und {A}nwendungen der absolut additiven {M}engenfunktionen},
  journal = {Sitzungsberichte der Kaiserlichen Akademie der Wissenschaften in Wien, Mathematisch-naturwissenschaftliche Klasse},
  volume  = {122},
  year    = {1913},
  pages   = {1295--1438}
}

@article{Hahn1921,
  author  = {Hahn, Hans},
  title   = {\"Uber die {M}ultiplikation total-additiver {M}engenfunktionen},
  journal = {Annali della Scuola Normale Superiore di Pisa},
  series  = {2},
  volume  = {9},
  year    = {1921},
  pages   = {429--452}
}

@article{Nikodym1930,
  author  = {Nikodym, Otton},
  title   = {Sur une g\'en\'eralisation des int\'egrales de {M}. {J}. {R}adon},
  journal = {Fundamenta Mathematicae},
  volume  = {15},
  year    = {1930},
  pages   = {131--179}
}

@article{vonNeumann1940,
  author  = {von Neumann, John},
  title   = {On rings of operators {III}},
  journal = {Annals of Mathematics},
  series  = {2},
  volume  = {41},
  year    = {1940},
  pages   = {94--161}
}

@article{Riesz1910,
  author  = {Riesz, Frigyes},
  title   = {Untersuchungen \"uber {S}ysteme integrierbarer {F}unktionen},
  journal = {Mathematische Annalen},
  volume  = {69},
  year    = {1910},
  pages   = {449--497}
}

@book{Halmos1950,
  author    = {Halmos, Paul R.},
  title     = {Measure Theory},
  publisher = {Van Nostrand},
  address   = {New York},
  year      = {1950}
}

@book{Folland1999,
  author    = {Folland, Gerald B.},
  title     = {Real Analysis: Modern Techniques and Their Applications},
  edition   = {2},
  publisher = {Wiley},
  year      = {1999}
}

@book{Rudin1987,
  author    = {Rudin, Walter},
  title     = {Real and Complex Analysis},
  edition   = {3},
  publisher = {McGraw-Hill},
  year      = {1987}
}

@book{Bogachev2007,
  author    = {Bogachev, Vladimir I.},
  title     = {Measure Theory, Volume 1},
  publisher = {Springer},
  year      = {2007}
}

@book{RoydenFitzpatrick2010,
  author    = {Royden, H. L. and Fitzpatrick, P. M.},
  title     = {Real Analysis},
  edition   = {4},
  publisher = {Pearson},
  year      = {2010}
}

@book{Doob1953,
  author    = {Doob, Joseph L.},
  title     = {Stochastic Processes},
  publisher = {Wiley},
  address   = {New York},
  year      = {1953}
}

@book{Kolmogorov1933,
  author    = {Kolmogorov, Andrey N.},
  title     = {Grundbegriffe der {W}ahrscheinlichkeitsrechnung},
  publisher = {Springer},
  address   = {Berlin},
  year      = {1933}
}

@article{Girsanov1960,
  author  = {Girsanov, Igor V.},
  title   = {On transforming a certain class of stochastic processes by absolutely continuous substitution of measures},
  journal = {Theory of Probability and its Applications},
  volume  = {5},
  year    = {1960},
  pages   = {285--301}
}

@book{Tao2016,
  author    = {Tao, Terence},
  title     = {Analysis II},
  edition   = {3},
  publisher = {Hindustan Book Agency},
  year      = {2016}
}