02.20.01 · analysis / littlewood-paley-interpolation

BMO and the John-Nirenberg Inequality

shipped3 tiersLean: none

Anchor (Master): Stein 1993 *Harmonic Analysis* (Princeton) Ch. IV; Grafakos 2014 *Modern Fourier Analysis* 3e (Springer) §7.1-7.4; Duoandikoetxea 2001 *Fourier Analysis* (AMS) §6

Intuition Beginner

Imagine measuring the price of one product across thousands of shops. Two kinds of "spread" are possible. One is that prices everywhere stay within a fixed band — never below a floor, never above a ceiling. The other, looser kind says nothing about a ceiling: prices can climb arbitrarily high in a few places, but if you zoom into any neighbourhood and compare each shop to the local average, the typical gap from that local average stays modest. The second condition controls how much things wiggle locally, not how big they get globally.

Functions of bounded mean oscillation are exactly the second kind. For each box you compute the average of the function over that box, then measure how far the function sits from its own average, on average, inside the box. If that averaged gap is uniformly small across every box and every size, the function has bounded mean oscillation. A bounded function automatically qualifies, because it cannot stray far from any average. But the reverse fails: a function can have bounded mean oscillation while shooting off to infinity.

The headline example is the logarithm of distance to a point. Near that point the function plunges to negative infinity, so it is nowhere bounded. Yet over any box the logarithm's gap from its box-average stays controlled, because the logarithm grows so slowly that rescaling a box barely changes its shape. This single example shows the new class is genuinely bigger than the bounded functions, and it is the reason the class matters: it is the natural home for outputs of operations that just barely fail to keep bounded inputs bounded.

The one-sentence takeaway: bounded mean oscillation asks each box's typical deviation from its own average to be small, which lets functions blow up slowly like a logarithm while still being tightly controlled at every scale.

Visual Beginner

Picture the graph of a function over a single box, with a horizontal line drawn at the height equal to the function's average over that box. The mean oscillation is the average vertical distance between the graph and that horizontal line. A nearly flat graph hugs its average line, so its mean oscillation is tiny. A graph that swings wildly above and below the line has large mean oscillation. The defining demand is that no box, of any size or position, produces a large average gap.

The lower panel carries the punchline. Even though the logarithm dives to negative infinity at the marked point, when you draw any box around the region and compare the curve to its box-average, the typical gap stays the same modest size no matter how small the box gets. Boundedness of the function is a statement about the curve's height; bounded mean oscillation is a statement about the curve's spread around its sliding average. The second can hold while the first fails.

Worked example Beginner

We measure the mean oscillation of a simple step function over a box and read off the number by hand.

Step 1. Work on the line and let equal on the left half of the interval from to , and equal on the right half, that is on and on . Take the box to be the whole interval , of length .

Step 2. Compute the average of over . The function is over a length and over a length , so the total area is , and the average is the area divided by the length, divided by , which is .

Step 3. Compute how far sits from this average of . On the left half , so the gap is . On the right half , so the gap is . The gap is everywhere.

Step 4. Average the gap over the box. Since the gap is the constant across the whole interval, its average is just . So the mean oscillation of over this box is .

What this tells us: the mean oscillation measures the typical distance from the local average, and here it came out to exactly half the jump height, because the function spends equal time on each side of its average. A function of bounded mean oscillation is one where this number stays below a fixed ceiling for every box at once; raising the jump height would raise the oscillation, while shrinking the box around a single level would lower it toward zero.

Check your understanding Beginner

Formal definition Intermediate+

Throughout, ranges over cubes in with sides parallel to the axes (equivalently, over balls; the two families give comparable seminorms), is the Lebesgue measure of , and is the average of over . All functions are real- or complex-valued and locally integrable.

Definition (bounded mean oscillation). For the mean oscillation seminorm is the supremum taken over all cubes . The space of bounded mean oscillation is Since if and only if is almost everywhere constant, is a seminorm whose null space is the constants; is a Banach space once functions are identified modulo additive constants. The competitor centring is comparable: , so replacing by the best constant changes by at most a factor .

Definition (sharp maximal function). The Fefferman-Stein sharp maximal function of is the supremum over cubes containing . By definition , so exactly when , the BMO seminorm being the -norm of the pointwise oscillation. One has pointwise, where is the Hardy-Littlewood maximal operator 02.19.01.

Non-examples and the canonical example. The inclusion holds with , and it is strict: the function on is unbounded yet lies in , with bounded by a dimensional constant. The function , by contrast, is bounded — hence in — even though it is the composition of a bounded function with a function, illustrating that is not preserved by all bounded nonlinearities. The point of the class is captured by the slogan: is the smallest rearrangement-stable space containing on which Calderón-Zygmund operators remain bounded.

Counterexamples to common slips Intermediate+

  • BMO is a seminorm space, not a norm space, until you quotient by constants. Adding a constant to leaves unchanged because is insensitive to constants. Treating as a norm on functions (rather than on cosets) gives a non-Hausdorff "space"; the genuine Banach space is with .

  • but is false — both are in BMO, yet can leave it. The subtle fact is that BMO is not a lattice: does not force in general, though for it happens to hold. Pointwise nonlinear operations are not BMO-bounded in general.

  • The supremum is over all cubes, not a fixed scale. A function may have small oscillation over large cubes yet large oscillation over small ones, or vice versa. Checking one scale never certifies BMO; the defining supremum couples all positions and all sizes.

  • Cubes versus balls and centred versus best-constant change only the constant. Each admissible variant of the definition — cubes or balls, or the infimising constant, or (after John-Nirenberg) averaging — yields a seminorm comparable to up to dimensional factors. The space is the same; only the numerical seminorm shifts.

Key theorem with proof Intermediate+

Theorem (John-Nirenberg inequality; John-Nirenberg 1961 Comm. Pure Appl. Math. 14, 415). There are dimensional constants such that for every , every cube , and every , One may take and (with the dyadic-cube normalization below).

Proof. By homogeneity (replace by ) and translation/dilation invariance of both sides, it suffices to assume and prove

Step 1 (one Calderón-Zygmund step). Fix a cube and set , so . Apply the dyadic Calderón-Zygmund decomposition 02.19.02 to on at height : subdividing dyadically and stopping at the maximal subcubes where , one obtains pairwise disjoint dyadic subcubes with the first from the unselected dyadic parent and the second from the stopping inequality summed over the disjoint family. Off the Lebesgue differentiation theorem 02.19.01 gives a.e.

Step 2 (the averages barely move). On each selected cube , Thus passing from to a child shifts the centring constant by at most . Consequently, for ,

Step 3 (the recursion). The set where inside is, up to a null set, contained in : off the selected cubes a.e. Hence for any , using Step 2 to absorb the constant shift, the definition of on each , and the measure bound . Taking the supremum over yields the functional inequality

Step 4 (iteration to exponential decay). Since always (it is the relative measure of a subset of ), iterating for every integer . For arbitrary write with and ; monotonicity of gives This is the claimed bound with and , after restoring by the initial normalization.

Bridge. The John-Nirenberg inequality builds toward the identification of as the correct endpoint of the scale, and it appears again in the Fefferman-Stein sharp-maximal theory and the - duality that anchor the rest of this chapter. The foundational reason the proof works is that the Calderón-Zygmund decomposition 02.19.02 is self-similar: one stopping-time step at height reproduces the same distributional question one scale down with a fixed measure contraction and a fixed constant shift , and this is exactly the structure that turns a single linear oscillation bound into geometric — hence exponential — decay. The central insight is that bounded mean oscillation, an averaged condition, self-improves to exponential integrability: putting these together, forces for small , so BMO functions are locally in every and in the Orlicz space . The bridge is that this is dual to the atomic estimate, the -integrability of generalises the plain bound, and the recursion here is exactly the iterated stopping-time selection flagged in the Calderón-Zygmund decomposition's own forward map.

Exercises Intermediate+

Advanced results Master

Theorem 1 (John-Nirenberg, exponential form and self-improvement; John-Nirenberg 1961 Comm. Pure Appl. Math. 14, 415). A locally integrable lies in if and only if there exist constants with for all cubes . Consequently the seminorms for are all comparable, and the BMO condition stated with -oscillation self-improves to - and to exponential-oscillation with no enlargement of the space. The constant is the reciprocal of the John-Nirenberg decay rate [John-Nirenberg 1961].

Theorem 2 (Fefferman-Stein sharp maximal theorem; Fefferman-Stein 1972 Acta Math. 129, 137). For and in a suitable dense class, with constants depending only on . The sharp maximal function thereby characterizes through pure mean-oscillation data, and is the endpoint of the family . This is the precise sense in which replaces at the top of the scale [Fefferman-Stein 1972].

Theorem 3 (Calderón-Zygmund operators on the endpoints). Every Calderón-Zygmund operator maps boundedly, and by the duality of Theorem 4 its (formal) adjoint maps boundedly. The pair is the correct endpoint replacement for the failed and bounds, and interpolating either with the bound recovers the full range [Stein 1993].

Theorem 4 (Fefferman duality $(H^1)^ = \mathrm{BMO}$; Fefferman 1971 Bull. AMS 77, 587; Fefferman-Stein 1972 Acta Math. 129, 137).* The dual of the real Hardy space is : every defines a bounded linear functional on by (suitably interpreted on atoms), with , and every bounded functional on arises this way. The pairing is well defined because the atomic structure of (mean-zero, -normalized bumps supported on cubes) is exactly matched by the John-Nirenberg integrability of BMO, so that converges absolutely by Cauchy-Schwarz [Fefferman-Stein 1972].

Theorem 5 (Carleson-measure characterization). A function lies in if and only if is a Carleson measure on the upper half-space, where is the Poisson extension: . This analytic characterization, equivalent to membership in up to constants, is the route by which enters the theory of the -equation and the corona problem, and it is the form in which Fefferman first proved the duality [Fefferman 1971].

Theorem 6 (commutators; Coifman-Rochberg-Weiss). For a Calderón-Zygmund operator and , the commutator is bounded on for , with ; conversely, for a Riesz transform, boundedness of on some forces . The commutator thereby gives an operator-theoretic characterization of , and the proof runs through the John-Nirenberg exponential integrability to control the product [Stein 1993].

Synthesis. The space is the foundational reason that the failure of Calderón-Zygmund operators to preserve and is a feature: and its predual are the correct endpoint spaces, and this is exactly what the four characterizations — mean-oscillation, sharp maximal, Carleson measure, commutator — assert from four directions. The central insight is the self-improvement engine of John-Nirenberg: a single -averaged oscillation bound, fed through the self-similar stopping-time recursion of 02.19.02, upgrades to exponential integrability, and this is exactly what makes the pairing converge against the -normalized atoms of . Putting these together, the picture generalises in three directions: vertically, caps the scale as the endpoint via the Fefferman-Stein sharp maximal theorem (Theorem 2), bracketing the whole scale by the dual pair ; horizontally, the Carleson-measure characterization (Theorem 5) ports the cube-oscillation picture to the half-space and is dual to the tent-space theory; structurally, the commutator characterization (Theorem 6) reveals as the symbol class for which multiplication interacts boundedly with singular integration. The bridge unifying them is the John-Nirenberg inequality: each equivalence restates that bounded mean oscillation is secretly exponential control, dual atom by atom to the cancellation built into .

Full proof set Master

Proposition 1 ( is a seminorm with null space the constants). is subadditive and absolutely homogeneous, and if and only if is a.e. equal to a constant.

Proof. Absolute homogeneity is clear from . For subadditivity, , so pointwise; averaging and taking the supremum over gives . If then and the oscillation vanishes, so . Conversely if then for every cube, so a.e. on each ; since the constants must agree on overlapping cubes, is a.e. a single constant.

Proposition 2 ( is complete modulo constants). The quotient with seminorm is a Banach space.

Proof. Let be Cauchy in ; normalize representatives so that for a fixed unit cube . For any fixed cube , the averages ; the first term is , and chaining to through finitely many overlapping cubes bounds by . Hence is Cauchy in for every , so it converges in to some . Lower semicontinuity of under -convergence gives , so and .

Proposition 3 (one Calderón-Zygmund step recursion, restated). With and , one has for all .

Proof. This is Steps 1-3 of the Key Theorem. The decomposition of at height produces disjoint dyadic subcubes with and , off which a.e. Hence up to null sets, and summing the per-cube bound yields the contraction.

Proposition 4 (exponential decay from the recursion). with .

Proof. From and Proposition 3, induction gives . For general , write and use monotonicity: .

Proposition 5 (, sharpness). , while for any .

Proof. Membership is Exercise 3. For sharpness, test on cubes of side at the origin: and as , so . The dilation scales the oscillation by , which is bounded over scales only when ; the logarithm sits exactly at , where the dilation acts by an additive constant the seminorm ignores.

Proposition 6 (sharp maximal pointwise bounds). pointwise, and .

Proof. For a cube , after replacing the cube by a comparable centred ball; taking the supremum over gives . The identity is immediate: , since every cube contains some point and the essential supremum over of a quantity depending only on the cubes through recovers the supremum over all cubes.

Connections Master

  • The Calderón-Zygmund decomposition 02.19.02. The direct prerequisite and the engine of the John-Nirenberg proof. One stopping-time step at height on produces the disjoint subcubes and the measure contraction that make the distributional recursion self-similar; iterating that single decomposition is exactly what converts bounded mean oscillation into exponential decay, so the John-Nirenberg inequality is the decomposition applied to itself across scales.

  • Hardy-Littlewood maximal function and the Vitali covering lemma 02.19.01. The direct prerequisite supplying the sharp maximal function's domination , the Lebesgue differentiation theorem used to control the good part off the selected cubes, and the pointwise bound a.e. that closes the Fefferman-Stein equivalence .

  • spaces, Hölder, Minkowski, Riesz-Fischer completeness 02.07.06. The direct prerequisite carrying the layer-cake distribution-function calculus behind the -BMO seminorm comparison and the exponential-integrability corollary, the Cauchy-Schwarz estimate in the proof, and the Marcinkiewicz interpolation that turns the - endpoints into boundedness on the full range.

  • Singular integral operators and the Hilbert transform [forward: 02.19.03]. The principal application. Calderón-Zygmund operators fail to preserve , and is precisely the space into which they map it; the commutator is -bounded exactly when , making BMO the natural symbol class for singular-integral multiplication.

  • Hardy spaces and atomic decomposition [forward: 02.20.02]. The dual partner. The Fefferman duality pairs the mean-zero -normalized atoms of against BMO functions; the absolute convergence of the pairing is supplied by the John-Nirenberg integrability proved here.

  • Littlewood-Paley theory and square functions [forward: 02.20.03]. The analytic face. The Carleson-measure characterization expresses through the Littlewood-Paley square function of the Poisson extension, embedding BMO into the tent-space and square-function framework that organises the rest of this chapter.

Historical & philosophical context Master

The space and the inequality bearing their names were introduced by Fritz John and Louis Nirenberg in their 1961 Communications on Pure and Applied Mathematics paper On functions of bounded mean oscillation [John-Nirenberg 1961]. John had arrived at the mean-oscillation condition through his work on elasticity and quasi-isometric (bi-Lipschitz) deformations, where the logarithm of the Jacobian of such a map is the prototypical bounded-mean-oscillation function; the exponential integrability they proved was the tool needed to control such deformations. Their original argument is the Calderón-Zygmund stopping-time iteration reproduced above, run on the level sets of the oscillation.

The decisive structural role of in harmonic analysis emerged a decade later. Charles Fefferman announced the duality in his 1971 Bulletin of the American Mathematical Society note Characterizations of bounded mean oscillation [Fefferman 1971], identifying the dual of the real Hardy space — until then an analytically awkward object — with the concrete and computable space of mean oscillation, and supplying the Carleson-measure characterization through the Poisson extension. The full theory, including the sharp maximal function and the systematic use of BMO as the -substitute endpoint for singular integrals, was developed by Fefferman and Elias Stein in their 1972 Acta Mathematica paper spaces of several variables [Fefferman-Stein 1972]. The commutator characterization, completing the operator-theoretic picture, was established by Ronald Coifman, Richard Rochberg, and Guido Weiss in 1976. Within this lineage the John-Nirenberg inequality is the technical keystone: it is the self-improvement that makes mean oscillation strong enough to serve as a dual space and as an interpolation endpoint.

Bibliography Master

@article{JohnNirenberg1961,
  author  = {John, Fritz and Nirenberg, Louis},
  title   = {On functions of bounded mean oscillation},
  journal = {Communications on Pure and Applied Mathematics},
  volume  = {14},
  year    = {1961},
  pages   = {415--426}
}

@article{Fefferman1971,
  author  = {Fefferman, Charles},
  title   = {Characterizations of bounded mean oscillation},
  journal = {Bulletin of the American Mathematical Society},
  volume  = {77},
  year    = {1971},
  pages   = {587--588}
}

@article{FeffermanStein1972,
  author  = {Fefferman, Charles and Stein, Elias M.},
  title   = {$H^p$ spaces of several variables},
  journal = {Acta Mathematica},
  volume  = {129},
  year    = {1972},
  pages   = {137--193}
}

@article{CoifmanRochbergWeiss1976,
  author  = {Coifman, Ronald R. and Rochberg, Richard and Weiss, Guido},
  title   = {Factorization theorems for Hardy spaces in several variables},
  journal = {Annals of Mathematics},
  volume  = {103},
  year    = {1976},
  pages   = {611--635}
}

@book{Stein1993,
  author    = {Stein, Elias M.},
  title     = {Harmonic Analysis: Real-Variable Methods, Orthogonality, and Oscillatory Integrals},
  publisher = {Princeton University Press},
  year      = {1993}
}

@book{Grafakos2014Modern,
  author    = {Grafakos, Loukas},
  title     = {Modern Fourier Analysis},
  edition   = {3},
  publisher = {Springer},
  year      = {2014}
}

@book{Duoandikoetxea2001,
  author    = {Duoandikoetxea, Javier},
  title     = {Fourier Analysis},
  publisher = {American Mathematical Society},
  year      = {2001}
}