02.19.02 · analysis / calderon-zygmund-singular-integrals

The Calderón-Zygmund Decomposition

shipped3 tiersLean: none

Anchor (Master): Stein 1993 *Harmonic Analysis* (Princeton) Ch. I §3; Duoandikoetxea 2001 *Fourier Analysis* (AMS) §2; Stein 1970 *Singular Integrals* (Princeton) Ch. I

Intuition Beginner

Suppose you have a quantity spread out over space — think of rainfall measured across a wide region. Most of the region gets a gentle, manageable amount, but a few small patches catch a downpour. If you want to study the whole rainfall pattern, it helps to split it into two clean pieces: a smooth background that never exceeds some manageable level, plus a collection of isolated storm cells where all the heavy stuff is concentrated. The background you can handle with crude tools because it is bounded; the storm cells you handle one at a time because each is small and you know exactly where it sits.

The Calderón-Zygmund decomposition does precisely this for an integrable function. You pick a height — call it — and you ask: where is the function, on average over small boxes, bigger than ? Those boxes are the storm cells. Everywhere else the local averages stay below , and you keep that part as the smooth background. The clever bookkeeping is that the storm cells cannot be too plentiful: the total area they occupy is controlled by how much total mass the function has, divided by the height . Raise the threshold and the storm cells shrink away; lower it and they spread.

Why bother carving a function up this way? Because many of the deepest tools in analysis only work on functions that are either bounded or supported on a small set with a special averaging property. A general integrable function is neither. The decomposition manufactures both situations at once: a bounded background and a list of localized bumps, each averaging to zero so its oscillation does the work. Almost every estimate that controls a wild operator at the borderline — where the function is merely integrable and nothing stronger — runs through this single split.

The one-sentence takeaway: cut an integrable function at a height into a bounded good part and a sum of small, mean-zero bad bumps, with the bumps occupying a total area no larger than the mass divided by — and you have the universal engine for borderline estimates.

Visual Beginner

Picture the graph of a positive function on a line, with a horizontal dashed line drawn across it at height . Now imagine sliding boxes along the line and computing the average height of the function inside each box. Where those box-averages climb above , mark the box. The marked boxes cluster around the tall spikes of the function. Keep growing or shrinking the boxes by a stopping rule until you have a tidy, non-overlapping collection of marked boxes that together capture every place the function runs hot.

The lower panel shows the payoff. The function becomes two graphs. The first is the good part: capped at the threshold, flat and tame. The second is the bad part: a row of localized bumps sitting exactly over the marked boxes, each drawn with as much area above the axis as below, a reminder that every bump averages to zero over its own box. The marked boxes never overlap, so adding up their lengths is honest bookkeeping, and that total length is what the mass-over- rule controls.

Worked example Beginner

We split a simple function at a chosen height and read off the two parts by hand.

Step 1. Work on the line and let be the function equal to on the interval from to , and everywhere else. Its total mass — the area under it — is times , which is . Choose the height .

Step 2. Find where the local averages exceed . Over the interval from to the average of is , which is above . Over a very large window the average is tiny, below . The interval from to is the one box whose average sits above the threshold, so it becomes our single storm cell .

Step 3. Build the good part . Outside the function is already , which is below , so we keep it. Inside we replace by its average over , namely . So equals on and outside. Notice the cap: here the average exceeds , because in one dimension the bound on is the height times a fixed factor, not exactly — the average over a stopping box stays within a bounded multiple of .

Step 4. Build the bad part . By definition . On that is , and outside it is . In this clean example vanishes because was already constant on its box. The total length of storm cells is the length of , which is ; the rule says this is at most the mass divided by , that is , and indeed is at most .

What this tells us: the decomposition replaces a function by its box-average on the heavy boxes (the good part) and records the leftover oscillation as mean-zero bumps (the bad part). When the function is already flat on a box the bump is empty; when it wiggles, the bump captures exactly the wiggle. The mass-over- bound on the total length of boxes is the quantitative heart, and it always holds with room to spare.

Check your understanding Beginner

Formal definition Intermediate+

Throughout, denotes the family of dyadic cubes of : cubes of the form with and . Distinct dyadic cubes are either nested or disjoint, and each dyadic cube has a unique dyadic parent of twice the side length and times the volume. For a cube write for the average of over .

Definition (Calderón-Zygmund decomposition). Let and . A Calderón-Zygmund decomposition of at height is a representation together with a countable collection of pairwise disjoint dyadic cubes , such that:

  1. (Good part bounded.) and .
  2. (Bad part localized.) Each is supported in , and is supported in .
  3. (Mean zero.) for every .
  4. (Controlled mass.) , equivalently , and .
  5. (Total measure bound.) .

The cubes are obtained by the dyadic stopping-time selection: ranges over the maximal dyadic cubes with . On each such cube one sets and ; off one sets and .

Definition (good- / bad-part splitting). The induced splitting of any sublinear operator estimate at height into a good-part contribution and a bad-part contribution is the good-/bad-part splitting. The good part is controlled by an (or ) estimate; the bad part is controlled by its mean-zero cancellation against the smoothness of the operator kernel away from .

Counterexamples to common slips Intermediate+

  • The good part is bounded by , not by . On a stopping cube the value of is the average , which exceeds by selection. The bound comes from the parent , which was not selected, so and . Forgetting the parent step gives the wrong constant.

  • Selection must take maximal cubes, not all cubes. The set is closed upward under nothing in particular, but each such is contained in a largest such cube because bounds the side length. Choosing maximal cubes makes the disjoint; choosing all of them double-counts mass.

  • Mean zero is a property of , not of globally in a useful sense. Each integrates to zero over its own cube, which is what the kernel-cancellation argument needs cube by cube. The global statement is true but too weak; the per-cube cancellation is the load-bearing fact.

  • The decomposition is for functions and a fixed , not a single canonical object. Different heights produce genuinely different cube families and different good/bad splits of the same . There is no decomposition independent of ; the -dependence is the point, since estimates integrate over via the distribution function.

Key theorem with proof Intermediate+

Theorem (Calderón-Zygmund decomposition; Calderón-Zygmund 1952 Acta Math. 88, 85; dyadic form Stein 1970 Singular Integrals Ch. I §3). Let and . Then there exist a function , functions , and pairwise disjoint dyadic cubes satisfying properties (1)–(5) of the Formal definition. In particular with , , , each supported on with , and .

Proof. Set where is the dyadic maximal operator of 02.19.01.

Step 1 (stopping-time selection of maximal cubes). For each with there is a dyadic cube with . Since , any such cube satisfies , so its side length is bounded; hence among the dyadic cubes through with average exceeding there is a *maximal* one. Let be the collection of all cubes that are maximal with respect to inclusion among dyadic cubes with . By the nested-or-disjoint dichotomy of dyadic cubes together with maximality, the are pairwise disjoint, and .

Step 2 (average control on each selected cube). Fix a selected cube and let be its dyadic parent. By maximality of , the strictly larger cube does not satisfy the selection inequality, so . Since , This is property (4): the average of over each selected cube lies in .

Step 3 (definition of and ). Set On each lies in exactly one by disjointness, so and are well defined and everywhere: off , and ; on , .

Step 4 (good part is bounded). For , the Lebesgue differentiation theorem 02.19.01 gives for a.e. such , since every small dyadic cube through has average at most (otherwise ); thus a.e. off . For , by Step 2. Hence . Moreover : off they agree, and on each , by Jensen. So .

Step 5 (bad part: support, mean zero, mass). By construction is supported in and which is property (3). For the mass, the triangle inequality and Step 2 give property (4).

Step 6 (total measure bound). Each selected cube satisfies from . Summing over the disjoint family, property (5). All five properties hold.

Bridge. The decomposition builds toward the weak-type boundedness of every Calderón-Zygmund singular integral operator, and the same stopping-time selection appears again in the dyadic-cube proofs of the theorem, the John-Nirenberg inequality, and the modern theory of sparse domination. The central insight is that the maximal-function level set is the bad set: selecting the maximal cubes where the average first crosses is exactly the act of producing the disjoint family, so this is the foundational reason a single hypothesis splits into a bounded -controllable good part and a mean-zero -controllable bad part. Putting these together, the weak- proof of 02.19.01 for the maximal operator and the decomposition here are dual faces of one stopping-time argument: the bound that controlled the maximal level set now controls the support of the bad part, and the bridge is that the good-/bad-part splitting generalises the layer-cake estimate to operators that see cancellation, not just size.

Exercises Intermediate+

Advanced results Master

Theorem 1 (weak-type for Calderón-Zygmund operators; Calderón-Zygmund 1952 Acta Math. 88, 85). Let be bounded on and given off-diagonal by a kernel satisfying the Hörmander integral condition. Then extends to a bounded operator of weak type , with bound as in Exercise 7, and consequently is bounded on for . The decomposition is the only substantive ingredient: the good part is handled by the bound and Chebyshev, the bad part by the mean-zero cancellation against the kernel's regularity away from the doubled cubes [Calderón-Zygmund 1952].

Theorem 2 (Hörmander's refinement; Hörmander 1960 Acta Math. 104, 93). The smoothness hypothesis on the kernel can be weakened from the pointwise gradient bound of Calderón-Zygmund to the integral (now-called Hörmander) condition used in Exercise 7. This is the minimal regularity under which the good-/bad-part argument closes, and it is the form in which the theorem propagates to operators whose kernels are merely Dini-continuous rather than [Hörmander 1960].

Theorem 3 (Fefferman-Stein sharp maximal characterization; Fefferman-Stein 1972 Acta Math. 129, 137). The decomposition at a continuum of heights is the engine behind the sharp maximal function and the identity for on functions with suitable decay. Iterating the stopping-time selection — each bad cube re-decomposed at a higher height — produces the good- inequalities that prove the equivalence [Fefferman-Stein 1972].

Theorem 4 (vector-valued Calderón-Zygmund decomposition). For with values in a Hilbert space , the identical stopping-time selection (now of cubes where ) yields with , each in , and . This Banach-valued form drives the Littlewood-Paley and Fourier-multiplier theory, where carries a sequence of frequency-localized pieces and tracks the square function [Stein 1970].

Theorem 5 (decomposition relative to a doubling measure). If is a doubling Borel measure on (so ), the dyadic stopping-time selection at height produces cubes with and , with replaced by the doubling constant . This is the form that underlies singular-integral theory on spaces of homogeneous type, where dyadic cubes are replaced by the Christ-David dyadic systems [Stein 1993].

Theorem 6 (-atomic connection; Coifman-Weiss). The bad bumps , after normalization, are precisely -atoms of the Hardy space : each is supported in , has mean zero, and satisfies . The Calderón-Zygmund decomposition is therefore the constructive bridge from to the atomic decomposition of , and the weak- bound for is the shadow on of the strong bound that holds because maps atoms to integrable functions with uniformly controlled -norm [Fefferman-Stein 1972].

Synthesis. The Calderón-Zygmund decomposition is the foundational reason that a single quantitative hypothesis — -boundedness plus kernel regularity — propagates to the entire scale , , and to the endpoint in weak form, and this is exactly the structural device that converts a size hypothesis into a cancellation hypothesis. The central insight is a dictionary: the stopping-time selection of maximal dyadic cubes is dual to the maximal-function level set of 02.19.01, so the bound that proved weak- for the maximal operator is the same bound that controls the support of the bad part here. Putting these together, the good-/bad-part splitting generalises in three directions that recur throughout harmonic analysis: vertically, from Lebesgue measure to arbitrary doubling and homogeneous-type measures via the Christ-David cubes (Theorem 5); horizontally, from scalar to Hilbert-valued functions via the vector-valued decomposition that powers Littlewood-Paley theory (Theorem 4); and structurally, from to the Hardy space , where the bad bumps are revealed to be atoms (Theorem 6) and the weak- bound is the trace of a strong estimate. The bridge is that every borderline boundedness theorem in the subject — singular integrals, the theorem, John-Nirenberg, sparse domination — routes through this one act of cutting at height , and the central insight unifying them is that selecting the heavy cubes is simultaneously a statement about the maximal function and a statement about cancellation.

Full proof set Master

Proposition 1 (disjointness of the selected cubes). The maximal dyadic cubes with are pairwise disjoint, and .

Proof. Two dyadic cubes are nested or disjoint. If with , one contains the other, say ; then is not maximal among cubes with average (it sits inside the larger , which also has average ), contradicting selection. So the are disjoint. For the union: iff some dyadic has , iff (by the side-length bound giving a maximal element) lies in a maximal such cube, iff .

Proposition 2 (the cap is sharp in order). There is and for which the good part attains comparable to (here , so ).

Proof. Take with slightly below , and chosen so while , i.e. . Then is the maximal selected interval, on it, and , which approaches as . The bound cannot be improved below the parent-volume ratio .

Proposition 3 (good and bad parts are genuinely in their stated spaces). and each with ; moreover with .

Proof. The bounds and are Steps 4 of the Key Theorem; their conjunction gives via . Each is a difference of functions supported in , hence , with by Step 5. Disjoint supports give .

Proposition 4 (the decomposition reproduces the weak- measure bound). .

Proof. By Proposition 1 the level set is with the disjoint, and . Summing, .

Proposition 5 (good-/bad-part reduction is exhaustive). For any sublinear and any , , and the second set splits further as .

Proof. If and then sublinearity forces , giving the first inclusion. The second is a plain set decomposition by intersecting with and its complement; the union of doubled cubes has measure by Proposition 4, so it is already of the required size and only the off-doubled-cube part needs the kernel estimate.

Proposition 6 (atoms). With , each is supported in , has , and — more usefully , the normalization of an -atom.

Proof. Support and mean zero are inherited from . For the size, ; using only the crude and , one gets . Hence , the atomic normalization.

Connections Master

  • Hardy-Littlewood maximal function and the Vitali covering lemma 02.19.01. The direct prerequisite and dual sibling. The dyadic stopping-time selection used here to build the cubes is the same selection that produced the weak- bound for the dyadic maximal operator there; the level set is literally the bad set , and the measure bound is the maximal inequality re-read as a statement about the support of the bad part.

  • spaces, Hölder, Minkowski, Riesz-Fischer completeness 02.07.06. The direct prerequisite supplying the function-space scaffolding: the estimate on the good part, the control on the bad part, the Tonelli interchange in the bad-part kernel estimate, and the Marcinkiewicz interpolation that converts the weak- plus strong- endpoints into the full range .

  • Singular integral operators and the Hilbert transform [forward: 02.19.03]. The principal successor. Every Calderón-Zygmund operator — the Hilbert transform, the Riesz transforms, convolution with a Calderón-Zygmund kernel — is shown bounded on and of weak type by feeding its -boundedness and kernel regularity into the good-/bad-part argument proved in Exercises 7 and 8.

  • Littlewood-Paley theory and Fourier multipliers [forward: 02.20.01]. The vector-valued decomposition (Theorem 4) with is the device that promotes the scalar singular-integral bounds to the square-function and Mikhlin-Hörmander multiplier theorems, where each frequency-localized piece is tracked simultaneously.

  • BMO and the John-Nirenberg inequality [forward: 02.20.04]. The iterated stopping-time selection — re-decomposing each bad cube at a higher height — is the mechanism behind the John-Nirenberg exponential integrability of BMO functions and the Fefferman-Stein sharp-maximal characterization (Theorem 3) that identifies as the dual of the Hardy space .

Historical & philosophical context Master

The decomposition was introduced by Alberto Calderón and Antoni Zygmund in their 1952 Acta Mathematica paper On the existence of certain singular integrals [Calderón-Zygmund 1952], the founding document of the real-variable theory of singular integrals. Their problem was to extend the -boundedness of the Hilbert transform — known on the line since Marcel Riesz's 1927 work via complex-analytic methods — to higher-dimensional convolution operators with Calderón-Zygmund kernels, where complex methods are unavailable. The decomposition supplied the missing real-variable tool: it let them prove the weak-type endpoint directly, then interpolate. Their original formulation selected cubes by a covering argument on the level set of the function itself; the now-standard dyadic stopping-time form, which makes the parent-cube estimate transparent, was systematized in Elias Stein's 1970 monograph Singular Integrals and Differentiability Properties of Functions [Stein 1970].

Lars Hörmander's 1960 Acta Mathematica paper Estimates for translation invariant operators in spaces [Hörmander 1960] isolated the integral smoothness condition on the kernel that is exactly what the bad-part argument consumes, replacing the pointwise gradient bound of Calderón-Zygmund and thereby covering kernels of merely Dini regularity. The decomposition's reach into the theory of Hardy spaces was made explicit by Charles Fefferman and Elias Stein in their 1972 Acta Mathematica paper spaces of several variables [Fefferman-Stein 1972], where the bad bumps were recognized as atoms and the duality - was established, recasting the weak- bound as the trace of a strong estimate on .

Bibliography Master

@article{CalderonZygmund1952,
  author  = {Calder\'on, Alberto P. and Zygmund, Antoni},
  title   = {On the existence of certain singular integrals},
  journal = {Acta Mathematica},
  volume  = {88},
  year    = {1952},
  pages   = {85--139}
}

@article{Hormander1960,
  author  = {H\"ormander, Lars},
  title   = {Estimates for translation invariant operators in $L^p$ spaces},
  journal = {Acta Mathematica},
  volume  = {104},
  year    = {1960},
  pages   = {93--140}
}

@article{FeffermanStein1972,
  author  = {Fefferman, Charles and Stein, Elias M.},
  title   = {$H^p$ spaces of several variables},
  journal = {Acta Mathematica},
  volume  = {129},
  year    = {1972},
  pages   = {137--193}
}

@book{Stein1970,
  author    = {Stein, Elias M.},
  title     = {Singular Integrals and Differentiability Properties of Functions},
  publisher = {Princeton University Press},
  year      = {1970}
}

@book{Stein1993,
  author    = {Stein, Elias M.},
  title     = {Harmonic Analysis: Real-Variable Methods, Orthogonality, and Oscillatory Integrals},
  publisher = {Princeton University Press},
  year      = {1993}
}

@book{Grafakos2014,
  author    = {Grafakos, Loukas},
  title     = {Classical Fourier Analysis},
  edition   = {3},
  publisher = {Springer},
  year      = {2014}
}

@book{Duoandikoetxea2001,
  author    = {Duoandikoetxea, Javier},
  title     = {Fourier Analysis},
  publisher = {American Mathematical Society},
  year      = {2001}
}