02.19.02 · analysis / calderon-zygmund-singular-integrals

The Calderón-Zygmund Decomposition

shipped3 tiersLean: none

Anchor (Master): Stein 1993 *Harmonic Analysis* (Princeton) Ch. I §3; Duoandikoetxea 2001 *Fourier Analysis* (AMS) §2; Stein 1970 *Singular Integrals* (Princeton) Ch. I

Intuition Beginner

Suppose you have a quantity spread out over space — think of rainfall measured across a wide region. Most of the region gets a gentle, manageable amount, but a few small patches catch a downpour. If you want to study the whole rainfall pattern, it helps to split it into two clean pieces: a smooth background that never exceeds some manageable level, plus a collection of isolated storm cells where all the heavy stuff is concentrated. The background you can handle with crude tools because it is bounded; the storm cells you handle one at a time because each is small and you know exactly where it sits.

The Calderón-Zygmund decomposition does precisely this for an integrable function. You pick a height — call it $λ$ — and you ask: where is the function, on average over small boxes, bigger than $λ$ ? Those boxes are the storm cells. Everywhere else the local averages stay below $λ$ , and you keep that part as the smooth background. The clever bookkeeping is that the storm cells cannot be too plentiful: the total area they occupy is controlled by how much total mass the function has, divided by the height $λ$ . Raise the threshold and the storm cells shrink away; lower it and they spread.

Why bother carving a function up this way? Because many of the deepest tools in analysis only work on functions that are either bounded or supported on a small set with a special averaging property. A general integrable function is neither. The decomposition manufactures both situations at once: a bounded background and a list of localized bumps, each averaging to zero so its oscillation does the work. Almost every estimate that controls a wild operator at the borderline — where the function is merely integrable and nothing stronger — runs through this single split.

The one-sentence takeaway: cut an integrable function at a height $λ$ into a bounded good part and a sum of small, mean-zero bad bumps, with the bumps occupying a total area no larger than the mass divided by $λ$ — and you have the universal engine for borderline estimates.

Visual Beginner

Picture the graph of a positive function on a line, with a horizontal dashed line drawn across it at height $λ$ . Now imagine sliding boxes along the line and computing the average height of the function inside each box. Where those box-averages climb above $λ$ , mark the box. The marked boxes cluster around the tall spikes of the function. Keep growing or shrinking the boxes by a stopping rule until you have a tidy, non-overlapping collection of marked boxes that together capture every place the function runs hot.

The lower panel shows the payoff. The function becomes two graphs. The first is the good part: capped at the threshold, flat and tame. The second is the bad part: a row of localized bumps sitting exactly over the marked boxes, each drawn with as much area above the axis as below, a reminder that every bump averages to zero over its own box. The marked boxes never overlap, so adding up their lengths is honest bookkeeping, and that total length is what the mass-over- $λ$ rule controls.

Worked example Beginner

We split a simple function at a chosen height and read off the two parts by hand.

Step 1. Work on the line and let $f$ be the function equal to $6$ on the interval from $0$ to $1$ , and $0$ everywhere else. Its total mass — the area under it — is $6$ times $1$ , which is $6$ . Choose the height $λ = 2$ .

Step 2. Find where the local averages exceed $λ = 2$ . Over the interval from $0$ to $1$ the average of $f$ is $6$ , which is above $2$ . Over a very large window the average is tiny, below $2$ . The interval from $0$ to $1$ is the one box whose average sits above the threshold, so it becomes our single storm cell $Q_{1} = [0, 1]$ .

Step 3. Build the good part $g$ . Outside $Q_{1}$ the function is already $0$ , which is below $λ$ , so we keep it. Inside $Q_{1}$ we replace $f$ by its average over $Q_{1}$ , namely $6$ . So $g$ equals $6$ on $[0, 1]$ and $0$ outside. Notice the cap: here the average $6$ exceeds $λ = 2$ , because in one dimension the bound on $g$ is the height times a fixed factor, not exactly $λ$ — the average over a stopping box stays within a bounded multiple of $λ$ .

Step 4. Build the bad part $b$ . By definition $b = f - g$ . On $[0, 1]$ that is $6 - 6 = 0$ , and outside it is $0 - 0 = 0$ . In this clean example $b$ vanishes because $f$ was already constant on its box. The total length of storm cells is the length of $[0, 1]$ , which is $1$ ; the rule says this is at most the mass $6$ divided by $λ = 2$ , that is $3$ , and indeed $1$ is at most $3$ .

What this tells us: the decomposition replaces a function by its box-average on the heavy boxes (the good part) and records the leftover oscillation as mean-zero bumps (the bad part). When the function is already flat on a box the bump is empty; when it wiggles, the bump captures exactly the wiggle. The mass-over- $λ$ bound on the total length of boxes is the quantitative heart, and it always holds with room to spare.

Check your understanding Beginner

Exercise (easy, multiple choice).

In the Calderón-Zygmund decomposition $f = g + b$ at height $λ$ , the "good part" $g$ is characterised by which property?

A. It is zero everywhere except on the heavy boxes B. It is bounded, never exceeding a fixed multiple of $λ$ C. It averages to zero over every box D. It equals the original function exactly

Hint

The word "good" refers to the part you can control with the crudest tool — a bound on its size. Which property is a size bound?

Answer

B. It is bounded, never exceeding a fixed multiple of $λ$ .

Feedback-correct: the good part is capped at a bounded multiple of the threshold $λ$ , which is exactly what lets you control it with an $L^{\infty}$ estimate. Feedback-wrong: A and C describe the bad part (supported on the heavy boxes, mean zero), not the good part; D is false because the whole point is to replace $f$ by something tamer.

Formal definition Intermediate+

Throughout, $D$ denotes the family of dyadic cubes of $R^{n}$ : cubes of the form $2^{- k} (m + [0, 1)^{n})$ with $k \in Z$ and $m \in Z^{n}$ . Distinct dyadic cubes are either nested or disjoint, and each dyadic cube $Q$ has a unique dyadic parent $Q$ of twice the side length and $2^{n}$ times the volume. For a cube $Q$ write $\fint_{Q} f = \frac{1}{∣ Q ∣} \int_{Q} f$ for the average of $f$ over $Q$ .

Definition (Calderón-Zygmund decomposition). Let $f \in L^{1} (R^{n})$ and $λ > 0$ . A Calderón-Zygmund decomposition of $f$ at height $λ$ is a representation $f = g + b, b = j \sum b_{j},$ together with a countable collection of pairwise disjoint dyadic cubes ${Q_{j}}$ , such that:

(Good part bounded.) $∥ g ∥_{L^{\infty}} \leq 2^{n} λ$ and $∥ g ∥_{L^{1}} \leq ∥ f ∥_{L^{1}}$ .
(Bad part localized.) Each $b_{j}$ is supported in $Q_{j}$ , and $b = \sum_{j} b_{j}$ is supported in $Ω = ⋃_{j} Q_{j}$ .
(Mean zero.) $\int_{Q_{j}} b_{j} d x = 0$ for every $j$ .
(Controlled mass.) $∥ b_{j} ∥_{L^{1}} \leq 2^{n + 1} λ ∣ Q_{j} ∣$ , equivalently $\fint_{Q_{j}} ∣ b_{j} ∣ \leq 2^{n + 1} λ$ , and $λ < \fint_{Q_{j}} ∣ f ∣ \leq 2^{n} λ$ .
(Total measure bound.) $\sum_{j} ∣ Q_{j} ∣ = ∣Ω∣ \leq λ^{- 1} ∥ f ∥_{L^{1}}$ .

The cubes ${Q_{j}}$ are obtained by the dyadic stopping-time selection: $Q_{j}$ ranges over the maximal dyadic cubes $Q$ with $\fint_{Q} ∣ f ∣ > λ$ . On each such cube one sets $g = \fint_{Q_{j}} f$ and $b_{j} = (f - \fint_{Q_{j}} f) χ_{Q_{j}}$ ; off $Ω$ one sets $g = f$ and $b = 0$ .

Definition (good- $λ$ / bad-part splitting). The induced splitting of any sublinear operator estimate at height $λ$ into a good-part contribution ${∣ T g ∣ > λ /2}$ and a bad-part contribution ${∣ T b ∣ > λ /2}$ is the good- $λ$ /bad-part splitting. The good part is controlled by an $L^{2}$ (or $L^{\infty}$ ) estimate; the bad part is controlled by its mean-zero cancellation against the smoothness of the operator kernel away from $Ω$ .

Counterexamples to common slips Intermediate+

The good part is bounded by $2^{n} λ$ , not by $λ$ . On a stopping cube $Q_{j}$ the value of $g$ is the average $\fint_{Q_{j}} f$ , which exceeds $λ$ by selection. The bound $\fint_{Q_{j}} ∣ f ∣ \leq 2^{n} λ$ comes from the parent $Q_{j}$ , which was not selected, so $\fint_{Q_{j}} ∣ f ∣ \leq λ$ and $\fint_{Q_{j}} ∣ f ∣ \leq 2^{n} \fint_{Q_{j}} ∣ f ∣ \leq 2^{n} λ$ . Forgetting the parent step gives the wrong constant.
Selection must take maximal cubes, not all cubes. The set ${Q : \fint_{Q} ∣ f ∣ > λ}$ is closed upward under nothing in particular, but each such $Q$ is contained in a largest such cube because $∣ Q ∣ < λ^{- 1} ∥ f ∥_{L^{1}}$ bounds the side length. Choosing maximal cubes makes the ${Q_{j}}$ disjoint; choosing all of them double-counts mass.
Mean zero is a property of $b_{j}$ , not of $b$ globally in a useful sense. Each $b_{j}$ integrates to zero over its own cube, which is what the kernel-cancellation argument needs cube by cube. The global statement $\int b = 0$ is true but too weak; the per-cube cancellation is the load-bearing fact.
The decomposition is for $L^{1}$ functions and a fixed $λ$ , not a single canonical object. Different heights $λ$ produce genuinely different cube families and different good/bad splits of the same $f$ . There is no decomposition independent of $λ$ ; the $λ$ -dependence is the point, since estimates integrate over $λ$ via the distribution function.

Key theorem with proof Intermediate+

Theorem (Calderón-Zygmund decomposition; Calderón-Zygmund 1952 Acta Math. 88, 85; dyadic form Stein 1970 Singular Integrals Ch. I §3). Let $f \in L^{1} (R^{n})$ and $λ > 0$ . Then there exist a function $g$ , functions $b_{j}$ , and pairwise disjoint dyadic cubes ${Q_{j}}$ satisfying properties (1)–(5) of the Formal definition. In particular $f = g + b$ with $b = \sum_{j} b_{j}$ , $∥ g ∥_{\infty} \leq 2^{n} λ$ , $∥ g ∥_{1} \leq ∥ f ∥_{1}$ , each $b_{j}$ supported on $Q_{j}$ with $\int b_{j} = 0$ , and $\sum_{j} ∣ Q_{j} ∣ \leq λ^{- 1} ∥ f ∥_{1}$ .

Proof. Set $Ω = {M_{D} f > λ}$ where $M_{D}$ is the dyadic maximal operator of 02.19.01.

Step 1 (stopping-time selection of maximal cubes). For each $x$ with $M_{D} f (x) > λ$ there is a dyadic cube $Q ∋ x$ with $\fint_{Q} ∣ f ∣ > λ$ . Since $f \in L^{1}$ , any such cube satisfies $∣ Q ∣ < λ^{- 1} \int_{Q} ∣ f ∣ \leq λ^{- 1} ∥ f ∥_{1}$ , so its side length is bounded; hence among the dyadic cubes through $x$ with average exceeding $λ$ there is a *maximal* one. Let ${Q_{j}}$ be the collection of all cubes that are maximal with respect to inclusion among dyadic cubes with $\fint_{Q} ∣ f ∣ > λ$ . By the nested-or-disjoint dichotomy of dyadic cubes together with maximality, the ${Q_{j}}$ are pairwise disjoint, and $Ω = ⋃_{j} Q_{j}$ .

Step 2 (average control on each selected cube). Fix a selected cube $Q_{j}$ and let $Q_{j}$ be its dyadic parent. By maximality of $Q_{j}$ , the strictly larger cube $Q_{j}$ does not satisfy the selection inequality, so $\fint_{Q_{j}} ∣ f ∣ \leq λ$ . Since $∣ Q_{j} ∣ = 2^{n} ∣ Q_{j} ∣$ , $λ < \fint_{Q_{j}} ∣ f ∣ = \frac{1}{∣ Q _{j} ∣} \int_{Q_{j}} ∣ f ∣ \leq \frac{2 ^{n}}{∣ Q _{j} ∣} \int_{Q_{j}} ∣ f ∣ = 2^{n} \fint_{Q_{j}} ∣ f ∣ \leq 2^{n} λ .$ This is property (4): the average of $∣ f ∣$ over each selected cube lies in $(λ, 2^{n} λ]$ .

Step 3 (definition of $g$ and $b$ ). Set $g (x) = {\fint_{Q_{j}} f f (x) x \in Q_{j}, x \in / Ω, b_{j} (x) = (f (x) - \fint_{Q_{j}} f) χ_{Q_{j}} (x), b = j \sum b_{j} .$ On $Ω$ each $x$ lies in exactly one $Q_{j}$ by disjointness, so $g$ and $b$ are well defined and $f = g + b$ everywhere: off $Ω$ , $b = 0$ and $g = f$ ; on $Q_{j}$ , $g + b_{j} = \fint_{Q_{j}} f + (f - \fint_{Q_{j}} f) = f$ .

Step 4 (good part is bounded). For $x \in / Ω$ , the Lebesgue differentiation theorem 02.19.01 gives $∣ f (x) ∣ = lim_{r \to 0} \fint_{B (x, r)} ∣ f ∣ \leq λ$ for a.e. such $x$ , since every small dyadic cube through $x$ has average at most $λ$ (otherwise $x \in Ω$ ); thus $∣ g (x) ∣ \leq λ$ a.e. off $Ω$ . For $x \in Q_{j}$ , $∣ g (x) ∣ = ∣ \fint_{Q_{j}} f ∣ \leq \fint_{Q_{j}} ∣ f ∣ \leq 2^{n} λ$ by Step 2. Hence $∥ g ∥_{\infty} \leq 2^{n} λ$ . Moreover $\int ∣ g ∣ \leq \int ∣ f ∣$ : off $Ω$ they agree, and on each $Q_{j}$ , $\int_{Q_{j}} ∣ g ∣ = ∣ Q_{j} ∣ ∣ \fint_{Q_{j}} f ∣ \leq \int_{Q_{j}} ∣ f ∣$ by Jensen. So $∥ g ∥_{1} \leq ∥ f ∥_{1}$ .

Step 5 (bad part: support, mean zero, mass). By construction $b_{j}$ is supported in $Q_{j}$ and $\int_{Q_{j}} b_{j} = \int_{Q_{j}} f - ∣ Q_{j} ∣ \fint_{Q_{j}} f = \int_{Q_{j}} f - \int_{Q_{j}} f = 0,$ which is property (3). For the mass, the triangle inequality and Step 2 give $\int_{Q_{j}} ∣ b_{j} ∣ \leq \int_{Q_{j}} ∣ f ∣ + ∣ Q_{j} ∣ \fint_{Q_{j}} f \leq 2 \int_{Q_{j}} ∣ f ∣ = 2∣ Q_{j} ∣ \fint_{Q_{j}} ∣ f ∣ \leq 2^{n + 1} λ ∣ Q_{j} ∣,$ property (4).

Step 6 (total measure bound). Each selected cube satisfies $∣ Q_{j} ∣ < λ^{- 1} \int_{Q_{j}} ∣ f ∣$ from $\fint_{Q_{j}} ∣ f ∣ > λ$ . Summing over the disjoint family, $j \sum ∣ Q_{j} ∣ < \frac{1}{λ} j \sum \int_{Q_{j}} ∣ f ∣ = \frac{1}{λ} \int_{Ω} ∣ f ∣ \leq \frac{1}{λ} ∥ f ∥_{1},$ property (5). All five properties hold. $□$

Bridge. The decomposition builds toward the weak-type $(1, 1)$ boundedness of every Calderón-Zygmund singular integral operator, and the same stopping-time selection appears again in the dyadic-cube proofs of the $T (1)$ theorem, the John-Nirenberg inequality, and the modern theory of sparse domination. The central insight is that the maximal-function level set ${M_{D} f > λ}$ is the bad set: selecting the maximal cubes where the average first crosses $λ$ is exactly the act of producing the disjoint family, so this is the foundational reason a single $L^{1}$ hypothesis splits into a bounded $L^{2}$ -controllable good part and a mean-zero $L^{1}$ -controllable bad part. Putting these together, the weak- $(1, 1)$ proof of 02.19.01 for the maximal operator and the decomposition here are dual faces of one stopping-time argument: the bound $\sum_{j} ∣ Q_{j} ∣ \leq λ^{- 1} ∥ f ∥_{1}$ that controlled the maximal level set now controls the support of the bad part, and the bridge is that the good- $λ$ /bad-part splitting generalises the layer-cake estimate to operators that see cancellation, not just size.

Exercises Intermediate+

Exercise 3 (medium, numeric).

On $R$ ( $n = 1$ ), let $f = 8 χ_{[0, 1)}$ and take $λ = 3$ . Working with the standard dyadic cubes (dyadic subintervals of $[0, 1)$ and their dyadic ancestors), identify the single maximal selected interval $Q_{1}$ and compute $\fint_{Q_{1}} ∣ f ∣$ . Give the value of $\fint_{Q_{1}} ∣ f ∣$ as a single number.

Hint

The dyadic interval $[0, 1)$ has average $8 > 3$ ; its dyadic parent $[0, 2)$ has average $4 > 3$ as well, so $[0, 1)$ is not maximal — climb until the parent's average drops to $\leq 3$ .

Answer

$4$ . The average of $f$ over $[0, 1)$ is $8$ , over the parent $[0, 2)$ is $8 \cdot 1/2 = 4$ , over $[0, 4)$ is $8/4 = 2 \leq 3$ . So $[0, 2)$ has average $4 > 3$ but its parent $[0, 4)$ has average $2 \leq 3$ ; hence the maximal selected interval is $Q_{1} = [0, 2)$ with $\fint_{Q_{1}} ∣ f ∣ = 4$ . This sits in $(λ, 2^{n} λ] = (3, 6]$ , as the theorem requires with $n = 1$ .

Exercise 5 (medium, symbolic).

Prove that $∥ b ∥_{L^{1}} \leq 2^{n + 1} ∥ f ∥_{L^{1}}$ , where $b = \sum_{j} b_{j}$ .

Hint

The $b_{j}$ have disjoint supports; sum the per-cube mass bound and use the total measure bound, or sum $\int_{Q_{j}} ∣ b_{j} ∣ \leq 2 \int_{Q_{j}} ∣ f ∣$ .

Answer

The supports $Q_{j}$ are pairwise disjoint, so $∥ b ∥_{1} = \sum_{j} ∥ b_{j} ∥_{1}$ . From Step 5 of the proof, $\int_{Q_{j}} ∣ b_{j} ∣ \leq 2 \int_{Q_{j}} ∣ f ∣$ . Summing over the disjoint family, $∥ b ∥_{L^{1}} = j \sum \int_{Q_{j}} ∣ b_{j} ∣ \leq 2 j \sum \int_{Q_{j}} ∣ f ∣ = 2 \int_{Ω} ∣ f ∣ \leq 2∥ f ∥_{L^{1}} .$ This already gives the cleaner bound $∥ b ∥_{1} \leq 2∥ f ∥_{1}$ . The weaker $2^{n + 1} ∥ f ∥_{1}$ follows the same way through property (4): $\int_{Q_{j}} ∣ b_{j} ∣ \leq 2^{n + 1} λ ∣ Q_{j} ∣$ , so $∥ b ∥_{1} \leq 2^{n + 1} λ \sum_{j} ∣ Q_{j} ∣ \leq 2^{n + 1} λ \cdot λ^{- 1} ∥ f ∥_{1} = 2^{n + 1} ∥ f ∥_{1}$ . Either route shows the bad part carries no more $L^{1}$ mass than a fixed multiple of $f$ .

Exercise 6 (medium, symbolic).

Using the decomposition, prove the weak-type $(1, 1)$ bound for the dyadic maximal operator: $∣ {M_{D} f > λ} ∣ \leq λ^{- 1} ∥ f ∥_{1}$ .

Hint

The set ${M_{D} f > λ}$ is exactly the union $Ω = ⋃_{j} Q_{j}$ of the selected cubes.

Answer

By the stopping-time selection (Step 1 of the proof), ${M_{D} f > λ} = Ω = ⋃_{j} Q_{j}$ , since $M_{D} f (x) > λ$ holds exactly when some dyadic cube through $x$ has average $> λ$ , equivalently when $x$ lies in a maximal such cube $Q_{j}$ . The selected cubes are disjoint and each obeys $∣ Q_{j} ∣ < λ^{- 1} \int_{Q_{j}} ∣ f ∣$ . Therefore $∣ {M_{D} f > λ} ∣ = j \sum ∣ Q_{j} ∣ < \frac{1}{λ} j \sum \int_{Q_{j}} ∣ f ∣ = \frac{1}{λ} \int_{Ω} ∣ f ∣ \leq \frac{1}{λ} ∥ f ∥_{1} .$ The constant is $1$ , matching the direct argument of 02.19.01: the decomposition's measure bound (property 5) is the dyadic weak- $(1, 1)$ inequality.

Exercise 7 (hard, symbolic).

Let $T$ be a linear operator bounded on $L^{2} (R^{n})$ with $∥ T f ∥_{2} \leq A ∥ f ∥_{2}$ , given by integration against a kernel $K (x, y)$ off the diagonal that satisfies the Hörmander condition $\int_{∣ x - y ∣ > 2∣ y - y^{'} ∣} ∣ K (x, y) - K (x, y^{'}) ∣ d x \leq B for all y, y^{'} .$ Prove $T$ is of weak type $(1, 1)$ : $∣ {∣ T f ∣ > λ} ∣ \leq C λ^{- 1} ∥ f ∥_{1}$ with $C = C (n, A, B)$ .

Hint

Decompose $f = g + b$ at height $λ$ . Bound ${∣ T g ∣ > λ /2}$ by Chebyshev and the $L^{2}$ estimate of Exercise 4. For the bad part, integrate $T b_{j}$ over the complement of the doubled cube $2 Q_{j}$ , using $\int b_{j} = 0$ to insert $K (x, c_{j})$ and the Hörmander condition; control the doubled cubes' total measure directly.

Answer

Form the Calderón-Zygmund decomposition $f = g + b$ at height $λ$ , with cubes ${Q_{j}}$ and centres $c_{j}$ . By sublinearity, ${∣ T f ∣ > λ} \subseteq {∣ T g ∣ > λ /2} \cup {∣ T b ∣ > λ /2}$ , so it suffices to bound each piece by $C λ^{- 1} ∥ f ∥_{1}$ .

Good part. By Chebyshev, the $L^{2}$ -boundedness of $T$ , and Exercise 4, $∣ {∣ T g ∣ > λ /2} ∣ \leq \frac{4}{λ ^{2}} ∥ T g ∥_{2}^{2} \leq \frac{4 A ^{2}}{λ ^{2}} ∥ g ∥_{2}^{2} \leq \frac{4 A ^{2}}{λ ^{2}} \cdot 2^{n} λ ∥ f ∥_{1} = \frac{2 ^{n + 2} A ^{2}}{λ} ∥ f ∥_{1} .$

Bad part. Let $Ω^{*} = ⋃_{j} 2 Q_{j}$ be the union of the cubes doubled about their centres. Then $∣ Ω^{*} ∣ \leq 2^{n} \sum_{j} ∣ Q_{j} ∣ \leq 2^{n} λ^{- 1} ∥ f ∥_{1}$ , so it remains to bound $∣ {x \in / Ω^{*} : ∣ T b (x) ∣ > λ /2} ∣$ . By Chebyshev with the $L^{1}$ -norm of $T b$ off $Ω^{*}$ , $∣ {x \in / Ω^{*} : ∣ T b ∣ > λ /2} ∣ \leq \frac{2}{λ} \int_{(Ω^{*})^{c}} ∣ T b ∣ d x \leq \frac{2}{λ} j \sum \int_{(2 Q_{j})^{c}} ∣ T b_{j} ∣ d x .$ For $x \in / 2 Q_{j}$ use $\int b_{j} = 0$ to write, with $y_{j}^{'} = c_{j}$ the centre, $T b_{j} (x) = \int_{Q_{j}} K (x, y) b_{j} (y) d y = \int_{Q_{j}} (K (x, y) - K (x, c_{j})) b_{j} (y) d y .$ Hence, by Tonelli 02.07.06 and the Hörmander condition (for $y \in Q_{j}$ , $x \in / 2 Q_{j}$ one has $∣ x - y ∣ > 2∣ y - c_{j} ∣$ ), $\int_{(2 Q_{j})^{c}} ∣ T b_{j} ∣ d x \leq \int_{Q_{j}} ∣ b_{j} (y) ∣ \int_{∣ x - y ∣ > 2∣ y - c_{j} ∣} ∣ K (x, y) - K (x, c_{j}) ∣ d x d y \leq B ∥ b_{j} ∥_{1} .$ Summing and using $\sum_{j} ∥ b_{j} ∥_{1} \leq 2∥ f ∥_{1}$ (Exercise 5), $∣ {x \in / Ω^{*} : ∣ T b ∣ > λ /2} ∣ \leq \frac{2 B}{λ} j \sum ∥ b_{j} ∥_{1} \leq \frac{4 B}{λ} ∥ f ∥_{1} .$ Combining the good part, the doubled-cube measure, and the bad part off $Ω^{*}$ , $∣ {∣ T f ∣ > λ} ∣ \leq \frac{2 ^{n + 2} A ^{2} + 2 ^{n} + 4 B}{λ} ∥ f ∥_{1},$ which is the weak-type $(1, 1)$ bound with $C = 2^{n + 2} A^{2} + 2^{n} + 4 B$ . This is the central application: $L^{2}$ -boundedness plus a kernel-regularity (Hörmander) hypothesis upgrades to the $L^{1}$ endpoint through the decomposition.

Exercise 8 (hard, symbolic).

Deduce from Exercise 7 (together with $L^{2}$ -boundedness) that $T$ is bounded on $L^{p} (R^{n})$ for all $1 < p < 2$ , and explain how to obtain $2 < p < \infty$ .

Hint

Interpolate the weak- $(1, 1)$ bound with the strong- $(2, 2)$ bound by Marcinkiewicz. For $p > 2$ pass to the adjoint $T^{*}$ , whose kernel $K^{*} (x, y) = \overline{K (y, x)}$ also satisfies a Hörmander condition.

Answer

The operator $T$ is of weak type $(1, 1)$ (Exercise 7) and of strong type $(2, 2)$ (hypothesis), hence of weak type $(2, 2)$ . The Marcinkiewicz interpolation theorem 02.07.06, applied to a (sub)linear operator with weak endpoints at $(1, 1)$ and $(2, 2)$ , yields strong type $(p, p)$ for every $p \in (1, 2)$ : $∥ T f ∥_{L^{p}} \leq C_{p} ∥ f ∥_{L^{p}}, C_{p} \leq C (n, A, B) (\frac{1}{p - 1}),$ the constant blowing up like $(p - 1)^{- 1}$ as $p \to 1^{+}$ , reflecting the failure of strong type $(1, 1)$ .

For $2 < p < \infty$ , consider the adjoint $T^{*}$ , given by the kernel $K^{*} (x, y) = \overline{K (y, x)}$ . The Hörmander condition for $K$ in the first variable is the Hörmander condition for $K^{*}$ in its first variable after swapping roles, so $T^{*}$ also satisfies the hypotheses of Exercise 7 and is therefore bounded on $L^{p^{'}}$ for $1 < p^{'} < 2$ . By duality $∥ T ∥_{L^{p} \to L^{p}} = ∥ T^{*} ∥_{L^{p^{'}} \to L^{p^{'}}}$ with $1/ p + 1/ p^{'} = 1$ ; as $p$ ranges over $(2, \infty)$ , $p^{'}$ ranges over $(1, 2)$ , so $T$ is bounded on $L^{p}$ there as well. Together with the $L^{2}$ hypothesis this gives boundedness on the full range $1 < p < \infty$ — the Calderón-Zygmund theorem for singular integrals.

Advanced results Master

Theorem 1 (weak-type $(1, 1)$ for Calderón-Zygmund operators; Calderón-Zygmund 1952 Acta Math. 88, 85). Let $T$ be bounded on $L^{2} (R^{n})$ and given off-diagonal by a kernel satisfying the Hörmander integral condition. Then $T$ extends to a bounded operator of weak type $(1, 1)$ , with bound $C (n, A, B)$ as in Exercise 7, and consequently is bounded on $L^{p}$ for $1 < p < \infty$ . The decomposition is the only substantive ingredient: the good part is handled by the $L^{2}$ bound and Chebyshev, the bad part by the mean-zero cancellation against the kernel's regularity away from the doubled cubes ^{[Calderón-Zygmund 1952]}.

Theorem 2 (Hörmander's refinement; Hörmander 1960 Acta Math. 104, 93). The smoothness hypothesis on the kernel can be weakened from the pointwise gradient bound $∣ \nabla_{y} K (x, y) ∣ \leq C ∣ x - y ∣^{- n - 1}$ of Calderón-Zygmund to the integral (now-called Hörmander) condition used in Exercise 7. This is the minimal regularity under which the good- $λ$ /bad-part argument closes, and it is the form in which the theorem propagates to operators whose kernels are merely Dini-continuous rather than $C^{1}$ ^{[Hörmander 1960]}.

Theorem 3 (Fefferman-Stein sharp maximal characterization; Fefferman-Stein 1972 Acta Math. 129, 137). The decomposition at a continuum of heights $λ$ is the engine behind the sharp maximal function $M^{#} f (x) = sup_{Q ∋ x} \fint_{Q} ∣ f - \fint_{Q} f ∣$ and the identity $∥ f ∥_{L^{p}} \approx ∥ M^{#} f ∥_{L^{p}}$ for $1 < p < \infty$ on functions with suitable decay. Iterating the stopping-time selection — each bad cube re-decomposed at a higher height — produces the good- $λ$ inequalities $∣ {M f > 2 λ, M^{#} f \leq γ λ} ∣ \leq C γ ∣ {M f > λ} ∣$ that prove the equivalence ^{[Fefferman-Stein 1972]}.

Theorem 4 (vector-valued Calderón-Zygmund decomposition). For $f \in L^{1} (R^{n}; H)$ with values in a Hilbert space $H$ , the identical stopping-time selection (now of cubes where $\fint_{Q} ∥ f ∥_{H} > λ$ ) yields $f = g + b$ with $∥ g (x) ∥_{H} \leq 2^{n} λ$ , each $\int_{Q_{j}} b_{j} = 0$ in $H$ , and $\sum_{j} ∣ Q_{j} ∣ \leq λ^{- 1} ∥ f ∥_{L^{1} (H)}$ . This Banach-valued form drives the Littlewood-Paley and Fourier-multiplier theory, where $f$ carries a sequence of frequency-localized pieces and $H = ℓ^{2}$ tracks the square function ^{[Stein 1970]}.

Theorem 5 (decomposition relative to a doubling measure). If $μ$ is a doubling Borel measure on $R^{n}$ (so $μ (2 Q) \leq C_{μ} μ (Q)$ ), the dyadic stopping-time selection at height $λ$ produces cubes with $λ < \fint_{Q_{j}}^{μ} ∣ f ∣ \leq C_{μ} λ$ and $\sum_{j} μ (Q_{j}) \leq λ^{- 1} ∥ f ∥_{L^{1} (μ)}$ , with $2^{n}$ replaced by the doubling constant $C_{μ}$ . This is the form that underlies singular-integral theory on spaces of homogeneous type, where dyadic cubes are replaced by the Christ-David dyadic systems ^{[Stein 1993]}.

Theorem 6 ( $H^{1}$ -atomic connection; Coifman-Weiss). The bad bumps $b_{j}$ , after normalization, are precisely $(1, 2)$ -atoms of the Hardy space $H^{1}$ : each $a_{j} = b_{j} / (2^{n + 1} λ ∣ Q_{j} ∣)$ is supported in $Q_{j}$ , has mean zero, and satisfies $∥ a_{j} ∥_{2} \leq ∣ Q_{j} ∣^{- 1/2}$ . The Calderón-Zygmund decomposition is therefore the constructive bridge from $L^{1}$ to the atomic decomposition of $H^{1}$ , and the weak- $(1, 1)$ bound for $T$ is the shadow on $L^{1}$ of the strong $H^{1} \to L^{1}$ bound that holds because $T$ maps atoms to integrable functions with uniformly controlled $L^{1}$ -norm ^{[Fefferman-Stein 1972]}.

Synthesis. The Calderón-Zygmund decomposition is the foundational reason that a single quantitative hypothesis — $L^{2}$ -boundedness plus kernel regularity — propagates to the entire scale $L^{p}$ , $1 < p < \infty$ , and to the $L^{1}$ endpoint in weak form, and this is exactly the structural device that converts a size hypothesis into a cancellation hypothesis. The central insight is a dictionary: the stopping-time selection of maximal dyadic cubes is dual to the maximal-function level set of 02.19.01, so the bound $\sum_{j} ∣ Q_{j} ∣ \leq λ^{- 1} ∥ f ∥_{1}$ that proved weak- $(1, 1)$ for the maximal operator is the same bound that controls the support of the bad part here. Putting these together, the good- $λ$ /bad-part splitting generalises in three directions that recur throughout harmonic analysis: vertically, from Lebesgue measure to arbitrary doubling and homogeneous-type measures via the Christ-David cubes (Theorem 5); horizontally, from scalar to Hilbert-valued functions via the vector-valued decomposition that powers Littlewood-Paley theory (Theorem 4); and structurally, from $L^{1}$ to the Hardy space $H^{1}$ , where the bad bumps are revealed to be atoms (Theorem 6) and the weak- $(1, 1)$ bound is the trace of a strong $H^{1} \to L^{1}$ estimate. The bridge is that every borderline boundedness theorem in the subject — singular integrals, the $T (1)$ theorem, John-Nirenberg, sparse domination — routes through this one act of cutting $f$ at height $λ$ , and the central insight unifying them is that selecting the heavy cubes is simultaneously a statement about the maximal function and a statement about cancellation.

Full proof set Master

Proposition 1 (disjointness of the selected cubes). The maximal dyadic cubes ${Q_{j}}$ with $\fint_{Q} ∣ f ∣ > λ$ are pairwise disjoint, and $⋃_{j} Q_{j} = {M_{D} f > λ}$ .

Proof. Two dyadic cubes are nested or disjoint. If $Q_{j} \cap Q_{k} \neq = \emptyset$ with $j \neq = k$ , one contains the other, say $Q_{j} ⊊ Q_{k}$ ; then $Q_{j}$ is not maximal among cubes with average $> λ$ (it sits inside the larger $Q_{k}$ , which also has average $> λ$ ), contradicting selection. So the $Q_{j}$ are disjoint. For the union: $x \in {M_{D} f > λ}$ iff some dyadic $Q ∋ x$ has $\fint_{Q} ∣ f ∣ > λ$ , iff (by the side-length bound $∣ Q ∣ < λ^{- 1} ∥ f ∥_{1}$ giving a maximal element) $x$ lies in a maximal such cube, iff $x \in ⋃_{j} Q_{j}$ . $□$

Proposition 2 (the cap $∥ g ∥_{\infty} \leq 2^{n} λ$ is sharp in order). There is $f \in L^{1} (R)$ and $λ > 0$ for which the good part attains $∥ g ∥_{\infty}$ comparable to $2^{n} λ$ (here $n = 1$ , so $2 λ$ ).

Proof. Take $f = c χ_{[0, 1)}$ with $c = 2 λ - ε$ slightly below $2 λ$ , and $λ$ chosen so $\fint_{[0, 1)} f = c > λ$ while $\fint_{[0, 2)} f = c /2 < λ$ , i.e. $λ < c < 2 λ$ . Then $[0, 1)$ is the maximal selected interval, $g = c$ on it, and $∥ g ∥_{\infty} = c = 2 λ - ε$ , which approaches $2 λ = 2^{n} λ$ as $ε \to 0$ . The bound cannot be improved below the parent-volume ratio $2^{n}$ . $□$

Proposition 3 (good and bad parts are genuinely in their stated spaces). $g \in L^{1} \cap L^{\infty} \subseteq L^{2}$ and each $b_{j} \in L^{1}$ with $\int b_{j} = 0$ ; moreover $b \in L^{1}$ with $∥ b ∥_{1} \leq 2∥ f ∥_{1}$ .

Proof. The bounds $∥ g ∥_{\infty} \leq 2^{n} λ$ and $∥ g ∥_{1} \leq ∥ f ∥_{1}$ are Steps 4 of the Key Theorem; their conjunction gives $g \in L^{2}$ via $∥ g ∥_{2}^{2} \leq ∥ g ∥_{\infty} ∥ g ∥_{1}$ . Each $b_{j} = (f - \fint_{Q_{j}} f) χ_{Q_{j}}$ is a difference of $L^{1}$ functions supported in $Q_{j}$ , hence $L^{1}$ , with $\int b_{j} = 0$ by Step 5. Disjoint supports give $∥ b ∥_{1} = \sum_{j} ∥ b_{j} ∥_{1} \leq 2 \sum_{j} \int_{Q_{j}} ∣ f ∣ = 2 \int_{Ω} ∣ f ∣ \leq 2∥ f ∥_{1}$ . $□$

Proposition 4 (the decomposition reproduces the weak- $(1, 1)$ measure bound). $∣ {M_{D} f > λ} ∣ \leq λ^{- 1} ∥ f ∥_{1}$ .

Proof. By Proposition 1 the level set is $⋃_{j} Q_{j}$ with the $Q_{j}$ disjoint, and $∣ Q_{j} ∣ < λ^{- 1} \int_{Q_{j}} ∣ f ∣$ . Summing, $∣ {M_{D} f > λ} ∣ = \sum_{j} ∣ Q_{j} ∣ < λ^{- 1} \int_{Ω} ∣ f ∣ \leq λ^{- 1} ∥ f ∥_{1}$ . $□$

Proposition 5 (good- $λ$ /bad-part reduction is exhaustive). For any sublinear $T$ and any $λ > 0$ , ${∣ T f ∣ > λ} \subseteq {∣ T g ∣ > λ /2} \cup {∣ T b ∣ > λ /2}$ , and the second set splits further as $(⋃_{j} 2 Q_{j}) \cup {x \in / ⋃_{j} 2 Q_{j} : ∣ T b (x) ∣ > λ /2}$ .

Proof. If $∣ T f (x) ∣ > λ$ and $∣ T g (x) ∣ \leq λ /2$ then sublinearity $∣ T f ∣ \leq ∣ T g ∣ + ∣ T b ∣$ forces $∣ T b (x) ∣ > λ /2$ , giving the first inclusion. The second is a plain set decomposition by intersecting with $⋃_{j} 2 Q_{j}$ and its complement; the union of doubled cubes has measure $\leq 2^{n} \sum_{j} ∣ Q_{j} ∣ \leq 2^{n} λ^{- 1} ∥ f ∥_{1}$ by Proposition 4, so it is already of the required size and only the off-doubled-cube part needs the kernel estimate. $□$

Proposition 6 (atoms). With $a_{j} = b_{j} / (2^{n + 1} λ ∣ Q_{j} ∣)$ , each $a_{j}$ is supported in $Q_{j}$ , has $\int a_{j} = 0$ , and $∥ a_{j} ∥_{L^{\infty}} \leq 2^{- 1} \cdot 2^{- n} ∣ Q_{j} ∣^{- 1} \cdot 2 \cdot 2^{n} λ \cdot (2^{n + 1} λ)^{- 1}$ — more usefully $∥ a_{j} ∥_{L^{2}} \leq ∣ Q_{j} ∣^{- 1/2}$ , the normalization of an $H^{1}$ $(1, 2)$ -atom.

Proof. Support and mean zero are inherited from $b_{j}$ . For the $L^{2}$ size, $∥ b_{j} ∥_{2}^{2} = \int_{Q_{j}} ∣ f - \fint_{Q_{j}} f ∣^{2} \leq \int_{Q_{j}} ∣ f ∣^{2}$ ; using only the crude $∣ b_{j} ∣ \leq ∣ f ∣ + \fint_{Q_{j}} ∣ f ∣ \leq ∣ f ∣ + 2^{n} λ$ and $\int_{Q_{j}} ∣ b_{j} ∣ \leq 2^{n + 1} λ ∣ Q_{j} ∣$ , one gets $∥ b_{j} ∥_{2}^{2} \leq ∥ b_{j} ∥_{\infty} ∥ b_{j} ∥_{1} \leq (2 \cdot 2^{n} λ) (2^{n + 1} λ ∣ Q_{j} ∣) = 2^{2 n + 2} λ^{2} ∣ Q_{j} ∣$ . Hence $∥ a_{j} ∥_{2} = ∥ b_{j} ∥_{2} / (2^{n + 1} λ ∣ Q_{j} ∣) \leq (2^{n + 1} λ ∣ Q_{j} ∣^{1/2}) / (2^{n + 1} λ ∣ Q_{j} ∣) = ∣ Q_{j} ∣^{- 1/2}$ , the atomic normalization. $□$

Connections Master

Hardy-Littlewood maximal function and the Vitali covering lemma 02.19.01. The direct prerequisite and dual sibling. The dyadic stopping-time selection used here to build the cubes ${Q_{j}}$ is the same selection that produced the weak- $(1, 1)$ bound for the dyadic maximal operator there; the level set ${M_{D} f > λ}$ is literally the bad set $Ω = ⋃_{j} Q_{j}$ , and the measure bound $\sum_{j} ∣ Q_{j} ∣ \leq λ^{- 1} ∥ f ∥_{1}$ is the maximal inequality re-read as a statement about the support of the bad part.
$L^{p}$ spaces, Hölder, Minkowski, Riesz-Fischer completeness 02.07.06. The direct prerequisite supplying the function-space scaffolding: the $L^{2}$ estimate on the good part, the $L^{1}$ control on the bad part, the Tonelli interchange in the bad-part kernel estimate, and the Marcinkiewicz interpolation that converts the weak- $(1, 1)$ plus strong- $(2, 2)$ endpoints into the full $L^{p}$ range $1 < p < \infty$ .
Singular integral operators and the Hilbert transform [forward: 02.19.03]. The principal successor. Every Calderón-Zygmund operator — the Hilbert transform, the Riesz transforms, convolution with a Calderón-Zygmund kernel — is shown bounded on $L^{p}$ and of weak type $(1, 1)$ by feeding its $L^{2}$ -boundedness and kernel regularity into the good- $λ$ /bad-part argument proved in Exercises 7 and 8.
Littlewood-Paley theory and Fourier multipliers [forward: 02.20.01]. The vector-valued decomposition (Theorem 4) with $H = ℓ^{2}$ is the device that promotes the scalar singular-integral bounds to the square-function and Mikhlin-Hörmander multiplier theorems, where each frequency-localized piece is tracked simultaneously.
BMO and the John-Nirenberg inequality [forward: 02.20.04]. The iterated stopping-time selection — re-decomposing each bad cube at a higher height — is the mechanism behind the John-Nirenberg exponential integrability of BMO functions and the Fefferman-Stein sharp-maximal characterization (Theorem 3) that identifies $BMO$ as the dual of the Hardy space $H^{1}$ .

Historical & philosophical context Master

The decomposition was introduced by Alberto Calderón and Antoni Zygmund in their 1952 Acta Mathematica paper On the existence of certain singular integrals ^{[Calderón-Zygmund 1952]}, the founding document of the real-variable theory of singular integrals. Their problem was to extend the $L^{p}$ -boundedness of the Hilbert transform — known on the line since Marcel Riesz's 1927 work via complex-analytic methods — to higher-dimensional convolution operators with Calderón-Zygmund kernels, where complex methods are unavailable. The decomposition supplied the missing real-variable tool: it let them prove the weak-type $(1, 1)$ endpoint directly, then interpolate. Their original formulation selected cubes by a covering argument on the level set of the function itself; the now-standard dyadic stopping-time form, which makes the parent-cube estimate transparent, was systematized in Elias Stein's 1970 monograph Singular Integrals and Differentiability Properties of Functions ^{[Stein 1970]}.

Lars Hörmander's 1960 Acta Mathematica paper Estimates for translation invariant operators in $L^{p}$ spaces ^{[Hörmander 1960]} isolated the integral smoothness condition on the kernel that is exactly what the bad-part argument consumes, replacing the pointwise gradient bound of Calderón-Zygmund and thereby covering kernels of merely Dini regularity. The decomposition's reach into the theory of Hardy spaces was made explicit by Charles Fefferman and Elias Stein in their 1972 Acta Mathematica paper $H^{p}$ spaces of several variables ^{[Fefferman-Stein 1972]}, where the bad bumps were recognized as atoms and the duality $H^{1}$ - $BMO$ was established, recasting the weak- $(1, 1)$ bound as the $L^{1}$ trace of a strong estimate on $H^{1}$ .

Bibliography Master

@article{CalderonZygmund1952,
  author  = {Calder\'on, Alberto P. and Zygmund, Antoni},
  title   = {On the existence of certain singular integrals},
  journal = {Acta Mathematica},
  volume  = {88},
  year    = {1952},
  pages   = {85--139}
}

@article{Hormander1960,
  author  = {H\"ormander, Lars},
  title   = {Estimates for translation invariant operators in $L^p$ spaces},
  journal = {Acta Mathematica},
  volume  = {104},
  year    = {1960},
  pages   = {93--140}
}

@article{FeffermanStein1972,
  author  = {Fefferman, Charles and Stein, Elias M.},
  title   = {$H^p$ spaces of several variables},
  journal = {Acta Mathematica},
  volume  = {129},
  year    = {1972},
  pages   = {137--193}
}

@book{Stein1970,
  author    = {Stein, Elias M.},
  title     = {Singular Integrals and Differentiability Properties of Functions},
  publisher = {Princeton University Press},
  year      = {1970}
}

@book{Stein1993,
  author    = {Stein, Elias M.},
  title     = {Harmonic Analysis: Real-Variable Methods, Orthogonality, and Oscillatory Integrals},
  publisher = {Princeton University Press},
  year      = {1993}
}

@book{Grafakos2014,
  author    = {Grafakos, Loukas},
  title     = {Classical Fourier Analysis},
  edition   = {3},
  publisher = {Springer},
  year      = {2014}
}

@book{Duoandikoetxea2001,
  author    = {Duoandikoetxea, Javier},
  title     = {Fourier Analysis},
  publisher = {American Mathematical Society},
  year      = {2001}
}

Prerequisites

02.19.01
02.07.06

Tier anchors

beginner: Stein-Shakarchi 2005 *Real Analysis* (Princeton) Ch. 3; informal cut-at-height-$\lambda$ picture
intermediate: Stein 1970 *Singular Integrals and Differentiability Properties of Functions* (Princeton) Ch. I §3; Grafakos 2014 *Classical Fourier Analysis* 3e (Springer) §5.3
master: Stein 1993 *Harmonic Analysis* (Princeton) Ch. I §3; Duoandikoetxea 2001 *Fourier Analysis* (AMS) §2; Stein 1970 *Singular Integrals* (Princeton) Ch. I

References

Calderón-Zygmund — On the existence of certain singular integrals · Acta Mathematica 88 (1952), 85-139
Stein — Singular Integrals and Differentiability Properties of Functions · Ch. I, §3, the Calderón-Zygmund decomposition
Stein — Harmonic Analysis: Real-Variable Methods, Orthogonality, and Oscillatory Integrals · Ch. I, §3
Grafakos — Classical Fourier Analysis, 3e · §5.3, the Calderón-Zygmund decomposition
Duoandikoetxea — Fourier Analysis · §2.5, Calderón-Zygmund decomposition and weak (1,1)
Hörmander — Estimates for translation invariant operators in L^p spaces · Acta Mathematica 104 (1960), 93-140
Fefferman-Stein — H^p spaces of several variables · Acta Mathematica 129 (1972), 137-193

Estimated time

beginner: 18m
intermediate: 55m
master: 90m