02.20.03 · analysis / littlewood-paley-interpolation

Littlewood-Paley Theory and the Square Function

shipped3 tiersLean: none

Anchor (Master): Stein 1970 *Singular Integrals* (Princeton) Ch. IV; Stein 1993 *Harmonic Analysis* (Princeton) Ch. I §8, VI-VII; Grafakos 2014 *Modern Fourier Analysis* 3e §6; Frazier-Jawerth-Weiss 1991 *Littlewood-Paley Theory and the Study of Function Spaces* (AMS CBMS 79)

Intuition Beginner

A sound engineer never works with a song as one undivided wall of noise. She splits it into frequency bands — the bass octave, the next octave up, and so on — and reads a separate level meter for each band. Two facts make this useful. First, the bands recombine: stack all the band signals back up and you recover the original song exactly. Second, the loudness of the whole song is faithfully reported by combining the band meters: a song is loud overall exactly when its bands carry a lot of energy between them, and quiet exactly when they do not. Littlewood-Paley theory is this idea turned into a theorem about functions.

The frequency axis gets chopped into "octaves" — bands where the frequency size roughly doubles from one band to the next. Each band picks out the part of the function vibrating at that scale, called a frequency piece. The square function is the band level meter: at each location you take the size of every frequency piece, square them, add them up, and take the square root, exactly the way you combine the sides of a right triangle to get the hypotenuse. This single number measures how much total energy, across all scales at once, the function carries near that point.

The headline claim is a faithfulness statement. Measuring the size of a function the ordinary way and measuring it through this band-combining square function give comparable answers — neither can be large without the other being large. So you may always trade the function for its menu of frequency pieces and lose nothing about its size. That trade is what makes scale-by-scale arguments legal in analysis.

The one-sentence takeaway: split a function into octave-frequency pieces, combine their sizes with a square-root-of-sum-of-squares, and the result faithfully reports the function's original size, so you can analyze any function one scale at a time.

Visual Beginner

Picture the frequency axis as a ruler. Place a row of smooth, overlapping bumps along it, each bump twice as wide and twice as far out as the one before: the first covers low frequencies, the next covers the octave above, and so on outward forever. Each bump is a band filter. Multiply a function's frequency content by one bump and you keep only that octave; the rest is silenced. Because the bumps add up to one everywhere, dropping nothing, stacking all the filtered pieces rebuilds the original.

The bottom panel shows the payoff. The three light curves are the function seen through three successive octave filters — each wigglier than the last, since higher octaves vibrate faster. The heavy curve below is the square function: at every point it bundles the three local sizes into one through the right-triangle combination. Where the heavy curve is tall, the function is carrying real energy nearby at some scale; where it is short, the function is locally calm at every scale. The theorem says the area-style size of this heavy curve matches the size of the original function.

Worked example Beginner

We build a square function for a signal made of two pure tones and read off how it combines their strengths.

Step 1. Take a signal with exactly two frequency components: a low tone of strength $3$ sitting in band one, and a high tone of strength $4$ sitting in band two. Every other band is silent. So the first frequency piece has size $3$ , the second has size $4$ , and all the rest have size $0$ .

Step 2. Form the square function at a point where these are the local sizes. Square each piece: $3$ squared is $9$ , and $4$ squared is $16$ . The silent bands contribute $0$ .

Step 3. Add the squares: $9 + 16 = 25$ . This total is the combined squared energy across all bands at that point.

Step 4. Take the square root: the square root of $25$ is $5$ . So the square function reads $5$ at that point.

What this tells us: the two tones of strength $3$ and $4$ do not simply add to $7$ , nor does the loud one alone ( $4$ ) dominate. They combine through the right-triangle rule into $5$ , the hypotenuse of the $3$ - $4$ triangle. This is exactly how the square function pools energy from different scales: a function spread across many octaves registers a size that blends all of them, and the Littlewood-Paley theorem promises this blended reading tracks the function's true overall size.

Check your understanding Beginner

Exercise (easy, multiple choice).

The square function at a point combines the sizes of the frequency pieces by:

A. Adding the sizes directly B. Taking the largest size among the pieces C. Squaring each size, adding, and taking the square root D. Multiplying the sizes together

Hint

The name "square function" describes the recipe: it involves squares and a root, the same combination that turns the two legs of a right triangle into its hypotenuse.

Answer

C. Squaring each size, adding, and taking the square root.

Feedback-correct: this is the square-root-of-sum-of-squares (the Euclidean or right-triangle combination), which is exactly how energy pools across independent frequency bands. Feedback-wrong: A overcounts when pieces partly cancel; B throws away all but one band; D is not a size measurement at all and would vanish whenever any single band is silent.

Formal definition Intermediate+

Throughout, $\cdot$ is the Fourier transform on $R^{n}$ of 02.10.04, normalized by $f (ξ) = \int f (x) e^{- 2 π i x \cdot ξ} d x$ , and $f$ ranges over the Schwartz class $S (R^{n})$ unless stated otherwise. Fix a Littlewood-Paley bump $ψ \in S (R^{n})$ whose Fourier transform $ψ$ is supported in the annulus ${\frac{1}{2} \leq ∣ ξ ∣ \leq 2}$ , is non-negative, and satisfies $j \in Z \sum ψ (2^{- j} ξ) = 1 for all ξ \neq = 0.$ Such a $ψ$ exists: take any non-negative $ϕ \in C_{c}^{\infty}$ equal to $1$ on ${∣ ξ ∣ \leq 1}$ and supported in ${∣ ξ ∣ \leq 2}$ , set $ψ (ξ) = ϕ (ξ) - ϕ (2 ξ)$ , which is supported in ${\frac{1}{2} \leq ∣ ξ ∣ \leq 2}$ and telescopes to the required partition.

Definition (Littlewood-Paley projections). The Littlewood-Paley projection at frequency scale $2^{j}$ is the Fourier multiplier $Δ_{j} f (ξ) = ψ (2^{- j} ξ) f (ξ), equivalently Δ_{j} f = ψ_{2^{j}} * f, ψ_{2^{j}} (x) = 2^{j n} ψ (2^{j} x) .$ The frequency-localization $Δ_{j}$ keeps the part of $f$ with frequencies $∣ ξ ∣ \sim 2^{j}$ . The partition of unity gives the dyadic frequency decomposition $f = j \in Z \sum Δ_{j} f$ with convergence in $L^{2}$ (and in $S^{'}$ modulo polynomials), since $\sum_{j} ψ (2^{- j} ξ) f (ξ) = f (ξ)$ for $ξ \neq = 0$ by the partition property.

Definition (Littlewood-Paley square function). The square function of $f$ is the $ℓ^{2}$ -norm of the projections, taken pointwise: $S f (x) = (j \in Z \sum ∣ Δ_{j} f (x) ∣^{2})^{1/2} .$ By Plancherel 02.10.04 and the almost-orthogonality of the $ψ (2^{- j} \cdot)$ (at most three overlap at any $ξ$ ), $∥ S f ∥_{L^{2}}^{2} = \sum_{j} ∥ Δ_{j} f ∥_{L^{2}}^{2} \sim ∥ f ∥_{L^{2}}^{2}$ , so the $L^{2}$ equivalence is built in; the content of the theory is the same equivalence on every $L^{p}$ .

Definition (Stein $g$ -function). Let $P_{t}$ be the Poisson semigroup, $P_{t} f (ξ) = e^{- 2 π t ∣ ξ ∣} f (ξ)$ , so $u (x, t) = P_{t} f (x)$ is the harmonic extension of $f$ to the upper half-space $R_{+}^{n + 1}$ . The Littlewood-Paley $g$ -function is $g (f) (x) = (\int_{0}^{\infty} t \partial_{t} P_{t} f (x)^{2} \frac{d t}{t})^{1/2},$ a continuous-parameter square function measuring, at each $x$ , the $L^{2} (d t / t)$ -energy of the vertical derivative of the harmonic extension. The Lusin area integral refines it by integrating over a cone $Γ_{α} (x) = {(y, t) : ∣ y - x ∣ < α t}$ : $S_{α} f (x) = (\int_{Γ_{α} (x)} ∣\nabla u (y, t) ∣^{2} t^{1 - n} d y d t)^{1/2} .$

Counterexamples to common slips Intermediate+

The square function is not the same as the partial sum. $S f$ takes the $ℓ^{2}$ -norm of the $∣ Δ_{j} f ∣$ before summing the geometric data; the reconstruction $\sum_{j} Δ_{j} f$ sums the $Δ_{j} f$ with their signs. Cancellation in the second is exactly what the first discards, which is why $∥ S f ∥_{p}$ controls $∥ f ∥_{p}$ from both sides while $\sum_{j} ∣ Δ_{j} f ∣$ does not.
The $L^{2}$ equivalence is elementary; the $L^{p}$ equivalence is not. On $L^{2}$ the comparison $∥ S f ∥_{2} \sim ∥ f ∥_{2}$ is one application of Plancherel and bounded overlap. There is no analogue of Plancherel on $L^{p}$ for $p \neq = 2$ , so the $L^{p}$ statement genuinely requires the vector-valued Calderón-Zygmund theorem or the $g$ -function machinery; treating it as "Plancherel again" is the central error.
The choice of bump $ψ$ does not matter — but only up to constants. Two admissible partitions give comparable square functions, $∥ S_{ψ} f ∥_{p} \sim ∥ S_{ψ^{'}} f ∥_{p}$ , with constants depending on $ψ, ψ^{'}, n, p$ . The space and the qualitative equivalence are independent of $ψ$ ; the numerical constant is not, and chasing the sharp constant is a separate and harder question.
The lower bound $∥ f ∥_{p} ≲ ∥ S f ∥_{p}$ is not automatic from the upper bound. It is recovered by polarization/duality, not by reversing inequalities: one uses the upper bound for the dual exponent together with the reproducing identity $\sum_{j} Δ_{j} \tilde{Δ}_{j} = I$ for a fattened companion $\tilde{Δ}_{j}$ . Assuming the reverse bound for free is a common gap.

Key theorem with proof Intermediate+

Theorem (Littlewood-Paley; $∥ f ∥_{p} \sim ∥ S f ∥_{p}$ ). Let $1 < p < \infty$ . There are constants $0 < c_{n, p} \leq C_{n, p} < \infty$ , depending only on $n$ , $p$ , and the bump $ψ$ , such that for all $f \in L^{p} (R^{n})$ , $c_{n, p} ∥ f ∥_{L^{p}} \leq S f_{L^{p}} = (j \sum ∣ Δ_{j} f ∣^{2})^{1/2}_{L^{p}} \leq C_{n, p} ∥ f ∥_{L^{p}} .$

Proof. The two directions split.

Upper bound via vector-valued Calderón-Zygmund. Regard the family ${Δ_{j}}_{j \in Z}$ as a single operator $T f = (Δ_{j} f)_{j}$ from scalar functions to $ℓ^{2}$ -valued functions. Its kernel is the $ℓ^{2}$ -valued convolution kernel $K (x) = (ψ_{2^{j}} (x))_{j}$ , acting by $T f (x) = \int K (x - y) f (y) d y$ . Three facts make $T$ a (Banach-space-valued) Calderón-Zygmund operator with target $ℓ^{2}$ :

$L^{2}$ bound. By Plancherel 02.10.04, $∥ T f ∥_{L^{2} (ℓ^{2})}^{2} = \sum_{j} ∥ Δ_{j} f ∥_{L^{2}}^{2} = \int (\sum_{j} ∣ ψ (2^{- j} ξ) ∣^{2}) ∣ f (ξ) ∣^{2} d ξ \leq 3∥ f ∥_{L^{2}}^{2}$ , since at most three of the annuli ${∣ ξ ∣ \sim 2^{j}}$ overlap at any $ξ$ and $0 \leq ψ \leq 1$ . So $∥ T ∥_{L^{2} \to L^{2} (ℓ^{2})} \leq 3$ .

Hörmander regularity of the $ℓ^{2}$ -valued kernel. We must bound $\int_{∣ x ∣ > 2∣ y ∣} ∥ K (x - y) - K (x) ∥_{ℓ^{2}} d x \leq B$ . By Minkowski's integral inequality 02.07.06 it suffices to control the kernel scale by scale and sum. Each $ψ_{2^{j}}$ is a Schwartz function at scale $2^{- j}$ with $\int ∣\nabla ψ_{2^{j}} (x) ∣ d x \leq c 2^{j}$ and $ψ_{2^{j}}$ concentrated on $∣ x ∣ ≲ 2^{- j}$ with rapid decay; the mean value bound gives, for fixed $y$ , $(j \sum ∣ ψ_{2^{j}} (x - y) - ψ_{2^{j}} (x) ∣^{2})^{1/2} \leq j \sum ∣ ψ_{2^{j}} (x - y) - ψ_{2^{j}} (x) ∣ \leq C \frac{∣ y ∣}{∣ x ∣ ^{n + 1}} (∣ x ∣ > 2∣ y ∣),$ where the last estimate is the standard summation of the scale-localized differences $min (2^{j} ∣ y ∣, 1) \cdot 2^{j n} (1 + 2^{j} ∣ x ∣)^{- N}$ over $j$ (the small scales $2^{j} ∣ y ∣ \leq 1$ contribute the factor $2^{j} ∣ y ∣$ , the large scales the rapid decay; the geometric series sums to $∣ y ∣∣ x ∣^{- n - 1}$ ). Integrating $∣ y ∣∣ x ∣^{- n - 1}$ over $∣ x ∣ > 2∣ y ∣$ gives a dimensional constant $B$ . Thus $K$ satisfies the $ℓ^{2}$ -valued Hörmander condition.

With the $L^{2}$ bound and the Hörmander condition, the vector-valued extension of the Calderón-Zygmund theorem 02.19.03 — whose proof is verbatim the scalar one, with absolute values replaced by the $ℓ^{2}$ -norm throughout the decomposition argument — yields weak type $(1, 1)$ for $T : L^{1} \to L^{1, \infty} (ℓ^{2})$ , and by Marcinkiewicz interpolation 02.07.06 and duality, $∥ S f ∥_{L^{p}} = ∥ T f ∥_{L^{p} (ℓ^{2})} \leq C_{n, p} ∥ f ∥_{L^{p}}, 1 < p < \infty.$

Lower bound via duality and a reproducing identity. Choose a fattened companion bump $ψ$ with $ψ = 1$ on $supp ψ$ and supported in ${\frac{1}{4} \leq ∣ ξ ∣ \leq 4}$ , so $Δ_{j} Δ_{j} = Δ_{j}$ , hence $\sum_{j} Δ_{j} Δ_{j} = \sum_{j} Δ_{j} = I$ on $S$ . For $f, h \in S$ , polarizing the decomposition and using $Δ_{j}^{*} = Δ_{- j -type}$ (the multiplier $\overline{ψ (2^{- j} \cdot)}$ ), $⟨ f, h ⟩ = j \sum ⟨ Δ_{j} Δ_{j} f, h ⟩ = j \sum ⟨ Δ_{j} f, Δ_{j}^{*} h ⟩ = \int j \sum Δ_{j} f (x) \overline{Δ_{j}^{*} h (x)} d x .$ By Cauchy-Schwarz in $j$ pointwise, then Hölder 02.07.06 with exponents $p, p^{'}$ , $∣ ⟨ f, h ⟩ ∣ \leq \int S f (x) S h (x) d x \leq ∥ S f ∥_{L^{p}} ∥ S h ∥_{L^{p^{'}}},$ where $S h = (\sum_{j} ∣ Δ_{j}^{*} h ∣^{2})^{1/2}$ . The upper bound already proved (applied to the companion family, which satisfies the same hypotheses) gives $∥ S h ∥_{L^{p^{'}}} \leq C_{n, p^{'}} ∥ h ∥_{L^{p^{'}}}$ . Taking the supremum over $∥ h ∥_{p^{'}} \leq 1$ and using the duality $∥ f ∥_{p} = sup_{∥ h ∥_{p^{'}} \leq 1} ∣ ⟨ f, h ⟩ ∣$ , $∥ f ∥_{L^{p}} \leq C_{n, p^{'}} ∥ S f ∥_{L^{p}},$ which is the lower bound with $c_{n, p} = C_{n, p^{'}}^{- 1}$ . $□$

Bridge. This theorem builds toward every scale-by-scale argument in modern harmonic analysis and dispersive PDE, and it appears again in the proof of the Marcinkiewicz and Hörmander-Mikhlin multiplier theorems, in the definition of Besov and Triebel-Lizorkin spaces, and in the Strichartz machinery downstream. The foundational reason it works is that the square function $T f = (Δ_{j} f)_{j}$ is exactly a Calderón-Zygmund operator with values in the Hilbert space $ℓ^{2}$ , so the scalar theory of 02.19.03 transfers wholesale once absolute values become $ℓ^{2}$ -norms; this is exactly the mechanism by which "one bound on $L^{2}$ plus kernel smoothness" propagates to the full open scale, now for the vector-valued kernel. The central insight is that orthogonality of frequencies, which on $L^{2}$ is the bare Plancherel identity, survives onto every $L^{p}$ — not as orthogonality but as the $ℓ^{2}$ -square-function equivalence — and this is dual to itself: the lower bound is the upper bound for the conjugate exponent, fed through the reproducing identity $\sum_{j} Δ_{j} Δ_{j} = I$ . Putting these together, the Littlewood-Paley theorem generalises Plancherel from $p = 2$ to $1 < p < \infty$ , and the bridge is the Khintchine inequality, which in the Master tier replaces the vector-valued Calderón-Zygmund black box by an explicit randomization that turns the square function into an average of ordinary multiplier operators.

Exercises Intermediate+

Exercise 3 (medium, symbolic).

Prove the Khintchine inequality in the form needed for randomization: for Rademacher signs $ε_{j}$ (independent, $P (ε_{j} = \pm 1) = \frac{1}{2}$ ), scalars $a_{1}, \dots, a_{N}$ , and $0 < p < \infty$ , there are constants $0 < A_{p} \leq B_{p} < \infty$ with $A_{p} (j \sum ∣ a_{j} ∣^{2})^{1/2} \leq (E j \sum ε_{j} a_{j}^{p})^{1/ p} \leq B_{p} (j \sum ∣ a_{j} ∣^{2})^{1/2} .$

Hint

For the upper bound with $p \geq 2$ use $E e^{s \sum ε_{j} a_{j}} \leq e^{s^{2} \sum a_{j}^{2} /2}$ (each $E e^{s ε_{j} a_{j}} = cosh (s a_{j}) \leq e^{s^{2} a_{j}^{2} /2}$ ) and optimize; for $0 < p < 2$ interpolate. The lower bound follows from the $p = 2$ identity and Hölder.

Answer

Set $σ^{2} = \sum_{j} a_{j}^{2}$ ; by scaling assume $σ = 1$ , and write $X = \sum_{j} ε_{j} a_{j}$ . The case $p = 2$ is exact: $E ∣ X ∣^{2} = \sum_{j, k} a_{j} a_{k} E [ε_{j} ε_{k}] = \sum_{j} a_{j}^{2} = 1$ by independence and $E ε_{j} = 0$ .

Upper bound, $p \geq 2$ . Since $cosh u \leq e^{u^{2} /2}$ , independence gives $E e^{s X} = \prod_{j} cosh (s a_{j}) \leq e^{s^{2} \sum a_{j}^{2} /2} = e^{s^{2} /2}$ for all real $s$ . Hence the tail bound $P (∣ X ∣ > λ) \leq 2 e^{- λ^{2} /2}$ (optimize $s = λ$ in $P (X > λ) \leq e^{- s λ} E e^{s X}$ , double for $\pm$ ). Integrating, $E ∣ X ∣^{p} = p \int_{0}^{\infty} λ^{p - 1} P (∣ X ∣ > λ) d λ \leq 2 p \int_{0}^{\infty} λ^{p - 1} e^{- λ^{2} /2} d λ = B_{p}^{p} < \infty$ , a constant depending only on $p$ . For $0 < p < 2$ , the upper bound follows from $p = 2$ by Hölder/Jensen: $(E ∣ X ∣^{p})^{1/ p} \leq (E ∣ X ∣^{2})^{1/2} = 1$ .

Lower bound. For $p \geq 2$ it is Jensen: $(E ∣ X ∣^{p})^{1/ p} \geq (E ∣ X ∣^{2})^{1/2} = 1$ . For $0 < p < 2$ , interpolate the second moment between $p$ and a larger exponent $q > 2$ : by Hölder with $θ \in (0, 1)$ , $1 = E ∣ X ∣^{2} \leq (E ∣ X ∣^{p})^{θ} (E ∣ X ∣^{q})^{(1 - θ)}$ where $2 = θ p + (1 - θ) q$ ; using the already-proved upper bound $E ∣ X ∣^{q} \leq B_{q}^{q}$ gives $(E ∣ X ∣^{p})^{1/ p} \geq A_{p} > 0$ . Undoing the normalization restores the factor $(\sum a_{j}^{2})^{1/2}$ on both sides.

Exercise 4 (medium, symbolic).

Use the Khintchine inequality to reduce the upper bound $∥ S f ∥_{p} \leq C ∥ f ∥_{p}$ to a uniform bound on the randomized multiplier operators $T_{ε} = \sum_{j} ε_{j} Δ_{j}$ over all sign sequences $ε = (ε_{j})$ .

Hint

Apply Khintchine pointwise in $x$ with $a_{j} = Δ_{j} f (x)$ , then integrate $d x$ and take expectations in $ε$ , swapping with Fubini.

Answer

Fix $x$ and apply Khintchine (Exercise 3) with $a_{j} = Δ_{j} f (x)$ : $A_{p}^{p} S f (x)^{p} = A_{p}^{p} (j \sum ∣ Δ_{j} f (x) ∣^{2})^{p /2} \leq E_{ε} j \sum ε_{j} Δ_{j} f (x)^{p} = E_{ε} ∣ T_{ε} f (x) ∣^{p} .$ Integrate in $x$ and use Tonelli 02.07.06 to swap $\int d x$ with $E_{ε}$ : $A_{p}^{p} ∥ S f ∥_{L^{p}}^{p} \leq \int E_{ε} ∣ T_{ε} f (x) ∣^{p} d x = E_{ε} ∥ T_{ε} f ∥_{L^{p}}^{p} \leq ε sup ∥ T_{ε} f ∥_{L^{p}}^{p} .$ Thus $∥ S f ∥_{L^{p}} \leq A_{p}^{- 1} sup_{ε} ∥ T_{ε} f ∥_{L^{p}}$ . So it suffices to bound the single-operator norms $∥ T_{ε} ∥_{L^{p} \to L^{p}}$ uniformly in the signs $ε$ . Each $T_{ε}$ has multiplier $m_{ε} (ξ) = \sum_{j} ε_{j} ψ (2^{- j} ξ)$ , which satisfies the Mikhlin-Hörmander bounds $∣ \partial^{α} m_{ε} (ξ) ∣ \leq C_{α} ∣ ξ ∣^{- ∣ α ∣}$ with constants independent of $ε$ (each $ξ$ sees at most three terms, and the dyadic derivatives telescope), so the Mikhlin multiplier theorem 02.19.03 gives $∥ T_{ε} ∥_{L^{p} \to L^{p}} \leq C_{n, p}$ uniformly. This is the randomization route to the upper bound, bypassing the vector-valued kernel.

Exercise 5 (medium, symbolic).

Show the reproducing identity used in the lower bound: if $ψ = 1$ on $supp ψ$ , then $Δ_{j} Δ_{j} = Δ_{j}$ and $\sum_{j} Δ_{j} Δ_{j} = I$ on $S$ (modulo the frequency origin).

Hint

Work on the Fourier side: $Δ_{j} Δ_{j} f = ψ (2^{- j} ξ) ψ (2^{- j} ξ) f$ .

Answer

On the Fourier side $Δ_{j} Δ_{j} f (ξ) = ψ (2^{- j} ξ) ψ (2^{- j} ξ) f (ξ)$ . When $ψ (2^{- j} ξ) \neq = 0$ we have $2^{- j} ξ \in supp ψ$ , where $ψ = 1$ by hypothesis, so $ψ (2^{- j} ξ) ψ (2^{- j} ξ) = ψ (2^{- j} ξ)$ for every $ξ$ ; hence $Δ_{j} Δ_{j} = Δ_{j}$ . Summing over $j$ and using the partition $\sum_{j} ψ (2^{- j} ξ) = 1$ for $ξ \neq = 0$ , $\sum_{j} Δ_{j} Δ_{j} f (ξ) = j \sum ψ (2^{- j} ξ) f (ξ) = f (ξ), ξ \neq = 0.$ So $\sum_{j} Δ_{j} Δ_{j} = I$ on Schwartz functions whose transform vanishes at $0$ (and on all of $S$ modulo constants, the frequency origin being a null set). This identity is what lets the duality pairing $⟨ f, h ⟩ = \sum_{j} ⟨ Δ_{j} f, Δ_{j}^{*} h ⟩$ be split by Cauchy-Schwarz into the two square functions.

Exercise 6 (medium, symbolic).

Prove the $L^{2}$ identity for the Stein $g$ -function: $∥ g (f) ∥_{L^{2}}^{2} = \frac{1}{4} ∥ f ∥_{L^{2}}^{2}$ for $f \in L^{2} (R^{n})$ .

Hint

By Plancherel, $t \partial_{t} P_{t} f$ has Fourier transform $- 2 π t ∣ ξ ∣ e^{- 2 π t ∣ ξ ∣} f (ξ)$ . Compute $\int_{0}^{\infty} (2 π t ∣ ξ ∣)^{2} e^{- 4 π t ∣ ξ ∣} d t / t$ .

Answer

Since $P_{t} f (ξ) = e^{- 2 π t ∣ ξ ∣} f (ξ)$ , $\partial_{t}$ gives $\partial_{t} P_{t} f (ξ) = - 2 π ∣ ξ ∣ e^{- 2 π t ∣ ξ ∣} f (ξ)$ , so $t \partial_{t} P_{t} f (ξ) = - 2 π t ∣ ξ ∣ e^{- 2 π t ∣ ξ ∣} f (ξ)$ . By Plancherel 02.10.04 for each fixed $t$ and then Tonelli 02.07.06 in $(x, t)$ , $∥ g (f) ∥_{2}^{2} = \int_{0}^{\infty} \int_{R^{n}} ∣ t \partial_{t} P_{t} f (x) ∣^{2} d x \frac{d t}{t} = \int_{R^{n}} ∣ f (ξ) ∣^{2} (\int_{0}^{\infty} (2 π t ∣ ξ ∣)^{2} e^{- 4 π t ∣ ξ ∣} \frac{d t}{t}) d ξ .$ Substitute $s = 4 π t ∣ ξ ∣$ in the inner integral: $\int_{0}^{\infty} (2 π t ∣ ξ ∣)^{2} e^{- 4 π t ∣ ξ ∣} \frac{d t}{t} = \int_{0}^{\infty} \frac{s ^{2}}{4} e^{- s} \frac{d s}{s} = \frac{1}{4} \int_{0}^{\infty} s e^{- s} d s = \frac{1}{4} Γ (2) = \frac{1}{4}$ . The inner integral is the dimensionless constant $\frac{1}{4}$ independent of $ξ$ , so $∥ g (f) ∥_{2}^{2} = \frac{1}{4} \int ∣ f ∣^{2} = \frac{1}{4} ∥ f ∥_{2}^{2}$ . This is the continuous-parameter analogue of the bounded-overlap $L^{2}$ identity for $S$ , and the source of the $g$ -function's $L^{p}$ equivalence.

Exercise 7 (hard, symbolic).

Prove the vector-valued Hörmander condition for the $g$ -function in its singular-integral guise: the operator $f \mapsto (t \partial_{t} P_{t} f)_{t > 0}$ is a Calderón-Zygmund operator with values in the Hilbert space $H = L^{2} (R_{+}, d t / t)$ , by showing its $H$ -valued kernel $K (x) = (t \partial_{t} P_{t} (x))_{t > 0}$ satisfies $\int_{∣ x ∣ > 2∣ y ∣} ∥ K (x - y) - K (x) ∥_{H} d x \leq B$ .

Hint

The Poisson kernel is $P_{t} (x) = c_{n} t (t^{2} + ∣ x ∣^{2})^{- (n + 1) /2}$ ; $t \partial_{t} P_{t} (x)$ is homogeneous of degree $- n$ jointly in $(x, t)$ . Bound $∥ \nabla_{x} (t \partial_{t} P_{t}) (x) ∥_{H} \leq C ∣ x ∣^{- n - 1}$ and apply the mean value inequality.

Answer

Write $Q_{t} (x) = t \partial_{t} P_{t} (x)$ . From $P_{t} (x) = c_{n} t (t^{2} + ∣ x ∣^{2})^{- (n + 1) /2}$ one computes that $Q_{t} (x)$ is smooth on $R_{+}^{n + 1}$ and jointly homogeneous of degree $- n$ : $Q_{λ t} (λ x) = λ^{- n} Q_{t} (x)$ . Define $K (x) = (Q_{t} (x))_{t > 0} \in H = L^{2} (d t / t)$ . By the homogeneity and smoothness, for $x \neq = 0$ , $∥ K (x) ∥_{H}^{2} = \int_{0}^{\infty} ∣ Q_{t} (x) ∣^{2} \frac{d t}{t} = ∣ x ∣^{- 2 n} \int_{0}^{\infty} ∣ Q_{s} (\overset{x}{^}) ∣^{2} \frac{d s}{s} = C ∣ x ∣^{- 2 n},$ substituting $t = ∣ x ∣ s$ , $\overset{x}{^} = x /∣ x ∣$ (the remaining integral is a finite constant by the rapid decay of $Q_{s}$ in $s$ and $1/ s$ ). So $∥ K (x) ∥_{H} \leq C ∣ x ∣^{- n}$ , the $H$ -valued size bound. Differentiating in $x$ lowers homogeneity by one: $∥ \nabla_{x} K (x) ∥_{H} \leq C ∣ x ∣^{- n - 1}$ by the same scaling computation applied to $\nabla_{x} Q_{t}$ . The mean value inequality along the segment from $x$ to $x - y$ , valid since $∣ x - θ y ∣ \geq ∣ x ∣/2$ for $∣ x ∣ > 2∣ y ∣$ , gives $∥ K (x - y) - K (x) ∥_{H} \leq ∣ y ∣ 0 \leq θ \leq 1 sup ∥ \nabla_{x} K (x - θ y) ∥_{H} \leq C ∣ y ∣ (∣ x ∣/2)^{- n - 1} = C^{'} ∣ y ∣ ∣ x ∣^{- n - 1} .$ Integrating over $∣ x ∣ > 2∣ y ∣$ , $\int_{∣ x ∣ > 2∣ y ∣} ∥ K (x - y) - K (x) ∥_{H} d x \leq C^{'} ∣ y ∣ ω_{n - 1} \int_{2∣ y ∣}^{\infty} r^{- n - 1} r^{n - 1} d r = C^{'} ω_{n - 1} /2 =: B$ , independent of $y$ . With the $L^{2}$ bound of Exercise 6 (norm $\frac{1}{2}$ ), the $H$ -valued Calderón-Zygmund theorem 02.19.03 gives $∥ g (f) ∥_{L^{p}} = ∥ T f ∥_{L^{p} (H)} \leq C_{n, p} ∥ f ∥_{L^{p}}$ for $1 < p < \infty$ ; the reverse bound follows by the same duality/polarization as for $S$ , using $- 4 \int_{0}^{\infty} t \partial_{t} P_{t} t \partial_{t} P_{t} d t / t$ -type reproducing identities. Hence $∥ g (f) ∥_{L^{p}} \sim ∥ f ∥_{L^{p}}$ .

Exercise 8 (hard, symbolic).

Deduce the Marcinkiewicz multiplier theorem in the dyadic form from the Littlewood-Paley theorem: if $m \in L^{\infty} (R)$ has uniformly bounded dyadic variation, $sup_{j} (∣ m (ξ) ∣ on I_{j} plus \int_{I_{j}} ∣ m^{'} ∣ d ξ) \leq A$ over the dyadic intervals $I_{j} = {2^{j} \leq ∣ ξ ∣ < 2^{j + 1}}$ , then $T_{m}$ is bounded on $L^{p} (R)$ for $1 < p < \infty$ .

Hint

Write $T_{m} f = \sum_{j} T_{m} Δ_{j} f$ , note $T_{m} Δ_{j}$ has multiplier $m ψ (2^{- j} \cdot)$ supported in $I_{j - 1} \cup I_{j} \cup I_{j + 1}$ , and bound $∥ T_{m} Δ_{j} f ∥$ via the variation hypothesis; then apply $S$ to $T_{m} f$ and use $∥ T_{m} f ∥_{p} \sim ∥ S (T_{m} f) ∥_{p}$ .

Answer

By the lower Littlewood-Paley bound, $∥ T_{m} f ∥_{p} \leq c^{- 1} ∥ S (T_{m} f) ∥_{p} = c^{- 1} (\sum_{k} ∣ Δ_{k} T_{m} f ∣^{2})^{1/2}_{p}$ . Since $ψ (2^{- k} \cdot)$ is supported in ${∣ ξ ∣ \sim 2^{k}}$ , the multiplier of $Δ_{k} T_{m}$ is $m ψ (2^{- k} \cdot)$ , supported in $I_{k - 1} \cup I_{k} \cup I_{k + 1}$ , where the dyadic-variation hypothesis controls $m$ : writing $m_{k} = m ψ (2^{- k} \cdot)$ , the bounds $∥ m_{k} ∥_{\infty} \leq C A$ and $\int ∣ m_{k}^{'} ∣ \leq C A$ (product rule, using $\int ∣ (ψ (2^{- k} \cdot))^{'} ∣ \leq C$ uniformly) make each $Δ_{k} T_{m}$ a multiplier operator of Mikhlin type at scale $2^{k}$ with constant $C A$ , so by almost-orthogonality $Δ_{k} T_{m} f = Δ_{k} T_{m} Δ_{k} f$ with $Δ_{k}$ a fattened projection. Therefore, pointwise in the $ℓ^{2}$ sense and using the uniform Mikhlin bound, $S (T_{m} f) = (k \sum ∣ Δ_{k} T_{m} Δ_{k} f ∣^{2})^{1/2} \leq C A (k \sum ∣ Δ_{k} f ∣^{2})^{1/2} = C A S f$ after a vector-valued (Khintchine-randomized) application of the uniform single-scale bound, exactly as in Exercise 4. Taking $L^{p}$ norms and applying the upper Littlewood-Paley bound to $S$ , $∥ T_{m} f ∥_{p} \leq c^{- 1} ∥ S (T_{m} f) ∥_{p} \leq c^{- 1} C A ∥ S f ∥_{p} \leq c^{- 1} C A C_{n, p} ∥ f ∥_{p} .$ Hence $T_{m}$ is bounded on $L^{p} (R)$ , $1 < p < \infty$ , with norm $\leq C_{p} A$ . The Littlewood-Paley square function is precisely the device that turns a per-octave control of the multiplier into a global $L^{p}$ bound — the multiplier need not be smooth, only of bounded variation on each dyadic block.

Advanced results Master

Theorem 1 (Littlewood-Paley $L^{p}$ equivalence; Littlewood-Paley 1937 Proc. London Math. Soc. 42, 52; Stein 1970 Singular Integrals Ch. IV). For $1 < p < \infty$ , $∥ f ∥_{L^{p}} \sim ∥ S f ∥_{L^{p}}$ with constants depending only on $n, p, ψ$ . The square function is a Hilbert-space-valued Calderón-Zygmund operator; the equivalence is the vector-valued 02.19.03 applied to the $ℓ^{2}$ -valued kernel $(ψ_{2^{j}})_{j}$ , with the lower bound recovered by duality through the reproducing identity $\sum_{j} Δ_{j} Δ_{j} = I$ ^{[Stein 1970-SI]}.

Theorem 2 (Khintchine randomization and the multiplier route; Marcinkiewicz 1939 Studia Math. 8, 78). With Rademacher signs $ε_{j}$ , $S f (x) \sim_{p} (E_{ε} ∣ \sum_{j} ε_{j} Δ_{j} f (x) ∣^{p})^{1/ p}$ pointwise by Khintchine, so $∥ S f ∥_{p} \sim_{p} (E_{ε} ∥ T_{ε} f ∥_{p}^{p})^{1/ p}$ . Each $T_{ε} = \sum_{j} ε_{j} Δ_{j}$ is a Mikhlin multiplier with $ε$ -independent constants, whence $sup_{ε} ∥ T_{ε} ∥_{L^{p} \to L^{p}} < \infty$ and the upper bound follows without the vector-valued kernel. This is the proof Stein attributes to the randomization idea, making the orthogonality survive onto $L^{p}$ as an average of scalar operators ^{[Marcinkiewicz 1939]}.

Theorem 3 (Stein $g$ -function equivalence; Stein 1970 Topics in Harmonic Analysis AMS Studies 63). For $1 < p < \infty$ , $∥ g (f) ∥_{L^{p}} \sim ∥ f ∥_{L^{p}}$ , where $g (f) = (\int_{0}^{\infty} ∣ t \partial_{t} P_{t} f ∣^{2} d t / t)^{1/2}$ . The proof parallels Theorem 1 with the continuous Hilbert space $H = L^{2} (R_{+}, d t / t)$ in place of $ℓ^{2}$ : the $H$ -valued kernel $(t \partial_{t} P_{t})_{t > 0}$ is jointly homogeneous of degree $- n$ , satisfies the $H$ -valued size and Hörmander bounds, and is $L^{2}$ -bounded with norm $\frac{1}{2}$ , so the vector-valued Calderón-Zygmund theorem applies and duality gives the lower bound ^{[Stein 1970-Topics]}.

Theorem 4 (Lusin area integral; Marcinkiewicz-Zygmund, Stein). The area integral $S_{α} f (x) = (\int_{Γ_{α} (x)} ∣\nabla u ∣^{2} t^{1 - n} d y d t)^{1/2}$ , integrating the full gradient of the harmonic extension over a cone, satisfies $∥ S_{α} f ∥_{L^{p}} \sim ∥ f ∥_{L^{p}}$ for $1 < p < \infty$ , with constants depending on the aperture $α$ . The pointwise relation $g (f) \leq C_{α} S_{α} f$ and a Fubini argument over cones link the area integral to the $g$ -function; the area integral additionally controls nontangential boundary behavior, making it the tool for almost-everywhere convergence and for the $H^{p}$ / $BMO$ theory ^{[Stein 1970-SI]}.

Theorem 5 (Besov and Triebel-Lizorkin spaces; Frazier-Jawerth-Weiss 1991 CBMS 79). The Littlewood-Paley pieces define the scale of smoothness spaces: $∥ f ∥_{\dot{F}_{p, q}^{s}} = ∥ (\sum_{j} 2^{j s q} ∣ Δ_{j} f ∣^{q})^{1/ q} ∥_{L^{p}}$ (Triebel-Lizorkin) and $∥ f ∥_{\dot{B}_{p, q}^{s}} = (\sum_{j} 2^{j s q} ∥ Δ_{j} f ∥_{p}^{q})^{1/ q}$ (Besov). For $s = 0, q = 2$ the Triebel-Lizorkin norm is the square-function norm, so $\dot{F}_{p, 2}^{0} = L^{p}$ ( $1 < p < \infty$ ) is exactly Theorem 1; Sobolev spaces are $\dot{F}_{p, 2}^{s} = \dot{W}^{s, p}$ , Hölder-Zygmund spaces are $\dot{B}_{\infty, \infty}^{s}$ , and the whole functional-analytic apparatus of PDE is encoded in the $Δ_{j}$ ^{[Frazier-Jawerth-Weiss 1991]}.

Theorem 6 (Hörmander-Mikhlin and the BMO endpoint). The square function reproves the Hörmander-Mikhlin multiplier theorem (a symbol $m$ with $∣ \partial^{α} m ∣ \leq A ∣ ξ ∣^{- ∣ α ∣}$ up to order $⌊ n /2 ⌋ + 1$ is $L^{p}$ -bounded) by per-octave control, and it is exactly the analytic object behind the Carleson-measure characterization of $BMO$ 02.20.01: $φ \in BMO$ iff $∣\nabla u ∣^{2} t d x d t$ is a Carleson measure, i.e. iff the area integral of $φ$ is controlled in the averaged sense. The square function thus bridges the $L^{p}$ theory and the $H^{1}$ - $BMO$ endpoint theory ^{[Stein 1970-SI]}.

Synthesis. The square function is the foundational reason that frequency orthogonality, an $L^{2}$ phenomenon, governs the entire scale $1 < p < \infty$ : the central insight is that $(Δ_{j} f)_{j}$ is a Calderón-Zygmund operator valued in a Hilbert space, so the scalar singular-integral theorem of 02.19.03 transfers verbatim, and this is exactly what converts Plancherel's single identity into the $L^{p}$ equivalence $∥ f ∥_{p} \sim ∥ S f ∥_{p}$ . Putting these together, three reformulations carry the same content: the discrete square function $S$ (Theorem 1), the randomized multiplier average $T_{ε}$ via Khintchine (Theorem 2), and the continuous $g$ - and area-integrals built from the Poisson semigroup (Theorems 3-4) — the bridge between them is that each is a vector-valued kernel ( $ℓ^{2}$ , the Rademacher average, $L^{2} (d t / t)$ ) over which the same Hörmander estimate runs. This generalises in two directions at once: vertically it is dual to itself, the lower bound being the upper bound for the conjugate exponent through $\sum_{j} Δ_{j} Δ_{j} = I$ ; horizontally it generalises the $L^{p} = \dot{F}_{p, 2}^{0}$ identity into the full Besov-Triebel-Lizorkin scale (Theorem 5), so that Sobolev regularity, Hölder regularity, and Hardy-space membership are all read off the same dyadic pieces. The deepest reading is that the square function is dual to the maximal function: where the Hardy-Littlewood maximal operator controls size by taking suprema of averages, the square function controls size by taking $ℓ^{2}$ -norms of frequency increments, and the Fefferman-Stein theory of 02.20.01 shows these two are the twin pillars on which the $L^{p}$ , $H^{1}$ , and $BMO$ scales rest.

Full proof set Master

Proposition 1 ( $L^{2}$ two-sided bound for $S$ ). $\frac{1}{3} ∥ f ∥_{L^{2}} \leq ∥ S f ∥_{L^{2}} \leq 3 ∥ f ∥_{L^{2}}$ for all $f \in L^{2} (R^{n})$ .

Proof. By Plancherel 02.10.04, $∥ S f ∥_{2}^{2} = \sum_{j} ∥ Δ_{j} f ∥_{2}^{2} = \int M (ξ) ∣ f (ξ) ∣^{2} d ξ$ with $M (ξ) = \sum_{j} ∣ ψ (2^{- j} ξ) ∣^{2}$ . At most three annuli overlap, so $M \leq 3$ . For the lower bound, the partition $\sum_{j} ψ (2^{- j} ξ) = 1$ with at most three nonzero terms forces some term $\geq 1/3$ , whence $M (ξ) \geq 1/9$ . Thus $\frac{1}{9} ∥ f ∥_{2}^{2} \leq ∥ S f ∥_{2}^{2} \leq 3∥ f ∥_{2}^{2}$ . $□$

Proposition 2 ( $ℓ^{2}$ -valued Hörmander bound). The kernel $K (x) = (ψ_{2^{j}} (x))_{j}$ satisfies $\int_{∣ x ∣ > 2∣ y ∣} ∥ K (x - y) - K (x) ∥_{ℓ^{2}} d x \leq B$ for a dimensional $B$ .

Proof. Bound the $ℓ^{2}$ -norm by the $ℓ^{1}$ -norm, then sum scales. Each $ψ_{2^{j}}$ obeys the Schwartz bounds $∣ ψ_{2^{j}} (x) ∣ \leq C_{N} 2^{j n} (1 + 2^{j} ∣ x ∣)^{- N}$ and $∣\nabla ψ_{2^{j}} (x) ∣ \leq C_{N} 2^{j (n + 1)} (1 + 2^{j} ∣ x ∣)^{- N}$ . By the mean value theorem, for $∣ x ∣ > 2∣ y ∣$ , $∣ ψ_{2^{j}} (x - y) - ψ_{2^{j}} (x) ∣ \leq ∣ y ∣ sup_{0 \leq θ \leq 1} ∣\nabla ψ_{2^{j}} (x - θ y) ∣ \leq C_{N} ∣ y ∣ 2^{j (n + 1)} (1 + 2^{j} ∣ x ∣)^{- N}$ , using $∣ x - θ y ∣ \geq ∣ x ∣/2$ ; this difference is also bounded directly by $∣ ψ_{2^{j}} (x - y) ∣ + ∣ ψ_{2^{j}} (x) ∣ \leq C_{N} 2^{j n} (1 + 2^{j} ∣ x ∣)^{- N}$ . Taking the minimum of the two bounds and summing the geometric series in $j$ (small scales $2^{j} ∣ x ∣ \leq 1$ use the difference bound $\sim 2^{j} ∣ y ∣$ times $2^{j n}$ ; large scales use the rapid decay) yields $\sum_{j} ∣ ψ_{2^{j}} (x - y) - ψ_{2^{j}} (x) ∣ \leq C ∣ y ∣∣ x ∣^{- n - 1}$ . Hence $∥ K (x - y) - K (x) ∥_{ℓ^{2}} \leq C ∣ y ∣∣ x ∣^{- n - 1}$ , and $\int_{∣ x ∣ > 2∣ y ∣} ∣ y ∣∣ x ∣^{- n - 1} d x = ω_{n - 1} ∣ y ∣ \int_{2∣ y ∣}^{\infty} r^{- 2} d r = ω_{n - 1} /2 =: B$ . $□$

Proposition 3 (upper bound $∥ S f ∥_{p} \leq C_{n, p} ∥ f ∥_{p}$ ). For $1 < p < \infty$ , $T = (Δ_{j})_{j} : L^{p} \to L^{p} (ℓ^{2})$ is bounded.

Proof. By Proposition 1, $T$ is bounded $L^{2} \to L^{2} (ℓ^{2})$ with norm $\leq 3$ . By Proposition 2 the $ℓ^{2}$ -valued kernel satisfies the Hörmander condition. The Calderón-Zygmund argument of 02.19.03 applies with scalars replaced by $ℓ^{2}$ : the decomposition $f = g + b$ at height $λ$ gives a good part handled by the $L^{2} (ℓ^{2})$ bound through Chebyshev, and a bad part $b = \sum_{k} b_{k}$ handled off the doubled cubes by the $ℓ^{2}$ -valued Hörmander estimate $\int_{(2 Q_{k})^{c}} ∥ T b_{k} ∥_{ℓ^{2}} \leq B ∥ b_{k} ∥_{1}$ (the mean-zero cancellation subtracts $K (x - c_{k})$ in $ℓ^{2}$ ). This yields weak type $(1, 1)$ for $T$ ; Marcinkiewicz interpolation 02.07.06 between $(1, 1)$ and $(2, 2)$ gives $1 < p < 2$ , and duality of $L^{p} (ℓ^{2})$ with $L^{p^{'}} (ℓ^{2})$ gives $2 < p < \infty$ . $□$

Proposition 4 (lower bound $∥ f ∥_{p} \leq C ∥ S f ∥_{p}$ ). For $1 < p < \infty$ , $∥ f ∥_{L^{p}} \leq C_{n, p} ∥ S f ∥_{L^{p}}$ .

Proof. With $Δ_{j}$ the fattened companion ( $ψ = 1$ on $supp ψ$ ), the reproducing identity $\sum_{j} Δ_{j} Δ_{j} = I$ (verified on the Fourier side: $ψ (2^{- j} ξ) ψ (2^{- j} ξ) = ψ (2^{- j} ξ)$ , then partition) gives for $h \in S$ $⟨ f, h ⟩ = j \sum ⟨ Δ_{j} f, Δ_{j}^{*} h ⟩ = \int j \sum Δ_{j} f \overline{Δ_{j}^{*} h} \leq \int S f \cdot S h \leq ∥ S f ∥_{p} ∥ S h ∥_{p^{'}}$ by Cauchy-Schwarz in $j$ and Hölder 02.07.06. The companion family satisfies the same hypotheses, so Proposition 3 gives $∥ S h ∥_{p^{'}} \leq C_{n, p^{'}} ∥ h ∥_{p^{'}}$ . Taking the supremum over $∥ h ∥_{p^{'}} \leq 1$ and using $L^{p}$ - $L^{p^{'}}$ duality, $∥ f ∥_{p} \leq C_{n, p^{'}} ∥ S f ∥_{p}$ . $□$

Proposition 5 ( $g$ -function $L^{2}$ identity). $∥ g (f) ∥_{L^{2}} = \frac{1}{2} ∥ f ∥_{L^{2}}$ .

Proof. By Plancherel and Tonelli as in Exercise 6, $∥ g (f) ∥_{2}^{2} = \int ∣ f (ξ) ∣^{2} \int_{0}^{\infty} (2 π t ∣ ξ ∣)^{2} e^{- 4 π t ∣ ξ ∣} \frac{d t}{t} d ξ$ , and the inner integral equals $\frac{1}{4} Γ (2) = \frac{1}{4}$ after $s = 4 π t ∣ ξ ∣$ . Hence $∥ g (f) ∥_{2}^{2} = \frac{1}{4} ∥ f ∥_{2}^{2}$ . $□$

Proposition 6 (Khintchine upper-bound reduction). $∥ S f ∥_{L^{p}} \leq A_{p}^{- 1} sup_{ε} ∥ T_{ε} f ∥_{L^{p}}$ , where $T_{ε} = \sum_{j} ε_{j} Δ_{j}$ .

Proof. Khintchine (Exercise 3) with $a_{j} = Δ_{j} f (x)$ gives $A_{p}^{p} S f (x)^{p} \leq E_{ε} ∣ T_{ε} f (x) ∣^{p}$ pointwise. Integrating in $x$ and applying Tonelli 02.07.06 to swap $\int d x$ and $E_{ε}$ , $A_{p}^{p} ∥ S f ∥_{p}^{p} \leq E_{ε} ∥ T_{ε} f ∥_{p}^{p} \leq sup_{ε} ∥ T_{ε} f ∥_{p}^{p}$ . Each $m_{ε} = \sum_{j} ε_{j} ψ (2^{- j} \cdot)$ is Mikhlin with $ε$ -independent constants, so $sup_{ε} ∥ T_{ε} ∥_{p \to p} < \infty$ by 02.19.03. $□$

Connections Master

Calderón-Zygmund singular integral operators 02.19.03. The direct prerequisite and the engine. The square function $T = (Δ_{j})_{j}$ is literally a Calderón-Zygmund operator with values in $ℓ^{2}$ ; the weak- $(1, 1)$ /interpolation/duality machinery proved there transfers verbatim with absolute values replaced by the $ℓ^{2}$ -norm, and the Mikhlin multiplier theorem from there bounds each randomized operator $T_{ε}$ uniformly. The Littlewood-Paley theorem is the vector-valued payoff of the scalar singular-integral theory.
Fourier transform on $R^{n}$ and the Plancherel theorem 02.10.04. The source of the $L^{2}$ skeleton. The projections $Δ_{j}$ are Fourier multipliers, the partition of unity and bounded annular overlap give the $L^{2}$ two-sided bound by Plancherel, and the $g$ -function's $L^{2}$ identity $∥ g (f) ∥_{2} = \frac{1}{2} ∥ f ∥_{2}$ is a Plancherel computation on the Poisson symbol $e^{- 2 π t ∣ ξ ∣}$ .
L^p spaces, Hölder, Minkowski, Marcinkiewicz interpolation 02.07.06. The function-space scaffolding. Minkowski's integral inequality assembles the $ℓ^{2}$ -valued Hörmander bound, Hölder pairs the two square functions in the duality lower bound, Marcinkiewicz interpolates the vector-valued weak- $(1, 1)$ and $(2, 2)$ endpoints, and Tonelli swaps the spatial integral with the Rademacher expectation in the Khintchine reduction.
Hilbert space 02.11.08. The structural home of the construction. The square function takes values in $ℓ^{2}$ , the $g$ -function in $L^{2} (d t / t)$ , the area integral in $L^{2}$ of a cone; the vector-valued Calderón-Zygmund theorem requires precisely a Hilbert-space target so that orthogonality (Plancherel/Bessel) supplies the $L^{2}$ bound and the inner product supplies the duality pairing.
BMO and the John-Nirenberg inequality 02.20.01. The endpoint partner. The Carleson-measure characterization of $BMO$ is the statement that the area-integral energy $∣\nabla u ∣^{2} t d x d t$ of the Poisson extension is a Carleson measure; the square function is thus the analytic object that ports the cube-oscillation picture of $BMO$ to the half-space and links the $L^{p}$ theory to the $H^{1}$ - $BMO$ endpoint.
Strichartz estimates via the TT* method 02.21.02. The downstream application in dispersive PDE. Frequency-localized (Littlewood-Paley) pieces are the standard device for proving Strichartz and bilinear estimates: one decomposes the solution into dyadic frequency blocks, proves the estimate per block by the $T T^{*}$ method, and reassembles with the square-function equivalence, exactly the scale-by-scale strategy this unit licenses.

Historical & philosophical context Master

The square function was introduced by John Edensor Littlewood and Raymond Paley in a sequence of 1937 papers in the Proceedings of the London Mathematical Society ^{[Littlewood-Paley 1937]}, working on the circle with lacunary and dyadic blocks of Fourier series. Their motivation was to extend M. Riesz's $L^{p}$ theory of conjugate functions to a statement about the sizes of dyadic Fourier blocks, proving that the $L^{p}$ norm of a function is comparable to the $L^{p}$ norm of the square function of its dyadic Fourier pieces. The original proofs used complex-analytic methods specific to the disc — Blaschke products and the theory of analytic functions in $H^{p}$ — and did not transfer to several real variables.

The real-variable theory was created by Elias Stein, who recast the square function as a singular integral with values in a Hilbert space and built the continuous-parameter $g$ -function from the Poisson semigroup, developing the systematic theory in his 1970 monograph Singular Integrals and Differentiability Properties of Functions and the companion Topics in Harmonic Analysis Related to the Littlewood-Paley Theory ^{[Stein 1970-Topics]}. The randomization argument using Rademacher functions and the Khintchine inequality, which reduces the square function to an average of ordinary multiplier operators, traces to Józef Marcinkiewicz and to the Marcinkiewicz-Zygmund theory of the 1930s ^{[Marcinkiewicz 1939]}. The discrete smooth dyadic decomposition in the form used today, and its role in defining Besov and Triebel-Lizorkin spaces, was systematized by Michael Frazier, Björn Jawerth, and Guido Weiss ^{[Frazier-Jawerth-Weiss 1991]}, who recast the whole theory in terms of the $φ$ -transform and atomic/molecular decompositions.

Bibliography Master

@article{LittlewoodPaley1937,
  author  = {Littlewood, John E. and Paley, Raymond E. A. C.},
  title   = {Theorems on Fourier series and power series},
  journal = {Proceedings of the London Mathematical Society},
  volume  = {42},
  year    = {1937},
  pages   = {52--89}
}

@article{Marcinkiewicz1939,
  author  = {Marcinkiewicz, J\'ozef},
  title   = {Sur les multiplicateurs des s\'eries de Fourier},
  journal = {Studia Mathematica},
  volume  = {8},
  year    = {1939},
  pages   = {78--91}
}

@book{Stein1970SI,
  author    = {Stein, Elias M.},
  title     = {Singular Integrals and Differentiability Properties of Functions},
  publisher = {Princeton University Press},
  year      = {1970}
}

@book{Stein1970Topics,
  author    = {Stein, Elias M.},
  title     = {Topics in Harmonic Analysis Related to the Littlewood-Paley Theory},
  series    = {Annals of Mathematics Studies},
  number    = {63},
  publisher = {Princeton University Press},
  year      = {1970}
}

@book{Stein1993,
  author    = {Stein, Elias M.},
  title     = {Harmonic Analysis: Real-Variable Methods, Orthogonality, and Oscillatory Integrals},
  publisher = {Princeton University Press},
  year      = {1993}
}

@book{Grafakos2014Modern,
  author    = {Grafakos, Loukas},
  title     = {Modern Fourier Analysis},
  edition   = {3},
  publisher = {Springer},
  year      = {2014}
}

@book{FrazierJawerthWeiss1991,
  author    = {Frazier, Michael and Jawerth, Bj\"orn and Weiss, Guido},
  title     = {Littlewood-Paley Theory and the Study of Function Spaces},
  series    = {CBMS Regional Conference Series in Mathematics},
  number    = {79},
  publisher = {American Mathematical Society},
  year      = {1991}
}

Prerequisites

02.19.03
02.07.06
02.10.04
02.11.08

Tier anchors

beginner: Stein-Shakarchi 2011 *Functional Analysis* (Princeton Lectures in Analysis IV) Ch. 5; informal 'split a signal into octave bands and measure the energy in each' picture
intermediate: Stein 1970 *Singular Integrals and Differentiability Properties of Functions* (Princeton) Ch. IV; Grafakos 2014 *Modern Fourier Analysis* 3e (Springer) §6.1-6.2; Duoandikoetxea 2001 *Fourier Analysis* (AMS) §8
master: Stein 1970 *Singular Integrals* (Princeton) Ch. IV; Stein 1993 *Harmonic Analysis* (Princeton) Ch. I §8, VI-VII; Grafakos 2014 *Modern Fourier Analysis* 3e §6; Frazier-Jawerth-Weiss 1991 *Littlewood-Paley Theory and the Study of Function Spaces* (AMS CBMS 79)

References

Littlewood-Paley — Theorems on Fourier series and power series · Proc. London Math. Soc. 42 (1937), 52-89; 43 (1937), 105-126
Stein — Singular Integrals and Differentiability Properties of Functions · Ch. IV, the Littlewood-Paley g-function and square functions
Stein — Topics in Harmonic Analysis Related to the Littlewood-Paley Theory · Annals of Mathematics Studies 63 (1970), Princeton
Grafakos — Modern Fourier Analysis, 3e · §6.1-6.2, Littlewood-Paley theory and the square function
Marcinkiewicz — Sur les multiplicateurs des séries de Fourier · Studia Math. 8 (1939), 78-91
Frazier-Jawerth-Weiss — Littlewood-Paley Theory and the Study of Function Spaces · CBMS Regional Conference Series 79, AMS (1991)

Estimated time

beginner: 20m
intermediate: 60m
master: 90m