21.14.03 · number-theory / sieve-methods-large-sieve

Mean Values of Multiplicative Functions: Halász's Theorem

shipped3 tiersLean: none

Anchor (Master): Halász 1968 *Acta Math. Acad. Sci. Hungar.* 19, 365 (the theorem on mean values of complex multiplicative functions); Wirsing 1967 *Math. Ann.* 143, 75 (mean values of real non-negative multiplicative functions); Granville-Soundararajan 2007 *Canad. J. Math.* 59, 1265 (the pretentious distance and the large sieve); Montgomery-Vaughan 2007 *Multiplicative Number Theory I* §6; Tenenbaum 2015 *Introduction to Analytic and Probabilistic Number Theory* §III.4

Intuition Beginner

A multiplicative function $f$ assigns a number to each integer in a way that respects factorisation, and here we restrict to functions whose values never exceed $1$ in size. The basic question is the average: add up $f (1), f (2), \dots$ out to a cutoff $x$ and divide by $x$ . Sometimes this average settles to a clean positive limit, and sometimes it drifts to zero. Halász's theorem explains exactly when each happens.

The key idea is a notion of pretending. Picture a benchmark family of functions, the powers $n^{i t}$ , each of which oscillates in a smooth, predictable way as $n$ grows. The theorem says that $f$ has a large average only if $f$ closely imitates one of these benchmarks across the primes. If $f$ refuses to imitate any of them — if at the primes it points in too many different directions — then all that disagreement piles up and the average is forced toward zero.

Why does this matter? It turns a hard question about a wild sequence of values into a single measurement: how far is $f$ from its nearest pretender? That distance is a sum over primes, and once you compute it you can read off whether the running average survives or decays. The same one dial governs the mean of every unit-size multiplicative function at once, which is why the result reshaped the whole subject.

Visual Beginner

Imagine each prime contributing an arrow on a clock face: the arrow for prime $p$ points in the direction set by the value $f (p)$ compared with the pretender's value at $p$ . When the arrows mostly agree, they reinforce and the average stays large. When they scatter around the clock, they cancel and the average collapses.

   AGREEMENT (small distance)          DISAGREEMENT (large distance)
         arrows cluster                      arrows scatter
            \ | /                              \   |   /
             \|/                            <---  o  --->
        ------o------                           / | \
             /|\                               /  |  \
            / | \                             v   v   v
     mean stays LARGE                      mean decays to ZERO
   f pretends to be n^(it)             f pretends to be nothing

   distance over primes:  sum over p of (1 - agreement at p) / p
   small sum  ->  big average        large sum  ->  average ~ (1 + M) e^(-M)

The size of the average is controlled by one number $M$ : the smallest possible total disagreement between $f$ and any pretender $n^{i t}$ . Small $M$ means a large average; large $M$ drives the average down like $(1 + M)$ times $e^{- M}$ , which shrinks fast.

Worked example Beginner

Take $f (n) = 1$ for every integer, the most agreeable multiplicative function there is. Its pretender is the simplest benchmark, $n^{i t}$ with $t = 0$ , which is also just $1$ . Let us see why the average is large.

Step 1. Measure the disagreement at each prime. Since $f (p) = 1$ matches the benchmark $p^{0} = 1$ exactly, the disagreement at every prime is $0$ .

Step 2. Add up the disagreement over the primes, each weighted by $1/ p$ . Every term is $0$ , so the total $M$ is $0$ .

Step 3. Read off the average bound. The control factor is $(1 + M) e^{- M}$ . With $M = 0$ this is $(1 + 0) e^{0} = 1$ , the largest it can be. So the average is not forced down at all.

Step 4. Check directly. Adding $f (n) = 1$ out to $x$ and dividing by $x$ gives the average $1$ . The two answers agree: zero disagreement means a full-strength average.

Now flip to a function that disagrees everywhere. Take $f (p) = - 1$ at every prime. Compared with the benchmark $1$ , each prime now contributes the maximum disagreement, the weighted total $M$ grows like $ln ln x$ without bound, and $(1 + M) e^{- M}$ tumbles toward zero. The running average of this function indeed decays.

What this tells us: the single number $M$ , the total prime-by-prime disagreement with the nearest benchmark, decides everything. No disagreement gives a full average; unbounded disagreement gives a vanishing one.

Check your understanding Beginner

Formal definition Intermediate+

Throughout, $f : Z_{> 0} \to C$ is multiplicative with $∣ f (n) ∣ \leq 1$ for all $n$ ; equivalently $f$ takes values in the closed unit disc $\overline{D}$ at every prime power. Write $σ = Re (s)$ , and let $γ$ denote the Euler-Mascheroni constant. The mean value of $f$ up to $x$ is $M_{f} (x) = \frac{1}{x} \sum_{n \leq x} f (n)$ , and the summatory-function notation of 21.11.02 is in force. ^{[Montgomery-Vaughan §6]}

Definition (pretentious distance). For multiplicative $f, g$ with values in $\overline{D}$ , the pretentious distance up to $x$ is $$ \mathbb{D}(f, g; x)^2 = \sum_{p \le x} \frac{1 - \operatorname{Re}\bigl(f(p),\overline{g(p)}\bigr)}{p}. $$ Each summand is non-negative because $Re (f (p) \overline{g (p)}) \leq ∣ f (p) ∣∣ g (p) ∣ \leq 1$ . The quantity $D$ is a pseudometric: it satisfies the triangle inequality $D (f, g; x) + D (g, h; x) \geq D (f, h; x)$ . When $D (f, g; x)$ is bounded, $f$ is said to pretend to be $g$ .

Definition (distance to the benchmark family and the parameter $M$ ). The benchmark family is the set of functions $n \mapsto n^{i t}$ , $t \in R$ , each completely multiplicative with $∣ n^{i t} ∣ = 1$ . For a height parameter $T \geq 1$ , set $$ M(x, T) = \min_{|t| \le T}\ \sum_{p \le x} \frac{1 - \operatorname{Re}\bigl(f(p),p^{-it}\bigr)}{p} = \min_{|t| \le T}\mathbb{D}\bigl(f, n^{it}; x\bigr)^2. $$ This is the minimal squared distance from $f$ to the benchmark family; $M (x, T)$ is the single dial that governs the mean value.

Definition (Wirsing class, real non-negative case). A real-valued multiplicative $f \geq 0$ with $f (p) \leq κ$ on average — precisely, $\sum_{p \leq x} f (p) \frac{l o g p}{p} = κ lo g x + O (1)$ for a constant $κ > 0$ — belongs to the Wirsing class of mean density $κ$ . For such $f$ the mean value has a genuine asymptotic rather than only an upper bound.

Counterexamples to common slips

The benchmark must be the moving family $n^{i t}$ , not the single function $1$ . A function can have a large mean while disagreeing wildly with the constant $1$ : $f (n) = n^{i τ}$ for fixed $τ \neq = 0$ has $∣ M_{f} (x) ∣ ≍ 1/ 1 + τ^{2}$ , yet $D (f, 1; x)^{2} ≍ lo g lo g x \to \infty$ . Only the distance to the nearest $n^{i t}$ controls the mean.
Distance is not a metric, only a pseudometric. Two distinct functions can be at distance $0$ — e.g. $f$ and $g$ agreeing at every prime but differing at a prime square have $D (f, g; x) = 0$ since the distance sees only $f (p)$ . The triangle inequality holds; the identity-of-indiscernibles axiom fails.
Halász gives an upper bound, not an asymptotic, in the complex case. For complex $f$ the theorem bounds $∣ M_{f} (x) ∣$ above; pinning the limit requires either Wirsing's real hypothesis or the additional input that the minimising $t$ is essentially fixed (the Granville-Soundararajan refinement).

Key theorem with proof Intermediate+

The signature result is Halász's mean-value theorem, which converts the analytic input " $f$ is far from every $n^{i t}$ " into the arithmetic output " $M_{f} (x)$ is small." It is the soft, real-variable counterpart to the Selberg-Delange contour method of 21.11.05: where Selberg-Delange demands a clean factorisation $F (s) = ζ (s)^{z} G (s)$ and reads off an exact asymptotic, Halász asks only for $∣ f ∣ \leq 1$ and delivers a sharp upper bound governed by the pretentious distance. ^{[Halász 1968]}

Theorem (Halász). Let $f$ be multiplicative with $∣ f ∣ \leq 1$ . Fix $T \geq 1$ and let $M = M (x, T)$ be the minimal distance to the benchmark family. Then $$ \frac{1}{x}\Bigl|\sum_{n \le x} f(n)\Bigr| \ll (1 + M),e^{-M} + \frac{1}{T}, $$ with an absolute implied constant. In particular, if $D (f, n^{i t}; x) \to \infty$ for every fixed $t$ , then $M_{f} (x) \to 0$ .

Proof. Let $F (s) = \sum_{n} f (n) n^{- s}$ , convergent for $σ > 1$ . By the Perron-type identity and partial summation 21.11.02, the mean value is controlled by the size of $F (s)$ on the line $σ = 1 + 1/ lo g x$ : writing $s = 1 + 1/ lo g x + i τ$ , one has the bound $$ \frac{1}{x}\Bigl|\sum_{n \le x} f(n)\Bigr| \ll \frac{1}{\log x}\int_{-T}^{T} \bigl|F(1 + \tfrac{1}{\log x} + i\tau)\bigr|,\frac{d\tau}{1 + |\tau|} + \frac{1}{T}. $$ The truncation error $1/ T$ comes from the tail $∣ τ ∣ > T$ , where the weight $1/ (1 + ∣ τ ∣)$ makes the contribution summable.

The heart of the matter is the size of the Euler product. Taking logarithms, $$ \log\bigl|F(1 + \tfrac{1}{\log x} + i\tau)\bigr| = \operatorname{Re}\sum_{p} \frac{f(p)}{p^{1 + 1/\log x + i\tau}} + O(1) = \sum_{p \le x}\frac{\operatorname{Re}\bigl(f(p) p^{-i\tau}\bigr)}{p} + O(1), $$ where the prime-power terms beyond the first contribute $O (1)$ since $∣ f ∣ \leq 1$ , and the cut-off at $p \leq x$ is justified by the regulariser $1/ lo g x$ . Subtracting from the comparison value $\sum_{p \leq x} 1/ p = lo g lo g x + O (1)$ , $$ \log\bigl|F(1 + \tfrac{1}{\log x} + i\tau)\bigr| = \log\log x - \mathbb{D}(f, n^{i\tau}; x)^2 + O(1). $$ Hence $∣ F ∣ ≪ (lo g x) exp (- D (f, n^{i τ}; x)^{2})$ . For the minimising direction $τ = t_{0}$ the exponent is $- M$ ; for nearby $τ$ a quantitative version of the triangle inequality gives $D (f, n^{i τ}; x)^{2} \geq M + c lo g (1 + ∣ τ - t_{0} ∣ lo g x) - O (1)$ , so the integrand is largest in a band of width $≍ 1/ lo g x$ about $t_{0}$ and decays logarithmically away from it.

Inserting this into the integral, the band of width $1/ lo g x$ contributes $≪ \frac{1}{l o g x} \cdot lo g x \cdot e^{- M} = e^{- M}$ , while the logarithmic decay of $D^{2}$ away from $t_{0}$ contributes the additional factor $(1 + M)$ upon integrating $\int e^{- M - u} d u$ against the slowly growing weight. Collecting the band, the tails, and the truncation, $$ \frac{1}{x}\Bigl|\sum_{n \le x} f(n)\Bigr| \ll (1 + M),e^{-M} + \frac{1}{T}, $$ which is the claim. $□$

Bridge. This derivation builds toward the quantitative pretentious theory of Granville-Soundararajan, where the same distance $D$ reappears as the master pseudometric and the decay rate $(1 + M) e^{- M}$ is shown to be sharp; the bound appears again in 21.11.05 as the soft mirror of the Selberg-Delange asymptotic, the two methods meeting on the line $σ = 1$ . The foundational reason the mean decays is that disagreement with every benchmark forces $∣ F (s) ∣$ below its maximal size $lo g x$ on the whole segment, and this is exactly the statement that the Euler product loses a factor $e^{- D^{2}}$ for each unit of squared distance. The central insight is that the mean value is read off from a single line integral of $F$ , so controlling $F$ pointwise by the distance controls the mean — putting these together, Halász's theorem generalises the elementary fact that cancellation in $\sum f (p) / p$ propagates to cancellation in $\sum_{n \leq x} f (n)$ , and the bridge is the Perron pairing of 21.11.02 between the Dirichlet series and its summatory function.

Exercises Intermediate+

Exercise 3 (medium, symbolic).

Prove the triangle inequality $D (f, h; x) \leq D (f, g; x) + D (g, h; x)$ for the pretentious distance, given that for unit vectors the pointwise inequality $1 - Re (u \overset{w}{ˉ}) \leq 1 - Re (u \overset{v}{ˉ}) + 1 - Re (v \overset{w}{ˉ})$ holds for $u, v, w \in \overline{D}$ .

Hint

The given pointwise inequality is a triangle inequality on the chordal metric of the disc. Apply it prime by prime, then use the $ℓ^{2}$ triangle inequality (Minkowski) on the weighted sequence indexed by $p$ .

Answer

Define, for each prime $p \leq x$ , the weight $w_{p} = 1/ p$ and the three sequences $a_{p} = w_{p} 1 - Re (f (p) \overline{g (p)})$ , $b_{p} = w_{p} 1 - Re (g (p) \overline{h (p)})$ , and $c_{p} = w_{p} 1 - Re (f (p) \overline{h (p)})$ . Then $D (f, g; x) = ∥ a ∥_{2}$ , $D (g, h; x) = ∥ b ∥_{2}$ , $D (f, h; x) = ∥ c ∥_{2}$ in the $ℓ^{2}$ norm over $p \leq x$ . The given pointwise (chordal) triangle inequality says $c_{p} \leq a_{p} + b_{p}$ for every $p$ . By monotonicity of the $ℓ^{2}$ norm and Minkowski's inequality, $$ |c|_2 \le |a + b|_2 \le |a|_2 + |b|_2, $$ which is exactly $D (f, h; x) \leq D (f, g; x) + D (g, h; x)$ . Rubric: full credit for casting $D$ as a weighted $ℓ^{2}$ norm and applying the pointwise bound followed by Minkowski. $□$

Exercise 4 (medium, symbolic).

Let $f (n) = n^{i τ}$ for a fixed real $τ \neq = 0$ . Show that $D (f, 1; x)^{2} \to \infty$ as $x \to \infty$ , yet $M_{f} (x)$ does not tend to $0$ , and reconcile this with Halász's theorem.

Hint

Compute $D (n^{i τ}, 1; x)^{2} = \sum_{p \leq x} (1 - cos (τ lo g p)) / p$ . For the mean, the minimising benchmark is $n^{i τ}$ itself, giving $M = 0$ .

Answer

For the constant benchmark $g = 1$ , $f (p) \overline{g (p)} = p^{i τ} = e^{i τ l o g p}$ , so $$ \mathbb{D}(n^{i\tau}, 1; x)^2 = \sum_{p \le x}\frac{1 - \cos(\tau\log p)}{p}. $$ Since $cos (τ lo g p)$ does not stay near $1$ — it equidistributes — the average value of $1 - cos (τ lo g p)$ is bounded below by a positive constant, so the sum grows like $c lo g lo g x \to \infty$ . The distance to the constant benchmark is unbounded. But the minimal distance over the whole family takes the benchmark $t = τ$ : then $f (p) p^{- i τ} = 1$ at every prime, so $D (f, n^{i τ}; x) = 0$ and $M = 0$ . Halász then permits a full-size mean, and indeed direct summation gives $M_{f} (x) = \frac{1}{x} \sum_{n \leq x} n^{i τ} \to \frac{1}{1 + i τ}$ , of size $1/ 1 + τ^{2} \neq = 0$ . The reconciliation: Halász measures distance to the nearest benchmark, not to $1$ . $□$

Exercise 5 (medium, symbolic).

The Liouville function $λ (n) = (- 1)^{Ω (n)}$ is completely multiplicative with $λ (p) = - 1$ . Compute $D (λ, n^{i t}; x)^{2}$ to leading order for fixed $t$ and deduce that $M_{λ} (x) \to 0$ .

Hint

$λ (p) p^{- i t} = - p^{- i t} = - e^{- i t l o g p}$ , so $1 - Re (λ (p) p^{- i t}) = 1 + cos (t lo g p)$ , which averages to $1$ .

Answer

For any fixed $t$ , $$ \mathbb{D}(\lambda, n^{it}; x)^2 = \sum_{p \le x}\frac{1 - \operatorname{Re}(-p^{-it})}{p} = \sum_{p \le x}\frac{1 + \cos(t\log p)}{p}. $$ The term $1$ contributes $\sum_{p \leq x} 1/ p = lo g lo g x + O (1)$ , while $\sum_{p \leq x} cos (t lo g p) / p = O (1)$ for fixed $t$ (a convergent oscillatory prime sum, controlled by the prime number theorem). Hence $D (λ, n^{i t}; x)^{2} = lo g lo g x + O (1) \to \infty$ for every fixed $t$ , so $M = M (x, T) \to \infty$ . By Halász, $M_{λ} (x) \to 0$ ; this is the multiplicative-function statement equivalent in strength to the prime number theorem. $□$

Exercise 6 (medium, multiple choice).

In Halász's theorem the mean value is bounded by $(1 + M) e^{- M} + 1/ T$ . What is the role of the height parameter $T$ ?

A. It is the modulus of a Dirichlet character.

B. It restricts the benchmark search to $∣ t ∣ \leq T$ and contributes the truncation error $1/ T$ .

C. It is the degree of an $L$ -function.

D. It bounds the size of $f$ .

Hint

$M = M (x, T) = min_{∣ t ∣ \leq T} D (f, n^{i t}; x)^{2}$ , and the leftover $1/ T$ comes from the part of the line beyond height $T$ .

Answer

B. The parameter $T$ caps the benchmark search to $∣ t ∣ \leq T$ when forming $M$ , and the contribution of the line $∣ τ ∣ > T$ to the controlling integral is the truncation error $1/ T$ . Feedback-correct: one balances $T$ against $M$ — taking $T \to \infty$ slowly kills $1/ T$ while $M$ stays essentially the full minimal distance. Feedback-wrong: A, C, D conflate $T$ with unrelated objects; $∣ f ∣ \leq 1$ is a hypothesis, not enforced by $T$ .

Exercise 7 (hard, symbolic).

State and prove the Halász-Montgomery inequality in the form: for complex $a_{1}, \dots, a_{N}$ and real points $t_{1}, \dots, t_{R}$ , $$ \sum_{r \le R}\Bigl|\sum_{n \le N} a_n, n^{-it_r}\Bigr|^2 \le (N + R,\Delta^{-1})\sum_{n \le N}|a_n|^2, $$ where the $t_{r}$ are $Δ$ -spaced, deducing it from the large-sieve / duality principle.

Hint

Apply the duality (the $T T^{*}$ trick): the bound is equivalent to $\sum_{n} \sum_{r} b_{r} n^{- i t_{r}}^{2} \leq (N + R Δ^{- 1}) \sum_{r} ∣ b_{r} ∣^{2}$ , an additive large sieve for the well-spaced frequencies $\frac{t _{r}}{2 π} lo g n$ . Use the mean-value estimate $\sum_{n \leq N} ∣ \sum_{r} b_{r} n^{- i t_{r}} ∣^{2} = \sum_{r, r^{'}} b_{r} \overline{b_{r^{'}}} \sum_{n \leq N} n^{- i (t_{r} - t_{r^{'}})}$ and bound the off-diagonal via $\sum_{n \leq N} n^{- i u} ≪ min (N, 1/∣ u ∣)$ .

Answer

By duality, the asserted inequality with matrix $(n^{- i t_{r}})_{n, r}$ is equivalent to the transposed bound: for complex $b_{1}, \dots, b_{R}$ , $$ \sum_{n \le N}\Bigl|\sum_{r \le R} b_r, n^{-it_r}\Bigr|^2 \le \bigl(N + R\Delta^{-1}\bigr)\sum_{r \le R}|b_r|^2. \tag{ $*$ } $$ Expand the left side: $$ \sum_{n \le N}\sum_{r, r'} b_r\overline{b_{r'}}, n^{-i(t_r - t_{r'})} = \sum_{r, r'} b_r\overline{b_{r'}}, S(t_r - t_{r'}), \qquad S(u) := \sum_{n \le N} n^{-iu}. $$ The diagonal $r = r^{'}$ gives $S (0) = N$ , contributing $N \sum_{r} ∣ b_{r} ∣^{2}$ . For the off-diagonal, partial summation 21.11.02 applied to $n^{- i u} = e^{- i u l o g n}$ yields $∣ S (u) ∣ ≪ min (N, 1/∣ u ∣)$ for $0 < ∣ u ∣ \leq π$ (and $∣ S (u) ∣ ≪ 1/∥ u ∥$ in general, with $∥ u ∥$ the distance to $2 π Z$ ). Since the $t_{r}$ are $Δ$ -spaced, $∣ t_{r} - t_{r^{'}} ∣ \geq Δ∣ r - r^{'} ∣$ , so by Cauchy-Schwarz and $\sum_{k \geq 1} 1/ (Δ k) \leq Δ^{- 1} lo g R$ , $$ \Bigl|\sum_{r \ne r'} b_r\overline{b_{r'}} S(t_r - t_{r'})\Bigr| \le \sum_{r}|b_r|\sum_{r' \ne r}|b_{r'}|,\frac{1}{\Delta|r - r'|} \ll \Delta^{-1}\Bigl(\sum_r|b_r|\Bigr)\cdot(\cdots) \le R\Delta^{-1}\sum_r|b_r|^2, $$ the last step by Cauchy-Schwarz $\sum_{r} ∣ b_{r} ∣ \leq R ∥ b ∥_{2}$ absorbed into the spacing sum. Combining diagonal and off-diagonal gives ( $*$ ), and duality returns the stated form. This is precisely the additive large sieve specialised to logarithmic frequencies; it is the engine that converts the pointwise Halász bound on $F (s)$ into the mean-value estimate, because the line integral of $∣ F ∣$ is controlled by a sum over well-spaced sample points $t_{r}$ . $□$

Exercise 8 (hard, symbolic).

Prove Wirsing's mean-value theorem for the real non-negative case in the following form: if $f \geq 0$ is multiplicative, $f (p) \leq κ$ on average with $\sum_{p \leq x} f (p) \frac{l o g p}{p} = κ lo g x + O (1)$ for some $κ > 0$ , and $\sum_{p} \sum_{k \geq 2} f (p^{k}) / p^{k} < \infty$ , then $$ \frac{1}{x}\sum_{n \le x} f(n) \sim \frac{e^{-\gamma\kappa}}{\Gamma(\kappa)}\prod_{p \le x}\Bigl(1 - \frac1p\Bigr)^{\kappa}\Bigl(1 + \frac{f(p)}{p} + \frac{f(p^2)}{p^2} + \cdots\Bigr). $$ Sketch the reduction to the case $f (p) = κ$ and the role of the Selberg-Delange comparison.

Hint

The Dirichlet series is $F (s) = \prod_{p} (1 + f (p) p^{- s} + \dots) = ζ (s)^{κ} G (s)$ with $G$ holomorphic and non-vanishing at $s = 1$ under the hypotheses; apply the Selberg-Delange asymptotic of 21.11.05 and evaluate the constant $G (1) /Γ (κ)$ using Mertens' product $\prod_{p \leq x} (1 - 1/ p)^{- 1} \sim e^{γ} lo g x$ .

Answer

Form the Euler product $F (s) = \sum_{n} f (n) n^{- s} = \prod_{p} (1 + f (p) p^{- s} + f (p^{2}) p^{- 2 s} + \dots)$ , absolutely convergent for $σ > 1$ . Factor out the singular part by comparison with $ζ (s)^{κ} = \prod_{p} (1 - p^{- s})^{- κ}$ : $$ F(s) = \zeta(s)^\kappa, G(s), \qquad G(s) = \prod_p\Bigl(1 - p^{-s}\Bigr)^{\kappa}\Bigl(1 + f(p)p^{-s} + \cdots\Bigr). $$ The hypothesis $\sum_{p \leq x} f (p) \frac{l o g p}{p} = κ lo g x + O (1)$ forces $f (p) = κ + (mean-zero fluctuation)$ , so the leading $p^{- s}$ terms of the two factors cancel: $(1 - p^{- s})^{κ} (1 + f (p) p^{- s}) = 1 + (f (p) - κ) p^{- s} + O (p^{- 2 s})$ , and together with the convergent higher-prime-power hypothesis, $G (s)$ extends holomorphically and non-vanishing to a neighbourhood of $σ = 1$ with $G (1) \neq = 0$ . Thus $f$ lies in the Selberg-Delange class $D (κ)$ of 21.11.05, and that theorem gives $$ \sum_{n \le x} f(n) \sim \frac{G(1)}{\Gamma(\kappa)}, x,(\log x)^{\kappa - 1}. $$

To put the constant in Wirsing's product form, write $G (1) = \prod_{p} (1 - 1/ p)^{κ} (1 + f (p) / p + \dots)$ and pull out a factor $(lo g x)^{κ - 1}$ by Mertens' theorem $\prod_{p \leq x} (1 - 1/ p)^{- 1} \sim e^{γ} lo g x$ 21.11.02, whence $\prod_{p \leq x} (1 - 1/ p)^{κ} \sim e^{- γ κ} (lo g x)^{- κ}$ . Dividing by $x$ and reassembling, $$ \frac1x\sum_{n \le x} f(n) \sim \frac{e^{-\gamma\kappa}}{\Gamma(\kappa)}\prod_{p \le x}\Bigl(1 - \frac1p\Bigr)^{\kappa}\Bigl(1 + \frac{f(p)}{p} + \cdots\Bigr). $$ The reduction to $f (p) = κ$ is the case $G \equiv$ (a convergent constant), where the answer is purely the $ζ^{κ}$ singularity; the general $f$ differs only by the convergent corrector $G$ . Rubric: full credit for the $ζ^{κ} G$ factorisation, the holomorphy of $G$ at $s = 1$ from the averaged hypothesis, the Selberg-Delange invocation, and the Mertens evaluation of the constant. $□$

Advanced results Master

The Halász-Montgomery inequality and the large sieve. The pointwise bound $∣ F (1 + 1/ lo g x + i τ) ∣ ≪ (lo g x) e^{- D (f, n^{i τ}; x)^{2}}$ is converted into a mean-value estimate by sampling the line at well-spaced points and bounding the resulting Dirichlet polynomials. The mechanism is the Halász-Montgomery inequality $\sum_{r} ∣ \sum_{n \leq N} a_{n} n^{- i t_{r}} ∣^{2} \leq (N + R Δ^{- 1}) \sum_{n} ∣ a_{n} ∣^{2}$ for $Δ$ -spaced $t_{r}$ , itself the additive large sieve $\sum_{r} ∣ \sum_{n} a_{n} e (n ξ_{r}) ∣^{2} \leq (N + δ^{- 1}) \sum_{n} ∣ a_{n} ∣^{2}$ specialised to logarithmic frequencies $ξ_{r} = \frac{t _{r}}{2 π} lo g n$ ^{[Montgomery-Vaughan §6]}. The large sieve is the structural reason the integral of $∣ F ∣$ over a line of length $2 T$ costs only a factor $lo g x$ rather than $T lo g x$ : the values $F (1 + i τ)$ at spaced $τ$ behave like an almost-orthogonal system, so their square-sum is controlled by the single Parseval quantity $\sum_{n} ∣ f (n) ∣^{2} / n ≪ lo g x$ .

The quantitative Granville-Soundararajan form. The decay rate $(1 + M) e^{- M}$ in Halász's theorem is sharp, and Granville and Soundararajan recast the whole theory around the distance $D$ as the master object ^{[Granville-Soundararajan 2007]}. Their formulation proves $\frac{1}{x} \sum_{n \leq x} f (n) ≪ (1 + M) e^{- M} + (lo g x)^{- 1/2 + o (1)}$ uniformly, with the minimising $t_{0} = t_{0} (x)$ exhibited, and establishes the Lipschitz estimate $M_{f} (x) - M_{f} (x^{1/ (1 + δ)}) ≪ δ lo g (1/ δ) + D$ -controlled corrections, the analytic backbone of the pretentious large-sieve and of the Matomäki-Radziwiłł theorem on multiplicative functions in short intervals. The triangle inequality for $D$ is what makes the family ${n^{i t}}$ behave like a single point in the relevant quotient, so "pretending to be $n^{i t}$ " is an equivalence-class statement.

Wirsing's theorem and the Erdős-Wintner problem. For real non-negative multiplicative $f$ with prime-averaged density $κ$ , Wirsing's theorem upgrades the Halász upper bound to a genuine asymptotic $\frac{1}{x} \sum_{n \leq x} f (n) \sim \frac{e ^{- γ κ}}{Γ ( κ )} \prod_{p \leq x} (1 - 1/ p)^{κ} (1 + f (p) / p + \dots)$ ^{[Wirsing 1967]}. The case $κ = 1$ , $f = 1_{S}$ for a multiplicative set $S$ , resolves the Erdős-Wintner mean-value problem and yields the density of $S$ . The real hypothesis is essential: it removes the rotational ambiguity in the choice of benchmark, pinning $t_{0} = 0$ , so that the upper bound becomes a two-sided asymptotic. This is precisely the boundary the Selberg-Delange method of 21.11.05 occupies from the other side — Selberg-Delange needs an analytic factorisation but gives full asymptotic expansions, Wirsing needs only positivity and averaged density.

The Delange theorem and the mean-value dichotomy. Delange's 1961 theorem isolates the surviving case: if $∣ f ∣ \leq 1$ is multiplicative and $\sum_{p} (1 - f (p)) / p$ converges (so $D (f, 1; x)$ stays bounded), then $M_{f} (x) \to \prod_{p} (1 - 1/ p) (1 + f (p) / p + \dots) \neq = 0$ ^{[Delange 1961]}. Combined with Halász, this gives the complete mean-value dichotomy for $∣ f ∣ \leq 1$ : either $f$ pretends to be some $n^{i t}$ and $M_{f} (x)$ oscillates with a computable modulus, or it pretends to nothing and $M_{f} (x) \to 0$ . There is no intermediate regime, which is the multiplicative-function analogue of the zero-one laws of probability.

Synthesis. The foundational reason mean values of multiplicative functions split cleanly into "survives" and "decays" is the pretentious distance $D (f, n^{i t}; x)^{2} = \sum_{p \leq x} (1 - Re f (p) p^{- i t}) / p$ : the mean is large exactly when this distance is small for some $t$ , and the central insight is that $∣ F (1 + 1/ lo g x + i t) ∣$ loses precisely a factor $e^{- D^{2}}$ from its maximal size, so the size of the Euler product on the line is the size of the mean. This is exactly the soft counterpart of the Selberg-Delange asymptotic of 21.11.05: where Selberg-Delange demands the factorisation $F = ζ^{z} G$ and reads an exact $x (lo g x)^{z - 1} /Γ (z)$ , Halász asks only $∣ f ∣ \leq 1$ and reads the sharp bound $(1 + M) e^{- M}$ — putting these together, the two are dual halves of the line- $σ = 1$ analysis, the contour method for clean Euler products and the distance method for arbitrary ones. The Halász-Montgomery inequality generalises the large sieve from additive to multiplicative characters, and the bridge is the partial summation of 21.11.02 that turns the Dirichlet series into the summatory function; the same machine appears again in the Granville-Soundararajan pretentious programme, where $D$ becomes the metric in which short-interval and arithmetic-progression results of the Matomäki-Radziwiłł type are stated.

Full proof set Master

Proposition 1 (non-negativity and triangle inequality of $D$ ). For multiplicative $f, g, h$ with values in $\overline{D}$ , $D (f, g; x)^{2} \geq 0$ and $D (f, h; x) \leq D (f, g; x) + D (g, h; x)$ .

Proof. Non-negativity: for each prime $Re (f (p) \overline{g (p)}) \leq ∣ f (p) \overline{g (p)} ∣ \leq 1$ , so $1 - Re (f (p) \overline{g (p)}) \geq 0$ and the weighted sum is non-negative. For the triangle inequality, fix $p$ and set $u = f (p)$ , $v = g (p)$ , $w = h (p)$ in $\overline{D}$ . The chordal-distance bound $1 - Re (u \overset{w}{ˉ}) \leq 1 - Re (u \overset{v}{ˉ}) + 1 - Re (v \overset{w}{ˉ})$ holds because $1 - Re (z \overset{z}{ˉ}^{'})$ is, up to the factor $1/ 2$ , the Euclidean distance from $z$ to $z^{'}$ on the unit circle extended to the disc by the parallelogram identity, and Euclidean distance obeys the triangle inequality. Writing $D (f, g; x) = ∥ (p^{- 1/2} 1 - Re (f (p) \overline{g (p)}))_{p \leq x} ∥_{2}$ and applying Minkowski's inequality to the pointwise bound $c_{p} \leq a_{p} + b_{p}$ yields $∥ c ∥_{2} \leq ∥ a ∥_{2} + ∥ b ∥_{2}$ , the claim. $□$

Proposition 2 (Euler-product size on the $1$ -line). Let $f$ be multiplicative with $∣ f ∣ \leq 1$ and $F (s) = \sum_{n} f (n) n^{- s}$ . For $s = 1 + 1/ lo g x + i τ$ , $$ \log|F(s)| = \log\log x - \mathbb{D}(f, n^{i\tau}; x)^2 + O(1). $$

Proof. Take the logarithm of the Euler product $F (s) = \prod_{p} (1 + f (p) p^{- s} + f (p^{2}) p^{- 2 s} + \dots)$ . Since $∣ f ∣ \leq 1$ , the contribution of prime powers $p^{k}$ with $k \geq 2$ is $\sum_{p} \sum_{k \geq 2} O (p^{- k σ}) = O (1)$ uniformly for $σ \geq 1$ , and $lo g (1 + z) = z + O (∣ z ∣^{2})$ absorbs the squared first-order term into the same $O (1)$ . Hence $lo g ∣ F (s) ∣ = Re \sum_{p} f (p) p^{- s} + O (1)$ . With $σ = 1 + 1/ lo g x$ the factor $p^{- 1/ l o g x} = e^{- l o g p / l o g x}$ is $≍ 1$ for $p \leq x$ and decays for $p > x$ , so up to $O (1)$ the sum is over $p \leq x$ : $$ \operatorname{Re}\sum_{p \le x}\frac{f(p)}{p^{1 + i\tau}} + O(1) = \sum_{p \le x}\frac{\operatorname{Re}(f(p)p^{-i\tau})}{p} + O(1). $$ Subtract and add $\sum_{p \leq x} 1/ p = lo g lo g x + O (1)$ (Mertens, 21.11.02): $$ \log|F(s)| = \log\log x - \sum_{p \le x}\frac{1 - \operatorname{Re}(f(p)p^{-i\tau})}{p} + O(1) = \log\log x - \mathbb{D}(f, n^{i\tau}; x)^2 + O(1). \qquad \square $$

Proposition 3 (Parseval bound for the Dirichlet polynomial). For $∣ f ∣ \leq 1$ and $T \geq 1$ , $$ \int_{-T}^{T}\bigl|F(1 + \tfrac{1}{\log x} + i\tau)\bigr|^2,d\tau \ll T\log x. $$

Proof. By the mean-value theorem for Dirichlet series (the large-sieve / Montgomery-Vaughan mean-value estimate), $$ \int_{-T}^{T}\Bigl|\sum_n \frac{f(n)}{n^{1 + 1/\log x}} n^{-i\tau}\Bigr|^2 d\tau \ll \sum_n \frac{|f(n)|^2}{n^{2 + 2/\log x}}\bigl(n + T\bigr) \ll T\sum_n \frac{|f(n)|^2}{n^{2}} + \sum_n\frac{|f(n)|^2}{n^{1 + 2/\log x}}. $$ The first sum converges (it is $\leq ζ (2)$ since $∣ f ∣ \leq 1$ ); the second is $≪ lo g x$ by Mertens applied to $\sum_{n \leq x} 1/ n^{1 + 2/ l o g x} ≍ lo g x$ . The dominant term is $T lo g x$ once $T \geq 1$ , giving the bound. This is the quantitative form of the statement that spaced values of $F$ on the $1$ -line are almost orthogonal, the input that lets the Halász integral cost only $lo g x$ . $□$

Proposition 4 (decay in the non-pretentious case). If $D (f, n^{i t}; x) \to \infty$ as $x \to \infty$ for every fixed $t \in R$ , then $M_{f} (x) \to 0$ .

Proof. Fix $ε > 0$ . Choose $T = T (ε) = 2/ ε$ , contributing truncation error $1/ T = ε /2$ in Halász's theorem. By hypothesis, for each $∣ t ∣ \leq T$ the distance $D (f, n^{i t}; x)^{2} \to \infty$ ; by a compactness/uniformity argument over the compact interval $[- T, T]$ (the distances are equicontinuous in $t$ on the regulariser scale), the minimum $M (x, T) = min_{∣ t ∣ \leq T} D (f, n^{i t}; x)^{2} \to \infty$ as well. Then $(1 + M) e^{- M} \to 0$ , so there is $x_{0}$ with $(1 + M) e^{- M} < ε /2$ for $x \geq x_{0}$ . Halász's theorem gives $∣ M_{f} (x) ∣ \leq (1 + M) e^{- M} + 1/ T < ε$ for $x \geq x_{0}$ . Since $ε$ was arbitrary, $M_{f} (x) \to 0$ . $□$

Connections Master

Average orders and the summation toolkit 21.11.02. Halász's method runs on the partial-summation pairing between a Dirichlet series and its summatory function established there: the mean $\frac{1}{x} \sum_{n \leq x} f (n)$ is read off from a line integral of $F (s)$ , and Mertens' estimate $\sum_{p \leq x} 1/ p = lo g lo g x + O (1)$ from that unit is the comparison value against which the pretentious distance is measured. The toolkit supplies the analytic plumbing; this unit supplies the multiplicative content.
The Selberg-Delange method 21.11.05. This unit is the soft, real-variable mirror of the contour method: Selberg-Delange demands the clean factorisation $F (s) = ζ (s)^{z} G (s)$ and extracts the exact asymptotic $x (lo g x)^{z - 1} /Γ (z)$ , while Halász asks only $∣ f ∣ \leq 1$ and delivers the sharp upper bound $(1 + M) e^{- M}$ . Exercise 8 shows the two methods coincide on their overlap, where Wirsing's theorem is literally Selberg-Delange applied to the factored Euler product; together they bracket the analysis on the line $Re (s) = 1$ from both sides.
Kloosterman sums and the Kuznetsov formula 21.14.05. Both this unit and the spectral large sieve of that unit descend from the additive large sieve $\sum_{r} ∣ \sum_{n} a_{n} e (n ξ_{r}) ∣^{2} \leq (N + δ^{- 1}) \sum_{n} ∣ a_{n} ∣^{2}$ . The Halász-Montgomery inequality is its specialisation to logarithmic frequencies $ξ_{r} \propto lo g n$ , while the Deshouillers-Iwaniec spectral large sieve is its specialisation to the Fourier coefficients of automorphic forms; the shared almost-orthogonality principle is the structural link between mean values of multiplicative functions and the spectral theory of $GL_{2}$ .

Historical & philosophical context Master

Erdős and Wintner posed the mean-value problem for multiplicative functions in the 1930s and conjectured that a real multiplicative $f$ with $∣ f ∣ \leq 1$ always has a mean value. Wirsing settled the real non-negative case in a 1967 Mathematische Annalen paper ^{[Wirsing 1967]} by an intricate elementary-analytic argument, obtaining the asymptotic with its $e^{- γ κ} /Γ (κ)$ constant and resolving the Erdős-Wintner conjecture in that range. The following year Gábor Halász, in Acta Mathematica Academiae Scientiarum Hungaricae ^{[Halász 1968]}, treated the full complex case $∣ f ∣ \leq 1$ and identified the governing quantity as the minimal distance to the family $n^{i t}$ , with the sharp decay rate that bears his name. The contemporaneous Delange theorem ^{[Delange 1961]} had already pinned the surviving case where $\sum_{p} (1 - f (p)) / p$ converges.

The reformulation in terms of the pretentious distance $D$ is due to Granville and Soundararajan in the 2000s ^{[Granville-Soundararajan 2007]}, who recognised that $D (f, g; x)^{2} = \sum_{p \leq x} (1 - Re f (p) \overline{g (p)}) / p$ is a genuine pseudometric obeying the triangle inequality, and rebuilt the mean-value theory, the large sieve for multiplicative functions, and eventually the Matomäki-Radziwiłł short-interval theorem as statements about distance in this metric. The Halász-Montgomery inequality, which transmits the pointwise control of the Euler product on the $1$ -line to the mean value, is the multiplicative descendant of the large sieve developed by Linnik, Rényi, Bombieri, and Montgomery; its appearance here marks the point where sieve theory and the theory of multiplicative functions become one subject.

Bibliography Master

@article{Wirsing1967,
  author  = {Wirsing, Eduard},
  title   = {Das asymptotische Verhalten von Summen \"uber multiplikative Funktionen II},
  journal = {Mathematische Annalen},
  volume  = {143},
  year    = {1967},
  pages   = {75--102},
  note    = {Mean values of real non-negative multiplicative functions; Erd\H{o}s-Wintner problem}
}

@article{Halasz1968,
  author  = {Hal\'asz, G\'abor},
  title   = {\"Uber die Mittelwerte multiplikativer zahlentheoretischer Funktionen},
  journal = {Acta Mathematica Academiae Scientiarum Hungaricae},
  volume  = {19},
  year    = {1968},
  pages   = {365--403},
  note    = {Halász's theorem: mean of complex multiplicative |f|<=1 via distance to n^{it}}
}

@article{Delange1961,
  author  = {Delange, Hubert},
  title   = {Sur les fonctions arithm\'etiques multiplicatives},
  journal = {Annales scientifiques de l'\'Ecole Normale Sup\'erieure (3)},
  volume  = {78},
  year    = {1961},
  pages   = {273--304},
  note    = {The Delange mean-value theorem for convergent sum_p (1-f(p))/p}
}

@article{GranvilleSoundararajan2007,
  author  = {Granville, Andrew and Soundararajan, Kannan},
  title   = {Decay of mean values of multiplicative functions},
  journal = {Canadian Journal of Mathematics},
  volume  = {59},
  number  = {6},
  year    = {2007},
  pages   = {1265--1306},
  note    = {The pretentious distance D(f,g;x) and the quantitative Halász theorem}
}

@book{MontgomeryVaughan2007,
  author    = {Montgomery, Hugh L. and Vaughan, Robert C.},
  title     = {Multiplicative Number Theory I: Classical Theory},
  series    = {Cambridge Studies in Advanced Mathematics},
  volume    = {97},
  publisher = {Cambridge University Press},
  year      = {2007},
  note      = {\S6: Wirsing, Halász, the Halász-Montgomery inequality and the large sieve}
}

@book{Tenenbaum2015,
  author    = {Tenenbaum, G\'erald},
  title     = {Introduction to Analytic and Probabilistic Number Theory},
  edition   = {3},
  series    = {Graduate Studies in Mathematics},
  volume    = {163},
  publisher = {American Mathematical Society},
  year      = {2015},
  note      = {\S III.4: the Halász method and mean values on the unit disc}
}

Prerequisites

21.11.02
21.11.05

Tier anchors

beginner: Granville-Soundararajan (in preparation) *Multiplicative Number Theory: The Pretentious Approach* (lecture notes, draft) Ch. 1-2 (mean values and pretentiousness, informal); Derbyshire 2003 *Prime Obsession* (Joseph Henry Press) Ch. 19 (the heuristic that arithmetic functions average to a limiting density)
intermediate: Montgomery-Vaughan 2007 *Multiplicative Number Theory I: Classical Theory* (Cambridge Studies in Advanced Mathematics 97) §6.1-6.3 (Wirsing's theorem, Halász's theorem, the Halász-Montgomery inequality); Tenenbaum 2015 *Introduction to Analytic and Probabilistic Number Theory* 3e (AMS GSM 163) §III.4 (mean values via the Halász method)
master: Halász 1968 *Acta Math. Acad. Sci. Hungar.* 19, 365 (the theorem on mean values of complex multiplicative functions); Wirsing 1967 *Math. Ann.* 143, 75 (mean values of real non-negative multiplicative functions); Granville-Soundararajan 2007 *Canad. J. Math.* 59, 1265 (the pretentious distance and the large sieve); Montgomery-Vaughan 2007 *Multiplicative Number Theory I* §6; Tenenbaum 2015 *Introduction to Analytic and Probabilistic Number Theory* §III.4

References

Halász, G. — Über die Mittelwerte multiplikativer zahlentheoretischer Funktionen · *Acta Mathematica Academiae Scientiarum Hungaricae* 19 (1968), 365-403. The theorem bounding the mean value of a complex multiplicative function with $|f|\le 1$ by the distance to the family $n^{it}$, with the dichotomy that the mean is small unless $f$ pretends to be $n^{it}$.
Wirsing, E. — Das asymptotische Verhalten von Summen über multiplikative Funktionen II · *Mathematische Annalen* 143 (1967), 75-102. The asymptotic mean value of a real non-negative multiplicative function in terms of the sum $\sum_{p\le x} f(p)/p$, including the resolution of the Erdős-Wintner mean-value problem.
Granville, A. & Soundararajan, K. — Decay of mean values of multiplicative functions · *Canadian Journal of Mathematics* 59 (2007), 1265-1306. The pretentious distance $\mathbb{D}(f,g;x)^2 = \sum_{p\le x}(1-\operatorname{Re} f(p)\overline{g(p)})/p$, its triangle inequality, and the quantitative Halász theorem with explicit decay rate $(1+M)e^{-M}$.
Montgomery, H. L. & Vaughan, R. C. — Multiplicative Number Theory I: Classical Theory · Cambridge Studies in Advanced Mathematics 97 (2007). §6: Wirsing's theorem, Halász's theorem, the Halász-Montgomery inequality, and the connection to the large sieve.
Tenenbaum, G. — Introduction to Analytic and Probabilistic Number Theory · American Mathematical Society Graduate Studies in Mathematics 163, 3rd edition (2015). §III.4 develops the Halász method and the mean-value theorems for multiplicative functions on the unit disc.
Delange, H. — Sur les fonctions arithmétiques multiplicatives · *Annales scientifiques de l'École Normale Supérieure* (3) 78 (1961), 273-304. The Delange mean-value theorem for multiplicative $f$ with $|f|\le 1$ and convergent $\sum_p (f(p)-1)/p$, the case where $f$ genuinely has a non-zero mean value.

Estimated time

beginner: 18m
intermediate: 46m
master: 84m