21.14.03 · number-theory / sieve-methods-large-sieve

Mean Values of Multiplicative Functions: Halász's Theorem

shipped3 tiersLean: none

Anchor (Master): Halász 1968 *Acta Math. Acad. Sci. Hungar.* 19, 365 (the theorem on mean values of complex multiplicative functions); Wirsing 1967 *Math. Ann.* 143, 75 (mean values of real non-negative multiplicative functions); Granville-Soundararajan 2007 *Canad. J. Math.* 59, 1265 (the pretentious distance and the large sieve); Montgomery-Vaughan 2007 *Multiplicative Number Theory I* §6; Tenenbaum 2015 *Introduction to Analytic and Probabilistic Number Theory* §III.4

Intuition Beginner

A multiplicative function assigns a number to each integer in a way that respects factorisation, and here we restrict to functions whose values never exceed in size. The basic question is the average: add up out to a cutoff and divide by . Sometimes this average settles to a clean positive limit, and sometimes it drifts to zero. Halász's theorem explains exactly when each happens.

The key idea is a notion of pretending. Picture a benchmark family of functions, the powers , each of which oscillates in a smooth, predictable way as grows. The theorem says that has a large average only if closely imitates one of these benchmarks across the primes. If refuses to imitate any of them — if at the primes it points in too many different directions — then all that disagreement piles up and the average is forced toward zero.

Why does this matter? It turns a hard question about a wild sequence of values into a single measurement: how far is from its nearest pretender? That distance is a sum over primes, and once you compute it you can read off whether the running average survives or decays. The same one dial governs the mean of every unit-size multiplicative function at once, which is why the result reshaped the whole subject.

Visual Beginner

Imagine each prime contributing an arrow on a clock face: the arrow for prime points in the direction set by the value compared with the pretender's value at . When the arrows mostly agree, they reinforce and the average stays large. When they scatter around the clock, they cancel and the average collapses.

   AGREEMENT (small distance)          DISAGREEMENT (large distance)
         arrows cluster                      arrows scatter
            \ | /                              \   |   /
             \|/                            <---  o  --->
        ------o------                           / | \
             /|\                               /  |  \
            / | \                             v   v   v
     mean stays LARGE                      mean decays to ZERO
   f pretends to be n^(it)             f pretends to be nothing

   distance over primes:  sum over p of (1 - agreement at p) / p
   small sum  ->  big average        large sum  ->  average ~ (1 + M) e^(-M)

The size of the average is controlled by one number : the smallest possible total disagreement between and any pretender . Small means a large average; large drives the average down like times , which shrinks fast.

Worked example Beginner

Take for every integer, the most agreeable multiplicative function there is. Its pretender is the simplest benchmark, with , which is also just . Let us see why the average is large.

Step 1. Measure the disagreement at each prime. Since matches the benchmark exactly, the disagreement at every prime is .

Step 2. Add up the disagreement over the primes, each weighted by . Every term is , so the total is .

Step 3. Read off the average bound. The control factor is . With this is , the largest it can be. So the average is not forced down at all.

Step 4. Check directly. Adding out to and dividing by gives the average . The two answers agree: zero disagreement means a full-strength average.

Now flip to a function that disagrees everywhere. Take at every prime. Compared with the benchmark , each prime now contributes the maximum disagreement, the weighted total grows like without bound, and tumbles toward zero. The running average of this function indeed decays.

What this tells us: the single number , the total prime-by-prime disagreement with the nearest benchmark, decides everything. No disagreement gives a full average; unbounded disagreement gives a vanishing one.

Check your understanding Beginner

Formal definition Intermediate+

Throughout, is multiplicative with for all ; equivalently takes values in the closed unit disc at every prime power. Write , and let denote the Euler-Mascheroni constant. The mean value of up to is , and the summatory-function notation of 21.11.02 is in force. [Montgomery-Vaughan §6]

Definition (pretentious distance). For multiplicative with values in , the pretentious distance up to is $$ \mathbb{D}(f, g; x)^2 = \sum_{p \le x} \frac{1 - \operatorname{Re}\bigl(f(p),\overline{g(p)}\bigr)}{p}. $$ Each summand is non-negative because . The quantity is a pseudometric: it satisfies the triangle inequality . When is bounded, is said to pretend to be .

Definition (distance to the benchmark family and the parameter ). The benchmark family is the set of functions , , each completely multiplicative with . For a height parameter , set $$ M(x, T) = \min_{|t| \le T}\ \sum_{p \le x} \frac{1 - \operatorname{Re}\bigl(f(p),p^{-it}\bigr)}{p} = \min_{|t| \le T}\mathbb{D}\bigl(f, n^{it}; x\bigr)^2. $$ This is the minimal squared distance from to the benchmark family; is the single dial that governs the mean value.

Definition (Wirsing class, real non-negative case). A real-valued multiplicative with on average — precisely, for a constant — belongs to the Wirsing class of mean density . For such the mean value has a genuine asymptotic rather than only an upper bound.

Counterexamples to common slips

  • The benchmark must be the moving family , not the single function . A function can have a large mean while disagreeing wildly with the constant : for fixed has , yet . Only the distance to the nearest controls the mean.
  • Distance is not a metric, only a pseudometric. Two distinct functions can be at distance — e.g. and agreeing at every prime but differing at a prime square have since the distance sees only . The triangle inequality holds; the identity-of-indiscernibles axiom fails.
  • Halász gives an upper bound, not an asymptotic, in the complex case. For complex the theorem bounds above; pinning the limit requires either Wirsing's real hypothesis or the additional input that the minimising is essentially fixed (the Granville-Soundararajan refinement).

Key theorem with proof Intermediate+

The signature result is Halász's mean-value theorem, which converts the analytic input " is far from every " into the arithmetic output " is small." It is the soft, real-variable counterpart to the Selberg-Delange contour method of 21.11.05: where Selberg-Delange demands a clean factorisation and reads off an exact asymptotic, Halász asks only for and delivers a sharp upper bound governed by the pretentious distance. [Halász 1968]

Theorem (Halász). Let be multiplicative with . Fix and let be the minimal distance to the benchmark family. Then $$ \frac{1}{x}\Bigl|\sum_{n \le x} f(n)\Bigr| \ll (1 + M),e^{-M} + \frac{1}{T}, $$ with an absolute implied constant. In particular, if for every fixed , then .

Proof. Let , convergent for . By the Perron-type identity and partial summation 21.11.02, the mean value is controlled by the size of on the line : writing , one has the bound $$ \frac{1}{x}\Bigl|\sum_{n \le x} f(n)\Bigr| \ll \frac{1}{\log x}\int_{-T}^{T} \bigl|F(1 + \tfrac{1}{\log x} + i\tau)\bigr|,\frac{d\tau}{1 + |\tau|} + \frac{1}{T}. $$ The truncation error comes from the tail , where the weight makes the contribution summable.

The heart of the matter is the size of the Euler product. Taking logarithms, $$ \log\bigl|F(1 + \tfrac{1}{\log x} + i\tau)\bigr| = \operatorname{Re}\sum_{p} \frac{f(p)}{p^{1 + 1/\log x + i\tau}} + O(1) = \sum_{p \le x}\frac{\operatorname{Re}\bigl(f(p) p^{-i\tau}\bigr)}{p} + O(1), $$ where the prime-power terms beyond the first contribute since , and the cut-off at is justified by the regulariser . Subtracting from the comparison value , $$ \log\bigl|F(1 + \tfrac{1}{\log x} + i\tau)\bigr| = \log\log x - \mathbb{D}(f, n^{i\tau}; x)^2 + O(1). $$ Hence . For the minimising direction the exponent is ; for nearby a quantitative version of the triangle inequality gives , so the integrand is largest in a band of width about and decays logarithmically away from it.

Inserting this into the integral, the band of width contributes , while the logarithmic decay of away from contributes the additional factor upon integrating against the slowly growing weight. Collecting the band, the tails, and the truncation, $$ \frac{1}{x}\Bigl|\sum_{n \le x} f(n)\Bigr| \ll (1 + M),e^{-M} + \frac{1}{T}, $$ which is the claim.

Bridge. This derivation builds toward the quantitative pretentious theory of Granville-Soundararajan, where the same distance reappears as the master pseudometric and the decay rate is shown to be sharp; the bound appears again in 21.11.05 as the soft mirror of the Selberg-Delange asymptotic, the two methods meeting on the line . The foundational reason the mean decays is that disagreement with every benchmark forces below its maximal size on the whole segment, and this is exactly the statement that the Euler product loses a factor for each unit of squared distance. The central insight is that the mean value is read off from a single line integral of , so controlling pointwise by the distance controls the mean — putting these together, Halász's theorem generalises the elementary fact that cancellation in propagates to cancellation in , and the bridge is the Perron pairing of 21.11.02 between the Dirichlet series and its summatory function.

Exercises Intermediate+

Advanced results Master

The Halász-Montgomery inequality and the large sieve. The pointwise bound is converted into a mean-value estimate by sampling the line at well-spaced points and bounding the resulting Dirichlet polynomials. The mechanism is the Halász-Montgomery inequality for -spaced , itself the additive large sieve specialised to logarithmic frequencies [Montgomery-Vaughan §6]. The large sieve is the structural reason the integral of over a line of length costs only a factor rather than : the values at spaced behave like an almost-orthogonal system, so their square-sum is controlled by the single Parseval quantity .

The quantitative Granville-Soundararajan form. The decay rate in Halász's theorem is sharp, and Granville and Soundararajan recast the whole theory around the distance as the master object [Granville-Soundararajan 2007]. Their formulation proves uniformly, with the minimising exhibited, and establishes the Lipschitz estimate -controlled corrections, the analytic backbone of the pretentious large-sieve and of the Matomäki-Radziwiłł theorem on multiplicative functions in short intervals. The triangle inequality for is what makes the family behave like a single point in the relevant quotient, so "pretending to be " is an equivalence-class statement.

Wirsing's theorem and the Erdős-Wintner problem. For real non-negative multiplicative with prime-averaged density , Wirsing's theorem upgrades the Halász upper bound to a genuine asymptotic [Wirsing 1967]. The case , for a multiplicative set , resolves the Erdős-Wintner mean-value problem and yields the density of . The real hypothesis is essential: it removes the rotational ambiguity in the choice of benchmark, pinning , so that the upper bound becomes a two-sided asymptotic. This is precisely the boundary the Selberg-Delange method of 21.11.05 occupies from the other side — Selberg-Delange needs an analytic factorisation but gives full asymptotic expansions, Wirsing needs only positivity and averaged density.

The Delange theorem and the mean-value dichotomy. Delange's 1961 theorem isolates the surviving case: if is multiplicative and converges (so stays bounded), then [Delange 1961]. Combined with Halász, this gives the complete mean-value dichotomy for : either pretends to be some and oscillates with a computable modulus, or it pretends to nothing and . There is no intermediate regime, which is the multiplicative-function analogue of the zero-one laws of probability.

Synthesis. The foundational reason mean values of multiplicative functions split cleanly into "survives" and "decays" is the pretentious distance : the mean is large exactly when this distance is small for some , and the central insight is that loses precisely a factor from its maximal size, so the size of the Euler product on the line is the size of the mean. This is exactly the soft counterpart of the Selberg-Delange asymptotic of 21.11.05: where Selberg-Delange demands the factorisation and reads an exact , Halász asks only and reads the sharp bound — putting these together, the two are dual halves of the line- analysis, the contour method for clean Euler products and the distance method for arbitrary ones. The Halász-Montgomery inequality generalises the large sieve from additive to multiplicative characters, and the bridge is the partial summation of 21.11.02 that turns the Dirichlet series into the summatory function; the same machine appears again in the Granville-Soundararajan pretentious programme, where becomes the metric in which short-interval and arithmetic-progression results of the Matomäki-Radziwiłł type are stated.

Full proof set Master

Proposition 1 (non-negativity and triangle inequality of ). For multiplicative with values in , and .

Proof. Non-negativity: for each prime , so and the weighted sum is non-negative. For the triangle inequality, fix and set , , in . The chordal-distance bound holds because is, up to the factor , the Euclidean distance from to on the unit circle extended to the disc by the parallelogram identity, and Euclidean distance obeys the triangle inequality. Writing and applying Minkowski's inequality to the pointwise bound yields , the claim.

Proposition 2 (Euler-product size on the -line). Let be multiplicative with and . For , $$ \log|F(s)| = \log\log x - \mathbb{D}(f, n^{i\tau}; x)^2 + O(1). $$

Proof. Take the logarithm of the Euler product . Since , the contribution of prime powers with is uniformly for , and absorbs the squared first-order term into the same . Hence . With the factor is for and decays for , so up to the sum is over : $$ \operatorname{Re}\sum_{p \le x}\frac{f(p)}{p^{1 + i\tau}} + O(1) = \sum_{p \le x}\frac{\operatorname{Re}(f(p)p^{-i\tau})}{p} + O(1). $$ Subtract and add (Mertens, 21.11.02): $$ \log|F(s)| = \log\log x - \sum_{p \le x}\frac{1 - \operatorname{Re}(f(p)p^{-i\tau})}{p} + O(1) = \log\log x - \mathbb{D}(f, n^{i\tau}; x)^2 + O(1). \qquad \square $$

Proposition 3 (Parseval bound for the Dirichlet polynomial). For and , $$ \int_{-T}^{T}\bigl|F(1 + \tfrac{1}{\log x} + i\tau)\bigr|^2,d\tau \ll T\log x. $$

Proof. By the mean-value theorem for Dirichlet series (the large-sieve / Montgomery-Vaughan mean-value estimate), $$ \int_{-T}^{T}\Bigl|\sum_n \frac{f(n)}{n^{1 + 1/\log x}} n^{-i\tau}\Bigr|^2 d\tau \ll \sum_n \frac{|f(n)|^2}{n^{2 + 2/\log x}}\bigl(n + T\bigr) \ll T\sum_n \frac{|f(n)|^2}{n^{2}} + \sum_n\frac{|f(n)|^2}{n^{1 + 2/\log x}}. $$ The first sum converges (it is since ); the second is by Mertens applied to . The dominant term is once , giving the bound. This is the quantitative form of the statement that spaced values of on the -line are almost orthogonal, the input that lets the Halász integral cost only .

Proposition 4 (decay in the non-pretentious case). If as for every fixed , then .

Proof. Fix . Choose , contributing truncation error in Halász's theorem. By hypothesis, for each the distance ; by a compactness/uniformity argument over the compact interval (the distances are equicontinuous in on the regulariser scale), the minimum as well. Then , so there is with for . Halász's theorem gives for . Since was arbitrary, .

Connections Master

  • Average orders and the summation toolkit 21.11.02. Halász's method runs on the partial-summation pairing between a Dirichlet series and its summatory function established there: the mean is read off from a line integral of , and Mertens' estimate from that unit is the comparison value against which the pretentious distance is measured. The toolkit supplies the analytic plumbing; this unit supplies the multiplicative content.

  • The Selberg-Delange method 21.11.05. This unit is the soft, real-variable mirror of the contour method: Selberg-Delange demands the clean factorisation and extracts the exact asymptotic , while Halász asks only and delivers the sharp upper bound . Exercise 8 shows the two methods coincide on their overlap, where Wirsing's theorem is literally Selberg-Delange applied to the factored Euler product; together they bracket the analysis on the line from both sides.

  • Kloosterman sums and the Kuznetsov formula 21.14.05. Both this unit and the spectral large sieve of that unit descend from the additive large sieve . The Halász-Montgomery inequality is its specialisation to logarithmic frequencies , while the Deshouillers-Iwaniec spectral large sieve is its specialisation to the Fourier coefficients of automorphic forms; the shared almost-orthogonality principle is the structural link between mean values of multiplicative functions and the spectral theory of .

Historical & philosophical context Master

Erdős and Wintner posed the mean-value problem for multiplicative functions in the 1930s and conjectured that a real multiplicative with always has a mean value. Wirsing settled the real non-negative case in a 1967 Mathematische Annalen paper [Wirsing 1967] by an intricate elementary-analytic argument, obtaining the asymptotic with its constant and resolving the Erdős-Wintner conjecture in that range. The following year Gábor Halász, in Acta Mathematica Academiae Scientiarum Hungaricae [Halász 1968], treated the full complex case and identified the governing quantity as the minimal distance to the family , with the sharp decay rate that bears his name. The contemporaneous Delange theorem [Delange 1961] had already pinned the surviving case where converges.

The reformulation in terms of the pretentious distance is due to Granville and Soundararajan in the 2000s [Granville-Soundararajan 2007], who recognised that is a genuine pseudometric obeying the triangle inequality, and rebuilt the mean-value theory, the large sieve for multiplicative functions, and eventually the Matomäki-Radziwiłł short-interval theorem as statements about distance in this metric. The Halász-Montgomery inequality, which transmits the pointwise control of the Euler product on the -line to the mean value, is the multiplicative descendant of the large sieve developed by Linnik, Rényi, Bombieri, and Montgomery; its appearance here marks the point where sieve theory and the theory of multiplicative functions become one subject.

Bibliography Master

@article{Wirsing1967,
  author  = {Wirsing, Eduard},
  title   = {Das asymptotische Verhalten von Summen \"uber multiplikative Funktionen II},
  journal = {Mathematische Annalen},
  volume  = {143},
  year    = {1967},
  pages   = {75--102},
  note    = {Mean values of real non-negative multiplicative functions; Erd\H{o}s-Wintner problem}
}

@article{Halasz1968,
  author  = {Hal\'asz, G\'abor},
  title   = {\"Uber die Mittelwerte multiplikativer zahlentheoretischer Funktionen},
  journal = {Acta Mathematica Academiae Scientiarum Hungaricae},
  volume  = {19},
  year    = {1968},
  pages   = {365--403},
  note    = {Halász's theorem: mean of complex multiplicative |f|<=1 via distance to n^{it}}
}

@article{Delange1961,
  author  = {Delange, Hubert},
  title   = {Sur les fonctions arithm\'etiques multiplicatives},
  journal = {Annales scientifiques de l'\'Ecole Normale Sup\'erieure (3)},
  volume  = {78},
  year    = {1961},
  pages   = {273--304},
  note    = {The Delange mean-value theorem for convergent sum_p (1-f(p))/p}
}

@article{GranvilleSoundararajan2007,
  author  = {Granville, Andrew and Soundararajan, Kannan},
  title   = {Decay of mean values of multiplicative functions},
  journal = {Canadian Journal of Mathematics},
  volume  = {59},
  number  = {6},
  year    = {2007},
  pages   = {1265--1306},
  note    = {The pretentious distance D(f,g;x) and the quantitative Halász theorem}
}

@book{MontgomeryVaughan2007,
  author    = {Montgomery, Hugh L. and Vaughan, Robert C.},
  title     = {Multiplicative Number Theory I: Classical Theory},
  series    = {Cambridge Studies in Advanced Mathematics},
  volume    = {97},
  publisher = {Cambridge University Press},
  year      = {2007},
  note      = {\S6: Wirsing, Halász, the Halász-Montgomery inequality and the large sieve}
}

@book{Tenenbaum2015,
  author    = {Tenenbaum, G\'erald},
  title     = {Introduction to Analytic and Probabilistic Number Theory},
  edition   = {3},
  series    = {Graduate Studies in Mathematics},
  volume    = {163},
  publisher = {American Mathematical Society},
  year      = {2015},
  note      = {\S III.4: the Halász method and mean values on the unit disc}
}