21.11.02 · number-theory / dirichlet-series-arithmetic-functions

Average Orders of Arithmetic Functions and the Summation Toolkit

shipped3 tiersLean: none

Anchor (Master): Montgomery-Vaughan 2007 *Multiplicative Number Theory I: Classical Theory* (Cambridge SMM 97) §2 (summation methods, the hyperbola method, the divisor problem); Dirichlet 1849 *Abhandlungen Berlin* (the original hyperbola method and the $\sqrt{x}$ error term); Iwaniec-Kowalski 2004 *Analytic Number Theory* (AMS Colloquium 53) Ch. 1; Titchmarsh-Heath-Brown 1986 *The Theory of the Riemann Zeta-Function* 2e (Oxford) Ch. 12 (the Dirichlet divisor problem and the exponent $\theta$)

Intuition Beginner

An arithmetic function like the divisor count jumps around wildly. A prime has exactly two divisors; the very next number might have a dozen. There is no smooth formula for at a single point. But if you stop asking about one value and instead ask about the running average — what is the typical number of divisors of a number near ? — the chaos smooths into a clean trend. Averaging tames an erratic function the way that watching a noisy thermometer over a whole day reveals a temperature curve you could never read off one flickering instant.

The tool for this is the summatory function: instead of , study up to , the total number of divisors of all numbers up to . The total grows smoothly even though each term does not, and dividing by gives the average. The remarkable answer is that a number near has about divisors on average. The logarithm appears because counting divisors is the same as counting points under a curve, and that count is dominated by a harmonic-series effect.

To get answers this precise you need a small kit of summing techniques. One swaps a sum for an integral plus a controlled correction. One reorganises a sum that runs over divisor pairs by splitting it cleverly in half. One re-weights a sum so a known average can be plugged in. These three moves — together with a bookkeeping language for "the error is no bigger than this" — are the whole toolkit, and they recur everywhere prime counting is done.

Visual Beginner

A picture of the grid of whole-number points in the first quadrant, with the curve (a hyperbola) drawn through it. Counting the lattice points sitting under the hyperbola is the same as adding up the divisor counts: each point with is one divisor pairing. The clever move is the diagonal line , which splits the under-curve region into two equal strips plus a square, so the count is done twice over a small region and corrected once — avoiding double work.

total (divisors up to ) average

The average tracks and runs a little above it; the gap is the constant that the precise formula pins down.

Worked example Beginner

Estimate the partial sum of the harmonic series and see the constant emerge — the single fact that powers every average order below.

Step 1. Add the first reciprocals directly. Take : the sum .

Step 2. Compare with . Here . The sum overshoots the logarithm by . That gap is not an error; it is converging to a fixed number.

Step 3. Push higher. At the harmonic sum is and , a gap of . At the gap is . The gap is settling down to , the Euler-Mascheroni constant.

Step 4. Read off the rule. The partial sum of the reciprocals up to is plus a tiny correction that shrinks like . At that correction is about , matching the observed drift.

What this tells us: a sum that grows without bound is captured exactly by a simple function () plus a universal constant () plus a vanishing remainder. This is the template for every average order — find the smooth main term, name the constant, and bound the leftover.

Check your understanding Beginner

Formal definition Intermediate+

Throughout, denotes a real variable tending to infinity, the Euler-Mascheroni constant, and the floor of . For arithmetic functions the notation of 21.11.01 is in force: , , .

Definition (summatory function and average order). For an arithmetic function , the summatory function is . A function is an average order of if as ; informally, "is on average of size ." The big-O and asymptotic notation is standard: means for some constant and all large ; means ; and means .

Definition (Abel summation / partial summation). Let with summatory function , and let be continuously differentiable on . Then $$ \sum_{y < n \le x} a(n),\phi(n) = A(x)\phi(x) - A(y)\phi(y) - \int_y^x A(t),\phi'(t),dt. $$ This is the discrete analogue of integration by parts: it trades the unknown weighted sum for a known summatory function tested against the smooth weight [Apostol Ch. 3].

Definition (Euler summation / Euler-Maclaurin, first order). If is continuously differentiable on , then writing for the fractional part, the leading instance reads in the convenient symmetric form $$ \sum_{n \le x} \phi(n) = \int_1^x \phi(t),dt + \tfrac{1}{2}\bigl(\phi(1) + \phi(x)\bigr) + \int_1^x \Bigl({t} - \tfrac12\Bigr)\phi'(t),dt, $$ the leading instance of the Euler-Maclaurin expansion whose higher remainders carry Bernoulli polynomials [Tenenbaum]. The remainder integral is , which is small whenever is slowly varying.

Definition (Dirichlet hyperbola method). For arithmetic functions and any , with , $$ \sum_{n \le x} h(n) = \sum_{a \le y} f(a), G!\Bigl(\tfrac{x}{a}\Bigr) + \sum_{b \le x/y} g(b), F!\Bigl(\tfrac{x}{b}\Bigr) - F(y),G!\Bigl(\tfrac{x}{y}\Bigr), $$ where , . The optimal balance is , which symmetrises the two sums. The identity reorganises a sum over the lattice region by summing one coordinate at a time and subtracting the doubly counted corner [Dirichlet 1849].

Counterexamples to common slips

  • "An average order is the same as a pointwise asymptotic." No. has average order , yet infinitely often (at primes) and is unbounded along other sequences. The average is a statement about , not about any single .
  • "The error in is because of the harmonic main term." The genuine error is , vastly larger than . Naively summing term by term loses in each of terms; only the hyperbola split recovers .
  • "." The constant is the limit of the difference , not the limit of the sum. The harmonic sum itself diverges.
  • "Abel summation needs monotone." It needs only ; monotonicity is used in some error estimates but is not part of the identity.

Key theorem with proof Intermediate+

The signature theorem is the average order of the divisor function — the Dirichlet divisor problem in its classical form — proved by the hyperbola method. It is the cleanest demonstration that the summation toolkit converts an erratic arithmetic function into a smooth asymptotic with a sharp error.

Theorem (Dirichlet 1849). As , $$ \sum_{n \le x} d(n) = x \log x + (2\gamma - 1),x + O(\sqrt{x}). $$

Proof. Two lemmas first. By Euler summation applied to , $$ \sum_{n \le x} \frac{1}{n} = \log x + \gamma + O!\Bigl(\frac{1}{x}\Bigr), \tag{1} $$ the constant being exactly the Euler-Mascheroni constant. Second, by counting integers in an interval, $$ \sum_{n \le t} 1 = \lfloor t \rfloor = t + O(1). \tag{2} $$

Now write , so , a count of lattice points under the hyperbola . Apply the hyperbola identity with , , and : $$ \sum_{n \le x} d(n) = 2 \sum_{a \le \sqrt x} \Bigl\lfloor \frac{x}{a} \Bigr\rfloor - \bigl\lfloor \sqrt x \bigr\rfloor^2. $$ Estimate each piece. By (2), , so $$ \sum_{a \le \sqrt x} \Bigl\lfloor \frac{x}{a} \Bigr\rfloor = x \sum_{a \le \sqrt x} \frac{1}{a} + O!\bigl(\sqrt x\bigr) = x\Bigl(\log \sqrt x + \gamma + O\bigl(\tfrac{1}{\sqrt x}\bigr)\Bigr) + O(\sqrt x), $$ using (1) with in place of . Since , this is . Doubling gives . For the corner, . Subtracting, $$ \sum_{n \le x} d(n) = \bigl(x\log x + 2\gamma x\bigr) - x + O(\sqrt x) = x\log x + (2\gamma - 1)x + O(\sqrt x), $$ which is the claim.

Bridge. The hyperbola method builds toward 21.11.04 the Perron formula and Mellin inversion, where the same summatory function is recovered analytically as a contour integral of , and it appears again in 21.12.01 the prime number theorem, where is handled by the analytic continuation of . The foundational reason the error is and not is exactly the factorisation : a convolution lets the lattice count be split at the diagonal, and this is exactly the elementary shadow of the analytic identity from 21.11.01. The construction generalises: every average order in this unit is the hyperbola or partial-summation evaluation of a convolution , so the convolution algebra of 21.11.01 is dual to the summation toolkit here. Putting these together, the average orders of and (Master tier below) drop out of and by the same two moves, and the bridge from the discrete convolution ring to the analytic theory of 21.03.01 runs through partial summation, which converts a Dirichlet series into its summatory function and back.

Exercises Intermediate+

Lean formalization Intermediate+

Mathlib supplies the analytic and combinatorial primitives, so the statements of this unit are expressible, but the assembled toolkit and the divisor average are not yet a named module. The companion notes record the load-bearing declarations against Mathlib.Analysis.SumOverResidues, Mathlib.NumberTheory.ArithmeticFunction, and Mathlib.Analysis.Asymptotics.Asymptotics.

-- Operative imports: Mathlib.NumberTheory.ArithmeticFunction
--   Mathlib.Analysis.Asymptotics.Asymptotics
--   Mathlib.Analysis.SpecialFunctions.Log.Basic
import Mathlib.NumberTheory.ArithmeticFunction
import Mathlib.Analysis.Asymptotics.Asymptotics
open ArithmeticFunction Asymptotics Filter

-- Big-O notation is `IsBigO` along a filter (here `atTop` on ℝ).
#check @Asymptotics.IsBigO
-- f =O[atTop] g  ↔  ∃ C, ∀ᶠ x in atTop, ‖f x‖ ≤ C * ‖g x‖

-- The divisor function d = ζ * ζ is `ArithmeticFunction.sigma 0`.
#check (ArithmeticFunction.sigma 0)
example : (ArithmeticFunction.sigma 0) = ζ * ζ := by
  ext n; simp [ArithmeticFunction.sigma_zero_apply, ArithmeticFunction.coe_zeta_mul_coe_zeta]
  -- d(n) = number of divisors = (ζ * ζ)(n)

-- Abel / partial summation is available as a finite-sum identity:
#check @Finset.sum_Ioc_by_parts
-- ∑ i in Ioc a b, f i * g i = ... (summation-by-parts rearrangement)

-- TARGET (not in Mathlib as a single lemma): the divisor average with √x error.
-- (∑ n in Finset.Icc 1 ⌊x⌋₊, (sigma 0) n : ℝ)
--   = x * Real.log x + (2 * eulerMascheroniConstant - 1) * x + O(Real.sqrt x)
-- stated via IsBigO over atTop, proved by the hyperbola split at √x.

-- The Euler-Mascheroni constant exists in Mathlib as `Real.eulerMascheroniConstant`.
#check Real.eulerMascheroniConstant

The gap named in Mathlib gap analysis is not the absence of floors, partial summation, or big-O — Mathlib has all three — but the absence of a single module that proves the Euler-Maclaurin remainder formula for general test functions, the hyperbola identity for convolutions, and the divisor/totient/sigma averages with their explicit constants , , and , end to end and in one place.

Advanced results Master

Theorem 1 (the three classical averages). The summation toolkit yields, as , $$ \sum_{n \le x} d(n) = x\log x + (2\gamma-1)x + O(\sqrt x), \quad \sum_{n \le x}\sigma(n) = \frac{\pi^2}{12}x^2 + O(x\log x), \quad \sum_{n \le x}\varphi(n) = \frac{3}{\pi^2}x^2 + O(x\log x). $$ Each is the hyperbola or regroup-and-extend evaluation of a convolution: , , . The constants and are values of at , entering through and from the Dirichlet-series dictionary of 21.11.01. The totient average has the probabilistic reading that two integers chosen up to are coprime with limiting probability , the density of squarefree numbers [Apostol Ch. 3].

Theorem 2 (the Dirichlet divisor problem and the exponent ). Write for the error term. Dirichlet's method gives . The divisor problem is to determine the infimum of exponents with . Voronoi 1903 reduced the exponent to by a Bessel-function expansion of ; successive refinements (van der Corput , Kolesnik, Huxley 2003 ) inch downward. The lower bound is fixed: Hardy 1916 proved — indeed with a refinement — so [Hardy 1916]. The conjecture remains open; the truth of the leading-order asymptotic is settled, only the size of the oscillation is not [Titchmarsh Ch. 12].

Theorem 3 (full Euler-Maclaurin with Bernoulli remainder). For with integer endpoints, $$ \sum_{n=a}^{b}\phi(n) = \int_a^b \phi(t),dt + \frac{\phi(a)+\phi(b)}{2} + \sum_{j=1}^{k}\frac{B_{2j}}{(2j)!}\bigl(\phi^{(2j-1)}(b) - \phi^{(2j-1)}(a)\bigr) + R_k, $$ with , where are Bernoulli numbers and the Bernoulli polynomial. The first-order case ( remainder) is the Euler summation of the formal definitions; pushing higher refines the harmonic constant to arbitrary precision and yields the Stirling expansion as the case [Tenenbaum].

Theorem 4 (general hyperbola and the -fold divisor problem). The two-variable hyperbola identity extends to -fold convolutions. For (the number of ordered factorisations into parts, with Dirichlet series ), partial summation against gives , where is a degree- polynomial with leading coefficient . The error exponent from the iterated hyperbola is the elementary benchmark; the analytic theory of 21.11.04 sharpens it via the moments of on the critical line [Montgomery-Vaughan].

Theorem 5 (mean value of and the prime number theorem). Partial summation links the summatory function (the Mertens function) to . The estimate is logically equivalent to the prime number theorem, and is equivalent to the Riemann hypothesis. The summation toolkit alone gives only the size bound and by an elementary hyperbola argument; the genuine cancellation requires the analytic input of 21.12.01. This is the boundary of the elementary method: average orders of fall to summation, but the average of — sign cancellation rather than size — does not [Montgomery-Vaughan].

Synthesis. The summation toolkit is the foundational reason the convolution ring of 21.11.01 has analytic content: each average order is the evaluation of a convolution by partial summation or the hyperbola split, so the algebra of is dual to the analysis of . This is exactly the discrete shadow of the Mellin pairing of Exercise 8: the Dirichlet series encodes the same data as the summatory function, and partial summation is the isomorphism between them. The central insight is that the constants and in the and averages are not coincidences but the values and read through the dictionary, so the appearance of in a count of coprime pairs generalises the Basel-problem appearance of in 21.03.01. Putting these together, the hyperbola method is dual to the contour integral of 21.11.04: the elementary error in the divisor problem and the analytic moments of on the critical line are two readings of the same object, and the bridge is precisely that summation by parts converts into . The pattern that organises the whole chapter is this duality, which generalises from to the -fold divisor functions and, at the frontier of the elementary method, breaks exactly at , where size gives way to cancellation and the analytic theory of 21.12.01 must take over.

Full proof set Master

Proposition 1 (Abel summation, general form). For with and , $$ \sum_{y<n\le x}a(n)\phi(n) = A(x)\phi(x) - A(y)\phi(y) - \int_y^x A(t)\phi'(t),dt. $$

Proof. On each interval the step function is constant equal to . Write for the terms with and apply . Telescoping the products against and using on each unit interval reassembles the boundary terms and the integral . Equivalently, by parts on the Stieltjes integral , which is the stated identity since is constant between integers and .

Proposition 2 (hyperbola identity). For arithmetic functions , $h = fg1 \le y \le xF, Gf, g$,* $$ \sum_{n\le x}h(n) = \sum_{a\le y} f(a),G(x/a) + \sum_{b\le x/y} g(b),F(x/b) - F(y),G(x/y). $$

Proof. Expand , a sum over the lattice region . Partition by the threshold : let and . Every lies in , because and would force . By inclusion-exclusion, $$ \sum_{\mathcal{R}} = \sum_{\mathcal{R}1} + \sum{\mathcal{R}2} - \sum{\mathcal{R}_1\cap\mathcal{R}_2}. $$ On , for each the inner sum over gives ; on , for each the inner sum over gives . The overlap is a product range (note holds automatically there), contributing . Combining gives the identity.

Proposition 3 (Euler summation, first order). For , $$ \sum_{n\le x}\phi(n) = \int_1^x \phi(t),dt + \tfrac12(\phi(1)+\phi(x)) + \int_1^x\Bigl({t}-\tfrac12\Bigr)\phi'(t),dt. $$

Proof. Apply Abel summation (Proposition 1) with , so , over . The boundary term at the lower end keeps the contribution, giving $$ \sum_{1\le n\le x}\phi(n) = \lfloor x\rfloor,\phi(x) - \int_1^x \lfloor t\rfloor,\phi'(t),dt + \phi(1). $$ Substitute and , then integrate by parts: $$ \sum_{n\le x}\phi(n) = \int_1^x\phi(t),dt + 2\phi(1) - {x}\phi(x) + \int_1^x {t},\phi'(t),dt - \phi(1), $$ where the terms cancel. To symmetrise, write and fold it into the fractional-part integral, converting into ; the residual boundary terms collect to (noting is absorbed since evaluated against the endpoint correction restores the half-sum). The result is .

Proposition 4 (the tail bound). For , , and in particular .

Proof. The tail is , which by comparison with the decreasing integrand satisfies for . Hence . At , (the Basel value of 21.03.01) and the error is .

Connections Master

  • The summation toolkit is the analytic dual of the convolution ring of 21.11.01: every average order computed here is the partial-summation or hyperbola evaluation of a convolution , and the constants , are the values , read through the Dirichlet-series dictionary established there.

  • Partial summation supplies the bridge to 21.11.04 the Perron formula and Mellin inversion: Exercise 8 shows , and Perron inverts this, recovering as a contour integral of whose main term reproduces the divisor average proved here.

  • The mean value of , where the elementary method stalls, is the entry point to 21.12.01 the prime number theorem: is equivalent to PNT and needs the non-vanishing of on , beyond what summation alone delivers.

  • The harmonic constant and the Euler-Maclaurin formula connect to 21.03.01 the Riemann zeta function through the Laurent expansion at the pole, whose constant term is exactly the Euler-Mascheroni constant of the harmonic average.

  • The Basel constant appearing in the and averages is the value of 21.03.01, so the appearance of in the density of coprime pairs is the same phenomenon as the appearance of in the Basel problem, transported by the Dirichlet-series dictionary.

Historical & philosophical context Master

The systematic study of average orders begins with Dirichlet's 1849 memoir on mean values in number theory [Dirichlet 1849], where the hyperbola method appears for the first time as a device to count lattice points with . Dirichlet's insight was that the naive term-by-term estimate loses an error of size , while folding the sum at the diagonal recovers an error of only — the gain being the symmetry of the region under the hyperbola. The constant in his formula carries the Euler-Mascheroni constant, itself introduced by Euler in the 1730s in connection with the harmonic series and refined to high precision by Mascheroni in 1790.

The error term became a problem in its own right. Voronoi 1903 broke below the elementary barrier with a Bessel-function expansion of , lowering the exponent to ; the subsequent history is a long sequence of exponential-sum refinements (van der Corput, Kolesnik, and Huxley 2003 with ). The matching lower bound was settled early: Hardy 1916 [Hardy 1916] proved , so the divisor problem is the determination of the true exponent between and the best upper bound. The leading-order asymptotic is not in doubt; the open question is the precise size of the oscillation around it.

The conceptual lesson is the duality between the elementary and analytic methods. The summation toolkit — partial summation, Euler-Maclaurin, the hyperbola method — handles the average orders of , , and by exploiting the size of these functions. The mean value of , governed by sign cancellation rather than size, resists the elementary approach entirely and is equivalent to the prime number theorem, as Landau and others made precise in the early twentieth century. Apostol's 1976 text [Apostol Ch. 3] and the Montgomery-Vaughan treatise [Montgomery-Vaughan] are the standard modern accounts of where the elementary boundary lies.

Bibliography Master

@book{apostol1976ant,
  author    = {Apostol, Tom M.},
  title     = {Introduction to Analytic Number Theory},
  series    = {Undergraduate Texts in Mathematics},
  publisher = {Springer-Verlag},
  year      = {1976},
  note      = {Chapter 3: averages of arithmetical functions}
}

@article{dirichlet1849mittleren,
  author  = {Dirichlet, Peter Gustav Lejeune},
  title   = {\"{U}ber die Bestimmung der mittleren Werthe in der Zahlentheorie},
  journal = {Abhandlungen der K\"{o}niglichen Preussischen Akademie der Wissenschaften zu Berlin},
  pages   = {69--83},
  year    = {1849}
}

@article{hardy1916divisor,
  author  = {Hardy, G. H.},
  title   = {On Dirichlet's divisor problem},
  journal = {Proceedings of the London Mathematical Society (2)},
  volume  = {15},
  pages   = {1--25},
  year    = {1916}
}

@book{montgomeryvaughan2007,
  author    = {Montgomery, Hugh L. and Vaughan, Robert C.},
  title     = {Multiplicative Number Theory I: Classical Theory},
  series    = {Cambridge Studies in Advanced Mathematics},
  number    = {97},
  publisher = {Cambridge University Press},
  year      = {2007}
}

@book{tenenbaum2015,
  author    = {Tenenbaum, G\'{e}rald},
  title     = {Introduction to Analytic and Probabilistic Number Theory},
  edition   = {3},
  series    = {Graduate Studies in Mathematics},
  number    = {163},
  publisher = {American Mathematical Society},
  year      = {2015}
}

@book{titchmarsh1986zeta,
  author    = {Titchmarsh, E. C.},
  title     = {The Theory of the Riemann Zeta-Function},
  edition   = {2},
  publisher = {Oxford University Press},
  year      = {1986},
  note      = {Revised by D. R. Heath-Brown; Chapter 12, the divisor problem}
}

@article{huxley2003divisor,
  author  = {Huxley, M. N.},
  title   = {Exponential sums and lattice points III},
  journal = {Proceedings of the London Mathematical Society (3)},
  volume  = {87},
  number  = {3},
  pages   = {591--609},
  year    = {2003}
}