21.14.02 · number-theory / sieve-methods-large-sieve

The Bombieri-Vinogradov Theorem

shipped3 tiersLean: none

Anchor (Master): Bombieri 1965 *Mathematika* 12, 201-225 (*On the large sieve* — the original theorem); A. I. Vinogradov 1965 *Izv. Akad. Nauk SSSR* 29, 903-934 (independent proof); Montgomery-Vaughan 2007 *Multiplicative Number Theory I* (Cambridge SMM 97) §9; Iwaniec-Kowalski 2004 *Analytic Number Theory* (AMS Colloquium 53) §17; Vaughan 1980 *Mathematika* 27, 153-163 (the simplified Vaughan-identity proof); Elliott-Halberstam 1970 *Symposia Mathematica* 4, 59-72 (the conjecture); Zhang 2014 *Annals of Math.* 179, 1121-1174 and Maynard 2015 *Annals of Math.* 181, 383-413 (bounded gaps)

Intuition Beginner

The primes thin out as you go up, but among the numbers with a fixed last-digit pattern they should thin out at the same rate. Count primes up to a million that leave remainder $1$ when divided by $7$ , and remainder $2$ , and remainder $3$ , and so on through the six allowed remainders: each box should hold roughly the same share of the primes. This is the prime number theorem for arithmetic progressions, and it says the primes do not play favourites among the remainder classes a divisor allows.

The catch is the size of the divisor. For small divisors this even spread is a proven fact. For divisors as large as the square root of where you are counting, the even spread is only a conjecture — the Generalized Riemann Hypothesis — and nobody can prove it for any single large divisor. The Bombieri-Vinogradov theorem makes a clever trade: instead of demanding the even spread for every large divisor, it asks only that the spread be even on average as the divisor ranges over all values up to the square root. That averaged statement, remarkably, can be proven outright.

So you give up control of each individual divisor and gain control of the whole pile. A few divisors might misbehave, but they cannot all misbehave at once, because the total error stays small. This is why people call the theorem "the Riemann Hypothesis on average": it delivers, unconditionally, the same strength the famous conjecture would give for each divisor separately, paid out as a sum rather than term by term.

The engine underneath is the large sieve. It is the tool that says well-separated test points cannot all light up at once, and summing the misbehaviour of many divisors is exactly a sum over well-separated points.

Visual Beginner

Picture a long counter with a box for each divisor $q$ from $1$ up to the square root of $x$ . Into box $q$ you drop the worst error you can find across all the remainder classes that $q$ allows — how far the prime count in the most lopsided class strays from its fair share $x / φ (q)$ .

   divisor q:    2    3    4    5   ...   ~sqrt(x)
                ┌──┐ ┌──┐ ┌──┐ ┌──┐       ┌──┐
   worst error: │▁ │ │▃ │ │▁ │ │▂ │  ...  │▁ │
                └──┘ └──┘ └──┘ └──┘       └──┘
                 \________ total height capped ________/
          sum of all box heights  <<  x / (log x)^A

   Any one box may spike, but the whole row's
   summed height stays tiny -- "GRH on average".

A single box might rise — one stubborn divisor whose primes lean one way. But the bracket under the whole row stays low: add up every box and the total is a vanishing fraction of $x$ . That summed cap, not any single box, is what the theorem guarantees.

Worked example Beginner

We check the "fair share" idea by hand on a small divisor, to see what number the theorem says the error should hug.

Step 1. Take the divisor $q = 4$ . The remainders coprime to $4$ are $1$ and $3$ ; there are $φ (4) = 2$ of them. The prime number theorem for progressions predicts that primes split evenly between the class " $\equiv 1$ " and the class " $\equiv 3$ ".

Step 2. Count primes up to $x = 100$ . There are $25$ primes up to $100$ . Removing $2$ (the only prime not coprime to $4$ ), $24$ primes fall into the two allowed classes.

Step 3. The fair share for each class is $24/ φ (4) = 24/2 = 12$ primes apiece.

Step 4. Count the actual classes. Primes $\equiv 1 (mod 4)$ : $5, 13, 17, 29, 37, 41, 53, 61, 73, 89, 97$ — that is $11$ . Primes $\equiv 3 (mod 4)$ : $3, 7, 11, 19, 23, 31, 43, 47, 59, 67, 71, 79, 83$ — that is $13$ .

Step 5. The errors from the fair share of $12$ are $∣11 - 12∣ = 1$ and $∣13 - 12∣ = 1$ . The worst error for $q = 4$ is $1$ , small against the count $24$ .

What this tells us: even at this tiny scale the lopsidedness is a single prime out of two dozen. Bombieri-Vinogradov says that when you add up the worst such error over every divisor up to $x$ , the grand total still stays below $x / (lo g x)^{A}$ for any power $A$ you like — the small wobbles never conspire into a large sum.

Check your understanding Beginner

Exercise (easy, multiple choice).

The Bombieri-Vinogradov theorem controls the error in the prime count across remainder classes for divisors up to about:

A. $lo g x$
B. The square root of $x$
C. $x$ itself
D. A fixed constant

Hint

The theorem is famous for matching, on average, what the Generalized Riemann Hypothesis gives for each divisor up to the square root of $x$ .

Answer

B. The averaging runs over divisors $q$ up to $x$ (more precisely $x^{1/2} / (lo g x)^{B}$ ). Option A is the much shorter Siegel-Walfisz range, where the even spread is known for each divisor individually. Option C is the Elliott-Halberstam conjecture's range and is unproven. Option D is far too short to be interesting. Feedback-correct: the square-root range is the headline of Bombieri-Vinogradov. Feedback-wrong: $lo g x$ is the unconditional individual range; the square-root range is the averaged one.

Formal definition Intermediate+

Write $Λ$ for the von Mangoldt function, $ψ (x; q, a) = \sum_{n \leq x, n \equiv a (q)} Λ (n)$ for the weighted prime count in the progression $a mod q$ , and $φ$ for Euler's totient. The expected main term is $x / φ (q)$ , since the primes coprime to $q$ should distribute evenly over the $φ (q)$ admissible classes.

Definition (error term and its maximum over residues). For $g cd (a, q) = 1$ set $$ E(x;q,a) = \psi(x;q,a) - \frac{x}{\varphi(q)}, \qquad E^(x;q) = \max_{\gcd(a,q)=1}\big|E(x;q,a)\big|. $$ $E^(x;q) $i s t h e w or s t s in g l e - c l a ss d e v ia t i o n f r o m e q u i d i s t r ib u t i o na t m o d u l u s$ q$.

Definition (level of distribution). A real $ϑ \in (0, 1]$ is a level of distribution for the primes if for every $A > 0$ there is a $B = B (A)$ such that $$ \sum_{q \le Q} E^*(x;q) \ll_A \frac{x}{(\log x)^A}, \qquad Q = \frac{x^{\vartheta}}{(\log x)^B}. $$ The larger $ϑ$ , the longer the range of moduli over which equidistribution holds on average.

Definition (the Bombieri-Vinogradov theorem). The primes have level of distribution $ϑ = \frac{1}{2}$ : for every $A > 0$ there is $B = B (A)$ with $$ \sum_{q \le Q} \max_{\gcd(a,q)=1}\Big|\psi(x;q,a) - \frac{x}{\varphi(q)}\Big| \ll_A \frac{x}{(\log x)^A}, \qquad Q = x^{1/2}(\log x)^{-B}. $$ The Generalized Riemann Hypothesis for Dirichlet $L$ -functions would give $E^{*} (x; q) ≪ x^{1/2} lo g^{2} (q x)$ for each individual $q$ , hence the same averaged bound up to $Q = x^{1/2 - ε}$ ; Bombieri-Vinogradov delivers the averaged conclusion unconditionally. In this sense $ϑ = \frac{1}{2}$ is "GRH on average".

Definition (the Elliott-Halberstam conjecture). The conjecture asserts that every $ϑ < 1$ is a level of distribution: for all $A$ and all $ε > 0$ , $$ \sum_{q \le x^{1-\varepsilon}} E^*(x;q) \ll_{A,\varepsilon} \frac{x}{(\log x)^A}. $$ It extends the averaged equidistribution from moduli up to $x^{1/2}$ to moduli up to almost $x$ , and is far stronger than GRH (which controls each $q$ only to the $x^{1/2}$ scale).

Counterexamples to common slips

"Bombieri-Vinogradov gives equidistribution for each $q \leq x$ ." It does not. It bounds the sum of the errors; an individual $q$ can have $E^{*} (x; q)$ as large as $x / φ (q)$ in principle, and only finitely many such bad $q$ are excluded by the averaging. Pointwise equidistribution at this scale is GRH.
"The maximum over $a$ is harmless and can be dropped." The $max_{(a, q) = 1}$ inside the sum is essential and far from automatic to install: a bound for a single fixed $a$ is weaker, and the proof must produce uniformity in $a$ before summing. Replacing the max by an average over $a$ gives a strictly easier statement (the Barban-Davenport-Halberstam theorem).
"Level $ϑ = \frac{1}{2}$ and GRH are the same." GRH gives $ϑ = \frac{1}{2}$ on average and pointwise control for each $q$ ; Bombieri-Vinogradov gives only the averaged half. Elliott-Halberstam ( $ϑ \to 1$ ) is strictly stronger than GRH on the averaged side, yet says nothing pointwise.

Key theorem with proof Intermediate+

The signature result is the level- $\frac{1}{2}$ bound; its proof routes the error through Dirichlet characters, applies the multiplicative large sieve of 21.14.01 to a bilinear decomposition of $Λ$ , and disposes of small moduli by Siegel-Walfisz ^{[Montgomery-Vaughan 2007]}.

Theorem (Bombieri-Vinogradov; Bombieri 1965, A. I. Vinogradov 1965). For every $A > 0$ there is $B = B (A) > 0$ such that, with $Q = x^{1/2} (lo g x)^{- B}$ , $$ \sum_{q \le Q} \max_{\gcd(a,q)=1}\Big|\psi(x;q,a) - \frac{x}{\varphi(q)}\Big| \ll_A \frac{x}{(\log x)^A}. $$

Proof. Detect the residue class with Dirichlet characters: for $g cd (a, q) = 1$ , $$ \psi(x;q,a) - \frac{x}{\varphi(q)}\cdot[\text{principal part}] = \frac{1}{\varphi(q)}\sum_{\chi \ne \chi_0}\bar\chi(a),\psi(x,\chi), \qquad \psi(x,\chi) = \sum_{n\le x}\chi(n)\Lambda(n), $$ so $E^{*} (x; q) \leq \frac{1}{φ ( q )} \sum_{χ \neq = χ_{0}} ∣ ψ (x, χ) ∣ + O (lo g^{2} x)$ , the error absorbing the prime powers dividing $q$ . Each non-principal $χ$ mod $q$ is induced by a unique primitive $χ^{*}$ mod $q^{*} ∣ q$ , and $ψ (x, χ) = ψ (x, χ^{*}) + O ((lo g q x)^{2})$ . Reordering the sum over $q \leq Q$ by the primitive character $χ^{*}$ of conductor $q^{*} \leq Q$ , $$ \sum_{q\le Q}E^(x;q) \ll (\log x)\sum_{q^\le Q}\frac{1}{\varphi(q^)}\ \sideset{}{^}\sum_{\chi^\bmod q^}\max_{y\le x}|\psi(y,\chi^)| + \frac{x}{(\log x)^A}, $$ where the weight $\sum_{q\le Q,\ q^\mid q}1/\varphi(q) \ll (\log x)/\varphi(q^) $co l l ec t s t h e m o d u l iin d u ce df r o m$ q^$.

Two ranges of conductor are handled separately. For small conductor $q^{*} \leq (lo g x)^{B}$ , the Siegel-Walfisz theorem — itself resting on the zero-free region and the Siegel-zero analysis of 21.13.01 — gives $∣ ψ (y, χ^{*}) ∣ ≪ x exp (- c lo g x)$ uniformly, and there are $≪ (lo g x)^{2 B}$ such characters, so their total contribution is $≪ x exp (- \frac{c}{2} lo g x) ≪ x (lo g x)^{- A}$ .

For large conductor $(lo g x)^{B} < q^{*} \leq Q$ the saving comes from the large sieve. Decompose $Λ = Λ^{♭} + Λ^{♯}$ by Vaughan's identity ^{[Vaughan 1980]} into Type I sums $\sum_{d \leq U} a_{d} \sum_{m} χ^{*} (d m)$ (one smooth variable) and Type II bilinear sums $\sum_{U < m \leq V} b_{m} \sum_{k} c_{k} χ^{*} (mk)$ (two rough variables), with $U = V = x^{1/3}$ . Apply the multiplicative large sieve of 21.14.01, $$ \sum_{q^\le Q}\frac{q^}{\varphi(q^)}\ \sideset{}{^}\sum_{\chi^}\Big|\sum_{n}\xi_n\chi^(n)\Big|^2 \ll (M + Q^2)\sum_n|\xi_n|^2, $$ to each bilinear piece after Cauchy-Schwarz in the two variables. With $Q^{2} \leq x (lo g x)^{- 2 B}$ and the coefficient lengths balanced near $x^{1/2}$ , the resulting bound is $≪ x (lo g x)^{- A}$ once $B = B (A)$ is taken large enough. Summing the small- and large-conductor contributions and passing back from $ψ (y, χ^{*})$ to $E^{*} (x; q)$ via Perron inversion 21.11.04 yields the stated bound. $□$

Bridge. The proof builds toward the bounded-gaps results of the Advanced results, where level $\frac{1}{2}$ is fed into a Selberg sieve, and the same character-sum machinery appears again in Linnik's theorem on the least prime in a progression. The foundational reason the threshold sits at $x^{1/2}$ is exactly the large-sieve constant $M + Q^{2}$ of 21.14.01: balancing $Q^{2}$ against the coefficient length $M \approx x$ forces $Q \leq x^{1/2}$ , so this is exactly the same $N + Q^{2}$ tradeoff that capped Brun-Titchmarsh. The split into small and large conductor is dual to the split in the prime number theorem itself: small moduli are governed by the zero-free region of 21.13.01, large moduli by the sieve, and putting these together gives the central insight that an averaged GRH needs no Riemann Hypothesis — the average over $q$ converts a statement about individual $L$ -function zeros into a statement about the $ℓ^{2}$ -energy of character sums, which the large sieve controls outright.

Exercises Intermediate+

Exercise 2 (easy, multiple choice).

Which conjecture asserts the averaged equidistribution bound with the level of distribution pushed to $Q = x^{1 - ε}$ ?

A. The Riemann Hypothesis
B. The Elliott-Halberstam conjecture
C. The twin prime conjecture
D. The Goldbach conjecture

Hint

It is named for two analytic number theorists and is the natural "endpoint" strengthening of Bombieri-Vinogradov.

Answer

B. The Elliott-Halberstam conjecture (1970) asserts level of distribution $ϑ = 1$ : the same mean-error bound for $Q = x^{1 - ε}$ , every $ε > 0$ . It is strictly stronger than GRH on the averaged side. Option A controls individual zeros, not an average; options C and D are equidistribution-independent prime statements that Elliott-Halberstam-type inputs help attack but do not state. Its role in bounded gaps (Goldston-Pintz-Yıldırım, Maynard) is what made it famous after 2013.

Exercise 3 (medium, symbolic).

Show that for $g cd (a, q) = 1$ , $ψ (x; q, a) = \frac{1}{φ ( q )} \sum_{χ mod q} \overset{χ}{ˉ} (a) ψ (x, χ)$ , where $ψ (x, χ) = \sum_{n \leq x} χ (n) Λ (n)$ . Identify the principal-character term.

Hint

Use orthogonality of Dirichlet characters: $\frac{1}{φ ( q )} \sum_{χ} \overset{χ}{ˉ} (a) χ (n) = 1_{n \equiv a (q)}$ for $g cd (a, q) = 1$ .

Answer

By character orthogonality, for $g cd (a, q) = 1$ and any $n$ with $g cd (n, q) = 1$ , $\frac{1}{φ ( q )} \sum_{χ mod q} \overset{χ}{ˉ} (a) χ (n) = 1_{n \equiv a (q)}$ , and the indicator's right side is $0$ when $g cd (n, q) > 1$ (where every $χ (n) = 0$ ). Multiply by $Λ (n)$ and sum over $n \leq x$ : $$ \psi(x;q,a) = \sum_{n\le x}\Lambda(n),\mathbf{1}{n\equiv a,(q)} = \frac{1}{\varphi(q)}\sum{\chi\bmod q}\bar\chi(a)\sum_{n\le x}\chi(n)\Lambda(n) = \frac{1}{\varphi(q)}\sum_{\chi}\bar\chi(a)\psi(x,\chi), $$ up to $O (lo g^{2} x)$ from prime powers $p^{k} ∣ q$ . The principal character $χ_{0}$ contributes $\frac{1}{φ ( q )} ψ (x, χ_{0}) = \frac{1}{φ ( q )} \sum_{n \leq x, (n, q) = 1} Λ (n) = \frac{x}{φ ( q )} + (error)$ , the expected main term. Subtracting it leaves $E (x; q, a) = \frac{1}{φ ( q )} \sum_{χ \neq = χ_{0}} \overset{χ}{ˉ} (a) ψ (x, χ)$ . Rubric: full credit for the orthogonality detection, the sum interchange, and isolating $χ_{0}$ as the main term.

Exercise 4 (medium, symbolic).

Explain why reducing each non-principal $χ$ mod $q$ to the primitive character $χ^{*}$ inducing it costs only $O ((lo g q x)^{2})$ in $ψ (x, χ)$ , and why this reduction is needed before applying the large sieve.

Hint

If $χ$ mod $q$ is induced by $χ^{*}$ mod $q^{*}$ with $q^{*} ∣ q$ , then $χ (n) = χ^{*} (n)$ for $g cd (n, q) = 1$ and $χ (n) = 0$ otherwise; the difference $ψ (x, χ) - ψ (x, χ^{*})$ sums $Λ (n)$ over $n$ with $g cd (n, q) > 1$ but $g cd (n, q^{*}) = 1$ .

Answer

Write $χ = χ^{*} χ_{0, q}$ with $χ^{*}$ primitive mod $q^{*}$ and $χ_{0, q}$ principal mod $q$ . Then $χ (n) = χ^{*} (n)$ when $g cd (n, q) = 1$ , and $χ (n) = 0$ but $χ^{*} (n)$ may be nonzero when $g cd (n, q) > 1$ while $g cd (n, q^{*}) = 1$ . Hence $$ \psi(x,\chi)-\psi(x,\chi^) = -\sum_{\substack{n\le x,\ (n,q^)=1\ (n,q)>1}}\chi^(n)\Lambda(n). $$ The non-zero $Λ (n)$ are prime powers $p^{k}$ , and the constraint forces $p ∣ q$ , $p\nmid q^ $; t h er e a r e$ \ll \omega(q)\ll \log q $s u c h p r im es, e a c h co n t r ib u t in g$ \ll \log x $p r im e p o w er s u pt o$ x $, so t h e d i f f er e n ce i s$ O(\log q\log x)=O((\log qx)^2) $. T h er e d u c t i o n t o p r imi t i v ec ha r a c t er s i s n ee d e d b ec a u se t h e m u l t i pl i c a t i v e l a r g es i e v e i s anin e q u a l i t y o v er * p r imi t i v e * c ha r a c t er s (i t s G a u ss - s u min p u t r e q u i r es$ |\tau(\chi^)|^2=q^ $, v a l i d o n l y f or p r imi t i v e$ \chi^* $); s u mmin g o v er in d u ce d c ha r a c t er s w o u l dd o u b l e - co u n t an d b r e ak t h eor t h o g o na l i t y . R u b r i c : f u l l cr e d i t f or t h e d i f f er e n ce i d e n t i t y, t h e$ O((\log qx)^2)$ count, and the primitivity requirement of the large sieve.

Exercise 5 (medium, numeric).

Maynard's theorem gives bounded gaps from level of distribution $ϑ = 1/2$ (Bombieri-Vinogradov), yielding $lim inf_{n} (p_{n + 1} - p_{n}) \leq 600$ ; under Elliott-Halberstam ( $ϑ$ close to $1$ ) the bound improves. State the gap bound Maynard obtains assuming Elliott-Halberstam, and how many primes in a bounded interval it produces infinitely often (the smallest $m$ his method handles for $k = 105$ ).

Hint

Under Elliott-Halberstam, Maynard's $k = 105$ admissible tuple yields a gap of $12$ and infinitely many intervals containing two primes.

Answer

Gap $12$ ; $m = 2$ primes. Maynard's multidimensional sieve, applied with the Elliott-Halberstam level of distribution $ϑ > 0.95$ , gives $lim inf_{n} (p_{n + 1} - p_{n}) \leq 12$ : infinitely often two primes lie within a window of width $12$ . (Unconditionally, with only Bombieri-Vinogradov's $ϑ = 1/2$ , the same method gives $\leq 600$ , later refined by Polymath8b to $246$ .) The conditional $12$ is the closest the GPY-Maynard machinery comes to the twin-prime conjecture's gap of $2$ ; the parity barrier blocks reaching $2$ by these sieve methods alone. The key arithmetic input is precisely the level of distribution: each $0.5$ of extra level buys a smaller admissible $k$ and hence a smaller gap.

Exercise 6 (hard, symbolic).

Sketch how Vaughan's identity $Λ (n) = \sum_{b d = n d \leq U} μ (d) lo g b - \sum_{d c m = n d \leq U, c \leq V} μ (d) Λ (c) + \sum_{mk = n m > U, k > V} (\sum_{d ∣ m d \leq U} μ (d)) Λ (k)$ produces Type I and Type II sums when paired against $χ^{*} (n)$ , and which term is the genuinely bilinear one.

Hint

A sum is "Type I" when one variable ranges over a smooth coefficient and the other runs over a full interval (so the inner character sum is an incomplete geometric/Pólya-Vinogradov sum); it is "Type II" when both variables carry rough coefficients of comparable length, forcing Cauchy-Schwarz and the large sieve.

Answer

Multiply by $χ^{*} (n)$ and sum over $n \leq x$ . The first two terms have a divisor $d \leq U$ and a co-factor running over a full range; collecting the inner sum over the long variable gives Type I sums $\sum_{d \leq U} α_{d} \sum_{m \leq x / d} χ^{*} (d m)$ with $α_{d}$ a short smooth coefficient ( $μ (d)$ or $μ (d) Λ (c)$ -convolved). Here the inner $\sum_{m} χ^{*} (m)$ is an incomplete character sum bounded by Pólya-Vinogradov / partial summation, and after summing over $d \leq U = x^{1/3}$ the total is small. The third term is the Type II (bilinear) sum $\sum_{U < m} β_{m} \sum_{k > V} Λ (k) χ^{*} (mk)$ with $β_{m} = \sum_{d ∣ m, d \leq U} μ (d)$ ; both $m$ and $k$ run over rough ranges of length $≫ x^{1/3}$ . This is the term that resists elementary treatment: applying Cauchy-Schwarz in $m$ and then the multiplicative large sieve of 21.14.01 to the $k$ -sum over characters yields the $(M + Q^{2})$ saving, which is where the $x^{1/2}$ level of distribution is born. Rubric: full credit for sorting the three terms into Type I / Type I / Type II, identifying the bilinear term, and naming Cauchy-Schwarz + large sieve as its disposal.

Exercise 7 (hard, symbolic).

The large-sieve density estimate states $\sum_{q \leq Q} \frac{q}{φ ( q )} \sideset^{*} \sum_{χ mod q} N (α, T, χ) ≪ (Q^{2} T)^{c (1 - α)} (lo g QT)^{c^{'}}$ , where $N (α, T, χ)$ counts zeros of $L (s, χ)$ with $Re (s) \geq α$ , $∣ Im (s) ∣ \leq T$ . Explain why such a density estimate is an alternative route to Bombieri-Vinogradov, and what feature substitutes for GRH.

Hint

GRH would force $α \leq 1/2$ for every non-real zero. A density estimate instead allows zeros past $1/2$ but proves there are few of them on average over $q$ ; the explicit formula converts a zero-count bound into a $ψ (x, χ)$ bound.

Answer

By the explicit formula, $ψ (x, χ) = - \sum_{ρ} \frac{x ^{ρ}}{ρ} + (small)$ , summed over the non-real zeros $ρ = β + iγ$ of $L (s, χ)$ ; each zero contributes $x^{β}$ , so a zero with $β$ near $1$ is costly. GRH would put every $β = 1/2$ , giving $ψ (x, χ) ≪ x^{1/2} lo g^{2} (q x)$ for each $χ$ . The density estimate does not exclude zeros with $β > 1/2$ ; instead it bounds, on average over $q \leq Q$ and over primitive $χ$ , the total number of zeros to the right of $α$ , with a count that decays as $α \to 1$ . Feeding this into the explicit formula and summing $\sum_{q} \frac{1}{φ ( q )} \sum_{χ}^{*} ∣ ψ (x, χ) ∣$ , the rare zeros past $1/2$ contribute a controlled total — precisely $≪ x (lo g x)^{- A}$ in the range $Q \leq x^{1/2} (lo g x)^{- B}$ . The substitute for GRH is the zero-density phenomenon: not every $L$ -function need satisfy RH, only that exceptions to RH be sparse on average. The large sieve supplies exactly this density bound, the constant $Q^{2} T$ mirroring the $M + Q^{2}$ of 21.14.01. Rubric: full credit for the explicit-formula link, the $x^{β}$ cost of off-line zeros, the density-on-average substitution, and naming the large sieve as its source.

Exercise 8 (hard, symbolic).

Show that the $x^{1/2}$ level of distribution is forced by the large-sieve constant: writing the multiplicative large sieve bound for character sums of length $\approx x$ as $≪ (x + Q^{2}) \cdot (energy)$ , explain why pushing $Q$ past $x^{1/2}$ loses the saving, and how Elliott-Halberstam's $Q = x^{1 - ε}$ would require a different (non-large-sieve) input.

Hint

The large sieve is sharp only when $Q^{2} ≲$ the coefficient length. Once $Q^{2} > x$ , the term $Q^{2}$ dominates and the inequality no longer beats the diagonal bound on $\sum_{χ}^{*} ∣ \sum_{n} ξ_{n} χ (n) ∣^{2}$ .

Answer

The multiplicative large sieve gives $\sum_{q \leq Q} \frac{q}{φ ( q )} \sum_{χ}^{*} ∣ \sum_{n \leq M} ξ_{n} χ (n) ∣^{2} \leq (M + Q^{2}) \sum_{n} ∣ ξ_{n} ∣^{2}$ . In the Bombieri-Vinogradov application the character sums have length $M \approx x$ (the Type II variables multiply to size $\approx x$ ). When $Q \leq x^{1/2}$ , the constant $M + Q^{2} \leq 2 x$ is of the same order as $M$ , so the inequality is sharp and the per-character bound it implies after Cauchy-Schwarz beats the diagonal $\sum_{χ}^{*} ∣ \dots ∣^{2} \leq φ (q) \sum_{n} ∣ ξ_{n} ∣^{2}$ summed over $q$ . Once $Q > x^{1/2}$ , the term $Q^{2} > x \geq M$ dominates, and $(M + Q^{2}) \approx Q^{2}$ no longer represents a saving over the bare diagonal bound — the large sieve has nothing left to give. Reaching Elliott-Halberstam's $Q = x^{1 - ε}$ would need control of $ψ (x, χ)$ for individual large-conductor characters far beyond what an $ℓ^{2}$ -energy inequality can supply: well-factorable moduli and Kloosterman-sum / spectral inputs (the Deshouillers-Iwaniec and Zhang technology) replace the bare large sieve there. Rubric: full credit for the $M \approx x$ identification, the $Q^{2}$ -dominance argument past $x^{1/2}$ , and naming a beyond-large-sieve input for Elliott-Halberstam.

Advanced results Master

The density-estimate proof and the comparison ladder

The cleanest modern proof routes through a zero-density estimate for Dirichlet $L$ -functions, summed over moduli by the large sieve ^{[Montgomery-Vaughan 2007]}. Writing $N (α, T, χ)$ for the number of zeros of $L (s, χ)$ in $Re (s) \geq α$ , $∣ Im (s) ∣ \leq T$ , the large sieve yields $$ \sum_{q\le Q}\frac{q}{\varphi(q)}\ \sideset{}{^*}\sum_{\chi\bmod q}N(\alpha,T,\chi) \ll \big(Q^2 T\big)^{\frac{c(1-\alpha)}{1}}(\log QT)^{c'}, $$ a bound exhibiting that zeros far from the critical line are rare on average over the modulus. Fed through the explicit formula for $ψ (x, χ)$ , this density estimate converts directly into the averaged error bound, with the level $ϑ = \frac{1}{2}$ emerging from the balance $Q^{2} T \approx x$ . The comparison ladder is then sharp: Siegel-Walfisz controls $q \leq (lo g x)^{A}$ individually; Bombieri-Vinogradov controls $q \leq x^{1/2 - o (1)}$ on average; GRH would control every $q \leq x^{1/2 - ε}$ individually; Elliott-Halberstam would extend the average to $q \leq x^{1 - ε}$ .

The Vaughan-identity bilinear decomposition

Bombieri's original 1965 argument and Vinogradov's independent one used the large sieve in a more intricate form; Vaughan's 1980 identity streamlined the proof to its present shape ^{[Vaughan 1980]}. The identity writes $Λ$ as a sum of a Type I part (a single smooth divisor variable) and a Type II part (a genuine bilinear form in two rough variables). The Type I sums are disposed of by the Pólya-Vinogradov bound and partial summation; the Type II sums, after Cauchy-Schwarz, are exactly the character bilinear forms the multiplicative large sieve of 21.14.01 bounds. The threshold $x^{1/2}$ is the precise point where the large-sieve constant $M + Q^{2}$ stops beating the diagonal bound when $M \approx x$ , so the bilinear method and the density-estimate method agree on where the wall stands.

Elliott-Halberstam, bounded gaps, and the parity barrier

The Elliott-Halberstam conjecture asks for level $ϑ \to 1$ ^{[Elliott-Halberstam 1970]}; it is open and strictly stronger than GRH on the averaged side. Its arithmetic payoff is bounded gaps between primes. Goldston-Pintz-Yıldırım showed that any level $ϑ > \frac{1}{2}$ forces $lim inf_{n} (p_{n + 1} - p_{n}) < \infty$ ; Zhang broke the $\frac{1}{2}$ barrier for the restricted class of smooth, well-factorable moduli in a fixed residue class ^{[Zhang 2014]}, obtaining a finite gap, and Maynard's multidimensional sieve achieved bounded gaps from Bombieri-Vinogradov's level $\frac{1}{2}$ alone ^{[Maynard 2015]}, with Elliott-Halberstam sharpening the gap to $12$ . The parity barrier of sieve theory caps these methods: they cannot reach the twin-prime gap $2$ without an input that distinguishes numbers by the parity of their prime-factor count, which neither the large sieve nor Elliott-Halberstam supplies.

Synthesis. Bombieri-Vinogradov is the capstone of primes-in-progressions on average, and the bridge is the multiplicative large sieve of 21.14.01 read as an $ℓ^{2}$ density statement: the inequality says character sums of length $x$ cannot all be large across the moduli $q \leq x^{1/2}$ , and this is exactly the orthogonality of primitive characters dualized into an operator-norm bound, the same duality that fixed the constant $N + Q^{2}$ . The foundational reason the level is $\frac{1}{2}$ generalises the Brun-Titchmarsh threshold: balancing $Q^{2}$ against the coefficient length $M \approx x$ caps $Q$ at $x^{1/2}$ , so this is exactly the $M + Q^{2}$ tradeoff seen there, now summed against the explicit formula of 21.11.04 instead of against the prime indicator. Putting these together, the small-conductor range is dual to the zero-free-region analysis of 21.13.01 — Siegel-Walfisz handling $q \leq (lo g x)^{A}$ through the very same exceptional-zero control — while the large-conductor range is the large sieve's; the central insight is that averaging over $q$ replaces the Generalized Riemann Hypothesis with a zero-density estimate, and the parity barrier of 21.14.04 marks exactly the line where this circle of ideas, Elliott-Halberstam included, cannot reach the twin primes without new arithmetic.

Full proof set Master

Proposition 1 (character detection of a residue class). For $g cd (a, q) = 1$ , $ψ (x; q, a) = \frac{1}{φ ( q )} \sum_{χ mod q} \overset{χ}{ˉ} (a) ψ (x, χ)$ , where $ψ (x, χ) = \sum_{n \leq x} χ (n) Λ (n)$ .

Proof. Orthogonality of Dirichlet characters gives $\frac{1}{φ ( q )} \sum_{χ mod q} \overset{χ}{ˉ} (a) χ (n) = 1$ if $n \equiv a (q)$ and $g cd (n, q) = 1$ , and $0$ otherwise (when $g cd (n, q) > 1$ every $χ (n) = 0$ ; when $g cd (n, q) = 1$ but $n \neq \equiv a$ , the orthogonality relation vanishes). Since $g cd (a, q) = 1$ already forces the surviving $n$ to be coprime to $q$ , multiplying by $Λ (n)$ and summing over $n \leq x$ interchanges the finite character sum with the sum over $n$ : $$ \psi(x;q,a) = \sum_{n\le x}\Lambda(n)\frac{1}{\varphi(q)}\sum_\chi\bar\chi(a)\chi(n) = \frac{1}{\varphi(q)}\sum_\chi\bar\chi(a)\sum_{n\le x}\chi(n)\Lambda(n), $$ which is the claim; the $Λ (n)$ with $g cd (n, q) > 1$ are prime powers $p^{k} ∣ q$ and contribute $O (lo g^{2} x)$ . $□$

Proposition 2 (main term from the principal character). The principal-character term equals $\frac{x}{φ ( q )} + O (x exp (- c lo g x) / φ (q))$ , identifying $E (x; q, a) = \frac{1}{φ ( q )} \sum_{χ \neq = χ_{0}} \overset{χ}{ˉ} (a) ψ (x, χ)$ .

Proof. For $χ = χ_{0}$ mod $q$ , $ψ (x, χ_{0}) = \sum_{n \leq x, (n, q) = 1} Λ (n) = ψ (x) - \sum_{p ∣ q} \sum_{p^{k} \leq x} lo g p = ψ (x) + O (lo g x lo g q)$ . The prime number theorem gives $ψ (x) = x + O (x exp (- c lo g x))$ , so $\frac{1}{φ ( q )} ψ (x, χ_{0}) = \frac{x}{φ ( q )} + O (x exp (- c lo g x) / φ (q))$ . Subtracting this from Proposition 1 isolates $E (x; q, a)$ as the stated non-principal sum. $□$

Proposition 3 (reduction to primitive characters). If $χ$ mod $q$ is induced by the primitive $\chi^ $m o d$ q^ $,$ q^\mid q $, t h e n$ \psi(x,\chi) = \psi(x,\chi^) + O((\log qx)^2)$.

Proof. For $g cd (n, q) = 1$ one has $χ (n) = χ^{*} (n)$ , while $χ (n) = 0$ for $g cd (n, q) > 1$ . Hence $ψ (x, χ^{*}) - ψ (x, χ) = \sum_{n \leq x, (n, q^{*}) = 1, (n, q) > 1} χ^{*} (n) Λ (n)$ . The summand is non-zero only for prime powers $p^{k} \leq x$ with $p ∣ q$ , $p ∤ q^{*}$ . There are $\leq ω (q) \leq \frac{l o g q}{l o g 2}$ such primes, each yielding $\leq \frac{l o g x}{l o g 2}$ prime powers, each weighted by $lo g p \leq lo g x$ . The total is $O (lo g q \cdot lo g x) = O ((lo g q x)^{2})$ . $□$

Proposition 4 (Siegel-Walfisz disposal of small conductors). For $\chi^ $p r imi t i v eo f co n d u c t or$ q^\le(\log x)^B $,$ \max_{y\le x}|\psi(y,\chi^)| \ll_C x\exp(-c\sqrt{\log x}) $f or e v er y$ C $, w i t h$ c=c(C)$, the implied constant ineffective when an exceptional (Siegel) zero is present.*

Proof. The Siegel-Walfisz theorem follows from the zero-free region of 21.13.01: $L (s, χ^{*})$ has no zero with $Re (s) > 1 - c_{1} / lo g (q^{*} (∣ t ∣ + 2))$ save a possible exceptional real zero $β < 1 - c_{2} q^{*- ε}$ (Siegel's ineffective bound). The truncated explicit formula $ψ (y, χ^{*}) = - \sum_{∣ γ ∣ \leq T} \frac{y ^{ρ}}{ρ} + O (\frac{y l o g ^{2} y T}{T})$ then bounds the zero sum by $y exp (- c lo g y)$ once $q^{*} \leq (lo g y)^{B}$ and $T = exp (lo g y)$ , the exceptional zero (if any) contributing $y^{β} ≪ y exp (- c_{2} (lo g y)^{1 - εB})$ , still admissible for $B$ chosen against the target. The ineffectivity of $c$ is inherited from Siegel's bound. $□$

Proposition 5 (the level- $\frac{1}{2}$ threshold from the large-sieve constant). In the Type II bilinear sums, the multiplicative large sieve over conductors $q^\le Q $s a v es a f a c t or o v er t h e d ia g o na l b o u n d i f f$ Q\le x^{1/2+o(1)}$.*

Proof. The Type II sum, after Cauchy-Schwarz, reduces to bounding $\sum_{q^{*} \leq Q} \frac{q ^{*}}{φ ( q ^{*} )} \sideset^{*} \sum_{χ^{*}} ∣ \sum_{n \leq M} ξ_{n} χ^{*} (n) ∣^{2}$ with $M \approx x$ the product of the two bilinear lengths. The multiplicative large sieve of 21.14.01 bounds this by $(M + Q^{2}) \sum_{n} ∣ ξ_{n} ∣^{2}$ . The diagonal bound, applying $\sideset^{*} \sum_{χ^{*}} ∣ \dots ∣^{2} \leq φ (q^{*}) \sum_{n} ∣ ξ_{n} ∣^{2}$ classwise, gives $\approx (\sum_{q^{*} \leq Q} q^{*}) \sum_{n} ∣ ξ_{n} ∣^{2} \approx Q^{2} \sum_{n} ∣ ξ_{n} ∣^{2}$ at the top of the range. The large sieve improves on this precisely when $M + Q^{2} ≪ Q^{2}$ fails outright, i.e. when $M ≳ Q^{2}$ , that is $Q ≲ M^{1/2} \approx x^{1/2}$ . For $Q > x^{1/2}$ the constant $M + Q^{2} \approx Q^{2}$ matches the diagonal bound and no saving remains. $□$

Connections Master

The entire proof is downstream of the multiplicative large sieve of 21.14.01: the Type II bilinear sums are bounded by its $(M + Q^{2})$ constant after Cauchy-Schwarz, and the $x^{1/2}$ level of distribution is the same $N + Q^{2}$ balance that caps the Brun-Titchmarsh range there, so Bombieri-Vinogradov is the large sieve's deepest arithmetic consequence.

The small-conductor range is handled by Siegel-Walfisz, which rests entirely on the zero-free regions and the exceptional-zero analysis of 21.13.01: the ineffective constant in Bombieri-Vinogradov is inherited from Siegel's ineffective lower bound for $L (1, χ)$ , and the dichotomy between the zero-free region and the possible Siegel zero is what makes the small- $q$ estimate uniform.

The passage from a bound on the character sum $ψ (x, χ)$ to a bound on the residue-class count $ψ (x; q, a)$ uses the truncated Perron formula and Mellin inversion of 21.11.04: the explicit formula that turns $L$ -function zeros into a sum over $x^{ρ} / ρ$ is exactly the contour-integral machinery developed there, specialized to $- L^{'} / L (s, χ)$ .

The parity barrier that prevents Bombieri-Vinogradov and even Elliott-Halberstam from delivering the twin-prime gap $2$ is the combinatorial-sieve obstruction of 21.14.04: the large sieve and the Selberg sieve both fail to separate the two parities of $Ω (n)$ , which is why bounded gaps stop short of $2$ .

Historical & philosophical context Master

The theorem was proved independently in 1965 by Enrico Bombieri, in his Mathematika paper on the large sieve ^{[Bombieri 1965]}, and by Askold Ivanovich Vinogradov in the Izvestiya ^{[Vinogradov 1965]}, the two arriving at the same level- $\frac{1}{2}$ averaged bound by closely related large-sieve routes. Bombieri's formulation, with the maximum over residue classes inside the sum, became the standard one; the result is consistently named for both, with Bombieri's name usually first by the depth of the surrounding paper. The averaged-GRH reading was understood at once: the theorem makes unconditional what GRH would give pointwise, and it has since substituted for GRH in a long list of applications where only the average over $q$ is needed.

Robert Vaughan's 1980 identity ^{[Vaughan 1980]} reorganized the proof into the Type I / Type II bilinear shape now taught universally, replacing the more delicate original arguments. Peter Elliott and Heini Halberstam stated their conjecture in 1970 ^{[Elliott-Halberstam 1970]}, extending the level of distribution toward $1$ . Its dramatic application came in 2013-2014: Yitang Zhang proved a Bombieri-Vinogradov-type estimate past level $\frac{1}{2}$ for smooth moduli and deduced bounded gaps between primes ^{[Zhang 2014]}, and James Maynard, with a multidimensional sieve, obtained bounded gaps from the unconditional level $\frac{1}{2}$ alone and the gap bound $12$ under Elliott-Halberstam ^{[Maynard 2015]}.

Bibliography Master

@article{bombieri1965largesieve,
  author  = {Bombieri, Enrico},
  title   = {On the large sieve},
  journal = {Mathematika},
  volume  = {12},
  pages   = {201--225},
  year    = {1965}
}

@article{vinogradov1965density,
  author  = {Vinogradov, Askold I.},
  title   = {The density hypothesis for {Dirichlet} {L}-series},
  journal = {Izvestiya Akademii Nauk SSSR, Seriya Matematicheskaya},
  volume  = {29},
  pages   = {903--934},
  year    = {1965},
  note    = {Corrigendum, ibid. 30 (1966), 719--720}
}

@article{vaughan1980elementary,
  author  = {Vaughan, Robert C.},
  title   = {An elementary method in prime number theory},
  journal = {Mathematika},
  volume  = {27},
  pages   = {153--163},
  year    = {1980}
}

@article{elliotthalberstam1970conjecture,
  author  = {Elliott, Peter D. T. A. and Halberstam, Heini},
  title   = {A conjecture in prime number theory},
  journal = {Symposia Mathematica},
  volume  = {4},
  pages   = {59--72},
  year    = {1970}
}

@article{zhang2014boundedgaps,
  author  = {Zhang, Yitang},
  title   = {Bounded gaps between primes},
  journal = {Annals of Mathematics},
  volume  = {179},
  number  = {3},
  pages   = {1121--1174},
  year    = {2014}
}

@article{maynard2015smallgaps,
  author  = {Maynard, James},
  title   = {Small gaps between primes},
  journal = {Annals of Mathematics},
  volume  = {181},
  number  = {1},
  pages   = {383--413},
  year    = {2015}
}

@book{davenport2000multiplicative,
  author    = {Davenport, Harold},
  title     = {Multiplicative Number Theory},
  edition   = {3},
  series    = {Graduate Texts in Mathematics},
  volume    = {74},
  publisher = {Springer-Verlag},
  year      = {2000},
  note      = {Revised by H. L. Montgomery}
}

@book{montgomeryvaughan2007,
  author    = {Montgomery, Hugh L. and Vaughan, Robert C.},
  title     = {Multiplicative Number Theory I: Classical Theory},
  series    = {Cambridge Studies in Advanced Mathematics},
  number    = {97},
  publisher = {Cambridge University Press},
  year      = {2007}
}

@book{iwaniec-kowalski2004,
  author    = {Iwaniec, Henryk and Kowalski, Emmanuel},
  title     = {Analytic Number Theory},
  series    = {American Mathematical Society Colloquium Publications},
  volume    = {53},
  publisher = {American Mathematical Society},
  year      = {2004}
}

Prerequisites

21.14.01
21.13.01
21.11.04

Tier anchors

beginner: Davenport 2000 *Multiplicative Number Theory* (Springer GTM 74, 3rd ed., rev. Montgomery) Ch. 28 (Bombieri's theorem, informal motivation as 'GRH on average'); Granville-Soundararajan expository notes on equidistribution of primes in progressions; the Polymath8 blog narrative on bounded gaps for the Elliott-Halberstam picture
intermediate: Davenport 2000 *Multiplicative Number Theory* (Springer GTM 74) Ch. 28 (the Bombieri-Vinogradov theorem, the large-sieve + Vaughan-identity proof); Montgomery-Vaughan 2007 *Multiplicative Number Theory I: Classical Theory* (Cambridge SMM 97) §9.4-§9.5; Iwaniec-Kowalski 2004 *Analytic Number Theory* (AMS Colloquium 53) §17.4
master: Bombieri 1965 *Mathematika* 12, 201-225 (*On the large sieve* — the original theorem); A. I. Vinogradov 1965 *Izv. Akad. Nauk SSSR* 29, 903-934 (independent proof); Montgomery-Vaughan 2007 *Multiplicative Number Theory I* (Cambridge SMM 97) §9; Iwaniec-Kowalski 2004 *Analytic Number Theory* (AMS Colloquium 53) §17; Vaughan 1980 *Mathematika* 27, 153-163 (the simplified Vaughan-identity proof); Elliott-Halberstam 1970 *Symposia Mathematica* 4, 59-72 (the conjecture); Zhang 2014 *Annals of Math.* 179, 1121-1174 and Maynard 2015 *Annals of Math.* 181, 383-413 (bounded gaps)

References

Bombieri, E. — On the large sieve · *Mathematika* 12 (1965), 201-225. Proves $\sum_{q\le Q}\max_{(a,q)=1}|\psi(x;q,a) - x/\varphi(q)| \ll_A x(\log x)^{-A}$ for $Q = x^{1/2}(\log x)^{-B(A)}$, an averaged form of the Generalized Riemann Hypothesis for the error term in the prime number theorem for arithmetic progressions, via the large sieve inequality and a decomposition of the von Mangoldt function into bilinear (Type I / Type II) sums.
Vinogradov, A. I. — The density hypothesis for Dirichlet L-series · *Izvestiya Akademii Nauk SSSR, Seriya Matematicheskaya* 29 (1965), 903-934 (corrigendum 30 (1966), 719-720). Independently and contemporaneously with Bombieri established the same mean-error bound for primes in arithmetic progressions averaged over moduli $q \le x^{1/2-\varepsilon}$, by a large-sieve density estimate for the zeros of Dirichlet $L$-functions.
Vaughan, R. C. — An elementary method in prime number theory · *Mathematika* 27 (1980), 153-163. Introduces Vaughan's identity $\Lambda = \Lambda_{\le U} - \mu_{\le U}*\Lambda*1 + \mu_{>U}*(\Lambda*1)_{>V} + \ldots$ giving a clean decomposition of $\sum_n \Lambda(n) f(n)$ into Type I sums $\sum_{d\le U}\sum_m a_d f(dm)$ and Type II bilinear sums $\sum_{U<m\le V}\sum_k b_m c_k f(mk)$, the modern engine of the Bombieri-Vinogradov proof.
Montgomery, H. L. & Vaughan, R. C. — Multiplicative Number Theory I: Classical Theory · Cambridge Studies in Advanced Mathematics 97 (2007), §9. The Bombieri-Vinogradov theorem with full proof: reduction to character sums, the multiplicative large sieve, the large-sieve density estimate $\sum_{q\le Q}\frac{q}{\varphi(q)}\sideset{}{^*}\sum_\chi N(\alpha,T,\chi) \ll (QT)^{c(1-\alpha)}$, Vaughan's identity for $\Lambda$, and the comparison with Siegel-Walfisz (small $q$) and GRH (individual $q$).
Elliott, P. D. T. A. & Halberstam, H. — A conjecture in prime number theory · *Symposia Mathematica* 4 (1970), 59-72. States the Elliott-Halberstam conjecture: the Bombieri-Vinogradov mean-error bound holds with the level of distribution pushed to $Q = x^{1-\varepsilon}$ for every $\varepsilon > 0$ and every $A$, i.e. primes are equidistributed in progressions on average over moduli up to almost $x$.
Zhang, Y. — Bounded gaps between primes · *Annals of Mathematics* 179 (2014), 1121-1174. Proves a Bombieri-Vinogradov-type estimate for smooth moduli with level of distribution exceeding $1/2$, restricted to a fixed residue class and well-factorable moduli, and deduces $\liminf_n (p_{n+1} - p_n) < 7\cdot 10^7$; the level beyond $1/2$ is exactly the Elliott-Halberstam-flavoured input.
Maynard, J. — Small gaps between primes · *Annals of Mathematics* 181 (2015), 383-413. A multidimensional Selberg sieve giving $\liminf_n(p_{n+1}-p_n) \le 600$ and bounded gaps for any fixed number of primes using only Bombieri-Vinogradov (level $1/2$), with Elliott-Halberstam (level $\vartheta$) sharpening the bound to gaps of $12$ for $\vartheta > 0.95$.

Estimated time

beginner: 20m
intermediate: 55m
master: 90m