21.14.01 · number-theory / sieve-methods-large-sieve

The Large Sieve Inequality and Brun-Titchmarsh

shipped3 tiersLean: none

Anchor (Master): Montgomery 1971 *Topics in Multiplicative Number Theory* (Springer LNM 227) Ch. 1-4 (the large sieve and its arithmetic form); Montgomery-Vaughan 1973 *Mathematika* 20, 119-134 (*The large sieve* — the optimal constant $N+Q^2$ and the Brun-Titchmarsh theorem $\pi(x;q,a)\le 2x/(\varphi(q)\log(x/q))$); Selberg's $1+o(1)$ majorant and the Beurling-Selberg extremal function; Bombieri 1965 *Mathematika* 12, 201-225 (*On the large sieve* — the route to the Bombieri-Vinogradov theorem); Iwaniec-Kowalski 2004 *Analytic Number Theory* (AMS Colloquium 53) Ch. 7; Montgomery-Vaughan 2007 *Multiplicative Number Theory I* (Cambridge SMM 97) Ch. 7

Intuition Beginner

Imagine you have a list of whole numbers — maybe the primes up to a million, maybe some mystery set you are studying. You want to know how many there can be. One way to corner a set is to watch how it behaves when you divide by small numbers. Divide every member of your set by and record the remainder.

If your set is "thin" or special, the remainders might avoid certain values entirely: perhaps no member ever leaves remainder when divided by . Each forbidden remainder is a clue, and each different divisor gives a fresh batch of clues. The large sieve is the tool that turns a large pile of such clues into a hard upper bound on how big the set can be.

The word "sieve" is the kitchen image: you pour your numbers through a mesh and catch only the ones that survive every filter. A small sieve removes a few remainders for each divisor; the large sieve is built for the case where you remove many remainders at once — up to half of them or more for each divisor. The remarkable fact is that forbidding many remainders modulo many different divisors forces the surviving set to be small, and the large sieve says exactly how small, with a clean and almost unimprovable bound.

How is this counted? Weigh each number with an arrow that spins according to a chosen fraction, add the arrows, and measure the total. When the fractions are spread apart, these totals cannot all be large at once — the energy in the original list is shared out and capped. That single cap, applied to the fractions coming from each divisor, is the whole engine, and it pays off in a sharp count of primes in an arithmetic progression.

Visual Beginner

Picture the numbers from to as a circle. For a divisor , mark the fractions around the rim — for you mark . Do this for every divisor up to some limit . The marked fractions, the Farey fractions, are spread around the circle so that no two are closer than about . The large sieve says that when you test your number list against a set of well-separated marks, the combined response is capped by the size of the list plus a term measuring how many marks there are.

        Farey marks for q up to 4 on the unit circle
        (no two marks closer than about 1/Q^2)

                    0/1
              1/4         3/4
          1/3                 2/3
        1/2 ......... well-spaced ......... 
          (each a/q is a "test frequency";
           spread-out tests cannot all
           resonate with the list at once)

The picture carries the whole idea: well-separated test points share out the list's total energy, so the responses are bounded all together. Counting the surviving numbers then becomes counting how much room the forbidden remainders leave.

Worked example Beginner

We sift a small set by hand to watch forbidden remainders shrink it.

Step 1. Start with the whole numbers from to , so candidates. We will sieve using the divisors , , and .

Step 2. Forbid the even numbers: throw away every number leaving remainder when divided by . That removes numbers and leaves odd ones.

Step 3. Among the survivors, forbid remainder when divided by — that is, drop multiples of . The surviving odd numbers are , and we have dropped , leaving .

Step 4. Now forbid remainder when divided by : drop and . We are left with , which is numbers.

Step 5. Compare with a rough sieve estimate. Each divisor removed a fraction of the numbers: a factor for the prime , then of those survive past , then survive past . Multiplying, , matching the exact count.

What this tells us: forbidding even one remainder for each of several divisors multiplies the survival fractions together and shrinks the set fast. The large sieve is the precise inequality that holds when you forbid many remainders per divisor across many divisors at once, and it is sharp enough to count primes in progressions.

Check your understanding Beginner

Formal definition Intermediate+

Throughout write for the additive character of 02.10.04 and 21.15.01, and for the distance from to the nearest integer. Let be integers with , and let be complex numbers; set $$ S(\alpha) = \sum_{M < n \le M+N} a_n, e(n\alpha), \qquad |a|2^2 = \sum{M < n \le M+N} |a_n|^2. $$ is a trigonometric polynomial of length with its squared -norm.

Definition (well-spaced points). Real points in are -spaced (well-spaced) if for all , where the norm is taken modulo . The points are spread so that disjoint arcs of length surround each one.

Definition (large sieve inequality, analytic form). A constant is a large sieve constant if for every choice of coefficients and every -spaced set , $$ \sum_{r=1}^{R} |S(\alpha_r)|^2 \le \Delta \sum_{M < n \le M+N} |a_n|^2. $$ The content of the large sieve is that is admissible, and for Farey points of order (where ) this reads . The form is optimal: neither term can be reduced by a constant factor.

Definition (Farey fractions and the arithmetic large sieve). The Farey fractions of order are the reduced fractions with and , . Distinct Farey fractions satisfy , so they are -spaced. The arithmetic large sieve is the specialization of the analytic inequality to these points: $$ \sum_{q \le Q} \ \sideset{}{^}\sum_{a \bmod q} \left| S!\left(\tfrac{a}{q}\right) \right|^2 \le (N + Q^2)\sum_{n} |a_n|^2, $$ where $\sum^\varphi(q)aq$.

Definition (sifted set). Let be a set of integers. Suppose for each prime there are forbidden residue classes mod that no element of occupies. The large sieve bounds in terms of how much room the exclusions leave: $$ |\mathcal{S}| \le \frac{N + Q^2}{L(Q)}, \qquad L(Q) = \sum_{q \le Q} \mu(q)^2 \prod_{p \mid q} \frac{\omega(p)}{p - \omega(p)}, $$ the sum running over squarefree . The denominator is large when many classes are excluded per prime, which is what makes the bound strong in the large-sieve regime.

Counterexamples to common slips

  • "The constant must be or with cross terms." The optimal additive form is exactly , with no product and no cross term. An early bound of Davenport-Halberstam gave ; the factor was removed by the duality argument with the Selberg majorant. The clean additive shape is the whole point.
  • "Well-spaced means equally spaced." It means only a lower bound on pairwise distances; the points need not be evenly distributed. Farey points are markedly uneven, yet still -spaced, and that is all the inequality needs.
  • "The sifted-set bound counts excluded classes additively." The quantity multiplies the local densities over primes dividing and sums over squarefree . Treating exclusions as a simple sum of loses the multiplicative reinforcement that makes the large sieve sharp.

Key theorem with proof Intermediate+

The signature result is the analytic large sieve inequality with the optimal constant; the duality principle reduces it to a dual bilinear bound, which a Beurling-Selberg majorant closes [Montgomery-Vaughan 1973].

Theorem (large sieve inequality; Montgomery-Vaughan 1973). Let be -spaced. Then for all coefficients , $$ \sum_{r=1}^{R} \left| \sum_{M < n \le M+N} a_n e(n\alpha_r) \right|^2 \le (N + \delta^{-1}) \sum_{M < n \le M+N} |a_n|^2. $$

Proof. Consider the linear map sending , with matrix entries . The asserted inequality is , that is . By the duality principle, , so it is equivalent to prove the dual bound $$ \sum_{M < n \le M+N} \left| \sum_{r=1}^{R} b_r, e(-n\alpha_r) \right|^2 \le (N + \delta^{-1}) \sum_{r=1}^{R} |b_r|^2 $$ for all . Expand the left side as . The diagonal contributes . For the off-diagonal terms, introduce a majorant: let be the Beurling-Selberg extremal function, a band-limited majorant of the indicator of with supported in and , satisfying on the index range. Replacing the sharp cutoff by and applying Poisson summation 21.15.01 turns the inner geometric sum into , supported only where . Because the points are -spaced, this support meets only the diagonal , on which . Hence $$ \sum_n \Big| \sum_r b_r e(-n\alpha_r) \Big|^2 \le \sum_{r,s} b_r \overline{b_s}, \widehat F(\alpha_s - \alpha_r) = (N+\delta^{-1}) \sum_r |b_r|^2, $$ the last step using and the diagonal collapse. By duality the original inequality follows with the same constant.

Corollary (arithmetic large sieve). For Farey fractions of order , $$ \sum_{q \le Q} \ \sideset{}{^*}\sum_{a \bmod q} \left| S!\left(\tfrac aq\right)\right|^2 \le (N + Q^2)\sum_n |a_n|^2. $$

Proof. The Farey fractions of order are -spaced, since for one has . Apply the theorem with , so . The points are exactly the primitive residues counted by .

Bridge. The large sieve builds toward the Bombieri-Vinogradov theorem of the Advanced results, where this same inequality is summed against Dirichlet characters to control primes in progressions on average, and it appears again in the analytic proof of Linnik's theorem on the least prime in a progression. The foundational reason the constant is and not larger is exactly the duality principle paired with the band-limited Selberg majorant: the diagonal carries , the spacing carries , and the majorant kills every off-diagonal term because its transform is supported inside one spacing. This is exactly the additive-character orthogonality of 21.15.01 read through a dual norm: the inequality is dual to the statement that well-separated frequencies cannot all resonate with one short list. Putting these together, the multiplicative large sieve — the same bound recast over Dirichlet characters — is the central insight that converts the additive inequality into Brun-Titchmarsh and into an averaged Riemann Hypothesis for -functions.

Exercises Intermediate+

Advanced results Master

The optimal constant and the Beurling-Selberg majorant

The constant is not merely admissible but optimal up to the additive structure [Montgomery-Vaughan 1973]. The early bounds of Roth and Bombieri carried constants like or ; the removal of every spurious factor rests on the Beurling-Selberg extremal function , the entire function of exponential type that majorizes with minimal . From it one builds a majorant of the indicator of an interval with supported in and exactly. Plugging into the dual bilinear form, the off-diagonal terms vanish by the spacing and the diagonal returns precisely . The extremal-function input is the reason the constant is sharp: any larger band support would reintroduce off-diagonal contributions, any smaller integral would fail to majorize. This is the analytic large sieve in its final form, due in this optimal shape to Montgomery and Vaughan and, independently, Selberg.

The Bombieri-Vinogradov theorem

The deepest application is an averaged Riemann Hypothesis [Bombieri 1965]. Write and . The Bombieri-Vinogradov theorem states that for every there is with $$ \sum_{q \le Q} E(x;q) \ll_A \frac{x}{(\log x)^A}, \qquad Q = x^{1/2}(\log x)^{-B}. $$ On average over up to nearly , the error term in the prime number theorem for progressions is as small as the Generalized Riemann Hypothesis would predict pointwise. The proof feeds the multiplicative large sieve — the character-sum form of Exercise 7 — into a decomposition of (Vaughan's identity) and bounds the resulting bilinear forms. Bombieri-Vinogradov substitutes for GRH in countless applications, including the Goldbach-type and twin-prime-adjacent results of sieve theory, and it is the large sieve's signature payoff.

Brun-Titchmarsh, uniformity, and the parity barrier

The Brun-Titchmarsh theorem holds uniformly for and [Montgomery-Vaughan 2007]. Its strength is uniformity: it is valid for as large as , far beyond the range where the prime number theorem for progressions is known unconditionally. The constant is conjecturally improvable to for , but cannot fall below , and the parity barrier of sieve theory explains why the large sieve alone cannot reach the truth : sieves cannot distinguish numbers with an even number of prime factors from those with an odd number without an external input such as a zero-free region. Brun-Titchmarsh nonetheless powers Linnik's theorem on the least prime in a progression and the Titchmarsh divisor problem, where its uniformity in is decisive.

Synthesis. The large sieve inequality, the arithmetic and multiplicative forms, Bombieri-Vinogradov, and Brun-Titchmarsh are one circle of ideas, and the bridge is the additive character of 21.15.01 read through a dual norm: the inequality says well-separated frequencies share out the energy of a short list, and every application is a way to harvest that sharing. The foundational reason the constant is is exactly the duality principle paired with the band-limited Selberg majorant, whose transform is supported inside one spacing so that off-diagonal terms vanish and the diagonal carries ; this is exactly the orthogonality of additive characters from 21.15.01 dualized into an operator-norm bound. Putting these together, the Gauss-sum separation generalises the additive inequality into the multiplicative one over Dirichlet characters, which is dual to the statement that primitive characters are an orthonormal-up-to- basis, and the central insight is that this multiplicative form converts directly into Bombieri-Vinogradov — an averaged GRH — and into Brun-Titchmarsh, where the parity barrier marks exactly the boundary the large sieve cannot cross without the analytic input of a zero-free region from 21.12.01.

Full proof set Master

Proposition 1 (duality principle). Let be a finite complex matrix. The best constant in (over all ) equals the best constant in (over all ).

Proof. Let be the linear map . Then , so the best in the first inequality is the largest eigenvalue of the positive semidefinite , that is . The adjoint has matrix in the transposed index pattern, and the best constant in the second inequality is . Since and have the same nonzero eigenvalues, , so the two best constants coincide.

Proposition 2 (Farey spacing). Distinct Farey fractions with satisfy .

Proof. Compute . The numerator is an integer; it is nonzero because the fractions are distinct (and reduced, so equal values force equal numerators and denominators). Hence , giving . The same bound holds for the nearest-integer norm since both fractions lie in and any wrap-around representative differs by an integer, only increasing the gap or repeating it.

Proposition 3 (off-diagonal vanishing via a band-limited majorant). Let be a non-negative function with supported in , , and for . If is -spaced, then .

Proof. Bound the index-restricted sum by the -weighted sum over all integers, using on the range and elsewhere: $$ \sum_{M<n\le M+N}\Big|\sum_r b_r e(-n\alpha_r)\Big|^2 \le \sum_{n\in\mathbb{Z}} F(n)\Big|\sum_r b_r e(-n\alpha_r)\Big|^2 = \sum_{r,s} b_r\overline{b_s}\sum_{n} F(n) e(n(\alpha_s - \alpha_r)). $$ By Poisson summation 21.15.01, , which because is supported in is nonzero only when . With and the points -spaced, forces , where the value is . Thus only diagonal terms survive, giving .

Proposition 4 (Gauss-sum separation for primitive characters). For a primitive Dirichlet character mod and any , $\chi(n),\tau(\bar\chi) = \sideset{}{^}\sum_{a\bmod q}\bar\chi(a), e(an/q)\tau(\bar\chi) = \sum_{a\bmod q}\bar\chi(a) e(a/q)|\tau(\bar\chi)|^2 = q$.*

Proof. For , substitute in the Gauss sum: , using and . Rearranging gives the identity for ; primitivity extends it to all (both sides vanish when for primitive ). The modulus is the standard evaluation: after using character orthogonality and primitivity.

Connections Master

The large sieve rests on the additive-character duality between and developed in 21.15.01: the off-diagonal vanishing in the proof is Poisson summation applied to a band-limited majorant, so the Poisson/Voronoi machinery there is the engine that produces the optimal constant here.

The sifted-set bound and the Brun-Titchmarsh deduction reuse the summation toolkit of 21.11.02 — partial summation and the mean of — so the elementary average-order estimates feed directly into the sieve's local densities and the logarithmic gain .

The dual-norm formulation is the Fourier-analytic Plancherel duality of 02.10.04 specialized to finitely many frequencies: the large sieve is the statement that the synthesis operator from frequency coefficients to a short signal has operator norm , an -boundedness that is Plancherel with a spacing-controlled overlap.

The Brun-Titchmarsh constant cannot reach by sieve methods alone because of the parity barrier, which is resolved only with the zero-free region and prime-counting analysis of 21.12.01; the large sieve and the analytic prime number theorem are the two complementary routes to primes in progressions.

Historical & philosophical context Master

The large sieve originates with Yuri Linnik's 1941 note in the Doklady [Linnik 1941], introduced to bound the least quadratic non-residue and to attack Vinogradov's hypothesis. Linnik's method was combinatorial and additive; Alfréd Rényi reformulated it analytically in the late 1940s, and Klaus Roth and Enrico Bombieri sharpened the constant through the 1960s. Bombieri's 1965 Mathematika paper [Bombieri 1965] turned the large sieve into the tool that proves the Bombieri-Vinogradov theorem, the averaged Riemann Hypothesis that substitutes for GRH in application after application.

The optimal constant was established by Hugh Montgomery and Robert Vaughan in their 1973 Mathematika paper [Montgomery-Vaughan 1973], using a duality argument with the Beurling-Selberg extremal majorant; Selberg had independently arrived at the same constant. The Brun-Titchmarsh theorem, named for Viggo Brun's sieve and Edward Titchmarsh's 1930 application, received its modern uniform form in the same work. Montgomery's 1971 lecture notes [Montgomery 1971] and the Montgomery-Vaughan treatise [Montgomery-Vaughan 2007] are the standard references for the analytic and arithmetic forms and their consequences.

Bibliography Master

@article{linnik1941largesieve,
  author  = {Linnik, Yuri V.},
  title   = {The large sieve},
  journal = {Doklady Akademii Nauk SSSR},
  volume  = {30},
  pages   = {292--294},
  year    = {1941}
}

@article{bombieri1965largesieve,
  author  = {Bombieri, Enrico},
  title   = {On the large sieve},
  journal = {Mathematika},
  volume  = {12},
  pages   = {201--225},
  year    = {1965}
}

@article{montgomeryvaughan1973largesieve,
  author  = {Montgomery, Hugh L. and Vaughan, Robert C.},
  title   = {The large sieve},
  journal = {Mathematika},
  volume  = {20},
  pages   = {119--134},
  year    = {1973}
}

@book{montgomery1971topics,
  author    = {Montgomery, Hugh L.},
  title     = {Topics in Multiplicative Number Theory},
  series    = {Lecture Notes in Mathematics},
  volume    = {227},
  publisher = {Springer-Verlag},
  year      = {1971}
}

@book{montgomeryvaughan2007,
  author    = {Montgomery, Hugh L. and Vaughan, Robert C.},
  title     = {Multiplicative Number Theory I: Classical Theory},
  series    = {Cambridge Studies in Advanced Mathematics},
  number    = {97},
  publisher = {Cambridge University Press},
  year      = {2007}
}

@book{davenport2000multiplicative,
  author    = {Davenport, Harold},
  title     = {Multiplicative Number Theory},
  edition   = {3},
  series    = {Graduate Texts in Mathematics},
  volume    = {74},
  publisher = {Springer-Verlag},
  year      = {2000},
  note      = {Revised by H. L. Montgomery}
}

@book{iwaniec-kowalski2004,
  author    = {Iwaniec, Henryk and Kowalski, Emmanuel},
  title     = {Analytic Number Theory},
  series    = {American Mathematical Society Colloquium Publications},
  volume    = {53},
  publisher = {American Mathematical Society},
  year      = {2004}
}