21.14.01 · number-theory / sieve-methods-large-sieve

The Large Sieve Inequality and Brun-Titchmarsh

shipped3 tiersLean: none

Anchor (Master): Montgomery 1971 *Topics in Multiplicative Number Theory* (Springer LNM 227) Ch. 1-4 (the large sieve and its arithmetic form); Montgomery-Vaughan 1973 *Mathematika* 20, 119-134 (*The large sieve* — the optimal constant $N+Q^2$ and the Brun-Titchmarsh theorem $\pi(x;q,a)\le 2x/(\varphi(q)\log(x/q))$); Selberg's $1+o(1)$ majorant and the Beurling-Selberg extremal function; Bombieri 1965 *Mathematika* 12, 201-225 (*On the large sieve* — the route to the Bombieri-Vinogradov theorem); Iwaniec-Kowalski 2004 *Analytic Number Theory* (AMS Colloquium 53) Ch. 7; Montgomery-Vaughan 2007 *Multiplicative Number Theory I* (Cambridge SMM 97) Ch. 7

Intuition Beginner

Imagine you have a list of whole numbers — maybe the primes up to a million, maybe some mystery set you are studying. You want to know how many there can be. One way to corner a set is to watch how it behaves when you divide by small numbers. Divide every member of your set by $7$ and record the remainder.

If your set is "thin" or special, the remainders might avoid certain values entirely: perhaps no member ever leaves remainder $3$ when divided by $7$ . Each forbidden remainder is a clue, and each different divisor gives a fresh batch of clues. The large sieve is the tool that turns a large pile of such clues into a hard upper bound on how big the set can be.

The word "sieve" is the kitchen image: you pour your numbers through a mesh and catch only the ones that survive every filter. A small sieve removes a few remainders for each divisor; the large sieve is built for the case where you remove many remainders at once — up to half of them or more for each divisor. The remarkable fact is that forbidding many remainders modulo many different divisors forces the surviving set to be small, and the large sieve says exactly how small, with a clean and almost unimprovable bound.

How is this counted? Weigh each number with an arrow that spins according to a chosen fraction, add the arrows, and measure the total. When the fractions are spread apart, these totals cannot all be large at once — the energy in the original list is shared out and capped. That single cap, applied to the fractions coming from each divisor, is the whole engine, and it pays off in a sharp count of primes in an arithmetic progression.

Visual Beginner

Picture the numbers from $0$ to $1$ as a circle. For a divisor $q$ , mark the fractions $a / q$ around the rim — for $q = 5$ you mark $0, 1/5, 2/5, 3/5, 4/5$ . Do this for every divisor up to some limit $Q$ . The marked fractions, the Farey fractions, are spread around the circle so that no two are closer than about $1/ Q^{2}$ . The large sieve says that when you test your number list against a set of well-separated marks, the combined response is capped by the size of the list plus a term measuring how many marks there are.

        Farey marks for q up to 4 on the unit circle
        (no two marks closer than about 1/Q^2)

                    0/1
              1/4         3/4
          1/3                 2/3
        1/2 ......... well-spaced ......... 
          (each a/q is a "test frequency";
           spread-out tests cannot all
           resonate with the list at once)

The picture carries the whole idea: well-separated test points share out the list's total energy, so the responses are bounded all together. Counting the surviving numbers then becomes counting how much room the forbidden remainders leave.

Worked example Beginner

We sift a small set by hand to watch forbidden remainders shrink it.

Step 1. Start with the whole numbers from $1$ to $30$ , so $30$ candidates. We will sieve using the divisors $2$ , $3$ , and $5$ .

Step 2. Forbid the even numbers: throw away every number leaving remainder $0$ when divided by $2$ . That removes $15$ numbers and leaves $15$ odd ones.

Step 3. Among the survivors, forbid remainder $0$ when divided by $3$ — that is, drop multiples of $3$ . The surviving odd numbers are $1, 5, 7, 11, 13, 17, 19, 23, 25, 29$ , and we have dropped $3, 9, 15, 21, 27$ , leaving $10$ .

Step 4. Now forbid remainder $0$ when divided by $5$ : drop $5$ and $25$ . We are left with $1, 7, 11, 13, 17, 19, 23, 29$ , which is $8$ numbers.

Step 5. Compare with a rough sieve estimate. Each divisor removed a fraction of the numbers: a factor $1/2$ for the prime $2$ , then $2/3$ of those survive past $3$ , then $4/5$ survive past $5$ . Multiplying, $30 \times \frac{1}{2} \times \frac{2}{3} \times \frac{4}{5} = 30 \times \frac{8}{30} = 8$ , matching the exact count.

What this tells us: forbidding even one remainder for each of several divisors multiplies the survival fractions together and shrinks the set fast. The large sieve is the precise inequality that holds when you forbid many remainders per divisor across many divisors at once, and it is sharp enough to count primes in progressions.

Check your understanding Beginner

Exercise (easy, multiple choice).

The large sieve is designed for the situation where, for each divisor you use, you forbid:

A. Exactly one remainder
B. No remainders at all
C. A large number of remainders — possibly half or more
D. Only the remainder zero

Hint

The name contrasts with the "small" sieve, where only a bounded number of remainders are removed per divisor.

Answer

C. The large sieve handles the case where many residue classes are excluded for each modulus — its strength is precisely that the number of forbidden remainders per divisor can grow. Option A and D describe the classical small sieves (like the sieve of Eratosthenes, which removes one class per prime). Option B forbids nothing and sifts no one. Feedback-correct: the "large" in large sieve refers to the large number of excluded classes per modulus. Feedback-wrong: removing one class per prime is the small-sieve setting; the large sieve is built for many exclusions.

Formal definition Intermediate+

Throughout write $e (α) = e^{2 π i α}$ for the additive character of 02.10.04 and 21.15.01, and $∥ α ∥$ for the distance from $α$ to the nearest integer. Let $M, N$ be integers with $N \geq 1$ , and let $(a_{n})_{M < n \leq M + N}$ be complex numbers; set $$ S(\alpha) = \sum_{M < n \le M+N} a_n, e(n\alpha), \qquad |a|2^2 = \sum{M < n \le M+N} |a_n|^2. $$ $S$ is a trigonometric polynomial of length $N$ with $∥ a ∥_{2}^{2}$ its squared $ℓ^{2}$ -norm.

Definition (well-spaced points). Real points $α_{1}, \dots, α_{R}$ in $[0, 1)$ are $δ$ -spaced (well-spaced) if $∥ α_{r} - α_{s} ∥ \geq δ$ for all $r \neq = s$ , where the norm is taken modulo $1$ . The points are spread so that disjoint arcs of length $δ$ surround each one.

Definition (large sieve inequality, analytic form). A constant $Δ = Δ (N, δ)$ is a large sieve constant if for every choice of coefficients $(a_{n})$ and every $δ$ -spaced set ${α_{r}}$ , $$ \sum_{r=1}^{R} |S(\alpha_r)|^2 \le \Delta \sum_{M < n \le M+N} |a_n|^2. $$ The content of the large sieve is that $Δ = N + δ^{- 1}$ is admissible, and for Farey points of order $Q$ (where $δ = Q^{- 2}$ ) this reads $Δ = N + Q^{2}$ . The form $N + Q^{2}$ is optimal: neither term can be reduced by a constant factor.

Definition (Farey fractions and the arithmetic large sieve). The Farey fractions of order $Q$ are the reduced fractions $a / q$ with $1 \leq q \leq Q$ and $g cd (a, q) = 1$ , $1 \leq a \leq q$ . Distinct Farey fractions satisfy $∥ a / q - a^{'} / q^{'} ∥ \geq \frac{1}{q q ^{'}} \geq \frac{1}{Q ^{2}}$ , so they are $Q^{- 2}$ -spaced. The arithmetic large sieve is the specialization of the analytic inequality to these points: $$ \sum_{q \le Q} \ \sideset{}{^}\sum_{a \bmod q} \left| S!\left(\tfrac{a}{q}\right) \right|^2 \le (N + Q^2)\sum_{n} |a_n|^2, $$ where $\sum^ $r u n so v er t h e$ \varphi(q) $p r imi t i v er es i d u es$ a $co p r im e t o$ q$.

Definition (sifted set). Let $S \subseteq (M, M + N]$ be a set of integers. Suppose for each prime $q \leq Q$ there are $ω (q)$ forbidden residue classes mod $q$ that no element of $S$ occupies. The large sieve bounds $∣ S ∣$ in terms of how much room the exclusions leave: $$ |\mathcal{S}| \le \frac{N + Q^2}{L(Q)}, \qquad L(Q) = \sum_{q \le Q} \mu(q)^2 \prod_{p \mid q} \frac{\omega(p)}{p - \omega(p)}, $$ the sum running over squarefree $q \leq Q$ . The denominator $L (Q)$ is large when many classes are excluded per prime, which is what makes the bound strong in the large-sieve regime.

Counterexamples to common slips

"The constant must be $N \cdot δ^{- 1}$ or $N + δ^{- 1} + \dots$ with cross terms." The optimal additive form is exactly $N + δ^{- 1}$ , with no product and no cross term. An early bound of Davenport-Halberstam gave $2 π (N + δ^{- 1})$ ; the factor $2 π$ was removed by the duality argument with the Selberg majorant. The clean additive shape is the whole point.
"Well-spaced means equally spaced." It means only a lower bound $δ$ on pairwise distances; the points need not be evenly distributed. Farey points are markedly uneven, yet still $Q^{- 2}$ -spaced, and that is all the inequality needs.
"The sifted-set bound counts excluded classes additively." The quantity $L (Q)$ multiplies the local densities $ω (p) / (p - ω (p))$ over primes dividing $q$ and sums over squarefree $q$ . Treating exclusions as a simple sum of $ω (p) / p$ loses the multiplicative reinforcement that makes the large sieve sharp.

Key theorem with proof Intermediate+

The signature result is the analytic large sieve inequality with the optimal constant; the duality principle reduces it to a dual bilinear bound, which a Beurling-Selberg majorant closes ^{[Montgomery-Vaughan 1973]}.

Theorem (large sieve inequality; Montgomery-Vaughan 1973). Let $α_{1}, \dots, α_{R} \in [0, 1)$ be $δ$ -spaced. Then for all coefficients $(a_{n})_{M < n \leq M + N}$ , $$ \sum_{r=1}^{R} \left| \sum_{M < n \le M+N} a_n e(n\alpha_r) \right|^2 \le (N + \delta^{-1}) \sum_{M < n \le M+N} |a_n|^2. $$

Proof. Consider the linear map $T : ℓ^{2} \to C^{R}$ sending $(a_{n}) \mapsto (S (α_{r}))_{r}$ , with matrix entries $T_{r, n} = e (n α_{r})$ . The asserted inequality is $∥ T a ∥_{2}^{2} \leq (N + δ^{- 1}) ∥ a ∥_{2}^{2}$ , that is $∥ T ∥^{2} \leq N + δ^{- 1}$ . By the duality principle, $∥ T ∥ = ∥ T^{*} ∥$ , so it is equivalent to prove the dual bound $$ \sum_{M < n \le M+N} \left| \sum_{r=1}^{R} b_r, e(-n\alpha_r) \right|^2 \le (N + \delta^{-1}) \sum_{r=1}^{R} |b_r|^2 $$ for all $(b_{r}) \in C^{R}$ . Expand the left side as $\sum_{r, s} b_{r} \overline{b_{s}} \sum_{M < n \leq M + N} e (n (α_{s} - α_{r}))$ . The diagonal $r = s$ contributes $N \sum_{r} ∣ b_{r} ∣^{2}$ . For the off-diagonal terms, introduce a majorant: let $F$ be the Beurling-Selberg extremal function, a band-limited majorant of the indicator of $(M, M + N]$ with $F$ supported in $[- δ, δ]$ and $\int F = N + δ^{- 1}$ , satisfying $F (n) \geq 1$ on the index range. Replacing the sharp cutoff by $F$ and applying Poisson summation 21.15.01 turns the inner geometric sum into $F (α_{s} - α_{r})$ , supported only where $∥ α_{s} - α_{r} ∥ < δ$ . Because the points are $δ$ -spaced, this support meets only the diagonal $r = s$ , on which $F (0) = \int F = N + δ^{- 1}$ . Hence $$ \sum_n \Big| \sum_r b_r e(-n\alpha_r) \Big|^2 \le \sum_{r,s} b_r \overline{b_s}, \widehat F(\alpha_s - \alpha_r) = (N+\delta^{-1}) \sum_r |b_r|^2, $$ the last step using $F \geq 0$ and the diagonal collapse. By duality the original inequality follows with the same constant. $□$

Corollary (arithmetic large sieve). For Farey fractions of order $Q$ , $$ \sum_{q \le Q} \ \sideset{}{^*}\sum_{a \bmod q} \left| S!\left(\tfrac aq\right)\right|^2 \le (N + Q^2)\sum_n |a_n|^2. $$

Proof. The Farey fractions of order $Q$ are $Q^{- 2}$ -spaced, since for $a / q \neq = a^{'} / q^{'}$ one has $∣ a / q - a^{'} / q^{'} ∣ = ∣ a q^{'} - a^{'} q ∣/ (q q^{'}) \geq 1/ (q q^{'}) \geq 1/ Q^{2}$ . Apply the theorem with $δ = Q^{- 2}$ , so $N + δ^{- 1} = N + Q^{2}$ . The points are exactly the primitive residues $a / q$ counted by $\sum_{q} \sum_{a}^{*}$ . $□$

Bridge. The large sieve builds toward the Bombieri-Vinogradov theorem of the Advanced results, where this same $N + Q^{2}$ inequality is summed against Dirichlet characters to control primes in progressions on average, and it appears again in the analytic proof of Linnik's theorem on the least prime in a progression. The foundational reason the constant is $N + Q^{2}$ and not larger is exactly the duality principle paired with the band-limited Selberg majorant: the diagonal carries $N$ , the spacing $δ = Q^{- 2}$ carries $δ^{- 1} = Q^{2}$ , and the majorant kills every off-diagonal term because its transform is supported inside one spacing. This is exactly the additive-character orthogonality of 21.15.01 read through a dual norm: the inequality is dual to the statement that well-separated frequencies cannot all resonate with one short list. Putting these together, the multiplicative large sieve — the same bound recast over Dirichlet characters — is the central insight that converts the additive inequality into Brun-Titchmarsh and into an averaged Riemann Hypothesis for $L$ -functions.

Exercises Intermediate+

Exercise 3 (medium, symbolic).

Prove the diagonal term in the dual bilinear form equals $N \sum_{r} ∣ b_{r} ∣^{2}$ by evaluating $\sum_{M < n \leq M + N} e (n (α_{s} - α_{r}))$ at $r = s$ .

Hint

When $r = s$ the exponent vanishes; count the terms in the index range.

Answer

When $r = s$ , the difference $α_{s} - α_{r} = 0$ , so $e (n \cdot 0) = 1$ for every $n$ in the range $M < n \leq M + N$ . That range contains exactly $N$ integers, so $\sum_{M < n \leq M + N} 1 = N$ . The diagonal contribution to $\sum_{r, s} b_{r} \overline{b_{s}} \sum_{n} e (n (α_{s} - α_{r}))$ is therefore $\sum_{r} b_{r} \overline{b_{r}} \cdot N = N \sum_{r} ∣ b_{r} ∣^{2}$ . The whole proof strategy is to show the off-diagonal terms ( $r \neq = s$ ) add at most $δ^{- 1} \sum_{r} ∣ b_{r} ∣^{2}$ , which the band-limited majorant accomplishes by confining its transform to the gap $∥ α_{s} - α_{r} ∥ < δ$ that the spacing forbids. Rubric: full credit for identifying the vanishing exponent and counting $N$ terms.

Exercise 4 (medium, symbolic).

State the duality principle for a finite linear map $T$ with matrix $(T_{r, n})$ and explain why $∥ T ∥^{2} = ∥ T^{*} ∥^{2}$ lets the large sieve be proved in either of two equivalent forms.

Hint

The operator norm of $T$ on $ℓ^{2}$ equals that of its adjoint $T^{*}$ ; write both norms as suprema of quadratic forms.

Answer

Duality principle. For a linear map $T : ℓ^{2} \to ℓ^{2}$ between finite-dimensional inner-product spaces, $∥ T ∥ = ∥ T^{*} ∥$ , where $T^{*}$ has matrix $(\overline{T_{n, r}})$ . Concretely, the best constant $Δ$ in $\sum_{r} ∣ \sum_{n} c_{r, n} a_{n} ∣^{2} \leq Δ \sum_{n} ∣ a_{n} ∣^{2}$ equals the best constant in the dual inequality $\sum_{n} ∣ \sum_{r} \overline{c_{r, n}} b_{r} ∣^{2} \leq Δ \sum_{r} ∣ b_{r} ∣^{2}$ . The two are $∥ T ∥^{2}$ and $∥ T^{*} ∥^{2}$ , equal because $T$ and $T^{*}$ share singular values. For the large sieve, $c_{r, n} = e (n α_{r})$ : the primal form bounds $\sum_{r} ∣ S (α_{r}) ∣^{2}$ directly, but the dual form $\sum_{n} ∣ \sum_{r} b_{r} e (- n α_{r}) ∣^{2}$ is the one where the diagonal-plus-majorant argument is transparent, since the inner sum over $n$ becomes a near-orthogonality statement among the $δ$ -spaced frequencies. Rubric: full credit for stating $∥ T ∥ = ∥ T^{*} ∥$ , the two equivalent inequalities, and why the dual side is the workable one.

Exercise 6 (medium, symbolic).

Using the arithmetic large sieve, show that a set $S \subseteq (0, N]$ omitting $ω (p)$ residue classes mod each prime $p \leq Q$ satisfies $∣ S ∣ \leq (N + Q^{2}) / L (Q)$ with $L (Q) = \sum_{q \leq Q} μ (q)^{2} \prod_{p ∣ q} \frac{ω ( p )}{p - ω ( p )}$ . Sketch why a square cutoff $Q = N$ is natural.

Hint

Apply the large sieve to $a_{n} = 1_{n \in S}$ , expand $S (a / q)$ over residue classes, and use the omitted classes to lower-bound the left side by $∣ S ∣^{2} L (Q)$ .

Answer

Take $a_{n} = 1_{n \in S}$ , so $\sum_{n} ∣ a_{n} ∣^{2} = ∣ S ∣$ and $S (0) = ∣ S ∣$ . For each squarefree $q = \prod p$ with all $p \leq Q$ , the primitive-character sum $\sideset^{*} \sum_{a mod q} ∣ S (a / q) ∣^{2}$ measures the deviation of $S$ from equidistribution mod $q$ ; because $S$ avoids $ω (p)$ classes mod each $p$ , a Cauchy-Schwarz / Ramanujan-sum computation gives $\sum_{q \leq Q, sqfree} \sideset^{*} \sum_{a} ∣ S (a / q) ∣^{2} \geq ∣ S ∣^{2} L (Q)$ , where the local factor $ω (p) / (p - ω (p))$ records how much room each prime removes. Combined with the large sieve upper bound $(N + Q^{2}) ∣ S ∣$ , this yields $∣ S ∣^{2} L (Q) \leq (N + Q^{2}) ∣ S ∣$ , hence $∣ S ∣ \leq (N + Q^{2}) / L (Q)$ . The cutoff $Q = N$ balances $N$ against $Q^{2}$ so the numerator is $Θ (N)$ ; pushing $Q$ past $N$ inflates $Q^{2}$ without a matching gain in $L (Q)$ for typical $ω$ . Rubric: full credit for the indicator choice, the $∣ S ∣^{2} L (Q)$ lower bound, and the $Q = N$ balance.

Exercise 7 (hard, symbolic).

Prove the multiplicative form of the large sieve: for the same coefficients and $Q$ , $$ \sum_{q\le Q}\frac{q}{\varphi(q)}\ \sideset{}{^*}\sum_{\chi \bmod q} \left|\sum_n a_n \chi(n)\right|^2 \le (N+Q^2)\sum_n |a_n|^2, $$ the inner sum over primitive Dirichlet characters $χ$ mod $q$ . Use the Gauss sum to pass from characters to additive frequencies.

Hint

For primitive $χ$ mod $q$ , $χ (n) = \frac{1}{τ ( χ ˉ )} \sum_{a mod q}^{*} \overset{χ}{ˉ} (a) e (an / q)$ with $∣ τ (χ) ∣ = q$ . Substitute and apply the arithmetic large sieve to the resulting additive sums.

Answer

For a primitive character $χ$ mod $q$ , the separation identity $χ (n) τ (\overset{χ}{ˉ}) = \sideset^{*} \sum_{a mod q} \overset{χ}{ˉ} (a) e (an / q)$ holds, with $∣ τ (\overset{χ}{ˉ}) ∣^{2} = q$ . Hence $$ \Big|\sum_n a_n\chi(n)\Big|^2 = \frac{1}{q}\Big|\sideset{}{^}\sum_{a\bmod q}\bar\chi(a) S(a/q)\Big|^2, $$ where $S (a / q) = \sum_{n} a_{n} e (an / q)$ . Summing over the primitive characters mod $q$ and using the orthogonality $\sum_{\chi\bmod q}^ \bar\chi(a)\chi(a') = \frac{\varphi(q)}{q}\sum_{d\mid(q,,\ldots)}(\cdots) $— co n cr e t e l y,$ \sideset{}{^}\sum_\chi |\sum_a^ \bar\chi(a) S(a/q)|^2 \le \varphi(q)\sideset{}{^}\sum_{a\bmod q}|S(a/q)|^2$ — gives $$ \frac{q}{\varphi(q)}\sideset{}{^}\sum_{\chi\bmod q}\Big|\sum_n a_n\chi(n)\Big|^2 \le \sideset{}{^*}\sum_{a\bmod q}|S(a/q)|^2. $$ Summing over $q \leq Q$ and applying the arithmetic large sieve corollary to the right side bounds it by $(N + Q^{2}) \sum_{n} ∣ a_{n} ∣^{2}$ . This multiplicative form is what feeds the Bombieri-Vinogradov theorem, since the inner sums $\sum_{n} a_{n} χ (n)$ with $a_{n} = Λ (n)$ are exactly the character sums controlling $ψ (x; q, a)$ . Rubric: full credit for the Gauss-sum separation, the $∣ τ ∣ = q$ normalization, the character orthogonality step, and the reduction to the arithmetic large sieve.

Exercise 8 (hard, symbolic).

Derive the Brun-Titchmarsh bound $π (x; q, a) \leq 2 x / (φ (q) lo g (x / q))$ in skeleton form: set up the large sieve applied to the primes in the progression $a mod q$ over a dyadic-type range, and identify where the constant $2$ enters.

Hint

Sieve the integers up to $x$ in the progression $n \equiv a (q)$ by all auxiliary primes $ℓ \leq z$ not dividing $q$ , take $z = (x / q)^{1/2}$ , and feed the omitted-class data into the sifted-set bound $∣ S ∣ \leq (N + Q^{2}) / L (Q)$ .

Answer

Let $S$ be the primes $p \leq x$ with $p \equiv a (q)$ , leaving aside the finitely many $p ∣ q$ . Write each such $p = a + q m$ with $0 \leq m \leq (x - a) / q =: N$ , so $S$ embeds in an interval of length $N \approx x / q$ . For every auxiliary prime $ℓ \leq z$ with $ℓ ∤ q$ , a prime $p > z$ in the progression occupies only the residue classes mod $ℓ$ that are units coprime to $ℓ$ in a shifted sense — more precisely $m$ avoids the single class $m \equiv - a \overset{q}{ˉ} (ℓ)$ , so $ω (ℓ) = 1$ omitted class per such $ℓ$ . The sifted-set bound gives $$ \pi(x;q,a) \le \frac{N + z^2}{L(z)} + z, \qquad L(z) = \sum_{\substack{d\le z\ (d,q)=1}}\mu(d)^2\prod_{\ell\mid d}\frac{1}{\ell-1}. $$ Now $L (z) \geq \sum_{d \leq z, (d, q) = 1} μ (d)^{2} / φ (d) = \frac{φ ( q )}{q} lo g z + O (1)$ by the standard mean of $μ^{2} / φ$ . Choosing $z = (x / q)^{1/2}$ makes $z^{2} = x / q \approx N$ , so the numerator is $\approx 2 N = 2 x / q$ , while $lo g z = \frac{1}{2} lo g (x / q)$ , giving $L (z) \geq \frac{φ ( q )}{q} \cdot \frac{1}{2} lo g (x / q)$ . Therefore $$ \pi(x;q,a) \le \frac{2x/q}{\frac{\varphi(q)}{q}\cdot\frac12\log(x/q)} (1+o(1)) = \frac{(2+o(1)),x}{\varphi(q)\log(x/q)}. $$ The constant $2$ is the product of two factors of the balance choice: $z^{2} = N$ doubles the numerator to $2 N$ , and $lo g z = \frac{1}{2} lo g (x / q)$ halves the denominator's logarithm, and these conspire — one from spacing, one from the sieve density — to the clean $2$ . A more careful Selberg-sieve or large-sieve weighting recovers the same $2$ uniformly in $q < x$ .

Rubric: full credit for the embedding into an interval of length $x / q$ , the one-omitted-class-per-auxiliary-prime count, the $L (z) \sim \frac{φ ( q )}{q} lo g z$ estimate, the balance $z = (x / q)^{1/2}$ , and locating the constant $2$ .

Advanced results Master

The optimal constant and the Beurling-Selberg majorant

The constant $N + Q^{2}$ is not merely admissible but optimal up to the additive structure ^{[Montgomery-Vaughan 1973]}. The early bounds of Roth and Bombieri carried constants like $N + O (Q^{2} lo g Q)$ or $2 π (N + Q^{2})$ ; the removal of every spurious factor rests on the Beurling-Selberg extremal function $B (x)$ , the entire function of exponential type $2 π$ that majorizes $sgn (x)$ with minimal $\int (B - sgn)$ . From it one builds a majorant $F$ of the indicator of an interval with $F$ supported in $[- δ, δ]$ and $\int F = N + δ^{- 1}$ exactly. Plugging $F$ into the dual bilinear form, the off-diagonal terms vanish by the spacing and the diagonal returns precisely $\int F$ . The extremal-function input is the reason the constant is sharp: any larger band support would reintroduce off-diagonal contributions, any smaller integral would fail to majorize. This is the analytic large sieve in its final form, due in this optimal shape to Montgomery and Vaughan and, independently, Selberg.

The Bombieri-Vinogradov theorem

The deepest application is an averaged Riemann Hypothesis ^{[Bombieri 1965]}. Write $ψ (x; q, a) = \sum_{n \leq x, n \equiv a (q)} Λ (n)$ and $E (x; q) = max_{(a, q) = 1} ∣ ψ (x; q, a) - x / φ (q) ∣$ . The Bombieri-Vinogradov theorem states that for every $A > 0$ there is $B = B (A)$ with $$ \sum_{q \le Q} E(x;q) \ll_A \frac{x}{(\log x)^A}, \qquad Q = x^{1/2}(\log x)^{-B}. $$ On average over $q$ up to nearly $x$ , the error term in the prime number theorem for progressions is as small as the Generalized Riemann Hypothesis would predict pointwise. The proof feeds the multiplicative large sieve — the character-sum form of Exercise 7 — into a decomposition of $Λ$ (Vaughan's identity) and bounds the resulting bilinear forms. Bombieri-Vinogradov substitutes for GRH in countless applications, including the Goldbach-type and twin-prime-adjacent results of sieve theory, and it is the large sieve's signature payoff.

Brun-Titchmarsh, uniformity, and the parity barrier

The Brun-Titchmarsh theorem $π (x; q, a) \leq \frac{2 x}{φ ( q ) l o g ( x / q )}$ holds uniformly for $1 \leq q < x$ and $g cd (a, q) = 1$ ^{[Montgomery-Vaughan 2007]}. Its strength is uniformity: it is valid for $q$ as large as $x^{1 - ε}$ , far beyond the range where the prime number theorem for progressions is known unconditionally. The constant $2$ is conjecturally improvable to $1 + o (1)$ for $q \leq x^{1 - ε}$ , but cannot fall below $1$ , and the parity barrier of sieve theory explains why the large sieve alone cannot reach the truth $1 + o (1)$ : sieves cannot distinguish numbers with an even number of prime factors from those with an odd number without an external input such as a zero-free region. Brun-Titchmarsh nonetheless powers Linnik's theorem on the least prime in a progression and the Titchmarsh divisor problem, where its uniformity in $q$ is decisive.

Synthesis. The large sieve inequality, the arithmetic and multiplicative forms, Bombieri-Vinogradov, and Brun-Titchmarsh are one circle of ideas, and the bridge is the additive character $e (\cdot)$ of 21.15.01 read through a dual norm: the inequality says well-separated frequencies share out the energy of a short list, and every application is a way to harvest that sharing. The foundational reason the constant is $N + Q^{2}$ is exactly the duality principle paired with the band-limited Selberg majorant, whose transform is supported inside one spacing $δ = Q^{- 2}$ so that off-diagonal terms vanish and the diagonal carries $N$ ; this is exactly the orthogonality of additive characters from 21.15.01 dualized into an operator-norm bound. Putting these together, the Gauss-sum separation generalises the additive inequality into the multiplicative one over Dirichlet characters, which is dual to the statement that primitive characters are an orthonormal-up-to- $τ$ basis, and the central insight is that this multiplicative form converts directly into Bombieri-Vinogradov — an averaged GRH — and into Brun-Titchmarsh, where the parity barrier marks exactly the boundary the large sieve cannot cross without the analytic input of a zero-free region from 21.12.01.

Full proof set Master

Proposition 1 (duality principle). Let $C = (c_{r, n})$ be a finite complex matrix. The best constant $Δ$ in $\sum_{r} ∣ \sum_{n} c_{r, n} a_{n} ∣^{2} \leq Δ \sum_{n} ∣ a_{n} ∣^{2}$ (over all $(a_{n})$ ) equals the best constant in $\sum_{n} ∣ \sum_{r} \overline{c_{r, n}} b_{r} ∣^{2} \leq Δ \sum_{r} ∣ b_{r} ∣^{2}$ (over all $(b_{r})$ ).

Proof. Let $T$ be the linear map $a \mapsto (\sum_{n} c_{r, n} a_{n})_{r}$ . Then $∥ T a ∥^{2} = ⟨ T^{*} T a, a ⟩$ , so the best $Δ$ in the first inequality is the largest eigenvalue of the positive semidefinite $T^{*} T$ , that is $∥ T ∥^{2}$ . The adjoint $T^{*}$ has matrix $(\overline{c_{r, n}})$ in the transposed index pattern, and the best constant in the second inequality is $∥ T^{*} ∥^{2}$ . Since $T^{*} T$ and $T T^{*}$ have the same nonzero eigenvalues, $∥ T ∥^{2} = ∥ T^{*} ∥^{2}$ , so the two best constants coincide. $□$

Proposition 2 (Farey spacing). Distinct Farey fractions $a / q \neq = a^{'} / q^{'}$ with $1 \leq q, q^{'} \leq Q$ satisfy $∥ a / q - a^{'} / q^{'} ∥ \geq Q^{- 2}$ .

Proof. Compute $\frac{a}{q} - \frac{a ^{'}}{q ^{'}} = \frac{a q ^{'} - a ^{'} q}{q q ^{'}}$ . The numerator $a q^{'} - a^{'} q$ is an integer; it is nonzero because the fractions are distinct (and reduced, so equal values force equal numerators and denominators). Hence $∣ a q^{'} - a^{'} q ∣ \geq 1$ , giving $∣ a / q - a^{'} / q^{'} ∣ \geq 1/ (q q^{'}) \geq 1/ Q^{2}$ . The same bound holds for the nearest-integer norm $∥ \cdot ∥$ since both fractions lie in $[0, 1)$ and any wrap-around representative differs by an integer, only increasing the gap or repeating it. $□$

Proposition 3 (off-diagonal vanishing via a band-limited majorant). Let $F$ be a non-negative function with $F$ supported in $[- δ, δ]$ , $F (0) = N + δ^{- 1}$ , and $F (n) \geq 1$ for $M < n \leq M + N$ . If ${α_{r}}$ is $δ$ -spaced, then $\sum_{n} ∣ \sum_{r} b_{r} e (- n α_{r}) ∣^{2} \leq (N + δ^{- 1}) \sum_{r} ∣ b_{r} ∣^{2}$ .

Proof. Bound the index-restricted sum by the $F$ -weighted sum over all integers, using $F (n) \geq 1$ on the range and $F \geq 0$ elsewhere: $$ \sum_{M<n\le M+N}\Big|\sum_r b_r e(-n\alpha_r)\Big|^2 \le \sum_{n\in\mathbb{Z}} F(n)\Big|\sum_r b_r e(-n\alpha_r)\Big|^2 = \sum_{r,s} b_r\overline{b_s}\sum_{n} F(n) e(n(\alpha_s - \alpha_r)). $$ By Poisson summation 21.15.01, $\sum_{n} F (n) e (n β) = \sum_{k} F (β + k)$ , which because $F$ is supported in $[- δ, δ]$ is nonzero only when $∥ β ∥ < δ$ . With $β = α_{s} - α_{r}$ and the points $δ$ -spaced, $∥ α_{s} - α_{r} ∥ < δ$ forces $r = s$ , where the value is $F (0) = N + δ^{- 1}$ . Thus only diagonal terms survive, giving $\sum_{r} ∣ b_{r} ∣^{2} (N + δ^{- 1})$ . $□$

Proposition 4 (Gauss-sum separation for primitive characters). For a primitive Dirichlet character $χ$ mod $q$ and any $n$ , $\chi(n),\tau(\bar\chi) = \sideset{}{^}\sum_{a\bmod q}\bar\chi(a), e(an/q) $, w h er e$ \tau(\bar\chi) = \sum_{a\bmod q}\bar\chi(a) e(a/q) $ha s$ |\tau(\bar\chi)|^2 = q$.*

Proof. For $g cd (n, q) = 1$ , substitute $a \mapsto an$ in the Gauss sum: $τ (\overset{χ}{ˉ}) = \sum_{a} \overset{χ}{ˉ} (a) e (a / q) = \sum_{a} \overset{χ}{ˉ} (an) e (an / q) = χ (n) \sum_{a} \overset{χ}{ˉ} (a) e (an / q)$ , using $\overset{χ}{ˉ} (an) = \overset{χ}{ˉ} (a) \overset{χ}{ˉ} (n)$ and $χ (n) \overset{χ}{ˉ} (n) = 1$ . Rearranging gives the identity for $g cd (n, q) = 1$ ; primitivity extends it to all $n$ (both sides vanish when $g cd (n, q) > 1$ for primitive $χ$ ). The modulus $∣ τ (\overset{χ}{ˉ}) ∣^{2} = q$ is the standard evaluation: $∣ τ (χ) ∣^{2} = \sum_{a, b} χ (a) \overset{χ}{ˉ} (b) e ((a - b) / q) = q$ after using character orthogonality and primitivity. $□$

Connections Master

The large sieve rests on the additive-character duality between $R / Z$ and $Z$ developed in 21.15.01: the off-diagonal vanishing in the proof is Poisson summation applied to a band-limited majorant, so the Poisson/Voronoi machinery there is the engine that produces the optimal constant $N + Q^{2}$ here.

The sifted-set bound and the Brun-Titchmarsh deduction reuse the summation toolkit of 21.11.02 — partial summation and the mean of $μ^{2} / φ$ — so the elementary average-order estimates feed directly into the sieve's local densities $ω (p) / (p - ω (p))$ and the logarithmic gain $L (z) \sim \frac{φ ( q )}{q} lo g z$ .

The dual-norm formulation is the Fourier-analytic Plancherel duality of 02.10.04 specialized to finitely many frequencies: the large sieve is the statement that the synthesis operator from frequency coefficients to a short signal has operator norm $N + δ^{- 1}$ , an $ℓ^{2}$ -boundedness that is Plancherel with a spacing-controlled overlap.

The Brun-Titchmarsh constant cannot reach $1 + o (1)$ by sieve methods alone because of the parity barrier, which is resolved only with the zero-free region and prime-counting analysis of 21.12.01; the large sieve and the analytic prime number theorem are the two complementary routes to primes in progressions.

Historical & philosophical context Master

The large sieve originates with Yuri Linnik's 1941 note in the Doklady ^{[Linnik 1941]}, introduced to bound the least quadratic non-residue and to attack Vinogradov's hypothesis. Linnik's method was combinatorial and additive; Alfréd Rényi reformulated it analytically in the late 1940s, and Klaus Roth and Enrico Bombieri sharpened the constant through the 1960s. Bombieri's 1965 Mathematika paper ^{[Bombieri 1965]} turned the large sieve into the tool that proves the Bombieri-Vinogradov theorem, the averaged Riemann Hypothesis that substitutes for GRH in application after application.

The optimal constant $N + Q^{2}$ was established by Hugh Montgomery and Robert Vaughan in their 1973 Mathematika paper ^{[Montgomery-Vaughan 1973]}, using a duality argument with the Beurling-Selberg extremal majorant; Selberg had independently arrived at the same constant. The Brun-Titchmarsh theorem, named for Viggo Brun's sieve and Edward Titchmarsh's 1930 application, received its modern uniform form $2 x / (φ (q) lo g (x / q))$ in the same work. Montgomery's 1971 lecture notes ^{[Montgomery 1971]} and the Montgomery-Vaughan treatise ^{[Montgomery-Vaughan 2007]} are the standard references for the analytic and arithmetic forms and their consequences.

Bibliography Master

@article{linnik1941largesieve,
  author  = {Linnik, Yuri V.},
  title   = {The large sieve},
  journal = {Doklady Akademii Nauk SSSR},
  volume  = {30},
  pages   = {292--294},
  year    = {1941}
}

@article{bombieri1965largesieve,
  author  = {Bombieri, Enrico},
  title   = {On the large sieve},
  journal = {Mathematika},
  volume  = {12},
  pages   = {201--225},
  year    = {1965}
}

@article{montgomeryvaughan1973largesieve,
  author  = {Montgomery, Hugh L. and Vaughan, Robert C.},
  title   = {The large sieve},
  journal = {Mathematika},
  volume  = {20},
  pages   = {119--134},
  year    = {1973}
}

@book{montgomery1971topics,
  author    = {Montgomery, Hugh L.},
  title     = {Topics in Multiplicative Number Theory},
  series    = {Lecture Notes in Mathematics},
  volume    = {227},
  publisher = {Springer-Verlag},
  year      = {1971}
}

@book{montgomeryvaughan2007,
  author    = {Montgomery, Hugh L. and Vaughan, Robert C.},
  title     = {Multiplicative Number Theory I: Classical Theory},
  series    = {Cambridge Studies in Advanced Mathematics},
  number    = {97},
  publisher = {Cambridge University Press},
  year      = {2007}
}

@book{davenport2000multiplicative,
  author    = {Davenport, Harold},
  title     = {Multiplicative Number Theory},
  edition   = {3},
  series    = {Graduate Texts in Mathematics},
  volume    = {74},
  publisher = {Springer-Verlag},
  year      = {2000},
  note      = {Revised by H. L. Montgomery}
}

@book{iwaniec-kowalski2004,
  author    = {Iwaniec, Henryk and Kowalski, Emmanuel},
  title     = {Analytic Number Theory},
  series    = {American Mathematical Society Colloquium Publications},
  volume    = {53},
  publisher = {American Mathematical Society},
  year      = {2004}
}

Prerequisites

21.11.02
21.15.01
02.10.04

Tier anchors

beginner: Davenport 2000 *Multiplicative Number Theory* (Springer GTM 74, 3rd ed., rev. Montgomery) Ch. 27 (the large sieve, informal motivation); Pollack 2009 *Not Always Buried Deep* (AMS) Ch. 8 on sieve heuristics; Tao's blog 'The large sieve' for the dual-norm viewpoint
intermediate: Montgomery-Vaughan 2007 *Multiplicative Number Theory I: Classical Theory* (Cambridge SMM 97) §7.1-§7.4 (the large sieve inequality, the arithmetic large sieve, Brun-Titchmarsh); Davenport 2000 *Multiplicative Number Theory* (Springer GTM 74) Ch. 27-29; Iwaniec-Kowalski 2004 *Analytic Number Theory* (AMS Colloquium 53) §7.4-§7.5
master: Montgomery 1971 *Topics in Multiplicative Number Theory* (Springer LNM 227) Ch. 1-4 (the large sieve and its arithmetic form); Montgomery-Vaughan 1973 *Mathematika* 20, 119-134 (*The large sieve* — the optimal constant $N+Q^2$ and the Brun-Titchmarsh theorem $\pi(x;q,a)\le 2x/(\varphi(q)\log(x/q))$); Selberg's $1+o(1)$ majorant and the Beurling-Selberg extremal function; Bombieri 1965 *Mathematika* 12, 201-225 (*On the large sieve* — the route to the Bombieri-Vinogradov theorem); Iwaniec-Kowalski 2004 *Analytic Number Theory* (AMS Colloquium 53) Ch. 7; Montgomery-Vaughan 2007 *Multiplicative Number Theory I* (Cambridge SMM 97) Ch. 7

References

Montgomery, H. L. & Vaughan, R. C. — Multiplicative Number Theory I: Classical Theory · Cambridge Studies in Advanced Mathematics 97 (2007), §7. The large sieve inequality $\sum_r |S(\alpha_r)|^2 \le (N+\delta^{-1})\sum_n |a_n|^2$ for $\delta$-spaced points $\alpha_r$, the optimal constant $N+Q^2$ for Farey points of order $Q$, the arithmetic large sieve $\sum_{q\le Q}\sum_{a\,(q)}^* |S(a/q)|^2 \le (N+Q^2)\sum_n |a_n|^2$, the sifted-set bound, and the Brun-Titchmarsh theorem $\pi(x;q,a) \le 2x/(\varphi(q)\log(x/q))$ via the multiplicative large sieve.
Montgomery, H. L. & Vaughan, R. C. — The large sieve · *Mathematika* 20 (1973), 119-134. The duality argument giving the optimal constant $N+Q^2$ in the large sieve inequality, the Selberg/Beurling majorant input, and the deduction of the Brun-Titchmarsh inequality $\pi(x;q,a) \le 2x/(\varphi(q)\log(x/q))$ for $q < x$, uniform in $q$ and $a$ with $\gcd(a,q)=1$.
Bombieri, E. — On the large sieve · *Mathematika* 12 (1965), 201-225. The large sieve in the form used to prove the Bombieri-Vinogradov theorem $\sum_{q\le Q}\max_{(a,q)=1}|\psi(x;q,a) - x/\varphi(q)| \ll x(\log x)^{-A}$ for $Q = x^{1/2}(\log x)^{-B}$, an averaged Generalized Riemann Hypothesis on the error term in the prime number theorem for arithmetic progressions.
Montgomery, H. L. — Topics in Multiplicative Number Theory · Springer Lecture Notes in Mathematics 227 (1971), Ch. 1-4. The analytic large sieve, its arithmetic form over residue classes, the duality principle for bilinear forms, and the historical development from Linnik's original additive large sieve through Rényi, Roth, Bombieri, and Davenport-Halberstam.
Linnik, Yu. V. — The large sieve · *Doklady Akademii Nauk SSSR* 30 (1941), 292-294. The original large sieve: a method for bounding the number of residue classes a set of integers can avoid modulo many primes simultaneously, introduced to attack the least quadratic non-residue and Vinogradov's hypothesis.

Estimated time

beginner: 20m
intermediate: 55m
master: 90m