21.13.03 · number-theory / dirichlet-l-functions-characters

The Prime Number Theorem in Arithmetic Progressions and Siegel-Walfisz

shipped3 tiersLean: none

Anchor (Master): Davenport 2000 *Multiplicative Number Theory* 3e §20, §22 (the full PNT-in-AP and Siegel-Walfisz development, the exceptional-zero term, the $q \le (\log x)^A$ uniformity); Montgomery-Vaughan 2007 *Multiplicative Number Theory I* §11; Iwaniec-Kowalski 2004 *Analytic Number Theory* (AMS Colloquium Publications 53) §5.9, §17 (Siegel-Walfisz and its role in the large sieve and Bombieri-Vinogradov); Walfisz 1936 *Math. Z.* 40; Siegel 1935 *Acta Arithmetica* 1 (the ineffective lower bound feeding the uniformity)

Intuition Beginner

Split the whole numbers by the remainder they leave when divided by a fixed number $q$ . Among numbers that leave remainder $1$ on division by $4$ — that is $1, 5, 9, 13, \dots$ — how many are prime? Among those leaving remainder $3$ — that is $3, 7, 11, 15, \dots$ — how many are prime? Dirichlet proved long ago that each such list with remainder sharing no common factor with $q$ holds infinitely many primes. The question now is sharper: do the primes spread evenly across these lists?

The answer is yes, and the statement is the prime number theorem for arithmetic progressions. There are exactly $φ (q)$ remainders that share no factor with $q$ , where $φ$ counts them, and the primes split among those lists in equal shares. If you weight each prime by the logarithm of its base and add the weights up to $x$ inside one list, the running total is about $x$ divided by $φ (q)$ . Every allowed remainder gets the same slice of the primes; none is favoured.

Proving this reuses the contour machinery that proved the ordinary prime number theorem, but run once for each "character", a gadget that detects remainders. The hard part is honesty about how big $q$ may grow with $x$ before the proof stops working. One single bad zero of one single $L$ -function — the exceptional zero — could throw one remainder class off balance, and ruling it out is the most delicate step. The honest theorem, Siegel-Walfisz, lets $q$ grow only as fast as a fixed power of the logarithm of $x$ .

Visual Beginner

Picture $φ (q)$ bins, one for each remainder that shares no factor with $q$ . As you walk along the number line up to $x$ , each prime drops into the bin matching its remainder. The prime number theorem for progressions says all bins fill at the same rate: each holds about $x / φ (q)$ of the logarithm-weighted total. The bins stay level with one another, like water finding one height across connected tanks.

modulus $q$	allowed remainders	count $φ (q)$	share per class
$4$	$1, 3$	$2$	$x /2$
$5$	$1, 2, 3, 4$	$4$	$x /4$
$12$	$1, 5, 7, 11$	$4$	$x /4$

The middle column lists the remainders coprime to $q$ ; the last column is the equal slice each receives. The one tilted bin in the picture is the danger the exceptional zero poses, and the dashed guarantee line is exactly what Siegel-Walfisz secures while $q$ stays small.

Worked example Beginner

Count the logarithm-weighted primes up to $x = 20$ in the two allowed classes modulo $4$ , and check they come out roughly equal.

Step 1. List the primes up to $20$ : they are $2, 3, 5, 7, 11, 13, 17, 19$ . Drop the prime $2$ , since it shares the factor $2$ with $4$ and lands in no allowed class. The allowed remainders modulo $4$ are $1$ and $3$ .

Step 2. Sort the rest by remainder. Remainder $1$ : the primes $5, 13, 17$ (since $5 = 4 + 1$ , $13 = 12 + 1$ , $17 = 16 + 1$ ). Remainder $3$ : the primes $3, 7, 11, 19$ (since $3, 7 = 4 + 3, 11 = 8 + 3, 19 = 16 + 3$ ).

Step 3. Weight each prime by the natural logarithm of itself and add. Remainder $1$ : $lo g 5 + lo g 13 + lo g 17 \approx 1.61 + 2.56 + 2.83 = 7.00$ . Remainder $3$ : $lo g 3 + lo g 7 + lo g 11 + lo g 19 \approx 1.10 + 1.95 + 2.40 + 2.94 = 8.39$ .

What this tells us: the two totals, $7.00$ and $8.39$ , are already close, and the theorem says their ratio approaches $1$ as $x$ grows. The predicted common share is $x / φ (4) = 20/2 = 10$ ; both totals sit a little below $10$ , which is the expected undercount at this tiny scale. The primes are not crowding into one class — they are splitting evenly, exactly as the equal-bins picture promised.

Check your understanding Beginner

Exercise (easy, multiple choice).

In the Siegel-Walfisz theorem, how large is the modulus $q$ allowed to grow with $x$ while the equal-distribution estimate still holds?

A. As large as $x$ itself B. As large as a fixed power of $lo g x$ C. As large as $x$ D. It must stay constant

Hint

The theorem is uniform only for very slowly growing moduli — far smaller than any power of $x$ .

Answer

B. Siegel-Walfisz holds uniformly for $q \leq (lo g x)^{A}$ , any fixed power $A$ of the logarithm. Feedback-correct: this is a severe restriction — a power of $lo g x$ is far smaller than any power of $x$ — and it is the price of controlling the exceptional zero. Feedback-wrong: ranges like $x$ are reached only later and only on average over many moduli, by the large sieve and the Bombieri-Vinogradov theorem, not by Siegel-Walfisz for each individual $q$ .

Formal definition Intermediate+

Throughout, $q \geq 1$ is the modulus, $a$ is an integer with $(a, q) = 1$ , $χ$ ranges over Dirichlet characters modulo $q$ as in 21.03.02, $χ_{0}$ is the principal character, $L (s, χ) = \sum_{n} χ (n) n^{- s}$ is the $L$ -function with Euler product for $σ > 1$ , $Λ$ is the von Mangoldt function, and $φ$ is Euler's totient. We follow Davenport ^{[Davenport §19]}.

Definition (Chebyshev function in a progression). For real $x \geq 2$ set $$ \psi(x; q, a) = \sum_{\substack{n \le x \ n \equiv a ,(\mathrm{mod}, q)}} \Lambda(n), $$ the von Mangoldt weight summed over the prime powers congruent to $a$ modulo $q$ . The twisted Chebyshev function is $$ \psi(x, \chi) = \sum_{n \le x} \chi(n)\Lambda(n), $$ the character-weighted analogue of the $ψ (x)$ of 21.12.02.

Definition (orthogonality decomposition). The orthogonality of Dirichlet characters, $\sum_{χ mod q} \overline{χ} (a) χ (n) = φ (q)$ when $n \equiv a$ and $0$ otherwise for $(a, q) = 1$ , gives the exact identity $$ \psi(x; q, a) = \frac{1}{\varphi(q)} \sum_{\chi \bmod q} \overline\chi(a), \psi(x, \chi). $$ The principal term $χ = χ_{0}$ contributes $φ (q)^{- 1} ψ (x, χ_{0}) = φ (q)^{- 1} (ψ (x) + O (lo g^{2} x))$ , the expected main term $x / φ (q)$ ; every non-principal $χ$ contributes an error term controlled by the zeros of $L (s, χ)$ .

Definition (the PNT in arithmetic progressions). The prime number theorem in arithmetic progressions is the asymptotic $$ \psi(x; q, a) \sim \frac{x}{\varphi(q)} \qquad (x \to \infty) $$ for each fixed $q$ and each $a$ with $(a, q) = 1$ ; equivalently $π (x; q, a) \sim Li (x) / φ (q)$ for the count of primes $\leq x$ in the class. The content is the equidistribution: the leading constant $1/ φ (q)$ is the same for every allowed $a$ .

The symbols $ψ (x; q, a)$ , $ψ (x, χ)$ , $L (s, χ)$ , $χ$ , $χ_{0}$ , $φ (q)$ , the modulus $q$ , the exceptional zero $β$ , and $ℜ$ for real part are recorded in _meta/NOTATION.md. The exceptional (Siegel) zero $β$ is the possible real zero of $L (s, χ)$ for real $χ$ described in 21.13.01.

Counterexamples to common slips

"Equidistribution is obvious once Dirichlet's theorem gives infinitely many primes per class." Infinitude is far weaker than equal proportion: Dirichlet's theorem allows wildly unequal densities a priori. The equal constant $1/ φ (q)$ is exactly the non-vanishing $L (1, χ) \neq = 0$ of 21.03.02 sharpened to a zero-free region.
"The error term for each $q$ is uniform up to $q$ near $x$ ." It is not. Siegel-Walfisz secures uniformity only for $q \leq (lo g x)^{A}$ . The barrier is the exceptional zero, whose contribution $- x^{β} / (β φ (q))$ is controlled only by Siegel's ineffective bound, which degrades as $q$ grows.
"The decomposition has an error because orthogonality is approximate." The orthogonality identity is exact for $(a, q) = 1$ ; the decomposition $ψ (x; q, a) = φ (q)^{- 1} \sum_{χ} \overline{χ} (a) ψ (x, χ)$ is an equality, not an estimate. All error enters through the analytic behaviour of each $ψ (x, χ)$ , not through the splitting.

Key theorem with proof Intermediate+

The signature result is the Siegel-Walfisz theorem: equidistribution with a de la Vallée Poussin error term, uniform in the modulus up to a fixed power of $lo g x$ . The proof runs the contour argument of 21.12.02 once per character, then reassembles by orthogonality, with Siegel's bound taming the one exceptional zero. ^{[Davenport §22]}

Theorem (Siegel-Walfisz). Fix $A > 0$ . There is a constant $c = c (A) > 0$ such that for all $q \leq (lo g x)^{A}$ and all $a$ with $(a, q) = 1$ , $$ \psi(x; q, a) = \frac{x}{\varphi(q)} + O!\big(x\exp(-c\sqrt{\log x})\big), $$ the implied constant depending only on $A$ . The constant $c (A)$ is ineffective.

Proof. Start from the exact decomposition $ψ (x; q, a) = φ (q)^{- 1} \sum_{χ} \overline{χ} (a) ψ (x, χ)$ . The principal character gives $ψ (x, χ_{0}) = ψ (x) - \sum_{p ∣ q} (lo g p) ⌊ lo g_{p} x ⌋ = x + O (x exp (- c_{0} lo g x)) + O (lo g q lo g x)$ by the prime number theorem of 21.12.02, so its share is $x / φ (q)$ up to the stated error. It remains to bound $ψ (x, χ)$ for each non-principal $χ$ .

For non-principal $χ$ the twisted explicit formula reads, for $2 \leq T \leq x$ , $$ \psi(x, \chi) = -\sum_{|\Im\rho| \le T}\frac{x^\rho}{\rho} + O!\Big(\frac{x\log^2(qx)}{T}\Big), $$ the sum over the non-elementary zeros $ρ = β_{ρ} + i γ_{ρ}$ of $L (s, χ)$ inside the critical strip. Insert the zero-free region of 21.13.01: every such zero satisfies $β_{ρ} \leq 1 - c_{1} / lo g (q (∣ γ_{ρ} ∣ + 2))$ , with the single possible exception, for real $χ$ , of one real zero $β$ . Bounding each term by $x^{β_{ρ}} /∣ ρ ∣$ and summing against the zero-density estimate $\sum_{∣ γ_{ρ} ∣ \leq T} 1 ≪ lo g (q T) \cdot T$ yields, for the non-exceptional zeros, $$ \Big|\sum_{\rho \ne \beta}\frac{x^\rho}{\rho}\Big| \ll x\exp!\Big(-\frac{c_2\log x}{\log(qT)}\Big)\log^2(qT) + \frac{x\log^2(qx)}{T}. $$ Choosing $lo g T = lo g x$ and using $q \leq (lo g x)^{A}$ , so $lo g (q T) ≪ lo g x$ , both pieces are $O (x exp (- c_{3} lo g x))$ .

The one term left is the exceptional zero, present only for at most one real $χ$ modulo $q$ (Landau's repulsion, 21.13.01). It contributes $- \overline{χ} (a) x^{β} / (β φ (q))$ to $ψ (x; q, a)$ . Its size is $x^{β} = x \cdot x^{- (1 - β)} = x exp (- (1 - β) lo g x)$ . Siegel's theorem ^{[Siegel 1935]}, referenced through 21.13.01, gives $1 - β ≫_{ε} q^{- ε}$ . Take $ε = 1/ (2 A)$ : since $q \leq (lo g x)^{A}$ , one has $q^{- ε} \geq (lo g x)^{- 1/2}$ , so $1 - β ≫ (lo g x)^{- 1/2}$ , and $$ x^\beta \ll x\exp!\big(-c_4 (\log x)^{1/2}\big) = x\exp(-c_4\sqrt{\log x}). $$ The constant $c_{4}$ is ineffective because Siegel's is. Collecting the principal main term, the non-exceptional zero sum, and the exceptional term, every error is $O (x exp (- c lo g x))$ with $c = c (A)$ , completing the proof. $□$

Bridge. This uniformity builds toward the analytic theory of primes in progressions and appears again in 21.14.02, where the large sieve replaces the per-modulus exceptional-zero control by an average over $q$ . The foundational reason the proof works is the exact orthogonality split: it converts a question about one residue class into $φ (q)$ independent contour problems, and this is exactly the device by which the single-modulus PNT of 21.12.02 generalises to every progression at once. The central insight is that the only obstruction to clean uniformity is one real zero of one real-character $L$ -function, so the entire strength of the theorem is dual to the strength of the lower bound on $1 - β$ : putting these together, Siegel's ineffective $q^{- ε}$ is exactly what forces the range $q \leq (lo g x)^{A}$ and no larger, and the bridge is the trade of an effective but feeble range for an ineffective but genuine one.

Exercises Intermediate+

Exercise 3 (medium, symbolic).

Derive the orthogonality decomposition $ψ (x; q, a) = φ (q)^{- 1} \sum_{χ mod q} \overline{χ} (a) ψ (x, χ)$ from the character orthogonality relation, for $(a, q) = 1$ .

Hint

Use $\sum_{χ mod q} \overline{χ} (a) χ (n) = φ (q)$ if $n \equiv a (mod q)$ and $0$ otherwise (for $(a, q) = 1$ ). Insert this as a detector inside the sum defining $ψ (x; q, a)$ .

Answer

The orthogonality relation for characters modulo $q$ states that for $(a, q) = 1$ , $$ \frac{1}{\varphi(q)}\sum_{\chi \bmod q}\overline\chi(a)\chi(n) = \begin{cases} 1 & n \equiv a ,(\mathrm{mod}, q),\ (n, q) = 1, \ 0 & \text{otherwise.}\end{cases} $$ Multiply by $Λ (n)$ and sum over $n \leq x$ . Since $(a, q) = 1$ , the surviving $n$ are exactly those $\equiv a$ , so the left side becomes $φ (q)^{- 1} \sum_{n \leq x} Λ (n) \sum_{χ} \overline{χ} (a) χ (n)$ and the right side is $\sum_{n \leq x, n \equiv a} Λ (n) = ψ (x; q, a)$ . Swapping the finite character sum with the sum over $n$ , $$ \psi(x; q, a) = \frac{1}{\varphi(q)}\sum_{\chi \bmod q}\overline\chi(a)\sum_{n \le x}\chi(n)\Lambda(n) = \frac{1}{\varphi(q)}\sum_{\chi \bmod q}\overline\chi(a),\psi(x, \chi). $$ Rubric: full credit for citing orthogonality, inserting it as a detector, and exchanging the sums. The identity is exact, with no error term.

Exercise 4 (medium, symbolic).

Show that the principal-character twisted Chebyshev function satisfies $ψ (x, χ_{0}) = ψ (x) + O (lo g q lo g x)$ , and explain why this yields the main term $x / φ (q)$ .

Hint

$χ_{0} (n) = 1$ when $(n, q) = 1$ and $0$ otherwise. So $ψ (x, χ_{0})$ omits the prime powers $p^{m}$ with $p ∣ q$ . Bound that omitted contribution.

Answer

By definition $ψ (x, χ_{0}) = \sum_{n \leq x} χ_{0} (n) Λ (n) = \sum_{n \leq x, (n, q) = 1} Λ (n) = ψ (x) - \sum_{p^{m} \leq x p ∣ q} lo g p$ . The omitted sum runs over prime powers of the primes dividing $q$ ; there are at most $lo g q / lo g 2$ such primes, each contributing $\leq lo g x$ across its powers $p^{m} \leq x$ , so the omitted total is $O (lo g q lo g x)$ . Hence $ψ (x, χ_{0}) = ψ (x) + O (lo g q lo g x) = x + O (x exp (- c lo g x)) + O (lo g q lo g x)$ by the PNT of 21.12.02. Dividing by $φ (q)$ gives the share $x / φ (q)$ with negligible correction. Rubric: full credit for isolating the $p ∣ q$ terms, the $O (lo g q lo g x)$ bound, and the appeal to the ordinary PNT.

Exercise 5 (medium, symbolic).

Given the zero-free region $β_{ρ} \leq 1 - c / lo g (q (∣ γ_{ρ} ∣ + 2))$ for the zeros of $L (s, χ)$ , show that a single non-exceptional zero contributes at most $x exp (- c lo g x / lo g (q T))$ to the explicit-formula sum, and choose $T$ to make this $x exp (- c^{'} lo g x)$ when $q \leq (lo g x)^{A}$ .

Hint

A zero $ρ$ contributes $∣ x^{ρ} / ρ ∣ = x^{β_{ρ}} /∣ ρ ∣$ . Bound $x^{β_{ρ}}$ using the region, then set $lo g T = lo g x$ and use $lo g q ≪ lo g lo g x$ .

Answer

For a zero $ρ = β_{ρ} + i γ_{ρ}$ with $∣ γ_{ρ} ∣ \leq T$ , the region gives $x^{β_{ρ}} \leq x^{1 - c / l o g (q (∣ γ_{ρ} ∣ + 2))} \leq x exp (- c lo g x / lo g (q T))$ , since $∣ γ_{ρ} ∣ + 2 \leq T + 2 ≪ T$ . Summing over the $≪ T lo g (q T)$ zeros up to height $T$ and dividing by $∣ ρ ∣ \geq 1$ gives a bound $≪ x exp (- c lo g x / lo g (q T)) T lo g (q T)$ for the zero sum, plus the truncation error $x lo g^{2} (q x) / T$ . Choose $lo g T = lo g x$ , so $T = exp (lo g x)$ . With $q \leq (lo g x)^{A}$ , $lo g q \leq A lo g lo g x = o (lo g x)$ , hence $lo g (q T) = lo g T + lo g q \sim lo g x$ . The exponent $c lo g x / lo g (q T) \sim c lo g x$ , and the polynomial-in- $T$ factors are absorbed, while the truncation error is $x lo g^{2} x / exp (lo g x) ≪ x exp (- c^{'} lo g x)$ . Both pieces are $O (x exp (- c^{'} lo g x))$ . Rubric: full credit for the per-zero bound, the density-weighted sum, the choice $lo g T = lo g x$ , and the absorption of $lo g q$ .

Exercise 6 (hard, symbolic).

The exceptional zero $β$ contributes $- \overline{χ} (a) x^{β} / (β φ (q))$ to $ψ (x; q, a)$ . Using Siegel's bound $1 - β ≫_{ε} q^{- ε}$ , prove this term is $O (x exp (- c lo g x))$ for $q \leq (lo g x)^{A}$ , and identify exactly where the ineffectivity enters.

Hint

Write $x^{β} = x exp (- (1 - β) lo g x)$ and lower-bound $1 - β$ by Siegel with $ε = 1/ (2 A)$ . Track which constant cannot be computed.

Answer

Since $∣ \overline{χ} (a) ∣ \leq 1$ and $β \geq 1/2$ , the term is $\leq x^{β} / (β φ (q)) \leq 2 x^{β}$ . Write $x^{β} = x^{1 - (1 - β)} = x exp (- (1 - β) lo g x)$ . Siegel's theorem gives, for every $ε > 0$ , a constant $C (ε) > 0$ with $1 - β \geq C (ε) q^{- ε}$ . Choose $ε = 1/ (2 A)$ . For $q \leq (lo g x)^{A}$ we get $q^{- ε} \geq (lo g x)^{- A ε} = (lo g x)^{- 1/2}$ , so $1 - β \geq C (ε) (lo g x)^{- 1/2}$ . Therefore $$ x^\beta \le x\exp!\big(-C(\varepsilon)(\log x)^{-1/2}\log x\big) = x\exp!\big(-C(\varepsilon)\sqrt{\log x}\big), $$ so the exceptional term is $O (x exp (- c lo g x))$ with $c = C (ε)$ . The ineffectivity is precisely the constant $C (ε) = c$ : Siegel's proof of $1 - β ≫_{ε} q^{- ε}$ case-splits on whether some real character has an exceptional zero near $1$ and cannot decide which case holds, so $C (ε)$ — and hence the implied $c$ in the final error — is known to be positive but cannot be computed. Rubric: full credit for the rewrite of $x^{β}$ , the choice $ε = 1/ (2 A)$ , the resulting $lo g x$ exponent, and pinning the ineffectivity to Siegel's constant.

Exercise 7 (hard, symbolic).

Explain why the range cannot be pushed beyond $q \leq (lo g x)^{A}$ by this method: show that for $q$ as large as a small power of $x$ , the exceptional term $x^{β}$ need not be smaller than the main term $x / φ (q)$ under only Siegel's bound.

Hint

For the exceptional term to beat the main term you need $x^{β} = o (x / φ (q))$ , i.e. $exp (- (1 - β) lo g x) = o (1/ φ (q))$ . Plug in $1 - β ≍ q^{- ε}$ and see what $q$ makes this fail.

Answer

The exceptional term is negligible against the main term $x / φ (q)$ precisely when $x^{β} φ (q) = o (x)$ , i.e. $exp (- (1 - β) lo g x) φ (q) = o (1)$ , i.e. $(1 - β) lo g x \to \infty$ faster than $lo g φ (q) \leq lo g q$ . Siegel gives only $1 - β ≫_{ε} q^{- ε}$ , so the best guaranteed growth of $(1 - β) lo g x$ is $≍ q^{- ε} lo g x$ . For this to exceed $lo g q$ one needs $q^{- ε} lo g x ≫ lo g q$ , i.e. $q^{ε} ≪ lo g x / lo g q$ , which for any fixed $ε$ forces $q ≪ (lo g x)^{1/ ε}$ — a power of $lo g x$ . If instead $q = x^{δ}$ for fixed $δ > 0$ , then $q^{- ε} = x^{- δ ε}$ , so $(1 - β) lo g x ≫ x^{- δ ε} lo g x \to 0$ , and $x^{β} = x exp (- (1 - β) lo g x) = x (1 + o (1))$ could be the full size of $x$ , swamping the main term $x / φ (q) = x^{1 - δ + o (1)}$ . With only Siegel's ineffective bound the exceptional zero is uncontrolled in this regime, which is exactly why per-modulus uniformity stops at a power of $lo g x$ . Rubric: full credit for the comparison $x^{β} φ (q)$ versus $x$ , substituting Siegel's bound, deriving the $q ≪ (lo g x)^{1/ ε}$ ceiling, and showing the power-of- $x$ regime fails.

Exercise 8 (hard, symbolic).

State the Bombieri-Vinogradov theorem and explain in what averaged sense it recovers a range $q \leq x^{1/2 - ε}$ that Siegel-Walfisz cannot reach for individual $q$ .

Hint

Bombieri-Vinogradov bounds $\sum_{q \leq Q} max_{(a, q) = 1} ∣ ψ (x; q, a) - x / φ (q) ∣$ , not each term separately. The maximum allowed $Q$ is essentially $x$ .

Answer

Bombieri-Vinogradov. For every $A > 0$ there is $B = B (A)$ such that, with $Q = x^{1/2} (lo g x)^{- B}$ , $$ \sum_{q \le Q}\max_{(a, q) = 1}\Big|\psi(x; q, a) - \frac{x}{\varphi(q)}\Big| \ll_A \frac{x}{(\log x)^A}. $$ This bounds the total error summed over all moduli $q \leq x^{1/2 - ε}$ , not the error for each $q$ . The point is that the exceptional zeros, which Siegel-Walfisz can control only individually and only for $q \leq (lo g x)^{A}$ , are rare (Landau's repulsion limits them to roughly one per dyadic range of moduli), so when one sums over $q$ the bad moduli contribute a negligible fraction of the total. The large sieve inequality of 21.14.02 provides the mean-square bound on character sums that powers this averaging. The upshot: on average over $q \leq x$ , primes are equidistributed in residue classes with the same quality of error one would get from the Generalized Riemann Hypothesis for each individual modulus — which is why Bombieri-Vinogradov is often called "GRH on average". Rubric: full credit for the correct statement with $Q ≍ x$ , the distinction between summed and per-modulus error, and the role of exceptional-zero rarity and the large sieve.

Advanced results Master

Theorem 1 (twisted explicit formula). For non-principal $χ$ modulo $q$ and $2 \leq T \leq x$ ^{[Davenport §19]}, $$ \psi(x, \chi) = -\sum_{|\Im\rho| \le T}\frac{x^\rho}{\rho} - (1 - E_0(\chi))\log x + O!\Big(\frac{x\log^2(qx)}{T}\Big), $$ the sum over non-elementary zeros $ρ$ of $L (s, χ)$ , with $E_{0} (χ) = 1$ if $χ = χ_{0}$ and $0$ otherwise. The absence of a main term for non-principal $χ$ — there is no pole of $L (s, χ)$ at $s = 1$ — is the analytic statement of equidistribution: only $χ_{0}$ produces $x$ , and orthogonality then averages the rest to zero.

Theorem 2 (PNT in progressions with exceptional term). Assembling the twisted explicit formulae through orthogonality ^{[Davenport §20]}, $$ \psi(x; q, a) = \frac{x}{\varphi(q)} - \frac{\overline\chi_1(a)}{\varphi(q)}\frac{x^{\beta}}{\beta} + O!\big(x\exp(-c\sqrt{\log x})\big), $$ where the middle term is present only if some real $χ_{1}$ modulo $q$ has an exceptional zero $β$ ; otherwise it is absent and the error stands alone. The exceptional term is the unique obstruction to clean equidistribution and is the reason the theorem is conditional in quality on the Landau-Siegel zero.

Theorem 3 (Siegel-Walfisz, uniform form). For every $A, B > 0$ there is $c = c (A) > 0$ with ^{[Walfisz 1936]} $$ \psi(x; q, a) = \frac{x}{\varphi(q)} + O!\big(x\exp(-c\sqrt{\log x})\big) \qquad (q \le (\log x)^A,\ (a, q) = 1), $$ and correspondingly $π (x; q, a) = Li (x) / φ (q) + O (x exp (- c lo g x))$ by partial summation. The constant $c (A)$ is ineffective, inherited from Siegel's bound on $1 - β$ ; this is the sole source of ineffectivity, since the non-exceptional zeros are handled by the effective de la Vallée Poussin region of 21.13.01.

Theorem 4 (the $q \leq (lo g x)^{A}$ barrier). The range is sharp for the method ^{[Iwaniec-Kowalski §5.9]}. For the exceptional term to be subsumed in the error one needs $x^{β} = O (x exp (- c lo g x))$ , which via $1 - β ≫_{ε} q^{- ε}$ forces $q^{- ε} lo g x ≫ lo g x$ , i.e. $q ≪ (lo g x)^{1/ (2 ε)}$ . No choice of $ε$ pushes the range to a power of $x$ ; the exceptional zero is the impassable wall for any single modulus, and crossing it for individual $q$ would require eliminating the Landau-Siegel zero outright.

Theorem 5 (recovery on average: Bombieri-Vinogradov). The large sieve of 21.14.02 yields, for every $A$ and $Q = x^{1/2} (lo g x)^{- B (A)}$ ^{[Iwaniec-Kowalski §17]}, $$ \sum_{q \le Q}\max_{(a, q) = 1}\Big|\psi(x; q, a) - \frac{x}{\varphi(q)}\Big| \ll_A \frac{x}{(\log x)^A}. $$ This is the average-over-moduli statement that compensates for the per-modulus failure: it reaches $q \leq x$ at the cost of a maximum over residues and a sum over moduli, and it is unconditional and effective, because the rarity of exceptional zeros (Landau's repulsion) makes their pooled contribution negligible.

Synthesis. The prime number theorem in arithmetic progressions is the foundational reason equidistribution of primes in residue classes is a theorem with an error term, and the exceptional zero is the single structural flaw running through it, exactly as in 21.13.01. The central insight is that orthogonality splits one residue class into $φ (q)$ independent $L$ -function contour problems, so the single-modulus PNT of 21.12.02 generalises verbatim except for the one real zero that the zero-free region cannot exclude: putting these together, the principal character supplies the main term $x / φ (q)$ while every other character contributes only zero-driven error, and this is exactly the analytic content of "no remainder is favoured". The bridge is everywhere the exceptional zero — its contribution $x^{β}$ is dual to Siegel's lower bound on $1 - β$ , and the range $q \leq (lo g x)^{A}$ is precisely the regime where that ineffective bound still beats $lo g q$ . This is dual to the large-sieve story: where Siegel-Walfisz controls each modulus up to a power of $lo g x$ at the cost of ineffectivity, the Bombieri-Vinogradov theorem of 21.14.02 controls all moduli up to $x$ on average and effectively, trading per-modulus precision for an average that the rarity of exceptional zeros makes clean. The central insight recurs across analytic number theory: a single conjectural zero conditions the entire quantitative theory, and every uniform statement about primes in progressions is written, like the zero-free region itself, in the conditional mood the Siegel zero imposes.

Full proof set Master

Proposition 1 (orthogonality decomposition is exact). For $(a, q) = 1$ and $x \geq 1$ , $ψ (x; q, a) = φ (q)^{- 1} \sum_{χ mod q} \overline{χ} (a) ψ (x, χ)$ .

Proof. The character orthogonality relation gives $\sum_{χ mod q} \overline{χ} (a) χ (n) = φ (q)$ if $n \equiv a (mod q)$ with $(n, q) = 1$ , and $0$ otherwise; for $(a, q) = 1$ the condition $n \equiv a$ already forces $(n, q) = 1$ . Multiply by $Λ (n)$ , sum over $n \leq x$ , and exchange the finite sums: $$ \frac{1}{\varphi(q)}\sum_{\chi}\overline\chi(a)\sum_{n \le x}\chi(n)\Lambda(n) = \sum_{n \le x}\Lambda(n)\frac{1}{\varphi(q)}\sum_\chi \overline\chi(a)\chi(n) = \sum_{\substack{n \le x \ n \equiv a}}\Lambda(n) = \psi(x; q, a). $$ The inner sum on the left is $ψ (x, χ)$ , giving the claim. No error term arises; the identity is exact. $□$

Proposition 2 (no main term for non-principal $χ$ ). For non-principal $χ$ , $L (s, χ)$ is holomorphic and non-zero at $s = 1$ , so $ψ (x, χ) = o (x)$ .

Proof. For non-principal $χ$ the $L$ -function $L (s, χ) = \sum_{n} χ (n) n^{- s}$ converges for $σ > 0$ by partial summation (the partial sums of $χ (n)$ are bounded by $q$ ), hence is holomorphic across $s = 1$ , and $L (1, χ) \neq = 0$ by 21.03.02. Thus $- L^{'} (s, χ) / L (s, χ)$ has no pole at $s = 1$ , and the Perron contour for $ψ (x, χ) = \frac{1}{2 π i} \int (- L^{'} / L) x^{s} / s d s$ sweeps past no residue at $s = 1$ . The leading contribution $x$ that the pole of $ζ$ supplied for $ψ (x)$ in 21.12.02 is therefore absent, leaving only the zero-driven sum, which is $o (x)$ by the zero-free region. $□$

Proposition 3 (exceptional-term bound under Siegel). If $L (s, χ_{1})$ has an exceptional zero $β$ and $q \leq (lo g x)^{A}$ , then $x^{β} ≪ x exp (- c lo g x)$ with $c = c (A)$ ineffective.

Proof. Write $x^{β} = x exp (- (1 - β) lo g x)$ . By Siegel's theorem ^{[Siegel 1935]}, for each $ε > 0$ there is $C (ε) > 0$ with $1 - β \geq C (ε) q^{- ε}$ . Take $ε = 1/ (2 A)$ ; then $q \leq (lo g x)^{A}$ gives $q^{- ε} \geq (lo g x)^{- 1/2}$ , so $1 - β \geq C (ε) (lo g x)^{- 1/2}$ and $$ x^\beta \le x\exp!\big(-C(\varepsilon)(\log x)^{-1/2}\cdot\log x\big) = x\exp!\big(-C(\varepsilon)\sqrt{\log x}\big). $$ Set $c = C (ε)$ . Siegel's $C (ε)$ is ineffective because its proof selects an auxiliary real character with the most extreme exceptional zero, whose existence cannot be decided, so $c$ is positive but not computable. $□$

Proposition 4 (non-exceptional zero sum). With $lo g T = lo g x$ and $q \leq (lo g x)^{A}$ , the contribution of the non-exceptional zeros of $L (s, χ)$ to $ψ (x, χ)$ is $O (x exp (- c^{'} lo g x))$ .

Proof. Each zero $ρ = β_{ρ} + i γ_{ρ}$ other than $β$ obeys $β_{ρ} \leq 1 - c_{1} / lo g (q (∣ γ_{ρ} ∣ + 2))$ by 21.13.01, so $x^{β_{ρ}} \leq x exp (- c_{1} lo g x / lo g (q T))$ for $∣ γ_{ρ} ∣ \leq T$ . The number of zeros with $∣ γ_{ρ} ∣ \leq T$ is $≪ T lo g (q T)$ , and each contributes $x^{β_{ρ}} /∣ ρ ∣$ with $∣ ρ ∣ \geq ∣ γ_{ρ} ∣$ or $\geq β_{ρ}$ ; partial summation against the density bound gives $\sum_{ρ \neq = β} ∣ x^{ρ} / ρ ∣ ≪ x exp (- c_{1} lo g x / lo g (q T)) lo g^{2} (q T)$ . Adding the truncation error $x lo g^{2} (q x) / T$ , and using $lo g (q T) \sim lo g x$ since $lo g q ≪ lo g lo g x = o (lo g x)$ and $lo g T = lo g x$ , both terms are $O (x exp (- c^{'} lo g x))$ . $□$

Connections Master

The contour proof, truncated Perron formula, and the optimisation $lo g T = lo g x$ are imported wholesale from 21.12.02; this unit runs that single-modulus argument once per character $χ$ and reassembles by orthogonality, so the analytic engine is identical and only the $φ (q)$ -fold bookkeeping and the exceptional-zero term are new.
The zero-free region for $L (s, χ)$ , Landau's repulsion, and the isolation of the exceptional zero $β$ are exactly the apparatus of 21.13.01; the present unit consumes that infrastructure — the de la Vallée Poussin region for the non-exceptional zeros and the at-most-one-exceptional-zero bound for the real-character term — to turn the per-character explicit formula into a uniform error.
The orthogonality of Dirichlet characters and the non-vanishing $L (1, χ) \neq = 0$ that gives every non-principal $ψ (x, χ)$ no main term are the facts established in 21.03.02; the decomposition $ψ (x; q, a) = φ (q)^{- 1} \sum_{χ} \overline{χ} (a) ψ (x, χ)$ is the exact identity through which those facts become equidistribution.
Siegel's ineffective lower bound $1 - β ≫_{ε} q^{- ε}$ , the single source of ineffectivity here and the controller of the exceptional term, is developed in detail in the sibling unit 21.13.02; the range $q \leq (lo g x)^{A}$ is precisely the regime where that bound dominates $lo g q$ .
The large sieve and the Bombieri-Vinogradov theorem of 21.14.02 are the average-over-moduli sequel: where Siegel-Walfisz reaches only $q \leq (lo g x)^{A}$ per modulus, the large sieve recovers $q \leq x^{1/2 - ε}$ on average, unconditionally and effectively, by exploiting the rarity of exceptional zeros that this unit's per-modulus argument cannot exploit.

Historical & philosophical context Master

The prime number theorem in arithmetic progressions descends directly from Dirichlet's 1837 proof that each coprime residue class modulo $q$ contains infinitely many primes, the first appearance of $L$ -functions and of the non-vanishing $L (1, χ) \neq = 0$ . The quantitative equidistribution $ψ (x; q, a) \sim x / φ (q)$ , with the de la Vallée Poussin error term, followed the 1896 prime number theorem for the same character-twisted contour reasons, but only for fixed or slowly growing $q$ ; the obstruction to uniformity was the exceptional zero isolated by Edmund Landau in 1918. The decisive uniform statement is due to Arnold Walfisz, who in 1936 ^{[Walfisz 1936]} combined Carl Ludwig Siegel's ineffective lower bound on $L (1, χ)$ ^{[Siegel 1935]}, published the year before, with the contour method to prove equidistribution uniformly for $q \leq (lo g x)^{A}$ . The theorem bears both names because Siegel supplied the analytic input and Walfisz the synthesis.

The ineffectivity Siegel introduced is permanent under this method: the constant $c (A)$ in the error is known to be positive but cannot be computed, because Siegel's argument bounds all $L (1, χ)$ in terms of a hypothetical extremal exceptional zero whose existence is undecided. The limitation $q \leq (lo g x)^{A}$ stood as the working range for individual moduli until the large-sieve methods of the 1960s. Klaus Roth and, decisively, Enrico Bombieri and A. I. Vinogradov in 1965 proved that on average over $q \leq x^{1/2 - ε}$ the equidistribution holds with the strength one would expect from the Generalized Riemann Hypothesis for each modulus — the Bombieri-Vinogradov theorem — which is the form actually used in modern prime-gap and additive-prime results. Davenport's Multiplicative Number Theory §20-22 and Iwaniec-Kowalski's Analytic Number Theory are the canonical treatments.

Bibliography Master

@article{walfisz1936,
  author  = {Walfisz, Arnold},
  title   = {Zur additiven Zahlentheorie II},
  journal = {Mathematische Zeitschrift},
  volume  = {40},
  pages   = {592--607},
  year    = {1936}
}

@article{siegel1935ap,
  author  = {Siegel, Carl Ludwig},
  title   = {\"{U}ber die Classenzahl quadratischer Zahlk\"{o}rper},
  journal = {Acta Arithmetica},
  volume  = {1},
  pages   = {83--86},
  year    = {1935}
}

@article{bombieri1965,
  author  = {Bombieri, Enrico},
  title   = {On the large sieve},
  journal = {Mathematika},
  volume  = {12},
  pages   = {201--225},
  year    = {1965}
}

@book{davenport2000ap,
  author    = {Davenport, Harold},
  title     = {Multiplicative Number Theory},
  edition   = {3},
  series    = {Graduate Texts in Mathematics},
  volume    = {74},
  publisher = {Springer-Verlag},
  year      = {2000},
  note      = {Revised by H. L. Montgomery; \S19--22: PNT in arithmetic progressions and Siegel-Walfisz}
}

@book{iwaniec-kowalski2004ap,
  author    = {Iwaniec, Henryk and Kowalski, Emmanuel},
  title     = {Analytic Number Theory},
  series    = {American Mathematical Society Colloquium Publications},
  volume    = {53},
  publisher = {American Mathematical Society},
  year      = {2004},
  note      = {\S5.9 and Ch. 17: Siegel-Walfisz, the large sieve, and Bombieri-Vinogradov}
}

@book{montgomery-vaughan2007ap,
  author    = {Montgomery, Hugh L. and Vaughan, Robert C.},
  title     = {Multiplicative Number Theory I: Classical Theory},
  series    = {Cambridge Studies in Advanced Mathematics},
  volume    = {97},
  publisher = {Cambridge University Press},
  year      = {2007},
  note      = {\S11.3: the Siegel-Walfisz theorem and its ineffective constant}
}

Prerequisites

21.13.01
21.12.02
21.03.02

Tier anchors

beginner: Derbyshire 2003 *Prime Obsession* (Joseph Henry Press) Ch. 21-22 (primes sorted by remainder, told informally for the character-twisted prime counts); Soundararajan's expository notes on equidistribution of primes in residue classes for the cleanest first picture of $\psi(x; q, a) \approx x/\varphi(q)$
intermediate: Davenport 2000 *Multiplicative Number Theory* 3e (Springer GTM 74) §20, §22 (the orthogonality decomposition $\psi(x; q, a) = \varphi(q)^{-1}\sum_\chi \overline\chi(a)\psi(x, \chi)$, the contour proof for each $\psi(x, \chi)$, and the Siegel-Walfisz theorem with the $q \le (\log x)^A$ range); Montgomery-Vaughan 2007 *Multiplicative Number Theory I* (Cambridge UP) §11.3
master: Davenport 2000 *Multiplicative Number Theory* 3e §20, §22 (the full PNT-in-AP and Siegel-Walfisz development, the exceptional-zero term, the $q \le (\log x)^A$ uniformity); Montgomery-Vaughan 2007 *Multiplicative Number Theory I* §11; Iwaniec-Kowalski 2004 *Analytic Number Theory* (AMS Colloquium Publications 53) §5.9, §17 (Siegel-Walfisz and its role in the large sieve and Bombieri-Vinogradov); Walfisz 1936 *Math. Z.* 40; Siegel 1935 *Acta Arithmetica* 1 (the ineffective lower bound feeding the uniformity)

References

Davenport, H. — Multiplicative Number Theory · Springer Graduate Texts in Mathematics 74, 3rd edition (2000), revised by H. L. Montgomery. §19-22. Develops the prime number theorem in arithmetic progressions: the orthogonality relation for Dirichlet characters gives $\psi(x; q, a) = \varphi(q)^{-1}\sum_{\chi \bmod q}\overline\chi(a)\psi(x, \chi)$; the contour proof of $\psi(x, \chi) = E_0(\chi)x - E_1(\chi)x^\beta/\beta + O(\ldots)$ for each $\chi$ uses the explicit formula and the zero-free region of §14; and assembling these with Siegel's ineffective bound on the exceptional zero yields the Siegel-Walfisz theorem $\psi(x; q, a) = x/\varphi(q) + O(x\exp(-c\sqrt{\log x}))$ uniformly for $q \le (\log x)^A$.
Walfisz, A. — Zur additiven Zahlentheorie II · Mathematische Zeitschrift 40 (1936), 592-607. Combines Siegel's 1935 ineffective lower bound on $L(1, \chi)$ with the de la Vallée Poussin contour method to prove the prime number theorem in arithmetic progressions uniformly in the modulus $q$ up to any fixed power $(\log x)^A$ of the logarithm, with the de la Vallée Poussin error term $O(x\exp(-c_A\sqrt{\log x}))$ and an ineffective constant $c_A$ inherited from Siegel.
Siegel, C. L. — Über die Classenzahl quadratischer Zahlkörper · Acta Arithmetica 1 (1935), 83-86. Proves $L(1, \chi) \gg_\varepsilon q^{-\varepsilon}$ for real primitive $\chi$ modulo $q$, equivalently $1 - \beta \gg_\varepsilon q^{-\varepsilon}$ for the exceptional zero, with an ineffective implied constant. This is the input that controls the exceptional-zero contribution uniformly in $q$ and converts the conditional PNT-in-AP into the unconditional Siegel-Walfisz statement on the range $q \le (\log x)^A$.
Iwaniec, H. & Kowalski, E. — Analytic Number Theory · AMS Colloquium Publications 53 (2004), §5.9 and Ch. 17. The Siegel-Walfisz theorem, the role of the exceptional zero in PNT in arithmetic progressions, the limitation $q \le (\log x)^A$, and the way the large sieve and the Bombieri-Vinogradov theorem recover uniformity up to $q \le x^{1/2-\varepsilon}$ on average over moduli.
Montgomery, H. L. & Vaughan, R. C. — Multiplicative Number Theory I: Classical Theory · Cambridge Studies in Advanced Mathematics 97 (2007), §11.3. The orthogonality decomposition, the explicit formula for $\psi(x, \chi)$, the exceptional-zero term, and the Siegel-Walfisz theorem with its ineffective constant, presented as the gateway to the large sieve of Ch. 9 and the Bombieri-Vinogradov theorem.

Estimated time

beginner: 18m
intermediate: 47m
master: 90m