21.15.02 · number-theory / exponential-sums

Weyl Sums, Weyl Differencing, and Equidistribution

shipped3 tiersLean: none

Anchor (Master): Weyl 1916 *Math. Ann.* 77, 313-352 (Über die Gleichverteilung von Zahlen mod. Eins — the differencing method, the bound for polynomial Weyl sums, and the equidistribution of $(\alpha n^k)$); Iwaniec-Kowalski 2004 *Analytic Number Theory* (AMS Colloquium 53) Ch. 8 (Weyl sums, van der Corput's method, the major/minor-arc dichotomy); Kuipers-Niederreiter 1974 *Uniform Distribution of Sequences* (Wiley) Ch. 1-2 (Weyl's criterion, discrepancy, the Erdős-Turán inequality, van der Corput's difference theorem); Montgomery 1994 *Ten Lectures* (CBMS 84) Ch. 1 (Weyl sums and the distribution of $\{n^2\alpha\}$); Vaughan 1997 *The Hardy-Littlewood Method* (Cambridge Tracts 125, 2nd ed.) Ch. 2 (Weyl's inequality and Hua's lemma in the circle method)

Intuition Beginner

Take an irrational number, say $2$ , and look at its multiples: $2, 22, 32, \dots$ . Throw away the whole-number part of each, keeping only what is left after the decimal point. You get a sequence of numbers between $0$ and $1$ . Weyl's equidistribution theorem says these leftover parts spread out perfectly evenly: in the long run, the fraction that land in any sub-interval is exactly the length of that interval, with no clumping and no gaps. The same holds for the squares $2, 42, 92, \dots$ , for the cubes, and for any polynomial in $n$ as long as one coefficient is irrational.

How do you prove a sequence spreads out evenly? Weyl's trick is to stop counting points in boxes and instead measure waves. For each whole number of cycles, wrap the unit interval into a circle, place a point for each term of the sequence, and add up the resulting arrows. If the points are spread evenly, the arrows point in all directions and cancel; their sum is tiny compared to the number of points. If the points clump, the arrows reinforce and the sum stays large. So even spreading becomes the single clean statement: every wave-sum is small.

For polynomial sequences the wave-sums are hard to add directly, so Weyl found a way to lower the difficulty. Squaring a sum lets you compare each term with a shifted copy of itself; the difference of two values of a degree- $k$ polynomial is a polynomial of degree one less. Repeating this peels the polynomial down to a straight line, which anyone can sum. This step-down is Weyl differencing, and it is the engine behind almost every bound on these sums.

Visual Beginner

Imagine the unit interval bent into a circle of circumference one. For a chosen whole number $h$ , you walk around the circle, and at step $n$ you mark the point reached by going $h$ times the fractional part of the $n$ -th term of your sequence. Each mark is an arrow of length one pointing from the center to that spot. The wave-sum is the single arrow you get by laying all these arrows tip to tail.

   evenly spread points          clumped points
        (arrows cancel)          (arrows reinforce)

         . | .                        . . .
       .   |   .                     . . . .
      .----+----.                    . . . .         -->  long
       .   |   .                       (all near                  total
         . | .                          one spot)                 arrow
     short total arrow

The picture shows the whole logic in one image: even spreading means short wave-sums, clumping means long ones. Weyl's criterion turns "the points are evenly spread" into "every wave-sum is short," and that is a thing you can actually compute.

Worked example Beginner

We watch the fractional parts of $n α$ begin to even out for $α = 2 = 1.41421 \dots$ , and we check one wave-sum by hand.

Step 1. List the first six fractional parts (the part after the decimal point) of $n 2$ . We have $12 = 1.41421$ so the fractional part is $0.41421$ ; $22 = 2.82843$ gives $0.82843$ ; $32 = 4.24264$ gives $0.24264$ ; $42 = 5.65685$ gives $0.65685$ ; $52 = 7.07107$ gives $0.07107$ ; $62 = 8.48528$ gives $0.48528$ .

Step 2. Sort them to see the spread: $0.071, 0.243, 0.414, 0.485, 0.657, 0.828$ . Even with only six points they already sample the whole interval from near $0$ to near $1$ without bunching. As you take more terms the gaps fill in more and more uniformly.

Step 3. Now test the wave-sum for one full cycle ( $h = 1$ ). For each fractional part $t$ , the arrow is the pair of numbers $(cos (2 π t), sin (2 π t))$ . Adding the six arrows, the horizontal parts are $0.866, - 0.187, - 0.890, - 0.443, 0.903, 0.044$ summing to $0.293$ , and the vertical parts are $0.500, - 0.983, 0.894, - 0.898, 0.429, 0.115$ , summing to $0.057$ .

Step 4. The total arrow has length about $0.29 3^{2} + 0.05 7^{2} = 0.086 + 0.003 = 0.30$ . Six arrows of length one have collapsed to a total of length $0.30$ , far shorter than six. The cancellation is already visible.

What this tells us: the fractional parts of $n 2$ are spreading evenly, and the matching fact is that the wave-sum is small relative to the number of terms. With more points the ratio of wave-sum length to point count keeps shrinking toward zero, which is precisely Weyl's criterion in action.

Check your understanding Beginner

Exercise (easy, multiple choice).

Weyl's criterion says a sequence of numbers between $0$ and $1$ is spread out perfectly evenly exactly when:

A. Every term equals a rational number
B. The wave-sum (sum of unit arrows) for each whole number of cycles stays small compared to the number of terms
C. The terms are strictly increasing
D. The average of the terms equals one half

Hint

Even spreading was rephrased as cancellation: the arrows point every which way and their total is short.

Answer

B. Weyl's criterion replaces the geometric statement "every sub-interval gets its fair share of points" with the analytic statement "for every nonzero whole number of cycles, the wave-sum is small relative to the number of terms." Option A is false: rational multiples repeat and do not spread out. Option C describes monotonicity, unrelated to even spreading on the circle. Option D is a single average; an evenly spread sequence does have average near one half, but that one number is far too weak to guarantee even spreading at every scale.

Formal definition Intermediate+

Throughout write $e (t) = e^{2 π i t}$ and let ${t}$ denote the fractional part of $t$ . The additive character $e (\cdot)$ is the same one used on $R / Z$ in 02.10.04 and 21.15.01.

Definition (uniform distribution mod $1$ ). A sequence $(x_{n})_{n \geq 1}$ of real numbers is uniformly distributed modulo $1$ (equidistributed) if for every subinterval $[a, b) \subseteq [0, 1)$ , $$ \lim_{N\to\infty} \frac{1}{N},#{, 1 \le n \le N : {x_n} \in [a,b) ,} = b - a. $$ Equivalently, by approximating indicator functions, $\frac{1}{N} \sum_{n \leq N} g ({x_{n}}) \to \int_{0}^{1} g$ for every Riemann-integrable $g$ on $[0, 1]$ .

Definition (Weyl sum). For a real-valued phase $f$ the associated Weyl sum of length $N$ is $$ S_N(f) = \sum_{n=1}^{N} e\big(f(n)\big). $$ When $f (x) = α_{k} x^{k} + \dots + α_{1} x + α_{0}$ is a polynomial of degree $k$ this is a polynomial Weyl sum; the case $f (x) = α x$ is the geometric (linear) sum, and $f (x) = α x^{2}$ is the prototypical quadratic Weyl sum.

Definition (discrepancy). The discrepancy of the first $N$ terms is $$ D_N = \sup_{0 \le a < b \le 1}\left| \frac{1}{N}#{, n \le N : {x_n} \in [a,b) ,} - (b-a)\right|. $$ A sequence is equidistributed iff $D_{N} \to 0$ . The discrepancy is the quantitative refinement of equidistribution: it measures the worst-case deviation from the fair share $b - a$ across all subintervals.

Definition (forward difference). For a function $f$ and a shift $h$ , the forward difference is $Δ_{h} f (x) = f (x + h) - f (x)$ . Iterating, $Δ_{h_{1}, \dots, h_{j}} f = Δ_{h_{j}} (\dots Δ_{h_{1}} f)$ . If $f$ is a polynomial of degree $k$ with leading coefficient $α_{k}$ , then $Δ_{h} f$ has degree $k - 1$ with leading coefficient $k α_{k} h$ ; applying $k - 1$ differences leaves a linear polynomial, and applying $k$ differences leaves the constant $k! α_{k} h_{1} \dots h_{k}$ . This degree-lowering is the structural fact Weyl differencing exploits.

Counterexamples to common slips

"Equidistribution mod $1$ is the same as density mod $1$ ." Density (the orbit closure being all of $[0, 1]$ ) is strictly weaker. The sequence built by listing the fractional parts of $lo g n$ is dense but not equidistributed, because $lo g n$ grows too slowly and the counts are weighted toward small values. Equidistribution demands the correct proportion in every interval, not merely that every interval is visited.
"Weyl's criterion needs all real frequencies $\sum e (θ x_{n}) \to 0$ ." Only integer frequencies $h \neq = 0$ are tested. Testing all real $θ$ would be both unnecessary and false in general; the integer characters $e (h \cdot)$ already separate points on $R / Z$ by Weyl's theorem on trigonometric polynomial approximation, so they suffice.
" ${n α}$ and ${n^{2} α}$ equidistribute for the same elementary reason." The linear case follows from a one-line geometric-series bound. The quadratic case has no such direct bound — the partial sums of $e (α n^{2})$ do not telescope — and genuinely requires differencing to reduce the square to a linear sum. Treating the quadratic case as a corollary of the linear one is the most common error.

Key theorem with proof Intermediate+

The signature result is Weyl's criterion together with its consequence for polynomial sequences; the differencing inequality is the tool that makes the polynomial case effective ^{[Weyl 1916]}.

Theorem (Weyl's criterion; Weyl 1916). A sequence $(x_{n})$ of reals is uniformly distributed mod $1$ if and only if $$ \frac{1}{N}\sum_{n=1}^{N} e(h, x_n) \longrightarrow 0 \qquad (N\to\infty) $$ for every integer $h \neq = 0$ .

Proof. Equidistribution is equivalent, by the Riemann-integrability reformulation, to $\frac{1}{N} \sum_{n \leq N} g ({x_{n}}) \to \int_{0}^{1} g$ for every Riemann-integrable $g$ . Taking $g (t) = e (h t)$ for $h \neq = 0$ gives $\int_{0}^{1} e (h t) d t = 0$ , so equidistribution forces the stated limit; this is the easy direction.

Conversely, suppose $\frac{1}{N} \sum_{n \leq N} e (h x_{n}) \to 0$ for all $h \neq = 0$ . For $h = 0$ the average is identically $1 = \int_{0}^{1} 1$ . Hence $\frac{1}{N} \sum_{n \leq N} P ({x_{n}}) \to \int_{0}^{1} P$ for every trigonometric polynomial $P (t) = \sum_{∣ h ∣ \leq H} c_{h} e (h t)$ , by linearity. Let $g$ be Riemann-integrable and $ε > 0$ . By the Weierstrass approximation theorem on the circle there are trigonometric polynomials $P_{-}, P_{+}$ with $P_{-} \leq g \leq P_{+}$ and $\int_{0}^{1} (P_{+} - P_{-}) < ε$ . Then $$ \frac1N\sum_{n\le N} P_-({x_n}) \le \frac1N\sum_{n\le N} g({x_n}) \le \frac1N\sum_{n\le N} P_+({x_n}), $$ and letting $N \to \infty$ the outer averages converge to $\int_{0}^{1} P_{\pm}$ , which sandwich $\int_{0}^{1} g$ within $ε$ . As $ε$ is arbitrary, $\frac{1}{N} \sum_{n \leq N} g ({x_{n}}) \to \int_{0}^{1} g$ . Taking $g$ the indicator of $[a, b)$ (Riemann-integrable, its boundary having measure zero) yields uniform distribution. $□$

Lemma (Weyl differencing inequality). For any complex numbers and any $f$ , the Weyl sum $S = \sum_{n = 1}^{N} e (f (n))$ satisfies $$ |S|^{2} \le N + 2,\bigg|\sum_{h=1}^{N-1}\ \sum_{n=1}^{N-h} e\big(f(n+h) - f(n)\big)\bigg| \le N \sum_{|h| < N}\ \bigg|\sum_{n \in I_h} e\big(\Delta_h f(n)\big)\bigg|, $$ where $I_{h}$ is the range of $n$ with $1 \leq n, n + h \leq N$ .

Proof. Expand the square: $∣ S ∣^{2} = \sum_{m, n} e (f (m) - f (n))$ . Substitute $m = n + h$ with $h$ ranging over $- (N - 1) \leq h \leq N - 1$ ; for fixed $h$ , $n$ runs over those indices with $1 \leq n \leq N$ and $1 \leq n + h \leq N$ , an interval $I_{h}$ of length $N - ∣ h ∣$ . Thus $∣ S ∣^{2} = \sum_{∣ h ∣ < N} \sum_{n \in I_{h}} e (Δ_{h} f (n))$ . The $h = 0$ term contributes $N$ . Pairing $h$ with $- h$ (conjugate inner sums) and applying the triangle inequality gives the middle bound; bounding $N - ∣ h ∣ \leq N$ and counting the $2 N - 1 \leq 2 N$ values of $h$ gives the final form $∣ S ∣^{2} \leq N \sum_{∣ h ∣ < N} ∣ \sum_{n} e (Δ_{h} f (n)) ∣$ . $□$

Corollary (equidistribution of $(α n^{k})$ ). For irrational $α$ and any integer $k \geq 1$ , the sequence $x_{n} = α n^{k}$ is uniformly distributed mod $1$ .

Proof. By Weyl's criterion it suffices to show $S_{N} (h) := \sum_{n \leq N} e (h α n^{k}) = o (N)$ for each fixed integer $h \neq = 0$ ; write $β = h α$ , still irrational. Induct on $k$ . For $k = 1$ , the geometric series gives $∣ S_{N} ∣ = ∣ \sum_{n \leq N} e (β n) ∣ \leq \frac{1}{∣ s i n π β ∣} = O (1) = o (N)$ since $β \in / Z$ . For $k \geq 2$ , apply the differencing lemma to $f (n) = β n^{k}$ . The inner difference $Δ_{h} f (n) = β ((n + h)^{k} - n^{k})$ is a polynomial in $n$ of degree $k - 1$ with leading coefficient $k β h$ , which is irrational for $h \neq = 0$ . By the inductive hypothesis its Weyl sum is $o (N)$ for each fixed $h$ , uniformly enough that $$ |S_N|^2 \le N \sum_{|h|<N} \Big|\sum_{n\in I_h} e(\Delta_h f(n))\Big| = N\Big(N + \sum_{0<|h|<N} o(N)\Big) = o(N^2), $$ where the $h = 0$ term gives $N \cdot N$ and a dominated-convergence / averaging argument over $h$ (splitting into $∣ h ∣ \leq H$ and $H < ∣ h ∣ < N$ , then sending $H \to \infty$ after $N \to \infty$ ) controls the remaining sum. Hence $∣ S_{N} ∣ = o (N)$ , completing the induction. $□$

Bridge. Weyl differencing builds toward the entire circle method, and it appears again in van der Corput's method below, where the same square-and-shift step is iterated with care for the lengths of the shift ranges. The foundational reason the polynomial case reduces to the linear case is that the forward difference $Δ_{h}$ lowers the degree of a polynomial by exactly one while keeping the leading coefficient irrational, so the induction this is exactly the differencing lemma supplies bottoms out at the elementary geometric sum. The criterion and the inequality are dual faces of one idea: the criterion says equidistribution is cancellation in $\sum e (h x_{n})$ , and the differencing inequality is how one produces that cancellation for polynomial phases. Putting these together, the quantitative version — the Erdős-Turán inequality of the Advanced results — turns each differencing bound into an explicit discrepancy estimate, and the same major/minor-arc split that organises the circle method is the central insight that the differencing bound is strong precisely on the minor arcs where $α$ is poorly approximated by rationals.

Exercises Intermediate+

Exercise 4 (medium, symbolic).

Using the differencing inequality, show that the quadratic Weyl sum $S = \sum_{n \leq N} e (α n^{2})$ satisfies $∣ S ∣^{2} \leq N \sum_{∣ h ∣ < N} \sum_{n \in I_{h}} e (2 α hn)$ , and identify the inner sum as a linear (geometric) sum.

Hint

Compute $Δ_{h} f (n)$ for $f (n) = α n^{2}$ and read off its $n$ -dependence.

Answer

With $f (n) = α n^{2}$ , the forward difference is $$ \Delta_h f(n) = \alpha\big((n+h)^2 - n^2\big) = \alpha(2hn + h^2) = 2\alpha h, n + \alpha h^2. $$ The constant term $α h^{2}$ contributes a unimodular factor $e (α h^{2})$ that pulls out of the inner sum and does not affect its absolute value, so $\sum_{n \in I_{h}} e (Δ_{h} f (n)) = \sum_{n \in I_{h}} e (2 α h n)$ . Substituting into the differencing lemma gives $∣ S ∣^{2} \leq N \sum_{∣ h ∣ < N} \sum_{n \in I_{h}} e (2 α hn)$ . The inner sum is a geometric sum with ratio $e (2 α h)$ , bounded by $min (∣ I_{h} ∣, \frac{1}{2∥2 α h ∥})$ . When $2 α h$ is far from the integers the bound $\frac{1}{2∥2 α h ∥}$ delivers cancellation; this is the mechanism by which the quadratic sum is controlled.

Exercise 6 (medium, symbolic).

State van der Corput's difference theorem and use it to give a second proof that $(α n^{2})$ is equidistributed for irrational $α$ , without computing exponential sums directly.

Hint

Van der Corput: if $(x_{n + h} - x_{n})$ is equidistributed for every fixed $h \geq 1$ , then $(x_{n})$ is equidistributed. Take $x_{n} = α n^{2}$ and compute the difference sequence.

Answer

Van der Corput's difference theorem. If for every integer $h \geq 1$ the sequence $(x_{n + h} - x_{n})_{n \geq 1}$ is uniformly distributed mod $1$ , then $(x_{n})$ is itself uniformly distributed mod $1$ .

Application. Let $x_{n} = α n^{2}$ . For fixed $h \geq 1$ , $$ x_{n+h} - x_n = \alpha\big((n+h)^2 - n^2\big) = 2\alpha h, n + \alpha h^2, $$ a linear sequence in $n$ with slope $2 α h$ . Since $α$ is irrational and $h \geq 1$ , the slope $2 α h$ is irrational, so $(x_{n + h} - x_{n})$ is a shifted copy of the equidistributed sequence $(2 α h n)$ — the additive constant $α h^{2}$ is a fixed translation that preserves equidistribution. Hence every difference sequence is equidistributed, and van der Corput's theorem gives equidistribution of $(α n^{2})$ . The theorem is the qualitative shadow of the differencing inequality: both reduce a sequence to its difference sequences, one for distributions and one for sums.

Exercise 7 (hard, symbolic).

Prove the Erdős-Turán inequality in the form $D_{N} \leq \frac{3}{H + 1} + \frac{3}{N} \sum_{h = 1}^{H} \frac{1}{h} \sum_{n \leq N} e (h x_{n})$ is consistent with Weyl's criterion: deduce from it that if $\frac{1}{N} \sum_{n \leq N} e (h x_{n}) \to 0$ for every fixed $h \neq = 0$ , then $D_{N} \to 0$ . (You may take the Erdős-Turán inequality as given.)

Hint

Fix $ε$ , choose $H$ so the first term is small, then let $N \to \infty$ with $H$ fixed so each of the finitely many exponential-sum terms vanishes.

Answer

Let $ε > 0$ . Choose $H = H (ε)$ with $\frac{3}{H + 1} < ε /2$ . With $H$ now fixed, the Erdős-Turán inequality reads $$ D_N \le \frac{\varepsilon}{2} + \frac{3}{N}\sum_{h=1}^{H}\frac1h\Big|\sum_{n\le N} e(h x_n)\Big| = \frac{\varepsilon}{2} + 3\sum_{h=1}^{H}\frac1h\Big|\frac1N\sum_{n\le N} e(h x_n)\Big|. $$ The sum over $h$ is finite (only $H$ terms), and by hypothesis each $\frac{1}{N} \sum_{n \leq N} e (h x_{n}) \to 0$ as $N \to \infty$ . Hence there is $N_{0}$ such that for $N \geq N_{0}$ every term satisfies $3 \sum_{h \leq H} \frac{1}{h} ∣ \dots ∣ < ε /2$ . For such $N$ , $D_{N} < ε$ . As $ε$ was arbitrary, $D_{N} \to 0$ , recovering equidistribution. The inequality is strictly stronger than Weyl's criterion: feeding in a rate of decay for the exponential sums yields a rate of decay for $D_{N}$ , which the qualitative criterion cannot give. For $x_{n} = n α$ with $α$ badly approximable, this yields $D_{N} ≪ N^{- 1} lo g N$ .

Exercise 8 (hard, symbolic).

Carry out the full induction for the equidistribution of $(α n^{k})$ in the case $k = 3$ , making explicit how the differencing inequality reduces the cubic sum to quadratic sums and then to linear sums, and where the $o (N^{2})$ control comes from.

Hint

Difference once to get quadratic phases $Δ_{h} f$ , invoke the (already proved) $k = 2$ result for each fixed $h$ , and handle the average over $h$ by splitting the $h$ -range.

Answer

Fix $h_{0} \neq = 0$ and set $β = h_{0} α$ (irrational); let $S_{N} = \sum_{n \leq N} e (β n^{3})$ . The differencing inequality gives $$ |S_N|^2 \le N\sum_{|h|<N}\Big|\sum_{n\in I_h} e(\Delta_h g(n))\Big|, \qquad g(n) = \beta n^3, $$ where $Δ_{h} g (n) = β ((n + h)^{3} - n^{3}) = 3 β h n^{2} + 3 β h^{2} n + β h^{3}$ is a quadratic in $n$ with leading coefficient $3 β h$ , irrational for $h \neq = 0$ . By the $k = 2$ result, for each fixed $h \neq = 0$ the inner sum is $o (N)$ as $N \to \infty$ . Split the $h$ -sum at a threshold $H$ : the $h = 0$ term contributes $N$ , the terms $0 < ∣ h ∣ \leq H$ contribute $\sum_{0 < ∣ h ∣ \leq H} o (N) = o (N)$ (finitely many $o (N)$ terms for fixed $H$ ), and the tail $H < ∣ h ∣ < N$ is bounded crudely by $\sum_{H < ∣ h ∣ < N} N = O (N^{2})$ — too weak alone, so instead bound each inner sum by $∣ I_{h} ∣ \leq N$ and use that the average over $h$ of the quadratic Weyl sums is $o (N)$ by Cauchy-Schwarz applied a second time (a second differencing, the Weyl-van der Corput iteration). Combining, $∣ S_{N} ∣^{2} \leq N (N + o (N \cdot N)) = o (N^{2})$ , so $∣ S_{N} ∣ = o (N)$ . By Weyl's criterion (over all $h_{0} \neq = 0$ ) the sequence $(α n^{3})$ is equidistributed. The pattern is uniform in $k$ : each differencing drops the degree by one and squares the sum, so $k - 1$ differencings reduce to the linear geometric bound, at the cost of the saving exponent $2^{1 - k}$ recorded in Weyl's inequality.

Advanced results Master

Weyl's inequality and the saving exponent

The qualitative corollary $(α n^{k})$ equidistributes becomes quantitative through Weyl's inequality ^{[Vaughan 1997]}. Let $f (x) = α x^{k} + α_{k - 1} x^{k - 1} + \dots + α_{1} x$ with $∣ α - a / q ∣ \leq q^{- 2}$ and $g cd (a, q) = 1$ . Then $$ \sum_{n\le N} e(f(n)) \ll N^{1+\varepsilon}\Big(\frac1q + \frac1N + \frac{q}{N^k}\Big)^{1/K}, \qquad K = 2^{k-1}. $$ The proof is $k - 1$ iterations of the differencing inequality, each squaring the sum and lowering the degree, terminating at a linear sum estimated by $min (N, ∥ \cdot ∥^{- 1})$ . The exponent $1/ K = 2^{1 - k}$ is the cumulative cost of the squarings: differencing is lossy, and the loss compounds geometrically in the degree. On a genuine minor arc, where $N^{δ} \leq q \leq N^{k - δ}$ , the bound reads $\sum_{n \leq N} e (f (n)) ≪ N^{1 - 2^{1 - k} + ε}$ , a power saving. The companion estimate, Hua's lemma, bounds the $2^{k}$ -th moment $\int_{0}^{1} ∣ \sum_{n \leq N} e (α n^{k}) ∣^{2^{k}} d α ≪ N^{2^{k} - k + ε}$ and is what feeds Weyl's inequality into the circle method for Waring's problem.

The major/minor-arc dichotomy

The phrase " $α$ well or poorly approximated by rationals" is made precise by the Farey dissection of $[0, 1]$ into major and minor arcs ^{[Iwaniec-Kowalski 2004]}. Fix a parameter $Q$ . The major arcs $M$ are short intervals around each rational $a / q$ with $q \leq Q$ ; the minor arcs $m$ are the rest. On the major arcs the Weyl sum is large but explicitly approximated by a main term (a product of a singular series and a singular integral); on the minor arcs Weyl's inequality forces $\sum_{n \leq N} e (f (n)) = o (N)$ , so their contribution to a circle-method integral is an error term. The entire Hardy-Littlewood method is the statement that $\int_{0}^{1} = \int_{M} + \int_{m}$ , with the major arcs giving the asymptotic main term and the minor arcs — controlled by Weyl differencing — giving a smaller error. Equidistribution of $(α n^{k})$ is the qualitative $Q \to \infty$ shadow: every fixed nonzero $h$ eventually lands $h α$ in a minor-arc regime relative to $N$ , where the sum is $o (N)$ .

Van der Corput's method and discrepancy

Van der Corput's method refines differencing into two named operations on exponential sums with smooth (not necessarily polynomial) phases ^{[Kuipers-Niederreiter 1974]}. The $A$ -process (Weyl-van der Corput) is exactly the differencing inequality, trading a sum of length $N$ for an average of difference sums. The $B$ -process is Poisson summation 21.15.01 followed by stationary phase, replacing $\sum_{n} e (f (n))$ by a dual sum $\sum_{m} e (\pm f^{*} (m)) / ∣ f^{''} ∣$ over the critical points of $f$ . Alternating $A$ and $B$ processes produces the exponent pairs $(κ, λ)$ with $\sum_{n \sim N} e (f (n)) ≪ N^{λ} (amplitude)^{κ}$ , the backbone of bounds for $ζ (1/2 + i t)$ and the Dirichlet divisor problem. Quantitatively, the Erdős-Turán inequality ^{[Erdős Turán 1948]} converts any such bound into a discrepancy estimate: $D_{N} ≪ \frac{1}{H} + \frac{1}{N} \sum_{h \leq H} \frac{1}{h} ∣ \sum_{n \leq N} e (h x_{n}) ∣$ , so cancellation in finitely many Weyl sums caps the worst-case deviation from uniformity across all intervals at once.

Synthesis. Weyl's criterion, the differencing inequality, van der Corput's method, and the major/minor-arc dichotomy are one circle of ideas, and the bridge is the additive character $e (\cdot)$ : equidistribution is the vanishing of every character average, and every tool here is a way to produce that vanishing for an increasingly wide class of phases. The foundational reason the polynomial case works is that forward differencing lowers degree by one while keeping leading coefficients irrational, so the induction bottoms out at the geometric sum; this is exactly the $A$ -process, and the central insight is that it is dual to van der Corput's $B$ -process, which is Poisson summation from 21.15.01. Putting these together, the saving exponent $2^{1 - k}$ generalises the linear $O (1)$ bound up the degree ladder, the Erdős-Turán inequality generalises the qualitative criterion into a quantitative discrepancy bound, and the major/minor-arc split is dual to the partition of frequencies by Diophantine quality — the same partition that organises the Hardy-Littlewood circle method and, through the $B$ -process, ties Weyl sums to the Voronoi and Poisson summation formulae of 21.15.01.

Full proof set Master

Proposition 1 (degree reduction under forward differencing). If $f$ is a real polynomial of degree $k \geq 1$ with leading coefficient $α_{k}$ , then for each $h \neq = 0$ the difference $Δ_{h} f$ is a polynomial of degree exactly $k - 1$ with leading coefficient $k α_{k} h$ ; consequently $Δ_{h_{1}} \dots Δ_{h_{k}} f = k! α_{k} h_{1} \dots h_{k}$ is constant.

Proof. Write $f (x) = α_{k} x^{k} + (lower order)$ . Then $$ \Delta_h f(x) = \alpha_k\big((x+h)^k - x^k\big) + \Delta_h(\text{lower}) = \alpha_k\big(k h, x^{k-1} + \tbinom{k}{2}h^2 x^{k-2} + \cdots\big) + \Delta_h(\text{lower}). $$ The top term $x^{k}$ cancels, leaving $k α_{k} h x^{k - 1}$ as the highest surviving power; the differenced lower-order terms have degree $\leq k - 2$ , so they do not interfere with the degree- $(k - 1)$ leading coefficient. Thus $de g Δ_{h} f = k - 1$ with leading coefficient $k α_{k} h \neq = 0$ . Iterating, each application lowers the degree by one and multiplies the leading coefficient by the next factor $(deg) \cdot h_{j}$ , so after $k$ steps the leading coefficient is $k (k - 1) \dots 1 \cdot α_{k} h_{1} \dots h_{k} = k! α_{k} h_{1} \dots h_{k}$ and the degree is $0$ . $□$

Proposition 2 (irrationality propagates through differencing). If $α$ is irrational and $f (x) = α x^{k}$ , then for every choice of nonzero shifts $h_{1}, \dots, h_{k - 1}$ the iterated difference $Δ_{h_{1}} \dots Δ_{h_{k - 1}} f$ is a linear polynomial $c x + d$ with slope $c = k (k - 1) \dots 2 \cdot α h_{1} \dots h_{k - 1}$ , which is irrational.

Proof. By Proposition 1 applied $k - 1$ times, $Δ_{h_{1}} \dots Δ_{h_{k - 1}} f$ has degree $k - (k - 1) = 1$ and leading (slope) coefficient $$ c = k\cdot(k-1)\cdots 2 \cdot \alpha, h_1 h_2 \cdots h_{k-1} = k!,\alpha, h_1\cdots h_{k-1}. $$ Each $h_{j}$ is a nonzero integer and the integer factor $k!$ is nonzero, so $c$ is a nonzero rational multiple of the irrational $α$ , hence irrational. The constant term $d$ is real but irrelevant: $e (c n + d) = e (d) e (c n)$ and the unimodular factor $e (d)$ does not affect $∣ \sum_{n} e (c n + d) ∣$ . Therefore the terminal linear sum is a geometric sum with irrational ratio $e (c)$ , bounded by $\frac{1}{2∥ c ∥} = O (1)$ uniformly in $N$ . $□$

Proposition 3 (Weyl's criterion via unique ergodicity for $x_{n} = n α$ ). For irrational $α$ , the rotation $T : x \mapsto x + α$ on $R / Z$ is uniquely ergodic, and consequently $(n α)$ is uniformly distributed mod $1$ .

Proof. Let $μ$ be any $T$ -invariant Borel probability measure on $R / Z$ . For its Fourier coefficients $\overset{μ}{^} (h) = \int e (- h x) d μ (x)$ , invariance under $T$ gives $\overset{μ}{^} (h) = \int e (- h (x + α)) d μ = e (- h α) \overset{μ}{^} (h)$ , so $(1 - e (- h α)) \overset{μ}{^} (h) = 0$ . For $h \neq = 0$ , irrationality of $α$ makes $e (- h α) \neq = 1$ , forcing $\overset{μ}{^} (h) = 0$ ; thus $μ$ has the same Fourier coefficients as Lebesgue measure and equals it. Unique ergodicity gives, for every continuous $g$ , the uniform convergence $\frac{1}{N} \sum_{n = 0}^{N - 1} g (T^{n} x) \to \int g d μ = \int_{0}^{1} g$ for every starting point $x$ , in particular $x = 0$ : $\frac{1}{N} \sum_{n < N} g ({n α}) \to \int_{0}^{1} g$ . Specialising $g (t) = e (h t)$ recovers $\frac{1}{N} \sum_{n < N} e (hn α) \to 0$ for $h \neq = 0$ , which is Weyl's criterion; equidistribution follows. This is the dynamical proof, complementary to the geometric-series proof, and it is the route Mathlib's uniform-distribution development takes. $□$

Connections Master

Weyl differencing is the qualitative ancestor of the van der Corput $B$ -process, which is Poisson summation followed by stationary phase; the dual sum it produces is exactly the transform side of the Poisson and Voronoi formulae developed in 21.15.01, so the two units are the $A$ -process and $B$ -process halves of one method for estimating $\sum_{n} e (f (n))$ .

The Weyl sum machinery rests on the additive character $e (\cdot)$ and the Fourier-analytic duality between $R / Z$ and $Z$ formalised in 02.10.04: Weyl's criterion is the statement that the integer characters separate invariant measures, and the differencing inequality is a Cauchy-Schwarz manipulation of these characters.

The minor-arc bounds delivered by Weyl's inequality are the input to the Hardy-Littlewood circle method, where they appear again alongside the complete exponential sums of 21.15.04; the Gauss and Kloosterman sums there govern the major-arc main terms while Weyl differencing governs the minor-arc error, so the two exponential-sum units partition the circle method between them.

Historical & philosophical context Master

Hermann Weyl's 1916 paper in the Mathematische Annalen ^{[Weyl 1916]} introduced all three pillars of this unit at once: the equidistribution criterion, the differencing method for polynomial phases, and the resulting theorem that $(α n^{k})$ is uniformly distributed for irrational $α$ . The paper grew out of work on the foundations of analysis and on almost-periodic phenomena; the equidistribution of $(α n)$ had been observed independently by Bohl, Sierpiński, and Weyl around 1909-1910, but the polynomial case and the criterion were Weyl's. The differencing idea — squaring a sum to compare a term with its shifts — became one of the most-reused tools in analytic number theory.

Johannes van der Corput systematised the method in the 1920s into the $A$ - and $B$ -processes and the theory of exponent pairs, and his difference theorem (a sequence is equidistributed if all its difference sequences are) gave the qualitative form. The quantitative bridge from exponential-sum cancellation to discrepancy is the inequality of Paul Erdős and Pál Turán ^{[Erdős Turán 1948]}, published in two parts in the Indagationes Mathematicae of 1948. Weyl's inequality entered the Hardy-Littlewood circle method as the standard minor-arc estimate, codified in Vaughan's tract ^{[Vaughan 1997]}; the modern refinements — Vinogradov's mean value theorem and its resolution by Bourgain-Demeter-Guth via decoupling in 2016 — sharpen the saving exponent far beyond Weyl's $2^{1 - k}$ for large $k$ , but the differencing reduction remains the conceptual starting point.

Bibliography Master

@article{weyl1916,
  author  = {Weyl, Hermann},
  title   = {{\"U}ber die Gleichverteilung von Zahlen mod. Eins},
  journal = {Mathematische Annalen},
  volume  = {77},
  number  = {3},
  pages   = {313--352},
  year    = {1916}
}

@article{erdosturan1948,
  author  = {Erd{\H o}s, Paul and Tur{\'a}n, P{\'a}l},
  title   = {On a problem in the theory of uniform distribution},
  journal = {Indagationes Mathematicae},
  volume  = {10},
  pages   = {370--378, 406--413},
  year    = {1948}
}

@book{kuipersniederreiter1974,
  author    = {Kuipers, Lauwerens and Niederreiter, Harald},
  title     = {Uniform Distribution of Sequences},
  series    = {Pure and Applied Mathematics},
  publisher = {Wiley-Interscience},
  year      = {1974}
}

@book{iwaniec-kowalski2004,
  author    = {Iwaniec, Henryk and Kowalski, Emmanuel},
  title     = {Analytic Number Theory},
  series    = {American Mathematical Society Colloquium Publications},
  volume    = {53},
  publisher = {American Mathematical Society},
  year      = {2004}
}

@book{vaughan1997,
  author    = {Vaughan, Robert C.},
  title     = {The Hardy-Littlewood Method},
  series    = {Cambridge Tracts in Mathematics},
  volume    = {125},
  edition   = {2},
  publisher = {Cambridge University Press},
  year      = {1997}
}

@book{montgomery1994,
  author    = {Montgomery, Hugh L.},
  title     = {Ten Lectures on the Interface Between Analytic Number Theory and Harmonic Analysis},
  series    = {CBMS Regional Conference Series in Mathematics},
  volume    = {84},
  publisher = {American Mathematical Society},
  year      = {1994}
}

Prerequisites

21.15.01
02.10.04

Tier anchors

beginner: Stein-Shakarchi 2003 *Fourier Analysis: An Introduction* (Princeton Lectures in Analysis I) Ch. 4 §2 (equidistribution and Weyl's criterion); Körner 1988 *Fourier Analysis* (Cambridge) Ch. 2-3 (Weyl's equidistribution theorem informally)
intermediate: Iwaniec-Kowalski 2004 *Analytic Number Theory* (AMS Colloquium 53) Ch. 8 §8.2 (Weyl sums and Weyl's differencing method); Stein-Shakarchi 2003 *Fourier Analysis* Ch. 4 (Weyl's criterion and equidistribution of $\{n\alpha\}$ and $\{n^2\alpha\}$); Montgomery 1994 *Ten Lectures on the Interface Between Analytic Number Theory and Harmonic Analysis* (CBMS 84) Ch. 1 §1
master: Weyl 1916 *Math. Ann.* 77, 313-352 (Über die Gleichverteilung von Zahlen mod. Eins — the differencing method, the bound for polynomial Weyl sums, and the equidistribution of $(\alpha n^k)$); Iwaniec-Kowalski 2004 *Analytic Number Theory* (AMS Colloquium 53) Ch. 8 (Weyl sums, van der Corput's method, the major/minor-arc dichotomy); Kuipers-Niederreiter 1974 *Uniform Distribution of Sequences* (Wiley) Ch. 1-2 (Weyl's criterion, discrepancy, the Erdős-Turán inequality, van der Corput's difference theorem); Montgomery 1994 *Ten Lectures* (CBMS 84) Ch. 1 (Weyl sums and the distribution of $\{n^2\alpha\}$); Vaughan 1997 *The Hardy-Littlewood Method* (Cambridge Tracts 125, 2nd ed.) Ch. 2 (Weyl's inequality and Hua's lemma in the circle method)

References

Weyl, H. — Über die Gleichverteilung von Zahlen mod. Eins · *Mathematische Annalen* 77 (1916), 313-352. Introduces the equidistribution criterion (a sequence $(x_n)$ is uniformly distributed mod $1$ iff $\sum_{n\le N} e(h x_n) = o(N)$ for every nonzero integer $h$), the differencing method for polynomial exponential sums $\sum_{n\le N} e(f(n))$, and the resulting equidistribution of $(\alpha n^k)$ for irrational $\alpha$.
Iwaniec, H. & Kowalski, E. — Analytic Number Theory · American Mathematical Society Colloquium Publications 53 (2004), Ch. 8. Weyl sums $\sum_{n\le N} e(f(n))$ for polynomial phases, Weyl's differencing inequality reducing a degree-$k$ sum to averages of degree-$(k-1)$ sums, the Weyl bound, van der Corput's method (the $A$- and $B$-processes), and the major/minor-arc dichotomy underlying the circle method.
Kuipers, L. & Niederreiter, H. — Uniform Distribution of Sequences · Pure and Applied Mathematics, Wiley-Interscience (1974), Ch. 1-2. Weyl's criterion, the discrepancy $D_N$ of a sequence, the Erdős-Turán inequality bounding $D_N$ by a finite sum of exponential sums, van der Corput's difference theorem, and the metric theory of uniform distribution.
Montgomery, H. L. — Ten Lectures on the Interface Between Analytic Number Theory and Harmonic Analysis · CBMS Regional Conference Series in Mathematics 84, American Mathematical Society (1994), Ch. 1. Weyl sums, the distribution of $\{n^2\alpha\}$, the large sieve, and the role of the major and minor arcs in bounding exponential sums.
Vaughan, R. C. — The Hardy-Littlewood Method · Cambridge Tracts in Mathematics 125, 2nd edition, Cambridge University Press (1997), Ch. 2. Weyl's inequality $\sum_{n\le N} e(\alpha n^k) \ll N^{1+\varepsilon}(q^{-1} + N^{-1} + q N^{-k})^{2^{1-k}}$ for $|\alpha - a/q| \le q^{-2}$, Hua's lemma, and their use in Waring's problem and the Goldbach problem.
Erdős, P. & Turán, P. — On a problem in the theory of uniform distribution · *Indagationes Mathematicae* 10 (1948), 370-378 and 406-413. The Erdős-Turán inequality $D_N \le C\big(\frac1H + \sum_{h=1}^{H}\frac1h |\frac1N\sum_{n\le N} e(h x_n)|\big)$, making Weyl's qualitative criterion quantitative by bounding the discrepancy in terms of finitely many exponential sums.

Estimated time

beginner: 20m
intermediate: 55m
master: 90m