21.15.05 · number-theory / exponential-sums

The Vinogradov Mean Value Theorem

shipped3 tiersLean: none

Anchor (Master): Bourgain-Demeter-Guth 2016 *Annals of Mathematics* 184, 633-682 (Proof of the main conjecture in Vinogradov's mean value theorem for degrees higher than three, via $\ell^2$ decoupling for the moment curve); Wooley 2016 *Proc. London Math. Soc.* 118, 942-1016 (Nested efficient congruencing and the main conjecture); Demeter 2020 *Fourier Restriction, Decoupling, and Applications* (Cambridge Studies in Advanced Mathematics 184) Ch. 11-13 (decoupling for the moment curve and the Vinogradov system); Vinogradov 1935 *Trav. Inst. Math. Stekloff* 10 (the original mean value method); Iwaniec-Kowalski 2004 *Analytic Number Theory* (AMS Colloquium 53) Ch. 8

Intuition Beginner

Pick a length and write down every whole number from to . Now choose of them (repeats allowed) for a left team and another for a right team. Ask the two teams to tie on a whole list of scores at once: the sum of the numbers must match, and the sum of their squares must match, and the sum of their cubes, all the way up to the sum of their -th powers. How many ways can the two teams be picked so that every one of these tallies comes out equal? That count is the Vinogradov mean value , and the whole subject is the size of this number.

There is an easy way to force a tie: make the right team a rearrangement of the left team. Then every tally matches automatically, because you are adding the same numbers in a different order. These "diagonal" ties are cheap and plentiful, and they set a floor on . The deep question is whether there are many more ties than the diagonal ones — whether the simultaneous equations have lots of genuinely different solutions, or whether the diagonal solutions are essentially all of them.

Why care about a counting puzzle? Because this single count controls how much a high-degree wave-sum can fail to cancel. The wave-sums built from that resisted van der Corput's method for large are governed exactly by this team-tying count, and knowing the count is small unlocks the sharpest bounds on those sums and on where the Riemann zeta function can grow.

Visual Beginner

Think of each number you pick as a point, and the matching conditions as a stack of balance scales — one scale for the plain sums, one for the squares, one for the cubes, and so on up to the -th powers. A choice of two teams is a solution only when every scale balances at once.

   the system to satisfy (k balance conditions)

   sum of values     [ left ]======[ right ]   balanced?
   sum of squares     [ left ]======[ right ]   balanced?
   sum of cubes       [ left ]======[ right ]   balanced?
        ...                  ...        ...
   sum of k-th powers [ left ]======[ right ]   balanced?

   diagonal tie: right = a shuffle of left  -->  ALL scales balance for free
   extra ties:  genuinely different teams that still balance every scale

The picture makes the dichotomy visible: the diagonal ties cost nothing and there are about of them; the theorem asks whether the genuinely different ties are also held down near that floor (plus a second floor coming from the larger pool of numbers when is small). The main conjecture says yes — there are essentially no surprises beyond the two obvious floors.

Worked example Beginner

We count solutions of the simplest interesting Vinogradov system by hand: (match sums and sums of squares), (two on each team), drawn from , so . We want pairs and with and .

Step 1. Fix the left team and compute its two tallies. For the pair of tallies is (sum, sum of squares). For example gives , and gives .

Step 2. List every left team and its tally pair. The ordered pairs from and their (sum, sum-of-squares): ; ; ; ; ; ; ; ; .

Step 3. Group teams by their tally pair, since a solution pairs any two teams with the same pair. The groups are: : one team; : two teams ; : two teams ; : one team; : two teams; : one team.

Step 4. Count ordered solution pairs. A group of size contributes solutions (pick any left team, any right team from the group). So the total is .

What this tells us: . The diagonal solutions (right team a shuffle of left) number choices in the generic case, and here they account for the bulk; the only "extra" structure is that and share a tally pair, which is just the shuffle again. For this matching is rigid: fixing the sum and the sum of squares essentially fixes the unordered pair. That rigidity, made quantitative, is the content of the theorem.

Check your understanding Beginner

Formal definition Intermediate+

Throughout write ; the additive character is the one used on in 21.15.02 and 21.15.03, and integrals are over the torus with .

Definition (the Vinogradov system and its count). Fix integers , , . The Vinogradov mean value is the number of integer solutions of the simultaneous system $$ x_1^j + \cdots + x_s^j = y_1^j + \cdots + y_s^j \qquad (1 \le j \le k), \qquad 1 \le x_i, y_i \le N. $$ Equivalently, setting , $$ J_{s,k}(N) = \int_{\mathbb{T}^k} \big| f_k(\boldsymbol\alpha; N) \big|^{2s}, d\boldsymbol\alpha, $$ since expanding and integrating term by term, the orthogonality in each variable picks out exactly the tuples satisfying all power-sum equations.

Definition (the two floors). Two families of solutions are forced. The diagonal solutions take to be a permutation of ; there are of these (choose the freely; generic tuples have permutations). Independently, counting freely on the remaining degrees of freedom after the constraints forces solutions when , since the system cuts out a variety of codimension inside the -dimensional box. Hence the unconditional lower bound $$ J_{s,k}(N) \gg N^{s} + N^{2s - \frac{k(k+1)}{2}}. $$

Definition (the main conjecture). The main conjecture for Vinogradov's mean value theorem asserts that this lower bound is sharp up to -powers: for every , $$ J_{s,k}(N) \ll_{\varepsilon} N^{s + \varepsilon} + N^{2s - \frac{k(k+1)}{2} + \varepsilon}, $$ uniformly in . The critical exponent is , where the two terms coincide at ; the difficulty of the conjecture is concentrated at , from which all other follow by Hölder's inequality and direct summation of the surplus variables.

Counterexamples to common slips

  • "The main conjecture is a single inequality at one value of ." It is a family indexed by , but the substance lives at the critical . For the first term dominates and the statement is ; for the second term dominates. Proving the critical case implies all others, so "the main conjecture" usually means the critical bound.

  • "Vinogradov's method always beats van der Corput." It beats van der Corput's exponent-pair bounds for high degree , where the iterated -processes lose too much. For small (notably for the zeta function on the critical line) the van der Corput and exponent-pair bounds of 21.15.03 remain competitive or superior; Vinogradov's strength is the uniformity of the saving as .

  • " holds for all ." Only for . Once exceeds , the second floor overtakes and is genuinely larger; claiming there contradicts the lower bound. The two-term shape of the conjecture is essential, not cosmetic.

Key theorem with proof Intermediate+

The signature result is the main conjecture itself, now a theorem; below is the statement together with the structure of the proof, and a complete proof of the orthogonality identity and the lower bound that frame it [Bourgain Demeter Guth 2016].

Theorem (Vinogradov main conjecture; Bourgain-Demeter-Guth 2016, Wooley 2016). For all integers , and every , $$ J_{s,k}(N) \ll_{k, s, \varepsilon} N^{s + \varepsilon} + N^{2s - \frac{k(k+1)}{2} + \varepsilon}. $$ In particular at the critical exponent one has .

Proof. The lower bound matching this is elementary and is proved below as Proposition 2; the content is the upper bound, established by two independent routes.

The orthogonality framing. Expand over . Integrating over and using in each coordinate kills every term except those with for all , i.e. counts exactly the solutions of the system. So , reducing the arithmetic count to a -th moment of a Weyl sum of the type bounded in 21.15.02.

Decoupling route (Bourgain-Demeter-Guth). Partition into -arcs and consider the moment curve . The decoupling inequality for states that for any function with Fourier support in an -neighbourhood of , decomposed as over -caps , $$ | g |{L^p(\mathbb{R}^k)} \ll\varepsilon N^\varepsilon \Big( \sum_\theta |g_\theta|{L^p}^2 \Big)^{1/2}, \qquad p = k(k+1). $$ Specialising to an exponential sum over the caps and using $|g\theta|{L^p} \asympp = 2s_k = k(k+1)J{s_k,k}(N) \ll N^{s_k + \varepsilon}N\Gamma_{k-1}$ feeds the induction at each scale.

Efficient congruencing route (Wooley). Argue -adically. Choosing a prime , classify solutions by the congruence classes of the variables modulo powers of ; a solution of the full system forces strong congruence conditions, because a Vandermonde determinant in the variables (the same determinant that makes the system rigid) is divisible by a high power of . This lets one bound by a version of itself at a smaller scale with extra congruence weight — an efficient congruencing iteration in schematic form. Iterating drives the exponent to the conjectured value; the nested efficient congruencing of Wooley's 2019 paper organises the bookkeeping to reach the sharp bound for all .

Bridge. The mean value theorem builds toward the sharpest Weyl-sum bounds and the Vinogradov-Korobov zero-free region, and it appears again in the application to Waring's problem below, where is exactly the minor-arc moment fed into the circle method of 21.15.02. The foundational reason the diagonal floor is so close to the truth is that the simultaneous power-sum equations are governed by a Vandermonde determinant: fixing the first power sums of a tuple of size fixes the tuple up to permutation, so genuinely non-diagonal solutions are forced to be sparse. This is exactly the rigidity that both proofs exploit — decoupling reads it as curvature of the moment curve , efficient congruencing reads it as -adic Vandermonde divisibility, and the two are dual descriptions of one geometric fact. The central insight is that a high-degree exponential sum's size is controlled by a counting problem, so the analytic question of cancellation generalises into the arithmetic question of how many solutions the system has; putting these together, the resolution of the main conjecture is simultaneously a statement in incidence geometry, in -adic analysis, and in the theory of .

Exercises Intermediate+

Advanced results Master

The critical exponent and the shape of the bound

The entire difficulty of the main conjecture concentrates at [Demeter 2020]. For the conjecture reads (the diagonal floor), and for it reads (the dimension floor); the critical case interpolates between them at , and Exercise 6 shows it implies the rest. Before the resolution, Vinogradov's original 1935 method and its refinements by Linnik, Karatsuba, and Stechkin reached bounds of the shape with but for fixed near critical — strong enough for the zero-free region but not the exact main conjecture. The gap was closed unconditionally only in 2015-2016, first for by Wooley's efficient congruencing (2014) and then for all by Bourgain-Demeter-Guth's decoupling (2016) and Wooley's nested congruencing (2019).

Decoupling for the moment curve

The Bourgain-Demeter-Guth proof factors through a statement in Euclidean harmonic analysis with no number theory in it [Bourgain Demeter Guth 2016]. The decoupling inequality for the moment curve asserts that for Fourier-supported in the -neighbourhood of and decomposed into -caps , $$ | g |{L^p(\mathbb{R}^k)} \le C\varepsilon \delta^{-\varepsilon} \Big( \sum_\theta | g_\theta |{L^p(\mathbb{R}^k)}^2 \Big)^{1/2} $$ for all , sharp at the endpoint . Taking $g\theta\delta = N^{-1}\Gamma_{k-1}\Gamma_k$ — the non-vanishing of the torsion, equivalently the Vandermonde determinant of the tangent, osculating, and higher flags — is the geometric input that makes decoupling hold, and it is the exact analytic avatar of the arithmetic rigidity of Exercise 7.

Efficient congruencing and the -adic route

Wooley's method bypasses harmonic analysis entirely, working -adically [Wooley 2012]. The idea is to count solutions by conditioning on the -adic distances between the variables: solutions in which many variables are congruent modulo high powers of a prime are forced by a Vandermonde divisibility — the same determinant — and this lets the count at scale be bounded by a weighted count at scale , giving a self-improving recursion. Each iteration "efficiently congruences" a block of variables, trading scale for congruence concentration; the nested version tracks two interleaved blocks and reaches the sharp exponent for every . Efficient congruencing also extends to systems over number fields and function fields where the decoupling machinery is harder to deploy, and it gives explicit (non--power) bounds in some ranges. The two proofs are now understood as parallel: decoupling is the archimedean and efficient congruencing the non-archimedean reading of the curvature/Vandermonde rigidity of the moment curve.

Consequences: Weyl sums, the zero-free region, and Waring's problem

The mean value theorem is a tool, and its three classical payoffs are sharp [Iwaniec Kowalski 2004]. First, the Weyl-sum bound: for the high-degree minor-arc sum one obtains a saving with , beating the van der Corput exponent-pair and Weyl-differencing bounds of 21.15.03 and 21.15.02 for all large . Second, the Vinogradov-Korobov zero-free region: feeding these bounds into the explicit formula gives that for , the widest known zero-free region and the source of the best unconditional error term in the prime number theorem. Third, Waring's problem: the mean value bound controls the minor arcs in the circle method for sums of -th powers, yielding for the least such that every large integer is a sum of positive -th powers — a bound that the resolution of the main conjecture brought to within a constant of the conjectured truth.

Synthesis. Vinogradov's mean value theorem, the decoupling inequality for the moment curve, efficient congruencing, and the Vinogradov-Korobov zero-free region are one circle of ideas, and the bridge is the rigidity of the moment curve — its curvature, equivalently the Vandermonde determinant of its flags. The foundational reason the count is pinned to its two floors is that fixing the first power sums of a short tuple fixes the tuple up to permutation (Exercise 7), and this is exactly the curvature that makes decoupling sharp and the -adic Vandermonde divisibility that drives efficient congruencing; the two resolutions are dual — archimedean and non-archimedean — readings of one geometric fact. The central insight is that a high-degree exponential sum's cancellation is a counting problem, so the analytic Weyl-sum bound of 21.15.02 generalises into the arithmetic mean value, and putting these together, the same bound that resolves the count delivers the sharpest Weyl bounds, the Vinogradov-Korobov region for , and the near-optimal in Waring's problem. This is exactly where the smooth-phase exponent-pair calculus of 21.15.03 runs out of strength and the polynomial-system viewpoint takes over: for high degree the central insight is that one should count solutions, not estimate sums directly, and the moment curve's geometry then does the rest.

Full proof set Master

Proposition 1 (orthogonality and the mean value identity). With , one has .

Proof. Expanding as a sum over gives the integrand . Integration over factors over the variables, and with . The product of indicators equals exactly for the solutions of the system, so the integral counts them.

Proposition 2 (the two-term lower bound). For , .

Proof. The diagonal contribution is Exercise 4: permutations of distinct tuples give solutions. For the second term, suppose . The map from to takes values with , so the number of possible target vectors is . By pigeonhole, some fibre (in fact the fibre over , by a standard averaging since is the most populous value) has at least $$ \frac{N^{2s}}{C,N^{k(k+1)/2}} \gg N^{2s - k(k+1)/2} $$ preimages, and the fibre over is exactly the solution set counted by . (That is the densest fibre follows from -type positivity, or directly from the Cauchy-Schwarz / convexity bound where counts representations.) Adding the two contributions gives the stated floor.

Proposition 3 (the and small- cases are exact). For , counts solutions of the single equation , and the main conjecture holds with for . For , is purely diagonal to leading order.

Proof. For the count is the number of with equal sums; the number of ways to write a value as is a bounded-degree quasi-polynomial peaking at near , and by Cauchy-Schwarz being sharp for the near-uniform , consistent with the second floor (here ). For : by the Vandermonde rigidity of Exercise 7 (Proposition 4 below), every solution with all distinct is diagonal, and tuples with a repeated coordinate number ; so , the diagonal floor with no excess.

Proposition 4 (Vandermonde rigidity at length ). If satisfy for , then is a permutation of .

Proof. Equal power sums force, via Newton's identities , equal elementary symmetric functions (induction on ). Hence the monic degree- polynomials and have equal coefficients, so they are equal, and a monic polynomial determines its root multiset. Thus as multisets. This is the determinantal heart of the subject: the Jacobian of in the is a Vandermonde determinant times , non-vanishing off the diagonal, which is the local form of the rigidity.

Connections Master

The mean value is a high-moment refinement of the Weyl-sum estimates of 21.15.02: where Weyl differencing bounds a single sum with the lossy saving , the mean value controls the -th moment of the full polynomial Weyl sum sharply, and the resulting minor-arc bound feeds the same Hardy-Littlewood circle method that Weyl's inequality serves.

The method supersedes the van der Corput exponent-pair calculus of 21.15.03 in the high-degree regime: the iterated -processes lose a constant factor of cancellation at each step and become inert as grows, whereas the mean value keeps a saving that decays only polynomially, so for large one counts solutions of the system rather than estimating the sum through stationary phase.

The sharp Weyl bounds produced here yield the Vinogradov-Korobov zero-free region, the widest known for , which is the deepest input to the growth, zero-density, and prime-counting theory of the Riemann zeta function in 21.03.01; the same minor-arc control underlies the circle-method treatment of Waring's problem and links back to the complete exponential sums of 21.15.04 that govern the major arcs.

Historical & philosophical context Master

Ivan Matveevich Vinogradov introduced the mean value method in 1935 [Vinogradov 1935] as the engine of his "method of trigonometric sums," using it to bound Weyl sums of high degree far beyond what Weyl's differencing or van der Corput's processes could reach, and thereby to establish the zero-free region for that Korobov and Vinogradov independently sharpened to the form around 1958. The original bounds carried an unavoidable loss from the exponent ; closing that loss to the conjectured was the central open problem of the area for eighty years. Linnik's 1943 -adic reformulation anticipated the congruencing idea, and Karatsuba and Stechkin pushed the classical method to its limit.

The resolution came from two directions almost simultaneously. Trevor Wooley's efficient congruencing, beginning in 2012 [Wooley 2012], proved the main conjecture for and then, in nested form, for all ; it is a -adic descent that conditions on congruences between the variables, made precise by Vandermonde divisibility. Jean Bourgain, Ciprian Demeter, and Larry Guth proved the full conjecture for all in 2016 [Bourgain Demeter Guth 2016] via decoupling for the moment curve, an inequality in Euclidean harmonic analysis with no arithmetic content, resting on the multilinear Kakeya estimate and the Bourgain-Guth induction on scales. Demeter's 2020 book [Demeter 2020] gives the systematic decoupling account. The two proofs are now read as archimedean and non-archimedean expressions of the same curvature of the moment curve, and the episode is a case where a number-theoretic conjecture was settled by, and in turn reshaped, the geometry of the Fourier restriction problem.

Bibliography Master

@article{vinogradov1935,
  author  = {Vinogradov, I. M.},
  title   = {The method of trigonometrical sums in the theory of numbers},
  journal = {Trudy Mat. Inst. Steklov},
  volume  = {10},
  year    = {1935},
  note    = {English translation, Interscience, 1954}
}

@article{wooley2012,
  author  = {Wooley, Trevor D.},
  title   = {Vinogradov's mean value theorem via efficient congruencing},
  journal = {Annals of Mathematics},
  volume  = {175},
  number  = {3},
  pages   = {1575--1627},
  year    = {2012}
}

@article{wooley2019,
  author  = {Wooley, Trevor D.},
  title   = {Nested efficient congruencing and relatives of Vinogradov's mean value theorem},
  journal = {Proceedings of the London Mathematical Society},
  volume  = {118},
  number  = {4},
  pages   = {942--1016},
  year    = {2019}
}

@article{bourgaindemeterguth2016,
  author  = {Bourgain, Jean and Demeter, Ciprian and Guth, Larry},
  title   = {Proof of the main conjecture in Vinogradov's mean value theorem for degrees higher than three},
  journal = {Annals of Mathematics},
  volume  = {184},
  number  = {2},
  pages   = {633--682},
  year    = {2016}
}

@book{demeter2020,
  author    = {Demeter, Ciprian},
  title     = {Fourier Restriction, Decoupling, and Applications},
  series    = {Cambridge Studies in Advanced Mathematics},
  volume    = {184},
  publisher = {Cambridge University Press},
  year      = {2020}
}

@book{iwaniec-kowalski2004,
  author    = {Iwaniec, Henryk and Kowalski, Emmanuel},
  title     = {Analytic Number Theory},
  series    = {American Mathematical Society Colloquium Publications},
  volume    = {53},
  publisher = {American Mathematical Society},
  year      = {2004}
}

@book{vaughan1997,
  author    = {Vaughan, Robert C.},
  title     = {The Hardy-Littlewood Method},
  series    = {Cambridge Tracts in Mathematics},
  volume    = {125},
  edition   = {2},
  publisher = {Cambridge University Press},
  year      = {1997}
}