21.04.04 · number-theory / modular-forms

Theta Series of Quadratic Forms and Sums of Squares

shipped3 tiersLean: none

Anchor (Master): Serre *A Course in Arithmetic* Ch. VII §§6-7 (full theta-series development, the sum-of-squares formulas, the bridge from Part I quadratic forms to Part II modular forms); Iwaniec *Topics in Classical Automorphic Forms* Ch. 10-11 (theta series of quadratic forms, the Weil representation, half-integer weight); Shimura 1973 *Ann. of Math.* 97 (originator paper: modular forms of half-integral weight, the Shimura correspondence to integral weight); Ogg *Modular Forms and Dirichlet Series* (Benjamin 1969) Ch. VI (theta series and Eisenstein series); Miyake *Modular Forms* (Springer 1989) Ch. 4 §4.9 (theta series of positive-definite forms as modular forms); Conway-Sloane *Sphere Packings, Lattices and Groups* (Springer Grundlehren 290, 3rd ed. 1999) Ch. 2-7 (theta series of lattices, modular-form identities); Jacobi 1829 *Fundamenta Nova Theoriae Functionum Ellipticarum* (originator: the two- and four-square formulas via theta-function identities)

Intuition Beginner

How many ways can you write a whole number as a sum of squares? The number $5$ is $1 + 4 = 1^{2} + 2^{2}$ , and if you count signs and order — $(\pm 1)^{2} + (\pm 2)^{2}$ and $(\pm 2)^{2} + (\pm 1)^{2}$ — there are eight such ways. Counting these representations one number at a time looks like bookkeeping with no pattern. The surprise of this unit is that all the counts at once are hidden inside a single smooth function, and reading them off becomes a matter of recognising which function it is.

The trick is to pack every count into one power series. Form the sum where the term for $n$ carries a marker $q$ raised to the power $n$ . If you take the basic series $1 + q + q^{4} + q^{9} + q^{16} + \dots$ , whose exponents are the perfect squares, and you multiply it by itself a few times, the coefficient sitting in front of $q^{n}$ in the product is exactly the number of ways to write $n$ as a sum of that many squares. The counting problem has turned into a multiplication of power series.

This packaged series is the theta series, and it is not just any series: it has a deep symmetry under a group of changes of variable, the same kind of symmetry that defines a modular form. Because there are only finitely many functions with a given symmetry and weight, the theta series must equal a simple combination of standard building blocks whose coefficients are known divisor sums. So counting sums of squares becomes reading off coefficients of a modular form.

Visual Beginner

Picture three stacked number lines. The top line marks the perfect squares $0, 1, 4, 9, 16, 25, \dots$ — these are the exponents that appear in the basic theta series $θ (z) = 1 + 2 q + 2 q^{4} + 2 q^{9} + \dots$ (the factor $2$ counts the two signs $\pm n$ ). The middle line marks, for a chosen target $n$ , every pair of squares that adds up to $n$ ; the bottom line collapses all those pairs into a single height, the representation count $r_{2} (n)$ .

The picture conveys the whole strategy in one image: the squares on top are the raw material, multiplying theta series together combines them, and the height at $q^{n}$ in the product is the answer to "how many ways is $n$ a sum of squares." The modular symmetry, invisible in the picture, is what pins the product down to a known formula.

Worked example Beginner

Take the question: in how many ways is $n = 2$ a sum of two squares, counting order and sign? List them directly. We need $a^{2} + b^{2} = 2$ with $a, b$ whole numbers (positive, negative, or zero). The only square parts available are $0, 1$ , and the squares larger than $2$ are out of range. So both $a^{2}$ and $b^{2}$ must equal $1$ .

Step 1. Fix the sizes. We need $a^{2} = 1$ and $b^{2} = 1$ , so $∣ a ∣ = 1$ and $∣ b ∣ = 1$ .

Step 2. Count the sign choices. The value $a$ can be $+ 1$ or $- 1$ : two choices. Independently $b$ can be $+ 1$ or $- 1$ : two choices. That gives $2 \times 2 = 4$ ordered sign-aware representations: $(1, 1), (1, - 1), (- 1, 1), (- 1, - 1)$ .

Step 3. Read the same answer off the series. The basic theta series is $θ (z) = 1 + 2 q + 2 q^{4} + 2 q^{9} + \dots$ , where $q = e^{2 π i z}$ is the bookkeeping marker. Squaring it, $θ (z)^{2} = (1 + 2 q + 2 q^{4} + \dots) (1 + 2 q + 2 q^{4} + \dots) = 1 + 4 q + 4 q^{2} + 0 \cdot q^{3} + 4 q^{4} + \dots .$ The coefficient of $q^{2}$ is $4$ , matching the direct count $r_{2} (2) = 4$ .

What this tells us: the coefficient of $q^{n}$ in $θ (z)^{2}$ is the count $r_{2} (n)$ of ways to write $n$ as an ordered, signed sum of two squares. The coefficient of $q^{0}$ is $1$ , recording the single empty representation $0 = 0^{2} + 0^{2}$ . The coefficient of $q^{3}$ is $0$ , recording that $3$ is not a sum of two squares. The power series stores every count at once.

Check your understanding Beginner

Exercise (easy, multiple choice).

In the theta series $θ (z) = 1 + 2 q + 2 q^{4} + 2 q^{9} + \dots$ , whose exponents run over the perfect squares with each non-zero square counted for both $\pm n$ , what is the coefficient of $q^{4}$ ?

A. $1$ , because $4 = 2^{2}$ once.
B. $2$ , because $4 = (+ 2)^{2} = (- 2)^{2}$ .
C. $4$ , counting signs and order of two squares.
D. $8$ , the sum-of-two-squares count.

Hint

The sum runs over a single whole number $n$ , so the exponent $n^{2} = 4$ comes from $n = 2$ and from $n = - 2$ . Count how many single values of $n$ give exponent $4$ .

Answer

B. The theta series has one term for each whole number $n$ , contributing $q$ raised to the power $n^{2}$ . The exponent $n^{2} = 4$ arises from $n = 2$ and from $n = - 2$ , two values, so the coefficient of $q^{4}$ is $2$ . Written out, $θ (z) = 1 + 2 q + 2 q^{4} + 2 q^{9} + 2 q^{16} + \dots$ , with the leading $1$ from $n = 0$ and every later coefficient equal to $2$ because each non-zero square comes from a $\pm$ pair. Option C would be the coefficient in $θ (z)^{2}$ (two squares); option D in $θ (z)^{4}$ -style counts. The single theta series itself records one-square representations.

Exercise (easy, true-false).

The coefficient of $q^{7}$ in $θ (z)^{4}$ counts the ways to write $7$ as an ordered signed sum of four squares, and that count is positive.

Hint

Try to write $7$ as four squares: $7 = 4 + 1 + 1 + 1 = 2^{2} + 1^{2} + 1^{2} + 1^{2}$ . Every whole number is a sum of four squares.

Answer

True. The coefficient of $q^{7}$ in $θ (z)^{4}$ is $r_{4} (7)$ , the number of ordered signed four-square representations of $7$ . One representation is $7 = 2^{2} + 1^{2} + 1^{2} + 1^{2}$ ; placing the $2$ in any of four slots and choosing signs gives many ordered representations, so the count is positive. That every whole number is a sum of four squares is Lagrange's theorem, the statement $r_{4} (n) > 0$ for every $n \geq 0$ . The exact value here is $r_{4} (7) = 64$ , which the four-square formula returns as $8$ times the sum of divisors of $7$ not divisible by $4$ , namely $8 (1 + 7) = 64$ .

Formal definition Intermediate+

We work over the complex upper half-plane $H = {z \in C : Im (z) > 0}$ and write $q = e^{2 π i z}$ , so $∣ q ∣ < 1$ for $z \in H$ .

Definition (theta series). The Jacobi theta series is the holomorphic function $$ \theta(z) ;=; \sum_{n \in \mathbb{Z}} q^{n^2} ;=; \sum_{n \in \mathbb{Z}} e^{2\pi i n^2 z}, \qquad z \in \mathbb{H}, $$ with Fourier expansion $θ (z) = 1 + 2 q + 2 q^{4} + 2 q^{9} + 2 q^{16} + \dots$ . The series converges absolutely and uniformly on compact subsets of $H$ because $∣ q^{n^{2}} ∣ = e^{- 2 π n^{2} Im (z)}$ decays super-geometrically in $n$ .

Definition (theta series of a quadratic form). Let $Q : Z^{m} \to Z_{\geq 0}$ be a positive-definite integral quadratic form, $Q (x) = \frac{1}{2} x^{T} A x$ with $A$ a symmetric positive-definite integer matrix with even diagonal (the even-integral normalisation). The theta series of $Q$ is $$ \theta_Q(z) ;=; \sum_{x \in \mathbb{Z}^m} q^{Q(x)} ;=; \sum_{n \geq 0} r_Q(n) , q^n, \qquad r_Q(n) := #{ x \in \mathbb{Z}^m : Q(x) = n }. $$ The non-negative integer $r_{Q} (n)$ is the representation number of $n$ by $Q$ ; the finiteness $r_{Q} (n) < \infty$ follows from positive-definiteness, which makes ${x : Q (x) \leq n}$ a bounded, hence finite, subset of the lattice $Z^{m}$ . When $Q (x) = x_{1}^{2} + \dots + x_{k}^{2}$ is the sum of $k$ squares, $θ_{Q} = θ^{k}$ and $r_{Q} (n) = r_{k} (n)$ is the sum-of- $k$ -squares count.

Definition (half-integer weight and the theta multiplier). A holomorphic $f : H \to C$ is a modular form of weight $1/2$ on $Γ_{0} (4)$ if $f$ is holomorphic at every cusp and satisfies $$ f(\gamma z) ;=; \nu_\theta(\gamma) , (c z + d)^{1/2} , f(z), \qquad \gamma = \begin{pmatrix} a & b \ c & d \end{pmatrix} \in \Gamma_0(4), $$ where $(cz + d)^{1/2}$ uses the principal branch of the square root and $ν_{θ} (γ) = θ (γ z) / ((cz + d)^{1/2} θ (z))$ is the theta multiplier, an explicit eighth root of unity defined by $ν_{θ} (γ) = (\frac{c}{d}) ε_{d}^{- 1}$ with $(\frac{c}{d})$ the extended Jacobi symbol and $ε_{d} = 1$ if $d \equiv 1 mod 4$ , $ε_{d} = i$ if $d \equiv 3 mod 4$ . The half-integer weight is forced: no automorphy factor without a chosen square root and a multiplier can make $θ$ modular, because $θ$ transforms with the square root of the usual factor $cz + d$ .

The congruence subgroup is $Γ_{0} (4) = {(a c b d) \in SL_{2} (Z) : c \equiv 0 mod 4}$ ; the group generated by $z \mapsto z + 2$ and $z \mapsto - 1/ (4 z)$ inside which $θ$ is modular is the theta group.

Definition (weight $m /2$ for $θ_{Q}$ ). For a positive-definite even integral form $Q$ of rank $m$ and level $N$ (the level being the smallest positive integer with $N A^{- 1}$ even integral), $θ_{Q}$ is a modular form of weight $m /2$ on $Γ_{0} (N)$ with a character $χ_{Q}$ determined by the discriminant of $Q$ ; for even $m$ the weight is an ordinary integer and no multiplier is needed, while for odd $m$ the theta multiplier reappears.

Counterexamples to common slips Intermediate+

The theta series is not modular on the full group $SL_{2} (Z)$ . The element $T : z \mapsto z + 1$ sends $θ$ to $\sum (- 1)^{n^{2}} q^{n^{2}} \neq = θ$ , since $q^{n^{2}}$ picks up the sign $(- 1)^{n^{2}} = (- 1)^{n}$ . Only the translation by $2$ , namely $z \mapsto z + 2$ , fixes $θ$ , which is why the natural level is $4$ , not $1$ .
The weight is $1/2$ , not $1$ . Squaring, $θ^{2}$ has weight $1$ on $Γ_{0} (4)$ , and a count by analogy with integer-weight forms that assigned $θ$ weight $1$ would be off by a factor-of-two error in the rank-to-weight rule. The correct rule is weight $= (rank) /2$ .
Positive-definiteness is essential for $θ_{Q}$ to be a holomorphic modular form. For an indefinite form the series $\sum_{x} q^{Q (x)}$ diverges, since $Q (x) \to - \infty$ along some directions makes $∣ q^{Q (x)} ∣$ unbounded. Indefinite forms require Siegel's theory of indefinite theta functions, a different object.
$r_{Q} (n)$ counts lattice points, with signs and order, not unordered representations. The four-square count $r_{4} (1) = 8$ records $(\pm 1, 0, 0, 0)$ in each of the four coordinate slots — eight points — not the single "essentially one-way" representation. Translating $r_{k} (n)$ to unordered partitions into squares requires dividing out the sign-and-permutation symmetry by hand.

Key theorem with proof Intermediate+

Theorem (modular transformation of $θ$ via Poisson summation; Serre VII §6). The theta series satisfies the inversion law $$ \theta!\left( -\frac{1}{4z} \right) ;=; \sqrt{-2 i z}; \theta(z), \qquad z \in \mathbb{H}, $$ where $\cdot$ is the principal branch; equivalently, on the imaginary axis $z = i t$ with $t > 0$ , writing $ϑ (t) = \sum_{n} e^{- π n^{2} t}$ one has $ϑ (1/ t) = t ϑ (t)$ . Together with the periodicity $θ (z + 1) = \sum (- 1)^{n} q^{n^{2}}$ and $θ (z + 2) = θ (z)$ , this makes $θ$ a modular form of weight $1/2$ on $Γ_{0} (4)$ with the theta multiplier.

Proof. The engine is the Poisson summation formula: for a Schwartz function $g$ on $R$ with Fourier transform $g (ξ) = \int_{R} g (x) e^{- 2 π i x ξ} d x$ , $$ \sum_{n \in \mathbb{Z}} g(n) ;=; \sum_{n \in \mathbb{Z}} \widehat{g}(n). $$

Step 1 (the Gaussian and its transform). Fix $t > 0$ and let $g_{t} (x) = e^{- π t x^{2}}$ . A standard contour computation of the Gaussian integral gives the self-similar transform $$ \widehat{g_t}(\xi) ;=; \frac{1}{\sqrt{t}}, e^{-\pi \xi^2 / t} ;=; \frac{1}{\sqrt{t}}, g_{1/t}(\xi). $$ The Gaussian is, up to the scale $t$ , its own Fourier transform with $t$ inverted.

Step 2 (apply Poisson). Summing $g_{t}$ over $Z$ gives the theta value $ϑ (t) = \sum_{n} e^{- π t n^{2}} = \sum_{n} g_{t} (n)$ . Poisson summation equates this to $\sum_{n} g_{t} (n) = \frac{1}{t} \sum_{n} e^{- π n^{2} / t} = \frac{1}{t} ϑ (1/ t)$ . Rearranging, $$ \vartheta(1/t) ;=; \sqrt{t}; \vartheta(t). $$ This is the inversion law on the imaginary axis $z = i t /2$ in the normalisation $θ (z) = ϑ (- 2 i z)$ ; analytic continuation from the imaginary axis to all of $H$ (both sides are holomorphic and agree on a set with a limit point) extends the identity to the stated $θ (- 1/ (4 z)) = - 2 i z θ (z)$ .

Step 3 (assemble the modularity). The two maps $z \mapsto z + 2$ and $z \mapsto - 1/ (4 z)$ generate the theta group, a subgroup commensurable with $Γ_{0} (4)$ ; under each generator $θ$ transforms by the displayed automorphy factor times a root of unity. Tracking the cocycle relation $ν_{θ} (γ_{1} γ_{2}) (\dots) = ν_{θ} (γ_{1}) ν_{θ} (γ_{2}) (\dots)$ through the generators identifies the multiplier $ν_{θ} (γ) = (\frac{c}{d}) ε_{d}^{- 1}$ on all of $Γ_{0} (4)$ . Holomorphy at the three cusps of $Γ_{0} (4)$ follows because $θ$ is bounded near each cusp (the $q$ -expansion at $\infty$ has non-negative exponents, and conjugating by the cusp-fixing scaling preserves boundedness). Hence $θ \in M_{1/2} (Γ_{0} (4), ν_{θ})$ . $□$

Bridge. The Poisson-summation inversion builds toward the entire sum-of-squares machinery: $θ^{k} \in M_{k /2} (Γ_{0} (4))$ is a half-integer-weight modular form whose Fourier coefficients are the counts $r_{k} (n)$ , and the foundational reason exact formulas exist is that this space is finite-dimensional. This is exactly the same finiteness that pins $M_{*} (SL_{2} (Z)) = C [E_{4}, E_{6}]$ in [21.04.01], now transplanted to a congruence subgroup with half-integer weight; the inversion law $ϑ (1/ t) = t ϑ (t)$ is dual to the functional equation of the Riemann zeta function, whose completed form [21.03.01] is the Mellin transform of this very theta series. The central insight is that representation numbers, defined by counting, become Fourier coefficients of an object constrained by symmetry, and putting these together — Poisson summation, finite-dimensionality, and Eisenstein-plus-cusp decomposition — appears again in [21.04.02] as the Hecke-operator analysis that splits $θ_{Q}$ into its eigencomponents. The bridge is that the analytic transformation law converts an arithmetic counting problem into a linear-algebra problem in a finite-dimensional space, and that conversion generalises from squares to every positive-definite quadratic form.

Exercises Intermediate+

A graded set covering the theta inversion, representation numbers, the four-square formula, and the Eisenstein-cusp decomposition.

Exercise 3 (medium, symbolic).

Starting from $ϑ (t) = \sum_{n} e^{- π n^{2} t}$ and the Poisson identity $ϑ (1/ t) = t ϑ (t)$ , show that the completed zeta integral $π^{- s /2} Γ (s /2) ζ (s) = \frac{1}{2} \int_{0}^{\infty} (ϑ (t) - 1) t^{s /2} \frac{d t}{t}$ inherits the symmetry $s \mapsto 1 - s$ .

Hint

Split the integral at $t = 1$ , substitute $t \mapsto 1/ t$ in the part from $0$ to $1$ , and apply the inversion law $ϑ (1/ t) = t ϑ (t)$ .

Answer

Write $ψ (t) = \frac{1}{2} (ϑ (t) - 1) = \sum_{n \geq 1} e^{- π n^{2} t}$ , so that $π^{- s /2} Γ (s /2) ζ (s) = \int_{0}^{\infty} ψ (t) t^{s /2} \frac{d t}{t}$ for $Re (s) > 1$ . Split at $1$ : $\int_{0}^{1} + \int_{1}^{\infty}$ . In the first integral substitute $t \mapsto 1/ u$ , giving $\int_{1}^{\infty} ψ (1/ u) u^{- s /2} \frac{d u}{u}$ . The inversion law $ϑ (1/ u) = u ϑ (u)$ becomes $ψ (1/ u) = \frac{1}{2} (u ϑ (u) - 1) = u ψ (u) + \frac{1}{2} u - \frac{1}{2}$ . Substituting and collecting the elementary $\frac{1}{2} u - \frac{1}{2}$ terms, which integrate to $\frac{1}{s - 1} - \frac{1}{s}$ , yields $$ \pi^{-s/2}\Gamma(s/2)\zeta(s) ;=; \frac{1}{s(s-1)} ;+; \int_1^\infty \psi(t)\big(t^{s/2} + t^{(1-s)/2}\big)\frac{dt}{t}. $$ The right side is manifestly invariant under $s \mapsto 1 - s$ , so the completed zeta function satisfies $ξ (s) = ξ (1 - s)$ . The theta inversion of this unit is the analytic source of the zeta functional equation in [21.03.01].

Exercise 5 (medium, short-answer).

Explain why $dim M_{2} (Γ_{0} (4)) = 2$ is exactly what makes the four-square theorem provable by the modular-forms method, and identify the two basis forms.

Hint

$θ^{4}$ has weight $4/2 = 2$ on $Γ_{0} (4)$ . The space $M_{2} (Γ_{0} (4))$ is spanned by two Eisenstein series; there are no cusp forms in weight $2$ at this level.

Answer

The form $θ^{4}$ is a modular form of weight $2$ on $Γ_{0} (4)$ . The space $M_{2} (Γ_{0} (4))$ is two-dimensional and consists entirely of Eisenstein series — the cusp-form space $S_{2} (Γ_{0} (4))$ is zero because the modular curve $X_{0} (4)$ has genus $0$ . A convenient basis is the pair of weight- $2$ Eisenstein series ${E_{2, 2} (z), E_{2, 4} (z)}$ built from $E_{2} (z) - 2 E_{2} (2 z)$ and $E_{2} (z) - 4 E_{2} (4 z)$ (the quasi-modular defect of $E_{2}$ cancels in these combinations). Since $θ^{4}$ lies in this $2$ -dimensional space, it is a specific linear combination of the basis, and matching the first two Fourier coefficients determines that combination uniquely. Reading off the general coefficient yields $r_{4} (n) = 8 \sum_{d ∣ n, 4 ∤ d} d$ with no error term — because there is no cusp form to contribute one. The absence of cusp forms is the structural reason the formula is exact.

Exercise 6 (hard, symbolic).

Show that the eight-square count $r_{8} (n)$ satisfies $r_{8} (n) = 16 \sum_{d ∣ n} (- 1)^{n - d} d^{3}$ by locating $θ^{8}$ inside $M_{4} (Γ_{0} (4))$ and using that $S_{4} (Γ_{0} (4))$ is one-dimensional but does not contribute to $θ^{8}$ .

Hint

$θ^{8}$ has weight $4$ on $Γ_{0} (4)$ . Decompose $θ^{8} = E + S$ with $E$ Eisenstein and $S$ a cusp form, then argue $S = 0$ by checking the cusp behaviour, or compute that the Eisenstein part already matches enough coefficients.

Answer

The form $θ^{8}$ lies in $M_{4} (Γ_{0} (4))$ , which decomposes as the Eisenstein subspace $E_{4} (Γ_{0} (4))$ plus the cusp space $S_{4} (Γ_{0} (4))$ . The Eisenstein subspace is three-dimensional (one Eisenstein series per cusp of $X_{0} (4)$ , of which there are three) and $S_{4} (Γ_{0} (4))$ is one-dimensional. Projecting $θ^{8}$ onto the Eisenstein subspace, the relevant combination of weight- $4$ Eisenstein series has $n$ -th coefficient $16 \sum_{d ∣ n} (- 1)^{n - d} d^{3}$ — the sign $(- 1)^{n - d}$ encodes the level- $4$ twist distinguishing even and odd parts of the divisor sum. The cusp contribution vanishes: matching the Fourier coefficients of $θ^{8}$ against the Eisenstein part for $n \leq 4$ forces the coefficient of the unique normalised cusp form to be zero, since $θ^{8}$ has non-negative integer coefficients consistent with the pure Eisenstein prediction at the determining indices. Therefore $r_{8} (n) = 16 \sum_{d ∣ n} (- 1)^{n - d} d^{3}$ , Jacobi's eight-square formula. For $n$ odd the sign is $+ 1$ throughout and the formula reads $r_{8} (n) = 16 σ_{3} (n)$ .

Exercise 7 (hard, short-answer).

For six squares the formula is $r_{6} (n) = 4 \sum_{d ∣ n} (4 (\frac{- 4}{n / d}) - (\frac{- 4}{d})) d^{2}$ , which is genuinely more intricate than the four- and eight-square cases. Explain, in terms of $dim S_{3} (Γ_{0} (4))$ , why the six-square case sits between "purely Eisenstein" and "Eisenstein plus a non-zero cusp correction."

Hint

$θ^{6}$ has weight $3$ on $Γ_{0} (4)$ , an odd weight with a character. Consider whether the relevant cusp space is zero, and whether the character $(\frac{- 4}{\cdot})$ is forced by the half-integer-weight bookkeeping.

Answer

The form $θ^{6}$ has weight $3$ on $Γ_{0} (4)$ with the quadratic character $χ_{- 4} = (\frac{- 4}{\cdot})$ attached, because the rank $6$ gives integer weight $3$ but the discriminant of the sum-of-six-squares form introduces the level- $4$ character. The space $M_{3} (Γ_{0} (4), χ_{- 4})$ splits as an Eisenstein part plus $S_{3} (Γ_{0} (4), χ_{- 4})$ . The cusp space is zero in this particular weight-and-character combination, so $r_{6} (n)$ is again purely Eisenstein — but the Eisenstein subspace now carries the character $χ_{- 4}$ , which is why the formula mixes $(\frac{- 4}{d})$ and $(\frac{- 4}{n / d})$ rather than reducing to a plain divisor power sum. The intricacy is the character, not a cusp correction. In weights where the cusp space is non-zero — for instance $r_{12} (n)$ , where $S_{6} (Γ_{0} (4))$ contributes — the exact formula genuinely acquires a cusp-form term, and the representation count is a divisor sum plus an arithmetically subtler correction bounded by Deligne's estimate.

Exercise 8 (hard, short-answer).

State the general strategy for obtaining an exact formula for $r_{Q} (n)$ for an arbitrary positive-definite integral form $Q$ of rank $m$ , and explain what changes between the cases " $θ_{Q}$ is determined by its genus" and " $θ_{Q}$ has a non-Eisenstein part."

Hint

Decompose $θ_{Q} = E_{Q} + f_{Q}$ with $E_{Q}$ the Eisenstein part (the genus theta series, a weighted average over the genus of $Q$ ) and $f_{Q}$ a cusp form. Siegel's mass formula evaluates the Eisenstein part.

Answer

The strategy is: (1) recognise $θ_{Q} \in M_{m /2} (Γ_{0} (N), χ_{Q})$ , a finite-dimensional space; (2) decompose $θ_{Q} = E_{Q} + f_{Q}$ into its Eisenstein part $E_{Q}$ and a cusp form $f_{Q}$ ; (3) identify $E_{Q}$ as the genus theta series, the suitably weighted average of $θ_{Q^{'}}$ over the forms $Q^{'}$ in the genus of $Q$ , whose Fourier coefficients are given in closed form by Siegel's mass formula as a product of local densities; (4) read off $r_{Q} (n)$ as the Eisenstein coefficient plus the cusp coefficient. When the cusp space $S_{m /2} (Γ_{0} (N), χ_{Q})$ is zero — as for sums of $\leq 8$ squares — the cusp part $f_{Q}$ vanishes, $θ_{Q}$ equals its genus average, and $r_{Q} (n)$ is exactly the Siegel main term: an exact closed formula. When the cusp space is non-zero, $f_{Q}$ contributes an additional term; the genus average $E_{Q}$ is still the dominant piece (the local-density main term), but $r_{Q} (n)$ is no longer determined by the genus alone — two forms in the same genus can represent $n$ differently, distinguished precisely by their cusp components. The cusp contribution is bounded by the Ramanujan-Petersson / Deligne estimate, making it lower-order, which is how the genus determines $r_{Q} (n)$ asymptotically even when it fails to determine it exactly.

Advanced results Master

The elementary statement "every theta series is a modular form" unfolds into a complete machine for representation numbers. We collect the structural results: the precise modularity of $θ_{Q}$ , the Eisenstein-cusp decomposition, Siegel's mass formula, and the half-integer-weight theory that houses $θ$ itself.

Modularity of $θ_{Q}$ (Schoeneberg-Pfetzer; Serre VII §6). Let $Q$ be a positive-definite even integral quadratic form of rank $m$ , with Gram matrix $A$ , level $N$ , and discriminant $D = det A$ . Then $$ \theta_Q ;\in; M_{m/2}\big(\Gamma_0(N), \chi_D\big), \qquad \chi_D(d) = \left( \frac{(-1)^{m/2} D}{d} \right), $$ for even $m$ , where $χ_{D}$ is the Kronecker character of the discriminant; for odd $m$ the form $θ_{Q}$ lies in the half-integer-weight space $M_{m /2} (Γ_{0} (4 N), χ_{D} ν_{θ}^{m})$ built on the theta multiplier. The proof generalises the Poisson-summation argument: applying multidimensional Poisson summation to the Gaussian $e^{- π t x^{T} A x}$ on the lattice $Z^{m}$ produces the inversion law $θ_{Q} (- 1/ (N z)) = (const) \cdot (- i N z)^{m /2} θ_{Q^{*}} (z)$ relating $Q$ to its dual form $Q^{*}$ ; self-duality of the relevant combination assembles the modularity on $Γ_{0} (N)$ .

The Eisenstein-cusp decomposition. For $m \geq 3$ the space $M_{m /2} (Γ_{0} (N), χ_{D})$ has the orthogonal decomposition $$ M_{m/2} ;=; \mathcal{E}{m/2} ;\oplus; S{m/2}, $$ Eisenstein subspace plus cusp forms, with respect to the Petersson inner product of [21.04.02]. Writing $θ_{Q} = E_{Q} + f_{Q}$ , the Eisenstein component $E_{Q}$ depends only on the genus of $Q$ — the local equivalence class at every place, the same genus data classified by Hasse-Minkowski in [21.02.08] — while the cusp component $f_{Q}$ detects the finer class of $Q$ within its genus. Two forms in the same genus have the same Eisenstein part but generally different cusp parts.

Siegel's mass formula. The Fourier coefficients of the genus Eisenstein series are products of local representation densities. Siegel's 1935 theorem states that the weighted average of representation numbers over the classes in a genus equals a product of local densities: $$ \frac{\sum_{Q' \in \mathrm{gen}(Q)} r_{Q'}(n)/|\mathrm{Aut}(Q')|}{\sum_{Q' \in \mathrm{gen}(Q)} 1/|\mathrm{Aut}(Q')|} ;=; \prod_p \beta_p(n), \qquad \beta_\infty(n)\beta_2(n)\beta_3(n)\cdots, $$ where each $β_{p} (n)$ is a $p$ -adic density counting solutions of $Q (x) \equiv n$ modulo powers of $p$ , and $β_{\infty}$ is the real (archimedean) density. This is the analytic content of the Eisenstein part: the main term in every representation-number formula is a Siegel product of local densities, the global-to-local principle of [21.02.08] realised quantitatively. For sums of $\leq 8$ squares the genus has one class and no cusp form, so the mass formula is an exact identity, recovering the Jacobi formulas.

Half-integer weight and the Shimura correspondence. The series $θ$ itself, of weight $1/2$ , sits at the bottom of Shimura's 1973 theory of half-integral-weight modular forms. Shimura constructed a Hecke-equivariant lift $Sh : S_{k + 1/2} (Γ_{0} (4 N)) \to S_{2 k} (Γ_{0} (N))$ sending a half-integer-weight eigenform to an integral-weight eigenform with matching Hecke eigenvalues. The cusp parts $f_{Q}$ of theta series, living in half-integer weight when $m$ is odd, are governed by this correspondence; the Waldspurger theorem later expressed the Fourier coefficients of the half-integer-weight form in terms of central $L$ -values of the Shimura lift, tying representation numbers of ternary forms to $L$ -functions of elliptic curves.

Even unimodular lattices and level-one theta series. When $Q$ is the quadratic form of an even unimodular lattice — discriminant $1$ , even diagonal, forcing $m \equiv 0 mod 8$ — the level is $N = 1$ and $θ_{Q}$ is a modular form on the full group $SL_{2} (Z)$ of weight $m /2$ , directly inside the ring $C [E_{4}, E_{6}]$ of [21.04.01]. For $m = 8$ the unique such lattice is $E_{8}$ and $θ_{E_{8}} = E_{4}$ , giving $r_{E_{8}} (n) = 240 σ_{3} (n)$ . For $m = 16$ the two lattices $E_{8} \oplus E_{8}$ and $D_{16}^{+}$ have equal theta series $E_{4}^{2}$ — they are isospectral but non-isometric, Milnor's 1964 example of distinct lattices "sounding the same," because $dim M_{8} (SL_{2} (Z)) = 1$ leaves no room for a distinguishing cusp form. For $m = 24$ the Niemeier lattices, including the Leech lattice, are separated by the weight- $12$ cusp form $Δ$ .

Synthesis. Putting these together, the theta-series construction is the foundational reason the two halves of Serre's book join: the quadratic-form data of Part I — genus, class, local densities, the Hasse-Minkowski classification of [21.02.08] — becomes the Fourier-coefficient data of a modular form in Part II, and this is exactly the dictionary in which Siegel's mass formula is the Eisenstein main term and the cusp form is the local-to-global defect. The central insight is that representation numbers factor as a genus-determined product of local densities plus a cusp correction; this generalises the elementary Jacobi formulas, where the cusp space vanishes, to arbitrary forms, where it does not, and it is dual to the geometric picture of theta series as sections over the moduli of lattices. The bridge is that counting lattice points at fixed length, a question of pure arithmetic, is read off a finite-dimensional space of modular forms — and the same finiteness that made $M_{k} (SL_{2} (Z))$ computable in [21.04.01] makes every $r_{Q} (n)$ computable here. Putting these together, the theta series is the load-bearing arch between quadratic forms and modular forms.

Full proof set Master

Proposition 1 (Jacobi's four-square theorem). For every positive integer $n$ , $$ r_4(n) ;=; 8 \sum_{\substack{d \mid n \ 4 \nmid d}} d. $$

Proof. The generating function is $θ^{4} = \sum_{n \geq 0} r_{4} (n) q^{n}$ , a modular form of weight $2$ on $Γ_{0} (4)$ . The space $M_{2} (Γ_{0} (4))$ has dimension $2$ and consists entirely of Eisenstein series, because the modular curve $X_{0} (4)$ has genus $0$ , so $S_{2} (Γ_{0} (4)) = 0$ . A basis is furnished by the two weight- $2$ Eisenstein series obtained from the quasi-modular $E_{2}$ by the level-raising combinations $$ \phi_1(z) = 2 E_2(2z) - E_2(z), \qquad \phi_2(z) = 4 E_2(4z) - E_2(z), $$ each of which is a genuine modular form of weight $2$ on $Γ_{0} (4)$ since the quasi-modular anomaly of $E_{2}$ cancels in the difference. Their $q$ -expansions begin $$ \phi_1 = 1 + 24\sum_{n\geq 1}\Big(\sum_{d\mid n, d \text{ odd}} d\Big) q^n + \cdots, \qquad \phi_2 = 1 + 8\sum_{n \geq 1}\Big(\sum_{d \mid n, 4 \nmid d} d\Big) q^n + \cdots, $$ after collecting divisor sums from $E_{2} (z) = 1 - 24 \sum σ_{1} (n) q^{n}$ .

Now $θ^{4} \in M_{2} (Γ_{0} (4))$ , so $θ^{4} = a ϕ_{1} + b ϕ_{2}$ for constants $a, b$ . Comparing constant terms: $θ^{4} = 1 + 8 q + \dots$ and $ϕ_{1}, ϕ_{2}$ both have constant term $1$ , giving $a + b = 1$ . Comparing the coefficient of $q^{1}$ : $r_{4} (1) = 8$ , while $[q^{1}] ϕ_{1} = 24$ and $[q^{1}] ϕ_{2} = 8$ , so $24 a + 8 b = 8$ . Solving $a + b = 1$ and $24 a + 8 b = 8$ gives $a = 0$ , $b = 1$ . Hence $θ^{4} = ϕ_{2}$ , and reading off the general coefficient, $$ r_4(n) ;=; [q^n]\phi_2 ;=; 8\sum_{d \mid n,; 4 \nmid d} d. \qquad \square $$

Corollary (Lagrange's four-square theorem). Every non-negative integer is a sum of four squares.

Proof. For $n \geq 1$ the divisor $d = 1$ always satisfies $4 ∤ 1$ , so the sum $\sum_{d ∣ n, 4 ∤ d} d \geq 1 > 0$ , whence $r_{4} (n) \geq 8 > 0$ : there is at least one representation. For $n = 0$ the empty representation $0 = 0^{2} + 0^{2} + 0^{2} + 0^{2}$ gives $r_{4} (0) = 1 > 0$ . So $r_{4} (n) > 0$ for every $n \geq 0$ . $□$

Proposition 2 (theta inversion forces half-integer weight). There is no integer $k$ for which $θ$ satisfies $θ (γ z) = (cz + d)^{k} θ (z)$ for all $γ \in Γ_{0} (4)$ with the identity multiplier; the correct automorphy factor is $(cz + d)^{1/2}$ times the theta multiplier $ν_{θ}$ .

Proof. Evaluate the inversion law $θ (- 1/ (4 z)) = - 2 i z θ (z)$ from the Key theorem on the imaginary axis $z = i t$ . The matrix realising $z \mapsto - 1/ (4 z)$ inside the theta group has lower-left entry $c = 2$ in the normalised coordinate, and the factor produced is $- 2 i z$ , whose modulus grows like $∣ z ∣^{1/2}$ , not $∣ z ∣^{k}$ for any integer $k$ . If an integer-weight automorphy factor $(cz + d)^{k}$ with $k \in Z$ governed $θ$ , then squaring the inversion identity would give $θ (- 1/ (4 z))^{2} = (- 2 i z) θ (z)^{2}$ , i.e. $θ^{2}$ has weight $1$ — an odd-looking integer weight that is nonetheless consistent — while $θ$ itself has weight exactly half of that. Since $1$ is odd, no integer $k$ with $2 k = 1$ exists, so $θ$ cannot carry an integer weight. The multiplier $ν_{θ}$ is then forced as the eighth-root-of-unity cocycle making the half-integer automorphy factor consistent under composition, $ν_{θ} (γ_{1} γ_{2}) = ν_{θ} (γ_{1}) ν_{θ} (γ_{2}) \cdot σ (γ_{1}, γ_{2})$ with $σ$ the sign ambiguity of the square root. $□$

Proposition 3 (the two-square theorem from $θ^{2}$ ). For $n \geq 1$ , $r_{2} (n) = 4 (d_{1} (n) - d_{3} (n))$ , where $d_{j} (n) = # {d ∣ n : d \equiv j mod 4}$ .

Proof. The form $θ^{2}$ has weight $1$ on $Γ_{0} (4)$ with the character $χ_{- 4} = (\frac{- 4}{\cdot})$ , the non-principal character modulo $4$ . The space $M_{1} (Γ_{0} (4), χ_{- 4})$ is one-dimensional and spanned by the single weight- $1$ Eisenstein series $$ E_{1,\chi_{-4}}(z) ;=; 1 + 4\sum_{n \geq 1}\Big(\sum_{d \mid n}\chi_{-4}(d)\Big) q^n, $$ where $χ_{- 4} (d) = + 1$ for $d \equiv 1 mod 4$ , $- 1$ for $d \equiv 3 mod 4$ , and $0$ for even $d$ . Since $θ^{2}$ has constant term $1$ and lies in this one-dimensional space, $θ^{2} = E_{1, χ_{- 4}}$ . The inner divisor sum is $\sum_{d ∣ n} χ_{- 4} (d) = d_{1} (n) - d_{3} (n)$ , so reading off the coefficient gives $r_{2} (n) = 4 (d_{1} (n) - d_{3} (n))$ . In particular $r_{2} (n) > 0$ exactly when $n$ has more divisors congruent to $1$ than to $3$ modulo $4$ , recovering the classical two-square criterion: $n$ is a sum of two squares iff every prime $\equiv 3 mod 4$ divides $n$ to an even power. $□$

Connections Master

Modular forms on $SL_{2} (Z)$ 21.04.01. The entire method rests on the finite-dimensionality of spaces of modular forms established there: $θ^{k}$ lives in a space $M_{k /2} (Γ_{0} (4))$ of known, small dimension, so matching a few Fourier coefficients pins it down completely. The level- $1$ structure theorem $M_{*} (SL_{2} (Z)) = C [E_{4}, E_{6}]$ is the special case governing theta series of even unimodular lattices, where $θ_{E_{8}} = E_{4}$ and the rank- $16$ isospectral pair both equal $E_{4}^{2}$ because $dim M_{8} = 1$ .
Hecke operators and Hecke algebra 21.04.02. The Eisenstein-cusp decomposition $θ_{Q} = E_{Q} + f_{Q}$ is an eigenspace decomposition for the Hecke operators, and the cusp part $f_{Q}$ is analysed through its Hecke eigenvalues. Self-adjointness of the Hecke operators under the Petersson inner product is what makes the orthogonal projection onto the Eisenstein subspace well-defined, so that $E_{Q}$ — the genus theta series with Siegel's local-density coefficients — is separated cleanly from the class-distinguishing cusp form $f_{Q}$ .
Hasse-Minkowski theorem and quadratic forms over $Q$ 21.02.08. The Eisenstein part $E_{Q}$ depends only on the genus of $Q$ , the local-equivalence data at every place that the Hasse-Minkowski local-global principle organises. Siegel's mass formula expresses the genus average of representation numbers as a product of the same local densities $β_{p}$ that the local classification of forms supplies; theta series is the quantitative, modular-form-valued refinement of the local-global principle, counting solutions rather than merely deciding solvability.
Quadratic residues and the Legendre symbol 21.01.06. The two-square theorem emerges in Proposition 3 from $θ^{2} = E_{1, χ_{- 4}}$ , whose coefficients are the character sum $\sum_{d ∣ n} χ_{- 4} (d)$ with $χ_{- 4}$ the Legendre-symbol-based character modulo $4$ . The criterion " $n$ is a sum of two squares iff every prime $\equiv 3 mod 4$ appears to even order" is exactly a statement about which primes are quadratic residues, linking the modular-forms count back to elementary reciprocity.
Theta function on a Riemann surface / Jacobian 06.06.05. This unit's $θ (z) = \sum_{n} q^{n^{2}}$ is a one-variable specialisation of the multivariable Riemann theta function $Θ (z, τ) = \sum_{n} e^{π i n^{T} τ n + 2 π i n^{T} z}$ on a principally polarised abelian variety, taken at $z = 0$ with the period matrix $τ$ replaced by the Gram matrix of $Q$ . The contrast is instructive: there the theta function is a section of a line bundle on a complex torus encoding its geometry; here, evaluated at the origin and summed against a quadratic form, it is an arithmetic generating function. The two theta functions share the Poisson-summation transformation law $Θ (- 1/ τ) = (det (- i τ))^{1/2} Θ (τ)$ — the modular inversion of this unit is its rank-one shadow.
Riemann zeta function 21.03.01. The completed zeta function is the Mellin transform of the theta series of this unit: $π^{- s /2} Γ (s /2) ζ (s) = \frac{1}{2} \int_{0}^{\infty} (θ (i t) - 1) t^{s /2} d t / t$ , and the functional equation $ξ (s) = ξ (1 - s)$ is the Mellin image of the theta inversion $ϑ (1/ t) = t ϑ (t)$ proved here by Poisson summation. The same analytic identity drives both the modularity of $θ$ and the symmetry of $ζ$ .

Historical & philosophical context Master

The theta function entered analysis through Jacobi's 1829 Fundamenta Nova ^{[Jacobi 1829]}, where the products and series now bearing his name were the engine of the theory of elliptic functions. Jacobi derived the two-square formula $r_{2} (n) = 4 (d_{1} (n) - d_{3} (n))$ and the four-square formula — $r_{4} (n)$ equal to $8$ times the sum of divisors of $n$ not divisible by $4$ — as identities among theta functions, by manipulating infinite products and the addition formulas for elliptic functions. These were exact counting formulas, sharper than the qualitative four-square theorem of Lagrange 1770 ^{[Lagrange 1770]}, which had established only that every integer is a sum of four squares — the bare positivity $r_{4} (n) > 0$ that Jacobi's formula now explained by exhibiting the count. The conceptual leap was that an arithmetic quantity, the number of representations, was the Fourier coefficient of an analytic object with a hidden transformation symmetry.

The modern framing — theta series of a quadratic form as a modular form, with representation numbers read off as Fourier coefficients — crystallised in the twentieth century. Hecke and his school recognised the transformation behaviour as modularity on congruence subgroups; Siegel's 1935 mass formula expressed the genus average of representation numbers as a product of local densities, founding the analytic theory of quadratic forms and giving the Eisenstein part of every theta series a closed form. Schoeneberg and Pfetzer established the precise modularity of $θ_{Q}$ on $Γ_{0} (N)$ with character. The half-integer-weight theory, latent in $θ$ 's own weight $1/2$ , was placed on rigorous foundations by Shimura's 1973 Annals paper ^{[Shimura 1973]}, which defined modular forms of half-integral weight via the theta multiplier and constructed the correspondence lifting them to integral weight — the framework in which the cusp parts of odd-rank theta series are now understood, and through which Waldspurger later connected ternary representation numbers to central $L$ -values.

Philosophically, the theta-series method is Serre's chosen climax in A Course in Arithmetic: the bridge from Part I, the algebraic and local theory of quadratic forms, to Part II, the analytic theory of modular forms. The two subjects, developed independently, meet in a single function whose existence is forced by Poisson summation and whose coefficients are simultaneously arithmetic counts and analytic data. That a problem as elementary as "in how many ways is $n$ a sum of four squares" is solved completely, and a problem as elementary-seeming as the general $r_{Q} (n)$ is solved up to a controlled cusp-form error, is a paradigm for the unity of mathematics: the rigidity of a symmetry in one domain dictates exact answers in another.

Bibliography Master

@book{Serre1973CinA,
  author = {Serre, Jean-Pierre},
  title = {A Course in Arithmetic},
  series = {Graduate Texts in Mathematics},
  volume = {7},
  publisher = {Springer},
  year = {1973},
}

@book{Jacobi1829,
  author = {Jacobi, Carl Gustav Jacob},
  title = {Fundamenta Nova Theoriae Functionum Ellipticarum},
  publisher = {Borntr\"ager, K\"onigsberg},
  year = {1829},
}

@article{Lagrange1770,
  author = {Lagrange, Joseph-Louis},
  title = {D\'emonstration d'un th\'eor\`eme d'arithm\'etique},
  journal = {Nouveaux M\'emoires de l'Acad\'emie Royale des Sciences et Belles-Lettres de Berlin},
  year = {1770},
}

@article{Shimura1973,
  author = {Shimura, Goro},
  title = {On modular forms of half integral weight},
  journal = {Annals of Mathematics},
  volume = {97},
  year = {1973},
  pages = {440--481},
}

@article{Siegel1935,
  author = {Siegel, Carl Ludwig},
  title = {{\"U}ber die analytische {T}heorie der quadratischen {F}ormen},
  journal = {Annals of Mathematics},
  volume = {36},
  year = {1935},
  pages = {527--606},
}

@book{Iwaniec1997,
  author = {Iwaniec, Henryk},
  title = {Topics in Classical Automorphic Forms},
  series = {Graduate Studies in Mathematics},
  volume = {17},
  publisher = {American Mathematical Society},
  year = {1997},
}

@book{Miyake1989theta,
  author = {Miyake, Toshitsune},
  title = {Modular Forms},
  series = {Springer Monographs in Mathematics},
  publisher = {Springer},
  year = {1989},
}

@book{Ogg1969,
  author = {Ogg, Andrew},
  title = {Modular Forms and Dirichlet Series},
  publisher = {W. A. Benjamin},
  year = {1969},
}

@book{ConwaySloane1999,
  author = {Conway, John H. and Sloane, Neil J. A.},
  title = {Sphere Packings, Lattices and Groups},
  series = {Grundlehren der mathematischen Wissenschaften},
  volume = {290},
  edition = {3rd},
  publisher = {Springer},
  year = {1999},
}

@book{HardyWright2008,
  author = {Hardy, G. H. and Wright, E. M.},
  title = {An Introduction to the Theory of Numbers},
  edition = {6th},
  publisher = {Oxford University Press},
  year = {2008},
}

Prerequisites

21.04.01
21.04.02
21.02.08
21.01.06

Tier anchors

beginner: Serre *A Course in Arithmetic* (Springer GTM 7, 1973) Ch. VII §6 (informal opening on $\theta(z) = \sum q^{n^2}$ and counting representations); Hardy-Wright *An Introduction to the Theory of Numbers* (Oxford, 6th ed. 2008) Ch. XX (elementary sums-of-squares discussion)
intermediate: Serre *A Course in Arithmetic* Ch. VII §6 (theta series, Poisson summation, weight-$1/2$ modularity, the four-square theorem via $M_2(\Gamma_0(4))$); Iwaniec *Topics in Classical Automorphic Forms* (AMS GSM 17, 1997) Ch. 10-11 (theta functions and half-integer-weight forms)
master: Serre *A Course in Arithmetic* Ch. VII §§6-7 (full theta-series development, the sum-of-squares formulas, the bridge from Part I quadratic forms to Part II modular forms); Iwaniec *Topics in Classical Automorphic Forms* Ch. 10-11 (theta series of quadratic forms, the Weil representation, half-integer weight); Shimura 1973 *Ann. of Math.* 97 (originator paper: modular forms of half-integral weight, the Shimura correspondence to integral weight); Ogg *Modular Forms and Dirichlet Series* (Benjamin 1969) Ch. VI (theta series and Eisenstein series); Miyake *Modular Forms* (Springer 1989) Ch. 4 §4.9 (theta series of positive-definite forms as modular forms); Conway-Sloane *Sphere Packings, Lattices and Groups* (Springer Grundlehren 290, 3rd ed. 1999) Ch. 2-7 (theta series of lattices, modular-form identities); Jacobi 1829 *Fundamenta Nova Theoriae Functionum Ellipticarum* (originator: the two- and four-square formulas via theta-function identities)

References

Serre, J.-P. — A Course in Arithmetic · Springer Graduate Texts in Mathematics 7 (1973). Ch. VII §6-7 (the theta series $\theta(z) = \sum_{n} q^{n^2}$ and its modular transformation under $z \mapsto -1/z$ via Poisson summation; theta series $\theta_Q$ of a positive-definite integral quadratic form as a modular form of weight $m/2$; the representation numbers $r_Q(n)$ as Fourier coefficients; the strategy of expressing $\theta_Q$ inside a finite-dimensional space of modular forms as Eisenstein-plus-cusp to extract exact formulas; the four-square theorem $r_4(n) = 8 \sum_{d \mid n, 4 \nmid d} d$). The signature exposition closing the arc from quadratic forms to modular forms.
Jacobi, C. G. J. — Fundamenta Nova Theoriae Functionum Ellipticarum · Königsberg (1829). The originator work on elliptic and theta functions; §40-66 derive the two-square formula $r_2(n) = 4 \sum_{d \mid n} \chi(d)$ with $\chi$ the non-principal character mod $4$, and the four-square formula $r_4(n) = 8 \sigma(n)$ for odd $n$ (and $24$ times the odd-divisor sum in general), both by theta-function identities — the analytic ancestor of the modular-forms proof.
Shimura, G. — On modular forms of half integral weight · *Annals of Mathematics* 97 (1973), 440-481. The originator paper rigorously founding modular forms of half-integral weight on $\Gamma_0(4N)$ with the theta-multiplier automorphy factor, and constructing the Shimura correspondence lifting a half-integral-weight Hecke eigenform of weight $k + 1/2$ to an integral-weight form of weight $2k$. The theoretical home of the weight-$1/2$ theta series.
Iwaniec, H. — Topics in Classical Automorphic Forms · American Mathematical Society Graduate Studies in Mathematics 17 (1997). Ch. 10 (theta functions, the Jacobi theta and its transformation law via Poisson summation), Ch. 11 (theta series of quadratic forms, half-integer-weight modular forms, the analytic theory of representation numbers). The modern analytic-number-theory reference for the theta-series machinery.
Conway, J. H. & Sloane, N. J. A. — Sphere Packings, Lattices and Groups · Springer Grundlehren der mathematischen Wissenschaften 290 (3rd ed. 1999). Ch. 2-4 (lattices and their theta series), Ch. 7 (theta-series identities and modular forms, the $E_8$ and Leech-lattice theta series as explicit modular forms). The lattice-theoretic companion: theta series of even unimodular lattices are level-$1$ modular forms.
Miyake, T. — Modular Forms · Springer Monographs in Mathematics (English 1989, corrected 2006). Ch. 4 §4.9 (theta series of positive-definite integral quadratic forms as modular forms of weight $m/2$ on a congruence subgroup determined by the level of the form; the transformation behaviour under the theta group). The comprehensive monograph treatment.
Ogg, A. — Modular Forms and Dirichlet Series · W. A. Benjamin (1969). Ch. VI (theta series, Eisenstein series, and the explicit determination of representation numbers by projecting a theta series onto the Eisenstein subspace). An early lucid account of the modular-forms approach to sums of squares.
Hardy, G. H. & Wright, E. M. — An Introduction to the Theory of Numbers · Oxford University Press (6th ed., revised Heath-Brown and Silverman, 2008). Ch. XX (representation of a number by two, four, and more squares; Lagrange's four-square theorem and Jacobi's exact formulas by elementary and generating-function methods). The elementary counterpart to the modular-forms derivation.
Lagrange, J.-L. — Démonstration d'un théorème d'arithmétique · *Nouveaux Mémoires de l'Académie de Berlin* (1770). The proof that every non-negative integer is a sum of four squares — the qualitative statement $r_4(n) > 0$ that Jacobi's exact formula $r_4(n) = 8\sum_{d \mid n, 4 \nmid d} d$ later refined to a count.

Estimated time

beginner: 20m
intermediate: 50m
master: 95m