21.01.08 · number-theory / elementary

Pell equation and continued fractions

shipped3 tiersLean: none

Anchor (Master): Brahmagupta 628 CE *Brāhmasphuṭasiddhānta* ch. 18 §§64-75 (originator: the **samasa-bhavana** composition law $(x_1, y_1, N_1) \diamond (x_2, y_2, N_2) = (x_1 x_2 + D y_1 y_2, x_1 y_2 + x_2 y_1, N_1 N_2)$ that solves $x^2 - D y^2 = N$ by composition; the first general method for the Pell equation in any culture); Bhaskara II 1150 CE *Bījagaṇita* §§71-90 (**chakravala** cyclic algorithm — completes Brahmagupta's method to a guaranteed-terminating procedure, found by Jayadeva c. 1000 CE according to Udayadivakara's commentary); Brouncker-Wallis 1657-58 correspondence (Brouncker's continued-fraction-style solution method, Wallis's exposition in *Commercium Epistolicum* 1658); Euler 1765 *Novi Comm. Acad. Petrop.* 11, 28-66 (mis-attribution of the equation to John Pell, who never worked on it; correct continued-fraction algorithm for $\sqrt D$); Lagrange 1770 *Mém. Acad. Berlin* (1768) 24, 165-310 (first complete proof of solvability of Pell, periodicity of continued fraction of every quadratic irrational, full structure of solutions); Dirichlet 1837 *Werke* I, 313 + Dirichlet-Dedekind 1894 *Vorlesungen über Zahlentheorie* §§80-100 (pigeonhole proof of Pell solvability without continued fractions, structure of unit group of $\mathbb{Q}(\sqrt D)$); Liouville 1844 *C. R. Acad. Sci. Paris* 18, 883-885 + 910-911 (Liouville bound: algebraic irrational of degree $n$ cannot be approximated by rationals to order better than $n$; founding the theory of transcendence); Thue 1909 *J. reine angew. Math.* 135, 284-305 (Thue's improvement of Liouville from exponent $n$ to $n/2 + 1$); Siegel 1921 *Math. Z.* 10, 173-213 (Siegel's improvement to exponent $2 \sqrt n$); Roth 1955 *Mathematika* 2, 1-20 (Roth's theorem: for every algebraic irrational $\alpha$ and every $\varepsilon > 0$, only finitely many rationals $p/q$ satisfy $|\alpha - p/q| < q^{-2 - \varepsilon}$ — Fields Medal 1958); Lang 1962 *Diophantine Geometry* Ch. VII (modern algebraic treatment of Pell, units, and Roth); Cassels 1957 *An Introduction to Diophantine Approximation* (Cambridge Tracts 45) Ch. II-III; Khinchin 1964 *Continued Fractions* (Chicago); Hardy-Wright 2008 §§10-11

Intuition Beginner

The Pell equation is the integer equation $x^{2} - D y^{2} = 1$ for a fixed positive integer $D$ that is not a perfect square. The unknowns $x, y$ are positive integers, and the question is whether interesting solutions exist beyond the boring case $(1, 0)$ . That boring pair $(x, y) = (1, 0)$ always works; what we want is positive-integer pairs with $y \geq 1$ .

A cattle problem. Bhaskara II's Bījagaṇita (1150 CE) poses, in a cattle-counting puzzle, the question: find positive integers $x, y$ with $x^{2} - 8 y^{2} = 1$ . Test small $y$ . For $y = 1$ we need $x^{2} = 9$ , so $x = 3$ . The pair $(3, 1)$ solves $9 - 8 = 1$ .

Bhaskara's actual cattle problems demanded much larger $D$ values. The famous $D = 61$ case has minimal solution $x = 1766319049$ , $y = 226153980$ — a $10$ -digit number found by chakravala in 1150 CE, then re-discovered by Brouncker in 1657 in response to a challenge from Fermat. For the small $D = 2$ case the answer is $(x, y) = (3, 2)$ : $9 - 8 = 1$ .

Why is $D$ required to be non-square? If $D = d^{2}$ is a perfect square, the left-hand side factors as $(x - d y) (x + d y) = 1$ , so the only integer solutions are $(x - d y, x + d y) = (\pm 1, \pm 1)$ with the signs matched. This gives $x = \pm 1$ and $y = 0$ , the degenerate cases. So the equation is interesting exactly when $D$ is not a perfect square — and then it always has infinitely many positive-integer solutions, as Lagrange proved in 1770.

Solutions multiply. The deeper observation, going back to Brahmagupta in 628 CE, is that two solutions multiply to make a new solution. If $(x_{1}, y_{1})$ and $(x_{2}, y_{2})$ both solve $x^{2} - D y^{2} = 1$ , consider the product $(x_{1} + y_{1} D) (x_{2} + y_{2} D) = (x_{1} x_{2} + D y_{1} y_{2}) + (x_{1} y_{2} + x_{2} y_{1}) D$ .

Writing $X = x_{1} x_{2} + D y_{1} y_{2}$ and $Y = x_{1} y_{2} + x_{2} y_{1}$ , the algebraic identity $(a + b) (a - b) = a^{2} - b^{2}$ applied to the conjugate pair gives $X^{2} - D Y^{2} = (x_{1}^{2} - D y_{1}^{2}) (x_{2}^{2} - D y_{2}^{2}) = 1 \cdot 1 = 1$ . So $(X, Y)$ is another Pell solution. Iterating with a fixed minimal solution $(x_{1}, y_{1})$ generates an infinite chain of solutions.

The minimal solution. Among all positive-integer Pell solutions $(x, y)$ with $y \geq 1$ , there is a smallest one — the fundamental solution, denoted $(x_{1}, y_{1})$ . Every other positive-integer solution is generated by multiplying $x_{1} + y_{1} D$ by itself: the $n$ -th power $(x_{1} + y_{1} D)^{n} = x_{n} + y_{n} D$ gives the $n$ -th positive solution $(x_{n}, y_{n})$ . So once the fundamental solution is in hand, every other solution is one matrix-multiplication away. The hard work is finding the fundamental solution — and this is where continued fractions enter.

Why does it matter? Pell equations are the elementary face of the unit group of the ring $Z [D]$ — the multiplicative group of elements $α = x + y D$ with norm $α \overset{α}{ˉ} = x^{2} - D y^{2} = \pm 1$ . The fundamental Pell solution is the fundamental unit, and the full solution group is generated by it.

Solving $x^{2} - 2 y^{2} = 1$ to find $(3, 2)$ is, in disguise, computing the unit group of $Z [2]$ as the infinite cyclic group $⟨ 1 + 2 ⟩ \times {\pm 1}$ . This is the entry point to the theory of units in number fields, which culminates in the Dirichlet unit theorem (1846) for general number fields.

Visual Beginner

The Pell solutions $(x, y)$ are integer points on the hyperbola $x^{2} - D y^{2} = 1$ . For $D = 2$ the hyperbola is $x^{2} - 2 y^{2} = 1$ , and the picture shows the first few lattice points on its right branch.

The picture captures the central structural fact: Pell solutions are integer points on a hyperbola, with consecutive solutions related by multiplication by the fundamental unit. The ratios $y_{n} / x_{n}$ are best rational approximations to $1/ D$ in the sense that no rational with smaller denominator gets closer — the convergents of the continued-fraction expansion of $1/ D$ . This is the geometric bridge between Pell, continued fractions, and Diophantine approximation.

Worked example Beginner

Find all positive-integer solutions of $x^{2} - 2 y^{2} = 1$ up to $y \leq 100$ , and verify the multiplication law.

Step 1. Find the fundamental solution by inspection of small $y$ . For $y = 1$ we need $x^{2} = 3$ , no integer. For $y = 2$ we need $x^{2} = 9$ , so $x = 3$ . The fundamental solution is therefore $(x_{1}, y_{1}) = (3, 2)$ . Check: $9 - 8 = 1$ .

Step 2. Generate the next solution by the multiplication law. Compute $(3 + 22)^{2} = 9 + 122 + 8 = 17 + 122$ , so $(x_{2}, y_{2}) = (17, 12)$ . Check: $289 - 288 = 1$ .

Step 3. Generate the next solution. Compute $(3 + 22)^{3} = (3 + 22) (17 + 122) = 51 + 362 + 342 + 48 = 99 + 702$ , so $(x_{3}, y_{3}) = (99, 70)$ . Check: $9801 - 9800 = 1$ .

Step 4. The next solution would be $(x_{4}, y_{4}) = (577, 408)$ with $y_{4} = 408 > 100$ , so we stop. The complete list of positive-integer solutions with $y \leq 100$ is $(3, 2), (17, 12), (99, 70)$ .

Step 5. Verify uniqueness by exhaustion. For each $y \in {1, 2, \dots, 100}$ , compute $2 y^{2} + 1$ and test whether it is a perfect square. The only values of $y$ in this range that give a perfect square are $y = 2, 12, 70$ , matching our list. So the three solutions are the only ones in this range.

Step 6. Verify the multiplication law more explicitly. The recursion $(x_{n + 1}, y_{n + 1}) = (3 x_{n} + 4 y_{n}, 2 x_{n} + 3 y_{n})$ follows from $(x_{n} + y_{n} 2) (3 + 22) = (3 x_{n} + 4 y_{n}) + (2 x_{n} + 3 y_{n}) 2$ .

Starting from $(x_{1}, y_{1}) = (3, 2)$ : $(x_{2}, y_{2}) = (3 \cdot 3 + 4 \cdot 2, 2 \cdot 3 + 3 \cdot 2) = (17, 12)$ . From $(17, 12)$ : $(x_{3}, y_{3}) = (3 \cdot 17 + 4 \cdot 12, 2 \cdot 17 + 3 \cdot 12) = (99, 70)$ . The recursion grows the solutions by a factor of $3 + 22 \approx 5.83$ at each step.

The same algorithm works for any non-square $D$ , but the fundamental solution can be astronomically large. For $D = 61$ , the smallest case Fermat chose to challenge the English in 1657, the fundamental solution is $(1766319049, 226153980)$ — a $10$ -digit answer that Bhaskara II found by chakravala in 1150 CE without paper or modern arithmetic notation.

Check your understanding Beginner

Formal definition Intermediate+

We restate the Pell-equation setup precisely.

Definition (Pell equation). For $D$ a positive integer that is not a perfect square, the Pell equation is $x^{2} - D y^{2} = 1$ in positive integers $x, y$ . A pair $(x, y) \in Z_{> 0}^{2}$ with $x^{2} - D y^{2} = 1$ is a Pell solution. The fundamental solution is the solution with smallest $y_{1} \geq 1$ , equivalently the solution with smallest $x_{1} + y_{1} D$ subject to $x_{1}, y_{1} \geq 1$ .

Definition (negative Pell equation). The equation $x^{2} - D y^{2} = - 1$ is the negative Pell equation. Unlike the Pell equation, the negative Pell equation is not always solvable: solutions exist iff the period of the continued-fraction expansion of $D$ is odd (proved in the Key Theorem below).

Definition (generalised Pell equation). For $N \in Z$ non-zero, the generalised Pell equation is $x^{2} - D y^{2} = N$ in integers $x, y$ . The set of solutions, when non-empty, is a finite union of orbits under the multiplicative action of the Pell unit group ${\pm (x_{1} + y_{1} D)^{n} : n \in Z}$ on $Z [D]$ .

Definition (continued fraction). For $α \in R$ , the continued-fraction expansion is the sequence $α = [a_{0}; a_{1}, a_{2}, a_{3}, \dots]$ produced by the recursion $$ \alpha_0 = \alpha, \qquad a_n = \lfloor \alpha_n \rfloor, \qquad \alpha_{n + 1} = \frac{1}{\alpha_n - a_n} $$ (continuing as long as $α_{n}$ is not an integer). The $a_{n}$ are the partial quotients, all integers with $a_{n} \geq 1$ for $n \geq 1$ . The convergents are the rationals $$ \frac{p_n}{q_n} = a_0 + \cfrac{1}{a_1 + \cfrac{1}{a_2 + \cfrac{1}{\ddots + \cfrac{1}{a_n}}}} $$ computed by the linear recursion $p_{n} = a_{n} p_{n - 1} + p_{n - 2}$ , $q_{n} = a_{n} q_{n - 1} + q_{n - 2}$ with initial values $p_{- 1} = 1, p_{- 2} = 0$ , $q_{- 1} = 0, q_{- 2} = 1$ .

Definition (purely periodic and reduced quadratic irrationals). A real number $α$ is a quadratic irrational iff $α$ satisfies an irreducible quadratic polynomial with rational coefficients. The continued fraction $[a_{0}; \overline{a_{1}, \dots, a_{ℓ}}]$ is purely periodic with period $ℓ$ iff $a_{n + ℓ} = a_{n}$ for all $n \geq 0$ , equivalently $α = α_{ℓ}$ . A quadratic irrational $α$ is reduced iff $α > 1$ and $- 1 < \overset{α}{ˉ} < 0$ , where $\overset{α}{ˉ}$ is the algebraic conjugate.

The convergents $p_{n} / q_{n}$ satisfy $g cd (p_{n}, q_{n}) = 1$ and the best-approximation property: for every rational $p / q$ with $1 \leq q \leq q_{n}$ , $∣ q_{n} α - p_{n} ∣ \leq ∣ q α - p ∣$ , with equality only if $p / q = p_{n} / q_{n}$ . The cross-ratio identity $p_{n} q_{n - 1} - p_{n - 1} q_{n} = (- 1)^{n - 1}$ holds for all $n \geq 0$ .

Counterexamples to common slips

"The Pell equation has the degenerate solution $(1, 0)$ counted as fundamental." The fundamental solution is the smallest positive-integer solution with $y \geq 1$ , not the degenerate $(1, 0)$ . The degenerate solution corresponds to the unit $1 \in Z [D]^{\times}$ , not to a generator of the infinite-cyclic part of the unit group.
" $D$ negative gives an interesting Pell equation." False. For $D < 0$ , the equation $x^{2} + ∣ D ∣ y^{2} = 1$ has only finitely many integer solutions (in fact at most four: $(\pm 1, 0)$ when $∣ D ∣ > 1$ , and four solutions when $∣ D ∣ = 1$ ). The infinite-solution-set phenomenon is specific to indefinite quadratic forms — i.e., $D > 0$ non-square.
"The continued-fraction expansion of every irrational is eventually periodic." False. Lagrange's theorem says $α$ has eventually-periodic continued fraction iff $α$ is a quadratic irrational. For algebraic numbers of degree $\geq 3$ or transcendental numbers like $e$ and $π$ , the continued fraction is non-periodic. (The continued fraction of $e = [2; 1, 2, 1, 1, 4, 1, 1, 6, 1, 1, 8, \dots]$ has a pattern but is not periodic.)
"The negative Pell equation $x^{2} - D y^{2} = - 1$ is always solvable for $D$ non-square." False. The negative Pell equation is solvable iff the period of $D$ 's continued fraction is odd. Examples: $D = 2$ has period $1$ (odd), so $- 1$ is solvable: $1^{2} - 2 \cdot 1^{2} = - 1$ . $D = 3$ has period $2$ (even), so $- 1$ is not solvable. $D = 5$ has period $1$ (odd), solvable: $2^{2} - 5 \cdot 1^{2} = - 1$ . $D = 6$ has period $2$ (even), not solvable. The criterion is a deep consequence of Lagrange's algorithm.
"The Liouville bound applies to all real numbers." False. Liouville's theorem applies only to algebraic numbers: if $α$ has degree $n$ , then $∣ α - p / q ∣ > c (α) q^{- n}$ for all rationals $p / q$ . For transcendental $α$ , no such bound exists — and Liouville used this to construct the first explicit transcendental numbers as those violating the bound for every $n$ .

Key theorem with proof Intermediate+

The signature theorem of this unit is Lagrange's theorem on the existence of Pell solutions via continued fractions. We prove it via the periodicity of quadratic-irrational continued fractions.

Theorem (Lagrange 1770, Pell solvability via continued fractions). Let $D$ be a positive integer that is not a perfect square. Let $D = [a_{0}; \overline{a_{1}, a_{2}, \dots, a_{ℓ}}]$ be the continued-fraction expansion, with period $ℓ \geq 1$ . Let $p_{n} / q_{n}$ be the convergents. Then:

(a) The continued fraction of $D$ is eventually periodic, with $a_{n} = a_{n - ℓ + 1}$ for all $n \geq 1$ and $a_{ℓ} = 2 a_{0}$ .

(b) The convergent $p_{ℓ - 1} / q_{ℓ - 1}$ satisfies $p_{ℓ - 1}^{2} - D q_{ℓ - 1}^{2} = (- 1)^{ℓ}$ . If $ℓ$ is even, $(p_{ℓ - 1}, q_{ℓ - 1})$ is the fundamental Pell solution. If $ℓ$ is odd, $(p_{ℓ - 1}, q_{ℓ - 1})$ solves the negative Pell equation, and the fundamental Pell solution is $(p_{2 ℓ - 1}, q_{2 ℓ - 1})$ .

(c) Every positive Pell solution is $(p_{m ℓ - 1}, q_{m ℓ - 1})$ for some integer $m \geq 1$ (with $m ℓ$ taken to be even).

Proof. We establish the three parts in sequence.

Step 1 — Periodicity of the continued fraction of $D$ . The expansion $D = [a_{0}; a_{1}, a_{2}, \dots]$ is generated by the recursion $α_{0} = D$ , $a_{n} = ⌊ α_{n} ⌋$ , $α_{n + 1} = 1/ (α_{n} - a_{n})$ . We claim each $α_{n}$ is a quadratic irrational of the form $α_{n} = (P_{n} + D) / Q_{n}$ with $P_{n}, Q_{n} \in Z$ , $Q_{n} ∣ D - P_{n}^{2}$ , and (for $n \geq 1$ ) $α_{n}$ is reduced.

Induction on $n$ . Base case $n = 0$ : $α_{0} = D = (0 + D) /1$ , so $P_{0} = 0$ and $Q_{0} = 1$ ; $α_{0}$ is not reduced (since $\overset{α}{ˉ}_{0} = - D < - 1$ ), but the reducedness will hold for $n \geq 1$ .

Inductive step. Suppose $α_{n} = (P_{n} + D) / Q_{n}$ with $Q_{n} ∣ D - P_{n}^{2}$ . Then $a_{n} = ⌊ α_{n} ⌋$ , and $$ \alpha_{n + 1} = \frac{1}{\alpha_n - a_n} = \frac{Q_n}{P_n - a_n Q_n + \sqrt D} = \frac{Q_n (a_n Q_n - P_n + \sqrt D)}{D - (P_n - a_n Q_n)^2}. $$ Set $P_{n + 1} = a_{n} Q_{n} - P_{n}$ and $Q_{n + 1} = (D - P_{n + 1}^{2}) / Q_{n}$ . We must verify $Q_{n + 1} \in Z$ and $Q_{n + 1} ∣ D - P_{n + 1}^{2}$ . The first: $D - P_{n + 1}^{2} = D - (a_{n} Q_{n} - P_{n})^{2} = D - a_{n}^{2} Q_{n}^{2} + 2 a_{n} Q_{n} P_{n} - P_{n}^{2}$ . By the inductive hypothesis $Q_{n} ∣ D - P_{n}^{2}$ , and the other terms are multiples of $Q_{n}$ , so $Q_{n} ∣ D - P_{n + 1}^{2}$ and $Q_{n + 1} = (D - P_{n + 1}^{2}) / Q_{n} \in Z$ . The second: $D - P_{n + 1}^{2} = Q_{n} Q_{n + 1}$ , so $Q_{n + 1} ∣ D - P_{n + 1}^{2}$ .

Reducedness for $n \geq 1$ : we show $α_{n} > 1$ and $- 1 < \overset{α}{ˉ}_{n} < 0$ . The first is immediate since $a_{n} \geq 1$ in the recursion. The second is a standard induction: if $α_{n}$ is reduced, then $\overset{α}{ˉ}_{n + 1} = 1/ (\overset{α}{ˉ}_{n} - a_{n}) \in (- 1, 0)$ because $\overset{α}{ˉ}_{n} - a_{n} < - a_{n} \leq - 1$ , so $∣ \overset{α}{ˉ}_{n} - a_{n} ∣ > 1$ , hence $∣1/ (\overset{α}{ˉ}_{n} - a_{n}) ∣ < 1$ , and the sign is negative because $\overset{α}{ˉ}_{n} - a_{n} < 0$ .

Now the key finiteness: for reduced $α_{n} = (P_{n} + D) / Q_{n}$ , the integers $P_{n}, Q_{n}$ are bounded. Reducedness gives $α_{n} > 1$ and $\overset{α}{ˉ}_{n} = (P_{n} - D) / Q_{n} \in (- 1, 0)$ . Adding: $α_{n} + \overset{α}{ˉ}_{n} = 2 P_{n} / Q_{n}$ , so $P_{n} / Q_{n} \in (0, α_{n} /2 + 1)$ . Subtracting: $α_{n} - \overset{α}{ˉ}_{n} = 2 D / Q_{n} > 0$ , so $Q_{n} > 0$ . From $\overset{α}{ˉ}_{n} \in (- 1, 0)$ , $P_{n} - D \in (- Q_{n}, 0)$ , so $D - Q_{n} < P_{n} < D$ . From $α_{n} > 1$ , $P_{n} + D > Q_{n}$ , so $P_{n} > Q_{n} - D$ . Combining: $max (0, Q_{n} - D) < P_{n} < D$ , and $Q_{n} < 2 D$ (from $α_{n} \cdot ∣ \overset{α}{ˉ}_{n} ∣ = ∣ D - P_{n}^{2} ∣/ Q_{n}^{2} \leq 1$ and a short manipulation). So $P_{n}, Q_{n}$ each take only finitely many integer values.

Since the pair $(P_{n}, Q_{n})$ determines $α_{n}$ and there are only finitely many possible pairs, the sequence $α_{1}, α_{2}, \dots$ must repeat. The first repetition gives the period $ℓ$ , and the algorithm is periodic from index $1$ onward.

The final identity $a_{ℓ} = 2 a_{0}$ is a standard consequence of the conjugation symmetry of the recursion (see Hardy-Wright §10.10).

Step 2 — Pell solutions from convergents. The convergents $p_{n} / q_{n}$ of the continued fraction of $D$ satisfy the basic identity $$ p_n^2 - D q_n^2 = (-1)^{n + 1} Q_{n + 1}, $$ where $Q_{n + 1}$ is the denominator in the recursion of Step 1. (Proof: induction on $n$ ; the base case $n = 0$ gives $a_{0}^{2} - D = - Q_{1}$ , and the step uses the recursion $p_{n} = a_{n} p_{n - 1} + p_{n - 2}$ together with $α_{n + 1} = (P_{n + 1} + D) / Q_{n + 1}$ .)

At $n = ℓ - 1$ : $Q_{ℓ} = Q_{0} = 1$ (since the period closes at index $ℓ$ , so $α_{ℓ} = α_{0}$ and the denominators match). Hence $$ p_{\ell - 1}^2 - D q_{\ell - 1}^2 = (-1)^\ell. $$

If $ℓ$ is even, $(p_{ℓ - 1}, q_{ℓ - 1})$ satisfies $x^{2} - D y^{2} = 1$ , a Pell solution.

If $ℓ$ is odd, $(p_{ℓ - 1}, q_{ℓ - 1})$ satisfies $x^{2} - D y^{2} = - 1$ , a negative-Pell solution. Squaring the negative-Pell solution as an element of $Z [D]$ : $(p_{ℓ - 1} + q_{ℓ - 1} D)^{2} = (p_{ℓ - 1}^{2} + D q_{ℓ - 1}^{2}) + 2 p_{ℓ - 1} q_{ℓ - 1} D$ . The norm is $(- 1)^{2} = 1$ , so the rational and irrational parts $(p_{ℓ - 1}^{2} + D q_{ℓ - 1}^{2}, 2 p_{ℓ - 1} q_{ℓ - 1})$ form a Pell solution. The same is $(p_{2 ℓ - 1}, q_{2 ℓ - 1})$ by the convergent identity applied at index $2 ℓ - 1$ (using $Q_{2 ℓ} = Q_{0} = 1$ from period- $ℓ$ structure repeated twice).

Step 3 — Fundamental solution and complete generation. Suppose $(x, y)$ is any positive Pell solution with $y \geq 1$ . We show $(x, y) = (p_{m ℓ - 1}, q_{m ℓ - 1})$ for some $m \geq 1$ . The best-approximation property of convergents says: for every rational $p / q$ with $∣ q D - p ∣ < 1/ (2 q)$ , $p / q$ must be a convergent of $D$ . From $x^{2} - D y^{2} = 1$ , we have $x - y D = 1/ (x + y D)$ , so $∣ x - y D ∣ = 1/ (x + y D) < 1/ (2 y D) < 1/ (2 y)$ (using $x + y D > 2 y D$ for $x, y \geq 1$ ). Hence $x / y$ is a convergent, $x / y = p_{n} / q_{n}$ , and since $g cd (p_{n}, q_{n}) = 1 = g cd (x, y)$ (the latter from $x^{2} - D y^{2} = 1$ ), $(x, y) = (p_{n}, q_{n})$ .

The convergent identity $p_{n}^{2} - D q_{n}^{2} = (- 1)^{n + 1} Q_{n + 1}$ shows $(p_{n}, q_{n})$ is a Pell solution iff $Q_{n + 1} = 1$ and $n$ is odd. Since $Q_{n + 1} = 1$ holds exactly when the index $n + 1$ is a multiple of $ℓ$ (from periodicity), the Pell solutions are precisely $(p_{m ℓ - 1}, q_{m ℓ - 1})$ for $m \geq 1$ (with $m ℓ$ even, i.e., $m$ even when $ℓ$ is odd and $m$ arbitrary when $ℓ$ is even). The fundamental Pell solution is the one with smallest $m$ . $□$

Bridge. Lagrange's theorem is the elementary face of the Dirichlet unit theorem: for the real quadratic field $K = Q (D)$ with $D > 0$ non-square, the unit group of the ring of integers $O_{K}$ is $O_{K}^{\times} = {\pm 1} \times ⟨ ε ⟩$ where $ε$ is the fundamental unit. The fundamental unit corresponds to the fundamental Pell (or negative-Pell) solution: when the negative-Pell equation is solvable, $ε$ has norm $- 1$ and $ε^{2}$ generates the Pell solutions; when it is not solvable, $ε$ itself has norm $+ 1$ and generates the Pell solutions directly. This is the first substantive instance of the general Dirichlet unit theorem (Dirichlet 1846), which gives the rank of the unit group of any number field as $r_{1} + r_{2} - 1$ where $r_{1}, r_{2}$ are the numbers of real and complex embeddings. For real quadratic $K$ the rank is $2 - 1 = 1$ , so the free part is infinite cyclic, generated by the fundamental unit — Pell.

Exercises Intermediate+

Exercise 3 (medium, numeric).

Compute the continued-fraction expansion of $13$ and use Lagrange's theorem to read off the fundamental Pell solution of $x^{2} - 13 y^{2} = 1$ .

Hint

Run the algorithm $α_{0} = 13$ , $a_{n} = ⌊ α_{n} ⌋$ , $α_{n + 1} = 1/ (α_{n} - a_{n})$ until the partial quotients repeat. The period length $ℓ$ determines whether the fundamental Pell solution is at the convergent $p_{ℓ - 1} / q_{ℓ - 1}$ (period even) or $p_{2 ℓ - 1} / q_{2 ℓ - 1}$ (period odd).

Answer

$(x_{1}, y_{1}) = (649, 180)$ . Run the algorithm.

$α_{0} = 13 \approx 3.606$ , $a_{0} = 3$ . $α_{1} = 1/ (13 - 3) = (13 + 3) /4 \approx 1.651$ , $a_{1} = 1$ . $α_{2} = 1/ (α_{1} - 1) = 4/ (13 - 1) = (13 + 1) /3 \approx 1.535$ , $a_{2} = 1$ . $α_{3} = 1/ (α_{2} - 1) = 3/ (13 - 2) = (13 + 2) /3 \approx 1.868$ , $a_{3} = 1$ . $α_{4} = 1/ (α_{3} - 1) = 3/ (13 - 1) = (13 + 1) /4 \approx 1.151$ , $a_{4} = 1$ . $α_{5} = 1/ (α_{4} - 1) = 4/ (13 - 3) = 13 + 3 \approx 6.606$ , $a_{5} = 6 = 2 a_{0}$ .

So $13 = [3; \overline{1, 1, 1, 1, 6}]$ with period $ℓ = 5$ (odd).

Convergents: $p_{0} / q_{0} = 3/1$ ; $p_{1} / q_{1} = (1 \cdot 3 + 1) / (1 \cdot 1 + 0) = 4/1$ ; $p_{2} / q_{2} = (1 \cdot 4 + 3) / (1 \cdot 1 + 1) = 7/2$ ; $p_{3} / q_{3} = (1 \cdot 7 + 4) / (1 \cdot 2 + 1) = 11/3$ ; $p_{4} / q_{4} = (1 \cdot 11 + 7) / (1 \cdot 3 + 2) = 18/5$ .

Check: $1 8^{2} - 13 \cdot 5^{2} = 324 - 325 = - 1$ , so $(18, 5)$ is the fundamental negative-Pell solution.

Period $ℓ = 5$ is odd, so the fundamental Pell solution is $(p_{9}, q_{9})$ . Continue the convergent recursion: $p_{5} / q_{5} = (6 \cdot 18 + 11) / (6 \cdot 5 + 3) = 119/33$ , etc. — or use the shortcut $(p_{9} + q_{9} 13) = (18 + 513)^{2} = 324 + 18013 + 325 = 649 + 18013$ . So $(x_{1}, y_{1}) = (649, 180)$ . Check: $64 9^{2} = 421201$ and $13 \cdot 18 0^{2} = 13 \cdot 32400 = 421200$ , hence $421201 - 421200 = 1$ .

Exercise 4 (medium, symbolic).

Prove the multiplication law for Pell solutions: if $(x_{1}, y_{1})$ and $(x_{2}, y_{2})$ both solve $x^{2} - D y^{2} = 1$ , then $(x_{1} x_{2} + D y_{1} y_{2}, x_{1} y_{2} + x_{2} y_{1})$ also solves $x^{2} - D y^{2} = 1$ .

Hint

The norm map $N : Z [D] \to Z$ , $N (a + b D) = a^{2} - D b^{2}$ , is multiplicative: $N (α β) = N (α) N (β)$ . Apply this to $α = x_{1} + y_{1} D$ and $β = x_{2} + y_{2} D$ .

Answer

Set $α = x_{1} + y_{1} D$ and $β = x_{2} + y_{2} D$ in $Z [D]$ . The conjugation $σ : a + b D \mapsto a - b D$ is a ring automorphism of $Z [D]$ . The norm $N (α) = α \cdot σ (α) = x_{1}^{2} - D y_{1}^{2} = 1$ by hypothesis, similarly $N (β) = 1$ .

By multiplicativity of the norm (an immediate consequence of $σ$ being a ring homomorphism), $N (α β) = N (α) N (β) = 1$ .

Expand $α β = (x_{1} + y_{1} D) (x_{2} + y_{2} D) = x_{1} x_{2} + x_{1} y_{2} D + x_{2} y_{1} D + y_{1} y_{2} D = (x_{1} x_{2} + D y_{1} y_{2}) + (x_{1} y_{2} + x_{2} y_{1}) D$ . Writing $X = x_{1} x_{2} + D y_{1} y_{2}$ and $Y = x_{1} y_{2} + x_{2} y_{1}$ , $α β = X + Y D$ and $N (α β) = X^{2} - D Y^{2} = 1$ . So $(X, Y)$ is a Pell solution.

This is Brahmagupta's identity (628 CE), the foundational composition principle for Pell-type equations.

Rubric: full credit for the norm-multiplicativity argument and the explicit expansion of $α β$ .

Exercise 5 (medium, symbolic).

Prove that all positive-integer Pell solutions of $x^{2} - D y^{2} = 1$ form an infinite cyclic group under the multiplication law of Exercise 4, generated by the fundamental solution.

Hint

Use Lagrange's theorem (Key Theorem above) to show every solution is a convergent $(p_{m ℓ - 1}, q_{m ℓ - 1})$ . Then show these convergents are exactly the powers of $ε_{1} = x_{1} + y_{1} D$ .

Answer

Set $ε_{n} = x_{n} + y_{n} D$ for the $n$ -th positive Pell solution $(x_{n}, y_{n})$ . By the multiplication law (Exercise 4), $ε_{m} ε_{n} = ε_{m + n}$ in the sense that the product corresponds to a Pell solution; we must show this product is $ε_{m + n}$ with the labelling by ordered solutions.

Order the positive Pell solutions $(x_{n}, y_{n})$ by increasing $y_{n}$ (equivalently, by increasing $ε_{n} > 1$ ). The fundamental solution is $ε_{1}$ , the smallest such element.

Claim: $ε_{n} = ε_{1}^{n}$ for all $n \geq 1$ .

Induction. Base case $n = 1$ : immediately $ε_{1} = ε_{1}^{1}$ . Step: assume $ε_{k} = ε_{1}^{k}$ for $k = 1, 2, \dots, n$ , and consider $ε_{1}^{n + 1}$ . By the multiplication law (Exercise 4), $ε_{1}^{n + 1} = ε_{1} \cdot ε_{1}^{n} = ε_{1} \cdot ε_{n}$ corresponds to a Pell solution. We show this is $ε_{n + 1}$ .

Suppose for contradiction that $ε_{1}^{n + 1} \neq = ε_{n + 1}$ . Since $ε_{1}^{n + 1}$ is a Pell solution by the multiplication law, it equals some $ε_{m}$ with $m \neq = n + 1$ . The ordering $1 < ε_{1} < ε_{2} < \dots$ corresponds to $ε_{1} < ε_{1}^{2} < ε_{1}^{3} < \dots$ (since $ε_{1} > 1$ ). So $ε_{1}^{n + 1}$ lies strictly between $ε_{1}^{n} = ε_{n}$ and $ε_{1}^{n + 2}$ .

If $ε_{n + 1}$ lay strictly between $ε_{n}$ and $ε_{1}^{n + 1}$ , then $ε_{1}^{- n} ε_{n + 1}$ would be a Pell solution (closure under the multiplication and inversion in $Z [D]^{\times}$ — units of norm $1$ form a group) strictly between $1$ and $ε_{1}$ . This contradicts $ε_{1}$ being the smallest Pell solution greater than $1$ .

So $ε_{n + 1} = ε_{1}^{n + 1}$ , completing the induction.

The negative direction: $ε_{1}^{- 1}$ is the conjugate $\overset{ε}{ˉ}_{1} = x_{1} - y_{1} D$ (since $ε_{1} \overset{ε}{ˉ}_{1} = x_{1}^{2} - D y_{1}^{2} = 1$ ). For $n \geq 1$ , $ε_{- n} = ε_{1}^{- n}$ corresponds to the Pell solution $(x_{n}, - y_{n})$ , which under the sign convention $y > 0$ is equivalent to $(x_{n}, y_{n})$ . Including the sign $\pm 1$ , the full unit group of norm $+ 1$ in $Z [D]$ is ${\pm ε_{1}^{n} : n \in Z}$ , isomorphic to $Z /2 \times Z$ .

Rubric: full credit for the multiplication-law application, the ordering argument, and the contradiction-by-smallness step.

Exercise 6 (medium, symbolic).

Prove Dirichlet's pigeonhole bound: for every irrational $α$ , there exist infinitely many rationals $p / q$ with $∣ α - p / q ∣ < 1/ q^{2}$ .

Hint

Fix $N \in Z_{> 0}$ and consider the $N + 1$ fractional parts ${q α}$ for $q = 0, 1, \dots, N$ . By pigeonhole, two of these fall in the same subinterval of length $1/ N$ in $[0, 1)$ .

Answer

Fix $N \geq 1$ . The $N + 1$ values ${0 \cdot α}, {1 \cdot α}, \dots, {N \cdot α}$ all lie in $[0, 1)$ , where ${x}$ denotes the fractional part. Partition $[0, 1)$ into $N$ equal subintervals $[0, 1/ N), [1/ N, 2/ N), \dots, [(N - 1) / N, 1)$ . By pigeonhole, two of the $N + 1$ values fall in the same subinterval, say ${q_{1} α}$ and ${q_{2} α}$ with $0 \leq q_{1} < q_{2} \leq N$ . Then $∣ {q_{2} α} - {q_{1} α} ∣ < 1/ N$ .

Set $q = q_{2} - q_{1} \in {1, 2, \dots, N}$ and $p = ⌊ q_{2} α ⌋ - ⌊ q_{1} α ⌋$ . Then $q α - p = {q_{2} α} - {q_{1} α}$ , so $∣ q α - p ∣ < 1/ N$ , equivalently $∣ α - p / q ∣ < 1/ (q N) \leq 1/ q^{2}$ (since $q \leq N$ ).

This gives one rational $p / q$ with $∣ α - p / q ∣ < 1/ q^{2}$ and $q \leq N$ . To get infinitely many, observe that for $α$ irrational, the inequality $∣ α - p / q ∣ < 1/ q^{2}$ has no solution with arbitrary small $q$ (because we can always pick $N > 1/∣ α - p / q ∣$ to construct a strictly better approximation). So the $q$ values produced as $N \to \infty$ are unbounded, giving infinitely many distinct rationals $p / q$ with the property.

This is the Dirichlet approximation theorem (Dirichlet 1842 Bericht Preuss. Akad.) — the first explicit pigeonhole-based result in number theory.

Rubric: full credit for the pigeonhole partition, the subtraction step, and the infinitely-many extension.

Exercise 7 (hard, symbolic).

Use the Dirichlet bound from Exercise 6 to give the non-constructive proof of Pell solvability (without continued fractions): for every non-square $D > 0$ , $x^{2} - D y^{2} = 1$ has a positive-integer solution.

Hint

By Exercise 6, there are infinitely many $(p, q)$ with $∣ p - q D ∣ < 1/ q$ . Bound $∣ p^{2} - D q^{2} ∣$ in terms of these, then use pigeonhole on residues modulo $M$ for a fixed $M$ , and apply Brahmagupta's composition (Exercise 4) to convert a solution of $x^{2} - D y^{2} = N$ into a Pell solution.

Answer

By the Dirichlet bound (Exercise 6 applied to $α = D$ ), there exist infinitely many integer pairs $(p, q)$ with $q \geq 1$ and $∣ p - q D ∣ < 1/ q$ .

Bound the norm: $∣ p^{2} - D q^{2} ∣ = ∣ p - q D ∣ \cdot ∣ p + q D ∣ < (1/ q) (2 q D + 1/ q) = 2 D + 1/ q^{2} < 2 D + 1$ . So the integer values $p^{2} - D q^{2}$ are bounded in absolute value by $⌈ 2 D + 1 ⌉$ .

By pigeonhole on the finite set ${N \in Z : ∣ N ∣ \leq 2 D + 1, N \neq = 0}$ , some non-zero $N$ is attained by infinitely many pairs $(p, q)$ . (The case $N = 0$ is excluded since $D$ is irrational, so $p^{2} = D q^{2}$ forces $p = q = 0$ contradicting $q \geq 1$ .)

Now apply pigeonhole again: among the infinitely many pairs $(p_{i}, q_{i})$ with $p_{i}^{2} - D q_{i}^{2} = N$ , restrict to those with fixed residues $p_{i} \equiv p_{0} (mod ∣ N ∣)$ and $q_{i} \equiv q_{0} (mod ∣ N ∣)$ . Since there are only $∣ N ∣^{2}$ possible residue pairs, some residue class $(p_{0}, q_{0})$ is attained by infinitely many pairs, in particular by at least two distinct pairs $(p_{1}, q_{1}) \neq = (p_{2}, q_{2})$ .

Compute the quotient in $Q (D)$ : $$ \frac{p_1 + q_1 \sqrt D}{p_2 + q_2 \sqrt D} = \frac{(p_1 + q_1 \sqrt D)(p_2 - q_2 \sqrt D)}{p_2^2 - D q_2^2} = \frac{(p_1 p_2 - D q_1 q_2) + (p_2 q_1 - p_1 q_2) \sqrt D}{N}. $$

The numerator coefficients: $p_{1} p_{2} - D q_{1} q_{2} \equiv p_{0}^{2} - D q_{0}^{2} = N \equiv 0 (mod ∣ N ∣)$ (using the residue congruences); similarly $p_{2} q_{1} - p_{1} q_{2} \equiv p_{0} q_{0} - p_{0} q_{0} = 0 (mod ∣ N ∣)$ . So both numerator coefficients are divisible by $N$ , and the quotient simplifies to $X + Y D$ with $X, Y \in Z$ .

Compute the norm: $N (X + Y D) = N (p_{1} + q_{1} D) / N (p_{2} + q_{2} D) = N / N = 1$ . So $(X, Y)$ is a Pell solution (with $X^{2} - D Y^{2} = 1$ ). Since the two pairs $(p_{1}, q_{1}) \neq = (p_{2}, q_{2})$ are distinct, the quotient is not $1$ , hence $(X, Y) \neq = (\pm 1, 0)$ , and $∣ Y ∣ \geq 1$ . Replacing $(X, Y)$ by $(∣ X ∣, ∣ Y ∣)$ if necessary, we get a positive-integer Pell solution. $□$

This is Dirichlet's pigeonhole proof of Pell solvability — a clean conceptual route bypassing continued fractions, and the elementary face of the Dirichlet unit theorem for $Q (D)$ .

Rubric: full credit for the Dirichlet-bound application, the two pigeonhole steps (on the value $N$ and on residues mod $∣ N ∣$ ), the quotient computation in $Q (D)$ , and the norm-multiplicativity conclusion.

Exercise 8 (hard, symbolic).

State and prove the Liouville approximation bound: if $α$ is an algebraic number of degree $n \geq 2$ , there exists $c (α) > 0$ such that $∣ α - p / q ∣ > c (α) q^{- n}$ for every rational $p / q$ with $q \geq 1$ .

Hint

Let $f \in Z [x]$ be the minimal polynomial of $α$ . Bound $∣ f (p / q) ∣$ from below by $1/ q^{n}$ (since $q^{n} f (p / q)$ is a non-zero integer) and from above by $∣ p / q - α ∣ \cdot M$ for $M = max_{∣ x - α ∣ \leq 1} ∣ f^{'} (x) ∣$ .

Answer

Let $f (x) \in Z [x]$ be the minimal polynomial of $α$ — the monic-or-integer-coefficient polynomial of smallest degree with $f (α) = 0$ , normalised so that $f$ has integer coefficients with $g cd = 1$ and positive leading coefficient. By assumption $de g f = n \geq 2$ , so $f$ is irreducible over $Q$ and has no rational roots.

Let $p / q$ be any rational with $q \geq 1$ . Then $f (p / q) \in Q$ , and $q^{n} f (p / q)$ is a non-zero integer (non-zero because $f$ has no rational roots; integer because $q^{n}$ clears the denominators in $f (p / q) = \sum_{i} a_{i} p^{i} / q^{i}$ with $a_{i} \in Z$ ). Hence $$ |q^n f(p/q)| \geq 1, \qquad \text{equivalently} \qquad |f(p/q)| \geq q^{-n}. $$

Bound $∣ f (p / q) ∣$ from above by the mean-value theorem. If $∣ p / q - α ∣ > 1$ , the conclusion $∣ α - p / q ∣ > q^{- n}$ holds immediately for $c \leq 1$ (since $q^{- n} \leq 1$ ).

Otherwise $∣ p / q - α ∣ \leq 1$ , and there exists $ξ$ between $α$ and $p / q$ with $f (p / q) = f (α) + f^{'} (ξ) (p / q - α) = f^{'} (ξ) (p / q - α)$ (using $f (α) = 0$ ). Set $M = sup_{∣ x - α ∣ \leq 1} ∣ f^{'} (x) ∣$ , a positive finite constant depending only on $α$ (since $f^{'}$ is a polynomial). Then $∣ f (p / q) ∣ \leq M \cdot ∣ p / q - α ∣$ .

Combine: $M \cdot ∣ p / q - α ∣ \geq ∣ f (p / q) ∣ \geq q^{- n}$ , so $$ |\alpha - p/q| \geq \frac{1}{M q^n}. $$

Set $c (α) = min (1, 1/ M) > 0$ . Then $∣ α - p / q ∣ > c (α) q^{- n}$ for every rational $p / q$ with $q \geq 1$ . $□$

The bound has two famous consequences. First, the Pell convergents $(p_{n}, q_{n})$ of $D$ satisfy $∣ p_{n} / q_{n} - D ∣ < 1/ q_{n}^{2}$ , saturating Liouville's bound for the quadratic case $n = 2$ — so quadratic irrationals are exactly the irrationals admitting infinitely many rational approximations to order $2$ but not to order $> 2$ (by the Liouville bound). Second, the Liouville numbers $α = \sum_{k} 1 0^{- k!}$ admit rational approximations to every order $n$ by truncating the series, so they are transcendental — the first explicit examples of transcendental real numbers (Liouville 1844).

The Liouville exponent $n$ is far from sharp for $n \geq 3$ . Thue (1909) improved it to $n /2 + 1 + ε$ ; Siegel (1921) to $2 n + ε$ ; Roth (1955) to $2 + ε$ — sharp by the Dirichlet bound (Exercise 6). Roth's theorem is the optimal endpoint of the Liouville-Thue-Siegel-Roth tower and earned Roth the Fields Medal 1958.

Rubric: full credit for the minimal-polynomial setup, the lower bound $∣ f (p / q) ∣ \geq q^{- n}$ , the mean-value upper bound $∣ f (p / q) ∣ \leq M ∣ p / q - α ∣$ , and the combination with the constant $c (α)$ .

Lean formalization Intermediate+

Mathlib provides scattered pieces of the Pell-and-continued-fractions infrastructure, but no unified pedagogical module.

-- Operative imports: Mathlib.NumberTheory.Pell,
-- Mathlib.NumberTheory.Zsqrtd.Basic,
-- Mathlib.Analysis.SpecialFunctions.ContinuedFractions

#check @Pell.xn
-- The x-coordinate of the n-th positive Pell solution for non-square D.

#check @Pell.yn
-- The y-coordinate of the n-th positive Pell solution.

#check @Pell.pell_eq
-- The Pell identity: xn^2 - D * yn^2 = 1 for all n.

#check @Pell.exists_pell_solution
-- Existence of a positive (y >= 1) Pell solution for non-square D > 0.

#check @Pell.is_fundamental
-- The fundamental Pell solution is the smallest positive solution.

#check @Zsqrtd.norm
-- The norm map N(a + b * sqrt d) = a^2 - d * b^2 on Z[sqrt d].

#check @Zsqrtd.norm_mul
-- Norm multiplicativity N(alpha * beta) = N(alpha) * N(beta).

#check @ContinuedFractions.convergents
-- The convergents p_n / q_n of a real-valued continued fraction.

#check @ContinuedFractions.bestApproximation
-- The best-approximation property of convergents.

The Mathlib formalisation covers the existence of the fundamental Pell solution, the cyclic structure of the solution set, the norm-multiplicative ring $Z [d]$ , and the basic theory of continued-fraction convergents. What Mathlib has not yet packaged as a unified pedagogical module: (i) Lagrange's periodicity theorem for the continued fraction of every quadratic irrational; (ii) the explicit algorithm reading the fundamental Pell solution off the convergent of $D$ immediately before the period closes; (iii) Dirichlet's pigeonhole proof of Pell solvability bypassing continued fractions; (iv) the chakravala algorithm of Brahmagupta-Bhaskara with constructive termination; (v) the criterion that the negative Pell equation $x^{2} - D y^{2} = - 1$ is solvable iff the period of $D$ 's continued fraction is odd; (vi) the Liouville approximation bound and the Thue-Siegel-Roth tower. These are the formalisation gaps documented in the unit metadata Mathlib gap analysis.

Advanced results Master

Theorem 1 (Lagrange periodicity, 1770). For every quadratic irrational $α$ , the continued-fraction expansion $α = [a_{0}; a_{1}, a_{2}, \dots]$ is eventually periodic. Moreover, $α$ is purely periodic (i.e., $α = [\overline{a_{0}; a_{1}, \dots, a_{ℓ}}]$ ) iff $α$ is a reduced quadratic irrational: $α > 1$ and $- 1 < \overset{α}{ˉ} < 0$ ^{[Lagrange 1770]}. Proved in the Key Theorem above (Step 1) via finiteness of the integer parameters $(P_{n}, Q_{n})$ in the reduced form $α_{n} = (P_{n} + D) / Q_{n}$ . The pure-periodicity characterisation is due to Galois (1828 Ann. de Gergonne 19, 294), generalising Lagrange's result.

Theorem 2 (Pell solvability, Lagrange 1770). For every non-square $D > 0$ , the Pell equation $x^{2} - D y^{2} = 1$ has a positive-integer solution ^{[Lagrange 1770]}. Proved in the Key Theorem above as a corollary of Lagrange periodicity: the convergent $p_{ℓ - 1} / q_{ℓ - 1}$ of $D = [a_{0}; \overline{a_{1}, \dots, a_{ℓ}}]$ satisfies $p_{ℓ - 1}^{2} - D q_{ℓ - 1}^{2} = (- 1)^{ℓ}$ , so $(p_{ℓ - 1}, q_{ℓ - 1})$ is a Pell solution when $ℓ$ is even and $(p_{2 ℓ - 1}, q_{2 ℓ - 1})$ is a Pell solution when $ℓ$ is odd. This is the first complete proof in European mathematics — 620 years after Bhaskara II's chakravala (1150 CE).

Theorem 3 (cyclic structure of the Pell solution set). The positive-integer solutions of $x^{2} - D y^{2} = 1$ are exactly $(x_{n}, y_{n}) =$ rational and irrational parts of $(x_{1} + y_{1} D)^{n}$ for $n \geq 1$ , where $(x_{1}, y_{1})$ is the fundamental solution ^{[Hardy-Wright 2008]}. Proved as Exercise 5 above via the multiplication law and the ordering argument. The full unit group $O_{Q (D)}^{\times}$ of the ring of integers (when $Z [D]$ is the full ring of integers, i.e., $D \neq \equiv 1 (mod 4)$ ) is ${\pm 1} \times ⟨ ε ⟩$ where $ε = x_{1} + y_{1} D$ when the negative-Pell equation is unsolvable, and $ε =$ the smaller fundamental negative-Pell solution when it is solvable.

Theorem 4 (negative Pell criterion). The negative Pell equation $x^{2} - D y^{2} = - 1$ has an integer solution iff the period $ℓ$ of the continued-fraction expansion of $D$ is odd ^{[Hardy-Wright 2008]}. The direction $\Leftarrow$ follows from the convergent identity $p_{ℓ - 1}^{2} - D q_{ℓ - 1}^{2} = (- 1)^{ℓ}$ : when $ℓ$ is odd, this gives $- 1$ , so $(p_{ℓ - 1}, q_{ℓ - 1})$ is a negative-Pell solution. The direction $\Rightarrow$ uses the complete characterisation of Pell-type solutions as convergents (best-approximation analogue of the Key Theorem) plus a parity argument on $ℓ$ . There are deep open questions about which $D$ have odd period: for $D = p$ prime with $p \equiv 1 (mod 4)$ , the period is always odd (Pell solvable); for $p \equiv 3 (mod 4)$ , the period is always even (not solvable); for composite $D$ the situation depends on the class group of $Q (D)$ (Stevenhagen 1993 conjecture: density of $D$ with odd period is approximately $0.58$ ).

Theorem 5 (Dirichlet unit theorem for real quadratic fields). For $D > 0$ non-square, the unit group of the ring of integers $O_{K}$ of $K = Q (D)$ is $O_{K}^{\times} = {\pm 1} \times ⟨ ε ⟩$ where $ε$ is the fundamental unit — the smallest unit $> 1$ ^{[Dirichlet 1846]}. Proved by Dirichlet 1846 Bericht Preuss. Akad. via the geometry of numbers and a pigeonhole argument analogous to Exercise 7 above. The general Dirichlet unit theorem for number fields (1846) says the unit group of $O_{K}$ has rank $r_{1} + r_{2} - 1$ , where $r_{1}, r_{2}$ are the numbers of real and complex embeddings; for real quadratic $K$ this rank is $1$ and the fundamental unit generates the free part. Pell solvability for non-square $D > 0$ is the elementary face of the rank- $1$ unit theorem for real quadratic fields.

Theorem 6 (chakravala algorithm; Jayadeva c. 1000, Bhaskara II 1150). The chakravala (cyclic) algorithm computes the fundamental Pell solution of $x^{2} - D y^{2} = 1$ for non-square $D > 0$ in finitely many steps. Given a triple $(a, b, k)$ with $a^{2} - D b^{2} = k$ (started from $(a, b, k) = (⌊ D ⌋, 1, ⌊ D ⌋^{2} - D)$ ), choose $m \in Z$ minimising $∣ m^{2} - D ∣$ subject to $a + bm \equiv 0 (mod ∣ k ∣)$ ; produce the new triple $$ \left(\frac{a m + D b}{|k|}, \frac{a + b m}{|k|}, \frac{m^2 - D}{k}\right). $$ Iteration terminates when $k = 1$ , yielding the fundamental Pell solution ^{[Bhaskara II 1150]}. Termination requires a careful argument; the original chakravala texts use the Brahmagupta composition principle (Exercise 4) repeatedly to descend to $∣ k ∣ = 1$ . Modern proofs by Selenius (1975 Hist. Math. 2, 167) and Lenstra (2002 Mathematical Intelligencer 24, 22) show chakravala produces the fundamental solution in $O (lo g D)$ iterations on average — faster than the naive continued-fraction algorithm in many cases, and asymptotically as fast as the fastest known. Hankel 1874 called chakravala "the finest thing achieved in the theory of numbers before Lagrange".

Theorem 7 (Liouville approximation bound, 1844). If $α$ is an algebraic number of degree $n \geq 2$ , there exists $c (α) > 0$ with $∣ α - p / q ∣ > c (α) q^{- n}$ for every rational $p / q$ with $q \geq 1$ ^{[Liouville 1844]}. Proved as Exercise 8 above via the minimal-polynomial argument. Liouville's exponent $n$ is sharp only for $n = 2$ (saturated by the convergents of every quadratic irrational, since $∣ D - p_{n} / q_{n} ∣ < 1/ q_{n}^{2}$ ). For $n \geq 3$ the exponent is far from sharp — improved by Thue, Siegel, Roth as below.

Theorem 8 (Thue-Siegel-Roth, 1909-1955). For every algebraic irrational $α$ and every $ε > 0$ , only finitely many rationals $p / q$ satisfy $∣ α - p / q ∣ < q^{- (2 + ε)}$ ^{[Roth 1955]}. The exponent $2$ is sharp by Dirichlet (Exercise 6), and the bound is the optimal universal exponent for algebraic irrationals. Thue 1909 ^{[Thue 1909]} proved exponent $n /2 + 1 + ε$ ; Siegel 1921 ^{[Siegel 1921]} improved to $2 n + ε$ ; Roth 1955 ^{[Roth 1955]} reached the optimal $2 + ε$ , earning the Fields Medal 1958. The proof is ineffective: it bounds the number of approximating rationals but does not bound their height, so explicit bounds for Diophantine equations cannot be deduced. Baker 1966-67 Mathematika 13, 14 introduced linear forms in logarithms to give effective bounds at the cost of weaker exponents.

Synthesis. The Pell equation is the elementary face of the unit group of $Q (D)$ , the simplest interesting number field beyond $Q$ itself. The fundamental Pell (or negative-Pell) solution is the fundamental unit, and the continued-fraction algorithm for $D$ is the constructive procedure for computing it. The same circle of ideas — continued fractions, periodicity, best approximation, Dirichlet pigeonhole — generalises in three directions:

(i) Algebraic direction (Dirichlet 1846, Minkowski 1896, Hasse-Minkowski 1923): the Dirichlet unit theorem for general number fields gives the rank of the unit group as $r_{1} + r_{2} - 1$ , with the fundamental units generating the free part. The Minkowski geometry-of-numbers approach (1896 Geometrie der Zahlen) reproves Pell solvability via the lattice $Z [D] ↪ R^{2}$ and Minkowski's convex-body theorem, generalising to lattices of arbitrary rank. The Hasse-Minkowski theorem (1923-24) classifies indefinite quadratic forms over $Q$ via local-global principle, with the Pell equation as the rank- $2$ indefinite case.

(ii) Analytic direction (Liouville 1844, Thue 1909, Siegel 1921, Roth 1955): the Liouville bound and its improvements bound the rate at which algebraic numbers can be rationally approximated. The Pell convergents saturate the Liouville bound for quadratic $α$ , but for higher-degree algebraic numbers the Roth theorem $∣ α - p / q ∣ > q^{- (2 + ε)}$ (sharp) shows algebraic numbers are essentially as hard to approximate as random irrationals. Schmidt's subspace theorem (1972 Acta Math. 125, 189) generalises Roth to simultaneous Diophantine approximation, with deep applications to integral points on subvarieties of tori.

(iii) Geometric direction (Thue 1909, Siegel 1929, Mordell 1922, Faltings 1983): finiteness of integer solutions of Diophantine equations of degree $\geq 3$ . Thue's 1909 theorem says $f (x, y) = c$ has finitely many integer solutions for $f$ irreducible binary of degree $\geq 3$ . Mordell 1922 conjectured (and Faltings 1983 proved, Fields Medal 1986) that every curve of genus $\geq 2$ over a number field has finitely many rational points. The Pell equation is precisely the boundary case — genus- $0$ curve with infinitely many integer points (in fact a $Z$ -torsor for the unit group) — and the contrast with the higher-genus finiteness is the elementary entry point to arithmetic geometry.

The historical arc from Brahmagupta 628 CE to Roth 1955 is a $1300$ -year story of one elementary equation generating ever-richer mathematics. Brahmagupta's composition principle (samasa-bhavana) is the embryonic form of norm multiplicativity in number fields. Bhaskara II's chakravala (1150 CE) is the embryonic form of Gauss's reduction theory of quadratic forms. Lagrange's continued-fraction proof (1770) is the embryonic form of the Dirichlet unit theorem. Liouville's bound (1844) is the embryonic form of Diophantine approximation theory. Roth's theorem (1955) is the optimal exponent — but the underlying object, $x^{2} - D y^{2} = 1$ , is the same one Brahmagupta solved in 628 CE.

The applied legacy is more modest than for quadratic reciprocity, but real. Pell-equation solutions provide best rational approximations to $D$ , used in computer algebra systems (Mathematica, SageMath, GAP) for symbolic manipulation of quadratic irrationals. The continued-fraction algorithm is the foundation for lattice-reduction algorithms (LLL 1982 generalises the Euclidean algorithm and continued fractions to higher-dimensional lattices), which underlie cryptanalysis of low-exponent RSA (Coppersmith 1996), the NTRU cryptosystem (1996), and the modern post-quantum cryptography candidates LWE and SIS. The Pell equation itself surfaces in continued-fraction factorisation (CFRAC, Brillhart-Morrison 1975), the precursor algorithm to the quadratic sieve.

Full proof set Master

Proposition 1 (Lagrange periodicity). Every quadratic irrational has eventually-periodic continued fraction.

Proof. See Step 1 of the Key Theorem above. The recursion $α_{n} = (P_{n} + D) / Q_{n}$ has integer parameters $P_{n}, Q_{n}$ bounded by a constant depending on $D$ , so the sequence of $(P_{n}, Q_{n})$ pairs takes only finitely many values, forcing the $α_{n}$ (hence the partial quotients $a_{n}$ ) to be eventually periodic. $□$

Proposition 2 (Pell solvability via convergents). For non-square $D > 0$ , the convergent $(p_{ℓ - 1}, q_{ℓ - 1})$ of $D$ satisfies $p_{ℓ - 1}^{2} - D q_{ℓ - 1}^{2} = (- 1)^{ℓ}$ .

Proof. See Step 2 of the Key Theorem. The identity $p_{n}^{2} - D q_{n}^{2} = (- 1)^{n + 1} Q_{n + 1}$ at $n = ℓ - 1$ uses $Q_{ℓ} = Q_{0} = 1$ (from the period closing at index $ℓ$ ). $□$

Proposition 3 (multiplication law for Pell solutions). If $(x_{1}, y_{1})$ and $(x_{2}, y_{2})$ are Pell solutions, so is $(x_{1} x_{2} + D y_{1} y_{2}, x_{1} y_{2} + x_{2} y_{1})$ .

Proof. See Exercise 4. Norm multiplicativity in $Z [D]$ : $N (α β) = N (α) N (β)$ with $α = x_{1} + y_{1} D$ , $β = x_{2} + y_{2} D$ . $□$

Proposition 4 (cyclic structure of Pell solutions). The positive Pell solutions form an infinite cyclic semigroup generated by the fundamental solution.

Proof. See Exercise 5. By Lagrange, every Pell solution is a convergent; by the ordering argument plus the multiplication law, every positive solution is $ε_{1}^{n}$ for $n \geq 1$ . $□$

Proposition 5 (Dirichlet pigeonhole approximation bound). For every irrational $α$ , there exist infinitely many rationals $p / q$ with $∣ α - p / q ∣ < 1/ q^{2}$ .

Proof. See Exercise 6. Pigeonhole on the $N + 1$ fractional parts ${q α}$ for $q = 0, 1, \dots, N$ in $N$ subintervals of $[0, 1)$ . $□$

Proposition 6 (Dirichlet pigeonhole proof of Pell solvability). For non-square $D > 0$ , $x^{2} - D y^{2} = 1$ has a positive-integer solution.

Proof. See Exercise 7. By Proposition 5, there are infinitely many $(p, q)$ with $∣ p - q D ∣ < 1/ q$ . The integer values $p^{2} - D q^{2}$ are bounded, so some non-zero $N$ is attained infinitely often. Pigeonhole on residues modulo $∣ N ∣$ yields two distinct pairs $(p_{1}, q_{1}) \neq = (p_{2}, q_{2})$ with the same residues, and the quotient $(p_{1} + q_{1} D) / (p_{2} + q_{2} D) = X + Y D$ has integer coordinates and norm $1$ , hence is a Pell solution. $□$

Proposition 7 (negative Pell criterion). The equation $x^{2} - D y^{2} = - 1$ has an integer solution iff the period of the continued fraction of $D$ is odd.

Proof sketch. Direction $\Leftarrow$ from Proposition 2 with odd $ℓ$ giving $(- 1)^{ℓ} = - 1$ . Direction $\Rightarrow$ : if $(x, y)$ is a negative-Pell solution, then by the convergent characterisation (analogous to Step 3 of the Key Theorem applied with the bound $∣ x - y D ∣ < 1/ (2 y)$ ), $(x, y)$ is a convergent $(p_{n}, q_{n})$ . The convergent identity $p_{n}^{2} - D q_{n}^{2} = (- 1)^{n + 1} Q_{n + 1}$ with value $- 1$ forces $Q_{n + 1} = 1$ (so $n + 1$ is a multiple of $ℓ$ ) and $n$ even (so $(- 1)^{n + 1} = - 1$ ). Hence $n + 1 = m ℓ$ for some $m$ with $m ℓ - 1$ even, equivalently $m ℓ$ odd. For $m \geq 1$ integer this forces both $m$ and $ℓ$ odd, so $ℓ$ is odd. $□$

Proposition 8 (Liouville approximation bound). Algebraic $α$ of degree $n \geq 2$ admits $∣ α - p / q ∣ > c (α) q^{- n}$ for all $p / q$ .

Proof. See Exercise 8. Minimal-polynomial argument: $∣ q^{n} f (p / q) ∣ \geq 1$ on one side; $∣ f (p / q) ∣ \leq M ∣ p / q - α ∣$ on the other side by the mean-value theorem. $□$

Proposition 9 (Brahmagupta composition for $x^{2} - D y^{2} = N$ ). If $(a, b)$ solves $x^{2} - D y^{2} = N_{1}$ and $(c, d)$ solves $x^{2} - D y^{2} = N_{2}$ , then $(a c + D b d, a d + b c)$ solves $x^{2} - D y^{2} = N_{1} N_{2}$ .

Proof. Norm multiplicativity in $Z [D]$ : $N ((a + b D) (c + d D)) = N (a + b D) N (c + d D) = N_{1} N_{2}$ , and the product expands to $(a c + D b d) + (a d + b c) D$ . This is Brahmagupta's samasa-bhavana (628 CE), the foundational composition principle. $□$

Connections Master

Divisibility, gcd, and the Euclidean algorithm 21.01.01. Shipped. The Euclidean algorithm for $g cd (a, b)$ is the finite-arithmetic analogue of the continued-fraction algorithm for $D$ : both are based on the recursion "subtract the largest possible integer multiple, then invert". The convergents $p_{n} / q_{n}$ of any irrational $α$ satisfy the same cross-ratio identity $p_{n} q_{n - 1} - p_{n - 1} q_{n} = (- 1)^{n - 1}$ that drives the Euclidean algorithm to a $g cd$ . The connection is the Stern-Brocot tree of best rational approximations.
Primes and the fundamental theorem of arithmetic 21.01.02. Shipped. Pell solvability for prime $D = p$ depends on the residue $p (mod 4)$ : for $p \equiv 1 (mod 4)$ , the negative-Pell equation $x^{2} - p y^{2} = - 1$ is always solvable (period odd, by a deep theorem); for $p \equiv 3 (mod 4)$ , the negative-Pell equation is never solvable (period even). The bridge to 21.01.02 is the way prime-residue conditions modulo small integers control deep structural questions about $Q (p)$ .
Real numbers and completeness 02.04.01. Pending. The continued-fraction algorithm requires the floor function $⌊ α ⌋$ , which presupposes the order completeness of $R$ . Lagrange's periodicity theorem uses the Bolzano-Weierstrass theorem (every bounded sequence has a convergent subsequence) implicitly via the finiteness-of- $(P_{n}, Q_{n})$ argument. The Liouville approximation bound is a $C^{1}$ -continuity statement about the minimal polynomial; the Thue-Siegel-Roth tower uses sophisticated complex-analytic arguments (transcendence-degree arithmetic, the Schmidt subspace theorem).
Algebraic number fields and rings of integers 21.02.01. Pending. Pell solvability is the elementary face of the structure theory of the unit group of $Q (D)$ for $D > 0$ non-square. The number field $K = Q (D)$ has ring of integers $O_{K} = Z [D]$ (if $D \neq \equiv 1 (mod 4)$ ) or $Z [(1 + D) /2]$ (if $D \equiv 1 (mod 4)$ ), and the unit group $O_{K}^{\times} = {\pm 1} \times ⟨ ε ⟩$ is the elementary instance of the Dirichlet unit theorem for general number fields.
Class groups and the ideal class group of $Z [D]$ 21.02.05. Pending. The Pell equation interacts with the class group $Cl (O_{K})$ of the real quadratic field $K = Q (D)$ : when $Cl (O_{K}) = 1$ (principal-ideal domain), the generalised Pell equation $x^{2} - D y^{2} = N$ for $N$ a prime ideal generator behaves uniformly; when $Cl (O_{K}) > 1$ , the equation has multiple orbits of solutions corresponding to the ideal classes. The famous Cohen-Lenstra heuristics (1984) predict the distribution of class numbers and the density of $D$ with $∣ Cl (O_{K}) ∣ = 1$ .
Dirichlet unit theorem 21.02.06. Pending. The Dirichlet unit theorem (Dirichlet 1846) gives the rank of the unit group of any number field as $r_{1} + r_{2} - 1$ , with the free part generated by the fundamental units. For real quadratic $K = Q (D)$ the rank is $1$ and the fundamental unit is the fundamental Pell (or negative-Pell) solution. Pell is the simplest, most elementary case of the Dirichlet unit theorem.
Diophantine approximation 21.01.10 pending. Lateral. The continued-fraction theory of Pell extends to the general theory of Diophantine approximation: how well can a real number be approximated by rationals? The Dirichlet bound (Exercise 6 above) is the universal lower bound $∣ α - p / q ∣ < 1/ q^{2}$ for infinitely many $p / q$ ; the Liouville bound (Exercise 8) is the upper bound for algebraic numbers; the Thue-Siegel-Roth tower gives the sharp $∣ α - p / q ∣ > q^{- (2 + ε)}$ for algebraic irrationals. The Khinchin metric theorem (1928): for almost every real $α$ (Lebesgue), the convergent denominators satisfy $q_{n}^{1/ n} \to K_{0} \approx 2.685$ — Khinchin's constant.
Analysis prerequisites [02 analysis]. Lateral. Continued fractions converge to their limit at the rate $∣ α - p_{n} / q_{n} ∣ < 1/ q_{n}^{2}$ — a quantitative version of the Cauchy criterion. The Lagrange periodicity proof uses the Bolzano-Weierstrass theorem (compactness of bounded sequences). The Liouville approximation bound uses the mean-value theorem from calculus. The Roth theorem (1955) uses the Thue-Siegel principle — a deep argument about heights of points on algebraic varieties, with the auxiliary polynomial constructed via Siegel's lemma (Cassels 1957 Diophantine Approximation Ch. III).
Continued-fraction expansions of fundamental constants. Lateral. The continued fraction of $e = [2; 1, 2, 1, 1, 4, 1, 1, 6, 1, 1, 8, \dots]$ (Euler 1737) has a clean pattern but is non-periodic, so $e$ is non-quadratic by Lagrange. The continued fraction of $π = [3; 7, 15, 1, 292, 1, 1, 1, 2, 1, 3, 1, \dots]$ (Brouncker 1655) has no known pattern, and the convergent $355/113$ (the Tsu Ch'ung-Chih approximation, c. 480 CE) agrees with $π$ to seven decimal places — explained by the large partial quotient $a_{4} = 292$ . Hermite 1873 proved $e$ transcendental; Lindemann 1882 proved $π$ transcendental; both proofs descend from the Liouville-style arguments developed for Pell.

Historical & philosophical context Master

The Pell equation has the longest continuously-recorded history of any single equation in mathematics: $1300$ years from Brahmagupta (628 CE) to Roth (1955).

Brahmagupta (628 CE) Brāhmasphuṭasiddhānta ^{[Brahmagupta 628]}, ch. 18 §§64-75 introduced the samasa-bhavana (composition principle): from two solutions of $x^{2} - D y^{2} = N_{1}$ and $x^{2} - D y^{2} = N_{2}$ , produce one of $x^{2} - D y^{2} = N_{1} N_{2}$ . This is the embryonic form of norm multiplicativity in $Z [D]$ and the elementary face of all subsequent reciprocity-style composition principles. Brahmagupta showed that if $x^{2} - D y^{2} = k$ is solved for $k \in {- 1, \pm 2, \pm 4}$ , then composition yields a Pell solution, providing the first general method for Pell-type equations in any culture. Brahmagupta's worked examples in $D = 92, 83$ and similar moderate cases are correct to the digit.

Jayadeva (c. 1000 CE, lost work) and Bhaskara II (1150 CE) Bījagaṇita ^{[Bhaskara II 1150]}, §§71-90 developed the chakravala (cyclic) algorithm completing Brahmagupta's composition to a guaranteed-terminating procedure. The chakravala algorithm produced Bhaskara's celebrated solution of $x^{2} - 61 y^{2} = 1$ as $(1766319049, 226153980)$ — a $10$ -digit answer found without paper or modern arithmetic notation, in the 12th century. The same answer was re-discovered in 1657 by Brouncker in response to Fermat's challenge. Hankel 1874 Zur Geschichte der Mathematik in Altertum und Mittelalter called chakravala "the finest thing achieved in the theory of numbers before Lagrange" (Lagrange's own proof of Pell solvability did not appear until 620 years later).

Fermat (1657) issued a public challenge in January 1657 to all European mathematicians: solve $x^{2} - D y^{2} = 1$ for $D = 61, 109, 149, 433$ and other carefully-chosen large $D$ values where the fundamental solution has many digits. The English mathematicians Brouncker and Wallis responded with a continued-fraction-style algorithm, with Brouncker computing the $D = 61$ solution in a few weeks. Wallis published the correspondence in Commercium Epistolicum 1658 ^{[Brouncker-Wallis 1657]}. Fermat acknowledged the solutions but did not publicly prove solvability for all non-square $D > 0$ ; his infinite descent proof, alluded to in a 1657 letter to Frenicle, is lost.

Euler (1730s-1765) ^{[Euler 1765]} developed the continued-fraction algorithm for $D$ , expanding $D = [a_{0}; \overline{a_{1}, a_{2}, \dots, a_{ℓ}}]$ and reading the fundamental Pell solution off the convergent immediately before the period closes. Euler also introduced the misattribution that has stuck: in his 1765 Novi Commentarii Academiae Scientiarum Petropolitanae paper he credited the equation to John Pell based on a misreading of Wallis's Algebra (1685, ch. 98) — Pell merely edited Brouncker's solution for printing. The name "Pell equation" has stuck despite Pell himself contributing nothing; the equation was solved by Brahmagupta in 628 CE, Bhaskara II in 1150 CE, Fermat in 1657, and Brouncker in 1658, all without Pell's involvement.

Lagrange (1770) Mémoires de l'Académie Royale des Sciences et Belles-Lettres de Berlin ^{[Lagrange 1770]} gave the first complete proof in European mathematics of Pell solvability for every non-square $D > 0$ . Lagrange's two theorems: (i) periodicity of the continued-fraction expansion of every quadratic irrational; (ii) the convergent immediately before the period closes gives the fundamental Pell (or negative-Pell) solution. This was 620 years after Bhaskara II's chakravala — a delay reflecting the European mathematical isolation from Indian sources.

Galois (1828) Annales de Gergonne 19, 294 generalised Lagrange's result with the purely-periodic characterisation: the continued fraction $[a_{0}; a_{1}, a_{2}, \dots]$ is purely periodic iff $α$ is a reduced quadratic irrational ( $α > 1$ , $- 1 < \overset{α}{ˉ} < 0$ ). Galois's result is the cleanest statement of the periodicity theorem and the basis for the modern proof in textbooks.

Dirichlet (1837 Werke I + Dirichlet-Dedekind 1894 Vorlesungen) ^{[Dirichlet 1837]} gave the pigeonhole proof of Pell solvability bypassing continued fractions: by the Dirichlet approximation theorem, there are infinitely many $(p, q)$ with $∣ p - q D ∣ < 1/ q$ ; the values $p^{2} - D q^{2}$ are bounded, so some non-zero $N$ is attained infinitely often; pigeonhole on residues modulo $∣ N ∣$ produces two pairs whose quotient is a Pell solution. The same machinery gives the Dirichlet unit theorem (1846): the unit group of any number field has rank $r_{1} + r_{2} - 1$ , with Pell as the rank- $1$ real-quadratic case.

Liouville (1844) ^{[Liouville 1844]} proved the Liouville approximation bound: algebraic $α$ of degree $n$ cannot be approximated by rationals to order strictly better than $n$ . Liouville used this to construct the first explicit transcendental numbers $\sum_{k} 1 0^{- k!}$ (with truncations giving approximations of arbitrarily high order). The Pell convergents saturate the Liouville bound for $n = 2$ — quadratic irrationals are exactly the irrationals approximable to order $2$ but not to order $> 2$ .

Thue (1909) ^{[Thue 1909]} Crelle 135 improved the Liouville exponent from $n$ to $n /2 + 1 + ε$ . Thue's method introduced the auxiliary polynomial $P (x, y)$ used in all later refinements. Siegel (1921) ^{[Siegel 1921]} Math. Z. 10 improved to $2 n + ε$ , attaining the optimum of Siegel's variational argument. Roth (1955) ^{[Roth 1955]} Mathematika 2 reached the optimal $2 + ε$ , sharp by the Dirichlet bound. Roth received the Fields Medal 1958 for this result.

The Roth theorem is ineffective: it bounds the number of approximating rationals but not their height. Effective bounds had to wait for Baker (1966-67) ^{[Roth 1955]} Mathematika 13-14 and his theory of linear forms in logarithms — for which Baker received the Fields Medal 1970. The Liouville-Thue-Siegel-Roth-Baker tower is the canonical example of a research programme where each generation refined the previous bound, with the final endpoint (Roth's exponent $2$ ) being optimal but ineffective, motivating an entirely new theory (Baker) to recover effectivity.

The philosophical lesson of the Pell story: a single elementary equation, posed by Brahmagupta in the 7th century, generated the entire theory of units in number fields, Diophantine approximation, transcendence theory, and the modern framework of integral points on algebraic varieties. The Pell equation is the simplest interesting Diophantine equation with infinitely many integer solutions, and understanding it completely — through Lagrange's continued fractions and Dirichlet's pigeonhole — required ideas that turned out to be the embryonic forms of $20$ th-century algebraic and arithmetic geometry.

Bibliography Master

@book{Brahmagupta628,
  author    = {Brahmagupta},
  title     = {Br{\=a}hmasphu{\d{t}}asiddh{\=a}nta},
  year      = {628},
  note      = {Sanskrit, c. 628 CE. Chapter 18 (kuṭṭakādhyāya) §§64-75 develop the samasa-bhavana (composition principle) for $x^2 - D y^2 = N$: from solutions of $x^2 - D y^2 = N_1$ and $x^2 - D y^2 = N_2$, produce a solution of $x^2 - D y^2 = N_1 N_2$. The first general method for Pell-type equations in any culture. English: Colebrooke, Algebra with Arithmetic and Mensuration from the Sanscrit of Brahmegupta and Bháscara, London 1817 (reprinted Wiesbaden 1973).}
}

@book{BhaskaraII1150,
  author    = {{Bh{\=a}skara II}},
  title     = {B{\=\i}jaga{\d n}ita},
  year      = {1150},
  note      = {Sanskrit, c. 1150 CE. §§71-90 develop the chakravala (cyclic) algorithm completing Brahmagupta's composition principle to a guaranteed-terminating procedure for $x^2 - D y^2 = 1$. Famous worked example: fundamental solution of $x^2 - 61 y^2 = 1$ as $(1766319049, 226153980)$, found without paper or modern arithmetic notation. Hankel 1874 called chakravala 'the finest thing achieved in the theory of numbers before Lagrange'. English: Plofker, Mathematics in India, Princeton 2009, §5.4.2.}
}

@book{BrounckerWallis1658,
  author    = {Brouncker, William and Wallis, John},
  title     = {Commercium Epistolicum},
  publisher = {Oxford},
  year      = {1658},
  note      = {Fermat's January 1657 challenge to the English mathematicians: solve $x^2 - D y^2 = 1$ for $D = 61, 109, 149, \ldots$. Brouncker's continued-fraction-style algorithm produced the $D = 61$ solution $(1766319049, 226153980)$ in a few weeks; Wallis wrote the exposition. The first complete European treatment of the Pell equation, $500$ years after Bhaskara II's chakravala.}
}

@article{Euler1765,
  author  = {Euler, Leonhard},
  title   = {De usu novi algorithmi in problemate Pelliano solvendo},
  journal = {Novi Commentarii Academiae Scientiarum Petropolitanae},
  volume  = {11},
  pages   = {28--66},
  year    = {1765},
  note    = {Reprinted Opera Omnia I.3, 73-111. Euler's continued-fraction algorithm for $\sqrt D$, with the misattribution to John Pell based on a misreading of Wallis's Algebra (1685). The name 'Pell equation' has stuck despite Pell himself contributing nothing.}
}

@article{Lagrange1770,
  author  = {Lagrange, Joseph Louis},
  title   = {Solution d'un probl{\`e}me d'arithm{\'e}tique},
  journal = {M{\'e}moires de l'Acad{\'e}mie Royale des Sciences et Belles-Lettres de Berlin},
  volume  = {24},
  pages   = {165--310},
  year    = {1770},
  note    = {Originally presented 1768. Reprinted Oeuvres I, 671-731. First complete proof in European mathematics of Pell solvability for every non-square $D > 0$. Two theorems: (i) periodicity of the continued fraction of every quadratic irrational; (ii) the convergent before the period closes is the fundamental Pell solution. 620 years after Bhaskara II's chakravala.}
}

@book{DirichletDedekind1894Pell,
  author    = {Dirichlet, Peter Gustav Lejeune and Dedekind, Richard},
  title     = {Vorlesungen {\"u}ber Zahlentheorie},
  edition   = {4},
  publisher = {Vieweg},
  address   = {Braunschweig},
  year      = {1894},
  note      = {§§80-100 give Dirichlet's pigeonhole proof of Pell solvability bypassing continued fractions. The proof generalises to the Dirichlet unit theorem (1846) giving the rank of the unit group of any number field as $r_1 + r_2 - 1$, with Pell as the rank-$1$ real-quadratic case.}
}

@article{Liouville1844,
  author  = {Liouville, Joseph},
  title   = {Sur des classes tr{\`e}s {\'e}tendues de quantit{\'e}s dont la valeur n'est ni alg{\'e}brique, ni m{\^e}me r{\'e}ductible {\`a} des irrationelles alg{\'e}briques},
  journal = {Comptes Rendus de l'Acad{\'e}mie des Sciences de Paris},
  volume  = {18},
  pages   = {883--885 and 910--911},
  year    = {1844},
  note    = {Liouville's approximation bound: algebraic $\alpha$ of degree $n \geq 2$ cannot be approximated by rationals to order strictly better than $n$. Used to construct the first explicit transcendental numbers $\sum_k 10^{-k!}$. Founding result of transcendence theory.}
}

@article{Thue1909,
  author  = {Thue, Axel},
  title   = {{\"U}ber Ann{\"a}herungswerte algebraischer Zahlen},
  journal = {Journal f{\"u}r die reine und angewandte Mathematik},
  volume  = {135},
  pages   = {284--305},
  year    = {1909},
  note    = {Thue's improvement of Liouville: algebraic $\alpha$ of degree $n \geq 3$ is approximable to order at most $n/2 + 1 + \varepsilon$. Consequence: Thue equations $f(x, y) = c$ for $f$ irreducible binary of degree $\geq 3$ have only finitely many integer solutions.}
}

@article{Siegel1921,
  author  = {Siegel, Carl Ludwig},
  title   = {Approximation algebraischer Zahlen},
  journal = {Mathematische Zeitschrift},
  volume  = {10},
  pages   = {173--213},
  year    = {1921},
  note    = {Siegel's improvement of Thue: the approximation exponent for algebraic $\alpha$ of degree $n$ is at most $2 \sqrt n + \varepsilon$. The proximate predecessor of Roth's theorem.}
}

@article{Roth1955,
  author  = {Roth, Klaus Friedrich},
  title   = {Rational approximations to algebraic numbers},
  journal = {Mathematika},
  volume  = {2},
  pages   = {1--20},
  year    = {1955},
  note    = {Roth's theorem: for every algebraic irrational $\alpha$ and every $\varepsilon > 0$, only finitely many rationals $p/q$ satisfy $|\alpha - p/q| < q^{-(2 + \varepsilon)}$. The exponent $2$ is sharp by the Dirichlet bound. Roth received the Fields Medal 1958. Ineffective: bounds the number but not the height of approximating rationals.}
}

@book{HardyWright2008Pell,
  author    = {Hardy, G. H. and Wright, E. M.},
  title     = {An Introduction to the Theory of Numbers},
  edition   = {6},
  publisher = {Oxford University Press},
  year      = {2008},
  note      = {Revised by D. R. Heath-Brown and J. H. Silverman. Chapters X-XI develop the theory of continued fractions for real numbers and quadratic irrationals: the Euclidean expansion of rationals, the convergents $p_n/q_n$ with best-approximation property, Lagrange's periodicity theorem, the Pell equation and its complete solution via the convergents of $\sqrt D$, the generalised Pell equation, the negative Pell criterion.}
}

@book{NivenZuckermanMontgomery1991Pell,
  author    = {Niven, Ivan and Zuckerman, Herbert S. and Montgomery, Hugh L.},
  title     = {An Introduction to the Theory of Numbers},
  edition   = {5},
  publisher = {Wiley},
  year      = {1991},
  note      = {§§7.7-7.8 develop the Pell equation $x^2 - D y^2 = N$ for general $N$, the negative Pell criterion (solvable iff period of $\sqrt D$ is odd), the structure of solutions of the generalised Pell equation as a finite union of orbits under the Pell unit group.}
}

@book{Khinchin1964,
  author    = {Khinchin, Aleksandr Yakovlevich},
  title     = {Continued Fractions},
  publisher = {University of Chicago Press},
  year      = {1964},
  note      = {Translation by Scripta Technica from Tsepnye drobi (Moscow 1935; 3rd Russian edition 1961). Dover reprint 1997. §§I.1-9 basic algorithm and convergents; §§II.10-13 metric theory including Khinchin's constant $K_0 \approx 2.685$ and Markov spectrum; §§III.14-17 best rational approximations and quadratic-irrational characterisation. The canonical monograph on continued fractions.}
}

@book{Lang1962,
  author    = {Lang, Serge},
  title     = {Diophantine Geometry},
  publisher = {Interscience},
  year      = {1962},
  note      = {Chapter VII develops Pell equations, units in number fields, and the Liouville-Thue-Siegel-Roth tower from the algebraic-geometric viewpoint. The bridge from elementary continued-fraction theory to integral points on curves and the Mordell conjecture (Faltings 1983).}
}

@book{Cassels1957,
  author    = {Cassels, J. W. S.},
  title     = {An Introduction to Diophantine Approximation},
  series    = {Cambridge Tracts in Mathematics and Mathematical Physics},
  volume    = {45},
  publisher = {Cambridge University Press},
  year      = {1957},
  note      = {Reprinted 1965, Hafner. Chapters II-III develop approximation to algebraic numbers from Liouville through Thue, Siegel, and Roth, with explicit construction of the auxiliary polynomial used in the Thue-Siegel-Roth proofs and careful treatment of the ineffectivity issue.}
}

@book{Burton2010Pell,
  author    = {Burton, David M.},
  title     = {Elementary Number Theory},
  edition   = {7},
  publisher = {McGraw-Hill},
  year      = {2010},
  note      = {Chapter 14 §14.4 develops the Pell equation at the undergraduate Beginner-tier level: hand-computation of fundamental solutions for small $D$, the multiplication law for generating all solutions, statement of Lagrange's continued-fraction algorithm.}
}

@book{Stillwell2010,
  author    = {Stillwell, John},
  title     = {Mathematics and Its History},
  edition   = {3},
  publisher = {Springer Undergraduate Texts in Mathematics},
  year      = {2010},
  note      = {§5.4 develops Brahmagupta's Brāhmasphuṭasiddhānta (628 CE) and Bhaskara II's Bījagaṇita (1150 CE) on Pell with cattle-style worked examples and the historical chain Brahmagupta-Bhaskara-Brouncker-Lagrange-Dirichlet.}
}

Prerequisites

21.01.01
21.01.02

Tier anchors

beginner: Burton 2010 *Elementary Number Theory* 7e §14.4 (the Pell equation $x^2 - D y^2 = 1$, hand-computation of fundamental solutions for small $D$, the multiplication law $(x_1 + y_1 \sqrt D)^n$ generating all solutions); Khan Academy 'Number theory — Pell equation' module; Stillwell 2010 *Mathematics and Its History* 3e §5.4 (Brahmagupta's *Brāhmasphuṭasiddhānta* 628 CE and Bhaskara II's *Bījagaṇita* 1150 CE on Pell with cattle-style worked examples)
intermediate: Hardy-Wright 2008 *An Introduction to the Theory of Numbers* 6e §§10.1-10.12, §§11.1-11.8 (continued-fraction algorithm for $\sqrt D$, Lagrange's periodicity theorem, fundamental Pell solution from the periodic part, generation of all solutions via $(x_1 + y_1 \sqrt D)^n$); Niven-Zuckerman-Montgomery 1991 *An Introduction to the Theory of Numbers* 5e §7.7-7.8 (Pell equation, negative Pell, generalised Pell $x^2 - D y^2 = N$); Khinchin 1964 *Continued Fractions* §§II.10-13 (Markov spectrum, best rational approximations, quadratic-irrational characterisation)
master: Brahmagupta 628 CE *Brāhmasphuṭasiddhānta* ch. 18 §§64-75 (originator: the **samasa-bhavana** composition law $(x_1, y_1, N_1) \diamond (x_2, y_2, N_2) = (x_1 x_2 + D y_1 y_2, x_1 y_2 + x_2 y_1, N_1 N_2)$ that solves $x^2 - D y^2 = N$ by composition; the first general method for the Pell equation in any culture); Bhaskara II 1150 CE *Bījagaṇita* §§71-90 (**chakravala** cyclic algorithm — completes Brahmagupta's method to a guaranteed-terminating procedure, found by Jayadeva c. 1000 CE according to Udayadivakara's commentary); Brouncker-Wallis 1657-58 correspondence (Brouncker's continued-fraction-style solution method, Wallis's exposition in *Commercium Epistolicum* 1658); Euler 1765 *Novi Comm. Acad. Petrop.* 11, 28-66 (mis-attribution of the equation to John Pell, who never worked on it; correct continued-fraction algorithm for $\sqrt D$); Lagrange 1770 *Mém. Acad. Berlin* (1768) 24, 165-310 (first complete proof of solvability of Pell, periodicity of continued fraction of every quadratic irrational, full structure of solutions); Dirichlet 1837 *Werke* I, 313 + Dirichlet-Dedekind 1894 *Vorlesungen über Zahlentheorie* §§80-100 (pigeonhole proof of Pell solvability without continued fractions, structure of unit group of $\mathbb{Q}(\sqrt D)$); Liouville 1844 *C. R. Acad. Sci. Paris* 18, 883-885 + 910-911 (Liouville bound: algebraic irrational of degree $n$ cannot be approximated by rationals to order better than $n$; founding the theory of transcendence); Thue 1909 *J. reine angew. Math.* 135, 284-305 (Thue's improvement of Liouville from exponent $n$ to $n/2 + 1$); Siegel 1921 *Math. Z.* 10, 173-213 (Siegel's improvement to exponent $2 \sqrt n$); Roth 1955 *Mathematika* 2, 1-20 (Roth's theorem: for every algebraic irrational $\alpha$ and every $\varepsilon > 0$, only finitely many rationals $p/q$ satisfy $|\alpha - p/q| < q^{-2 - \varepsilon}$ — Fields Medal 1958); Lang 1962 *Diophantine Geometry* Ch. VII (modern algebraic treatment of Pell, units, and Roth); Cassels 1957 *An Introduction to Diophantine Approximation* (Cambridge Tracts 45) Ch. II-III; Khinchin 1964 *Continued Fractions* (Chicago); Hardy-Wright 2008 §§10-11

References

Brahmagupta — Brāhmasphuṭasiddhānta · 628 CE. Chapter 18 (kuṭṭakādhyāya) §§64-75 develop the **samasa-bhavana** (composition principle) for $x^2 - D y^2 = N$: from two solutions $(x_1, y_1)$ of $x^2 - D y^2 = N_1$ and $(x_2, y_2)$ of $x^2 - D y^2 = N_2$, produce $(x_1 x_2 + D y_1 y_2, x_1 y_2 + x_2 y_1)$ as a solution of $x^2 - D y^2 = N_1 N_2$. Brahmagupta showed that if a solution of $x^2 - D y^2 = k$ is found for $k \in \{-1, \pm 2, \pm 4\}$, then composition yields a solution of $x^2 - D y^2 = 1$. The first general method for what later became the Pell equation. English: Colebrooke, *Algebra, with Arithmetic and Mensuration, from the Sanscrit of Brahmegupta and Bháscara*, London 1817 (reprinted Wiesbaden 1973).
Bhaskara II — Bījagaṇita · 1150 CE. §§71-90 develop the **chakravala** (cyclic) algorithm completing Brahmagupta's composition principle to a guaranteed-terminating procedure for $x^2 - D y^2 = 1$. The algorithm: starting from any $(a, b, k)$ with $a^2 - D b^2 = k$, choose $m \in \mathbb{Z}$ minimising $|m^2 - D|$ subject to $b m + a \equiv 0 \pmod{|k|}$; then $((a m + D b)/|k|, (a + b m)/|k|, (m^2 - D)/k)$ is a new triple. Iteration terminates in $k = 1$. Hankel 1874 *Zur Geschichte der Mathematik in Altertum und Mittelalter* called chakravala 'the finest thing achieved in the theory of numbers before Lagrange' (Lagrange's own proof of Pell solvability did not appear until 620 years later). Discovered c. 1000 CE by Jayadeva according to Udayadivakara's 13th-century commentary on Lalla. English: Plofker, *Mathematics in India*, Princeton 2009, §5.4.2.
Brouncker, W. and Wallis, J. — Commercium Epistolicum · Oxford 1658. Fermat's January 1657 challenge to the English mathematicians: find integer solutions to $x^2 - D y^2 = 1$ for $D = 61, 109, 149, \ldots$. Brouncker produced a continued-fraction-style algorithm and computed the minimal solution to $x^2 - 61 y^2 = 1$ as $(x, y) = (1766319049, 226153980)$ — a $10$-digit answer Bhaskara II had found by chakravala $500$ years earlier. Wallis wrote the exposition in *Commercium Epistolicum*. Fermat acknowledged the solutions but never publicly proved solvability for all non-square $D > 0$ (his \textit{infinite descent} proof, alluded to in a 1657 letter to Frenicle, is lost).
Euler, L. — De usu novi algorithmi in problemate Pelliano solvendo · *Novi Commentarii Academiae Scientiarum Petropolitanae* 11 (1765), 28-66. Reprinted *Opera Omnia* I.3, 73-111. Euler's continued-fraction algorithm for $\sqrt D$: expand $\sqrt D = [a_0; \overline{a_1, a_2, \ldots, a_\ell}]$ where the bar denotes the periodic part of length $\ell$, then read the fundamental Pell solution off the convergent immediately before the period closes. Euler also misattributed the equation to **John Pell** based on a misreading of Wallis's *Algebra* (1685, ch. 98) — Pell merely edited Brouncker's solution for printing. The name 'Pell equation' has stuck despite the misattribution; the equation was studied by Brahmagupta in 628 CE, Bhaskara II in 1150 CE, Fermat in 1657, and Brouncker in 1658, with Pell himself contributing nothing.
Lagrange, J. L. — Solution d'un problème d'arithmétique · *Mémoires de l'Académie Royale des Sciences et Belles-Lettres de Berlin* (1768) 24 (1770), 165-310. Reprinted *Oeuvres* I, 671-731. Lagrange's two theorems: (i) **periodicity** — the continued-fraction expansion of every quadratic irrational $\alpha = (P + \sqrt D)/Q$ is eventually periodic; (ii) **purely periodic iff $\alpha$ is a reduced quadratic irrational**, meaning $\alpha > 1$ and $-1 < \bar\alpha < 0$. Combined with Euler's algorithm and a bound on the size of the period, Lagrange's theorems give the first complete proof that $x^2 - D y^2 = 1$ has a positive integer solution for every non-square $D > 0$. This is the first complete proof of Pell solvability in European mathematics — 620 years after Bhaskara II's chakravala.
Dirichlet, P. G. L. — Werke I and Vorlesungen über Zahlentheorie · 1837 *Werke* I, 313-342 (Pell solvability via pigeonhole). Dirichlet-Dedekind 1894 *Vorlesungen über Zahlentheorie* 4e §§80-100 give the pigeonhole proof of Pell solvability that bypasses continued fractions: by the Dirichlet approximation theorem, there exist infinitely many rationals $p/q$ with $|q \sqrt D - p| < 1/q$, so the values $p^2 - D q^2$ are bounded in absolute value by $2 \sqrt D + 1$; some non-zero value $N$ is attained for infinitely many pairs $(p, q)$, and two such pairs with $p_1 \equiv p_2 \pmod{|N|}$ and $q_1 \equiv q_2 \pmod{|N|}$ compose via Brahmagupta to give a solution of $x^2 - D y^2 = 1$. The proof also yields the **Dirichlet unit theorem** for $\mathbb{Q}(\sqrt D)$: the unit group is $\{\pm 1\} \times \langle \varepsilon \rangle$ where $\varepsilon$ is the fundamental unit.
Liouville, J. — Sur des classes très étendues de quantités · *Comptes Rendus de l'Académie des Sciences de Paris* 18 (1844), 883-885 and 910-911. **Liouville's theorem**: if $\alpha$ is an algebraic number of degree $n \geq 2$, there exists a constant $c(\alpha) > 0$ such that for every rational $p/q$ with $q \geq 1$, $|\alpha - p/q| > c(\alpha) q^{-n}$. Consequence: numbers admitting rational approximations to order strictly better than their algebraic degree are **transcendental**. The first explicit transcendental numbers were Liouville's $\sum_{k = 1}^\infty 10^{-k!}$ (rapidly approximable by truncations). The bound is sharp for quadratic irrationals (Pell-class) by Lagrange's continued-fraction theory but very weak for higher degree — the Thue-Siegel-Roth improvements show that the exponent $n$ can be replaced by $2 + \varepsilon$ for every algebraic irrational of any degree.
Thue, A. — Über Annäherungswerte algebraischer Zahlen · *Journal für die reine und angewandte Mathematik* 135 (1909), 284-305. Thue's improvement of Liouville: if $\alpha$ is algebraic of degree $n \geq 3$, then for every $\varepsilon > 0$ only finitely many rationals $p/q$ satisfy $|\alpha - p/q| < q^{-(n/2 + 1 + \varepsilon)}$. Consequence: **Thue equations** $f(x, y) = c$ for $f$ an irreducible binary form of degree $\geq 3$ have only finitely many integer solutions — a generalisation of the finiteness of solutions of $y^2 = x^3 + k$ (Mordell 1922) and prefiguring Siegel's theorem on integral points on curves of genus $\geq 1$ (Siegel 1929).
Siegel, C. L. — Approximation algebraischer Zahlen · *Mathematische Zeitschrift* 10 (1921), 173-213. Siegel's further improvement of Thue: the exponent $n/2 + 1$ can be replaced by $\min_s (n/(s + 1) + s)$ for any integer $s$ with $1 \leq s \leq n$, attaining the optimum $2 \sqrt n$ at $s = \sqrt n - 1$. Siegel's theorem is the proximate predecessor of Roth's theorem and the foundation for Siegel's 1929 result on integral points on curves.
Roth, K. F. — Rational approximations to algebraic numbers · *Mathematika* 2 (1955), 1-20 (corrigendum p. 168). **Roth's theorem**: for every algebraic irrational $\alpha$ and every $\varepsilon > 0$, only finitely many rationals $p/q$ satisfy $|\alpha - p/q| < q^{-(2 + \varepsilon)}$. The exponent $2$ is sharp by the theory of continued fractions ($|\alpha - p_n/q_n| < q_n^{-2}$ at every convergent of every irrational). Roth received the Fields Medal 1958 for this result. The proof is ineffective: it bounds the number of solutions but does not bound their height, so explicit bounds for Diophantine equations cannot be deduced. The Baker 1966-67 theory of linear forms in logarithms gives effective bounds at the cost of weaker exponents.
Hardy, G. H. and Wright, E. M. — An Introduction to the Theory of Numbers · Oxford University Press, 6th edition (2008), revised by D. R. Heath-Brown and J. H. Silverman. Chapter X develops the theory of continued fractions for real numbers: the Euclidean expansion of rationals, the infinite continued-fraction representation of every irrational, the convergents $p_n/q_n$ with the best-approximation property $|q \alpha - p| \geq |q_n \alpha - p_n|$ for $q \leq q_n$. Chapter XI develops the theory of quadratic-irrational continued fractions: Lagrange's periodicity theorem, the Pell equation and its complete solution via the convergents of $\sqrt D$, the generalised Pell equation $x^2 - D y^2 = N$, the negative Pell equation. The standard reference for the elementary theory.
Niven, I., Zuckerman, H. S., and Montgomery, H. L. — An Introduction to the Theory of Numbers · Wiley, 5th edition (1991). §§7.7-7.8 develop the Pell equation $x^2 - D y^2 = N$ for general $N$, the negative Pell case $N = -1$ with the criterion that $N = -1$ is solvable iff the period of $\sqrt D$'s continued fraction is odd, and the structure of solutions of the generalised Pell equation as a finite union of orbits under the action of the unit group of $\mathbb{Q}(\sqrt D)$.
Khinchin, A. Ya. — Continued Fractions · Originally *Tsepnye drobi* (Moscow 1935; 3rd Russian edition 1961). English translation by Scripta Technica, University of Chicago Press 1964; Dover reprint 1997. The canonical monograph on continued fractions: §§I.1-9 the basic algorithm and convergents; §§II.10-13 the metric theory (Khinchin's constant, Gauss-Kuzmin distribution, Markov spectrum); §§III.14-17 best rational approximations and quadratic-irrational characterisation. Khinchin's theorem (1928): for almost every real $\alpha$ in the Lebesgue sense, the geometric mean of partial quotients $\sqrt[n]{a_1 a_2 \cdots a_n}$ converges to **Khinchin's constant** $K_0 \approx 2.685$.
Lang, S. — Diophantine Geometry · Interscience, 1962. Chapter VII develops Pell equations, units in number fields, and the Liouville-Thue-Siegel-Roth tower of approximation results from the algebraic-geometric viewpoint. The bridge from the elementary continued-fraction theory of Pell to the modern theory of integral points on curves and the Mordell conjecture (proved by Faltings 1983).
Cassels, J. W. S. — An Introduction to Diophantine Approximation · Cambridge Tracts in Mathematics and Mathematical Physics 45. Cambridge University Press, 1957 (reprinted 1965, Hafner). Chapters II-III develop the theory of approximation to algebraic numbers from Liouville through Thue, Siegel, and Roth, with the explicit construction of the auxiliary polynomial used in the Thue-Siegel-Roth proofs and a careful treatment of the ineffectivity issue.

Estimated time

beginner: 24m
intermediate: 58m
master: 100m