21.01.01 · number-theory / elementary

Divisibility, GCD, Bézout's identity, and the Euclidean algorithm

shipped3 tiersLean: none

Anchor (Master): Euclid c. 300 BCE *Elements* Book VII Props. 1-2 (originator of the algorithm); Hardy-Wright 2008 *An Introduction to the Theory of Numbers* 6e (Oxford, revised Heath-Brown-Silverman-Wiles) §§I-II; Lang 2002 *Algebra* (Springer GTM 211, 3e) Ch. II §1 (Euclidean rings and principal ideal domains); Knuth 1997 *The Art of Computer Programming, Vol. 2: Seminumerical Algorithms* 3e §4.5.2 (greatest common divisor); Lamé 1844 *Comptes Rendus* 19 (Fibonacci worst case); Stein 1967 *J. Comput. Phys.* 1 (binary GCD)

Intuition Beginner

Divisibility is the most basic question you can ask about two whole numbers. Given and , you can split into four equal piles of with nothing left over, so divides . Given and , you cannot — you get two piles of and a remainder of . The whole subject of number theory begins with this single question and the patterns that emerge when you ask it across many pairs of integers.

The greatest common divisor of two positive integers is the largest whole number that divides both. The gcd of and is , because divides both ( and ) and no larger number does. The gcd shows up everywhere: simplifying fractions, finding common periods, deciding when two clocks tick in unison, and analysing the structure of cyclic groups.

The Euclidean algorithm is an old and elegant procedure for computing the gcd quickly. Instead of factoring both numbers and comparing factors, you replace the larger number by the remainder when it is divided by the smaller, and repeat. The numbers shrink fast, and the last nonzero remainder is the gcd. The method dates to Euclid's Elements around BCE and is still the algorithm a modern computer uses.

Bézout's identity is the beautiful side of this story. It says that the gcd of and is always reachable as a sum with and whole numbers (one of which is usually negative). The gcd of and is , and indeed . Combinations of and exactly fill the multiples of .

Visual Beginner

A diagram of two stacks of boxes side by side, one of height and one of height . The procedure removes the smaller stack from the larger as many times as it fits, leaving a stack of height . Then the procedure repeats with the new pair , where contains two copies of exactly, leaving a stack of height . The last nonzero remainder, , is the gcd. A line at the bottom shows the Bézout combination .

The picture captures the entire algorithm in two steps. The number on the right side keeps shrinking until it hits zero, and the previous nonzero value is the answer. The Bézout combination is the bookkeeping that records how the answer was assembled from the original two numbers.

Worked example Beginner

Compute the gcd of and using the Euclidean algorithm, then write the gcd as a combination .

Step 1. Divide by : . The remainder is , so replace the pair by .

Step 2. Divide by : . The remainder is , so replace by .

Step 3. Divide by : . The remainder is , so the algorithm stops. The last nonzero remainder is , so .

Step 4. Reverse the chain to find the Bézout coefficients. From Step 2: . From Step 1: . Substitute the second into the first:

So . Check: .

What this tells us: three division steps produce both the gcd and the explicit coefficients , . The reverse pass is just bookkeeping — every gcd computation carries inside it an explicit recipe for writing the gcd as an integer combination of the original two numbers.

Check your understanding Beginner

Formal definition Intermediate+

Divisibility, the greatest common divisor, and Bézout coefficients admit clean definitions in the ring of integers . The Euclidean algorithm is the constructive bridge between them.

Definition (divisibility). For integers , we say divides , written , if there exists an integer with . The relation is reflexive (), transitive ( and imply ), and respects linear combinations ( and imply for all ).

Definition (greatest common divisor). For not both zero, the greatest common divisor is the unique positive integer such that

  1. and (common divisor), and
  2. for every with and , we have (universal among common divisors).

Equivalently, is the largest positive integer dividing both and under the natural-number ordering; condition (2) sharpens "largest" to a universal property that survives the ring-theoretic generalisation. By convention .

Definition (Bézout coefficients). Integers satisfying are called Bézout coefficients for the pair . They are not unique: if is a Bézout pair and , then is also a Bézout pair for every .

Definition (Euclidean algorithm). Given with , the Euclidean algorithm produces a sequence of remainders by repeated division-with-remainder:

The remainders strictly decrease, so the sequence terminates at some . The last nonzero remainder is the gcd of and .

Counterexamples to common slips

  • " divides is false because you cannot divide by zero." Under the definition above, holds: setting (or any integer) gives . The everyday-arithmetic prohibition against dividing by zero concerns the fraction being undefined, not the divisibility relation . The integer is then a convention chosen so that gcd is associative and respects the ideal-theoretic definition generator of , with .

  • "The Bézout coefficients are unique." The pair with is determined only up to shifts by multiples of the orthogonal lattice direction. For , the pair works (), and so does for every , since and .

  • "The Euclidean algorithm requires ." Strictly the algorithm makes sense for any with at least one nonzero, after reduction to the positive case via . The recursion stays well-defined when via the base case .

Key theorem with proof Intermediate+

The signature theorem of this unit is Bézout's identity, the existence statement underlying every later application of the Euclidean algorithm (modular inverses, the structure of as a PID, RSA, the Chinese remainder theorem). We prove it via the well-ordering principle on the set of positive integer linear combinations.

Theorem (Bézout's identity). For integers not both zero, there exist integers such that

Moreover, equals the smallest positive integer in the set .

Proof. Let be the set of positive integer linear combinations of and . Since and are not both zero, at least one of or lies in , so is non-empty. By the well-ordering principle on , has a least element for some .

We claim . First, : divide by with remainder, with . Then

so is an integer linear combination of and . If , then and , contradicting the minimality of . Hence , and . By the symmetric argument, .

Next, is universal: if satisfies and , then . The two conditions characterising the gcd are satisfied, so . Since , this proves the moreover statement: is the smallest positive element of the set of integer linear combinations of and .

Bridge. Bézout's identity builds toward 01.02.06 commutative rings and ideals, where the proof above reappears in the form "the principal ideal with ", and the central insight is exactly the bridge from the ring-theoretic statement that is a principal ideal domain to the analytic statement that every additive subgroup of generated by integers is cyclic. The foundational reason the well-ordering argument works is that admits the strong-induction principle of 00.12.01, which is exactly the categorical content of being a well-founded recursion target. Putting these together identifies the Euclidean algorithm with a well-founded recursion on the lexicographic ordering of the pair , generalises to every Euclidean domain (the polynomial ring , the Gaussian integers , the Eisenstein integers ), and appears again in 21.01.04 the fundamental theorem of arithmetic, whose proof reduces via Bézout to Euclid's Lemma (a prime dividing divides or ).

Exercises Intermediate+

Lean formalization Intermediate+

Mathlib already supplies the operational definitions and the Bézout identity in the form needed for everyday formalisation, but the pedagogical reconstruction below names the load-bearing pieces explicitly.

-- Operative imports: Mathlib.Data.Int.GCD, Mathlib.Data.Nat.GCD.Basic,
-- Mathlib.RingTheory.EuclideanDomain, Mathlib.RingTheory.PrincipalIdealDomain

#check @Nat.gcd                -- Nat → Nat → Nat
#check @Nat.gcd_dvd_left        -- ∀ (m n : Nat), Nat.gcd m n ∣ m
#check @Nat.gcd_dvd_right       -- ∀ (m n : Nat), Nat.gcd m n ∣ n
#check @Nat.dvd_gcd             -- ∀ {m n k : Nat}, k ∣ m → k ∣ n → k ∣ Nat.gcd m n

-- Bezout via extended Euclidean (xgcd):
#check @Int.gcd_eq_gcd_ab
-- Int.gcd_eq_gcd_ab :
--   ∀ (a b : ℤ), (Int.gcd a b : ℤ) = a * Int.gcdA a b + b * Int.gcdB a b

-- Ring-theoretic generalisation:
#check @EuclideanDomain
#check @IsPrincipalIdealRing
example : IsPrincipalIdealRing ℤ := inferInstance

The unit's central theorem (Bézout's identity in ) is Int.gcd_eq_gcd_ab, which Mathlib derives from the recursive Int.xgcd mirroring the algorithm of Exercise 7. The implication EuclideanDomain ⇒ IsPrincipalIdealRing lives in Mathlib.RingTheory.PrincipalIdealDomain and supplies the ring-theoretic abstraction underpinning the Master-tier statement that is a PID. What Mathlib does not yet supply in a unified pedagogical module — the Lamé complexity bound, the binary GCD, the Lehmer word-level speedup, and the Knuth-Schönhage half-GCD — is documented in the unit metadata Mathlib gap analysis.

Advanced results Master

Theorem 1 (Bézout's identity, ring-theoretic restatement). The set is an ideal of , and equals . Equivalently, the principal ideals satisfy with . This is the ideal-theoretic encoding of Bézout: the gcd is the generator of the ideal sum, and the existence of with is the statement that lies in .

Theorem 2 (Lamé 1844 C.R. Acad. Sci. 19). The number of division steps required by the Euclidean algorithm to compute with is at most five times the number of decimal digits of . Equivalently, the worst-case running time is , with the worst case realised by consecutive Fibonacci numbers and . [Lamé 1844] The constant five comes from the inequality , where is the golden ratio: each Euclidean step reduces the smaller argument by a factor at least in the worst case, so the number of steps is bounded by .

Theorem 3 (Stein 1967 binary GCD). The binary GCD algorithm computes using only bit-shifts, subtractions, and tests for parity, avoiding integer division altogether [Stein 1967]. On a binary computer this is typically 20-60% faster than the Euclidean algorithm despite using more steps, because each step is cheaper. The algorithm uses three rules: , for odd , and for odd . The bit-complexity is for -bit inputs, identical asymptotically to the ordinary Euclidean algorithm but with smaller hidden constants on real hardware. Knuth 1997 §4.5.2 gives the full analysis [Knuth 1997].

Theorem 4 (Lehmer 1938 word-level speedup). For multi-precision integers represented as arrays of machine words, the Euclidean algorithm can be sped up by performing several Euclidean steps at the level of the leading words alone, then committing the accumulated matrix back to multi-precision [Lehmer 1938]. This reduces the number of multi-precision operations by a factor of the machine word size (typically or ), at the cost of a small constant overhead per word-level pass. Knuth 1997 §4.5.2 contains the full pseudocode and complexity analysis.

Theorem 5 (Knuth-Schönhage subquadratic GCD). The half-GCD algorithm reduces the computation of for -bit integers to a constant number of -bit multiplications, recursively yielding a complexity where is the cost of -bit multiplication [Knuth 1997]. With Schönhage-Strassen-style FFT multiplication at , the GCD complexity becomes , beating the quadratic Euclidean bound. The algorithm proceeds by computing a integer matrix that effects Euclidean steps at once.

Theorem 6 (Euclidean domains and the implication chain). Every Euclidean domain is a principal ideal domain, and every PID is a unique factorisation domain. The implications are strict: is a PID that is not Euclidean (Motzkin 1949); is a Noetherian integral domain that is neither PID nor UFD (witnessed by ). The proof Euclidean PID is precisely the well-ordering / minimal-positive-element argument we used for ; the proof PID UFD reduces to Euclid's Lemma via the ideal characterisation of primality. Lang 2002 Algebra Ch. II §1 contains the full development [Lang 2002].

Theorem 7 (Lattice interpretation; as a PID). The set is the rank-1 sublattice of generated by . This identifies the action of the integer matrix

with a unimodular transformation of that diagonalises to via the Smith normal form. The Smith normal form for general integer matrices generalises this picture: every integer matrix admits a decomposition with unimodular and diagonal with each diagonal entry dividing the next, recovering the entire elementary-divisor theory of finitely generated abelian groups (a foundational consequence of Theorem 6).

Theorem 8 (Euclid's algorithm in ). The Gaussian integers form a Euclidean domain under the norm . The division algorithm: given with , write with and let where are integers closest to respectively. Then , so . The Euclidean recursion terminates on the strictly decreasing norm. Consequence: is a PID and a UFD, the foundation of Fermat's two-square theorem ( iff for odd primes ).

Synthesis. The Euclidean algorithm is the foundational reason that is a principal ideal domain, and this is exactly the bridge from elementary number theory to commutative algebra. The central insight is that one effective division-with-remainder operation — once verified to terminate via a well-founded norm — generates the entire ideal-theoretic structure: Bézout's identity, Euclid's Lemma, the fundamental theorem of arithmetic, the structure theorem for finitely generated abelian groups, and the Smith normal form. The pattern generalises from to every Euclidean domain ( for a field, , , the formal power series , the localisation at a prime), and the implication Euclidean PID UFD identifies the algorithmic content of with its categorical-algebraic content.

Putting these together with the complexity analysis, the Euclidean algorithm is one of the rare procedures of antiquity (Euclid c. 300 BCE) that survives without modification into modern algorithmics: Lamé 1844 established the logarithmic step count, Stein 1967 reframed for binary hardware, Lehmer 1938 added the word-level speedup, and Knuth-Schönhage 1971 broke the quadratic bit-complexity barrier via the half-GCD. The bridge is exactly the lattice identification , which generalises into the lattice reduction of Lenstra-Lenstra-Lovász 1982 (LLL), cryptanalytic shortest-vector approximations, and the modern theory of integer linear programming.

Full proof set Master

Proposition 1 (Lamé's bound, full proof). If the Euclidean algorithm on with requires exactly division steps before terminating, then , where is the -th Fibonacci number ().

Proof. Let be the remainder sequence. We claim by reverse induction on that .

Base cases: and .

Inductive step. Assume and . Each Euclidean step satisfies with (since the step is non-degenerate). Hence . Applying this with :

The induction proceeds. Taking gives .

Corollary. Inverting via the Fibonacci closed form with and , the step count satisfies , hence . The constant five in Lamé's original statement is the consequence : each five steps of the Euclidean algorithm reduce by at least a factor of , so the number of steps is at most plus a constant.

Proposition 2 ( is a PID, via Bézout). Every ideal is principal.

Proof. If , then is principal. Otherwise let be the least positive element of (which exists by well-ordering applied to , non-empty because contains nonzero integers and their negatives). We claim .

The inclusion is immediate since and is closed under integer multiplication.

For , take any and divide by with remainder: with . Then (since both and lie in ). If , this contradicts the minimality of in . Hence and .

Proposition 3 (Euclid's Lemma, ideal-theoretic form). Let be a prime. Then the ideal is a prime ideal: if , then or .

Proof. Suppose , meaning . If , then , so (since the only positive divisors of the prime are and , and does not divide ). By Bézout, for some . Multiply by : . Both and are divisible by (the first directly, the second because ), so , i.e., .

Proposition 4 (Fundamental theorem of arithmetic). Every integer admits a factorisation as a product of prime powers, unique up to reordering.

Proof. Existence: by strong induction on (see 00.12.01). If is prime, the factorisation is itself. Otherwise with , and the inductive hypothesis gives prime factorisations of and ; concatenating yields one for .

Uniqueness: suppose are two prime factorisations of the same integer. By Euclid's Lemma (Proposition 3), for some ; since is prime, . Cancel and induct on .

Connections Master

  • Mathematical induction 00.12.01. The termination proof for the Euclidean algorithm reduces to strong induction on the smaller argument, and the proof of Bézout's identity via well-ordering uses the equivalent well-ordering principle established in the induction unit. The lattice-of-ideals argument in Proposition 2 (every ideal of is principal) is itself a well-ordering argument on the positive elements of the ideal. The foundational reason these arguments work is that is well-founded, the load-bearing structural fact catalogued in 00.12.01.

  • Real numbers, integers, rationals 00.01.01. The integers are the substrate on which divisibility, gcd, and Bézout coefficients are defined. The construction of from (via ordered pairs and the equivalence ) supplies the additive-inverse machinery that lets Bézout coefficients be negative. The construction of as a field of fractions then relies on the lemma "every nonzero integer has a gcd with every other nonzero integer", established here as a corollary of Bézout.

  • Group theory 01.02.01. The set is a cyclic subgroup of , and the lattice interpretation in Theorem 7 generalises to the structure theorem for finitely generated abelian groups: every such group is a direct sum of cyclic factors, with the diagonal entries of the Smith normal form recording the invariant factors. The Bézout argument is the load-bearing step in the rank-1 case, and the Smith-normal-form induction extends it to arbitrary rank. This is the bridge from elementary number theory to the structure theory of modules over a PID.

  • Hilbert basis theorem 01.02.17. The Noetherian property of — every ascending chain of ideals stabilises — is an immediate consequence of being a PID, established via Bézout in Proposition 2. The chain of ideals corresponds to a chain of divisibility relations among generators, and the strictly decreasing alternative (in the divisibility order, replacing "less than" with "properly divides") cannot extend indefinitely. The general Hilbert basis theorem catalogued in 01.02.17 specialises to this dimension-1 fact for , and the extension to multivariate polynomial rings rests on the same well-foundedness pattern.

  • Riemann zeta function 21.03.01. The Euler product relies on the fundamental theorem of arithmetic (Proposition 4 above), whose proof reduces to Euclid's Lemma and thence to Bézout's identity. The bridge from elementary number theory to analytic number theory is exactly this: the multiplicative structure of — unique factorisation — is what allows the Dirichlet series to factor as a product over primes. Without Bézout, the zeta unit's central Euler-product identity would have no proof. This unit is the load-bearing prerequisite for the entire L-function development in chapter 21.03.

Historical & philosophical context Master

The Euclidean algorithm is among the oldest mathematical procedures still in active use. Euclid c. 300 BCE Elements Book VII Propositions 1-2 [Euclid] gives the algorithm for natural numbers — under the name anthyphairesis (), "alternating subtraction" — and Proposition VII.2 explicitly establishes that the procedure terminates and produces the greatest common measure. The geometric setting was line segments, with the algorithm originally serving as the test for commensurability: two segments are commensurable if and only if their anthyphairesis terminates. The extension to incommensurable segments, where the procedure does not terminate, is the conceptual seed of the continued-fraction expansions of irrational numbers. Historians (Knorr 1975 The Evolution of the Euclidean Elements, Fowler 1999 The Mathematics of Plato's Academy) trace the algorithm further back to Theaetetus and possibly the Pythagorean discovery of incommensurability in the 5th century BCE.

The complexity analysis is comparatively modern. Lamé 1844 C.R. Acad. Sci. 19 [Lamé 1844] published a one-page note proving that the number of division steps is at most five times the number of decimal digits of the smaller argument, with the worst case attained by consecutive Fibonacci numbers — the first substantive bit-complexity bound for an algorithm in the modern sense, predating the formalisation of complexity theory by more than a century. The connection to the golden ratio via identifies the worst case with the slowest possible decrease, providing both the asymptotic bound and the sharp constant.

Stein 1967 J. Comput. Phys. 1 [Stein 1967] reformulated the algorithm for binary computer arithmetic: by replacing division-with-remainder by parity-test, bit-shift, and subtraction, the binary GCD avoids the multi-precision division that dominates the cost of the classical Euclidean algorithm on machine integers. Lehmer 1938 Amer. Math. Monthly 45 [Lehmer 1938] had earlier proposed performing several Euclidean steps at the level of leading words to reduce the number of multi-precision operations, a technique still used in modern arbitrary-precision libraries (GMP, Magma). Knuth-Schönhage 1971 introduced the half-GCD algorithm reducing GCD to fast multiplication and thereby breaking the quadratic bit-complexity barrier [Knuth 1997]. Knuth 1997 TAOCP Vol. 2 §4.5.2 collects the algorithmic theory in one place and remains the canonical reference.

The ring-theoretic abstraction took shape in the late 19th century. Dedekind 1871 introduced the notion of an ideal in his supplements to Dirichlet's Vorlesungen über Zahlentheorie, with the explicit motivation of restoring unique factorisation in algebraic number rings where it fails at the element level. Kronecker 1882 J. reine angew. Math. 92 developed parallel notions via divisors. Noether 1921 Math. Ann. 83 abstracted the chain condition and identified Noetherian rings as the natural setting for unique-factorisation theorems. The chain Euclidean PID UFD became standard textbook material with van der Waerden 1930 Moderne Algebra and Lang 2002 Algebra Ch. II §1 [Lang 2002]. The recognition that is the prototype of every PID, and that Bézout's identity is the operational content of the principality, organises modern commutative algebra from its first pages.

Bibliography Master

@book{EuclidElements,
  author    = {Euclid},
  title     = {The Thirteen Books of Euclid's Elements},
  editor    = {Heath, Thomas L.},
  publisher = {Cambridge University Press},
  year      = {1908},
  note      = {Three volumes; Dover reprint 1956. Book VII Propositions 1-2 contain the Euclidean algorithm (anthyphairesis), originally c. 300 BCE}
}

@article{Lame1844,
  author  = {Lam\'{e}, Gabriel},
  title   = {Note sur la limite du nombre des divisions dans la recherche du plus grand commun diviseur entre deux nombres entiers},
  journal = {Comptes Rendus de l'Acad\'{e}mie des Sciences (Paris)},
  volume  = {19},
  pages   = {867--870},
  year    = {1844}
}

@article{Stein1967,
  author  = {Stein, Josef},
  title   = {Computational problems associated with Racah algebra},
  journal = {Journal of Computational Physics},
  volume  = {1},
  pages   = {397--405},
  year    = {1967}
}

@article{Lehmer1938,
  author  = {Lehmer, Derrick Henry},
  title   = {Euclid's algorithm for large numbers},
  journal = {American Mathematical Monthly},
  volume  = {45},
  pages   = {227--233},
  year    = {1938}
}

@book{Knuth1997,
  author    = {Knuth, Donald E.},
  title     = {The Art of Computer Programming, Vol. 2: Seminumerical Algorithms},
  edition   = {3},
  publisher = {Addison-Wesley},
  year      = {1997},
  note      = {Section 4.5.2 (Greatest Common Divisor) is the canonical algorithmic reference}
}

@book{HardyWright2008,
  author    = {Hardy, G. H. and Wright, E. M.},
  title     = {An Introduction to the Theory of Numbers},
  edition   = {6},
  publisher = {Oxford University Press},
  year      = {2008},
  note      = {Revised by D. R. Heath-Brown and J. H. Silverman, foreword by A. Wiles. Chapters I-II contain the elementary theory.}
}

@book{IrelandRosen1990,
  author    = {Ireland, Kenneth and Rosen, Michael},
  title     = {A Classical Introduction to Modern Number Theory},
  edition   = {2},
  series    = {Graduate Texts in Mathematics},
  volume    = {84},
  publisher = {Springer},
  year      = {1990}
}

@book{Lang2002,
  author    = {Lang, Serge},
  title     = {Algebra},
  edition   = {3},
  series    = {Graduate Texts in Mathematics},
  volume    = {211},
  publisher = {Springer},
  year      = {2002},
  note      = {Chapter II §1 develops Euclidean domains, PIDs, and the structure of finitely generated modules over a PID}
}

@book{Apostol1976,
  author    = {Apostol, Tom M.},
  title     = {Introduction to Analytic Number Theory},
  series    = {Undergraduate Texts in Mathematics},
  publisher = {Springer},
  year      = {1976}
}