Linear Codes and the Hamming, Singleton, and Gilbert-Varshamov Bounds
Anchor (Master): van Lint-Wilson 2001 *A Course in Combinatorics* 2e (Cambridge) Ch. 18-20 (linear codes, weight enumerators, MacWilliams, the bounds); MacWilliams-Sloane 1977 *The Theory of Error-Correcting Codes* (North-Holland) Ch. 1-5, 19 (the definitive treatment of weight enumerators and the bounds); Hamming 1950 *Bell System Tech. J.* 29 (the sphere-packing bound and the original single-error-correcting codes); Gilbert 1952 *Bell System Tech. J.* 31; Varshamov 1957 *Dokl. Akad. Nauk SSSR* 117; Singleton 1964 *IEEE Trans. Inform. Theory* 10; MacWilliams 1963 (the weight-enumerator duality)
Intuition Beginner
When a message travels down a noisy channel, some of its symbols flip. If you send the single bit and it arrives as , the receiver has no way to notice, let alone repair the damage. The fix is to add planned redundancy: instead of one bit, send a longer string in which the extra symbols are bound to the message by fixed rules. If noise breaks one of those rules, the receiver sees that something is wrong, and if only a little noise occurred, the receiver can even guess back the original. A code is a chosen collection of allowed strings, called codewords; everything you ever transmit is one of them.
The key number is how far apart the codewords sit. Measure the distance between two strings by counting the positions where they differ. If every pair of codewords differs in at least positions, then a handful of flipped symbols cannot carry one codeword all the way to another. The received string stays closer to the codeword you sent than to any other, so decoding to the nearest codeword recovers the message. The larger is, the more errors you survive.
This sets up a tug of war. You want many codewords, so you can send many different messages; you want them long-spaced, so you correct many errors; and you want the strings short, so you waste little channel. These three wishes fight each other, and the whole subject is about the exact terms of the trade. The bounds in this unit are the rules of that fight: how good a code is even possible, and how good a code is guaranteed to exist.
Visual Beginner
Picture every string of length as a point, and draw an edge between two strings that differ in exactly one position. The codewords are a scattered handful of these points, deliberately spread out. Around each codeword draw a ball: all the strings within a small number of flips of it. If the balls do not overlap, then any lightly corrupted string sits inside exactly one ball, and you decode it to that ball's centre.
| string | differs from in | inside the ball of ? |
|---|---|---|
| positions | yes (it is the centre) | |
| position | yes, if the ball has radius | |
| positions | no, if the ball has radius |
The packing question is geometric: how many non-overlapping balls of a given radius fit in the space of strings? Too many codewords and the balls collide, so decoding becomes ambiguous. The bounds count how the balls fit.
Worked example Beginner
Build the simplest useful code by hand: the binary repetition code that sends each bit three times. The two codewords are and . The allowed strings have length , and there are of them, one for the message and one for the message .
Step 1. Find the distance between the two codewords. Compare and position by position: they differ in all places. So the minimum distance is .
Step 2. Decide how many flips you survive. With , two codewords are apart, so a received string within flip of a codeword is still strictly closer to that codeword than to the other. The code corrects error.
Step 3. Decode a received word. Suppose arrives. Count its distance to each codeword: it differs from in positions and from in position. The nearer codeword is , so you decode to , recovering the message bit .
Step 4. Decode by majority. Notice the rule is just a majority vote: has two s and one , so the answer is . Nearest-codeword decoding and majority voting agree here.
What this tells us: spending three symbols to send one bit buys the power to fix any single flip. The price is the rate — one message bit per three channel symbols. Every code is a deal of this shape, and the bounds say how good a deal you can strike.
Check your understanding Beginner
Formal definition Intermediate+
Throughout, is the finite field with elements (its existence and structure are developed in 21.02.01; here it is applied, not reconstructed), and is the standard -dimensional vector space of row vectors, whose linear-algebraic theory is taken from 01.01.04. The integer is a prime power.
Definition (linear code). A linear code of length over is a -dimensional linear subspace . Its elements are codewords; the parameters are written , and . The rate is .
Definition (Hamming weight and distance). For , the Hamming distance counts the positions in which and differ; it is a metric. The Hamming weight counts the nonzero coordinates. The minimum distance of is $$ d(C) = \min_{\substack{x, y \in C \ x \neq y}} d(x, y). $$
Lemma (linearity collapses distance to weight). Because is closed under subtraction, with , so $$ d(C) = \min_{\substack{c \in C \ c \neq 0}} \mathrm{wt}(c). $$ A code with parameters and minimum distance is called an code. By the metric triangle inequality, nearest-codeword (minimum-distance) decoding corrects any error pattern of weight at most and detects any nonzero pattern of weight at most .
Definition (generator and parity-check matrices). A generator matrix is a matrix over whose rows are a basis of , so . The dual code is , where ; it is an code. A parity-check matrix is an generator matrix of , so that $$ C = { x \in \mathbb{F}q^n : H x^{\mathsf T} = 0 }. $$ When is in standard form, $H = [, -A^{\mathsf T} \mid I{n-k} ,]G H^{\mathsf T} = -A + A = 0$.
Definition (syndrome and standard array). The syndrome of a received word is . Two words have the same syndrome iff they lie in the same coset of ; writing with , the syndrome depends only on the error . Syndrome decoding fixes, in each coset, a coset leader of least weight, and decodes to . The table of cosets, with leaders down the first column, is the standard array.
Counterexamples to common slips Intermediate+
Minimum distance is a property of the whole code, not of one pair. It is the least weight over all nonzero codewords; finding two codewords at distance does not establish unless no nonzero codeword has smaller weight.
The dual code is not the set-complement. is the orthogonal subspace under the standard bilinear form, and over a code may be self-dual (, forcing ) or self-orthogonal (); a codeword can be orthogonal to itself since may vanish in characteristic dividing the weight.
Parity-check rows count, not parity-check equations chosen ad hoc. Any matrix of rank whose row space is is a valid ; different choices give the same code but different syndromes, hence different standard arrays.
A generator matrix need not be in standard form. Row operations and column permutations bring to only up to a permutation-equivalent code; the permutation changes which coordinates are message symbols but preserves all parameters .
Key theorem with proof Intermediate+
The structural fact that turns minimum distance into linear algebra is the parity-check criterion: the distance of a code is read directly off the columns of any parity-check matrix. Everything downstream — the Singleton bound, MDS codes, the construction half of Gilbert-Varshamov — flows from it.
Theorem (parity-check distance criterion). Let be an code with parity-check matrix , and let . Then if and only if every set of columns of is linearly independent. Equivalently, equals the least number of linearly dependent columns of [van Lint 1999].
Proof. Write the columns of as . A vector lies in iff , that is, iff . Let be nonzero with support of size . Then is a nonzero linear dependence among exactly columns of , with all coefficients nonzero. Conversely, any linear dependence with and all defines a codeword with for and otherwise, of weight .
Hence the weights of nonzero codewords are exactly the sizes of minimal sets of columns of admitting a dependence with all coefficients nonzero — which, after discarding zero coefficients, are the sizes of linearly dependent column sets. The minimum weight is therefore the least size of a linearly dependent set of columns. In particular exactly when no columns are dependent, i.e. every columns are independent.
Corollary (Singleton bound). Every code satisfies [Singleton 1964]. The matrix has rows, so any columns are linearly dependent. By the criterion the least dependent column set has size at most , i.e. . A code meeting this with equality, , is maximum distance separable (MDS).
Bridge. The parity-check criterion builds toward all three classical bounds by converting a metric question into a question about column rank, and it appears again in the master-tier construction of Gilbert-Varshamov, where a parity-check matrix is grown one column at a time so that no small set of columns becomes dependent. The foundational reason the Singleton bound holds is exactly this: a matrix of rows cannot keep more than columns independent, so the simple case of full-rank linear algebra already caps the distance. This is dual to the Hamming bound, which limits distance from below by counting volume rather than rank: where Singleton is an upper bound on forced by the row count of , sphere-packing is an upper bound on forced by the radius of decoding balls. Putting these together, an MDS code is the case where the rank obstruction is the only obstruction, and the central insight — that the columns of encode the code's whole error-correcting geometry — is what generalises, in the master tier, into the weight enumerator and its MacWilliams dual.
Exercises Intermediate+
Advanced results Master
The bounds frame the achievable region; the weight enumerator records a code's full distance profile and is governed by a single duality, the MacWilliams identity. We collect the structural results: the asymptotic Gilbert-Varshamov bound, the sphere-packing bound and perfect codes, the weight enumerator and MacWilliams transform, and the place of MDS codes.
Theorem 1 (Hamming sphere-packing bound, -ary). Any -ary code (linear or not) of length , size , and minimum distance correcting errors satisfies , where is the volume of a Hamming ball of radius [Hamming 1950]. The -balls about codewords are disjoint and lie in . Equality holds exactly for perfect codes, where the balls tile the space; the binary and -ary Hamming codes, the binary and ternary Golay codes, the repetition codes, and the degenerate whole-space codes exhaust all perfect codes over fields (van Lint-Tietäväinen), pursued in the co-produced perfect-codes unit.
Theorem 2 (asymptotic Gilbert-Varshamov bound). Define the -ary entropy for . Then for every relative distance there exist (linear) codes of rate arbitrarily close to with relative minimum distance , because and the Varshamov condition becomes [Gilbert 1952]. For this remained, until the 1982 Tsfasman-Vlăduţ-Zink algebraic-geometry codes, the best known lower bound on achievable rate; for binary codes it still is.
Theorem 3 (weight enumerator). The weight enumerator of an code is the homogeneous polynomial $$ W_C(x, y) = \sum_{c \in C} x^{,n - \mathrm{wt}(c)}, y^{,\mathrm{wt}(c)} = \sum_{i=0}^{n} A_i, x^{n-i} y^{i}, $$ where is the weight distribution; and for . The enumerator packages error probabilities, the undetected-error rate, and (via ) the minimum distance.
Theorem 4 (MacWilliams identity). For a linear code with dual , $$ W_{C^\perp}(x, y) = \frac{1}{|C|}, W_C\big(x + (q-1)y,; x - y\big) $$ [MacWilliams-Sloane Ch. 5]. Equivalently the dual weight distribution is , where is the -ary Krawtchouk polynomial. The proof is a discrete Fourier/Poisson-summation argument over the additive characters of : summing the character sum over collapses to on and off it, and re-expanding the generating function in the weight variables produces the Krawtchouk transform. The identity lets one compute 's spectrum from 's without enumerating , and underlies the linear-programming (Delsarte) bound.
Theorem 5 (MDS duality and the singleton case). If is MDS with parameters , then is MDS with parameters , and the weight distribution of an MDS code is determined entirely by : $$ A_i = \binom{n}{i}\sum_{j=0}^{,i - d}(-1)^j\binom{i}{j}\big(q^{,i - d + 1 - j} - 1\big), \qquad i \geq d = n-k+1, $$ a closed form forced by the MacWilliams relations together with , . Generalised Reed-Solomon codes realise MDS parameters for all ; the MDS conjecture asserts for MDS codes with (except two exceptions at even ), proved for prime by Ball (2012).
Synthesis. Putting these together, the three bounds and the weight enumerator are one circle of ideas pinned by the parity-check matrix. The Singleton bound is the rank ceiling on distance; the sphere-packing bound is the volume ceiling on size; the Gilbert-Varshamov bound is the volume floor guaranteeing existence — and the foundational reason all three speak the same language is that each counts column-configurations of , exactly as the distance criterion of the Key theorem demands. This is dual, in the precise sense of the MacWilliams identity, to a statement about : the weight enumerator of a code and of its dual determine each other through the Krawtchouk transform, so the geometry of decoding balls (sphere-packing) and the algebra of parity checks (Singleton) are two readings of the same generating function. The central insight is that an MDS code is the simultaneous extremal case — meeting Singleton with equality, self-dual in parameters, with a weight distribution forced by MacWilliams alone — and this is exactly where the achievable region of touches its rank boundary. The bridge from these finite identities to the asymptotic theory is the entropy expansion of the ball volume, which turns Gilbert-Varshamov into the rate and turns sphere-packing into the Hamming rate bound, framing Shannon's channel-coding theorem (stated as a pointer, its information-theoretic proof reserved) as the probabilistic completion of this combinatorial picture.
Full proof set Master
Proposition 1 (dimension of the dual; reflexivity). For an code , and .
Proof. The bilinear form on is nondegenerate. The map , , is a linear isomorphism (nondegeneracy), and where . By the rank-nullity / annihilator dimension formula of 01.01.05, , so . Applying this to gives , and holds by definition of the dual; equal finite dimensions force .
Proposition 2 (Singleton bound and the MDS dual is MDS). Every code has , and if equality holds then is , again meeting Singleton.
Proof. The Singleton inequality is the Corollary to the Key theorem: a parity-check matrix has rows, so any columns are dependent, forcing . Suppose (MDS). Project onto any coordinates: as shown for MDS codes, no nonzero codeword vanishes on coordinates (such a word would have weight ), so the projection is injective, hence bijective onto , and every -subset of coordinates is an information set. Dually, a generator matrix of has every columns of rank (independent). The parity-check matrix of is a generator matrix of ; the MDS property of states every columns of are independent, equivalently (a standard determinantal duality for complementary minors) every columns of are independent, so has minimum distance . Singleton for gives , so and is MDS.
Proposition 3 (sphere-packing bound; equality is perfection). Any code of length , size , minimum distance over satisfies with , with equality iff the radius- balls about codewords partition .
Proof. For in , the balls and are disjoint: a common point would give , contradicting the minimum distance. Each ball has exactly points — choose of the coordinates to alter and one of nonzero changes in each. The disjoint balls lie in , so . Equality holds iff the balls exhaust with no leftover points, i.e. every word is within of a unique codeword — the definition of a perfect code.
Proposition 4 (Gilbert-Varshamov existence, Varshamov form). If , then an linear code exists.
Proof. Construct a full-row-rank parity-check matrix with every columns independent, by the greedy column choice of Exercise 7: having placed columns, the vectors that are linear combinations of at most of them number at most ; the hypothesis makes this strictly less than , so a column outside every such combination exists, and adding it keeps every columns independent. After steps the code has, by the parity-check distance criterion, minimum distance and dimension .
Proposition 5 (MacWilliams identity). For a linear code, .
Proof. Fix a nonprincipal additive character of . For define and its character transform . Because factors over coordinates, does too: in each coordinate equals when and when (since for ). Hence . Now apply the Poisson summation formula for the finite group and the subgroup : $$ \sum_{u \in C^\perp} g(u) ;=; \frac{1}{|C|}\sum_{v \in C}\hat g(v). $$ The left side is . The right side is . The two are equal.
Connections Master
The parity-check distance criterion rests on the rank theory of finite-dimensional spaces in
01.01.05(rank-nullity) and01.01.04(subspace, basis, dimension): a code is a subspace, its dual is the annihilator under the standard form, and is the rank-nullity identity applied to the syndrome map . The foundational reason the bounds are linear-algebraic, not merely combinatorial, is that the whole error-geometry is encoded in the column space of .The arithmetic of the alphabet is the finite-field theory of
21.02.01: codes over inherit MDS constructions (Reed-Solomon codes evaluate polynomials at field elements, so or ) and the MacWilliams proof runs over the additive characters of , where the nondegenerate trace form supplies the Fourier duality. This is dual to the multiplicative structure used in cyclic codes, generated by factorisations of over .The co-produced unit
40.06.07on perfect and Hamming codes is the equality case of the sphere-packing bound proved here: the Hamming codes are exactly the single-error-correcting perfect codes whose parity-check columns enumerate all nonzero vectors of , and the Golay codes are the remaining nondegenerate perfect codes. This unit supplies the sphere-packing inequality;40.06.07characterises when it is met.The co-produced unit
40.06.05on Hadamard matrices and codes connects through the Plotkin bound and its equality case: a Hadamard matrix of order yields a code meeting the Plotkin bound, the high-distance regime where the bounds of this unit are sharpest, and the same matrices build the symmetric -designs of this chapter, tying codes to designs through their incidence structure.
Historical & philosophical context Master
Coding theory begins in 1948-1950 at Bell Labs. Claude Shannon's 1948 A Mathematical Theory of Communication proved, non-constructively, that reliable transmission is possible at any rate below the channel capacity, but exhibited no codes; the channel-coding theorem is recorded here only as the pointer to which the combinatorial bounds aspire. Richard Hamming, frustrated by weekend batch-computer failures, built the first explicit single-error-correcting codes and the sphere-packing bound, published in 1950 [Hamming 1950]; the Hamming code is the prototype perfect code.
The existence bound is due to Edgar Gilbert in 1952 [Gilbert 1952], who gave a sphere-covering greedy argument over arbitrary codes; Rom Varshamov in 1957 [Varshamov 1957] sharpened it to guarantee a linear code by the column-avoidance construction reproduced above, so the combined statement is the Gilbert-Varshamov bound. Richard Singleton stated the rank bound in 1964 [Singleton 1964] and named the maximum-distance-separable codes meeting it, though the bound was implicit in earlier work of Joshi and Komamiya. Jessie MacWilliams established the weight-enumerator duality in her 1962 Harvard thesis and 1963 paper, giving the identity that bears her name and that, with Neil Sloane, became the organising theorem of the 1977 treatise [MacWilliams-Sloane Ch. 5]. The asymptotic Gilbert-Varshamov bound stood as the best known lower bound on binary code rate for thirty years and was first surpassed, for , by the 1982 algebraic-geometry codes of Tsfasman, Vlăduţ, and Zink.
Bibliography Master
@article{shannon1948,
author = {Shannon, Claude E.},
title = {A Mathematical Theory of Communication},
journal = {Bell System Technical Journal},
volume = {27},
year = {1948},
pages = {379--423, 623--656}
}
@article{hamming1950,
author = {Hamming, Richard W.},
title = {Error detecting and error correcting codes},
journal = {Bell System Technical Journal},
volume = {29},
number = {2},
year = {1950},
pages = {147--160}
}
@article{gilbert1952,
author = {Gilbert, Edgar N.},
title = {A comparison of signalling alphabets},
journal = {Bell System Technical Journal},
volume = {31},
year = {1952},
pages = {504--522}
}
@article{varshamov1957,
author = {Varshamov, Rom R.},
title = {Estimate of the number of signals in error correcting codes},
journal = {Doklady Akademii Nauk SSSR},
volume = {117},
year = {1957},
pages = {739--741}
}
@article{singleton1964,
author = {Singleton, Richard C.},
title = {Maximum distance $q$-nary codes},
journal = {IEEE Transactions on Information Theory},
volume = {10},
number = {2},
year = {1964},
pages = {116--118}
}
@article{macwilliams1963,
author = {MacWilliams, F. Jessie},
title = {A theorem on the distribution of weights in a systematic code},
journal = {Bell System Technical Journal},
volume = {42},
year = {1963},
pages = {79--94}
}
@book{macwilliamssloane1977,
author = {MacWilliams, F. Jessie and Sloane, Neil J. A.},
title = {The Theory of Error-Correcting Codes},
publisher = {North-Holland},
year = {1977}
}
@book{vanlint1999,
author = {van Lint, Jacobus H.},
title = {Introduction to Coding Theory},
series = {Graduate Texts in Mathematics},
volume = {86},
edition = {3rd},
publisher = {Springer},
year = {1999}
}
@book{vanlintwilson2001,
author = {van Lint, Jacobus H. and Wilson, Richard M.},
title = {A Course in Combinatorics},
edition = {2nd},
publisher = {Cambridge University Press},
year = {2001}
}
@article{tvz1982,
author = {Tsfasman, Michael A. and Vl\u{a}du\c{t}, Serge G. and Zink, Thomas},
title = {Modular curves, Shimura curves, and Goppa codes, better than Varshamov-Gilbert bound},
journal = {Mathematische Nachrichten},
volume = {109},
year = {1982},
pages = {21--28}
}