01.01.06 · foundations / linear-algebra

Systems of linear equations and the Kronecker-Capelli theorem

shipped3 tiersLean: none

Anchor (Master): Shilov — Linear Algebra Ch. 1; Hoffman-Kunze — Linear Algebra Ch. 1; Gantmacher — Theory of Matrices Ch. III

Intuition Beginner

A system of linear equations asks several demands of the same unknowns at once. Find numbers , , so that one combination of them equals 6, another equals 0, and a third equals 4. Each equation pins the unknowns to a flat sheet in space; the solution is wherever all the sheets meet. Three sheets can cross at a single point, share a whole line, or fail to share anything at all.

So a linear system has exactly one of three fates. It can have one answer, infinitely many answers, or no answer. The whole subject is about telling which fate you are in before grinding through the arithmetic, and about reading the answer's shape off the equations themselves.

The deciding quantity is how much genuinely new information each equation carries. If a fourth equation is just the first three added together, it tells you nothing fresh and cannot rule out a solution. If instead it demands something the first three already forbid, the demands clash and no answer survives. The count of independent demands is the rank, and comparing two ranks settles existence.

Visual Beginner

Picture three flat planes floating in space, each one the solution sheet of a single equation. The left panel shows the lucky case: the three planes pierce one another at a single shared point, the unique solution. The middle panel shows three planes meeting along a common line — every point on that line solves all three, so there are infinitely many answers. The right panel shows three planes arranged like the sides of a triangular prism: each pair crosses in a line, but no point lies on all three at once, so there is no solution.

The picture you land in depends only on how the equations lean against one another, not on the particular numbers on the right-hand side.

Worked example Beginner

Solve the system $$ x + y + z = 6,\qquad 2x - y + z = 3,\qquad x + 2y - z = 2. $$

Start by removing from the second and third equations. Subtract twice the first equation from the second: , which gives . Subtract the first equation from the third: , which gives .

Now work with the two smaller equations and . From the second, . Put this into the first: , so , so , giving .

Back-substitute. With , the equation gives . The first original equation gives . Check all three: , , and . Every equation holds.

What this tells us: by peeling off one unknown at a time, three tangled equations collapse into a staircase you can climb back up. The single answer is the one point where all three planes meet.

Check your understanding Beginner

Formal definition Intermediate+

Let be a field. A system of linear equations in unknowns over is a list $$ \sum_{j=1}^{n} a_{ij} x_j = b_i \qquad (i = 1, \ldots, m), $$ with coefficients and right-hand sides . Collecting the coefficients into the coefficient matrix , the unknowns into , and the right-hand sides into , the system is the single matrix equation $$ A\mathbf x = \mathbf b. $$ The augmented matrix is the matrix obtained by adjoining as a final column. The system is consistent (or solvable) if at least one satisfies it, and inconsistent otherwise. When the system is homogeneous; the zero solution is always present, and the substantive question is whether a nonzero solution exists.

Two readings of organise the theory. The column reading writes , where is the -th column of ; consistency then means is a linear combination of the columns, that is, lies in the column space [quantum-well Rank of a matrix.md]. The operator reading treats as the linear map , , whose kernel is the null space, the solution set of the homogeneous system [quantum-well Null space.md]. The rank of is , the dimension of the column space, equal also to the dimension of the row space and to the number of pivots in any row-echelon form of [quantum-well Linear map.md].

A matrix is in row-echelon form when each nonzero row begins (reading left to right) with a leading entry strictly to the right of the leading entry of the row above, and all-zero rows sit at the bottom. It is in reduced row-echelon form when, additionally, every leading entry equals and is the only nonzero entry in its column. Gauss-Jordan elimination transforms a matrix to reduced row-echelon form using the three elementary row operations: swapping two rows, scaling a row by a nonzero element of , and adding a multiple of one row to another. Each operation corresponds to left-multiplication by an invertible elementary matrix, so it preserves both the solution set of and the ranks of and . The columns of containing leading entries are the pivot columns; their count is . The remaining columns are the free columns, and the unknowns they index are the free variables.

Counterexamples to common slips

  • Consistency is a property of the pair , not of alone. The same can give a solvable system for one and an unsolvable one for another, depending on whether .
  • More equations than unknowns () does not force inconsistency, and fewer () does not force consistency. A tall system can still be solvable if the extra rows are dependent; a short system in one unknown is already unsolvable.
  • "Unique solution" requires , the number of unknowns — not . A square system is the only case where these coincide.
  • Adjoining can raise the rank by at most , never lower it, since is a submatrix of . So always.

Key theorem with proof Intermediate+

Theorem (Kronecker-Capelli). Let and . The system is consistent if and only if $$ \operatorname{rank}(A) = \operatorname{rank}([A \mid \mathbf b]). $$ When it is consistent, the solution set is the affine set , where is any particular solution; its dimension as an affine set is . In particular the solution is unique exactly when . [textbooks-extra Calculus Vol.2 - Multi-Variable Calculus and Linear Algebra with Applications (Tom Apostol).pdf]

Proof. Write for the columns of , so . The columns of are , so .

Consistency means has a solution, that is, for some scalars , which is the statement . Now holds if and only if adjoining to the spanning set does not enlarge the span, that is, , which holds if and only if these two subspaces have equal dimension, since one contains the other. Taking dimensions, with equality of dimension forcing equality of the spaces; and equality of dimension is exactly . The two ranks are therefore equal if and only if the system is consistent. (If , then is independent of the columns of , so .)

For the solution-set structure, suppose the system is consistent and let be a particular solution, . If is any solution, then , so , giving . Conversely, for any , , so is a solution. The solution set is exactly . By rank-nullity applied to from 01.01.05, , which is the dimension of the affine solution set. The solution is unique precisely when , that is, when , i.e. .

Corollary (homogeneous and square cases). The homogeneous system has a nonzero solution if and only if ; in particular this holds whenever (fewer equations than unknowns). A square system () has a unique solution for every if and only if , equivalently , equivalently is invertible.

Proof. The homogeneous system is always consistent (the zero solution), so the rank criterion is automatic; the substantive content is the size of , which is nonzero exactly when , i.e. . If , then , forcing a nonzero solution. For the square case, makes injective by the kernel computation and hence surjective by rank-nullity from 01.01.05, so has the unique solution for every ; the equivalence with is the invertibility criterion of 01.01.07.

Bridge. The rank criterion builds toward the coordinate-free first isomorphism theorem of 01.01.05: "" is the matrix face of "", and the affine solution set is a coset of the kernel, the same object that the quotient packages intrinsically. The same solvability bookkeeping appears again in the determinant unit 01.01.07 as Cramer's rule for the square nonsingular case, and resurfaces in the spectral and least-squares circle of 01.01.12, where an inconsistent is replaced by the consistent normal equations . It connects forward to the Fredholm alternative of functional analysis, where the finite rank comparison becomes an orthogonality condition in infinite dimensions.

Exercises Intermediate+

Advanced results Master

Cramer's rule. Let with and . The unique solution of has components $$ x_i = \frac{\det A_i}{\det A}, \qquad i = 1, \ldots, n, $$ where is with its -th column replaced by . The rule follows from the adjugate identity of 01.01.07: writing and expanding the -th component, is precisely the cofactor expansion of along its -th column. Cramer's rule is a closed-form solvability certificate, but as an algorithm it costs determinant evaluations and is, for numerical purposes, dominated by Gauss-Jordan elimination, which solves the same system in field operations. Its value is structural and theoretical: it exhibits the solution as a rational function of the data with denominator , which is what makes the solution depend continuously (indeed polynomially after clearing the denominator) on the entries away from the singular locus .

The Fredholm alternative. The Kronecker-Capelli dichotomy is the finite-dimensional shadow of the Fredholm alternative. For a square real system , exactly one of two statements holds: either has a solution for every (the case ), or the homogeneous system has a nonzero solution and then is solvable if and only if is orthogonal to every solution of the transposed homogeneous system . The orthogonality condition is the metric refinement of "", using the identity valid in any finite-dimensional inner-product space. In the infinite-dimensional setting of compact operators on a Hilbert space — Fredholm's 1903 theory of integral equations — the same alternative governs solvability, with and taking the place of the two null spaces, both finite-dimensional and of equal dimension.

Conditioning. Solvability is a yes/no question, but the stability of the answer under perturbation of the data is governed by the condition number of a square nonsingular . If , then the relative error obeys . A system can be exactly consistent yet practically unsolvable when is enormous: the column space is technically full but its columns are nearly dependent, so sits near the boundary of and tiny changes in swing the solution wildly. This is the quantitative residue of the qualitative rank jump: rank deficiency is the limit , and near-deficiency is large but finite .

Synthesis. Four strands meet at the rank criterion. The first is membership: solvability of is the single geometric fact , detected by the rank comparison because adjoining a vector to a spanning set raises the dimension exactly when the vector is external to the span. The second is affine structure: a consistent system's solution set is a coset of the kernel, so the homogeneous problem controls the multiplicity and the inhomogeneous data controls only the offset. The third is duality: the column-space condition has an orthogonal-complement form , the Fredholm alternative, which survives into the infinite-dimensional theory of integral equations where determinants no longer exist. The fourth is quantification: rank is a discrete invariant, but its failure is approached continuously through the condition number, which measures how close a solvable system sits to the inconsistent boundary. Together they recast the elementary question "does this system have a solution?" as a statement about one subspace containing one vector, and every later refinement — Cramer, Fredholm, least squares, conditioning — is a reading of that single containment.

Full proof set Master

Proposition (rank is well-defined: row rank equals column rank). For any , the dimension of the row space equals the dimension of the column space. Consequently the pivot count in any row-echelon form of is an invariant of , justifying the notation used throughout.

Proof. Let be the column rank, , and let be a basis of the column space. Form the matrix with these as columns. Every column of is a linear combination , and collecting the coefficients into with gives the factorisation . Reading by rows, each row of is a linear combination of the rows of , so the row space of is spanned by the rows of and has dimension at most ; thus the row rank is at most the column rank. Applying the same argument to , whose rows are the columns of and vice versa, the column rank is at most the row rank. The two are equal. Since elementary row operations preserve the row space (each new row is a combination of old rows, reversibly), they preserve its dimension, and in row-echelon form the nonzero rows are linearly independent and number exactly the pivots; so the pivot count equals the row rank, hence the common rank.

Proposition (uniqueness of reduced row-echelon form). Every matrix is row-equivalent to exactly one matrix in reduced row-echelon form. Hence the set of pivot columns, the rank, and the parametrisation of by free variables are intrinsic to , not artifacts of the elimination path.

Proof. Existence is the termination of Gauss-Jordan elimination: at each stage choose the leftmost column with a nonzero entry on or below the current pivot row, swap it up, scale it to a leading , and clear the rest of its column; the process strictly advances the pivot column and pivot row and halts. For uniqueness, suppose and are reduced row-echelon forms of , so and for products of elementary matrices; then with invertible, and have the same null space (row operations preserve solution sets), namely . Two reduced row-echelon matrices with the same null space agree: the pivot columns are determined as those columns not expressible as a combination of earlier columns — equivalently the columns where a new dependency among relations is resolved — and this characterisation depends only on . Concretely, column is a pivot column if and only if the -th standard relation is not forced by the relations among columns that hold for all vectors of ; and once the pivot columns coincide, the entries of a reduced row-echelon matrix in the free columns are exactly the coefficients expressing each free column in the basis of preceding pivot columns, again determined by alone. Therefore .

Proposition (solution-count trichotomy over a field). Over a field , a system has either no solution, exactly one solution, or, when is infinite, infinitely many; when is finite of order , a consistent system with free variables has exactly solutions.

Proof. If inconsistent, there are no solutions. If consistent, the solution set is by the Kronecker-Capelli theorem, an affine copy of the vector space , so its cardinality equals . If , then and the solution is unique. If , then is a -vector space of dimension , hence isomorphic to as a set; over an infinite field is infinite, and over a finite field of order it has elements. Translating by does not change the count, so the consistent system has solutions in the finite case and infinitely many in the infinite case.

Connections Master

  • Linear transformation: kernel, image, rank-nullity 01.01.05 — supplies the engine of this unit. The Kronecker-Capelli criterion is "" and the solution-set dimension is the nullity from rank-nullity; the unit specialises the abstract linear-map theory to the concrete matrix-and-augmented-matrix calculus, with the coefficient matrix read as the map .

  • Determinant 01.01.07 — the square nonsingular case factors through the determinant: has a unique solution for every iff , and Cramer's rule expresses each coordinate as a ratio of determinants . The determinant is the scalar gate that detects when the rank criterion is automatically satisfied for all right-hand sides at once.

  • Singular value decomposition 01.01.12 — when is inconsistent, the least-squares replacement minimises , solved by the normal equations and stably computed through the SVD and the Moore-Penrose pseudoinverse. The SVD also exposes the condition number that governs how a solvable system degrades toward the inconsistent boundary.

  • Eigenvalue and eigenvector 01.01.08 — the eigenvalue equation is a homogeneous linear system, and is an eigenvalue precisely when this system has a nonzero solution, i.e. when . The Kronecker-Capelli homogeneous criterion is the existence half of the spectral theory built in that unit.

  • Linear equations and the line 00.03.01 — the two-equation, two-unknown case is the classification of how two lines in the plane meet, with the determinant as the unique-intersection test. The present unit generalises that planar classification to equations in unknowns and replaces the single determinant by the pair of ranks.

Historical & philosophical context Master

Systematic elimination predates its abstract justification by millennia: the Chinese Jiuzhang Suanshu (Nine Chapters on the Mathematical Art, compiled by the first century CE) solves systems by an array-reduction procedure essentially identical to what is now called Gaussian elimination. Carl Friedrich Gauss formalised the elimination method in his astronomical least-squares work of the early nineteenth century, and Wilhelm Jordan's textbook treatment in the 1880s fixed the reduced form that bears their joint name. The rank-based consistency criterion is the contribution of the 1870s. Eugène Rouché stated the solvability condition in terms of the vanishing of certain determinants in Sur la discussion des équations du premier degré, Comptes Rendus 81 (1875), 1050–1052 [Rouché, E. — Sur la discussion des équations du premier degré]. Ferdinand Georg Frobenius gave the modern matrix-rank framing shortly after, which is why the result is called the Rouché-Frobenius theorem in much of the literature. The name "Kronecker-Capelli" honours Leopold Kronecker's rank formulation and Alfredo Capelli's textbook codification of the criterion in his Lezioni di algebra complementare (Naples, 1892) [Capelli, A. — Lezioni di algebra complementare]. The several names of one theorem record a half-century in which the determinant-heavy nineteenth-century viewpoint was gradually replaced by the dimension-and-rank viewpoint of twentieth-century linear algebra; Georgi Shilov's Linear Algebra (1971 English translation) presents the criterion in the mature rank-theoretic form adopted here [Shilov, G. E. — Linear Algebra (Dover, 1977 reprint of the 1971 Prentice-Hall translation)].

Bibliography Master

  • Rouché, E., "Sur la discussion des équations du premier degré", Comptes Rendus de l'Académie des Sciences 81 (1875), 1050–1052.
  • Frobenius, G., "Zur Theorie der linearen Gleichungen", Journal für die reine und angewandte Mathematik 129 (1905), 175–180.
  • Capelli, A., Lezioni di algebra complementare, B. Pellerano, Naples, 1892.
  • Kronecker, L., Vorlesungen über die Theorie der Determinanten, ed. K. Hensel, B. G. Teubner, Leipzig, 1903.
  • Fredholm, I., "Sur une classe d'équations fonctionnelles", Acta Mathematica 27 (1903), 365–390.
  • Shilov, G. E., Linear Algebra, transl. R. A. Silverman, Prentice-Hall, 1971; Dover reprint, 1977. Ch. 1.
  • Hoffman, K. & Kunze, R., Linear Algebra, 2nd ed., Prentice-Hall, 1971. Ch. 1.
  • Gantmacher, F. R., The Theory of Matrices, Vol. 1, transl. K. A. Hirsch, Chelsea, 1959. Ch. III.
  • Apostol, T. M., Calculus, Vol. 2: Multi-Variable Calculus and Linear Algebra with Applications, 2nd ed., John Wiley & Sons, 1969. Ch. 16.

Autonomous production unit. Specialises the rank-nullity machinery of 01.01.05 to the augmented-matrix calculus; load-bearing for Cramer's rule, least squares, eigenvalue existence, and the Fredholm alternative.