43.03.01 · numerical-analysis / 03-direct-linear-solvers

Gaussian elimination, LU factorization, and its stability

shipped3 tiersLean: none

Anchor (Master): Higham 2002 *Accuracy and Stability of Numerical Algorithms* 2e (SIAM) Ch. 9 (the backward-error analysis of LU and of GEPP, the growth-factor bound); Golub-Van Loan 2013 *Matrix Computations* 4e (Johns Hopkins) §3.3-3.5; Wilkinson 1965 *The Algebraic Eigenvalue Problem* (Oxford) Ch. 4 (rounding analysis of Gaussian elimination); Trefethen-Bau 1997 *Numerical Linear Algebra* (SIAM) Lectures 20-22

Intuition Beginner

When you solve a system of linear equations by hand, you eliminate one unknown at a time: subtract a multiple of the first equation from the others to wipe out the first variable, then repeat on what is left. This is Gaussian elimination, the method you already know from 01.01.06. The quiet fact that powers all of modern computing is that this bookkeeping has a name and a shape: it factors your matrix into two triangular pieces.

The two pieces are a lower-triangular matrix that records the multipliers you used, and an upper-triangular matrix that is the staircase form you ended up with. Writing the original matrix as is the whole story of elimination packaged into a single equation. Triangular systems are cheap to solve — you read the answers off one at a time — so once you have and , every right-hand side is fast.

There is a catch that only shows up on a computer. If one of the numbers you divide by is tiny, the multipliers blow up, and the rounding the computer already does gets magnified into nonsense. The fix is to reorder the rows so you always divide by the largest available number. This is called partial pivoting, and it is the difference between a method that works and one that quietly fails.

Pivoting does not make every problem solvable. Some matrices genuinely amplify error no matter how carefully you compute — that is their condition number talking, from 43.01.02. What pivoting promises is that elimination itself adds almost nothing to the trouble: it gives the exact answer to a problem a hairsbreadth away from yours.

Visual Beginner

The picture is the elimination process turning a full square of numbers into a clean triangle, while a second triangle quietly fills up with the multipliers you used along the way.

Read the table left to right. You start with a full matrix. After clearing the first column below the diagonal, the multipliers you subtracted are stored in the lower-left; the upper part starts to settle into its final triangular shape. Repeat down the diagonal and you finish with (the multipliers, with ones on the diagonal) and (the triangle).

stage what the matrix looks like what gets recorded
start full square nothing yet
after column 1 zeros below the first pivot multipliers in column 1 of
after column 2 zeros below the first two pivots multipliers in column 2 of
finish upper triangle all multipliers stored in

The takeaway: elimination is not a throwaway procedure, it is a factorization. And the row swaps of partial pivoting are not a hack — they are what keep the stored multipliers from exploding.

Worked example Beginner

Let us factor a small matrix by hand and watch and appear.

Take $$ A = \begin{pmatrix} 2 & 1 & 1 \ 4 & 3 & 3 \ 8 & 7 & 9 \end{pmatrix}. $$

Step 1. Clear the first column. The first pivot is . To kill the below it, subtract times row 1 from row 2. To kill the , subtract times row 1 from row 3. The multipliers and get stored in the first column of . The matrix becomes $$ \begin{pmatrix} 2 & 1 & 1 \ 0 & 1 & 1 \ 0 & 3 & 5 \end{pmatrix}. $$

Step 2. Clear the second column. The new pivot is . To kill the below it, subtract times the new row 2 from row 3. The multiplier is stored in the second column of . The matrix becomes $$ U = \begin{pmatrix} 2 & 1 & 1 \ 0 & 1 & 1 \ 0 & 0 & 2 \end{pmatrix}. $$

Step 3. Read off . The multipliers, with ones on the diagonal, give $$ L = \begin{pmatrix} 1 & 0 & 0 \ 2 & 1 & 0 \ 4 & 3 & 1 \end{pmatrix}. $$

Step 4. Check. Multiply and confirm you get back: row 3 of is , the original third row.

What this tells us: the messy process of elimination is exactly the clean statement . The pivots were , , — all comfortably away from zero — so no row swaps were needed and the multipliers stayed small. If a pivot had come out tiny, we would have swapped a larger row up first, which is partial pivoting.

Check your understanding Beginner

Formal definition Intermediate+

Let with or . An LU factorization of is a factorization $$ A = LU, $$ where is unit lower triangular (lower triangular with ) and is upper triangular. Gaussian elimination computes such a factorization, when it exists, by applying a sequence of elementary lower-triangular Gauss transformations. At step , with the partially reduced matrix having a nonzero pivot , the transformation subtracts the multiplier times row from each row . After steps, $$ M_{n-1} \cdots M_2 M_1 A = U, $$ and since each is unit lower triangular with inverse , the product is unit lower triangular with for [Trefethen, L. N. & Bau, D. — Numerical Linear Algebra].

Existence via leading principal minors. Write for the leading principal submatrix of (rows and columns through ). The unpivoted LU factorization exists and is unique with unit lower triangular precisely when for . The -th pivot is then (with ), the ratio of consecutive leading minors in the sense of 01.01.07.

Partial pivoting. When a leading minor vanishes — or, on a computer, when a pivot is small relative to the entries beneath it — one permutes rows. Partial pivoting selects, at step , the row maximizing and swaps it into the pivot position. Encoding all swaps in a single permutation matrix , the result is the factorization $$ PA = LU, $$ with unit lower triangular and every multiplier bounded: . This factorization exists for every nonsingular .

Operation count. Forming and by Gaussian elimination costs floating-point operations; partial pivoting adds only comparisons. Solving then proceeds in two triangular sweeps — forward substitution and back substitution — each costing operations, so additional right-hand sides are cheap once the factorization is in hand.

Triangular-solve cost. A triangular system with a nonsingular triangular matrix of order is solved by substitution in operations: row requires the already-known unknowns, one subtraction each, and one division, summing to . The cubic cost of a linear solve thus lives entirely in the factorization, not in the solves.

Counterexamples to common slips

  • Nonsingularity of does not guarantee an unpivoted LU factorization. The matrix is nonsingular but has , so no with unit lower-triangular exists; one must permute. The leading-minor condition, not invertibility, governs unpivoted existence.

  • A nonzero pivot is not the same as a safe pivot. The matrix has , so unpivoted elimination runs, but the multiplier is enormous and floating-point elimination loses all accuracy. Existence of the factorization and numerical safety are different questions; pivoting addresses the second.

  • Partial pivoting bounds the multipliers, not the entries of . The bound is automatic after pivoting, but the entries of can still grow; the magnitude of that growth is the growth factor, and it — not the multipliers — is what the stability theorem controls.

  • The permutation is part of the answer. Solving after computing requires solving , with the right-hand side permuted to match. Forgetting to permute solves a different system.

Key theorem with proof Intermediate+

The signature result certifies that Gaussian elimination with partial pivoting is a backward-stable solver, with the entire risk concentrated in a single scalar, the growth factor.

Definition (growth factor). For Gaussian elimination applied to (with partial pivoting unless stated otherwise), let denote the entries of the intermediate matrices . The growth factor is $$ \rho_n = \frac{\max_{i,j,k} |a^{(k)}{ij}|}{\max{i,j} |a_{ij}|}, $$ the largest magnitude appearing anywhere during elimination, relative to the largest entry of the original matrix.

Theorem (backward stability of GEPP). Let be nonsingular and let be the factors computed by Gaussian elimination with partial pivoting in floating-point arithmetic with unit roundoff , in the standard model of 43.01.01. Then the computed factors are the exact factors of a perturbed matrix: $$ \tilde{L}\tilde{U} = P(A + E), \qquad |E|\infty \le c_n,\rho_n,\epsilon{\mathrm{mach}},|A|\infty, $$ where is a low-degree polynomial in and is the growth factor. Consequently the computed solution of satisfies with $|\Delta A|\infty \le c'n,\rho_n,\epsilon{\mathrm{mach}},|A|_\infty\rho_n$ is of modest size [Higham, N. J. — Accuracy and Stability of Numerical Algorithms (2nd ed.)].

Proof. Track the rounding committed at each elimination step and charge it backward to the data. By the standard model, the computed multiplier is with , and partial pivoting forces . The update of each entry, , equals the exact update plus a rounding term: $$ \tilde{a}^{(k+1)}{ij} = a^{(k)}{ij} - \tilde{\ell}{ik},a^{(k)}{kj} + e^{(k)}{ij}, \qquad |e^{(k)}{ij}| \le \gamma_2\big(|a^{(k)}{ij}| + |\tilde{\ell}{ik}|,|a^{(k)}{kj}|\big), $$ where $\gamma_2 = 2\epsilon{\mathrm{mach}}/(1 - 2\epsilon_{\mathrm{mach}})(i,j)\tilde{L}\tilde{U} = P(A + E)E_{ij}e^{(k)}{ij}\gamma_2\rho_n \max{ij}|a_{ij}|n$. Hence $$ |E_{ij}| \le n,\gamma_2 \cdot 2,\rho_n \max_{ij}|a_{ij}| \le c_n,\rho_n,\epsilon_{\mathrm{mach}},\max_{ij}|a_{ij}|, $$ and converting the entrywise bound to the -norm via gives . The solve obtained from forward and back substitution inherits, by the triangular backward-error result of 43.01.03, an additional perturbation of the same order, which combines with into a single of the stated size.

Bridge. This theorem is the foundational reason a direct solver can be trusted: it shows that the backward error of elimination factors into the universal rounding unit and a single problem-and-pivoting quantity , and this is exactly the master inequality of 43.01.03 specialized to elimination, with playing the role the backward error played there. The result builds toward every later direct-solver guarantee — the unconditional stability of Cholesky, where the growth factor is bounded a priori, and the a-posteriori residual bound for — each obtained by feeding this backward error into the condition number of 43.01.02. It appears again in the perturbation theory of 43.03.03, where the achievable solution accuracy is read off by multiplying conditioning against this backward error. The growth factor generalises the per-operation rounding bound of 43.01.01 from one arithmetic step to a whole elimination, and the central insight is that pivoting cannot improve conditioning but can keep small; putting these together, the conditioning supplies the amplification a problem forces and the growth factor supplies the amplification elimination risks, and the bridge is that their product with is the accuracy GEPP delivers.

Exercises Intermediate+

Advanced results Master

Theorem 1 (existence and uniqueness of unpivoted LU). For , the factorization with unit lower triangular and upper triangular exists and is unique if and only if every leading principal submatrix , , is nonsingular. Under this condition the -th pivot equals . When some leading minor vanishes the unpivoted factorization fails, but row permutation always restores it: for every nonsingular there is a permutation with , and partial pivoting computes one such with all multipliers bounded by in magnitude [Golub, G. H. & Van Loan, C. F. — Matrix Computations (4th ed.)].

Theorem 2 (Wilkinson's growth bound for partial pivoting). Gaussian elimination with partial pivoting on produces a growth factor satisfying , and this bound is attained: the matrix with , for , , and zeros elsewhere triggers exactly, with no row swap ever taken because each candidate pivot is already maximal. Thus the worst-case backward error of GEPP, , can in principle erase all accuracy in double precision once exceeds about . The exponential worst case is the precise sense in which GEPP is only conditionally backward stable [Wilkinson, J. H. — The Algebraic Eigenvalue Problem].

Theorem 3 (the benign average case). Despite the worst case, the growth factor of partial pivoting is small for almost all matrices. For matrices with independent standard-normal entries, the expected growth factor grows slowly — empirically and by statistical analysis like — so the typical backward error is , comfortably stable. The worst-case matrices form an exponentially thin set that elimination, with its column-pivoting choices, essentially never wanders into; the gap between the proved worst case and the observed behavior is one of the sharpest in numerical analysis, and it is why GEPP, not the unconditionally stable but slower orthogonal methods, remains the default direct solver for general dense systems [Trefethen, L. N. & Schreiber, R. S. — Average-case stability of Gaussian elimination].

Theorem 4 (complete pivoting and its polynomial bound). Complete pivoting selects the pivot as the largest-magnitude entry in the entire remaining submatrix, swapping both rows and columns, giving . Its growth factor satisfies the far smaller bound , a function growing more slowly than any power of above a fixed degree, and no matrix attaining catastrophic growth under complete pivoting is known. Complete pivoting is unconditionally stable in practice but costs comparisons against the of partial pivoting, and the search disrupts the column-oriented memory access that makes partial pivoting fast; the verdict of the field is that partial pivoting's empirical stability makes the extra cost of complete pivoting rarely worth paying [Higham, N. J. — Accuracy and Stability of Numerical Algorithms (2nd ed.)].

Synthesis. The stability of Gaussian elimination is the foundational reason direct linear solvers are trusted at all, and it is governed by a single scalar that the entire backward-error analysis isolates: the growth factor . This is exactly the structure of the master inequality of 43.01.03 — the computed factors are the exact factors of with — specialized to elimination, with as the algorithm's contribution and from 43.01.02 as the problem's. The central insight is that pivoting does two separable things: the multiplier bound is free and automatic, while the growth of is what genuinely needs controlling, and partial pivoting controls it well in expectation but not in the worst case. The worst case generalises a clean combinatorial extremum, the matrix doubling the largest entry at every step to reach , which is dual to the empirical fact that random matrices keep near ; putting these together, the conditioning supplies the amplification a problem forces and the growth factor supplies the amplification elimination risks. The bridge to the rest of the chapter is that Cholesky 43.03.02 removes the gamble entirely by bounding a priori for symmetric positive-definite matrices, and the perturbation theory of 43.03.03 turns the backward error proved here into the concrete solution-accuracy guarantee , with no direct method on a general system able to do better than the conditioning allows and GEPP, in practice, achieving it.

Full proof set Master

Proposition 1 (the LU multiplier formula and the pivot-minor identity). Let have nonsingular leading principal submatrices . Then unpivoted Gaussian elimination runs to completion, the factorization is unique among unit-lower-triangular , and the pivots are with .

Proof. Proceed by induction on the step . At step the pivot is , so the multipliers are defined and has zeros below the first pivot. Assume steps have run, producing the partially reduced matrix , upper triangular in its first columns. Each is unit lower triangular, hence so is their product, and a unit-lower-triangular left factor preserves every leading principal minor: , where is the leading block of . That block is upper triangular with diagonal , so its determinant is the product . By the inductive hypothesis , so , the -th pivot is nonzero, and step runs. After steps is upper triangular and is unit lower triangular with . For uniqueness, suppose with both unit lower triangular and both upper triangular and nonsingular (the pivots are nonzero). Then ; the left side is unit lower triangular and the right side is upper triangular, so both equal a matrix that is simultaneously unit lower and upper triangular, namely the identity. Hence and .

Proposition 2 (the multiplier bound under partial pivoting). Gaussian elimination with partial pivoting produces multipliers with for all .

Proof. Fix step and let be the current matrix. Partial pivoting interchanges row with the row that maximizes over . After the swap the pivot entry satisfies for every , since is one of the maximized candidates. The multiplier is , well-defined because the pivot is nonzero whenever the column is not entirely zero (and if the column were entirely zero below the diagonal, would be singular, contrary to hypothesis). Therefore .

Proposition 3 (the growth bound and its attainment). Partial pivoting yields , and the bound is attained.

Proof. Write , so . The update with (Proposition 2) gives , so and inductively . The largest intermediate magnitude is therefore at most , so . For attainment, take with , for , for all , and for with . At each step every below-diagonal entry of the active column equals in magnitude while the pivot is , so no swap is performed and every multiplier is ; the last column doubles at each elimination step, reaching in the bottom-right entry. Hence , so the bound is sharp.

Proposition 4 (backward error of the triangular solves composes with the factorization). Let be the computed GEPP factors with , . Solving and by substitution produces exactly satisfying with .

Proof. By the componentwise backward-error result for triangular solves of 43.01.03, the computed exactly solves with , and the computed exactly solves with . Eliminating , , and expanding the product, . Substituting and moving to the front, , so with . The factorization term obeys . The triangular-solve terms are bounded using (entries by Proposition 2) and controlled by , and ; dropping the second-order as , their -norm is . Summing, for a polynomial .

Connections Master

  • Systems of linear equations and the Kronecker-Capelli theorem 01.01.06 supplies the elementary-operation view of Gaussian elimination that this unit promotes to a factorization: the row operations that unit defines as the engine of solvability are exactly the Gauss transforms , and the pivot structure that decides solvability there becomes here the leading-minor existence condition for . This unit reads that algorithm as the matrix identity , turning a procedure into an object that can be reused across right-hand sides and analyzed for stability.

  • The determinant 01.01.07 furnishes the leading principal minors whose nonvanishing governs unpivoted existence; the pivot identity proved here is the bridge between the multiplicative determinant theory of that unit and the additive elimination process, and it shows — the standard route by which determinants are actually computed, in rather than the catastrophic of cofactor expansion.

  • Backward stability and backward-error analysis 43.01.03 is the framework this unit instantiates: the GEPP backward-error theorem with is the master inequality of that unit specialized to elimination, with the growth factor as the algorithm's backward error and the triangular-solve analysis there reused verbatim in the composition proof here. That unit owns the definition of backward stability; this unit owns the first major algorithm proved to have it.

  • Conditioning and condition numbers 43.01.02 supplies the second factor in the forward-error estimate derived in the exercises: the condition number is the amplification the problem forces, multiplying the backward error this unit certifies. A perfectly stable GEPP solve of an ill-conditioned system is still inaccurate, and that inaccuracy is charged to conditioning, not to elimination.

  • *The Cholesky factorization 43.03.02* is the symmetric-positive-definite companion that removes the gamble of this unit: where partial pivoting only bounds in expectation, Cholesky bounds it a priori, needs no pivoting, and is unconditionally backward stable. This unit's growth-factor analysis is precisely the obstacle that the positive-definite structure of 43.03.02 dissolves, which is why Cholesky is preferred whenever the symmetric positive-definite hypothesis holds.

Historical & philosophical context Master

Elimination as a systematic procedure long predates its matrix formulation; the row-reduction algorithm appears in the Chinese Nine Chapters on the Mathematical Art (c. 100 CE) and was studied by Carl Friedrich Gauss in the early nineteenth century in connection with least-squares orbit determination, from which it takes its name. Its recasting as the matrix factorization is twentieth-century, crystallized by Paul Dwyer and others in the 1940s and made standard by the matrix-computation literature thereafter.

The stability question — whether the rounding errors of elimination on a large system accumulate fatally — was the central worry of early numerical analysis. John von Neumann and Herman Goldstine (1947) gave a pessimistic forward-error analysis that left the practicality of solving large systems in doubt. The resolution came from James H. Wilkinson, whose backward-error analysis of Gaussian elimination in Error analysis of direct methods of matrix inversion (Journal of the ACM 8, 1961, 281–330) [Wilkinson, J. H. — Error analysis of direct methods of matrix inversion] and in The Algebraic Eigenvalue Problem (Oxford, 1965) [Wilkinson, J. H. — The Algebraic Eigenvalue Problem] showed that the computed factors are the exact factors of a nearby matrix, with the error controlled entirely by the growth factor, and proved the worst-case bound for partial pivoting. The lasting puzzle Wilkinson identified — that this exponential worst case is essentially never observed — was given a statistical explanation by Lloyd N. Trefethen and Robert S. Schreiber in Average-case stability of Gaussian elimination (SIAM Journal on Matrix Analysis and Applications 11, 1990, 335–360) [Trefethen, L. N. & Schreiber, R. S. — Average-case stability of Gaussian elimination], which showed the growth factor of random matrices stays near , and the modern componentwise treatment is Nicholas Higham's Accuracy and Stability of Numerical Algorithms (SIAM, 1996; 2nd ed. 2002).

Bibliography Master

@book{trefethenbau1997,
  author    = {Trefethen, Lloyd N. and Bau, David},
  title     = {Numerical Linear Algebra},
  publisher = {Society for Industrial and Applied Mathematics},
  year      = {1997}
}

@book{golubvanloan2013,
  author    = {Golub, Gene H. and Van Loan, Charles F.},
  title     = {Matrix Computations},
  edition   = {4},
  publisher = {Johns Hopkins University Press},
  year      = {2013}
}

@book{higham2002accuracy,
  author    = {Higham, Nicholas J.},
  title     = {Accuracy and Stability of Numerical Algorithms},
  edition   = {2},
  publisher = {Society for Industrial and Applied Mathematics},
  year      = {2002}
}

@book{wilkinson1965eigenvalue,
  author    = {Wilkinson, James H.},
  title     = {The Algebraic Eigenvalue Problem},
  publisher = {Oxford University Press},
  year      = {1965}
}

@article{wilkinson1961error,
  author  = {Wilkinson, James H.},
  title   = {Error Analysis of Direct Methods of Matrix Inversion},
  journal = {Journal of the ACM},
  volume  = {8},
  number  = {3},
  year    = {1961},
  pages   = {281--330}
}

@article{trefethenschreiber1990,
  author  = {Trefethen, Lloyd N. and Schreiber, Robert S.},
  title   = {Average-Case Stability of Gaussian Elimination},
  journal = {SIAM Journal on Matrix Analysis and Applications},
  volume  = {11},
  number  = {3},
  year    = {1990},
  pages   = {335--360}
}

@article{vonneumann1947numerical,
  author  = {von Neumann, John and Goldstine, Herman H.},
  title   = {Numerical Inverting of Matrices of High Order},
  journal = {Bulletin of the American Mathematical Society},
  volume  = {53},
  number  = {11},
  year    = {1947},
  pages   = {1021--1099}
}