43.03.01 · numerical-analysis / 03-direct-linear-solvers

Gaussian elimination, LU factorization, and its stability

shipped3 tiersLean: none

Anchor (Master): Higham 2002 *Accuracy and Stability of Numerical Algorithms* 2e (SIAM) Ch. 9 (the backward-error analysis of LU and of GEPP, the growth-factor bound); Golub-Van Loan 2013 *Matrix Computations* 4e (Johns Hopkins) §3.3-3.5; Wilkinson 1965 *The Algebraic Eigenvalue Problem* (Oxford) Ch. 4 (rounding analysis of Gaussian elimination); Trefethen-Bau 1997 *Numerical Linear Algebra* (SIAM) Lectures 20-22

Intuition Beginner

When you solve a system of linear equations by hand, you eliminate one unknown at a time: subtract a multiple of the first equation from the others to wipe out the first variable, then repeat on what is left. This is Gaussian elimination, the method you already know from 01.01.06. The quiet fact that powers all of modern computing is that this bookkeeping has a name and a shape: it factors your matrix into two triangular pieces.

The two pieces are a lower-triangular matrix $L$ that records the multipliers you used, and an upper-triangular matrix $U$ that is the staircase form you ended up with. Writing the original matrix as $A = LU$ is the whole story of elimination packaged into a single equation. Triangular systems are cheap to solve — you read the answers off one at a time — so once you have $L$ and $U$ , every right-hand side is fast.

There is a catch that only shows up on a computer. If one of the numbers you divide by is tiny, the multipliers blow up, and the rounding the computer already does gets magnified into nonsense. The fix is to reorder the rows so you always divide by the largest available number. This is called partial pivoting, and it is the difference between a method that works and one that quietly fails.

Pivoting does not make every problem solvable. Some matrices genuinely amplify error no matter how carefully you compute — that is their condition number talking, from 43.01.02. What pivoting promises is that elimination itself adds almost nothing to the trouble: it gives the exact answer to a problem a hairsbreadth away from yours.

Visual Beginner

The picture is the elimination process turning a full square of numbers into a clean triangle, while a second triangle quietly fills up with the multipliers you used along the way.

Read the table left to right. You start with a full matrix. After clearing the first column below the diagonal, the multipliers you subtracted are stored in the lower-left; the upper part starts to settle into its final triangular shape. Repeat down the diagonal and you finish with $L$ (the multipliers, with ones on the diagonal) and $U$ (the triangle).

stage	what the matrix looks like	what gets recorded
start	full square $A$	nothing yet
after column 1	zeros below the first pivot	multipliers in column 1 of $L$
after column 2	zeros below the first two pivots	multipliers in column 2 of $L$
finish	upper triangle $U$	all multipliers stored in $L$

The takeaway: elimination is not a throwaway procedure, it is a factorization. And the row swaps of partial pivoting are not a hack — they are what keep the stored multipliers from exploding.

Worked example Beginner

Let us factor a small matrix by hand and watch $L$ and $U$ appear.

Take $$ A = \begin{pmatrix} 2 & 1 & 1 \ 4 & 3 & 3 \ 8 & 7 & 9 \end{pmatrix}. $$

Step 1. Clear the first column. The first pivot is $2$ . To kill the $4$ below it, subtract $\frac{4}{2} = 2$ times row 1 from row 2. To kill the $8$ , subtract $\frac{8}{2} = 4$ times row 1 from row 3. The multipliers $2$ and $4$ get stored in the first column of $L$ . The matrix becomes $$ \begin{pmatrix} 2 & 1 & 1 \ 0 & 1 & 1 \ 0 & 3 & 5 \end{pmatrix}. $$

Step 2. Clear the second column. The new pivot is $1$ . To kill the $3$ below it, subtract $\frac{3}{1} = 3$ times the new row 2 from row 3. The multiplier $3$ is stored in the second column of $L$ . The matrix becomes $$ U = \begin{pmatrix} 2 & 1 & 1 \ 0 & 1 & 1 \ 0 & 0 & 2 \end{pmatrix}. $$

Step 3. Read off $L$ . The multipliers, with ones on the diagonal, give $$ L = \begin{pmatrix} 1 & 0 & 0 \ 2 & 1 & 0 \ 4 & 3 & 1 \end{pmatrix}. $$

Step 4. Check. Multiply $LU$ and confirm you get $A$ back: row 3 of $LU$ is $4 \times (2, 1, 1) + 3 \times (0, 1, 1) + 1 \times (0, 0, 2) = (8, 7, 9)$ , the original third row.

What this tells us: the messy process of elimination is exactly the clean statement $A = LU$ . The pivots were $2$ , $1$ , $2$ — all comfortably away from zero — so no row swaps were needed and the multipliers stayed small. If a pivot had come out tiny, we would have swapped a larger row up first, which is partial pivoting.

Check your understanding Beginner

Formal definition Intermediate+

Let $A \in F^{n \times n}$ with $F = R$ or $C$ . An LU factorization of $A$ is a factorization $$ A = LU, $$ where $L \in F^{n \times n}$ is unit lower triangular (lower triangular with $L_{ii} = 1$ ) and $U \in F^{n \times n}$ is upper triangular. Gaussian elimination computes such a factorization, when it exists, by applying a sequence of elementary lower-triangular Gauss transformations. At step $k$ , with the partially reduced matrix $A^{(k)}$ having a nonzero pivot $a_{k k}^{(k)}$ , the transformation $M_{k} = I - ℓ_{k} e_{k}^{⊤}$ subtracts the multiplier $ℓ_{ik} = a_{ik}^{(k)} / a_{k k}^{(k)}$ times row $k$ from each row $i > k$ . After $n - 1$ steps, $$ M_{n-1} \cdots M_2 M_1 A = U, $$ and since each $M_{k}$ is unit lower triangular with inverse $M_{k}^{- 1} = I + ℓ_{k} e_{k}^{⊤}$ , the product $L = M_{1}^{- 1} M_{2}^{- 1} \dots M_{n - 1}^{- 1}$ is unit lower triangular with $L_{ik} = ℓ_{ik}$ for $i > k$ ^{[Trefethen, L. N. & Bau, D. — Numerical Linear Algebra]}.

Existence via leading principal minors. Write $A_{k}$ for the $k \times k$ leading principal submatrix of $A$ (rows and columns $1$ through $k$ ). The unpivoted LU factorization exists and is unique with $L$ unit lower triangular precisely when $det A_{k} \neq = 0$ for $k = 1, \dots, n - 1$ . The $k$ -th pivot is then $a_{k k}^{(k)} = det A_{k} / det A_{k - 1}$ (with $det A_{0} = 1$ ), the ratio of consecutive leading minors in the sense of 01.01.07.

Partial pivoting. When a leading minor vanishes — or, on a computer, when a pivot is small relative to the entries beneath it — one permutes rows. Partial pivoting selects, at step $k$ , the row $p \geq k$ maximizing $∣ a_{p k}^{(k)} ∣$ and swaps it into the pivot position. Encoding all swaps in a single permutation matrix $P$ , the result is the factorization $$ PA = LU, $$ with $L$ unit lower triangular and every multiplier bounded: $∣ L_{ik} ∣ \leq 1$ . This factorization exists for every nonsingular $A$ .

Operation count. Forming $L$ and $U$ by Gaussian elimination costs $\frac{2}{3} n^{3} + O (n^{2})$ floating-point operations; partial pivoting adds only $O (n^{2})$ comparisons. Solving $A x = b$ then proceeds in two triangular sweeps — forward substitution $L y = P b$ and back substitution $U x = y$ — each costing $n^{2} + O (n)$ operations, so additional right-hand sides are cheap once the factorization is in hand.

Triangular-solve cost. A triangular system with a nonsingular triangular matrix of order $n$ is solved by substitution in $n^{2} + O (n)$ operations: row $i$ requires the $n - i$ already-known unknowns, one subtraction each, and one division, summing to $\sum_{i = 1}^{n} (2 (n - i) + 1) = n^{2}$ . The cubic cost of a linear solve thus lives entirely in the factorization, not in the solves.

Counterexamples to common slips

Nonsingularity of $A$ does not guarantee an unpivoted LU factorization. The matrix $(0110)$ is nonsingular but has $det A_{1} = 0$ , so no $A = LU$ with unit lower-triangular $L$ exists; one must permute. The leading-minor condition, not invertibility, governs unpivoted existence.
A nonzero pivot is not the same as a safe pivot. The matrix $(ε 1 11)$ has $det A_{1} = ε \neq = 0$ , so unpivoted elimination runs, but the multiplier $1/ ε$ is enormous and floating-point elimination loses all accuracy. Existence of the factorization and numerical safety are different questions; pivoting addresses the second.
Partial pivoting bounds the multipliers, not the entries of $U$ . The bound $∣ L_{ik} ∣ \leq 1$ is automatic after pivoting, but the entries of $U$ can still grow; the magnitude of that growth is the growth factor, and it — not the multipliers — is what the stability theorem controls.
The permutation $P$ is part of the answer. Solving $A x = b$ after computing $P A = LU$ requires solving $L y = P b$ , with the right-hand side permuted to match. Forgetting to permute $b$ solves a different system.

Key theorem with proof Intermediate+

The signature result certifies that Gaussian elimination with partial pivoting is a backward-stable solver, with the entire risk concentrated in a single scalar, the growth factor.

Definition (growth factor). For Gaussian elimination applied to $A$ (with partial pivoting unless stated otherwise), let $a_{ij}^{(k)}$ denote the entries of the intermediate matrices $A^{(1)} = A, A^{(2)}, \dots, A^{(n)} = U$ . The growth factor is $$ \rho_n = \frac{\max_{i,j,k} |a^{(k)}{ij}|}{\max{i,j} |a_{ij}|}, $$ the largest magnitude appearing anywhere during elimination, relative to the largest entry of the original matrix.

Theorem (backward stability of GEPP). Let $A \in F^{n \times n}$ be nonsingular and let $\tilde{L}, \tilde{U}, P$ be the factors computed by Gaussian elimination with partial pivoting in floating-point arithmetic with unit roundoff $ϵ_{mach}$ , in the standard model of 43.01.01. Then the computed factors are the exact factors of a perturbed matrix: $$ \tilde{L}\tilde{U} = P(A + E), \qquad |E|\infty \le c_n,\rho_n,\epsilon{\mathrm{mach}},|A|\infty, $$ where $c_{n}$ is a low-degree polynomial in $n$ and $ρ_{n}$ is the growth factor. Consequently the computed solution $\tilde{x}$ of $A x = b$ satisfies $(A + Δ A) \tilde{x} = b$ with $|\Delta A|\infty \le c'n,\rho_n,\epsilon{\mathrm{mach}},|A|_\infty $, so GE P P i s ba c k w a r d s t ab l e in t h ese n seo f [43.01.03] w h e n e v er$ \rho_n$ is of modest size ^{[Higham, N. J. — Accuracy and Stability of Numerical Algorithms (2nd ed.)]}.

Proof. Track the rounding committed at each elimination step and charge it backward to the data. By the standard model, the computed multiplier is $\tilde{ℓ}_{ik} = fl (a_{ik}^{(k)} / a_{k k}^{(k)}) = (a_{ik}^{(k)} / a_{k k}^{(k)}) (1 + δ_{ik})$ with $∣ δ_{ik} ∣ \leq ϵ_{mach}$ , and partial pivoting forces $∣ \tilde{ℓ}_{ik} ∣ \leq 1 + ϵ_{mach}$ . The update of each entry, $\tilde{a}_{ij}^{(k + 1)} = fl (a_{ij}^{(k)} - \tilde{ℓ}_{ik} a_{k j}^{(k)})$ , equals the exact update plus a rounding term: $$ \tilde{a}^{(k+1)}{ij} = a^{(k)}{ij} - \tilde{\ell}{ik},a^{(k)}{kj} + e^{(k)}{ij}, \qquad |e^{(k)}{ij}| \le \gamma_2\big(|a^{(k)}{ij}| + |\tilde{\ell}{ik}|,|a^{(k)}{kj}|\big), $$ where $\gamma_2 = 2\epsilon{\mathrm{mach}}/(1 - 2\epsilon_{\mathrm{mach}}) $f r o m t h e p r o d u c t - o f - f a c t or s l e mma o f [43.01.01] a ppl i e d t o t h e m u l t i pl y - s u b t r a c tp ai r . S u mmin g t h e d e f inin g r e l a t i o n so f t h eco m p u t e df a c t or i z a t i o n o v er t h es t e p s t ha tt o u c h e n t r y$ (i,j) $t e l esco p es t o t h e i d e n t i t y$ \tilde{L}\tilde{U} = P(A + E) $, w h er e t h e a cc u m u l a t e d p er t u r ba t i o n e n t r y$ E_{ij} $co l l ec t s t h e l oc a l r o u n d in g t er m s$ e^{(k)}{ij} $. E a c h s u c h t er mi s b o u n d e d b y$ \gamma_2 $t im es in t er m e d ia t e ma g ni t u d es, an d e v er y in t er m e d ia t e ma g ni t u d e i s a t m os t$ \rho_n \max{ij}|a_{ij}| $b y t h e d e f ini t i o n o f t h e g r o w t h f a c t or; t h e n u mb er o f co n t r ib u t in g s t e p s f or an y e n t r y i s a t m os t$ n$. Hence $$ |E_{ij}| \le n,\gamma_2 \cdot 2,\rho_n \max_{ij}|a_{ij}| \le c_n,\rho_n,\epsilon_{\mathrm{mach}},\max_{ij}|a_{ij}|, $$ and converting the entrywise bound to the $\infty$ -norm via $max_{ij} ∣ a_{ij} ∣ \leq ∥ A ∥_{\infty}$ gives $∥ E ∥_{\infty} \leq c_{n} ρ_{n} ϵ_{mach} ∥ A ∥_{\infty}$ . The solve $\tilde{x}$ obtained from forward and back substitution inherits, by the triangular backward-error result of 43.01.03, an additional perturbation of the same order, which combines with $E$ into a single $Δ A$ of the stated size. $□$

Bridge. This theorem is the foundational reason a direct solver can be trusted: it shows that the backward error of elimination factors into the universal rounding unit $ϵ_{mach}$ and a single problem-and-pivoting quantity $ρ_{n}$ , and this is exactly the master inequality of 43.01.03 specialized to elimination, with $ρ_{n}$ playing the role the backward error played there. The result builds toward every later direct-solver guarantee — the unconditional stability of Cholesky, where the growth factor is bounded a priori, and the a-posteriori residual bound for $A x = b$ — each obtained by feeding this backward error into the condition number $κ (A)$ of 43.01.02. It appears again in the perturbation theory of 43.03.03, where the achievable solution accuracy $κ (A) ρ_{n} ϵ_{mach}$ is read off by multiplying conditioning against this backward error. The growth factor generalises the per-operation rounding bound of 43.01.01 from one arithmetic step to a whole elimination, and the central insight is that pivoting cannot improve conditioning but can keep $ρ_{n}$ small; putting these together, the conditioning supplies the amplification a problem forces and the growth factor supplies the amplification elimination risks, and the bridge is that their product with $ϵ_{mach}$ is the accuracy GEPP delivers.

Exercises Intermediate+

Exercise 3 (medium, symbolic).

Show that the $k$ -th pivot of unpivoted Gaussian elimination equals $det A_{k} / det A_{k - 1}$ , where $A_{k}$ is the $k \times k$ leading principal submatrix and $det A_{0} = 1$ . Deduce the leading-minor existence condition.

Hint

Elimination is left-multiplication by unit-lower-triangular matrices, which have determinant $1$ and preserve leading principal minors. The leading $k \times k$ block of $U$ has the first $k$ pivots on its diagonal.

Answer

The elementary transforms $M_{1}, \dots, M_{k - 1}$ are unit lower triangular, so they leave every leading principal minor unchanged: $det A_{k} = det (M_{k - 1} \dots M_{1} A)_{k} = det U_{k}$ , where $U_{k}$ is the leading $k \times k$ block of the partially reduced matrix, which is upper triangular with the pivots $u_{11}, \dots, u_{k k}$ on its diagonal. Hence $det A_{k} = u_{11} u_{22} \dots u_{k k}$ , and dividing consecutive minors gives $u_{k k} = det A_{k} / det A_{k - 1}$ . The $k$ -th pivot is nonzero iff $det A_{k} \neq = 0$ , so unpivoted elimination runs to completion (with all pivots nonzero) precisely when every leading principal minor through order $n - 1$ is nonzero.

Exercise 4 (medium, numeric).

Consider $A = (ε 1 11)$ with $ε = 1 0^{- 20}$ , in arithmetic with $ϵ_{mach} \approx 1 0^{- 16}$ . Perform unpivoted elimination symbolically, then explain in one sentence why the computed answer to $A x = b$ is wrong, and what partial pivoting does instead.

Hint

The multiplier is $1/ ε = 1 0^{20}$ . Form the $(2, 2)$ entry of $U$ , then round it in the given arithmetic.

Answer

Unpivoted, the multiplier is $1/ ε = 1 0^{20}$ , and the updated $(2, 2)$ entry is $1 - 1 0^{20} \cdot 1 = - 1 0^{20} + 1$ . In floating point with $ϵ_{mach} \approx 1 0^{- 16}$ this rounds to exactly $- 1 0^{20}$ : the original $1$ has been swamped, so the entry $a_{22} = 1$ is lost entirely and the computed $U$ no longer carries the true matrix's information, producing a large backward error. Partial pivoting swaps the rows first (since $∣1∣ > ∣ ε ∣$ ), making the multiplier $ε = 1 0^{- 20}$ , so the update $1 - ε \cdot 1 \approx 1$ loses nothing and the factorization is backward stable.

Exercise 5 (medium, symbolic).

Prove that partial pivoting forces every multiplier to satisfy $∣ ℓ_{ik} ∣ \leq 1$ , and conclude that the entries of $L$ are bounded by $1$ in magnitude.

Hint

At step $k$ , the pivot is chosen as the largest-magnitude entry in column $k$ on or below the diagonal.

Answer

At step $k$ , partial pivoting swaps into the pivot position the row $p \geq k$ maximizing $∣ a_{p k}^{(k)} ∣$ , so after the swap the pivot satisfies $∣ a_{k k}^{(k)} ∣ = max_{i \geq k} ∣ a_{ik}^{(k)} ∣ \geq ∣ a_{ik}^{(k)} ∣$ for every $i > k$ . The multiplier is $ℓ_{ik} = a_{ik}^{(k)} / a_{k k}^{(k)}$ , so $∣ ℓ_{ik} ∣ = ∣ a_{ik}^{(k)} ∣/∣ a_{k k}^{(k)} ∣ \leq 1$ . Since the strictly-lower-triangular entries of $L$ are exactly these multipliers and the diagonal entries are $1$ , every entry of $L$ has magnitude at most $1$ . This bound is automatic and free; it is the entries of $U$ , measured by the growth factor, that pivoting does not directly control.

Exercise 7 (hard, symbolic).

Prove that for Gaussian elimination with partial pivoting on an $n \times n$ matrix the growth factor satisfies $ρ_{n} \leq 2^{n - 1}$ .

Hint

Each update is $a_{ij}^{(k + 1)} = a_{ij}^{(k)} - ℓ_{ik} a_{k j}^{(k)}$ with $∣ ℓ_{ik} ∣ \leq 1$ . Bound the magnitude of the new entry by the largest magnitude at the previous step.

Answer

Let $g_{k} = max_{ij} ∣ a_{ij}^{(k)} ∣$ be the largest entry magnitude after $k - 1$ elimination steps, so $g_{1} = max_{ij} ∣ a_{ij} ∣$ . The update at step $k$ is $a_{ij}^{(k + 1)} = a_{ij}^{(k)} - ℓ_{ik} a_{k j}^{(k)}$ with $∣ ℓ_{ik} ∣ \leq 1$ by partial pivoting (Exercise 5). The triangle inequality gives $∣ a_{ij}^{(k + 1)} ∣ \leq ∣ a_{ij}^{(k)} ∣ + ∣ ℓ_{ik} ∣ ∣ a_{k j}^{(k)} ∣ \leq g_{k} + 1 \cdot g_{k} = 2 g_{k}$ , so $g_{k + 1} \leq 2 g_{k}$ . Iterating from $g_{1}$ through the $n - 1$ steps, $g_{n} \leq 2^{n - 1} g_{1}$ , and since the growth factor is $ρ_{n} = max_{k} g_{k} / g_{1} = g_{n} / g_{1}$ (the maximum is attained at the last step under this bound), $ρ_{n} \leq 2^{n - 1}$ . The bound is attained by the matrix of Exercise 6, so it cannot be improved in general.

Exercise 8 (hard, symbolic).

Using the GEPP backward-error theorem, derive the forward-error estimate for the computed solution: $∥ \tilde{x} - x ∥/∥ x ∥ \leq c_{n}^{'} κ_{\infty} (A) ρ_{n} ϵ_{mach} / (1 - c_{n}^{'} κ_{\infty} (A) ρ_{n} ϵ_{mach})$ , and identify each factor as conditioning versus stability.

Hint

Start from $(A + Δ A) \tilde{x} = b = A x$ , write $\tilde{x} - x = - A^{- 1} Δ A \tilde{x}$ , and use the standard perturbation bound with $κ (A) = ∥ A ∥∥ A^{- 1} ∥$ .

Answer

From $(A + Δ A) \tilde{x} = b$ and $A x = b$ , subtract to get $A (\tilde{x} - x) = - Δ A \tilde{x}$ , so $\tilde{x} - x = - A^{- 1} Δ A \tilde{x}$ and $∥ \tilde{x} - x ∥ \leq ∥ A^{- 1} ∥ ∥Δ A ∥ ∥ \tilde{x} ∥$ . Dividing by $∥ x ∥$ and inserting $∥ A ∥/∥ A ∥$ gives, after the standard rearrangement that moves the $∥ \tilde{x} ∥$ to $∥ x ∥$ , $$ \frac{|\tilde{x} - x|}{|x|} \le \frac{\kappa_\infty(A),|\Delta A|\infty/|A|\infty}{1 - \kappa_\infty(A),|\Delta A|\infty/|A|\infty}. $$ Substituting the theorem's bound $∥Δ A ∥_{\infty} /∥ A ∥_{\infty} \leq c_{n}^{'} ρ_{n} ϵ_{mach}$ yields the stated estimate. The factor $κ_{\infty} (A)$ is the conditioning of the problem, from 43.01.02 — fixed by $A$ alone — while $c_{n}^{'} ρ_{n} ϵ_{mach}$ is the backward error of the algorithm, from 43.01.03 — controlled by elimination and pivoting; their product is the forward error, exactly the master inequality.

Advanced results Master

Theorem 1 (existence and uniqueness of unpivoted LU). For $A \in F^{n \times n}$ , the factorization $A = LU$ with $L$ unit lower triangular and $U$ upper triangular exists and is unique if and only if every leading principal submatrix $A_{k}$ , $k = 1, \dots, n - 1$ , is nonsingular. Under this condition the $k$ -th pivot equals $u_{k k} = det A_{k} / det A_{k - 1}$ . When some leading minor vanishes the unpivoted factorization fails, but row permutation always restores it: for every nonsingular $A$ there is a permutation $P$ with $P A = LU$ , and partial pivoting computes one such $P$ with all multipliers bounded by $1$ in magnitude ^{[Golub, G. H. & Van Loan, C. F. — Matrix Computations (4th ed.)]}.

Theorem 2 (Wilkinson's growth bound for partial pivoting). Gaussian elimination with partial pivoting on $A \in F^{n \times n}$ produces a growth factor satisfying $ρ_{n} \leq 2^{n - 1}$ , and this bound is attained: the matrix $W_{n}$ with $W_{ii} = 1$ , $W_{ij} = - 1$ for $i > j$ , $W_{in} = 1$ , and zeros elsewhere triggers $ρ_{n} = 2^{n - 1}$ exactly, with no row swap ever taken because each candidate pivot is already maximal. Thus the worst-case backward error of GEPP, $∥ E ∥ = O (2^{n - 1} ϵ_{mach} ∥ A ∥)$ , can in principle erase all accuracy in double precision once $n$ exceeds about $50$ . The exponential worst case is the precise sense in which GEPP is only conditionally backward stable ^{[Wilkinson, J. H. — The Algebraic Eigenvalue Problem]}.

Theorem 3 (the benign average case). Despite the $2^{n - 1}$ worst case, the growth factor of partial pivoting is small for almost all matrices. For matrices with independent standard-normal entries, the expected growth factor grows slowly — empirically and by statistical analysis like $ρ_{n} \sim n^{2/3}$ — so the typical backward error is $O (n^{2/3} ϵ_{mach} ∥ A ∥)$ , comfortably stable. The worst-case matrices form an exponentially thin set that elimination, with its column-pivoting choices, essentially never wanders into; the gap between the proved worst case and the observed behavior is one of the sharpest in numerical analysis, and it is why GEPP, not the unconditionally stable but slower orthogonal methods, remains the default direct solver for general dense systems ^{[Trefethen, L. N. & Schreiber, R. S. — Average-case stability of Gaussian elimination]}.

Theorem 4 (complete pivoting and its polynomial bound). Complete pivoting selects the pivot as the largest-magnitude entry in the entire remaining submatrix, swapping both rows and columns, giving $P A Q = LU$ . Its growth factor satisfies the far smaller bound $ρ_{n} \leq n (2 \cdot 3^{1/2} \dots n^{1/ (n - 1)})^{1/2}$ , a function growing more slowly than any power of $n$ above a fixed degree, and no matrix attaining catastrophic growth under complete pivoting is known. Complete pivoting is unconditionally stable in practice but costs $O (n^{3})$ comparisons against the $O (n^{2})$ of partial pivoting, and the search disrupts the column-oriented memory access that makes partial pivoting fast; the verdict of the field is that partial pivoting's empirical stability makes the extra cost of complete pivoting rarely worth paying ^{[Higham, N. J. — Accuracy and Stability of Numerical Algorithms (2nd ed.)]}.

Synthesis. The stability of Gaussian elimination is the foundational reason direct linear solvers are trusted at all, and it is governed by a single scalar that the entire backward-error analysis isolates: the growth factor $ρ_{n}$ . This is exactly the structure of the master inequality of 43.01.03 — the computed factors are the exact factors of $A + E$ with $∥ E ∥ = O (ρ_{n} ϵ_{mach} ∥ A ∥)$ — specialized to elimination, with $ρ_{n}$ as the algorithm's contribution and $κ (A)$ from 43.01.02 as the problem's. The central insight is that pivoting does two separable things: the multiplier bound $∣ ℓ_{ik} ∣ \leq 1$ is free and automatic, while the growth of $U$ is what genuinely needs controlling, and partial pivoting controls it well in expectation but not in the worst case. The worst case generalises a clean combinatorial extremum, the matrix $W_{n}$ doubling the largest entry at every step to reach $2^{n - 1}$ , which is dual to the empirical fact that random matrices keep $ρ_{n}$ near $n^{2/3}$ ; putting these together, the conditioning supplies the amplification a problem forces and the growth factor supplies the amplification elimination risks. The bridge to the rest of the chapter is that Cholesky 43.03.02 removes the gamble entirely by bounding $ρ_{n}$ a priori for symmetric positive-definite matrices, and the perturbation theory of 43.03.03 turns the backward error proved here into the concrete solution-accuracy guarantee $κ (A) ρ_{n} ϵ_{mach}$ , with no direct method on a general system able to do better than the conditioning allows and GEPP, in practice, achieving it.

Full proof set Master

Proposition 1 (the LU multiplier formula and the pivot-minor identity). Let $A \in F^{n \times n}$ have nonsingular leading principal submatrices $A_{1}, \dots, A_{n - 1}$ . Then unpivoted Gaussian elimination runs to completion, the factorization $A = LU$ is unique among unit-lower-triangular $L$ , and the pivots are $u_{k k} = det A_{k} / det A_{k - 1}$ with $det A_{0} = 1$ .

Proof. Proceed by induction on the step $k$ . At step $1$ the pivot is $a_{11} = det A_{1} \neq = 0$ , so the multipliers $ℓ_{i 1} = a_{i 1} / a_{11}$ are defined and $M_{1} A$ has zeros below the first pivot. Assume steps $1, \dots, k - 1$ have run, producing the partially reduced matrix $A^{(k)} = M_{k - 1} \dots M_{1} A$ , upper triangular in its first $k - 1$ columns. Each $M_{j}$ is unit lower triangular, hence so is their product, and a unit-lower-triangular left factor preserves every leading principal minor: $det A_{k}^{(k)} = det A_{k}$ , where $A_{k}^{(k)}$ is the leading $k \times k$ block of $A^{(k)}$ . That block is upper triangular with diagonal $u_{11}, \dots, u_{k - 1, k - 1}, a_{k k}^{(k)}$ , so its determinant is the product $u_{11} \dots u_{k - 1, k - 1} a_{k k}^{(k)}$ . By the inductive hypothesis $u_{11} \dots u_{k - 1, k - 1} = det A_{k - 1}$ , so $a_{k k}^{(k)} = det A_{k} / det A_{k - 1} \neq = 0$ , the $k$ -th pivot is nonzero, and step $k$ runs. After $n - 1$ steps $U = M_{n - 1} \dots M_{1} A$ is upper triangular and $L = M_{1}^{- 1} \dots M_{n - 1}^{- 1}$ is unit lower triangular with $L_{ik} = ℓ_{ik}$ . For uniqueness, suppose $A = L_{1} U_{1} = L_{2} U_{2}$ with both $L_{j}$ unit lower triangular and both $U_{j}$ upper triangular and nonsingular (the pivots are nonzero). Then $L_{2}^{- 1} L_{1} = U_{2} U_{1}^{- 1}$ ; the left side is unit lower triangular and the right side is upper triangular, so both equal a matrix that is simultaneously unit lower and upper triangular, namely the identity. Hence $L_{1} = L_{2}$ and $U_{1} = U_{2}$ . $□$

Proposition 2 (the multiplier bound under partial pivoting). Gaussian elimination with partial pivoting produces multipliers with $∣ ℓ_{ik} ∣ \leq 1$ for all $i > k$ .

Proof. Fix step $k$ and let $A^{(k)}$ be the current matrix. Partial pivoting interchanges row $k$ with the row $p \geq k$ that maximizes $∣ a_{p k}^{(k)} ∣$ over $p \geq k$ . After the swap the pivot entry satisfies $∣ a_{k k}^{(k)} ∣ = max_{p \geq k} ∣ a_{p k}^{(k)} ∣ \geq ∣ a_{ik}^{(k)} ∣$ for every $i > k$ , since $a_{ik}^{(k)}$ is one of the maximized candidates. The multiplier is $ℓ_{ik} = a_{ik}^{(k)} / a_{k k}^{(k)}$ , well-defined because the pivot is nonzero whenever the column is not entirely zero (and if the column were entirely zero below the diagonal, $A$ would be singular, contrary to hypothesis). Therefore $∣ ℓ_{ik} ∣ = ∣ a_{ik}^{(k)} ∣/∣ a_{k k}^{(k)} ∣ \leq 1$ . $□$

Proposition 3 (the $2^{n - 1}$ growth bound and its attainment). Partial pivoting yields $ρ_{n} \leq 2^{n - 1}$ , and the bound is attained.

Proof. Write $g_{k} = max_{ij} ∣ a_{ij}^{(k)} ∣$ , so $g_{1} = max_{ij} ∣ a_{ij} ∣$ . The update $a_{ij}^{(k + 1)} = a_{ij}^{(k)} - ℓ_{ik} a_{k j}^{(k)}$ with $∣ ℓ_{ik} ∣ \leq 1$ (Proposition 2) gives $∣ a_{ij}^{(k + 1)} ∣ \leq ∣ a_{ij}^{(k)} ∣ + ∣ a_{k j}^{(k)} ∣ \leq 2 g_{k}$ , so $g_{k + 1} \leq 2 g_{k}$ and inductively $g_{k} \leq 2^{k - 1} g_{1}$ . The largest intermediate magnitude is therefore at most $2^{n - 1} g_{1}$ , so $ρ_{n} = max_{k} g_{k} / g_{1} \leq 2^{n - 1}$ . For attainment, take $W_{n} \in R^{n \times n}$ with $W_{ii} = 1$ , $W_{ij} = - 1$ for $i > j$ , $W_{in} = 1$ for all $i$ , and $W_{ij} = 0$ for $j > i$ with $j < n$ . At each step every below-diagonal entry of the active column equals $\pm 1$ in magnitude while the pivot is $1$ , so no swap is performed and every multiplier is $\pm 1$ ; the last column doubles at each elimination step, reaching $2^{n - 1}$ in the bottom-right entry. Hence $ρ_{n} (W_{n}) = 2^{n - 1}$ , so the bound is sharp. $□$

Proposition 4 (backward error of the triangular solves composes with the factorization). Let $\tilde{L}, \tilde{U}, P$ be the computed GEPP factors with $\tilde{L} \tilde{U} = P (A + E)$ , $∥ E ∥_{\infty} \leq c_{n} ρ_{n} ϵ_{mach} ∥ A ∥_{\infty}$ . Solving $\tilde{L} y = P b$ and $\tilde{U} x = y$ by substitution produces $\tilde{x}$ exactly satisfying $(A + Δ A) \tilde{x} = b$ with $∥Δ A ∥_{\infty} \leq c_{n}^{'} ρ_{n} ϵ_{mach} ∥ A ∥_{\infty}$ .

Proof. By the componentwise backward-error result for triangular solves of 43.01.03, the computed $\tilde{y}$ exactly solves $(\tilde{L} + Δ L) \tilde{y} = P b$ with $∣Δ L ∣ \leq γ_{n} ∣ \tilde{L} ∣$ , and the computed $\tilde{x}$ exactly solves $(\tilde{U} + Δ U) \tilde{x} = \tilde{y}$ with $∣Δ U ∣ \leq γ_{n} ∣ \tilde{U} ∣$ . Eliminating $\tilde{y}$ , $(\tilde{L} + Δ L) (\tilde{U} + Δ U) \tilde{x} = P b$ , and expanding the product, $(\tilde{L} \tilde{U} + \tilde{L} Δ U + Δ L \tilde{U} + Δ L Δ U) \tilde{x} = P b$ . Substituting $\tilde{L} \tilde{U} = P (A + E)$ and moving $P$ to the front, $P (A + E + P^{⊤} (\tilde{L} Δ U + Δ L \tilde{U} + Δ L Δ U)) \tilde{x} = P b$ , so $(A + Δ A) \tilde{x} = b$ with $Δ A = E + P^{⊤} (\tilde{L} Δ U + Δ L \tilde{U} + Δ L Δ U)$ . The factorization term obeys $∥ E ∥_{\infty} \leq c_{n} ρ_{n} ϵ_{mach} ∥ A ∥_{\infty}$ . The triangular-solve terms are bounded using $∣ \tilde{L} ∣ \leq$ (entries $\leq 1$ by Proposition 2) and $∣ \tilde{U} ∣$ controlled by $ρ_{n} ∥ A ∥_{\infty}$ , and $γ_{n} = O (n ϵ_{mach})$ ; dropping the second-order $Δ L Δ U$ as $O (ϵ_{mach}^{2})$ , their $\infty$ -norm is $O (n ρ_{n} ϵ_{mach} ∥ A ∥_{\infty})$ . Summing, $∥Δ A ∥_{\infty} \leq c_{n}^{'} ρ_{n} ϵ_{mach} ∥ A ∥_{\infty}$ for a polynomial $c_{n}^{'}$ . $□$

Connections Master

Systems of linear equations and the Kronecker-Capelli theorem 01.01.06 supplies the elementary-operation view of Gaussian elimination that this unit promotes to a factorization: the row operations that unit defines as the engine of solvability are exactly the Gauss transforms $M_{k}$ , and the pivot structure that decides solvability there becomes here the leading-minor existence condition for $A = LU$ . This unit reads that algorithm as the matrix identity $A = LU$ , turning a procedure into an object that can be reused across right-hand sides and analyzed for stability.
The determinant 01.01.07 furnishes the leading principal minors $det A_{k}$ whose nonvanishing governs unpivoted existence; the pivot identity $u_{k k} = det A_{k} / det A_{k - 1}$ proved here is the bridge between the multiplicative determinant theory of that unit and the additive elimination process, and it shows $det A = \prod_{k} u_{k k}$ — the standard route by which determinants are actually computed, in $O (n^{3})$ rather than the catastrophic $O (n!)$ of cofactor expansion.
Backward stability and backward-error analysis 43.01.03 is the framework this unit instantiates: the GEPP backward-error theorem $\tilde{L} \tilde{U} = P (A + E)$ with $∥ E ∥ = O (ρ_{n} ϵ_{mach} ∥ A ∥)$ is the master inequality of that unit specialized to elimination, with the growth factor $ρ_{n}$ as the algorithm's backward error and the triangular-solve analysis there reused verbatim in the composition proof here. That unit owns the definition of backward stability; this unit owns the first major algorithm proved to have it.
Conditioning and condition numbers 43.01.02 supplies the second factor in the forward-error estimate $∥ \tilde{x} - x ∥/∥ x ∥ = O (κ (A) ρ_{n} ϵ_{mach})$ derived in the exercises: the condition number $κ (A) = ∥ A ∥∥ A^{- 1} ∥$ is the amplification the problem forces, multiplying the backward error this unit certifies. A perfectly stable GEPP solve of an ill-conditioned system is still inaccurate, and that inaccuracy is charged to conditioning, not to elimination.
*The Cholesky factorization $A = R^{*} R$ 43.03.02* is the symmetric-positive-definite companion that removes the gamble of this unit: where partial pivoting only bounds $ρ_{n}$ in expectation, Cholesky bounds it a priori, needs no pivoting, and is unconditionally backward stable. This unit's growth-factor analysis is precisely the obstacle that the positive-definite structure of 43.03.02 dissolves, which is why Cholesky is preferred whenever the symmetric positive-definite hypothesis holds.

Historical & philosophical context Master

Elimination as a systematic procedure long predates its matrix formulation; the row-reduction algorithm appears in the Chinese Nine Chapters on the Mathematical Art (c. 100 CE) and was studied by Carl Friedrich Gauss in the early nineteenth century in connection with least-squares orbit determination, from which it takes its name. Its recasting as the matrix factorization $A = LU$ is twentieth-century, crystallized by Paul Dwyer and others in the 1940s and made standard by the matrix-computation literature thereafter.

The stability question — whether the rounding errors of elimination on a large system accumulate fatally — was the central worry of early numerical analysis. John von Neumann and Herman Goldstine (1947) gave a pessimistic forward-error analysis that left the practicality of solving large systems in doubt. The resolution came from James H. Wilkinson, whose backward-error analysis of Gaussian elimination in Error analysis of direct methods of matrix inversion (Journal of the ACM 8, 1961, 281–330) ^{[Wilkinson, J. H. — Error analysis of direct methods of matrix inversion]} and in The Algebraic Eigenvalue Problem (Oxford, 1965) ^{[Wilkinson, J. H. — The Algebraic Eigenvalue Problem]} showed that the computed factors are the exact factors of a nearby matrix, with the error controlled entirely by the growth factor, and proved the $2^{n - 1}$ worst-case bound for partial pivoting. The lasting puzzle Wilkinson identified — that this exponential worst case is essentially never observed — was given a statistical explanation by Lloyd N. Trefethen and Robert S. Schreiber in Average-case stability of Gaussian elimination (SIAM Journal on Matrix Analysis and Applications 11, 1990, 335–360) ^{[Trefethen, L. N. & Schreiber, R. S. — Average-case stability of Gaussian elimination]}, which showed the growth factor of random matrices stays near $n^{2/3}$ , and the modern componentwise treatment is Nicholas Higham's Accuracy and Stability of Numerical Algorithms (SIAM, 1996; 2nd ed. 2002).

Bibliography Master

@book{trefethenbau1997,
  author    = {Trefethen, Lloyd N. and Bau, David},
  title     = {Numerical Linear Algebra},
  publisher = {Society for Industrial and Applied Mathematics},
  year      = {1997}
}

@book{golubvanloan2013,
  author    = {Golub, Gene H. and Van Loan, Charles F.},
  title     = {Matrix Computations},
  edition   = {4},
  publisher = {Johns Hopkins University Press},
  year      = {2013}
}

@book{higham2002accuracy,
  author    = {Higham, Nicholas J.},
  title     = {Accuracy and Stability of Numerical Algorithms},
  edition   = {2},
  publisher = {Society for Industrial and Applied Mathematics},
  year      = {2002}
}

@book{wilkinson1965eigenvalue,
  author    = {Wilkinson, James H.},
  title     = {The Algebraic Eigenvalue Problem},
  publisher = {Oxford University Press},
  year      = {1965}
}

@article{wilkinson1961error,
  author  = {Wilkinson, James H.},
  title   = {Error Analysis of Direct Methods of Matrix Inversion},
  journal = {Journal of the ACM},
  volume  = {8},
  number  = {3},
  year    = {1961},
  pages   = {281--330}
}

@article{trefethenschreiber1990,
  author  = {Trefethen, Lloyd N. and Schreiber, Robert S.},
  title   = {Average-Case Stability of Gaussian Elimination},
  journal = {SIAM Journal on Matrix Analysis and Applications},
  volume  = {11},
  number  = {3},
  year    = {1990},
  pages   = {335--360}
}

@article{vonneumann1947numerical,
  author  = {von Neumann, John and Goldstine, Herman H.},
  title   = {Numerical Inverting of Matrices of High Order},
  journal = {Bulletin of the American Mathematical Society},
  volume  = {53},
  number  = {11},
  year    = {1947},
  pages   = {1021--1099}
}

Prerequisites

01.01.06
01.01.07
43.01.02
43.01.03

Tier anchors

beginner: Trefethen-Bau 1997 *Numerical Linear Algebra* (SIAM) Lectures 20-22 (Gaussian elimination as LU, pivoting, the growth factor — opening discussion); Strang 2016 *Introduction to Linear Algebra* 5e (Wellesley-Cambridge) §2.6 (elimination = A = LU)
intermediate: Trefethen-Bau 1997 *Numerical Linear Algebra* (SIAM) Lectures 20-22 (the LU factorization, partial pivoting PA = LU, the growth factor and the stability theorem); Golub-Van Loan 2013 *Matrix Computations* 4e (Johns Hopkins) §3.2, §3.4 (LU existence via leading minors, pivoting strategies)
master: Higham 2002 *Accuracy and Stability of Numerical Algorithms* 2e (SIAM) Ch. 9 (the backward-error analysis of LU and of GEPP, the growth-factor bound); Golub-Van Loan 2013 *Matrix Computations* 4e (Johns Hopkins) §3.3-3.5; Wilkinson 1965 *The Algebraic Eigenvalue Problem* (Oxford) Ch. 4 (rounding analysis of Gaussian elimination); Trefethen-Bau 1997 *Numerical Linear Algebra* (SIAM) Lectures 20-22

References

Trefethen, L. N. & Bau, D. — Numerical Linear Algebra · SIAM, 1997. Lectures 20-22: Gaussian elimination as the LU factorization, the operation count 2n^3/3, the instability of unpivoted elimination, partial pivoting PA = LU, the growth factor rho, the worst-case 2^{n-1} bound, and the backward-stability statement of GEPP in terms of the growth factor.
Higham, N. J. — Accuracy and Stability of Numerical Algorithms (2nd ed.) · SIAM, 2002. Ch. 9: the componentwise backward-error analysis of LU factorization and of Gaussian elimination with partial pivoting, the bound ||E|| <= c_n rho_n eps_mach ||A||, the growth factor definition and its worst-case 2^{n-1} attainment, and the empirical observation that rho_n grows like n^{2/3} on average.
Wilkinson, J. H. — The Algebraic Eigenvalue Problem · Oxford University Press, 1965. Ch. 4: the rounding-error analysis of Gaussian elimination, the per-step perturbation bound, and the first proof that the growth factor of partial pivoting cannot exceed 2^{n-1}.
Wilkinson, J. H. — Error analysis of direct methods of matrix inversion · Journal of the ACM 8 (1961), 281-330. The founding backward-error analysis of Gaussian elimination: the computed L and U are the exact factors of a matrix near A, with the error controlled by the growth factor.
Trefethen, L. N. & Schreiber, R. S. — Average-case stability of Gaussian elimination · SIAM Journal on Matrix Analysis and Applications 11 (1990), 335-360. The statistical analysis explaining why the growth factor of partial pivoting is small (order n^{2/3}) for random matrices despite the 2^{n-1} worst case.

Estimated time

beginner: 20m
intermediate: 45m
master: 90m