43.06.02 · numerical-analysis / 06-eigenvalue-algorithms

Reduction to Hessenberg/tridiagonal form

shipped3 tiersLean: none

Anchor (Master): Trefethen-Bau *Numerical Linear Algebra* Lectures 26–28; Golub-Van Loan *Matrix Computations* (4th ed.) §7.4.3 (Householder reduction to Hessenberg form), §8.3 (the symmetric tridiagonal reduction); Wilkinson *The Algebraic Eigenvalue Problem* (Oxford, 1965) Ch. 6 (Householder's method); Parlett *The Symmetric Eigenvalue Problem* (SIAM Classics, 1998) §7.3 (the Householder tridiagonalisation)

Intuition Beginner

Finding the eigenvalues of a full square matrix is hard work, and the machine that does it, the QR algorithm, has to grind through many rounds. Each round costs roughly the cube of the matrix size, so for a dense matrix the whole job is expensive. There is a cheaper way to set things up first.

The idea is to comb the matrix into a tidier shape before the eigenvalue machine ever runs. We squeeze almost all of its entries down to zero, leaving only a narrow band near the diagonal. A general matrix gets combed into a shape where everything below the first slot under the diagonal is zero; this is called Hessenberg form. A symmetric matrix combs down even further, to a thin three-stripe band called tridiagonal form.

The combing has to be done carefully, because we are not allowed to change the eigenvalues. We use a special kind of rotation, a reflection, applied on both sides of the matrix at once. Applying the same reflection on the left and on the right is a change of viewpoint that leaves every eigenvalue exactly where it was. So we get a matrix with the same eigenvalues but a far simpler shape, and the eigenvalue machine then races through it.

Visual Beginner

The picture shows a five-by-five matrix before and after combing. On the left every cell can be nonzero, a full block of numbers. On the right almost the whole lower triangle has been zeroed out, leaving a staircase: the diagonal, the one stripe just below it, and everything above. That staircase is the Hessenberg shape.

   Full matrix            Upper-Hessenberg form
  x x x x x                x x x x x
  x x x x x                x x x x x
  x x x x x      ===>      0 x x x x
  x x x x x                0 0 x x x
  x x x x x                0 0 0 x x

   Symmetric matrix        Tridiagonal form
  x x x x x                x x 0 0 0
  x x x x x                x x x 0 0
  x x x x x      ===>      0 x x x 0
  x x x x x                0 0 x x x
  x x x x x                0 0 0 x x

Two things are worth seeing. First, the combed matrix keeps the same eigenvalues as the full one, because each combing step is a balanced two-sided reflection. Second, the symmetric case lands on the thinnest band of all, a single stripe on each side of the diagonal, because zeroing a column of a symmetric matrix zeroes the matching row for free.

Worked example Beginner

Take the symmetric three-by-three matrix

We want to clear the entry in the bottom-left corner, the one that sits below the first subdiagonal in column one. We will use a reflection acting on the last two coordinates to fold the vector formed by the two below-diagonal entries of column one onto a single axis.

Step 1. Look at the part of column one below the diagonal: the numbers and . Their combined length is . A reflection on these two coordinates can rotate onto , putting all the weight into the first of the two slots and clearing the second.

Step 2. Build the reflection that does this, and form the new matrix . Applying on the left clears the bottom-left entry; applying the same on the right keeps the eigenvalues unchanged and, because is symmetric, clears the top-right entry to match.

Step 3. The result is the tridiagonal matrix

where the corner entries are now zero and only the diagonal and its two neighbouring stripes survive.

What this tells us: one reflection turned a full symmetric matrix into a three-stripe band, and the eigenvalues are unchanged because we reflected on both sides. A larger matrix needs a handful more reflections, one per column, but the principle is the same.

Check your understanding Beginner

Formal definition Intermediate+

Let with . A matrix is upper-Hessenberg when for all ; that is, the only possibly-nonzero entries below the main diagonal lie on the first subdiagonal. A matrix is tridiagonal when for , so it is upper-Hessenberg and lower-Hessenberg at once. A Hessenberg matrix is unreduced when every subdiagonal entry is nonzero, for .

A Householder reflector is a unitary matrix of the form for a nonzero vector , in the sense of 01.01.09; it is Hermitian and unitary, , and it reflects across the hyperplane . Given any vector , a Householder reflector can be chosen so that with , collapsing all but the first coordinate of onto a single axis.

Two matrices and are similar when for an invertible ; they share the same characteristic polynomial and hence the same eigenvalues with the same algebraic multiplicities. When is unitary, is a unitary (orthogonal, in the real case) similarity, and it preserves not only the spectrum but the Hermitian structure: if then .

Definition (Hessenberg reduction). The Hessenberg reduction of is a unitary with upper-Hessenberg. It is built as a product of Householder reflectors: acts as the identity on the first coordinates and, applied as the two-sided similarity , zeroes the entries of column below row . When the same construction yields tridiagonal, because each reflector preserves symmetry and the cleared subcolumn forces a cleared superrow.

Notation: is the first standard basis vector; denotes the computed (floating-point) result of an operation; without subscript is the Euclidean / spectral norm; is the unit roundoff. The reflector is identified with the embedded reflection acting on coordinates .

Counterexamples to common slips

  • The reflector must act as a two-sided similarity , not a one-sided multiplication. A one-sided would zero the subcolumn but change the eigenvalues; the right multiplication by restores the spectrum and is why the cleared region is below the first subdiagonal, not below the diagonal — the right multiply refills the entries in row that a triangularising left-only sweep would have cleared.
  • One cannot zero the entry on the first subdiagonal: the reflector is built to act on coordinates through , so it can collapse the subcolumn onto its first slot but cannot remove that slot. This is the structural reason finite Householder reduction stops at Hessenberg, not at triangular form.
  • Tridiagonal form requires . For a non-symmetric , clearing the subcolumn does not clear the matching superrow, so the upper triangle stays full and the result is only Hessenberg.

Key theorem with proof Intermediate+

Theorem (orthogonal reduction to Hessenberg/tridiagonal form; Trefethen-Bau Lecture 26 [source pending]; Golub-Van Loan §7.4.3 [source pending]). Every admits a unitary , a product of at most Householder reflectors, with $Q^ A Q = HA = A^Q^ A Q = TO(n^3)\tfrac{10}{3}n^3\tfrac{4}{3}n^3AHT$ all share the same spectrum.*

Proof. Construct one column at a time. Suppose after steps the matrix is upper-Hessenberg in its first columns. Write the -th column below the diagonal as . Choose a Householder reflector on with , , and set , the identity on the first coordinates.

Left-multiplication replaces the subcolumn by , zeroing rows of column while leaving the diagonal and subdiagonal entries in rows intact. Right-multiplication by acts only on columns — because fixes the first coordinates — so it does not disturb columns and in particular preserves the zeros just created in column . Thus is upper-Hessenberg in its first columns. After steps every column below the first subdiagonal is zero, so is upper-Hessenberg, with .

When , each is again Hermitian, since the are unitary and Hermitian conjugation commutes with the similarity. A Hermitian upper-Hessenberg matrix is tridiagonal: for forces for , so both bands vanish beyond the first off-diagonals. Hence is tridiagonal.

The operation count: step applies to an trailing block from both sides, costing per side; summing in the symmetric case (only the lower band is touched) and in the general case where both the trailing rows and columns are updated. Each is unitarily similar to , so the characteristic polynomials agree and the spectra coincide.

Bridge. This construction builds toward the QR algorithm of 43.06.03, where the Hessenberg form produced here is the standing input that makes each QR sweep cost instead of ; the band structure appears again in the Arnoldi iteration of 43.07.02, whose orthonormal Krylov basis produces exactly a Hessenberg matrix , the same object reached here by direct reflectors rather than by iteration. The foundational reason the reduction stops at Hessenberg and not at triangular form is the Abel/Galois obstruction: a finite rational-plus-root algorithm cannot in general expose the eigenvalues, so the best a finite orthogonal procedure can do is the cheapest non-triangular band, and this is exactly the division of labour between a finite direct phase and an infinite iterative phase. The two-sided similarity that preserves the spectrum here generalises the change-of-basis invariance of eigenvalues, and putting these together, the reduction is the bridge from the static spectral theory of 01.01.08 to the production eigensolvers: it converts a dense problem into a banded one at one-time cost, after which the iteration is nearly free.

Exercises Intermediate+

Advanced results Master

Theorem (backward stability of the orthogonal reduction; Wilkinson Ch. 6 [source pending]). Let and let be the upper-Hessenberg matrix computed by Householder reduction in floating-point arithmetic with unit roundoff . There is a unitary , and a perturbation , such that

The computed reduction is the exact reduction of a nearby matrix. The mechanism is that each Householder reflector, applied in floating point, equals an exactly-unitary reflector applied to a slightly perturbed operand, with the perturbation bounded by a low-degree polynomial in times ; composing such steps multiplies the bound by the number of steps and keeps the accumulated exactly unitary to working precision. Because the backward error is of the order of the unavoidable rounding of the input, the reduction adds no instability of its own: the eigenvalues of are the eigenvalues of a matrix within of , and by the perturbation theory of 43.01.02 they are accurate to the conditioning of the eigenproblem. This is the precise sense in which orthogonal similarities are the right tool: a non-orthogonal reduction with a large condition number would magnify by , whereas a unitary has .

Theorem (the implicit-Q / uniqueness theorem; Golub-Van Loan §7.4.5 [source pending]). Let and let , be unitary with $Q^ A Q = HV^* A V = GQ e_1 = V e_1QVHGH - \mu I$ explicitly; it suffices to apply the correct first transformation and then restore Hessenberg form by chasing the resulting bulge down the band, because the uniqueness theorem guarantees the result equals the explicit sweep.

Theorem (band reduction and the operation-count hierarchy). For a banded Hermitian with bandwidth , the Householder tridiagonalisation can be organised to cost rather than by chasing fill-in within the band; for a full Hermitian the cost is , half the general because only one triangle is updated. The symmetric case is cheaper at every level because the two-sided similarity acts on a self-conjugate object, so the work and the storage halve; the band-chasing organisation is the lever that turns an already-sparse input into a tridiagonal output at sub-cubic cost, and it is the same fill-control idea that governs sparse-direct factorisation.

Synthesis. The reduction to Hessenberg/tridiagonal form is the hinge of the two-phase eigenvalue computation, and the foundational reason it exists is the Abel/Galois obstruction: no finite rational-and-radical procedure exposes the eigenvalues of a general matrix, so the computation must split into a finite direct phase and an infinite iterative phase. This is exactly the division this unit constructs — Householder reflectors collapse the matrix onto the cheapest band a finite orthogonal procedure can reach, and the band is preserved by every QR sweep, so the iterative phase inherits a permanent -per-step (or tridiagonal) cost. The two-sided unitary similarity is the central insight that makes the scheme stable: it preserves the spectrum exactly, preserves Hermitian structure, and has unit condition number, so the reduction's backward-error bound generalises the -free stability of the Householder QR of 01.01.09. Putting these together, the Hessenberg form is at once the output of a finite direct algorithm and the standing input of the iterative one, and the implicit-Q theorem shows the form is rigid enough — fixed by its first column — that the iteration can be driven implicitly; the bridge to the Krylov methods of chapter is that the same Hessenberg object arises there as from the Arnoldi process, so the direct reduction and the iterative projection are two routes to one structure.

Full proof set Master

Proposition (existence and finiteness of the Hessenberg reduction). For every there is a unitary , a product of at most Householder reflectors, with $Q^ A Q$ upper-Hessenberg.*

Proof. Induct on the column index , with the invariant that is upper-Hessenberg in columns . The base case is vacuous. Assume is Hessenberg in its first columns. Let be the part of column below the subdiagonal. If take ; otherwise choose a Householder reflector on with — the sign chosen to avoid cancellation — and embed .

Left-multiplication by acts only on rows , replacing by a multiple of and so zeroing rows of column ; rows and columns are fixed, preserving the earlier Hessenberg columns. Right-multiplication by acts only on columns , hence does not touch columns and preserves the new zeros in column . So is Hessenberg in columns . At the matrix is fully upper-Hessenberg, and is unitary as a product of unitaries.

Proposition (preservation of Hermitian structure; tridiagonal output). If $A = A^Q^* A Q$ tridiagonal.*

Proof. Each is Hermitian whenever is, since , and is Hermitian by hypothesis; by induction is Hermitian. By the previous proposition is upper-Hessenberg, so for . Hermitian symmetry gives , so for as well. Both bands beyond the first off-diagonals vanish, so , which is tridiagonality.

Proposition (spectral invariance and the operation count). The reduction preserves the spectrum and uses operations, with leading term in the Hermitian case.

Proof. Spectral invariance is similarity invariance: for unitary , so , and the characteristic polynomials, hence the spectra with multiplicities, coincide. For the count, step applies to a trailing block: a Householder application on an block costs operations (one matrix-vector product and one rank-one update). In the Hermitian case only the lower triangle is stored and each step updates an symmetric trailing block at cost halved by symmetry to for the two-sided update plus the symmetric rank-two correction; summing gives leading term . The general (non-Hermitian) case updates both the trailing rows and the trailing columns, doubling the symmetric work to leading term .

Proposition (QR-step invariance of the Hessenberg band). If is upper-Hessenberg with QR factorisation ( unitary, upper-triangular invertible), then is upper-Hessenberg.

Proof. Write . For an upper-Hessenberg and an upper-triangular , the product is upper-Hessenberg: , with for and for , so a surviving term needs , impossible when . Thus is upper-Hessenberg. Likewise for upper-triangular and upper-Hessenberg , the product is upper-Hessenberg: with for and for , so a surviving term needs , impossible when . Applying this with , gives upper-Hessenberg. The same argument with the conjugate-symmetric band shows a symmetric tridiagonal stays tridiagonal under a QR step.

Connections Master

The Hessenberg/tridiagonal reduction is the direct first phase of the QR algorithm of 43.06.03: the QR algorithm is run not on the dense but on its Hessenberg reduction, where the band-invariance proposition proved here guarantees that every QR sweep returns another Hessenberg matrix, dropping the per-sweep cost from to and, in the symmetric tridiagonal case, to ; without this reduction the QR algorithm would be impractical.

The Householder reflectors that perform the reduction are the same orthogonal triangularisation tool built and analysed in 01.01.09, reused here as two-sided similarities rather than one-sided factors; the backward-stability bound of the reduction is the -independent stability of Householder transformations transplanted from the QR factorisation to the eigenvalue setting.

The inverse-iteration step of 43.06.01 solves a linear system at every iteration, and reducing to Hessenberg or tridiagonal form first makes each of those solves cheap — a tridiagonal solve is — which is why the practical eigenvalue pipeline reduces once and then iterates, sharing the reduction across the iteration just as the QR algorithm does.

The Arnoldi iteration of 43.07.02 produces the Hessenberg matrix as the projection of onto a Krylov subspace, and its Hermitian specialisation, the Lanczos iteration, produces a tridiagonal ; these are the same band structures reached here by direct reflectors, so the present unit is the finite-dimensional, full-spectrum counterpart of the Krylov-subspace projection that underlies GMRES and conjugate gradients.

Historical & philosophical context Master

The reduction of a matrix to Hessenberg form by orthogonal similarities is due to Alston Householder, whose 1958 introduction of the elementary reflector gave numerical linear algebra its sharpest tool for selectively zeroing entries while preserving norms, and whose subsequent treatise The Theory of Matrices in Numerical Analysis (1964) systematised the reduction as the standard preprocessing step [Householder 1958]. Wilkinson, in The Algebraic Eigenvalue Problem (1965), supplied the rounding-error analysis showing that Householder's reduction is backward stable: the computed band matrix is the exact reduction of a matrix within of the original, which is what made the two-phase eigenvalue computation trustworthy in finite precision [Wilkinson 1965].

The strategic role of the reduction — a finite direct phase followed by an infinite iterative phase — reflects the Abel-Ruffini theorem and Galois's resolution of solvability by radicals: because the eigenvalues of a general matrix of size are the roots of an arbitrary polynomial, no finite sequence of arithmetic and root operations can produce them, so the eigenvalue problem is necessarily iterative [Abel 1826]. The reduction does the maximal finite work — banding the matrix as far as a finite orthogonal procedure allows — and hands the irreducibly iterative remainder to the QR algorithm of Francis and Kublanovskaya (1961). The tridiagonal specialisation for symmetric matrices, and its exploitation by the symmetric QR and divide-and-conquer eigensolvers, was developed extensively by Parlett, whose The Symmetric Eigenvalue Problem is the standard account [Parlett 1998].

Bibliography Master

@article{Householder1958,
  author  = {Householder, Alston S.},
  title   = {Unitary Triangularization of a Nonsymmetric Matrix},
  journal = {Journal of the ACM},
  volume  = {5},
  number  = {4},
  year    = {1958},
  pages   = {339--342}
}

@book{Householder1964,
  author    = {Householder, Alston S.},
  title     = {The Theory of Matrices in Numerical Analysis},
  publisher = {Blaisdell},
  address   = {New York},
  year      = {1964}
}

@book{Wilkinson1965,
  author    = {Wilkinson, James H.},
  title     = {The Algebraic Eigenvalue Problem},
  publisher = {Oxford University Press},
  address   = {Oxford},
  year      = {1965}
}

@article{Francis1961,
  author  = {Francis, John G. F.},
  title   = {The QR Transformation, I},
  journal = {The Computer Journal},
  volume  = {4},
  number  = {3},
  year    = {1961},
  pages   = {265--271}
}

@article{Abel1826,
  author  = {Abel, Niels Henrik},
  title   = {Beweis der Unm{\"o}glichkeit, algebraische Gleichungen von h{\"o}heren Graden als dem vierten allgemein aufzul{\"o}sen},
  journal = {Journal f{\"u}r die reine und angewandte Mathematik},
  volume  = {1},
  year    = {1826},
  pages   = {65--84}
}

@book{Parlett1998,
  author    = {Parlett, Beresford N.},
  title     = {The Symmetric Eigenvalue Problem},
  publisher = {SIAM},
  series    = {Classics in Applied Mathematics},
  year      = {1998}
}

@book{TrefethenBau1997,
  author    = {Trefethen, Lloyd N. and Bau, David},
  title     = {Numerical Linear Algebra},
  publisher = {SIAM},
  address   = {Philadelphia},
  year      = {1997}
}

@book{GolubVanLoan2013,
  author    = {Golub, Gene H. and Van Loan, Charles F.},
  title     = {Matrix Computations},
  edition   = {4th},
  publisher = {Johns Hopkins University Press},
  address   = {Baltimore},
  year      = {2013}
}