43.06.10 · numerical-analysis / 06-eigenvalue-algorithms

The generalised eigenvalue problem Ax=λBx and the QZ algorithm

shipped3 tiersLean: none

Anchor (Master): Golub-Van Loan *Matrix Computations* (4th ed.) §7.7 (the QZ step, deflation, infinite eigenvalues); Moler-Stewart *An Algorithm for Generalized Matrix Eigenvalue Problems* (SIAM J. Numer. Anal. 10, 1973); Stewart-Sun *Matrix Perturbation Theory* (Academic Press, 1990) Ch. VI (perturbation of the generalised eigenproblem, chordal metric); Watkins *The Matrix Eigenvalue Problem: GR and Krylov Subspace Methods* (SIAM, 2007) Ch. 6 (the generalised problem and the GZ family)

Intuition Beginner

A standard eigenvalue question asks: in which directions does a matrix act by pure stretching, ? The generalised question brings in a second matrix and asks instead: in which directions do and stretch in lockstep, so that and point the same way? The number that makes is then a generalised eigenvalue. Set to the identity and you are back to the ordinary problem.

This shows up the moment a physical system has mass as well as stiffness. A vibrating structure obeys an equation where a stiffness matrix and a mass matrix both act, and the natural frequencies are exactly the numbers with . You cannot fold the mass into the stiffness without distorting the geometry, so the pair has to be handled together.

A tempting shortcut is to multiply through by the inverse of and solve the ordinary problem for . That works on paper. On a computer it can be a trap: if is close to non-invertible, forming amplifies rounding error and corrupts the answer. The QZ algorithm is the careful alternative. It works on and side by side, never forming the inverse, and grinds the pair down to a shape where the generalised eigenvalues can be read straight off two diagonals.

Visual Beginner

The picture shows what the QZ algorithm produces. You start with a pair of full square matrices and . The algorithm multiplies on the left by one rotation-like matrix and on the right by another, , repeatedly, until both and have become upper-triangular staircases at the same time. Once they are both triangular, each generalised eigenvalue is the ratio of a diagonal entry of the transformed to the matching diagonal entry of the transformed .

   Start: a PAIR                 QZ: shrink both together         Read the ratios
   (A , B) full                  Q on the left, Z on the right    lambda_i = a_ii / b_ii

   A = x x x x      B = x x x x        A' = a * * *     B' = b * * *
       x x x x          x x x x             . a * *          . b * *
       x x x x          x x x x   ===>      . . a *          . . b *
       x x x x          x x x x             . . . a          . . . b
                                          (both upper-triangular at once)

Two things are worth seeing. First, and are squeezed into triangular form simultaneously by the same pair of side multiplications, which is why two different matrices and are needed rather than one. Second, when a diagonal entry of the transformed lands on zero, the ratio blows up: that direction is an infinite generalised eigenvalue, a perfectly real feature of pairs that the ordinary one-matrix picture has no room for.

Worked example Beginner

Take the small pair

Both are already diagonal, so we can read the generalised eigenvalues directly. We want the numbers and directions with .

Step 1. Try the first coordinate direction . Then and . We need , which gives .

Step 2. Try the second coordinate direction . Then and . We need , which gives .

Step 3. Collect the answers. The two generalised eigenvalues are and , each the ratio of the matching diagonal entries of and , and the eigendirections are the two coordinate axes.

What this tells us: when a pair is already triangular, every generalised eigenvalue is a diagonal-to-diagonal ratio . The whole job of the QZ algorithm is to reach that triangular pair without ever dividing by as a whole matrix.

Check your understanding Beginner

Formal definition Intermediate+

Let . The matrix pencil generated by the ordered pair is the matrix-valued function , written and identified with the pair . The pencil is regular when its determinant is not the identically-zero polynomial in , and singular otherwise. Throughout, the pencils considered are regular unless stated otherwise.

Definition (generalised eigenvalues and eigenvectors). A finite generalised eigenvalue of a regular pencil is a root of the generalised characteristic polynomial , following Golub-Van Loan §7.7 [source pending]. A non-zero vector with , equivalently , is an associated generalised eigenvector. To accommodate the degree drop of when is singular, generalised eigenvalues are recorded projectively as pairs , defined up to a common non-zero scaling, with the meaning ; the value when , and the pair with , is the infinite eigenvalue. A regular pencil has exactly generalised eigenvalues counted with multiplicity as pairs, of which are finite and are infinite.

Definition (equivalence of pencils). Two pencils and are equivalent when there exist invertible with and . Equivalence preserves the generalised eigenvalues as projective pairs, since differs from only by the non-zero constant . When and are required to be unitary, the relation is unitary equivalence, the numerically stable case: it leaves the two-norms of and unchanged and so does not amplify rounding error.

Definition (generalised Schur / QZ decomposition). A generalised Schur decomposition of is a unitary equivalence to a pair of upper-triangular matrices: unitary with

both upper-triangular. The generalised eigenvalues are then the pairs along the two diagonals, with finite values when and an infinite eigenvalue at each index with . Over one uses real orthogonal and accepts a generalised real Schur form in which is upper-triangular and is quasi-triangular with and diagonal blocks, the blocks carrying complex-conjugate eigenvalue pairs.

Definition (Hessenberg-triangular form). A pair is in Hessenberg-triangular form when is upper-Hessenberg (entries vanishing below the first subdiagonal, in the sense of 43.06.03) and is upper-triangular. This is the condensed starting form for the QZ iteration, the pencil analogue of the upper-Hessenberg starting form of the standard QR algorithm.

Notation: is the pencil of the pair ; is the conjugate transpose; denotes a generalised eigenvalue as a projective pair; denote the upper-triangular generalised Schur factors of ; denote a Hessenberg-triangular pair. The symbol ranges over finite generalised eigenvalues; the pencil is regular unless flagged singular.

Counterexamples to common slips

  • A generalised eigenvalue need not be a ratio of an eigenvalue of to one of . For A = \begin{psmallmatrix} 1 & 1 \\ 0 & 1 \end{psmallmatrix}, B = \begin{psmallmatrix} 1 & 0 \\ 1 & 1 \end{psmallmatrix}, the polynomial has roots that are not quotients of the eigenvalues of (both ) and (both ). The pencil mixes the two matrices and must be solved as a pair.

  • The infinite eigenvalue is not an artefact to be discarded. For A = \begin{psmallmatrix} 1 & 0 \\ 0 & 1 \end{psmallmatrix}, B = \begin{psmallmatrix} 1 & 0 \\ 0 & 0 \end{psmallmatrix}, the polynomial has degree one, so one finite eigenvalue and one infinite eigenvalue; the second is a genuine solution direction of recorded as the pair .

  • A singular pencil is not merely an ill-conditioned regular one. Take A = \begin{psmallmatrix} 1 & 0 \\ 0 & 0 \end{psmallmatrix}, B = \begin{psmallmatrix} 1 & 0 \\ 0 & 0 \end{psmallmatrix}: then A - \lambda B = (1-\lambda)\begin{psmallmatrix}1&0\\0&0\end{psmallmatrix} has determinant identically zero. Every "solves" it, the eigenvalues are undefined, and QZ is not applicable without first extracting the Kronecker structure.

Key theorem with proof Intermediate+

Theorem (existence of the generalised Schur decomposition; Moler-Stewart 1973 [source pending]; Golub-Van Loan §7.7 [source pending]). Let . There exist unitary matrices such that and are both upper-triangular. If the pencil is regular, its generalised eigenvalues are exactly the pairs , and the diagonal entries can be made to appear in any prescribed order.

Proof. Argue by induction on . The case is immediate, since a pair is already triangular. Assume the result for size .

Two cases arise according to whether is invertible. Suppose first is invertible. Then has an eigenpair with a unit vector, by the fundamental theorem of algebra applied to its characteristic polynomial 01.01.08. This satisfies , so is a generalised eigenvalue of the pencil with eigenvector . Choose this as the last column of a unitary , extending to an orthonormal basis. The vector is non-zero because is invertible; normalise and choose this as the last column of a unitary , again extending to an orthonormal basis.

With these choices, consider the last columns of and . The last column of is , so , placing the only non-zero entry of that column on the diagonal. Likewise , so . Both and therefore have a zero last row except in the bottom-right corner — equivalently, after the equivalence the pencil deflates into an leading pencil and a trailing block. Apply the inductive hypothesis to the leading pair, with its own unitary factors padded by a in the corner, to triangularise the rest. The composition of the two unitary equivalences is again a unitary equivalence, and the resulting are upper-triangular.

When is singular, it has a kernel of positive dimension: pick a unit vector with , and set . Then the last column of vanishes, so the last column of is zero for any unitary ; choose so that is a multiple of (take aligned with , which is non-zero because regularity forbids a common kernel vector). The pencil again deflates, contributing an infinite eigenvalue , and the induction proceeds on the leading block. Reordering of the diagonal pairs is achieved by selecting which eigenpair is peeled off at each step, since the eigenpair chosen for is free.

Bridge. This existence theorem builds toward the QZ algorithm of the Advanced results, which computes the very decomposition just shown to exist, by the same orthogonal-equivalence mechanism but iteratively and in rather than through an abstract induction. The construction here is exactly the generalised analogue of the static Schur form whose single-matrix version underwrites the QR algorithm of 43.06.03: setting collapses with back to ordinary Schur triangularisation, so the QZ decomposition generalises that case while keeping two independent unitaries to handle the second matrix. The foundational reason two unitaries are unavoidable is that a pencil carries two matrices that cannot in general be triangularised by one common similarity; this is exactly the structural fact that the diagonal pairs , not single numbers, are the invariants. The decomposition appears again in the symmetric-definite reduction and in the perturbation theory of the chordal metric, and putting these together, the bridge is the recognition that the existence proof's peel-one-eigenpair induction is precisely what the QZ iteration realises as a convergent, backward-stable sweep on the Hessenberg-triangular form.

Exercises Intermediate+

Advanced results Master

Theorem (the QZ algorithm; Moler-Stewart 1973 [source pending]; Golub-Van Loan §7.7 [source pending]). Let be a regular pencil with and invertible. The QZ algorithm computes the generalised Schur decomposition in two phases. First, a finite sequence of unitary equivalences reduces to Hessenberg-triangular form with upper-Hessenberg and upper-triangular. Second, an iterative phase applies implicit shifted QR sweeps to the matrix — represented only through and , never by forming the product — driving the subdiagonal of to zero while preserving the triangularity of , so that converges to a generalised Schur pair . The unitary accumulators and are the products of the left and right transformations, and the computed pair satisfies .

The reduction to Hessenberg-triangular form first triangularises by a QR factorisation , applying on the left of both matrices; then a sequence of Givens rotations annihilates the entries of below the first subdiagonal one at a time, each rotation applied on the left to a pair of rows of both and and immediately followed by a rotation on the right to a pair of columns of both, chosen to restore the upper-triangularity of that the left rotation disturbed. This left-then-right paired sweep is the structural signature of QZ: every transformation is a two-sided orthogonal equivalence, so the generalised eigenvalues are preserved at machine precision throughout, and the cost is .

The iterative phase is the implicit shift. Conceptually one runs the shifted QR algorithm of 43.06.03 on , which is upper-Hessenberg because is Hessenberg and is upper-triangular. A QR step on would compute and form . The QZ realisation never forms : it computes only the first column of , which depends on and on solving one triangular system with , builds the Householder or Givens transformation that maps that column to a multiple of , applies it on the left of , and then chases the resulting bulge down the pencil with paired left-right rotations exactly as in the reduction phase, restoring Hessenberg-triangular form. By the implicit-Q theorem of 43.06.03, the result is the same pencil the explicit shifted step on would produce, because an unreduced Hessenberg-triangular pencil is determined up to diagonal scaling by the first column of the equivalent . Avoiding the explicit product is what makes the method stable when (equivalently ) is ill-conditioned: the entries of would be polluted by a factor , whereas the implicit scheme touches only through one well-posed triangular solve per sweep.

Theorem (the symmetric-definite generalised eigenproblem; Golub-Van Loan §8.7 [source pending]). Let be Hermitian and Hermitian positive-definite, with Cholesky factorisation . Then is Hermitian, its eigenvalues are the generalised eigenvalues of — all real and finite — and the symmetric QR algorithm applied to computes them with backward error . The generalised eigenvectors are recovered by from the eigenvectors of .

This reduction is the constructive numerical counterpart of the simultaneous-diagonalisation theory of 01.01.19, which establishes that a symmetric pair with one definite member can be diagonalised by a single congruence; the Cholesky factor supplies that congruence explicitly and stably. The QZ algorithm applies to a general pencil and so does not exploit symmetry, but on the symmetric-definite class it is wasteful: the Cholesky reduction costs a third of the work and delivers a standard Hermitian eigenproblem whose conditioning is governed by , acceptable when is well-conditioned. When is symmetric positive-definite but ill-conditioned, the QZ algorithm on the unsymmetrised pair is preferred for accuracy, trading the symmetry exploitation for the avoidance of the squared sensitivity that inherits from .

Theorem (perturbation and the chordal metric; Stewart-Sun 1990 [source pending]). The sensitivity of a simple finite generalised eigenvalue of a regular pencil is correctly measured not by the Euclidean distance on but by the chordal metric on projective pairs,

under which a finite and an infinite eigenvalue sit at finite chordal distance, and the condition number of a generalised eigenvalue with left and right eigenvectors is , finite even as .

The chordal metric is the intrinsic distance on the projective line of eigenvalue pairs, and its appearance is forced by the existence of infinite eigenvalues: a perturbation that moves a large finite eigenvalue past infinity to a large negative one is small in the chordal sense though enormous in the ordinary one, which is the correct accounting for a pencil where is an ordinary point. This is the perturbation-theoretic shadow of the projective bookkeeping that the generalised Schur form keeps in its diagonal pairs .

Synthesis. The QZ algorithm is the orthogonal-equivalence descendant of the QR algorithm, and the generalised Schur decomposition is the foundational reason it computes the entire pencil spectrum at once: where QR drives a single matrix to triangular form by one-sided similarity, QZ drives the pair to a triangular pair by two-sided equivalence, reading every generalised eigenvalue as a diagonal ratio . This is exactly the structure that setting collapses back to the standard Schur form of 43.06.03, so QZ generalises that algorithm while keeping two unitaries to absorb the second matrix; the central insight is that the implicit shift runs the shifted QR iteration on without ever forming it, touching only through one triangular solve per sweep, which is precisely how the method stays backward stable when is ill-conditioned and forming would not.

The symmetric-definite case is dual to the general one: there the Cholesky congruence of 01.01.19 reduces the pencil to a single Hermitian matrix, trading the two-sided machinery for the cheaper one-sided symmetric QR whenever is well-conditioned. Putting these together, the perturbation theory closes the loop — the chordal metric on projective pairs is the intrinsic geometry the diagonal pairs live in, making a finite and an infinite eigenvalue near neighbours — and the bridge is the recognition that the static existence of a common triangularising equivalence becomes, through Hessenberg-triangular reduction plus implicit bulge-chasing, a constructive backward-stable algorithm whose every step is the pencil analogue of one orthogonal QR sweep.

Full proof set Master

Proposition (equivalence preserves the generalised spectrum). If are invertible and , then , so the finite generalised eigenvalues and their multiplicities, the degree of the characteristic polynomial, and hence the count of infinite eigenvalues are all preserved.

Proof. Compute by linearity in . Taking determinants, , with a non-zero constant independent of . The two polynomials in therefore have the same roots with the same multiplicities and the same degree. Since a regular pencil has eigenvalue pairs of which the degree counts the finite ones, the infinite count is preserved as well.

Proposition (HR⁻¹ is upper-Hessenberg). If is upper-Hessenberg and is invertible upper-triangular, then is upper-Hessenberg.

Proof. The inverse of an invertible upper-triangular matrix is upper-triangular. The product of an upper-Hessenberg matrix and an upper-triangular matrix has entry , where for and for . A non-zero term needs , which is impossible when . So for , that is is upper-Hessenberg.

Proposition (one QZ sweep preserves Hessenberg-triangular form). Let be Hessenberg-triangular with invertible, and let a QZ sweep consist of a left transformation determined by the first column of (with ) followed by bulge-chasing left-right paired rotations restoring triangularity of . The resulting pair is again Hessenberg-triangular, and equals the explicit shifted-QR iterate of .

Proof. Each paired step applies a left rotation and a right rotation . The right rotations are chosen at each stage to annihilate the single entry that the preceding left rotation introduced below the diagonal of , so is upper-triangular by construction. The left rotation is built so that is a multiple of ; the bulge-chasing restores to upper-Hessenberg form one subdiagonal position at a time, the standard bulge-chase of 43.06.03 adapted so each left rotation is paired with a right rotation keeping triangular. Let and . Then , so is a unitary similarity of by . Since — because and each subsequent fixes the first coordinate — and both and the explicit iterate from are unreduced upper-Hessenberg with transforming first columns agreeing up to scale, the implicit-Q uniqueness theorem of 43.06.03 gives for a diagonal unitary , so equals the explicit shifted-QR iterate up to diagonal scaling.

Proposition (deflation at a zero subdiagonal). If during the QZ iteration a subdiagonal entry of falls below a tolerance proportional to , setting it to zero splits the pencil into a leading Hessenberg-triangular pencil and a trailing one, whose generalised eigenvalues are the union of the two subpencils'.

Proof. Setting makes block upper-triangular with diagonal blocks (size ) and (size ); is already upper-triangular, hence block upper-triangular with the same partition into . For a block upper-triangular pencil, by the block-triangular determinant formula, so the generalised characteristic polynomial factors and the eigenvalue set is the union of the two subpencils' eigenvalue sets, with multiplicities adding. The QZ iteration then continues independently on each block. When instead a diagonal entry of becomes negligible, a left rotation moves the resulting zero to the bottom-right, deflating an infinite eigenvalue .

Connections Master

The QZ algorithm is the direct generalisation of the QR algorithm of 43.06.03: setting and constraining recovers ordinary Schur triangularisation, the implicit shift on reduces to the implicit shift on , and the bulge-chasing machinery, the implicit-Q theorem, the shift strategies, and the backward-stability argument all specialise verbatim; the present unit is the standard eigenproblem lifted from one matrix to a pencil by replacing one-sided similarity with two-sided orthogonal equivalence.

The symmetric-definite reduction realises numerically the simultaneous-diagonalisation theorem of 01.01.19: that unit proves a symmetric pencil with one definite member admits a single congruence diagonalising both forms, and the Cholesky factor of together with the spectral theorem of 01.01.13 applied to is exactly that congruence made constructive and stable, turning the existence statement into the computed eigenvalues that vibration analysis and quadratic-form optimisation consume.

The generalised eigenvalue structure rests on the ordinary eigenvalue and characteristic-polynomial theory of 01.01.08: the generalised characteristic polynomial is the pencil analogue of , its degree drop below encodes the infinite eigenvalues that the projective pair records, and the existence proof peels off one ordinary eigenpair of at a time exactly as the Schur induction of the single-matrix case peels off one eigenvector.

Historical & philosophical context Master

The QZ algorithm was introduced by Cleve Moler and G. W. Stewart in their 1973 paper in the SIAM Journal on Numerical Analysis, which gave the name QZ and established the two-phase structure — reduction to Hessenberg-triangular form followed by an implicit shifted iteration on the pencil — together with the backward-stability analysis [Moler 1973]. Their central design decision was to refuse the obvious route of forming and running the existing QR algorithm: when is ill-conditioned that product is computed inaccurately, and the resulting eigenvalues inherit the conditioning of rather than the conditioning of the pencil. By manipulating and only through unitary equivalences and a single triangular solve per sweep, QZ achieves the backward error that the explicit-inverse route forfeits.

The generalised eigenvalue problem itself is older, arising in the nineteenth-century theory of pairs of quadratic forms studied by Weierstrass, whose elementary-divisor theory and the Kronecker theory of singular pencils classify the canonical forms that QZ computes in the regular case [Stewart 1990]. The recognition that generalised eigenvalues should be carried as projective pairs , and that their perturbation is governed by the chordal metric under which infinity is an ordinary point, was developed in the perturbation theory of Stewart and Sun, completing the numerical picture by giving the condition number of a generalised eigenvalue a form that remains finite as the eigenvalue tends to infinity. The QZ algorithm remains the method underlying the generalised eigenvalue routines of LAPACK and is the standard back end for polynomial root-finding through the companion pencil.

Bibliography Master

@article{MolerStewart1973,
  author  = {Moler, Cleve B. and Stewart, G. W.},
  title   = {An Algorithm for Generalized Matrix Eigenvalue Problems},
  journal = {SIAM Journal on Numerical Analysis},
  volume  = {10},
  number  = {2},
  year    = {1973},
  pages   = {241--256}
}

@book{GolubVanLoan2013,
  author    = {Golub, Gene H. and Van Loan, Charles F.},
  title     = {Matrix Computations},
  edition   = {4th},
  publisher = {Johns Hopkins University Press},
  address   = {Baltimore},
  year      = {2013}
}

@book{StewartSun1990,
  author    = {Stewart, G. W. and Sun, Ji-Guang},
  title     = {Matrix Perturbation Theory},
  publisher = {Academic Press},
  address   = {Boston},
  year      = {1990}
}

@book{Stewart2001,
  author    = {Stewart, G. W.},
  title     = {Matrix Algorithms, Volume II: Eigensystems},
  publisher = {SIAM},
  address   = {Philadelphia},
  year      = {2001}
}

@book{Watkins2007,
  author    = {Watkins, David S.},
  title     = {The Matrix Eigenvalue Problem: {GR} and {K}rylov Subspace Methods},
  publisher = {SIAM},
  address   = {Philadelphia},
  year      = {2007}
}

@article{Weierstrass1868,
  author  = {Weierstrass, Karl},
  title   = {Zur Theorie der bilinearen und quadratischen Formen},
  journal = {Monatsberichte der K\"oniglich Preussischen Akademie der Wissenschaften zu Berlin},
  year    = {1868},
  pages   = {310--338}
}

@article{Kronecker1890,
  author  = {Kronecker, Leopold},
  title   = {Algebraische Reduction der Schaaren bilinearer Formen},
  journal = {Sitzungsberichte der K\"oniglich Preussischen Akademie der Wissenschaften zu Berlin},
  year    = {1890},
  pages   = {1225--1237}
}