Simultaneous diagonalisation of two quadratic forms and the generalised eigenvalue problem
Anchor (Master): Shilov *Linear Algebra* Ch. 10; Parlett *The Symmetric Eigenvalue Problem* Ch. 15 (the symmetric-definite generalised eigenproblem); Golub-Van Loan *Matrix Computations* §8.7 (Cholesky reduction of $Av=\lambda Bv$); Gantmacher *Theory of Matrices* Vol. I Ch. X (pencils of forms, Weierstrass-Kronecker theory); Arnold *Mathematical Methods of Classical Mechanics* §23–24 (small oscillations, normal modes)
Intuition Beginner
A single symmetric matrix has a set of perpendicular axes along which it acts by pure stretching. The spectral theorem 01.01.13 finds them. But many problems in physics and statistics come with two such matrices at once, and the two sets of axes usually point in different directions. The question of this unit is whether one clever choice of coordinates can straighten out both matrices together.
The surprising answer is yes, as long as one of the two matrices measures a genuine size, meaning it is positive whenever you feed it a nonzero vector. That positive matrix gets used as a new ruler. Once you agree to measure lengths and angles with that ruler instead of the ordinary one, the second matrix turns out to have perpendicular stretch axes in the new sense, and those very axes diagonalise both matrices at the same time.
A vibrating system makes this concrete. Two masses joined by springs have a kinetic-energy matrix and a potential-energy matrix. Diagonalising both at once finds the special motions, called normal modes, in which the whole system oscillates at a single clean frequency, with every part moving in step.
Visual Beginner
Two quadratic forms in the plane draw two ellipses centred at the origin. The first form, the positive-definite one, draws a reference ellipse; the second draws another ellipse, tilted differently. Ordinary perpendicular axes line up with at most one of them. The simultaneous-diagonalisation directions are the axes that look like the principal axes of both ellipses once you stretch the picture so the reference ellipse becomes a perfect circle.
The right panel is the punchline: after rescaling space so the blue reference ellipse becomes a circle, the red ellipse lines up with the coordinate axes. Those axes are the directions that diagonalise both forms together.
Worked example Beginner
Take two symmetric matrices in the plane,
Here is positive: feeding it any nonzero vector gives . The generalised eigenvalues solve .
Step 1. Form .
Step 2. Set its determinant to zero: .
Step 3. Multiply out: .
Step 4. Solve the quadratic: . Numerically and .
Step 5. Find the first generalised eigenvector from , that is . The top row gives , so . A direction is .
Step 6. Check -orthogonality of the two eigenvectors. The second eigenvector is . Then , confirming the eigenvectors are perpendicular when measured by .
What this tells us: the two generalised eigenvalues are the values of where and agree as forms in some direction, and the two eigenvectors are perpendicular in the ruler set by , not in the ordinary ruler. Those directions are exactly the ones that diagonalise both forms at once.
Check your understanding Beginner
Formal definition Intermediate+
Let be a finite-dimensional real vector space and let be real symmetric bilinear forms on in the sense of 01.01.15, with Gram matrices (also written ) in a fixed basis. Assume is positive-definite, written : for every . Symmetry means and . All forms in this unit are real symmetric unless stated otherwise.
Because , the formula
defines a genuine inner product on , the -inner product, with norm . Two vectors are -orthogonal when ; a basis is -orthonormal when its Gram matrix against is the identity. This makes a finite-dimensional inner-product space in the sense of 01.01.09.
Definition (generalised eigenvalue problem). A scalar and a vector form a generalised eigenpair of the ordered pair when
The scalar is a generalised eigenvalue and a generalised eigenvector. Equivalently is a root of the generalised characteristic polynomial
and . The formal object , viewed as a one-parameter family of matrices, is the matrix pencil of the pair; it is regular when is not identically zero, which holds automatically here because has leading term .
Definition (simultaneous diagonalisation by congruence). The pair is simultaneously diagonalisable by congruence when there is an invertible with
The transformation is congruence, the change-of-variables law for bilinear forms 01.01.17, and it is the correct equivalence here: a substitution sends the form to . Congruence is not similarity (); the two coincide only when is orthogonal, and the diagonalising below is generally not orthogonal.
Definition (generalised Rayleigh quotient). For ,
is the generalised Rayleigh quotient of . It is well defined because keeps the denominator positive, and it reduces to the ordinary Rayleigh quotient of 01.01.14 when .
Counterexamples to common slips
- Positive-definiteness of , not of , is what the construction needs. The matrix may be indefinite; its generalised eigenvalues are then a mix of positive and negative reals, and the simultaneous diagonalisation still exists.
- The diagonalising is generally not orthogonal, so is congruent but not similar to . Reading off "eigenvalues of " from is the error: the are eigenvalues of the pencil, equivalently of , not of .
- If is merely positive-semidefinite or indefinite, the pencil may be defective or singular. The pair , has , with no real generalised eigenvalue; no real congruence diagonalises both.
Key theorem with proof Intermediate+
Theorem (simultaneous diagonalisation of two forms, one definite; Shilov Ch. 10 [source pending]; Horn-Johnson §4.5 [source pending]). Let be real symmetric forms on an -dimensional real vector space with . Then there is an invertible with
the columns of are -orthonormal generalised eigenvectors, and the diagonal entries are the generalised eigenvalues, that is the roots of . All are real.
Proof. Equip with the -inner product , which is a genuine inner product because . Consider the linear operator . It is self-adjoint for : for all ,
using and , and symmetrically . The two are equal, so in the -inner product.
By the spectral theorem 01.01.13 applied to the self-adjoint on the inner-product space , there is a -orthonormal basis of eigenvectors, with . The eigenvalue equation rearranges to , so each is a generalised eigenvector of with generalised eigenvalue .
Assemble the eigenvectors as the columns of . The -orthonormality is exactly . For the other form,
so . The matrix is invertible because its columns are -orthonormal, hence linearly independent (a dependence paired in against forces ).
Finally, the are the generalised eigenvalues: from and one gets and , so
The roots are precisely , all real, and ordering the eigenbasis by decreasing eigenvalue gives .
Bridge. This theorem builds toward the small-oscillation theory of the Advanced results, where is the stiffness form, the mass form, and the generalised eigenpairs become normal modes with frequencies ; the same congruence reduction appears again in 01.01.15 as the metric-signature normalisation that -orthonormalisation performs, since is Sylvester's positive-definite case made explicit. The foundational reason the construction works is that a positive-definite is an inner product in disguise, so becomes self-adjoint and the spectral theorem 01.01.13 applies verbatim; this is exactly the move that converts a two-form problem into a one-operator problem. The generalised Rayleigh quotient generalises the ordinary Rayleigh quotient of 01.01.14, with its critical values the . Putting these together, simultaneous diagonalisation is the statement that one congruence rounds out to the identity and squares up to a diagonal at the same time.
Exercises Intermediate+
Lean formalization Intermediate+
Mathlib carries the finite-dimensional spectral theorem and positive-definite bilinear forms, but not the assembled simultaneous-congruence-diagonalisation theorem for an ordered pair with . The readable view of the intended statements:
import Mathlib.Analysis.InnerProductSpace.Spectrum
import Mathlib.LinearAlgebra.Matrix.PosDef
open Matrix
variable {n : Type*} [Fintype n] [DecidableEq n]
/-- Simultaneous diagonalisation by congruence of a symmetric pair with the
second matrix positive-definite. NOT a single named Mathlib theorem. -/
theorem simultaneous_diagonalisation
(A B : Matrix n n ℝ) (hA : A.IsSymm) (hB : B.PosDef) :
∃ (P : Matrix n n ℝ) (lam : n → ℝ),
P.det ≠ 0 ∧
Pᵀ * B * P = 1 ∧
Pᵀ * A * P = Matrix.diagonal lam := by
sorry
-- Cholesky B = Lᵀ * L (Matrix.PosDef.cholesky-style); diagonalise the
-- symmetric C = L⁻¹ * A * L⁻ᵀ via the spectral theorem; pull back.
/-- The generalised eigenvalues are the roots of det(A - λ B); equivalently the
ordinary eigenvalues of L⁻¹ A L⁻ᵀ. NOT packaged in Mathlib. -/
theorem generalised_eigenvalues_real
(A B : Matrix n n ℝ) (hA : A.IsSymm) (hB : B.PosDef)
(lam : ℝ) (v : n → ℝ) (hv : v ≠ 0)
(h : A.mulVec v = lam • B.mulVec v) :
True := by
trivial -- placeholder: reality follows from self-adjointness of B⁻¹A in ⟨·,·⟩_BThe proof gap to a clean contribution: (i) expose a Cholesky factorisation B = Lᵀ * L from Matrix.PosDef with L lower-triangular invertible; (ii) transport the finite-dimensional spectral theorem LinearMap.IsSymmetric.eigenvectorBasis to the symmetric C = L⁻¹ A L⁻ᵀ; (iii) pull the eigenbasis back through L⁻ᵀ to obtain P with Pᵀ B P = 1 and Pᵀ A P diagonal, recording the -orthogonality. Each step is reachable from current Mathlib, but the assembled simultaneous_diagonalisation and a generalisedEigenvalue API are not present as named results, and the singular-pencil Weierstrass-Kronecker form is entirely absent.
Advanced results Master
Theorem (small oscillations; Lagrange 1788 [source pending]). Let a mechanical system with generalised coordinates have kinetic energy and potential energy near a stable equilibrium, with the mass form and the stiffness form. The linearised equations of motion decouple in the simultaneous-diagonalisation coordinates into independent harmonic oscillators
where the are the generalised eigenvalues of . Each is a normal coordinate; the -th normal mode oscillates at angular frequency when , and the equilibrium is stable exactly when every , that is when .
The substitution with , sends to and to , so the Lagrangian separates and the Euler-Lagrange equations become the decoupled set above. This is the mechanical content of simultaneous diagonalisation: the normal modes are the generalised eigenvectors, and the normal frequencies squared are the generalised eigenvalues.
Theorem (generalised Rayleigh quotient and min-max; cf. 01.01.14). With the generalised eigenvalues of , , the generalised Rayleigh quotient has range and
The extremes and are attained on the corresponding generalised eigenvectors.
The Cholesky substitution identifies with the ordinary Rayleigh quotient of , so every result of 01.01.14 transports: the critical points of are the generalised eigenvectors, and the recursive deflation uses -orthogonality rather than ordinary orthogonality.
Theorem (Weierstrass-Kronecker canonical form of a pencil; Weierstrass 1868 [source pending]; Kronecker 1890 [source pending]). Let be a pencil of real (or complex) symmetric matrices. If the pencil is regular (), it is equivalent under strict congruence to a direct sum of blocks indexed by its elementary divisors and infinite elementary divisors (poles at , present when is singular). If the pencil is singular (), additional Kronecker minimal-index blocks appear, encoding the row and column minimal indices of the singular part.
When the pencil is regular with finite linear elementary divisors , no infinite divisors, and no minimal indices, which is the diagonalisable case of the definite theorem. The general theory is what governs the failure modes: a defective regular pencil (repeated with ) is not diagonalisable, and a singular pencil arises when and share a common null direction. The QZ algorithm of Moler and Stewart computes the generalised Schur form that realises this classification numerically.
Theorem (Williamson's symplectic normal form; pointer). Let be a positive-definite real symmetric matrix and the standard symplectic form. There is a symplectic (meaning ) with
and the , the symplectic eigenvalues, are the positive parts of the eigenvalues of .
This is the simultaneous-normal-form theorem when the second structure is a symplectic form rather than a positive-definite form: instead of congruence by an arbitrary -orthonormal , the allowed transformations are symplectic, and the invariants are the symplectic eigenvalues. Williamson's theorem is the linear-algebra backbone of Gaussian quantum information and the metaplectic representation, exactly as the definite case underpins classical normal modes.
Synthesis. Simultaneous diagonalisation of two real symmetric forms, one positive-definite, is the spectral theorem 01.01.13 read in the inner product that the definite form supplies. A positive-definite is an inner product ; in that inner product the operator is self-adjoint, so it has a -orthonormal eigenbasis, and the matrix of that basis achieves and at once. The diagonal entries are the generalised eigenvalues, the roots of , equivalently the ordinary eigenvalues of the Cholesky-reduced , and the generalised Rayleigh quotient carries the entire Courant-Fischer min-max apparatus of 01.01.14 across the substitution . The transformation is congruence, not similarity, which is exactly why the are pencil invariants rather than eigenvalues of , and why Sylvester's law of inertia rather than the characteristic polynomial of governs the signs.
Putting these together, the mechanical reading is Lagrange's normal-mode decomposition, with the stiffness, the mass, normal modes the generalised eigenvectors, and normal frequencies ; the definiteness of is the kinetic-energy positivity that makes the theory work, and its loss pushes the problem into the Weierstrass-Kronecker classification of general pencils. The same template, with the second structure a symplectic form, is Williamson's theorem and its symplectic eigenvalues. Across all of these the central insight is that one congruence rounds one form to the identity and squares the other to a diagonal, converting a coupled two-form problem into an uncoupled spectrum.
Full proof set Master
Proposition (existence and uniqueness up to -orthogonal mixing). Let be real symmetric with . A simultaneous congruence diagonalisation , exists; the diagonal is unique up to permutation, and within each generalised eigenspace the choice of -orthonormal eigenvectors is free up to a -orthogonal transformation.
Proof. Existence is the Key theorem: is self-adjoint for , hence has a -orthonormal eigenbasis, whose matrix satisfies the two identities. For the diagonal's uniqueness, suppose both diagonalise the pair, giving . From and , the pencil factors as , so . The roots of are basis-independent, so the multiset equals , giving up to permutation.
For the eigenvector freedom, fix a generalised eigenvalue with generalised eigenspace . If and are two -orthonormal bases of , the change-of-basis matrix between them satisfies because both are -orthonormal: . So is orthogonal, and the choice within is exactly a -orthogonal mixing.
Proposition (inertia of the pencil and signs of the generalised eigenvalues). Let be real symmetric with , and let be the diagonal of generalised eigenvalues. The number of positive, negative, and zero equals the inertia of , the Sylvester signature of the form itself.
Proof. By the Key theorem there is an invertible with , so and are congruent. Sylvester's law of inertia [Sylvester 1852] states that the numbers of positive, negative, and zero diagonal entries of a real symmetric matrix are invariant under real congruence. Applying it to the congruence , the inertia of equals the inertia of , which counts the signs of the . In particular the equilibrium of the small-oscillation system is stable () precisely when all .
Proposition (positivity of 's eigenvalues forces simultaneous definiteness). If are real symmetric with , then if and only if every generalised eigenvalue , and if and only if every .
Proof. Diagonalise simultaneously, , . For , set ; then and . If all , then for every , so . Conversely, if some , take (so ); then , so is not positive-definite. The semidefinite statement follows by replacing the strict inequalities throughout.
Proposition (Cholesky reduction realises the diagonalising congruence). Let be the Cholesky factorisation ( lower-triangular, invertible) and let , a real symmetric matrix with orthogonal eigendecomposition , . Then satisfies and .
Proof. Compute the -form: . Compute the -form: , using twice. The eigenvalues in are the eigenvalues of , which by the substitution are the generalised eigenvalues of . This gives an explicit, numerically standard route to : one Cholesky factorisation and one symmetric eigendecomposition.
Connections Master
The construction is the spectral theorem 01.01.13 applied in the inner product supplied by the positive-definite form: where the spectral theorem diagonalises a self-adjoint operator in a fixed inner product, this unit chooses the inner product so that becomes self-adjoint, and the resulting eigenbasis diagonalises both forms. The reality of the generalised eigenvalues and the -orthogonality of eigenvectors are the spectral theorem's reality and orthogonality conclusions, transported.
The generalised Rayleigh quotient specialises the Rayleigh quotient and Courant-Fischer min-max of 01.01.14 through the Cholesky substitution , which turns into the ordinary Rayleigh quotient of ; every extremal and min-max statement there carries over, now with -orthogonality replacing ordinary orthogonality in the deflation.
The congruence transformation is the bilinear-form change-of-variables law of 01.01.17, and the sign count of the diagonalised pair is governed by Sylvester's law of inertia from 01.01.15: the generalised eigenvalues inherit the signature of , so the quadratic-form classification by inertia is exactly what determines stability of the associated mechanical equilibrium.
The same template recurs when the second form is symplectic rather than positive-definite, giving Williamson's normal form and the symplectic eigenvalues that organise Gaussian states in quantum information 12.07.03; and the small-oscillation reading feeds the normal-mode analysis of coupled oscillators and lattice vibrations in classical and condensed-matter physics, where the generalised eigenvectors are the phonon polarisations and are the squared mode frequencies.
Historical & philosophical context Master
The simultaneous reduction of two quadratic forms grew out of mechanics before it became a chapter of linear algebra. Lagrange, in the Mécanique analytique (1788), analysed the small oscillations of a system about a stable equilibrium by writing the kinetic and potential energies as two quadratic forms and seeking coordinates in which both became sums of squares, so that the equations of motion separated into independent harmonic oscillators; the determinant condition he reached is the secular equation that names the squared normal frequencies [Lagrange 1788]. Sylvester's law of inertia (1852) supplied the invariant that the reduction respects: the signature of a real quadratic form is unchanged by real congruence, so the signs of the generalised eigenvalues are intrinsic [Sylvester 1852].
The general theory of a pencil of forms is due to Weierstrass and Kronecker. Weierstrass (1868) classified a regular pencil by its elementary divisors, settling exactly when the pair is simultaneously diagonalisable and when it is defective [Weierstrass 1868]. Kronecker (1890) completed the picture for singular pencils, introducing the minimal indices that measure the singular part beyond the Weierstrass divisors [Kronecker 1890]. The positive-definite case singled out in this unit is the benign corner of that theory, where all elementary divisors are finite and linear and the pencil diagonalises by a real congruence. Shilov's Linear Algebra presents the definite case as the simultaneous-reduction theorem, and Parlett's The Symmetric Eigenvalue Problem (1980) developed the numerical side, the Cholesky reduction and the conditioning of the symmetric-definite generalised eigenproblem that underlies modern eigensolvers [Parlett 1980].
Bibliography Master
@book{Lagrange1788,
author = {Lagrange, Joseph-Louis},
title = {M{\'e}canique analytique},
publisher = {Veuve Desaint},
address = {Paris},
year = {1788}
}
@article{Sylvester1852,
author = {Sylvester, James Joseph},
title = {A demonstration of the theorem that every homogeneous quadratic polynomial is reducible by real orthogonal substitutions to the form of a sum of positive and negative squares},
journal = {Philosophical Magazine (4th series)},
volume = {4},
number = {23},
year = {1852},
pages = {138--142}
}
@article{Weierstrass1868,
author = {Weierstrass, Karl},
title = {Zur Theorie der bilinearen und quadratischen Formen},
journal = {Monatsberichte der K{\"o}niglich Preussischen Akademie der Wissenschaften zu Berlin},
year = {1868},
pages = {310--338}
}
@article{Kronecker1890,
author = {Kronecker, Leopold},
title = {Algebraische Reduction der Schaaren bilinearer Formen},
journal = {Sitzungsberichte der K{\"o}niglich Preussischen Akademie der Wissenschaften zu Berlin},
year = {1890},
pages = {763--776}
}
@book{Gantmacher1959,
author = {Gantmacher, Felix R.},
title = {The Theory of Matrices, Volume I},
publisher = {Chelsea},
address = {New York},
year = {1959}
}
@book{Parlett1998,
author = {Parlett, Beresford N.},
title = {The Symmetric Eigenvalue Problem},
publisher = {SIAM},
series = {Classics in Applied Mathematics},
year = {1998},
note = {Originally published Prentice-Hall, 1980}
}
@book{GolubVanLoan2013,
author = {Golub, Gene H. and Van Loan, Charles F.},
title = {Matrix Computations},
edition = {4th},
publisher = {Johns Hopkins University Press},
year = {2013}
}
@book{Shilov1977,
author = {Shilov, Georgi E.},
title = {Linear Algebra},
publisher = {Dover Publications},
address = {New York},
year = {1977},
note = {Translation of the 1971 Russian edition, transl. R. A. Silverman}
}
@book{Arnold1989,
author = {Arnold, Vladimir I.},
title = {Mathematical Methods of Classical Mechanics},
edition = {2nd},
publisher = {Springer},
series = {Graduate Texts in Mathematics},
volume = {60},
year = {1989}
}
@article{MolerStewart1973,
author = {Moler, Cleve B. and Stewart, G. W.},
title = {An algorithm for generalized matrix eigenvalue problems},
journal = {SIAM Journal on Numerical Analysis},
volume = {10},
number = {2},
year = {1973},
pages = {241--256}
}
@article{Williamson1936,
author = {Williamson, John},
title = {On the algebraic problem concerning the normal forms of linear dynamical systems},
journal = {American Journal of Mathematics},
volume = {58},
number = {1},
year = {1936},
pages = {141--163}
}