12.07.03 · quantum / perturbation

Variational method (Rayleigh-Ritz) in quantum mechanics

shipped3 tiersLean: none

Anchor (Master): Landau & Lifshitz, Quantum Mechanics, Vol. 3 (Pergamon, 1977), Ch. VI §44 and Problems; MacDonald 1933 *Phys. Rev.* 43; Ritz 1909 *J. reine angew. Math.* 135

Intuition Beginner

Most quantum systems cannot be solved exactly. You can still get a good number for the lowest energy without solving anything, and the trick is almost embarrassingly simple. Pick any reasonable guess for the wavefunction, compute the average energy that guess carries, and you are guaranteed that the true ground-state energy sits at or below your number. Your guess can never undershoot the real answer. So a guess hands you a ceiling, and a better guess hands you a lower ceiling closer to the truth.

Why is the ceiling always above the truth? The ground state is, by definition, the configuration of lowest possible energy. Any other state is a blend of the ground state with higher-energy states mixed in. Mixing in higher-energy pieces can only raise the average. Think of grades in a class: the lowest grade is some number, and any weighted average of all the grades lands at or above that lowest one. The ground-state energy plays the role of the lowest grade, and your guess is a weighted average.

This turns physics into a search. Build a guess with a few adjustable dials, write the average energy as a formula in those dials, and turn the dials until the energy bottoms out. The bottom of that search is your best estimate. Helium, where two electrons repel each other and exact solutions vanish, is the showpiece: one dial, an effective nuclear charge that each electron feels after the other partially screens the nucleus, already lands within two percent of the measured energy.

The method is forgiving in a way that matters. If your guessed wavefunction is a little wrong, the energy you compute is wrong by much less. A ten-percent error in the shape of the guess can leave the energy off by only one percent. That is why a crude guess still gives a respectable number, and why this is the first tool reached for when an exact answer is out of range.

Visual Beginner

The picture to hold is an energy landscape over the space of guesses. Each adjustable dial is a horizontal axis; the average energy is the height. Every point on this surface sits at or above one floor line, the true ground-state energy. Searching for the lowest point on the surface is searching for the best guess, and the height you reach there is your estimate of the floor.

The bowl never dips below the floor. The deeper and narrower your family of guesses lets the bowl go, the closer the bottom of the bowl gets to the floor. Adding more dials cannot raise the bottom; it can only lower it or leave it where it was, so richer guesses always do at least as well.

Worked example Beginner

Estimate the ground-state energy of a particle of mass in the one-dimensional harmonic oscillator, where the potential energy is . The exact answer is known to be , so this checks the method against a result we can verify.

Step 1. Choose a guess with one dial. A Gaussian bump is a sensible shape: it is peaked at the origin where the potential is smallest, and the width is controlled by the positive number . Larger means a narrower bump.

Step 2. The average energy of this guess works out to . The first piece is the kinetic energy, which grows as the bump narrows; the second piece is the potential energy, which grows as the bump widens and samples more of the rising potential. The two pieces pull against each other.

Step 3. Find the dial setting that bottoms out . Set the rate of change of with respect to to zero: . Solving gives , so .

Step 4. Put this best dial setting back into the energy formula. The kinetic piece becomes , and the potential piece becomes . They add to .

What this tells us: the guess landed exactly on the true ground-state energy. That happened because the Gaussian shape is, in fact, the exact ground-state wavefunction of the harmonic oscillator, so the family of guesses contained the right answer. When the family does not contain the exact answer, the bottom of the search lands a little above the truth, and that gap is the price of an imperfect guess.

Check your understanding Beginner

Formal definition Intermediate+

Let be a self-adjoint Hamiltonian on a Hilbert space , bounded below, with spectrum whose infimum is the ground-state energy . For a nonzero vector in the form domain of , the Rayleigh quotient is

A trial function (or trial state) is any admissible used to evaluate ; admissibility means lies in the form domain and satisfies the boundary conditions defining . The variational functional is the map . A trial state is stationary when its first-order variation vanishes: for every admissible variation .

A parametrised trial family is a map , and the optimisation reduces to the stationarity equations

A linear trial family, the Rayleigh-Ritz case proper, fixes a finite set of basis functions and writes . Define the Hamiltonian matrix and the overlap (Gram) matrix . Both are Hermitian; is positive definite when the are linearly independent. The trial energy is the ratio of quadratic forms .

When the basis is orthonormal, and the generalised problem collapses to the ordinary Hermitian eigenproblem . The non-orthonormal case keeps and is the situation arising whenever the chosen atomic or molecular orbitals overlap, as they generically do.

Counterexamples to common slips

  • The bound is a bound on the lowest eigenvalue only. for an arbitrary says nothing by itself about excited states; without an orthogonality constraint, the quotient can sit anywhere between and the spectral supremum sampled by .
  • The trial state must lie in the form domain, not merely be square-integrable. A discontinuous trial function can give a finite-looking integral for after one integration by parts while the genuine expectation of the kinetic energy diverges; the bound then fails because was never admissible.
  • Forgetting the overlap matrix when the basis is non-orthonormal produces wrong roots. Diagonalising alone, rather than solving , is a frequent and silent error in hand computations with overlapping orbitals.

Key theorem with proof Intermediate+

Theorem (variational bound; Rayleigh-Ritz). Let be self-adjoint and bounded below with . For every nonzero admissible ,

with equality if and only if is a ground eigenstate, . Moreover the stationary points of the functional over all admissible states are exactly the eigenstates of , and the stationary values are the corresponding eigenvalues.

Proof. Take first the case of a Hamiltonian with purely discrete spectrum and a complete orthonormal eigenbasis , . Expand the trial state with . The denominator is . The numerator, using and orthonormality, is . Therefore

Every factor and every weight , so the right-hand side is a ratio of a nonnegative number to a positive number, hence nonnegative. This gives . Equality forces for every , so whenever ; the surviving coefficients live entirely in the -eigenspace, meaning is a ground eigenstate.

For the general lower-semibounded case, replace the eigenbasis sum by the spectral measure of . With a positive measure supported on ,

since throughout the support. The bound is thus the statement that the average of a quantity never falls below its least value.

For the stationarity claim, vary and impose . Writing the quotient as and taking the first variation gives for all admissible . Independence of the real and imaginary parts of forces . So stationary states satisfy the eigenvalue equation , and the stationary value is the eigenvalue.

Bridge. The variational bound builds toward every approximate-energy method in quantum mechanics and quantum chemistry, and it appears again in the Hartree-Fock construction 12.09.03, where the trial state is a Slater determinant and the stationarity condition produces the self-consistent-field equations. The foundational reason the bound holds is the spectral expansion just used: the Rayleigh quotient is a weighted average of eigenvalues, and an average never drops below its least term. This is exactly the structure that makes the method robust — a small admixture of excited states changes the weights only slightly, so the energy estimate is forgiving. Putting these together, the linear case sharpens the bound into the secular equation , whose lowest root is the best estimate the chosen basis can produce, and the central insight is that enlarging the basis can only lower that root. The variational principle is dual to perturbation theory 12.07.01: perturbation theory expands around a solvable Hamiltonian and produces a series of unknown sign, while the variational method makes no such expansion and produces a guaranteed one-sided bound. The bridge is that both are stationarity statements about , the first about its Taylor coefficients in a coupling, the second about its extrema over a trial family.

Exercises Intermediate+

Lean formalization Intermediate+

Mathlib carries the Rayleigh quotient and the min-max eigenvalue characterisation for self-adjoint operators, but not the quantum-mechanical variational theorem as a named statement. The intended formalisation reads schematically:

import Mathlib.Analysis.InnerProductSpace.Rayleigh
import Mathlib.Analysis.InnerProductSpace.Spectrum

open InnerProductSpace

variable {𝕜 : Type*} [RCLike 𝕜]
variable {E : Type*} [NormedAddCommGroup E] [InnerProductSpace 𝕜 E]

/-- The Rayleigh quotient of a trial vector for a self-adjoint operator
    is bounded below by the bottom of the spectrum. -/
theorem variational_bound
    (T : E →L[𝕜] E) (hT : IsSelfAdjoint T)
    (E0 : ℝ) (hE0 : ∀ x, E0 * ‖x‖ ^ 2 ≤ RCLike.re ⟪T x, x⟫)
    (ψ : E) (hψ : ψ ≠ 0) :
    E0 ≤ RCLike.re ⟪T ψ, ψ⟫ / ‖ψ‖ ^ 2 := by
  sorry  -- spectral measure: average of λ ≥ E0 is ≥ E0

/-- Stationary points of the Rayleigh quotient are eigenvectors. -/
theorem rayleigh_stationary_iff_eigenvector
    (T : E →L[𝕜] E) (hT : IsSelfAdjoint T) (ψ : E) (hψ : ψ ≠ 0) :
    IsLocalExtrOn (fun x => RCLike.re ⟪T x, x⟫ / ‖x‖ ^ 2) {x | x ≠ 0} ψ ↔
      ∃ μ : 𝕜, T ψ = μ • ψ :=
  sorry  -- Mathlib has hasEigenvector_of_isLocalExtrOn for the finite case

The bound for a lower-semibounded self-adjoint operator follows from the spectral theorem already in Mathlib; the gap is packaging it as a trial-function statement. The linear Ritz reduction to a generalised eigenvalue problem with a non-orthonormal overlap matrix, and the MacDonald interleaving theorem, have no Mathlib counterpart and would build on the Courant-Fischer min-max API in Analysis.InnerProductSpace.Spectrum.

Advanced results Master

The linear method sharpens the existence bound into an algorithm with provable convergence. Fix linearly independent , form and , and seek stationary points of . The stationarity condition is the generalised Hermitian eigenvalue problem

Since is positive definite it admits a Cholesky factor , and the substitution converts the problem to the ordinary Hermitian eigenproblem with real spectrum. The roots are the Ritz values and the corresponding are the Ritz vectors.

Theorem (MacDonald 1933; interleaving / bracketing). Let be the exact eigenvalues of below the essential spectrum and let be the Ritz values in an -dimensional trial subspace . Then for . If , the Ritz values interlace, , so each bound decreases monotonically toward as the basis is enlarged.

This is the spectral-physics content of the Courant-Fischer min-max theorem restricted to a finite subspace: the -th Ritz value equals the min-max of the quadratic form over -dimensional subspaces of , and shrinking the ambient space from to can only raise the min-max. The ground-state bound is the case . The higher Ritz values bracket the excited states from above, which is what makes the Ritz method a route to the whole low-lying spectrum, not merely the ground state.

Theorem (quadratic error of the energy). Let be a trial state, where is the exact normalised ground state, with , and is a small real parameter. Then

so a first-order error in the wavefunction produces only a second-order error in the energy.

The cross term vanishes because and is self-adjoint, leaving the leading correction at order . This is the precise statement behind the Beginner-tier remark that a crude guess still gives a good energy; it is also why the method tolerates basis truncation gracefully in practice.

Theorem (Eckart's lower-bound complement). If is the gap to the first excited state and is normalised with not too small, then the overlap with the true ground state obeys

so a trial energy close to forces the trial state close to . This converts an energy bound into a wavefunction bound, the converse direction to the quadratic-error theorem, and is the basis of the Eckart and Temple lower-bound estimates that flank the Rayleigh-Ritz upper bound.

The nonlinear case — where parameters enter the trial function in a way that is not a linear combination, as with the helium or a variational width — is handled by the same stationarity principle but loses the clean interleaving structure; the resulting energy is still a rigorous upper bound, but the optimisation surface may have several stationary points and only the global minimum within the family is guaranteed to beat all members of the family. Hylleraas's 1929 correlated trial for helium, which inserts an explicit dependence on the inter-electron distance that no product of one-electron orbitals can capture, drives the helium ground-state error below a part in and stands as the historical demonstration that trial-function ingenuity, not basis size alone, controls accuracy.

Synthesis. The variational principle is the foundational reason approximate quantum mechanics has any rigour at all: the Rayleigh quotient is a weighted average of eigenvalues, so it is dual to nothing more exotic than the statement that an average lies above its least term, and this is exactly what converts a guess into a guaranteed one-sided bound. Putting these together, the linear Ritz reduction packages the bound as the secular equation , MacDonald's interleaving theorem promotes the single ground-state bound to a tower of bounds on the whole low-lying spectrum, and the quadratic-error theorem explains why the bounds are forgiving — the central insight being that the energy is stationary, not merely continuous, at the truth. The method appears again in the Hartree-Fock and configuration-interaction machinery of electronic structure 12.09.03, where the trial state is a determinant or a determinant expansion and the stationarity condition becomes the self-consistent-field equations; the variational bound is what guarantees that a larger configuration space never raises the computed energy, and the bridge to the chemistry side is precisely the secular equation built in an atomic-orbital basis. The principle generalises beyond ground states through the symmetry-orthogonality device and beyond linear families through nonlinear and correlated ansätze, and it identifies the bottom of the spectrum of a self-adjoint operator with the infimum of a quadratic form over the form domain, the identification that underlies the Friedrichs extension and the modern operator-theoretic definition of a Hamiltonian.

Full proof set Master

Theorem (variational bound), proof. Given in the Intermediate-tier section: spectral expansion writes as a ratio of to , manifestly nonnegative, with equality iff lies in the -eigenspace; the general lower-semibounded case replaces the sum by the spectral measure and uses on the support.

Proposition (stationary values are eigenvalues). Let be self-adjoint and . The first variation under is

Proof. Differentiate . The variation of the right side is ; the variation of the left is . Equating and solving for gives the displayed formula. Setting for all admissible , and using that and are both admissible to separate the two conjugate terms, forces . So is an eigenstate and its eigenvalue.

Theorem (MacDonald interleaving), proof. The -th Ritz value in the trial subspace is, by the Courant-Fischer characterisation applied to the restriction of the quadratic form to ,

The exact eigenvalue is the same min-max taken over all -dimensional subspaces of . Since every -dimensional is also a -dimensional subspace of , the minimisation over the smaller family ranges over fewer subspaces, so its minimum is at least the minimum over all of : . For the nesting , every -subspace of is a -subspace of , so the min-max over ranges over more subspaces and yields a value no larger: . Monotone and bounded below by , the Ritz values converge as the basis is enlarged.

Theorem (quadratic error), proof. Write with , . The denominator is . The numerator is . The cross term is because and . Hence

so the leading correction is , second order in .

Theorem (Eckart overlap bound), proof. Expand normalised, . Then . Bound the tail using for : . Rearranging, , that is .

Theorem (helium effective-charge estimate), proof. For the product trial of normalised hydrogenic orbitals with variable charge , the one-electron expectation of in atomic units is computed from the hydrogenic virial relations: kinetic energy per electron and nuclear attraction per electron. The electron-electron repulsion for two orbitals is the standard Coulomb integral . Summing over both electrons, . With , ; the stationarity gives and a.u. Converting at eV per atomic unit gives eV.

Connections Master

  • Operators, observables, and Hermiticity 12.02.02. The variational bound is a statement about the spectrum of a self-adjoint Hamiltonian, and its proof rests on the reality of the spectrum and the completeness of the eigenbasis established for Hermitian observables. The vanishing of the variational cross term that makes the energy stationary uses self-adjointness directly: . Without the Hermiticity framework the Rayleigh quotient would be complex and the bound meaningless.

  • Hilbert-space formalism 12.02.01. The trial state lives in a Hilbert space and the Rayleigh quotient is built from the inner product; the spectral-measure argument that extends the bound from discrete to continuous spectra is the Hilbert-space spectral theorem in action. The form-domain admissibility condition that a valid trial function must satisfy is a Hilbert-space-theoretic constraint, not a merely formal one.

  • Time-independent perturbation theory 12.07.01. The variational method and Rayleigh-Schrödinger perturbation theory are the two pillars of approximate stationary-state quantum mechanics, and they are complementary: perturbation theory expands in a coupling and produces corrections of either sign, while the variational method produces a guaranteed upper bound with no expansion. The second-order perturbation correction to the ground state is always negative, which is itself a shadow of the variational bound — the exact ground state lies below the unperturbed one, exactly as the variational principle requires.

  • Particle in a box 12.04.01. The infinite-well eigenfunctions are the cleanest finite-basis testbed for the Ritz method, and the parabolic-trial exercise shows the quadratic energy insensitivity concretely against an exactly known spectrum. The box eigenstates also serve as a complete orthonormal basis () in which the linear variational method reduces to ordinary matrix diagonalisation, the simplest instance of the secular equation.

  • Hartree-Fock self-consistent field method 12.09.03. Hartree-Fock is the variational method applied to a Slater-determinant trial wavefunction for a many-electron atom; the self-consistent-field equations are the stationarity conditions derived here, specialised to the determinantal ansatz. The variational bound guarantees that the Hartree-Fock energy is an upper bound to the true ground-state energy, and configuration interaction enlarges the trial space exactly in the Ritz sense, lowering the bound monotonically.

Historical & philosophical context Master

The quotient bound predates quantum mechanics by half a century. Lord Rayleigh, estimating the fundamental frequencies of vibrating strings, membranes, and elastic bodies in the 1870s, observed that an assumed mode shape always yields a frequency at or above the true fundamental, and that the estimate is stationary at the correct mode; the systematic account appears in The Theory of Sound (1877; 2nd ed. 1894, §§88-89) [Rayleigh 1894]. The classical-mechanical content is identical to the quantum statement once the eigenvalue is read as an energy rather than a squared frequency. Walther Ritz supplied the algorithmic upgrade in 1909, reducing a continuous variational problem to a finite linear system by expanding in a fixed basis and solving the resulting determinantal equation, published as a general method for the variational problems of mathematical physics [Ritz 1909]. The pairing of Rayleigh's bound with Ritz's linear algorithm became the Rayleigh-Ritz method.

The transfer to quantum mechanics was immediate once Schrödinger's equation appeared in 1926, since the energy eigenvalue problem is formally the variational problem Rayleigh and Ritz had already solved. Egil Hylleraas's 1929 calculation of the helium ground state [Hylleraas 1929] was the decisive demonstration: by including an explicit dependence on the inter-electron coordinate in the trial function, he reached an energy accurate to better than a part in , settling the question of whether quantum mechanics could quantitatively account for a two-electron atom. J. K. L. MacDonald's 1933 paper [MacDonald 1933] proved that the successive roots of the Ritz secular equation bracket the exact eigenvalues from above and interleave correctly as the basis grows, giving the method its rigorous convergence theory and extending it from the ground state to the excited spectrum. Landau and Lifshitz present the variational principle compactly in Volume 3, Chapter VI, treating the upper-bound theorem as a brief but load-bearing aside and developing the trial-function estimates in the chapter problems [Landau-Lifshitz 1977]. The method became the computational backbone of quantum chemistry through the Hartree-Fock and configuration-interaction programs, where the secular equation in an atomic-orbital basis is solved on every electronic-structure calculation performed today.

Bibliography Master

@book{Rayleigh1894TheoryOfSound,
  author    = {Strutt, John William (Lord Rayleigh)},
  title     = {The Theory of Sound, Vol. 1},
  edition   = {2},
  publisher = {Macmillan},
  year      = {1894}
}

@article{Ritz1909,
  author  = {Ritz, Walther},
  title   = {{\"U}ber eine neue Methode zur L{\"o}sung gewisser Variationsprobleme der mathematischen Physik},
  journal = {Journal f{\"u}r die reine und angewandte Mathematik},
  volume  = {135},
  year    = {1909},
  pages   = {1--61}
}

@article{Hylleraas1929,
  author  = {Hylleraas, Egil A.},
  title   = {Neue Berechnung der Energie des Heliums im Grundzustande, sowie des tiefsten Terms von Ortho-Helium},
  journal = {Zeitschrift f{\"u}r Physik},
  volume  = {54},
  year    = {1929},
  pages   = {347--366}
}

@article{MacDonald1933,
  author  = {MacDonald, J. K. L.},
  title   = {Successive Approximations by the Rayleigh-Ritz Variation Method},
  journal = {Physical Review},
  volume  = {43},
  year    = {1933},
  pages   = {830--833}
}

@book{LandauLifshitzQM,
  author    = {Landau, L. D. and Lifshitz, E. M.},
  title     = {Quantum Mechanics: Non-Relativistic Theory},
  series    = {Course of Theoretical Physics, Vol. 3},
  edition   = {3},
  publisher = {Pergamon Press},
  year      = {1977}
}

@book{CohenTannoudjiQM,
  author    = {Cohen-Tannoudji, Claude and Diu, Bernard and Lalo{\"e}, Franck},
  title     = {Quantum Mechanics, Vol. 2},
  publisher = {Wiley},
  year      = {1977}
}

@book{GriffithsSchroeter2018,
  author    = {Griffiths, David J. and Schroeter, Darrell F.},
  title     = {Introduction to Quantum Mechanics},
  edition   = {3},
  publisher = {Cambridge University Press},
  year      = {2018}
}

@article{Eckart1930,
  author  = {Eckart, Carl},
  title   = {The Theory of the Helium Atom},
  journal = {Physical Review},
  volume  = {36},
  year    = {1930},
  pages   = {878--892}
}

@article{Temple1928,
  author  = {Temple, George},
  title   = {The Theory of Rayleigh's Principle as Applied to Continuous Systems},
  journal = {Proceedings of the Royal Society of London A},
  volume  = {119},
  year    = {1928},
  pages   = {276--293}
}