Time-independent perturbation theory
Anchor (Master): Kato, Perturbation Theory for Linear Operators (Springer, 1966); Epstein 1916
Intuition [Beginner]
Most quantum mechanics problems cannot be solved exactly. The hydrogen atom is a rare success — you get explicit energy levels and wavefunctions. But add a second electron (helium), place hydrogen in an electric field (Stark effect), or include the electron spin interacting with its orbital motion (fine structure), and closed-form solutions disappear.
Perturbation theory is the standard tool for these situations. Split the Hamiltonian into two pieces, , where is exactly solvable and is a small correction. The parameter is a bookkeeping device; set at the end. The energies and eigenstates of are already known. Perturbation theory computes how they shift when you turn on .
Think of it as a Taylor expansion for eigenvalues. Expand the th energy as a power series in : . The zeroth-order term is the unperturbed energy you already know. The first-order correction averages the perturbation over the unperturbed wavefunction.
The state itself also gets corrected. The perturbed state is the original plus small admixtures of every other state . The amount of mixed in is proportional to the coupling divided by the energy gap . Strong coupling and small energy gaps produce large corrections.
The second-order energy correction captures how these admixtures feed back into the energy. Each state with contributes . Two features stand out. First, if is the ground state energy, every denominator is negative, so is always negative: the second-order correction pushes the ground state down. Second, nearby states contribute the most because the energy gap sits in the denominator.
A crucial restriction: the formulas above require the unperturbed energy level to be non-degenerate. If two distinct states share the same unperturbed energy, the denominator vanishes and the expression diverges. Degenerate levels need a different strategy: diagonalize within the degenerate subspace first, splitting the level, then apply non-degenerate perturbation theory to each resulting state separately.
What does "small" mean quantitatively? The perturbation is small when is much less than for all states that couple appreciably. When this condition holds, the first few terms approximate the true energy well. If the perturbation is comparable to the level spacing, the series diverges and the approach fails.
Three physical applications drive the theory. The Stark effect places an atom in a uniform electric field. For hydrogen's ground state, the first-order energy correction vanishes because the wavefunction has definite parity and the field perturbation is odd — the expectation value of an odd operator in an even state is zero.
The Zeeman effect splits degenerate magnetic sub-levels with a magnetic field, each splitting computed as a first-order correction. Hydrogen fine structure — spin-orbit coupling, relativistic kinetic-energy corrections, and the Darwin term — adds perturbative corrections smaller than the Bohr energies by a factor of , where is the fine-structure constant. Each application starts from the exactly solvable hydrogen atom and adds physically realistic corrections.
The framework is systematic: take a solvable system, add a small perturbation, compute corrections order by order. The rest of this unit develops the formalism, proves the key formulas, works through the Stark effect in detail, and extends the theory to degenerate levels where the standard formulas break down.
Visual [Beginner]
Picture a row of horizontal lines on a number line — these are the unperturbed energy levels from bottom to top. Now turn on a perturbation . Each level shifts: moves to at first order, then receives an additional smaller second-order push. Nearby levels shift more than distant ones because the energy gap controls the mixing strength.
The perturbed state acquires thin contributions from every other state, visualized as arrows from to each with thickness proportional to . Thick arrows connect nearby levels; distant levels contribute thin, nearly invisible arrows.
Worked example [Beginner]
Stark effect for the hydrogen ground state. Place a hydrogen atom in a uniform electric field pointing in the -direction. The perturbation is , where is the elementary charge and are spherical coordinates. The unperturbed ground state is with energy eV.
First-order correction. Compute . The ground-state wavefunction depends on alone — it is spherically symmetric. The operator is odd under parity (). Integrating an odd function times an even function over all space gives zero. So .
The first-order Stark shift vanishes for the ground state. This is a general pattern: any non-degenerate state with definite parity has zero first-order energy shift under a perturbation that is odd under the same parity operation.
Second-order correction. With the first-order term gone, the leading Stark shift comes from , which adds contributions from every excited state :
Each term is negative (every denominator is negative since is the lowest energy), so the ground state is pushed down. The explicit summation over all hydrogen bound states and the continuum gives , where is the Bohr radius. The shift is quadratic in the field: the atom develops an induced electric dipole moment proportional to , and the energy is the dipole-field interaction, proportional to .
Check your understanding [Beginner]
Formal definition [Intermediate+]
Let be a Hamiltonian with known discrete non-degenerate spectrum: , where forms an orthonormal basis of the Hilbert space. The full Hamiltonian is
where is a Hermitian operator (the perturbation) and is a dimensionless coupling parameter. The goal is to find the eigenvalues and eigenvectors of as power series in :
Intermediate normalization. Fix the phase convention by requiring . This implies for all . The norm of is then ; normalization to unit norm can be imposed later by dividing.
Substitute the series into the time-independent Schrödinger equation and collect powers of .
Order : , satisfied by assumption.
Order :
Project onto and use :
Expand (no component along by intermediate normalization). Projecting onto with :
Order : Projecting the second-order equation onto yields
This sum runs over all unperturbed states (including the continuum, if present). Each term is the square of a coupling divided by an energy gap. States close in energy dominate the sum.
Validity. The expansion is well-defined when for all with appreciable coupling. This is the non-degeneracy condition. When violated, the series diverges or converges too slowly for practical use, and one must resort to degenerate perturbation theory or non-perturbative methods.
Higher orders. The procedure continues: at order , project onto to extract , then project onto for to fix the components of . Each order depends only on lower-order quantities. The third-order energy correction is
The algebraic complexity grows rapidly with order, but the systematic structure remains the same.
Key theorem with proof [Intermediate+]
Theorem (Rayleigh-Schrodinger perturbation series). Let have a discrete non-degenerate spectrum with normalized eigenstates . Let be a Hermitian perturbation. Then for sufficiently small, the perturbed Hamiltonian has eigenvalues and eigenstates given by the formal power series
Proof. Substitute and into :
Collect terms at order : , the unperturbed eigenvalue equation, satisfied by construction.
At order :
Rearrange: . Left-multiply by and use the self-adjointness of together with the zeroth-order equation to kill the left side: . This gives .
For the state correction, left-multiply by with : . The non-degeneracy hypothesis allows division, giving the first-order state coefficients.
At order :
Left-multiply by . The and terms cancel by the same argument as above. What remains is
Substituting the first-order state and using the expression for yields the second-order energy formula.
Corollary (Hellmann-Feynman). If , then at . The first-order perturbation formula is the first term in the Taylor expansion of . This connects perturbation theory to the variational principle: the first-order energy is the derivative of the eigenvalue with respect to the coupling.
Bridge. The Rayleigh-Schrodinger series builds toward 12.07.02 time-dependent perturbation theory, where the same matrix elements govern transition rates between unperturbed states, and appears again in 14.04.01 pending for the helium atom's first-order electron-electron repulsion correction. The foundational reason perturbation theory works at all is that the spectrum of is discrete and isolated — this is exactly the hypothesis of the Kato-Rellich theorem (Master tier), which guarantees the perturbed eigenvalue is an analytic function of . Putting these together, the power-series ansatz is not merely formal: the central insight is that analyticity in converts the eigenvalue problem into a hierarchy of linear equations solvable order by order, and the bridge is between the algebraic recursion derived here and the resolvent-based analytic continuation that justifies it.
Worked example: the two-level system
Consider with and perturbation (real for simplicity).
First-order: , . The diagonal elements of are zero, so the first-order energy shift vanishes.
Second-order:
The lower level is pushed down and the upper level is pushed up: the perturbation repels the two levels. This is a general feature — second-order corrections tend to increase the spacing between levels.
Exact comparison: The full Hamiltonian has eigenvalues
Expanding the square root for small recovers , confirming the second-order result. The exact answer also shows that the perturbative expansion breaks down when .
Exercises [Intermediate+]
Lean formalization [Intermediate+]
Mathlib does not yet cover quantum perturbation theory. The closest layers are:
Mathlib.Analysis.InnerProductSpace.Spectrum: spectral theory for bounded operators on Hilbert spaces.Mathlib.Analysis.Calculus.Deriv: derivatives of functions between normed spaces, needed for analytic perturbation theory.Mathlib.LinearAlgebra.Matrix.GeneralLinearGroup: invertible matrices, relevant to the finite-dimensional case.
There is no Mathlib definition of "Rayleigh-Schrodinger perturbation series", no Kato-Rellich theorem, no resolvent-based expansion of eigenvalues, and no formalization of the intermediate normalization convention. The formalisation pathway is outlined in lean_mathlib_gap. The lean_status: none reflects this gap; no lean_module ships with this unit. Tyler's review attests intermediate-tier correctness.
Degenerate perturbation theory [Master]
The non-degenerate formulas break down when the unperturbed energy is -fold degenerate: there exist orthonormal states () with . The energy denominator vanishes within the degenerate subspace, and the first-order state correction diverges.
The resolution is to choose the right starting states before applying perturbation theory. Within the degenerate subspace , form the matrix with entries
Diagonalize : solve the secular equation
The eigenvalues are the first-order energy corrections. The corresponding eigenvectors are the correct zeroth-order states: the particular linear combinations of degenerate states that perturbation theory selects as the physical starting points.
Theorem (first-order degenerate perturbation theory). Let have a -fold degenerate eigenvalue with orthonormal eigenstates . The first-order energy corrections are the eigenvalues of the Hermitian matrix , and the correct zeroth-order states are the corresponding eigenvectors.
Proof. Within the degenerate subspace , write a general zeroth-order state as . The first-order equation is well-defined because the left side annihilates any state in . Projecting onto gives , the eigenvalue equation . Since is Hermitian, is Hermitian and has real eigenvalues with orthonormal eigenvectors.
If all eigenvalues of are distinct, the degeneracy is completely lifted at first order. If some eigenvalues coincide, the degeneracy is partially lifted, and higher-order degenerate perturbation theory applies within the remaining degenerate subspaces.
Once the correct zeroth-order states are found, higher-order corrections follow the same pattern as non-degenerate theory, with the sums over now excluding only the states within the already-split degenerate subspace (those states are accounted for by the diagonalization of ).
Example: Stark effect for hydrogen . The shell of hydrogen is four-fold degenerate: , , , . The perturbation has selection rules , . Within the manifold, the only nonzero off-diagonal matrix element is
The states are decoupled from all others by the rule. The secular equation reduces to a block:
with eigenvalues and eigenvectors . The four-fold degenerate level splits into three: one shifted up, one shifted down (linearly in the field), and two unshifted (). This is the linear Stark effect, possible only for degenerate states — in contrast to the ground state's quadratic Stark effect.
The variational principle and perturbation theory [Master]
The Rayleigh-Ritz variational principle states that for any normalized trial state ,
where is the exact ground state energy. This gives an upper bound on , non-perturbative and valid regardless of the size of the perturbation.
Perturbation theory complements this. For the ground state, second-order perturbation theory always gives with , which (when the series converges) is a lower bound on the true ground state energy at any finite order. Together, the variational principle and perturbation theory bracket the ground state:
The variational principle applies directly only to the ground state (excited-state bounds require orthogonalization constraints). Perturbation theory, by contrast, gives systematic expansions for all states but requires the perturbation to be small. In practice, the two methods are often combined: a variational calculation provides the trial state, and perturbation theory corrects it.
For systems where the perturbation is not small (strongly correlated electrons, deep quantum wells), the perturbative expansion diverges and only variational or other non-perturbative methods apply. The Hellmann-Feynman theorem (Exercise 8) provides a bridge: it is exact for any eigenstate, but its usefulness in computation relies on having a good approximation to the state.
The Kato-Rellich theorem and convergence [Master]
The formal power series derived above is meaningful only if it converges (or at least provides a useful asymptotic expansion). The rigorous foundation is the Kato-Rellich theorem [Kato 1966] .
Theorem (Rellich, 1937; Kato, 1949). Let be a self-adjoint operator on a Hilbert space with an isolated eigenvalue of finite multiplicity. Let be symmetric and relatively bounded with respect to with relative bound less than 1. Then there exists such that for , the operator has an eigenvalue that is an analytic function of , with .
The relative boundedness condition means there exist constants and such that for all in the domain of . This is a regularity condition that excludes perturbations that are too singular.
The theorem guarantees that the perturbation series converges for small enough , not merely that it is asymptotic. The radius of convergence depends on the distance from to the rest of the spectrum: where is the spectral gap. When the spectral gap is small, the radius of convergence is small, and perturbation theory requires many orders.
For degenerate eigenvalues, Kato's theorem generalizes: the degenerate level splits into analytic branches, and the splitting at first order is given by the secular equation above. The eigenvalues may undergo avoided crossings as varies — they approach each other but do not cross, a phenomenon connected to the Wigner-von Neumann non-crossing rule.
The Wigner-von Neumann non-crossing rule [Master]
For a one-parameter family of real-symmetric (or Hermitian) matrices , two eigenvalues and that are distinct at generically avoid crossing as increases. The repulsion is a consequence of dimensional counting: a crossing requires (one real condition) and (one additional real condition for Hermitian matrices, two for real-symmetric matrices with additional symmetry). In a one-parameter family, satisfying two conditions simultaneously is generically impossible.
This is visible in the two-level system: , with minimum gap — the levels approach but never cross. Perturbation theory captures the leading order of this repulsion: the second-order correction pushes the levels apart.
For systems with additional symmetries (e.g., different angular momentum quantum numbers), crossings can and do occur because the symmetry forbids coupling between the two states, setting the off-diagonal matrix element to zero identically. The non-crossing rule applies only to levels that are coupled by the perturbation.
Brillouin-Wigner perturbation theory [Master]
An alternative to the Rayleigh-Schrodinger expansion is Brillouin-Wigner perturbation theory, in which the exact energy appears in the denominators. The first few iterations give
The advantage is that the denominators use the exact rather than , which can improve convergence for larger perturbations. The disadvantage is that the equation is implicit ( appears on both sides) and must be solved iteratively. For sufficiently small , both formulations agree order by order. The Brillouin-Wigner form is the starting point for many-body perturbation theory in condensed-matter and nuclear physics.
Resolvent formalism and the spectral projection [Master]
The rigorous treatment of perturbation theory uses the resolvent of the Hamiltonian, , defined for in the resolvent set . For the unperturbed Hamiltonian with isolated eigenvalue , choose a contour in the complex plane enclosing but no other point of . The spectral projection onto the eigenspace of is
where .
When , the resolvent expands as a Neumann series:
convergent for . The perturbed spectral projection is
and for small enough, encloses exactly one eigenvalue of with . The perturbed eigenvalue is recovered by .
Theorem (analyticity of the perturbed eigenvalue). Under the hypotheses of the Kato-Rellich theorem, and are analytic functions of in a disk . The Taylor coefficients of at are the Rayleigh-Schrodinger corrections.
Proof sketch. Analyticity of in follows from the Neumann series: each term is analytic in , and uniform convergence on compact subsets preserves analyticity. The contour integral of an analytic function is analytic, giving analyticity of . The trace is then analytic as a product of analytic functions. Uniqueness follows from discreteness of the spectrum inside : for the enclosed eigenvalue is by construction, and analyticity plus continuity prevents eigenvalue exchange across the contour.
The resolvent approach has two advantages over the direct power-series substitution. First, it yields explicit radius-of-convergence estimates via the Neumann-series bound. Second, it extends to degenerate eigenvalues without modification: the contour encloses the entire degenerate cluster, and captures all perturbed eigenvalues at once, splitting them through the eigenvalues of restricted to the perturbed subspace.
Synthesis. The formalism developed in this unit is the foundational reason that most quantitative predictions in atomic, molecular, and condensed-matter physics are computable at all: the central insight is that an exactly solvable reference system plus a small correction yields a systematic expansion for every observable. The Rayleigh-Schrodinger series identifies the first-order energy correction with a simple expectation value and the second-order correction with a sum-over-states formula, and this is exactly the structure that generalises to time-dependent transitions 12.07.02, to the many-body diagrams of Goldstone perturbation theory, and to the Brillouin-Wigner implicit summation. The degenerate extension identifies the correct diagonalisation of the perturbation within the degenerate subspace as the bridge between the vanishing-denominator pathology and the physical level-splitting observed in the Stark and Zeeman effects. Putting these together with the Kato-Rellich theorem, the resolvent formalism provides the analytic foundation: perturbation theory is not a formal trick but a convergent expansion whose radius is controlled by the spectral gap. The pattern recurs throughout quantum physics — from the helium atom 14.04.01 pending to quantum electrodynamics — that physically meaningful corrections emerge from well-defined mathematical objects governed by the analytic structure of the perturbed resolvent.
Full proof set [Master]
Proposition (negativity of the ground-state second-order correction). Let have discrete spectrum with non-degenerate. Then for any nonzero Hermitian perturbation .
Proof. The second-order correction is . Since is the ground state, every denominator is strictly negative. The numerators are non-negative. At least one numerator is strictly positive: if all for , then , so is an eigenstate of . But and the unperturbed basis is complete, so some off-diagonal matrix element is nonzero — meaning at least one coupling to a different state exists. Hence at least one term is strictly negative and no term is positive, giving .
Proposition (variational bound for the perturbative ground state). For any normalized , where is the exact ground-state energy. When the perturbation series converges, , so the first-order result is an upper bound and the second-order result is a lower bound.
Proof. Expand in the exact eigenbasis of with . Then . For the first-order perturbative estimate: is the Rayleigh-Ritz functional evaluated on , hence an upper bound on . The second-order correction satisfies by the previous proposition, so lies below the first-order estimate. For convergent perturbation series, this sum approaches from below, establishing the lower bound.
Proposition (level repulsion in the two-level system). For the two-level Hamiltonian with and , the gap between the two eigenvalues satisfies for all .
Proof. The exact eigenvalues are with . The gap is . Since and , the term , so the gap exceeds . The second-order perturbation corrections and capture the leading-order repulsion, and this is exactly the Wigner-von Neumann non-crossing rule for a one-parameter Hermitian family.
Connections [Master]
Hydrogen atom (12.06.01) is the primary testbed for perturbation theory in quantum mechanics. The Bohr energies and hydrogenic wavefunctions serve as the unperturbed starting point for Stark, Zeeman, and fine-structure calculations. Every perturbative correction to hydrogen relies on the integrals computed from the solutions of 12.06.01.
Multi-electron atoms (14.04.01) extend the hydrogenic framework via perturbation theory. The helium atom's ground state is treated by taking as two independent hydrogen Hamiltonians and as the electron-electron repulsion . The first-order correction gives a 34% error relative to the measured ionization energy — large, but systematically improvable.
Schrodinger equation (12.04.01) is the eigenvalue equation into which perturbation theory inserts its power-series ansatz. The entire formalism is a systematic procedure for solving when .
Hilbert space and operators (12.02.02) provides the mathematical framework. The Hermiticity of ensures real energy corrections, the completeness of the unperturbed basis underlies the expansion of state corrections, and the spectral theorem justifies the eigenvalue decomposition.
Variational methods provide a complementary approach. Where perturbation theory expands in a small parameter, the variational principle optimizes over trial states. The two bracket the ground state energy from opposite sides and are often used together.
Time-dependent perturbation theory extends the formalism to perturbations that depend on time, governing transitions between states. The time-independent theory developed here is the prerequisite: the transition rates involve the same matrix elements and energy denominators.
Kato's theory of linear operators [Kato 1966] places the Rayleigh-Schrodinger expansion on rigorous mathematical footing via the analytic theory of operator-valued functions.
Historical & philosophical context [Master]
Perturbation theory in the physical sciences predates quantum mechanics. Lord Rayleigh developed perturbative methods for acoustics and vibration theory in The Theory of Sound (1877, 2nd edition), computing corrections to the frequencies of vibrating systems with small inhomogeneities. His approach — expand the solution in a small parameter and match orders — is formally identical to what quantum mechanics uses, applied to the classical wave equation rather than the Schrodinger equation.
The quantum version was introduced by Schrodinger in his fourth communication on wave mechanics (1926) [Schrodinger 1926], where he applied the Rayleigh method to the Schrodinger equation to compute the Stark effect. Independently, Epstein (1916) and Schwarzschild (1916) had computed the Stark effect using the old quantum theory of Bohr and Sommerfeld, obtaining results that agreed with the experimental measurements of Stark (1913). Schrodinger's wave-mechanical perturbation theory reproduced these results from a more systematic framework and extended them to problems the old quantum theory could not handle.
The method became known as Rayleigh-Schrodinger perturbation theory, acknowledging both the classical origin and the quantum reformulation. It remains the standard approach taught in every quantum mechanics course and used in atomic, molecular, and condensed-matter physics.
The mathematical foundations were developed over several decades. Rellich (1937-1942) proved the analyticity of isolated eigenvalues under perturbation for bounded operators. Kato (1949, 1966) extended this to unbounded operators — the physically relevant case, since quantum Hamiltonians are generically unbounded — and established the relative boundedness condition that guarantees the perturbation series converges for small coupling. Kato's Perturbation Theory for Linear Operators (1966) [Kato 1966] remains the definitive mathematical reference.
The Brillouin-Wigner formulation (Brillouin 1932, Wigner 1935) introduced the implicit form with exact energies in the denominators. It saw limited use in atomic physics but became the foundation of many-body perturbation theory through the work of Goldstone (1957), Hugenholtz (1957), and others, who derived diagrammatic perturbation expansions for interacting fermion systems.
Perturbation theory encodes a philosophical stance: the real world is a small correction to an idealized model. This is both its power and its limitation. When the correction is genuinely small (hydrogen fine structure, weak-field Stark and Zeeman effects), the method is extraordinarily precise. When it is not (strongly correlated systems, quantum chromodynamics at low energies), the expansion diverges and non-perturbative methods — variational principles, lattice simulations, renormalization group techniques — become necessary.
Bibliography [Master]
Primary literature and historical sources:
- Schrodinger, E., "Quantisierung als Eigenwertproblem (Vierte Mitteilung)", Annalen der Physik 81 (1926), 109–139. [Wave-mechanical perturbation theory applied to the Stark effect.]
- Epstein, P. S., "Zur Theorie des Starkeffektes", Annalen der Physik 50 (1916), 489–520. [Stark effect via old quantum theory.]
- Schwarzschild, K., "Zur Quantenhypothese", Sitzungsber. Kgl. Preuss. Akad. Wiss. (1916), 548–568. [Independent Stark effect calculation.]
- Rayleigh, J. W. S., The Theory of Sound, 2nd ed. (Macmillan, 1894), §90-95. [Classical perturbation theory for vibrating systems.]
- Kato, T., "On the convergence of the perturbation method", J. Fac. Sci. Univ. Tokyo 6 (1951), 145–226.
- Rellich, F., "Storungstheorie der Spektralzerlegung", Math. Annalen 113 (1937), 600–619; 116 (1939), 555–570; 117 (1940), 356–382; 118 (1942), 462–484.
Textbooks:
- Griffiths, D. J. & Schroeter, D. F., Introduction to Quantum Mechanics, 3rd ed. (Cambridge, 2018), Ch. 7. [Pedagogical introduction; the standard undergraduate reference.]
- Sakurai, J. J. & Napolitano, J., Modern Quantum Mechanics, 2nd ed. (Cambridge, 2017), Ch. 5.1-5.2. [Graduate-level treatment with degenerate perturbation theory.]
- Landau, L. D. & Lifshitz, E. M., Quantum Mechanics: Non-Relativistic Theory, 3rd ed. (Pergamon, 1977), §38-41. [Concise and physically motivated.]
- Shankar, R., Principles of Quantum Mechanics, 2nd ed. (Plenum, 1994), Ch. 17-18. [Detailed exposition with many worked examples.]
- Cohen-Tannoudji, C., Diu, B. & Laloee, F., Quantum Mechanics, Vols. I-II (Wiley, 1977), Ch. XI. [Thorough coverage of both non-degenerate and degenerate cases.]
- Messiah, A., Quantum Mechanics, Vols. I-II (North-Holland, 1961), Ch. XVI. [Systematic treatment at the graduate level.]
- Merzbacher, E., Quantum Mechanics, 3rd ed. (Wiley, 1998), Ch. 18. [Compact but rigorous.]
Mathematical foundations:
- Kato, T., Perturbation Theory for Linear Operators (Springer, 1966; repr. 1995). [The definitive mathematical reference on analytic perturbation theory.]
- Reed, M. & Simon, B., Methods of Modern Mathematical Physics, Vol. IV: Analysis of Operators (Academic Press, 1978), Ch. XII-XIII. [Rigorous spectral analysis and perturbation of isolated eigenvalues.]
- Friedrichs, K. O., Perturbation of Spectra in Hilbert Space (AMS, 1965). [Compact presentation of the analytic theory.]
Wave 3 physics unit, produced per PHYSICS_PLAN §5 runbook. Hooks_out targets 14.04.01 and 12.06.01 are both proposed; no receiving unit yet confirms them. Status remains draft pending Tyler's review and external QM reviewer sign-off.