12.11.01 · quantum / relativistic-qm

Dirac equation and relativistic spin

draft3 tiersLean: nonepending prereqs

Anchor (Master): Dirac, The Principles of Quantum Mechanics, 4e (1958), Ch. XI; Bjorken & Drell, Relativistic Quantum Mechanics (1964)

Intuition [Beginner]

The Schrödinger equation works well for slow particles. It treats time and space differently: one time derivative on the left, two space derivatives on the right. But special relativity 10.05.01 pending says space and time sit on equal footing. If you try to fix this by writing down as a quantum equation (the Klein-Gordon equation), you get a second-order time derivative — and that causes trouble. Probability can go negative. The equation seems to admit negative-energy solutions that have no physical interpretation.

Dirac's insight in 1928 was to insist on a first-order equation in time. Not , but . The price: the wave function can no longer be a single complex number at each point. It must be a column of four numbers — a four-component spinor, written . The equation involves matrices called gamma matrices (denoted ) that encode the geometry of spacetime.

In condensed notation, the Dirac equation reads "the gamma matrices, times the spacetime gradients of , minus , equals zero" — with a convention where . The sum runs over the four spacetime directions and produces one matrix equation coupling all four components of .

Two things fall out of this equation that were not put in by hand.

First: spin. The angular momentum of a Dirac particle splits into orbital plus an extra piece — intrinsic spin — that comes from the matrix structure alone. The electron has spin because the Dirac equation forces it, not because someone added spin as an extra assumption 12.05.01 pending.

Second: antimatter. The four-component spinor has two solutions with positive energy (spin-up and spin-down electrons) and two with negative energy. Dirac interpreted the negative-energy states as an already-full "sea" of electrons; a missing electron from this sea behaves as a particle with opposite charge — a positron. Anderson discovered the positron in 1932, confirming Dirac's prediction. In modern quantum field theory the Dirac sea is replaced by a cleaner picture: the negative-energy solutions describe antiparticles travelling forward in time.

The Dirac equation also predicts the electron's magnetic moment. For a point particle with spin, the naive ratio of magnetic moment to angular momentum gives . The Dirac equation gives . This was a stunning agreement with experiment, and the small deviation from exactly 2 (the anomalous magnetic moment) is one of the most precisely tested predictions in all of physics.

Visual [Beginner]

Picture the energy spectrum of the Dirac equation for a free particle at rest. The horizontal axis is energy ; the vertical axis is not needed. There are two allowed values: and (in natural units where ). Each value is doubly degenerate — spin-up and spin-down.

Energy spectrum of the free Dirac equation at rest. Two positive-energy levels at E = +m (labelled "electron, spin up" and "electron, spin down") and two negative-energy levels at E = -m (labelled "positron, spin up" and "positron, spin down"). A vertical arrow from the negative-energy continuum to a positive-energy level indicates pair production.

The Dirac sea picture: every negative-energy state is filled by default. An incoming photon can knock one of these negative-energy electrons up to a positive-energy state. The result is a visible electron (the promoted particle) plus a "hole" in the negative-energy sea — the hole behaves as a positively charged particle with the same mass. This is pair production: photon goes in, electron-positron pair comes out.

The modern view drops the sea entirely. The Dirac equation, reinterpreted as a quantum field equation, has four independent particle states: two electron polarizations and two positron polarizations. The negative sign in the energy is absorbed by redefining the creation operators for antiparticles.

Worked example [Beginner]

Solve the Dirac equation for a free particle at rest — no momentum, .

With , all spatial variation vanishes and the Dirac equation reduces to . Look for solutions of the form where is a constant four-component spinor. Substituting gives , or .

Using the Dirac (standard) representation where is block-diagonal,

with the identity, the equation splits into two upper and two lower components:

where with and each having two components.

From the first equation: either and (two solutions — spin-up and spin-down), or and (two solutions — also spin-up and spin-down).

The four solutions are:

Energy Interpretation
Electron, spin up
Electron, spin down
Positron, spin up
Positron, spin down

The positive-energy solutions describe electrons at rest. The negative-energy solutions, reinterpreted, describe positrons — same mass, opposite charge. The splitting into two pairs of two is the origin of the four-component spinor structure: two for particle spin states, two for antiparticle spin states.

The magnetic moment prediction: applying the Dirac equation to an electron in a magnetic field and taking the non-relativistic limit yields the Pauli equation with gyromagnetic ratio . The derivation is in the Intermediate tier below.

Check your understanding [Beginner]

Formal definition [Intermediate+]

We work in natural units and adopt the mostly-minus metric signature . Greek indices run over ; Latin indices run over . The spacetime coordinate is and .

The Dirac equation for a free spin- particle of mass is

where is a four-component Dirac spinor and the gamma matrices () are complex matrices satisfying the Clifford algebra 03.09.08

This anticommutation relation is the load-bearing algebraic structure. It ensures that applying the Dirac operator twice recovers the Klein-Gordon operator:

The step from the second to third expression uses to symmetrise the derivative indices. So every solution of the Dirac equation is also a solution of the Klein-Gordon equation, but not vice versa — the Dirac equation is a stronger constraint.

Standard (Dirac) representation. A concrete realisation of the gamma matrices is

where () are the Pauli matrices 12.05.01 pending. One verifies directly that and , with all mixed anticommutators vanishing, matching .

Adjoint spinor and current conservation. Define the Dirac adjoint and the gamma-five matrix , which anticommutes with all . The vector current is conserved: . This is the probability-current conservation law. The charge is positive definite and conserved, resolving the negative-probability problem of the Klein-Gordon equation.

Plane-wave solutions. For a free particle with four-momentum , write (positive frequency) or (negative frequency). The spinors and satisfy

The positive-energy condition picks out the electron solutions; the negative-frequency spinors describe positrons of momentum and energy . There are two linearly independent -spinors and two linearly independent -spinors, corresponding to the two spin polarizations at each energy.

Lorentz covariance. Under a Lorentz transformation 10.05.01 pending, the spinor transforms as where is a matrix satisfying . The existence of such is guaranteed by the Clifford algebra. This makes the Dirac equation Lorentz covariant: if solves it in one frame, solves it in the transformed frame.

The non-relativistic limit and the magnetic moment

For an electron in an electromagnetic field (minimal coupling ), write to peel off the rest-mass phase and take . The upper two components of (the "large" components) satisfy the Pauli equation

where is a two-component Pauli spinor. Expanding the squared term using yields a term , corresponding to a magnetic moment and hence . This is the Dirac prediction — the magnetic moment comes out of the equation without being put in by hand.

Key theorem with proof [Intermediate+]

Theorem (Antiparticles from the Dirac equation). The free Dirac equation admits both positive-energy and negative-energy plane-wave solutions. The negative-energy solutions, when reinterpreted as positive-energy states of a particle with opposite charge, predict the existence of antiparticles with the same mass as the original particle but opposite electric charge.

Proof. The Dirac equation admits plane-wave solutions where is a constant spinor. Substitution yields . Multiplying on the left by gives , where .

Hence , which gives two branches:

Each branch supports two independent spinor solutions (two spin polarizations), giving four solutions total.

The negative-energy branch appears unphysical. Dirac's resolution (1930): for electrons, assume all negative-energy states are filled (the Dirac sea). The Pauli exclusion principle prevents double occupancy. A photon with energy can promote a sea electron to a positive-energy state, leaving a hole. The hole has:

  • Energy (removing a negative-energy electron adds positive energy);
  • Charge (removing charge leaves net );
  • The same mass ;
  • Spin with two polarizations.

This hole is the positron, the electron's antiparticle.

The modern QFT resolution [Peskin & Schroeder 1995] dispenses with the sea. One quantises the Dirac field by expanding in terms of creation and annihilation operators:

The operators annihilate electrons; the operators create positrons. The negative-frequency exponential accompanies the positron creation operator, and the field expansion is consistent with positive-definite Hamiltonian . No sea is required; the antiparticle interpretation is built into the operator algebra.

Corollary. Every charged spin- particle has a corresponding antiparticle with the same mass and spin but opposite charge. The positron () is the antiparticle of the electron (); the antiproton () is the antiparticle of the proton ().

The experimental confirmation came in 1932 when Anderson observed positron tracks in a cloud chamber exposed to cosmic rays [Dirac 1928]. The track curvature in a magnetic field showed a particle with the electron's mass but opposite charge.

Worked example: magnetic moment from the non-relativistic limit

For a free Dirac particle at rest, the solutions split into two positive-energy spinors and two negative-energy spinors . In the standard representation with :

The normalisation factor is chosen so that and .

The magnetic moment operator for the Dirac particle is read off from the interaction Hamiltonian with an external field. The coupling yields, in the non-relativistic limit, the Pauli term with where is the spin operator. Evaluating in the standard representation, , and the magnetic moment is , giving . The radiative corrections (vertex diagrams in QED) shift this to , where is the fine-structure constant.

Bridge. The antiparticle theorem builds toward 12.13.02 fermionic Fock space, where the negative-frequency spinors become positron creation operators on the antisymmetric Fock space and the canonical anticommutation relations enforce Pauli exclusion at the algebraic level. The foundational reason the proof goes through is that the Dirac equation factorises through the Clifford algebra Cl(1,3), and this is exactly the structure that gives the spinor representation its built-in spin- content. The Gordon-identity decomposition appears again in 14.04.01 pending hydrogen-atom fine structure, where the magnetic-moment readout above identifies the spin-orbit coupling as the bridge between orbital angular momentum 12.05.01 pending and the intrinsic spin produced by the Dirac kinematic structure. Putting these together: the central insight of the Dirac framework is that a single algebraic constraint (the Clifford anticommutation) forces simultaneously the spin- structure, the antiparticle spectrum, the gyromagnetic ratio , and the full hydrogen fine structure — none of which were independent inputs.

Exercises [Intermediate+]

Lean formalization [Intermediate+]

Mathlib's coverage of the Dirac equation is indirect. The relevant layers are:

  • Mathlib.Algebra.CliffordAlgebra: abstract Clifford algebras over a commutative ring with a quadratic form, including the universal property and the isomorphism for the signature relevant to physics.
  • The Dirac operator as a geometric object (unit 03.09.08) exists in Mathlib's Geometry layer as a first-order differential operator on a Clifford-module bundle over a Riemannian manifold.

What Mathlib does not contain: the Dirac equation as a physics equation (i.e., a time-dependent PDE whose solutions are spinor fields on Minkowski spacetime); the identification of positive-energy and negative-energy solution branches; the antiparticle interpretation via second quantisation; the non-relativistic limit yielding the Pauli equation; or the prediction for the magnetic moment. These are physics-layer constructions that require the physics formalisation roadmap (unit id conventions, physical-unit system, Hilbert-space operators) to be in place before they can be formalised.

lean_status: none reflects this gap. No lean_module ships with this unit. The lean_mathlib_gap field in frontmatter records the boundary of current Mathlib coverage.

Gamma matrix algebra and Clifford structure [Master]

The closure derivation: why the Clifford algebra is forced. The Dirac equation is found by requiring three structural conditions simultaneously: (i) the equation is first order in time so that initial data is just and the conserved density is positive-definite, (ii) the equation is Lorentz-covariant so that space and time enter symmetrically, and (iii) every solution also satisfies the relativistic dispersion , equivalently the Klein-Gordon equation component-wise. Conditions (i) and (ii) together force the ansatz

where and are matrix-valued coefficients on some finite-dimensional vector space in which takes values. The first-order form is built in by hand; the cost is that is now a column of components rather than a single complex number.

Apply the operator twice and impose (iii). One step gives

For the right-hand side to equal on every component of , the matrix coefficients must satisfy

These are the algebraic closure conditions. The symmetric part of must contract to to reproduce , and the cross term between and must vanish so no linear-in- residue appears.

Define and . A direct expansion gives

using in the last step. Bundling these into a single equation with the mostly-minus metric gives the Clifford algebra relation

Multiplying the original equation by and rearranging recovers the manifestly covariant Dirac equation . The Clifford algebra is not assumed — it is the unique algebraic structure on the matrix coefficients consistent with conditions (i)-(iii). Any first-order Lorentz-covariant wave equation whose solutions satisfy Klein-Gordon must factor through a Clifford algebra.

Minimal dimension. The smallest supporting a faithful Cl(1,3) representation is . Counting: Cl(1,3) has dimension as a real vector space (one element per subset of ). Over , Cl(1,3) Mat(4, ), the algebra of complex matrices, and Mat(4, ) acts faithfully on . Smaller representations would fail to encode the full anticommutation pattern; larger representations are reducible. The four-component spinor structure of the Dirac equation is dictated by this algebraic minimum, not chosen ad hoc.

The gamma matrices generate the Clifford algebra , the algebra associated with the Minkowski metric . The full set of products spans a 16-dimensional real vector space — the Clifford algebra as a matrix algebra . A basis is given by the 16 matrices:

Rank Count Elements Notation
0 1 identity
1 4 gamma matrices
2 6 commutators
3 4 axial vectors
4 1 chirality

This is the complete set of Dirac bilinears . Lorentz symmetry restricts which bilinears can appear in interaction terms: scalars (), vectors (), tensors (), axial vectors (), and pseudoscalars (). These five classes exhaust the independent Lorentz-covariant bilinears.

Trace identities. The gamma matrices satisfy:

  1. (traceless).
  2. .
  3. .
  4. .
  5. , , .

These are the computational backbone of QED. Every scattering amplitude reduces to traces of gamma-matrix products via the Casimir trick (summing over final spins, averaging over initial spins). The identities are proved by repeated use of the Clifford algebra and the cyclicity of the trace.

Chiral (Weyl) representation. An alternative representation useful for massless or ultrarelativistic particles:

In this representation, the projectors and project onto the upper and lower two components respectively — the left-handed and right-handed Weyl spinors. For , the Dirac equation decouples into two independent two-component equations, the Weyl equations, and chirality equals helicity. The weak interaction couples only to left-handed particles, making the chiral representation the natural one for the electroweak theory.

Fierz rearrangement. Any product of two Dirac bilinears can be rewritten in terms of a different basis pairing. The Fierz identity is

where runs over the 16 basis elements and is a fixed numerical matrix. This identity is essential for analysing four-fermion interactions (Fermi theory, effective weak couplings) and for proving the Pauli exclusion principle in its relativistic form (the spin-statistics theorem).

The spinor representation and the SL(2,) double cover. Lorentz covariance of the Dirac equation requires a matrix for each Lorentz transformation acting on the spinor index. Write an infinitesimal Lorentz transformation as with (six independent parameters: three boosts, three rotations). The infinitesimal spinor representation is

where . Imposing the covariance condition at linear order in gives

which follows from the Clifford algebra by direct computation. The six matrices thus form a representation of the Lorentz Lie algebra ; exponentiating gives a finite-dimensional representation of the universal cover rather than of itself. Concretely, in the chiral basis is block-diagonal with the upper block a representation and the lower block a representation of ; together they form the representation that the Dirac spinor carries.

A rotation by about any axis multiplies the spinor by , not — this is the hallmark of the double cover. The Dirac spinor returns to itself only after a rotation. The doubled identity is not a pathology but the very signature of a half-integer-spin representation, and it is the algebraic reason fermions obey the Pauli exclusion principle in the relativistic spin-statistics theorem (cross-link 12.13.02).

Non-relativistic limit and [Master]

Minimal coupling and the algebra of . Promote the free Dirac equation to an external electromagnetic field via minimal coupling: replace the spacetime derivative everywhere by the gauge-covariant derivative , equivalently the kinetic 4-momentum . The Dirac equation becomes

The factor is the electron charge (sign convention: for the physical electron). The gauge invariance of the equation under , is built in by the form of the covariant derivative.

The non-relativistic limit isolates the kinematics where and . The most efficient path is to multiply the Dirac equation by on the left, giving the Schrödinger-form Hamiltonian

and then decompose in the standard (Dirac) basis where . The matrices are off-diagonal, , and the Dirac equation resolves into the coupled pair

To peel off the dominant rest-mass oscillation, write , . The exponential cancels the and the partially, leaving

In the non-relativistic regime, the small component is suppressed by relative to . The left side of the second equation is dominated by , and dropping and (both small relative to ) gives the algebraic solution

Substituting back into the first equation gives the Pauli equation:

This is a two-component Schrödinger equation; the four-component spinor has collapsed to two components by integrating out the antiparticle degrees of freedom.

The Pauli identity and the magnetic-moment readout. Expand using the spin algebra identity

Therefore

The commutator (in the convention and ). Substituting,

using and absorbing the sign in the standard convention. The Pauli equation then reads

The last term is a magnetic-moment coupling with magnetic moment

Reading off the gyromagnetic ratio: the spin operator is , so , giving . This is the Dirac prediction: the electron's gyromagnetic ratio comes out as exactly twice the classical orbital value , with no assumption injected beyond the Clifford algebra and minimal coupling. The result was a stunning agreement with the empirical Landé pre-factor and the cornerstone of the equation's acceptance.

Radiative corrections and the anomalous magnetic moment. The tree-level receives quantum corrections from QED loop diagrams. The leading correction comes from the one-loop vertex diagram (Schwinger 1948):

Higher-order corrections give the series

The theoretical prediction and experimental measurement agree to more than 12 significant figures — the most precise agreement between theory and experiment in all of physics. Any deviation would signal new physics (supersymmetric particles, composite structure, extra dimensions), making a precision probe of the Standard Model and beyond.

The anomalous magnetic moment of the muon shows a persistent tension between the Standard Model prediction and experiment (the "muon anomaly"), currently at the level depending on which theoretical calculation is used. This is one of the strongest hints of physics beyond the Standard Model.

Chiral symmetry [Master]

The projectors and the explicit chiral decomposition. The matrix satisfies and . From , the eigenvalues of are . Define the chiral projectors

Direct computation shows , , , and . They are orthogonal idempotents projecting onto the eigenspaces of . Every Dirac spinor decomposes uniquely as

The components and are the left-handed and right-handed Weyl spinors, each carrying two complex degrees of freedom. In the chiral (Weyl) representation, , so projects onto the upper two components and onto the lower two; the Dirac spinor is just the column with each entry a Weyl spinor.

Massless decoupling. For , the Dirac equation admits a clean factorisation. Apply on the left:

using the anticommutation (because flips the sign of when commuted past ). Similarly applying on the left gives . The two-component Weyl equations

(in the chiral basis with , ) are uncoupled: a left-handed spinor evolves independently of a right-handed spinor. For , chirality is conserved along the trajectory, and the two halves are physically separable. The neutrino — long considered massless — was modelled by a purely left-handed Weyl spinor until oscillation experiments forced the introduction of a mass term.

Mass term as the chirality mixer. The Dirac mass term explicitly couples the two chiralities. Compute:

Using (again from ), the cross terms survive but the diagonal ones vanish:

and similarly for . Therefore

where projects opposite chirality (note: , so the bar swaps chirality). The mass term is off-diagonal in the L-R basis: it converts a left-handed spinor into a right-handed one and vice versa. A massive Dirac fermion is the minimum relativistic-quantum description of a particle that has both chiralities mixing — equivalently, a particle that decelerates relative to the speed of light.

This explicitly demonstrates how chirality and mass interact. For , neither nor is separately a solution of the Dirac equation; only their linear combination is. For , the two halves decouple and propagate as independent Weyl fermions.

Chiral symmetry as a global . For , the Dirac Lagrangian

(the kinetic term preserves chirality because swaps it twice in when commuting projectors through) has two continuous global symmetries beyond the phase rotation :

  1. Vector symmetry: (equivalently , ). Conserved current , conserved charge (particle number, or electric charge after gauging).

  2. Axial (chiral) symmetry: (equivalently , ). Conserved current , conserved charge (chirality).

The axial symmetry is generated by . Using , the massless Dirac equation commutes with : if is a solution, so is . The projectors decompose into left- and right-handed components that evolve independently. This is chiral symmetry.

A mass term explicitly breaks chiral symmetry by coupling the two chiralities. The breaking is proportional to ; for light quarks (), chiral symmetry is approximate, and its breaking pattern produces the pions as (pseudo-)Goldstone bosons — the foundation of chiral perturbation theory.

In QCD, even for massless quarks, the axial symmetry is anomalously broken by the quantum path integral measure (the Adler-Bell-Jackiw anomaly). The divergence of the axial current is , which is nonzero in the presence of non-constant gauge field topology. This anomaly resolves the problem (why the meson is heavy despite near-chiral symmetry) and is connected to instanton physics and the strong CP problem.

The chiral structure of the Dirac equation is also the origin of the V-A (vector minus axial) structure of the weak interaction, the neutrino mass problem (observed neutrino oscillations require mass, which breaks chiral symmetry for fermions that were originally placed in purely left-handed Weyl representations), and the anomaly cancellation constraints on fermion representations in grand unified theories.

Foldy-Wouthuysen transformation and the systematic non-relativistic expansion [Master]

The Pauli-equation derivation above keeps only the leading term and discards everything else. To get the systematic expansion — including the Darwin term, the spin-orbit coupling, and the relativistic kinetic correction — requires a controlled framework that removes the troublesome odd operators (those mixing large and small components) order by order in . This is the Foldy-Wouthuysen transformation (1950) [Foldy-Wouthuysen 1950], the canonical bridge from the four-component Dirac picture to the two-component Pauli picture with all relativistic corrections accounted for.

The transformation as an exponentiated rotation. The Dirac Hamiltonian splits naturally into even and odd parts with respect to :

For minimal-coupled Dirac, (block-diagonal) and (block-off-diagonal, mixing with ). The odd part is the obstruction to a clean two-component reduction.

Foldy and Wouthuysen sought a unitary transformation (and correspondingly ) that eliminates at leading order. Take

The operator is Hermitian because (anticommutation) and the prefactor flips the sign once more. The transformed Hamiltonian is

Expand the BCH series:

The leading commutator is , exactly cancelling the offending odd operator at . Higher commutators produce new even operators at order , , etc.

The systematic expansion through order . Carrying the BCH expansion to the required order, the transformed Hamiltonian becomes

where the new odd operator is one order smaller than . Iteration — applying a second FW rotation with — eliminates this odd operator and introduces a yet-smaller one of order . Continuing the iteration order by order produces a unitary such that the final Hamiltonian is block-diagonal up to errors of order .

Reading off the physical terms. Specialize to the electron in an electrostatic potential, (no ), . Compute each operator in the standard representation:

  • — the non-relativistic kinetic energy.

  • — the relativistic kinetic correction (restoring ).

  • evaluates via :

    • The double commutator splits into a symmetric piece and an antisymmetric piece. The symmetric piece gives the Darwin term (a contact interaction that smears the electron position by its Compton wavelength).
    • The antisymmetric piece, after acting with , gives the spin-orbit term . For a central potential , , and the spin-orbit coupling becomes the familiar , with the famous Thomas factor of relative to the naive classical result emerging automatically (Thomas precession is no longer needed as an extra input).
  • For a magnetic field , the kinetic term picks up the Pauli magnetic moment as derived above: , with .

The Pauli equation with relativistic corrections. Restricting to the upper two components (now the only ones occupied in the absence of pair-production sources), the effective non-relativistic Hamiltonian is

Term-by-term: non-relativistic kinetic + electrostatic + relativistic kinetic correction + Pauli Zeeman + Darwin + spin-orbit. This is the complete non-relativistic limit of the Dirac equation through order . Applied to the hydrogen atom with , these corrections reproduce the entire hydrogen fine structure — the same level splittings Sommerfeld derived empirically in 1916 from the old quantum theory but now from a single first-principles equation. This is the second great success of the Dirac equation (after ), and the agreement with experiment is at the level of one part in for the splitting, the Lamb shift between and being a QED radiative effect not captured by the FW reduction alone.

The FW transformation as a position-operator question. A deeper reading of the FW transformation: in the original four-component picture, the position operator mixes positive- and negative-energy states, leading to the unphysical Zitterbewegung (rapid oscillation at frequency ) when computing the expectation value of on a positive-energy wave packet. The FW transformation diagonalises the Hamiltonian in the energy-sign basis, and the transformed position operator no longer mixes the two branches — it is the mean position of the wave packet without the Compton-scale jitter. Newton and Wigner (1949) had introduced this operator independently from a representation-theoretic standpoint (the unique position operator covariant under the Euclidean group acting on a positive-energy irreducible representation of the Poincaré group); Foldy-Wouthuysen recover the same operator via a constructive unitary transformation.

Hole theory, the Dirac sea, and the bridge to QFT second-quantisation [Master]

The single-particle Dirac equation has a foundational problem that the original 1928 paper did not resolve and that drove the next two decades of physics: the negative-energy spectrum is unbounded below, so the theory has no ground state if interpreted naively as a one-particle wave mechanics.

The unbounded-below spectrum. The free Dirac Hamiltonian has eigenvalues , each doubly degenerate (Exercise 9). As ranges over , the negative branch covers . Any external perturbation — say, a photon — could in principle cause an electron sitting at to transition to , releasing of energy. The single-particle Dirac electron would tumble downward indefinitely. This catastrophic instability is not observed; real electrons sit happily in atomic orbitals for cosmic timescales.

Hole theory (Dirac 1930). Dirac's resolution: postulate that in the actual vacuum, every negative-energy state is filled by an electron, forming an inert sea. The Pauli exclusion principle, which we now know applies to electrons, prevents any positive-energy electron from cascading downward — there is no empty negative-energy state to fall into. The transition from to would require simultaneously displacing an existing sea electron, which is energetically excluded.

A photon with energy can promote a sea electron to a positive-energy state. The result is observable: a positive-energy electron and a missing slot ("hole") in the sea. The hole has:

  • Energy (because removing a state from the sea adds to the total).
  • Charge (because removing a charge contributes net ).
  • Momentum relative to the missing sea state.
  • Spin with opposite spin projection to the missing sea state.

Dirac initially identified the hole with the proton (the only known positive charge), but Weyl and Oppenheimer immediately objected: the hole must have the same mass as the electron, and the proton is ~1836 times heavier. After two years of resistance, Dirac conceded in 1931 and predicted a new particle. Anderson's 1932 observation of the positron in a cloud chamber [Anderson 1932] confirmed the prediction in dramatic fashion.

Why hole theory failed as a literal physical picture. Hole theory, taken seriously, requires:

  • An infinite negative charge density of sea electrons (each carrying charge , summed over an infinite continuum). The observable physics is the deviation from this infinite reference, with infinite charge absorbed as a renormalisation of the vacuum.
  • The Pauli exclusion principle to prevent infinite cascading. But the exclusion principle is a statistical property of fermions, not of electrons alone — bosonic relativistic particles (e.g., the pion, described by the Klein-Gordon equation) also have negative-energy solutions, and there is no exclusion principle to stop them.

The second point is decisive. Klein-Gordon's negative-energy problem cannot be solved by hole theory because pions are bosons; any number of them can pile into the same state, so a "filled sea" does not stabilise them. A different framework is needed — one that handles bosons and fermions on the same footing.

The QFT resolution: the Dirac field as an operator. The modern treatment of the negative-energy problem is the second-quantisation programme. Promote from a -valued classical field to an operator-valued field acting on a Fock space. Expand it in a basis of positive- and negative-frequency mode functions:

The operators and annihilate and create electrons; the operators and annihilate and create positrons. The key reinterpretation: the negative-frequency mode that was a "negative-energy solution" in single-particle language is now accompanied by a positron creation operator . Removing a negative-energy electron from the sea is rephrased as creating a positive-energy positron in the vacuum.

The operators satisfy the canonical anticommutation relations (CAR):

with all other anticommutators vanishing. Anticommutators rather than commutators because describes fermions — this is the input from the spin-statistics theorem (cross-link 12.13.02). The Hilbert space is the fermionic Fock space built from the positive-energy electron one-particle Hilbert space and the positive-energy positron one-particle Hilbert space , both of which carry the irreducible positive-energy spinor representation of the Poincaré group (Wigner 1939 classification).

The Hamiltonian becomes positive-definite. Computing in this expansion and applying the CAR (using and normal-ordering the result) gives

Every term is manifestly : both number operators and have non-negative eigenvalues, and . The vacuum state (annihilated by all and ) is the unique ground state at energy zero. No Dirac sea is required: the negative-energy problem dissolves because the operator algebra rephrases the would-be negative-energy excitations as positron creations from a positive-definite vacuum.

Crossing symmetry and the unification of particle and antiparticle. In the second-quantised picture, particle and antiparticle are no longer two species but two excitations of one field. Feynman's interpretation, equivalent to the operator picture, treats the positron as an electron propagating backward in time: the negative-frequency mode is reinterpreted as a forward-time wave for the conjugate particle. Crossing symmetry — the fact that in QFT scattering amplitudes — is the rigorous statement of this equivalence. The Klein paradox (Exercise 10) finds its proper explanation in this framework: the strong potential pair-creates electrons and positrons from the vacuum, and the "transmitted" wave is a real positron travelling away.

CPT and the spin-statistics theorem. Two structural results crown the second-quantised Dirac theory:

  1. CPT theorem (Lüders, Pauli, Schwinger; ~1954): any Lorentz-invariant local quantum field theory with a positive-definite metric on the Hilbert space is invariant under the combined operation (charge conjugation : electron positron; parity : ; time reversal : with antiunitarity). For the Dirac field, , and the combined transformation is an exact symmetry of the free theory and of any interacting Lorentz-invariant theory. CPT is one of the most stringently tested predictions in physics; no violation has ever been observed (current upper bound on the electron-positron mass difference is relative).

  2. Spin-statistics theorem (Fierz 1939, Pauli 1940): in any Lorentz-invariant QFT, half-integer-spin fields must be quantised with anticommutators (fermions) and integer-spin fields must be quantised with commutators (bosons). Choosing the wrong statistics for a given spin makes the energy unbounded below or the algebra of observables ill-defined. The Dirac field, carrying the spin- representation of , must obey the CAR — the negative-energy problem cannot be resolved by bosonic quantisation of the Dirac field. The full proof uses the Wightman axiom of positivity of the inner product and is laid out in 12.13.02 and in Streater-Wightman 1964.

Synthesis. The foundational reason the Dirac equation has the structure it does is that it sits exactly at the intersection of three constraints: relativistic energy-momentum, first-order time evolution, and the positive-definite probability current. The central insight of the Dirac framework is that satisfying all three simultaneously forces (a) the Clifford algebra Cl(1,3) and its minimal 4-component representation, (b) the spinor representation of which is exactly the spin- representation, (c) a negative-energy branch that must be reinterpreted, and (d) the prediction for the magnetic moment with all relativistic corrections systematically calculable via Foldy-Wouthuysen. Putting these together with the second-quantised Fock framework identifies the negative-energy problem with the existence of antiparticles, the chirality structure with the parity-violating weak interaction, and the spin-orbit + Darwin + Zeeman terms with the entire hydrogen fine structure. This is exactly the bridge between single-particle relativistic quantum mechanics and QFT: the Dirac equation generalises classical wave mechanics by forcing the inclusion of antimatter, just as the Schrödinger equation generalises classical mechanics by forcing quantisation; the pattern recurs throughout the Standard Model, where the requirement of relativistic QFT plus gauge symmetry predicts the entire particle zoo and its weak-interaction asymmetries.

Connections [Master]

  • Special relativity 10.05.01 pending provides the Minkowski metric and Lorentz transformations that the Dirac equation is built on. The Clifford algebra is the algebraic encoding of the metric signature; the gamma matrices are its matrix representation. Without the relativity framework, there is no Dirac equation.

  • Spin and angular momentum 12.05.01 pending emerge from the Dirac equation rather than being postulated. The orbital angular momentum is not separately conserved; the conserved quantity is where is the spin operator built from the gamma matrices. The non-relativistic limit recovers the Pauli spin matrices.

  • Schrödinger equation 12.03.01 pending is the non-relativistic, single-component limit of the Dirac equation. The Dirac equation reduces to the Pauli equation (a two-component Schrödinger-type equation with spin) in the limit; the Pauli equation reduces to the Schrödinger equation when the magnetic field vanishes.

  • Path integral for the Dirac field 12.10.01 pending replaces the wave-function picture with a sum-over-histories. Fermionic path integrals require Grassmann variables (anticommuting numbers) because the Dirac field describes fermions. The path-integral formulation makes the connection to QED scattering amplitudes systematic.

  • Clifford algebras 03.09.08 are the mathematical substrate. The gamma matrices generate ; the classification of Clifford algebras in the periodicity theorem (Bott periodicity) explains why spinors in different spacetime dimensions have different sizes (2-component in , 4-component in , etc.).

  • Klein-Gordon equation 12.11.00 pending (if present) is the relativistic wave equation that comes from "quantising" directly. Every Dirac solution satisfies the Klein-Gordon equation, but the Dirac equation is a stronger constraint that resolves the negative-probability and spin degeneracy problems.

  • Quantum electrodynamics builds on the Dirac equation as the matter sector. The QED Lagrangian couples the Dirac field to the Maxwell field via the covariant derivative . The Dirac propagator, vertex factor , and the gamma-matrix trace technology developed above are the computational ingredients of all QED calculations.

  • CPT theorem and the spin-statistics theorem are deep structural results of relativistic QFT that apply to the Dirac field. The CPT theorem guarantees that the combined operation of charge conjugation, parity, and time reversal is an exact symmetry. The spin-statistics theorem guarantees that half-integer spin fields (Dirac) are fermions and integer spin fields are bosons.

Historical & philosophical context [Master]

Dirac formulated his equation in 1928 [Dirac 1928], published in two papers in the Proceedings of the Royal Society A 117 and A 118. The motivation was to find a relativistic wave equation that was first-order in time (unlike Klein-Gordon) and that yielded a positive-definite probability density. Dirac's starting point was the observation that the Klein-Gordon equation's second-order time derivative prevented a conserved positive-definite probability current; a first-order equation would fix this.

The equation immediately yielded the correct fine-structure splitting of the hydrogen atom (the original empirical success) and the magnetic moment. But it also produced negative-energy solutions that had no obvious physical interpretation. Dirac wrestled with this for two years before proposing the hole theory in 1930: the negative-energy states form a filled sea, and a hole in this sea is a new particle with positive energy and positive charge. Dirac initially identified this particle with the proton, hoping to explain the electron-proton mass asymmetry — a proposal that Weyl and Oppenheimer immediately criticised on the grounds that the hole would have the same mass as the electron. Dirac conceded in 1931 and predicted a particle with the electron's mass but opposite charge.

Anderson's discovery of the positron in 1932 [Anderson 1932] — observed as a particle of electron mass but opposite curvature in a cloud chamber exposed to cosmic rays — was the dramatic confirmation. It was the first instance of a particle predicted mathematically before its experimental detection, and it established the concept of antimatter as a physical reality rather than a mathematical artifact.

The philosophical implications are deep. The Dirac equation showed that combining quantum mechanics with special relativity does not merely refine existing predictions — it predicts new particles. Antimatter is not an optional extra but a structural consequence of relativistic quantum mechanics. This pattern repeated throughout the 20th century: the Dirac equation's structure generalised to the Standard Model, where the requirement of relativistic quantum field theory, combined with gauge symmetry, predicts the existence of the and bosons, the Higgs boson, and the entire particle zoo.

The Klein paradox (1929) showed early on that the single-particle interpretation of the Dirac equation fails in strong external potentials. The resolution — pair production from the vacuum — required the full machinery of quantum field theory (second quantisation, the Dirac field as an operator-valued distribution). This transition from wave mechanics to field theory, forced by the Dirac equation's own structure, is one of the central conceptual shifts in 20th-century physics.

The systematic non-relativistic reduction was completed by Foldy and Wouthuysen in 1950 [Foldy-Wouthuysen 1950], whose canonical unitary transformation diagonalises the Dirac Hamiltonian in the energy-sign basis order by order in and identifies the kinetic, Darwin, spin-orbit, and Pauli-Zeeman terms with full Thomas-half factor. Schwinger's 1948 one-loop calculation of the anomalous magnetic moment inaugurated quantum electrodynamics as a precision-predictive theory, and the modern decade-long Penning-trap measurements have pushed the agreement to twelve significant figures — the most precise theory-experiment match in all of physics, and the empirical benchmark against which any extension of the Standard Model (supersymmetry, composite electrons, extra dimensions) is calibrated.

Bibliography [Master]

Primary literature:

  • Dirac, P. A. M., "The Quantum Theory of the Electron", Proc. Roy. Soc. A 117 (1928), 610–624. [Originator paper.]
  • Dirac, P. A. M., "The Quantum Theory of the Electron. Part II", Proc. Roy. Soc. A 118 (1928), 351–361.
  • Dirac, P. A. M., "A Theory of Electrons and Protons", Proc. Roy. Soc. A 126 (1930), 360–365. [Hole theory.]
  • Anderson, C. D., "The Positive Electron", Phys. Rev. 43 (1933), 491. [Positron discovery.]
  • Klein, O., "Die Reflexion von Elektronen an einem Potentialsprung nach der relativistischen Dynamik von Dirac", Z. Phys. 53 (1929), 157–165. [Klein paradox.]
  • Foldy, L. L. & Wouthuysen, S. A., "On the Dirac Theory of Spin 1/2 Particles and its Non-Relativistic Limit", Phys. Rev. 78 (1950), 29–36. [Canonical FW transformation, Darwin term, spin-orbit with Thomas factor.]
  • Newton, T. D. & Wigner, E. P., "Localized States for Elementary Systems", Rev. Mod. Phys. 21 (1949), 400–406. [The mean-position operator on positive-energy irreducible representations.]
  • Schwinger, J., "On Quantum-Electrodynamics and the Magnetic Moment of the Electron", Phys. Rev. 73 (1948), 416. [Leading g-2 correction.]
  • Fierz, M., "Über die relativistische Theorie kräftefreier Teilchen mit beliebigem Spin", Helv. Phys. Acta 12 (1939), 3–37. [Precursor to spin-statistics.]
  • Pauli, W., "The Connection Between Spin and Statistics", Phys. Rev. 58 (1940), 716–722. [Spin-statistics theorem.]
  • Lüders, G., "On the Equivalence of Invariance under Time Reversal and under Particle-Antiparticle Conjugation for Relativistic Field Theories", Kgl. Dan. Vid. Selsk. Mat.-fys. Medd. 28 No. 5 (1954). [CPT theorem.]

Textbooks and monographs:

  • Dirac, P. A. M., The Principles of Quantum Mechanics, 4th ed. (Oxford, 1958), Ch. XI. [The master's own exposition.]
  • Bjorken, J. D. & Drell, S. D., Relativistic Quantum Mechanics (McGraw-Hill, 1964). [Standard reference for the single-particle Dirac theory.]
  • Bjorken, J. D. & Drell, S. D., Relativistic Quantum Fields (McGraw-Hill, 1965). [QFT sequel.]
  • Peskin, M. E. & Schroeder, D. V., An Introduction to Quantum Field Theory (Westview, 1995), Ch. 3. [Modern QFT textbook treatment of the Dirac field.]
  • Weinberg, S., The Quantum Theory of Fields, Vol. I (Cambridge, 1995), Ch. 5. [Rigorous derivation from Wigner's classification of Poincare irreps.]
  • Griffiths, D. J., Introduction to Elementary Particles, 2nd ed. (Wiley-VCH, 2008), Ch. 7. [Accessible intermediate-level treatment.]
  • Feynman, R. P., QED: The Strange Theory of Light and Matter (Princeton, 1985). [Popular-level exposition; Beginner anchor.]
  • Tong, D., Quantum Field Theory (DAMTP Cambridge lecture notes), §4. [Clear lecture-note exposition.]
  • Itzykson, C. & Zuber, J.-B., Quantum Field Theory (McGraw-Hill, 1980), Ch. 2. [Comprehensive reference on gamma-matrix technology.]
  • Greiner, W., Relativistic Quantum Mechanics: Wave Equations, 3rd ed. (Springer, 2000). [Detailed worked examples.]

Wave 3 unit, produced 2026-05-19. Status: draft pending Tyler review and external QM reviewer per PHYSICS_PLAN §6.