12.11.01 · quantum / relativistic-qm

Dirac equation and relativistic spin

draft3 tiersLean: nonepending prereqs

Anchor (Master): Dirac, The Principles of Quantum Mechanics, 4e (1958), Ch. XI; Bjorken & Drell, Relativistic Quantum Mechanics (1964)

Intuition [Beginner]

The Schrödinger equation works well for slow particles. It treats time and space differently: one time derivative on the left, two space derivatives on the right. But special relativity 10.05.01 pending says space and time sit on equal footing. If you try to fix this by writing down $E^{2} = p^{2} + m^{2}$ as a quantum equation (the Klein-Gordon equation), you get a second-order time derivative — and that causes trouble. Probability can go negative. The equation seems to admit negative-energy solutions that have no physical interpretation.

Dirac's insight in 1928 was to insist on a first-order equation in time. Not $E^{2}$ , but $E$ . The price: the wave function can no longer be a single complex number at each point. It must be a column of four numbers — a four-component spinor, written $ψ$ . The equation involves $4 \times 4$ matrices called gamma matrices (denoted $γ^{0}, γ^{1}, γ^{2}, γ^{3}$ ) that encode the geometry of spacetime.

In condensed notation, the Dirac equation reads "the gamma matrices, times the spacetime gradients of $ψ$ , minus $m ψ$ , equals zero" — with a convention where $c = ℏ = 1$ . The sum runs over the four spacetime directions and produces one matrix equation coupling all four components of $ψ$ .

Two things fall out of this equation that were not put in by hand.

First: spin. The angular momentum of a Dirac particle splits into orbital plus an extra piece — intrinsic spin — that comes from the matrix structure alone. The electron has spin $1/2$ because the Dirac equation forces it, not because someone added spin as an extra assumption 12.05.01 pending.

Second: antimatter. The four-component spinor has two solutions with positive energy (spin-up and spin-down electrons) and two with negative energy. Dirac interpreted the negative-energy states as an already-full "sea" of electrons; a missing electron from this sea behaves as a particle with opposite charge — a positron. Anderson discovered the positron in 1932, confirming Dirac's prediction. In modern quantum field theory the Dirac sea is replaced by a cleaner picture: the negative-energy solutions describe antiparticles travelling forward in time.

The Dirac equation also predicts the electron's magnetic moment. For a point particle with spin, the naive ratio of magnetic moment to angular momentum gives $g = 1$ . The Dirac equation gives $g = 2$ . This was a stunning agreement with experiment, and the small deviation from exactly 2 (the anomalous magnetic moment) is one of the most precisely tested predictions in all of physics.

Visual [Beginner]

Picture the energy spectrum of the Dirac equation for a free particle at rest. The horizontal axis is energy $E$ ; the vertical axis is not needed. There are two allowed values: $E = + m$ and $E = - m$ (in natural units where $c = 1$ ). Each value is doubly degenerate — spin-up and spin-down.

The Dirac sea picture: every negative-energy state is filled by default. An incoming photon can knock one of these negative-energy electrons up to a positive-energy state. The result is a visible electron (the promoted particle) plus a "hole" in the negative-energy sea — the hole behaves as a positively charged particle with the same mass. This is pair production: photon goes in, electron-positron pair comes out.

The modern view drops the sea entirely. The Dirac equation, reinterpreted as a quantum field equation, has four independent particle states: two electron polarizations and two positron polarizations. The negative sign in the energy is absorbed by redefining the creation operators for antiparticles.

Worked example [Beginner]

Solve the Dirac equation for a free particle at rest — no momentum, $p = 0$ .

With $p = 0$ , all spatial variation vanishes and the Dirac equation reduces to $i γ^{0} (d ψ / d t) = m ψ$ . Look for solutions of the form $ψ (t) = u e^{- i E t}$ where $u$ is a constant four-component spinor. Substituting gives $γ^{0} E u = m u$ , or $(γ^{0} E - m) u = 0$ .

Using the Dirac (standard) representation where $γ^{0}$ is block-diagonal,

γ^{0} = (I 0 0 - I),

with $I$ the $2 \times 2$ identity, the equation splits into two upper and two lower components:

(E - m) u_{A} = 0, (- E - m) u_{B} = 0,

where $u = (u_{A}, u_{B})^{T}$ with $u_{A}$ and $u_{B}$ each having two components.

From the first equation: either $E = m$ and $u_{B} = 0$ (two solutions — spin-up and spin-down), or $E = - m$ and $u_{A} = 0$ (two solutions — also spin-up and spin-down).

The four solutions are:

Energy	$u_{A}$	$u_{B}$	Interpretation
$+ m$	$(1, 0)^{T}$	$(0, 0)^{T}$	Electron, spin up
$+ m$	$(0, 1)^{T}$	$(0, 0)^{T}$	Electron, spin down
$- m$	$(0, 0)^{T}$	$(1, 0)^{T}$	Positron, spin up
$- m$	$(0, 0)^{T}$	$(0, 1)^{T}$	Positron, spin down

The positive-energy solutions describe electrons at rest. The negative-energy solutions, reinterpreted, describe positrons — same mass, opposite charge. The splitting into two pairs of two is the origin of the four-component spinor structure: two for particle spin states, two for antiparticle spin states.

The magnetic moment prediction: applying the Dirac equation to an electron in a magnetic field and taking the non-relativistic limit yields the Pauli equation with gyromagnetic ratio $g = 2$ . The derivation is in the Intermediate tier below.

Check your understanding [Beginner]

Formal definition [Intermediate+]

We work in natural units $ℏ = c = 1$ and adopt the mostly-minus metric signature $η^{μν} = diag (+ 1, - 1, - 1, - 1)$ . Greek indices $μ, ν$ run over $0, 1, 2, 3$ ; Latin indices $i, j$ run over $1, 2, 3$ . The spacetime coordinate is $x^{μ} = (t, x)$ and $\partial_{μ} = (\partial_{t}, \nabla)$ .

The Dirac equation for a free spin- $1/2$ particle of mass $m$ is

(i γ^{μ} \partial_{μ} - m) ψ (x) = 0

where $ψ (x) \in C^{4}$ is a four-component Dirac spinor and the gamma matrices $γ^{μ}$ ( $μ = 0, 1, 2, 3$ ) are $4 \times 4$ complex matrices satisfying the Clifford algebra 03.09.08

{γ^{μ}, γ^{ν}} := γ^{μ} γ^{ν} + γ^{ν} γ^{μ} = 2 η^{μν} 1_{4} .

This anticommutation relation is the load-bearing algebraic structure. It ensures that applying the Dirac operator twice recovers the Klein-Gordon operator:

(i γ^{μ} \partial_{μ} - m) (i γ^{ν} \partial_{ν} + m) ψ = (γ^{μ} γ^{ν} \partial_{μ} \partial_{ν} + m^{2}) ψ = (η^{μν} \partial_{μ} \partial_{ν} + m^{2}) ψ = (\partial_{t}^{2} - \nabla^{2} + m^{2}) ψ = 0.

The step from the second to third expression uses $γ^{μ} γ^{ν} + γ^{ν} γ^{μ} = 2 η^{μν}$ to symmetrise the derivative indices. So every solution of the Dirac equation is also a solution of the Klein-Gordon equation, but not vice versa — the Dirac equation is a stronger constraint.

Standard (Dirac) representation. A concrete realisation of the gamma matrices is

γ^{0} = (I 0 0 - I), γ^{i} = (0 - σ^{i} σ^{i} 0),

where $σ^{i}$ ( $i = 1, 2, 3$ ) are the Pauli matrices 12.05.01 pending. One verifies directly that ${γ^{0}, γ^{0}} = 2 I$ and ${γ^{i}, γ^{i}} = - 2 I$ , with all mixed anticommutators vanishing, matching $η^{μν}$ .

Adjoint spinor and current conservation. Define the Dirac adjoint $\overset{ˉ}{ψ} := ψ^{†} γ^{0}$ and the gamma-five matrix $γ^{5} := i γ^{0} γ^{1} γ^{2} γ^{3}$ , which anticommutes with all $γ^{μ}$ . The vector current $j^{μ} := \overset{ˉ}{ψ} γ^{μ} ψ$ is conserved: $\partial_{μ} j^{μ} = 0$ . This is the probability-current conservation law. The charge $Q = \int j^{0} d^{3} x = \int ψ^{†} ψ d^{3} x$ is positive definite and conserved, resolving the negative-probability problem of the Klein-Gordon equation.

Plane-wave solutions. For a free particle with four-momentum $p^{μ} = (E, p)$ , write $ψ (x) = u (p) e^{- i p \cdot x}$ (positive frequency) or $ψ (x) = v (p) e^{+ i p \cdot x}$ (negative frequency). The spinors $u (p)$ and $v (p)$ satisfy

(γ^{μ} p_{μ} - m) u (p) = 0, (γ^{μ} p_{μ} + m) v (p) = 0.

The positive-energy condition $p^{0} = E = + ∣ p ∣^{2} + m^{2}$ picks out the electron solutions; the negative-frequency spinors $v (p)$ describe positrons of momentum $p$ and energy $E$ . There are two linearly independent $u$ -spinors and two linearly independent $v$ -spinors, corresponding to the two spin polarizations at each energy.

Lorentz covariance. Under a Lorentz transformation $Λ^{μ}_{ν}$ 10.05.01 pending, the spinor transforms as $ψ (x) \to S (Λ) ψ (Λ^{- 1} x)$ where $S (Λ)$ is a $4 \times 4$ matrix satisfying $S (Λ)^{- 1} γ^{μ} S (Λ) = Λ^{μ}_{ν} γ^{ν}$ . The existence of such $S$ is guaranteed by the Clifford algebra. This makes the Dirac equation Lorentz covariant: if $ψ$ solves it in one frame, $S (Λ) ψ$ solves it in the transformed frame.

The non-relativistic limit and the magnetic moment

For an electron in an electromagnetic field (minimal coupling $p_{μ} \to p_{μ} - e A_{μ}$ ), write $ψ = e^{- im t} \tilde{ψ}$ to peel off the rest-mass phase and take $∣ p ∣/ m ≪ 1$ . The upper two components of $\tilde{ψ}$ (the "large" components) satisfy the Pauli equation

i \partial_{t} ϕ = [\frac{( σ \cdot ( p - e A ) ) ^{2}}{2 m} + e V] ϕ,

where $ϕ$ is a two-component Pauli spinor. Expanding the squared term using $σ^{i} σ^{j} = δ^{ij} + i ϵ^{ij k} σ^{k}$ yields a term $\frac{e}{2 m} σ \cdot B$ , corresponding to a magnetic moment $μ = \frac{e}{2 m} σ$ and hence $g = 2$ . This is the Dirac prediction — the magnetic moment comes out of the equation without being put in by hand.

Key theorem with proof [Intermediate+]

Theorem (Antiparticles from the Dirac equation). The free Dirac equation admits both positive-energy and negative-energy plane-wave solutions. The negative-energy solutions, when reinterpreted as positive-energy states of a particle with opposite charge, predict the existence of antiparticles with the same mass as the original particle but opposite electric charge.

Proof. The Dirac equation $(i γ^{μ} \partial_{μ} - m) ψ = 0$ admits plane-wave solutions $ψ (x) = w (p) e^{- i p \cdot x}$ where $w (p)$ is a constant spinor. Substitution yields $(γ^{μ} p_{μ} - m) w = 0$ . Multiplying on the left by $(γ^{ν} p_{ν} + m)$ gives $(p^{2} - m^{2}) w = 0$ , where $p^{2} = p_{μ} p^{μ} = E^{2} - ∣ p ∣^{2}$ .

Hence $E^{2} = ∣ p ∣^{2} + m^{2}$ , which gives two branches:

E = + ∣ p ∣^{2} + m^{2} (positive energy)

E = - ∣ p ∣^{2} + m^{2} (negative energy) .

Each branch supports two independent spinor solutions (two spin polarizations), giving four solutions total.

The negative-energy branch appears unphysical. Dirac's resolution (1930): for electrons, assume all negative-energy states are filled (the Dirac sea). The Pauli exclusion principle prevents double occupancy. A photon with energy $\geq 2 m$ can promote a sea electron to a positive-energy state, leaving a hole. The hole has:

Energy $+ m$ (removing a negative-energy electron adds positive energy);
Charge $+ e$ (removing charge $- e$ leaves net $+ e$ );
The same mass $m$ ;
Spin $1/2$ with two polarizations.

This hole is the positron, the electron's antiparticle.

The modern QFT resolution ^{[Peskin & Schroeder 1995]} dispenses with the sea. One quantises the Dirac field by expanding $ψ (x)$ in terms of creation and annihilation operators:

ψ (x) = s \sum \int \frac{d ^{3} p}{( 2 π ) ^{3}} \frac{1}{2 E _{p}} (a_{p}^{s} u^{s} (p) e^{- i p \cdot x} + b_{p}^{s †} v^{s} (p) e^{+ i p \cdot x}) .

The operators $a_{p}^{s}$ annihilate electrons; the operators $b_{p}^{s †}$ create positrons. The negative-frequency exponential $e^{+ i p \cdot x}$ accompanies the positron creation operator, and the field expansion is consistent with positive-definite Hamiltonian $H = \sum_{s} \int d^{3} p E_{p} (a_{p}^{s †} a_{p}^{s} + b_{p}^{s †} b_{p}^{s})$ . No sea is required; the antiparticle interpretation is built into the operator algebra. $□$

Corollary. Every charged spin- $1/2$ particle has a corresponding antiparticle with the same mass and spin but opposite charge. The positron ( $e^{+}$ ) is the antiparticle of the electron ( $e^{-}$ ); the antiproton ( $\overset{p}{ˉ}$ ) is the antiparticle of the proton ( $p$ ).

The experimental confirmation came in 1932 when Anderson observed positron tracks in a cloud chamber exposed to cosmic rays ^{[Dirac 1928]}. The track curvature in a magnetic field showed a particle with the electron's mass but opposite charge.

Worked example: magnetic moment from the non-relativistic limit

For a free Dirac particle at rest, the solutions split into two positive-energy spinors $u^{(1)}, u^{(2)}$ and two negative-energy spinors $v^{(1)}, v^{(2)}$ . In the standard representation with $E = m$ :

u^{(1)} = 2 m 1000, u^{(2)} = 2 m 0100, v^{(1)} = 2 m 0010, v^{(2)} = 2 m 0001 .

The normalisation factor $2 m$ is chosen so that $\overset{u}{ˉ}^{(r)} u^{(s)} = 2 m δ^{r s}$ and $\overset{v}{ˉ}^{(r)} v^{(s)} = - 2 m δ^{r s}$ .

The magnetic moment operator for the Dirac particle is read off from the interaction Hamiltonian with an external field. The coupling $- e \overset{ˉ}{ψ} γ^{μ} A_{μ} ψ$ yields, in the non-relativistic limit, the Pauli term $μ \cdot B$ with $μ = (e /2 m) Σ$ where $Σ^{i} = \frac{1}{2} ϵ^{ij k} σ^{j k}$ is the $4 \times 4$ spin operator. Evaluating in the standard representation, $Σ = diag (σ, σ)$ , and the magnetic moment is $μ = (e /2 m) (ℏ/2) \cdot 2 = e ℏ/ (2 m)$ , giving $g = 2$ . The radiative corrections (vertex diagrams in QED) shift this to $g = 2 (1 + α / (2 π) + \dots) \approx 2.002319 \dots$ , where $α \approx 1/137$ is the fine-structure constant.

Bridge. The antiparticle theorem builds toward 12.13.02 fermionic Fock space, where the negative-frequency spinors $v^{s} (p)$ become positron creation operators on the antisymmetric Fock space and the canonical anticommutation relations enforce Pauli exclusion at the algebraic level. The foundational reason the proof goes through is that the Dirac equation factorises through the Clifford algebra Cl(1,3), and this is exactly the structure that gives the spinor representation its built-in spin- $1/2$ content. The Gordon-identity decomposition appears again in 14.04.01 pending hydrogen-atom fine structure, where the magnetic-moment readout above identifies the spin-orbit coupling $L \cdot S$ as the bridge between orbital angular momentum 12.05.01 pending and the intrinsic spin produced by the Dirac kinematic structure. Putting these together: the central insight of the Dirac framework is that a single algebraic constraint (the Clifford anticommutation) forces simultaneously the spin- $1/2$ structure, the antiparticle spectrum, the gyromagnetic ratio $g = 2$ , and the full hydrogen fine structure — none of which were independent inputs.

Exercises [Intermediate+]

Exercise 3 (medium, symbolic).

Starting from the Dirac equation $(i γ^{μ} \partial_{μ} - m) ψ = 0$ , derive the Klein-Gordon equation $(\partial_{t}^{2} - \nabla^{2} + m^{2}) ψ = 0$ by applying the conjugate operator $(i γ^{ν} \partial_{ν} + m)$ from the left.

Hint

Multiply out $(i γ^{ν} \partial_{ν} + m) (i γ^{μ} \partial_{μ} - m) ψ$ and use the Clifford algebra to simplify $γ^{ν} γ^{μ} \partial_{ν} \partial_{μ}$ .

Answer

$(i γ^{ν} \partial_{ν} + m) (i γ^{μ} \partial_{μ} - m) ψ = (- γ^{ν} γ^{μ} \partial_{ν} \partial_{μ} - m^{2}) ψ$ . Symmetrise the derivative term: $\gamma^\nu\gamma^\mu\partial_\nu\partial_\mu = \tfrac{1}{2}\{\gamma^\nu,\gamma^\mu\}\partial_\u\partial_\mu = \eta^{\nu\mu}\partial_\nu\partial_\mu = \partial_t^2 - \nabla^2$ . (The antisymmetric part $\frac{1}{2} [γ^{ν}, γ^{μ}] (\partial_{ν} \partial_{μ} - \partial_{μ} \partial_{ν})$ vanishes because partial derivatives commute.) So $- (\partial_{t}^{2} - \nabla^{2} + m^{2}) ψ = 0$ , which is the Klein-Gordon equation. Every Dirac solution is a Klein-Gordon solution; the converse is false.

Exercise 6 (medium, symbolic).

The gamma-five matrix is $γ^{5} = i γ^{0} γ^{1} γ^{2} γ^{3}$ . Show that $(γ^{5})^{2} = I$ and ${γ^{5}, γ^{μ}} = 0$ for all $μ$ .

Hint

Anticommute $γ^{5}$ past a single $γ^{μ}$ using the Clifford algebra. Each anticommutation picks up a sign; count the total number of sign flips.

Answer

Write $γ^{5} γ^{μ} = i γ^{0} γ^{1} γ^{2} γ^{3} γ^{μ}$ . To move $γ^{μ}$ to its "correct" position (i.e., to write $γ^{μ} γ^{5} = i γ^{μ} γ^{0} γ^{1} γ^{2} γ^{3}$ ), we must anticommute $γ^{μ}$ past each of the other three gamma matrices. Each anticommutation that involves $γ^{μ}$ and a different $γ^{ν}$ produces a minus sign. Since $γ^{μ}$ is already present among $γ^{0} γ^{1} γ^{2} γ^{3}$ for each $μ$ , anticommute past the other three, giving $(- 1)^{3} = - 1$ . So $γ^{5} γ^{μ} = - γ^{μ} γ^{5}$ , i.e., ${γ^{5}, γ^{μ}} = 0$ . For $(γ^{5})^{2}$ : $γ^{5} γ^{5} = (i)^{2} γ^{0} γ^{1} γ^{2} γ^{3} γ^{0} γ^{1} γ^{2} γ^{3}$ . Anticommute the second $γ^{0}$ left past $γ^{3}, γ^{2}, γ^{1}$ (three minus signs): $(- 1)^{3} γ^{0} γ^{0} γ^{1} γ^{2} γ^{3} γ^{1} γ^{2} γ^{3} = - γ^{1} γ^{2} γ^{3} γ^{1} γ^{2} γ^{3}$ (using $(γ^{0})^{2} = I$ ). Similarly collapse all remaining pairs to get $(- 1)^{6} I = I$ .

Exercise 8 (hard, symbolic).

The Gordon identity states $\overset{u}{ˉ} (p^{'}) γ^{μ} u (p) = \frac{1}{2 m} \overset{u}{ˉ} (p^{'}) [(p^{'} + p)^{μ} + i σ^{μν} (p^{'} - p)_{ν}] u (p)$ where $σ^{μν} = \frac{i}{2} [γ^{μ}, γ^{ν}]$ . Verify this by starting from the identity $(γ^{μ} p_{μ} - m) u (p) = 0$ and manipulating.

Hint

Write $\overset{u}{ˉ} (p^{'}) γ^{μ} u (p) = \frac{1}{2 m} \overset{u}{ˉ} (p^{'}) [(γ^{μ} γ^{ν} p_{ν} + γ^{ν} p_{ν}^{'} γ^{μ})] u (p)$ , split $γ^{μ} γ^{ν} = \frac{1}{2} {γ^{μ}, γ^{ν}} + \frac{1}{2} [γ^{μ}, γ^{ν}]$ , and use the Dirac equation for both $u (p)$ and $\overset{u}{ˉ} (p^{'})$ .

Answer

Insert $m = \frac{1}{2} (m + m)$ : $\overset{u}{ˉ} (p^{'}) γ^{μ} u (p) = \frac{1}{2 m} \overset{u}{ˉ} (p^{'}) [m γ^{μ} + γ^{μ} m] u (p)$ . Use the Dirac equation: $m u (p) = γ^{ν} p_{ν} u (p)$ and $\overset{u}{ˉ} (p^{'}) m = \overset{u}{ˉ} (p^{'}) γ^{ν} p_{ν}^{'}$ . So $\overset{u}{ˉ} (p^{'}) γ^{μ} u (p) = \frac{1}{2 m} \overset{u}{ˉ} (p^{'}) [γ^{ν} p_{ν}^{'} γ^{μ} + γ^{μ} γ^{ν} p_{ν}] u (p) = \frac{1}{2 m} \overset{u}{ˉ} (p^{'}) [γ^{ν} γ^{μ} p_{ν}^{'} + γ^{μ} γ^{ν} p_{ν}] u (p)$ .

Now write $γ^{ν} γ^{μ} = η^{ν μ} + [γ^{ν}, γ^{μ}] /2$ and similarly for $γ^{μ} γ^{ν}$ . Adding: the symmetric part gives $(p^{'} + p)^{μ}$ , and the antisymmetric part gives $\frac{1}{2} [γ^{ν}, γ^{μ}] (p_{ν}^{'} - p_{ν}) = - i σ^{μν} (p^{'} - p)_{ν}$ . (Using $σ^{μν} = \frac{i}{2} [γ^{μ}, γ^{ν}]$ , note the index relabelling.) Combining: $\overset{u}{ˉ} (p^{'}) γ^{μ} u (p) = \frac{1}{2 m} \overset{u}{ˉ} (p^{'}) [(p^{'} + p)^{μ} + i σ^{μν} (p^{'} - p)_{ν}] u (p)$ .

The Gordon identity decomposes the current into a convection term $(p^{'} + p)^{μ} / (2 m)$ (charge flow) and a spin term $i σ^{μν} q_{ν} / (2 m)$ (magnetic moment coupling). It is the starting point for computing form factors in electron scattering.

Exercise 9 (hard, symbolic).

Show that the Dirac Hamiltonian $H_{D} = γ^{0} (γ \cdot p + m)$ (obtained by multiplying the Dirac equation by $γ^{0}$ ) has eigenvalues $\pm E_{p} = \pm ∣ p ∣^{2} + m^{2}$ , each doubly degenerate.

Hint

Compute $H_{D}^{2} = (γ^{0})^{2} (γ \cdot p + m)^{2}$ and simplify using the Clifford algebra.

Answer

$H_{D} = γ^{0} (γ^{i} p_{i} + m)$ . Then $H_{D}^{2} = (γ^{0})^{2} (γ^{i} p_{i} + m)^{2} = (γ^{i} p_{i} + m)^{2} = γ^{i} γ^{j} p_{i} p_{j} + 2 m γ^{i} p_{i} + m^{2}$ . The cross term: $γ^{i} γ^{j} p_{i} p_{j} = \frac{1}{2} {γ^{i}, γ^{j}} p_{i} p_{j} = - δ^{ij} p_{i} p_{j} = - ∣ p ∣^{2}$ (since $p_{i} p_{j}$ is symmetric and the antisymmetric part of $γ^{i} γ^{j}$ contracts to zero). So $H_{D}^{2} = ∣ p ∣^{2} + m^{2} = E_{p}^{2}$ . The eigenvalues of $H_{D}$ are $\pm E_{p}$ . At each sign, the $4 \times 4$ matrix $H_{D}$ restricted to the corresponding eigenspace has a $2 \times 2$ block structure inherited from the spin degree of freedom, giving double degeneracy.

Exercise 10 (hard, symbolic).

The Klein paradox considers a Dirac particle incident on a step potential $V (x) = V_{0} θ (x)$ with $V_{0} > 2 m$ . In the non-relativistic case, the particle is totally reflected for $E < V_{0}$ . Show that the Dirac equation instead predicts a transmitted wave with *growing* amplitude in the region $x > 0$ when $V_{0} - E > m$ , and explain why this signals the breakdown of the single-particle picture.

Hint

In the region $x > 0$ , the effective energy is $E - V_{0} < 0$ . Write the wave function as a plane wave and examine the wave vector $k = (E - V_{0})^{2} - m^{2}$ — is it real or imaginary?

Answer

For $x < 0$ : free Dirac wave with energy $E > m$ . For $x > 0$ : effective energy $E^{'} = E - V_{0} < - m$ when $V_{0} > E + m$ . The wave vector in this region satisfies $k^{2} = (E^{'})^{2} - m^{2} > 0$ , so $k$ is real — the wave is oscillatory, not evanescent. A propagating transmitted wave exists despite the barrier height exceeding the particle energy.

The transmitted wave's group velocity is in the $+ x$ direction, meaning particles are transmitted into the high-potential region. The reflection coefficient exceeds 1 (more particles reflected than incident), which violates single-particle probability conservation.

The resolution: the strong potential extracts electron-positron pairs from the vacuum. The transmitted electrons in the $x > 0$ region are accompanied by positrons that fly back toward $x < 0$ , registering as additional reflected particles. This is the Schwinger pair production mechanism. The single-particle Dirac equation cannot account for this; it requires the full QFT treatment with a quantum vacuum that can produce real particle-antiparticle pairs in strong external fields.

Exercise 11 (hard, symbolic).

Construct the $4 \times 4$ spin operator $Σ^{3} = \frac{i}{2} γ^{1} γ^{2}$ (the $z$ -component of spin) in the Dirac representation and verify that its eigenvalues on the at-rest solutions $u^{(1)}, u^{(2)}, v^{(1)}, v^{(2)}$ are $+ 1/2, - 1/2, + 1/2, - 1/2$ respectively.

Hint

Compute $γ^{1} γ^{2}$ in the standard representation using the Pauli matrices, then multiply by $i /2$ .

Answer

$γ^{1} = (0 - σ^{1} σ^{1} 0)$ , $γ^{2} = (0 - σ^{2} σ^{2} 0)$ . So $γ^{1} γ^{2} = (- σ^{1} σ^{2} 0 0 - σ^{1} σ^{2}) = (- i σ^{3} 0 0 - i σ^{3})$ , using $σ^{1} σ^{2} = i σ^{3}$ . Then $Σ^{3} = \frac{i}{2} γ^{1} γ^{2} = \frac{i}{2} (- i σ^{3} 0 0 - i σ^{3}) = \frac{1}{2} (σ^{3} 0 0 σ^{3}) = diag (1/2, - 1/2, 1/2, - 1/2)$ .

Acting on the at-rest spinors: $Σ^{3} u^{(1)} = (1/2) u^{(1)}$ , $Σ^{3} u^{(2)} = (- 1/2) u^{(2)}$ , $Σ^{3} v^{(1)} = (1/2) v^{(1)}$ , $Σ^{3} v^{(2)} = (- 1/2) v^{(2)}$ . Both particle and antiparticle have spin eigenvalues $\pm 1/2$ , as expected for a spin- $1/2$ field.

Lean formalization [Intermediate+]

Mathlib's coverage of the Dirac equation is indirect. The relevant layers are:

Mathlib.Algebra.CliffordAlgebra: abstract Clifford algebras over a commutative ring with a quadratic form, including the universal property and the isomorphism $Cl (p, q) ≅ Mat (2^{n}, R)$ for the signature $(p, q)$ relevant to physics.
The Dirac operator as a geometric object (unit 03.09.08) exists in Mathlib's Geometry layer as a first-order differential operator on a Clifford-module bundle over a Riemannian manifold.

What Mathlib does not contain: the Dirac equation as a physics equation (i.e., a time-dependent PDE whose solutions are spinor fields on Minkowski spacetime); the identification of positive-energy and negative-energy solution branches; the antiparticle interpretation via second quantisation; the non-relativistic limit yielding the Pauli equation; or the $g = 2$ prediction for the magnetic moment. These are physics-layer constructions that require the physics formalisation roadmap (unit id conventions, physical-unit system, Hilbert-space operators) to be in place before they can be formalised.

lean_status: none reflects this gap. No lean_module ships with this unit. The lean_mathlib_gap field in frontmatter records the boundary of current Mathlib coverage.

Gamma matrix algebra and Clifford structure [Master]

The closure derivation: why the Clifford algebra is forced. The Dirac equation is found by requiring three structural conditions simultaneously: (i) the equation is first order in time so that initial data is just $ψ (t_{0}, x)$ and the conserved density is positive-definite, (ii) the equation is Lorentz-covariant so that space and time enter symmetrically, and (iii) every solution also satisfies the relativistic dispersion $E^{2} = ∣ p ∣^{2} + m^{2}$ , equivalently the Klein-Gordon equation $(\partial_{μ} \partial^{μ} + m^{2}) ψ = 0$ component-wise. Conditions (i) and (ii) together force the ansatz

i \partial_{t} ψ = (α \cdot p + β m) ψ,

where $α = (α^{1}, α^{2}, α^{3})$ and $β$ are matrix-valued coefficients on some finite-dimensional vector space $V$ in which $ψ$ takes values. The first-order form is built in by hand; the cost is that $ψ$ is now a column of $N$ components rather than a single complex number.

Apply the operator twice and impose (iii). One step gives

- \partial_{t}^{2} ψ = (α \cdot p + β m)^{2} ψ = [α^{i} α^{j} p_{i} p_{j} + m (α^{i} β + β α^{i}) p_{i} + β^{2} m^{2}] ψ .

For the right-hand side to equal $(∣ p ∣^{2} + m^{2}) ψ$ on every component of $ψ$ , the matrix coefficients must satisfy

\frac{1}{2} {α^{i}, α^{j}} = δ^{ij} 1, {α^{i}, β} = 0, β^{2} = 1 .

These are the algebraic closure conditions. The symmetric part of $α^{i} α^{j}$ must contract to $δ^{ij}$ to reproduce $∣ p ∣^{2}$ , and the cross term between $α$ and $β$ must vanish so no linear-in- $p$ residue appears.

Define $γ^{0} := β$ and $γ^{i} := β α^{i}$ . A direct expansion gives

{γ^{0}, γ^{0}} = 2 β^{2} = 21, {γ^{i}, γ^{j}} = β α^{i} β α^{j} + β α^{j} β α^{i} = - β^{2} {α^{i}, α^{j}} = - 2 δ^{ij} 1, {γ^{0}, γ^{i}} = β β α^{i} + β α^{i} β = {β, β α^{i}} - β {α^{i}, β} = 0,

using ${α^{i}, β} = 0$ in the last step. Bundling these into a single equation with the mostly-minus metric $η^{μν} = diag (+ 1, - 1, - 1, - 1)$ gives the Clifford algebra relation

{γ^{μ}, γ^{ν}} = 2 η^{μν} 1 .

Multiplying the original equation by $β = γ^{0}$ and rearranging recovers the manifestly covariant Dirac equation $(i γ^{μ} \partial_{μ} - m) ψ = 0$ . The Clifford algebra is not assumed — it is the unique algebraic structure on the matrix coefficients consistent with conditions (i)-(iii). Any first-order Lorentz-covariant wave equation whose solutions satisfy Klein-Gordon must factor through a Clifford algebra.

Minimal dimension. The smallest $N$ supporting a faithful Cl(1,3) representation is $N = 4$ . Counting: Cl(1,3) has dimension $2^{4} = 16$ as a real vector space (one element per subset of ${0, 1, 2, 3}$ ). Over $C$ , Cl(1,3) $≅$ Mat(4, $C$ ), the algebra of $4 \times 4$ complex matrices, and Mat(4, $C$ ) acts faithfully on $C^{4}$ . Smaller representations would fail to encode the full anticommutation pattern; larger representations are reducible. The four-component spinor structure of the Dirac equation is dictated by this algebraic minimum, not chosen ad hoc.

The $4 \times 4$ gamma matrices generate the Clifford algebra $Cl (1, 3)$ , the algebra associated with the Minkowski metric $η = diag (+ 1, - 1, - 1, - 1)$ . The full set of products $γ^{μ_{1}} \dots γ^{μ_{k}}$ spans a 16-dimensional real vector space — the Clifford algebra as a matrix algebra $Cl (1, 3) ≅ Mat (4, C)$ . A basis is given by the 16 matrices:

Rank	Count	Elements	Notation
0	1	$I$	identity
1	4	$γ^{μ}$	gamma matrices
2	6	$σ^{μν} = \frac{i}{2} [γ^{μ}, γ^{ν}]$	commutators
3	4	$γ^{μ} γ^{5}$	axial vectors
4	1	$γ^{5}$	chirality

This is the complete set of Dirac bilinears $\overset{ˉ}{ψ} Γ ψ$ . Lorentz symmetry restricts which bilinears can appear in interaction terms: scalars ( $\overset{ˉ}{ψ} ψ$ ), vectors ( $\overset{ˉ}{ψ} γ^{μ} ψ$ ), tensors ( $\overset{ˉ}{ψ} σ^{μν} ψ$ ), axial vectors ( $\overset{ˉ}{ψ} γ^{μ} γ^{5} ψ$ ), and pseudoscalars ( $\overset{ˉ}{ψ} γ^{5} ψ$ ). These five classes exhaust the independent Lorentz-covariant bilinears.

Trace identities. The gamma matrices satisfy:

$tr (γ^{μ}) = 0$ (traceless).
$tr (γ^{μ} γ^{ν}) = 4 η^{μν}$ .
$tr (γ^{μ} γ^{ν} γ^{ρ} γ^{σ}) = 4 (η^{μν} η^{ρ σ} - η^{μ ρ} η^{ν σ} + η^{μ σ} η^{ν ρ})$ .
$tr (odd number of γ ’s) = 0$ .
$tr (γ^{5}) = 0$ , $tr (γ^{5} γ^{μ} γ^{ν}) = 0$ , $tr (γ^{5} γ^{μ} γ^{ν} γ^{ρ} γ^{σ}) = - 4 i ϵ^{μν ρ σ}$ .

These are the computational backbone of QED. Every scattering amplitude reduces to traces of gamma-matrix products via the Casimir trick (summing over final spins, averaging over initial spins). The identities are proved by repeated use of the Clifford algebra and the cyclicity of the trace.

Chiral (Weyl) representation. An alternative representation useful for massless or ultrarelativistic particles:

γ^{0} = (0 I I 0), γ^{i} = (0 - σ^{i} σ^{i} 0), γ^{5} = (- I 0 0 I) .

In this representation, the projectors $P_{L} = \frac{1}{2} (1 - γ^{5})$ and $P_{R} = \frac{1}{2} (1 + γ^{5})$ project onto the upper and lower two components respectively — the left-handed and right-handed Weyl spinors. For $m = 0$ , the Dirac equation decouples into two independent two-component equations, the Weyl equations, and chirality equals helicity. The weak interaction couples only to left-handed particles, making the chiral representation the natural one for the electroweak theory.

Fierz rearrangement. Any product of two Dirac bilinears can be rewritten in terms of a different basis pairing. The Fierz identity is

(\overset{ˉ}{ψ}_{1} Γ^{A} ψ_{2}) (\overset{ˉ}{ψ}_{3} Γ_{A} ψ_{4}) = B \sum C_{A B} (\overset{ˉ}{ψ}_{1} Γ^{B} ψ_{4}) (\overset{ˉ}{ψ}_{3} Γ_{B} ψ_{2}),

where $Γ^{A}$ runs over the 16 basis elements and $C_{A B}$ is a fixed numerical matrix. This identity is essential for analysing four-fermion interactions (Fermi theory, effective weak couplings) and for proving the Pauli exclusion principle in its relativistic form (the spin-statistics theorem).

The spinor representation $S (Λ)$ and the SL(2, $C$ ) double cover. Lorentz covariance of the Dirac equation requires a matrix $S (Λ)$ for each Lorentz transformation $Λ^{μ}_{ν} \in SO^{+} (3, 1)$ acting on the spinor index. Write an infinitesimal Lorentz transformation as $Λ^{μ}_{ν} = δ^{μ}_{ν} + ω^{μ}_{ν}$ with $ω_{μν} = - ω_{ν μ}$ (six independent parameters: three boosts, three rotations). The infinitesimal spinor representation is

S (Λ) = 1 - \frac{i}{4} ω_{μν} σ^{μν} + O (ω^{2}),

where $σ^{μν} = \frac{i}{2} [γ^{μ}, γ^{ν}]$ . Imposing the covariance condition $S^{- 1} γ^{μ} S = Λ^{μ}_{ν} γ^{ν}$ at linear order in $ω$ gives

[σ^{μν}, γ^{ρ}] = 2 i (η^{ν ρ} γ^{μ} - η^{μ ρ} γ^{ν}),

which follows from the Clifford algebra by direct computation. The six matrices $σ^{μν} /2$ thus form a representation of the Lorentz Lie algebra $so (1, 3)$ ; exponentiating gives a finite-dimensional representation of the universal cover $SL (2, C)$ rather than of $SO^{+} (3, 1)$ itself. Concretely, in the chiral basis $σ^{μν}$ is block-diagonal with the upper block a $(\frac{1}{2}, 0)$ representation and the lower block a $(0, \frac{1}{2})$ representation of $SL (2, C)$ ; together they form the $(\frac{1}{2}, 0) \oplus (0, \frac{1}{2})$ representation that the Dirac spinor carries.

A rotation by $2 π$ about any axis multiplies the spinor by $- 1$ , not $+ 1$ — this is the hallmark of the double cover. The Dirac spinor returns to itself only after a $4 π$ rotation. The doubled identity is not a pathology but the very signature of a half-integer-spin representation, and it is the algebraic reason fermions obey the Pauli exclusion principle in the relativistic spin-statistics theorem (cross-link 12.13.02).

Non-relativistic limit and $g - 2$ [Master]

Minimal coupling and the algebra of $(σ \cdot π)^{2}$ . Promote the free Dirac equation to an external electromagnetic field $A^{μ} = (V, A)$ via minimal coupling: replace the spacetime derivative $i \partial_{μ}$ everywhere by the gauge-covariant derivative $i \partial_{μ} - e A_{μ}$ , equivalently the kinetic 4-momentum $p_{μ} \to π_{μ} := p_{μ} - e A_{μ}$ . The Dirac equation becomes

(i γ^{μ} \partial_{μ} - e γ^{μ} A_{μ} - m) ψ = 0.

The factor $e$ is the electron charge (sign convention: $e < 0$ for the physical electron). The gauge invariance of the equation under $A_{μ} \to A_{μ} + \partial_{μ} χ$ , $ψ \to e^{i eχ} ψ$ is built in by the form of the covariant derivative.

The non-relativistic limit isolates the kinematics where $∣ p ∣/ m ≪ 1$ and $∣ e V ∣/ m ≪ 1$ . The most efficient path is to multiply the Dirac equation by $γ^{0}$ on the left, giving the Schrödinger-form Hamiltonian

H_{D} = α \cdot π + β m + e V, α = γ^{0} γ, β = γ^{0},

and then decompose $ψ = (ϕ_{L}, ϕ_{S})^{T}$ in the standard (Dirac) basis where $β = diag (1, - 1)$ . The $α$ matrices are off-diagonal, $α = antidiag (σ, σ)$ , and the Dirac equation $i \partial_{t} ψ = H_{D} ψ$ resolves into the coupled pair

i \partial_{t} ϕ_{L} = (e V) ϕ_{L} + (σ \cdot π) ϕ_{S} + m ϕ_{L},

i \partial_{t} ϕ_{S} = (σ \cdot π) ϕ_{L} + (e V) ϕ_{S} - m ϕ_{S} .

To peel off the dominant rest-mass oscillation, write $ϕ_{L} = e^{- im t} χ_{L}$ , $ϕ_{S} = e^{- im t} χ_{S}$ . The exponential cancels the $m ϕ_{L}$ and the $- m ϕ_{S}$ partially, leaving

i \partial_{t} χ_{L} = (σ \cdot π) χ_{S} + (e V) χ_{L},

i \partial_{t} χ_{S} + 2 m χ_{S} = (σ \cdot π) χ_{L} + (e V) χ_{S} .

In the non-relativistic regime, the small component $χ_{S}$ is suppressed by $O (∣ p ∣/ m) = O (v / c)$ relative to $χ_{L}$ . The left side of the second equation is dominated by $2 m χ_{S}$ , and dropping $i \partial_{t} χ_{S}$ and $e V χ_{S}$ (both small relative to $2 m χ_{S}$ ) gives the algebraic solution

χ_{S} \approx \frac{1}{2 m} (σ \cdot π) χ_{L} .

Substituting back into the first equation gives the Pauli equation:

i \partial_{t} χ_{L} = [\frac{( σ \cdot π ) ^{2}}{2 m} + e V] χ_{L} .

This is a two-component Schrödinger equation; the four-component spinor has collapsed to two components by integrating out the antiparticle degrees of freedom.

The Pauli identity and the magnetic-moment readout. Expand $(σ \cdot π)^{2}$ using the spin algebra identity

σ^{i} σ^{j} = δ^{ij} 1 + i ϵ^{ij k} σ^{k} .

Therefore

(σ \cdot π)^{2} = σ^{i} σ^{j} π^{i} π^{j} = π^{i} π^{i} + i ϵ^{ij k} σ^{k} π^{i} π^{j} = ∣ π ∣^{2} + \frac{i}{2} ϵ^{ij k} σ^{k} [π^{i}, π^{j}] .

The commutator $[π^{i}, π^{j}] = [p^{i} - e A^{i}, p^{j} - e A^{j}] = - e [p^{i}, A^{j}] - e [A^{i}, p^{j}] = - i e (\partial^{i} A^{j} - \partial^{j} A^{i}) = - i e ϵ^{ij k} B^{k}$ (in the convention $[x^{i}, p^{j}] = i δ^{ij}$ and $B = \nabla \times A$ ). Substituting,

(σ \cdot π)^{2} = ∣ π ∣^{2} + \frac{i}{2} ϵ^{ij k} σ^{k} (- i e) ϵ^{ij ℓ} B^{ℓ} = ∣ π ∣^{2} + \frac{e}{2} σ^{k} B^{ℓ} (2 δ^{k ℓ}) = ∣ π ∣^{2} - e σ \cdot B,

using $ϵ^{ij k} ϵ^{ij ℓ} = 2 δ^{k ℓ}$ and absorbing the sign in the standard convention. The Pauli equation then reads

i \partial_{t} χ_{L} = [\frac{∣ p - e A ∣ ^{2}}{2 m} + e V - \frac{e}{2 m} σ \cdot B] χ_{L} .

The last term is a magnetic-moment coupling $- μ \cdot B$ with magnetic moment

μ = \frac{e}{2 m} σ = g \cdot \frac{e}{2 m} S, S = \frac{1}{2} σ .

Reading off the gyromagnetic ratio: the spin operator is $S = σ /2$ , so $μ = (e / m) S = 2 \cdot (e /2 m) S$ , giving $g = 2$ . This is the Dirac prediction: the electron's gyromagnetic ratio comes out as exactly twice the classical orbital value $g_{orb} = 1$ , with no assumption injected beyond the Clifford algebra and minimal coupling. The result was a stunning agreement with the empirical Landé pre-factor and the cornerstone of the equation's acceptance.

Radiative corrections and the anomalous magnetic moment. The tree-level $g = 2$ receives quantum corrections from QED loop diagrams. The leading correction comes from the one-loop vertex diagram (Schwinger 1948):

a_{e} := \frac{g - 2}{2} = \frac{α}{2 π} \approx 0.0011614.

Higher-order corrections give the series

a_{e} = \frac{1}{2} \frac{α}{π} - 0.328478965 (\frac{α}{π})^{2} + 1.181241456 (\frac{α}{π})^{3} - 1.912245764 (\frac{α}{π})^{4} + \dots

The theoretical prediction and experimental measurement agree to more than 12 significant figures — the most precise agreement between theory and experiment in all of physics. Any deviation would signal new physics (supersymmetric particles, composite structure, extra dimensions), making $g - 2$ a precision probe of the Standard Model and beyond.

The anomalous magnetic moment of the muon $a_{μ}$ shows a persistent tension between the Standard Model prediction and experiment (the "muon $g - 2$ anomaly"), currently at the $\sim 5 σ$ level depending on which theoretical calculation is used. This is one of the strongest hints of physics beyond the Standard Model.

Chiral symmetry [Master]

The projectors $P_{L}, P_{R}$ and the explicit chiral decomposition. The matrix $γ^{5} = i γ^{0} γ^{1} γ^{2} γ^{3}$ satisfies $(γ^{5})^{2} = 1$ and ${γ^{5}, γ^{μ}} = 0$ . From $(γ^{5})^{2} = 1$ , the eigenvalues of $γ^{5}$ are $\pm 1$ . Define the chiral projectors

P_{L} := \frac{1}{2} (1 - γ^{5}), P_{R} := \frac{1}{2} (1 + γ^{5}) .

Direct computation shows $P_{L}^{2} = P_{L}$ , $P_{R}^{2} = P_{R}$ , $P_{L} P_{R} = P_{R} P_{L} = 0$ , and $P_{L} + P_{R} = 1$ . They are orthogonal idempotents projecting onto the $\mp 1$ eigenspaces of $γ^{5}$ . Every Dirac spinor decomposes uniquely as

ψ = ψ_{L} + ψ_{R}, ψ_{L} := P_{L} ψ, ψ_{R} := P_{R} ψ .

The components $ψ_{L}$ and $ψ_{R}$ are the left-handed and right-handed Weyl spinors, each carrying two complex degrees of freedom. In the chiral (Weyl) representation, $γ^{5} = diag (- 1, + 1)$ , so $P_{L}$ projects onto the upper two components and $P_{R}$ onto the lower two; the Dirac spinor is just the column $(ψ_{L}, ψ_{R})^{T}$ with each entry a Weyl spinor.

Massless decoupling. For $m = 0$ , the Dirac equation $i γ^{μ} \partial_{μ} ψ = 0$ admits a clean factorisation. Apply $P_{R}$ on the left:

P_{R} \cdot i γ^{μ} \partial_{μ} ψ = i γ^{μ} \partial_{μ} (P_{L} ψ) = i γ^{μ} \partial_{μ} ψ_{L} = 0,

using the anticommutation $γ^{μ} P_{R} = P_{L} γ^{μ}$ (because ${γ^{5}, γ^{μ}} = 0$ flips the sign of $γ^{5}$ when commuted past $γ^{μ}$ ). Similarly applying $P_{L}$ on the left gives $i γ^{μ} \partial_{μ} ψ_{R} = 0$ . The two-component Weyl equations

i \overset{σ}{ˉ}^{μ} \partial_{μ} ψ_{L} = 0, i σ^{μ} \partial_{μ} ψ_{R} = 0

(in the chiral basis with $σ^{μ} = (1, σ)$ , $\overset{σ}{ˉ}^{μ} = (1, - σ)$ ) are uncoupled: a left-handed spinor evolves independently of a right-handed spinor. For $m = 0$ , chirality is conserved along the trajectory, and the two halves are physically separable. The neutrino — long considered massless — was modelled by a purely left-handed Weyl spinor until oscillation experiments forced the introduction of a mass term.

Mass term as the chirality mixer. The Dirac mass term $m \overset{ˉ}{ψ} ψ$ explicitly couples the two chiralities. Compute:

\overset{ˉ}{ψ} ψ = ψ^{†} γ^{0} ψ = (ψ_{L} + ψ_{R})^{†} γ^{0} (ψ_{L} + ψ_{R}) .

Using $γ^{0} P_{L} = P_{R} γ^{0}$ (again from ${γ^{5}, γ^{0}} = 0$ ), the cross terms survive but the diagonal ones vanish:

ψ_{L}^{†} γ^{0} ψ_{L} = ψ^{†} P_{L} γ^{0} P_{L} ψ = ψ^{†} P_{L} P_{R} γ^{0} ψ = 0,

and similarly for $ψ_{R}^{†} γ^{0} ψ_{R}$ . Therefore

\overset{ˉ}{ψ} ψ = \overset{ˉ}{ψ}_{L} ψ_{R} + \overset{ˉ}{ψ}_{R} ψ_{L},

where $\overset{ˉ}{ψ}_{L, R} := ψ_{L, R}^{†} γ^{0}$ projects opposite chirality (note: $\overset{ˉ}{ψ}_{L} = \overline{P_{L} ψ} = \overset{ˉ}{ψ} P_{R}$ , so the bar swaps chirality). The mass term is off-diagonal in the L-R basis: it converts a left-handed spinor into a right-handed one and vice versa. A massive Dirac fermion is the minimum relativistic-quantum description of a particle that has both chiralities mixing — equivalently, a particle that decelerates relative to the speed of light.

This explicitly demonstrates how chirality and mass interact. For $m \neq = 0$ , neither $ψ_{L}$ nor $ψ_{R}$ is separately a solution of the Dirac equation; only their linear combination is. For $m = 0$ , the two halves decouple and propagate as independent Weyl fermions.

Chiral symmetry as a global $U (1)_{L} \times U (1)_{R}$ . For $m = 0$ , the Dirac Lagrangian

L = \overset{ˉ}{ψ} (i γ^{μ} \partial_{μ}) ψ = i \overset{ˉ}{ψ}_{L} γ^{μ} \partial_{μ} ψ_{L} + i \overset{ˉ}{ψ}_{R} γ^{μ} \partial_{μ} ψ_{R}

(the kinetic term preserves chirality because $γ^{μ}$ swaps it twice in $\overset{ˉ}{ψ}_{L} γ^{μ} ψ_{L} = ψ_{L}^{†} P_{R} γ^{0} γ^{μ} P_{L} ψ_{L} = ψ_{L}^{†} γ^{0} γ^{μ} ψ_{L}$ when commuting projectors through) has two continuous global symmetries beyond the $U (1)$ phase rotation $ψ \to e^{i α} ψ$ :

Vector symmetry: $ψ \to e^{i α} ψ$ (equivalently $ψ_{L} \to e^{i α} ψ_{L}$ , $ψ_{R} \to e^{i α} ψ_{R}$ ). Conserved current $j^{μ} = \overset{ˉ}{ψ} γ^{μ} ψ$ , conserved charge $Q = \int ψ^{†} ψ d^{3} x$ (particle number, or electric charge after gauging).
Axial (chiral) symmetry: $ψ \to e^{i β γ^{5}} ψ$ (equivalently $ψ_{L} \to e^{- i β} ψ_{L}$ , $ψ_{R} \to e^{i β} ψ_{R}$ ). Conserved current $j_{5}^{μ} = \overset{ˉ}{ψ} γ^{μ} γ^{5} ψ$ , conserved charge $Q_{5} = \int ψ^{†} γ^{5} ψ d^{3} x$ (chirality).

The axial symmetry is generated by $γ^{5}$ . Using ${γ^{5}, γ^{μ}} = 0$ , the massless Dirac equation commutes with $γ^{5}$ : if $ψ$ is a solution, so is $γ^{5} ψ$ . The projectors $P_{L}, P_{R}$ decompose $ψ = ψ_{L} + ψ_{R}$ into left- and right-handed components that evolve independently. This is chiral symmetry.

A mass term $m \overset{ˉ}{ψ} ψ = m (\overset{ˉ}{ψ}_{L} ψ_{R} + \overset{ˉ}{ψ}_{R} ψ_{L})$ explicitly breaks chiral symmetry by coupling the two chiralities. The breaking is proportional to $m$ ; for light quarks ( $m_{u}, m_{d} ≪ Λ_{QCD}$ ), chiral symmetry is approximate, and its breaking pattern $S U (2)_{L} \times S U (2)_{R} \to S U (2)_{V}$ produces the pions as (pseudo-)Goldstone bosons — the foundation of chiral perturbation theory.

In QCD, even for massless quarks, the axial $U (1)_{A}$ symmetry is anomalously broken by the quantum path integral measure (the Adler-Bell-Jackiw anomaly). The divergence of the axial current is $\partial_{μ} j_{5}^{μ} = \frac{g ^{2}}{16 π ^{2}} F_{μν} \tilde{F}^{μν}$ , which is nonzero in the presence of non-constant gauge field topology. This anomaly resolves the $U (1)_{A}$ problem (why the $η^{'}$ meson is heavy despite near-chiral symmetry) and is connected to instanton physics and the strong CP problem.

The chiral structure of the Dirac equation is also the origin of the V-A (vector minus axial) structure of the weak interaction, the neutrino mass problem (observed neutrino oscillations require mass, which breaks chiral symmetry for fermions that were originally placed in purely left-handed Weyl representations), and the anomaly cancellation constraints on fermion representations in grand unified theories.

Foldy-Wouthuysen transformation and the systematic non-relativistic expansion [Master]

The Pauli-equation derivation above keeps only the leading $1/ m$ term and discards everything else. To get the systematic expansion — including the Darwin term, the spin-orbit coupling, and the relativistic kinetic correction — requires a controlled framework that removes the troublesome odd operators (those mixing large and small components) order by order in $1/ m$ . This is the Foldy-Wouthuysen transformation (1950) ^{[Foldy-Wouthuysen 1950]}, the canonical bridge from the four-component Dirac picture to the two-component Pauli picture with all relativistic corrections accounted for.

The transformation as an exponentiated rotation. The Dirac Hamiltonian splits naturally into even and odd parts with respect to $β = γ^{0}$ :

H_{D} = β m + E + O, {β, O} = 0, [β, E] = 0.

For minimal-coupled Dirac, $E = e V$ (block-diagonal) and $O = α \cdot π$ (block-off-diagonal, mixing $ϕ_{L}$ with $ϕ_{S}$ ). The odd part is the obstruction to a clean two-component reduction.

Foldy and Wouthuysen sought a unitary transformation $ψ \to ψ^{'} = U ψ$ (and correspondingly $H^{'} = U H U^{†} - i U \partial_{t} U^{†}$ ) that eliminates $O$ at leading order. Take

U_{1} = exp (i S_{1}), S_{1} = - \frac{i β O}{2 m} = - \frac{i β α \cdot π}{2 m} .

The operator $S_{1}$ is Hermitian because $(β α)^{†} = α^{†} β^{†} = α β = - β α$ (anticommutation) and the prefactor $- i$ flips the sign once more. The transformed Hamiltonian is

H_{1}^{'} = U_{1} H_{D} U_{1}^{†} = e^{i S_{1}} (β m + E + O) e^{- i S_{1}} .

Expand the BCH series:

H_{1}^{'} = H_{D} + i [S_{1}, H_{D}] + \frac{i ^{2}}{2 !} [S_{1}, [S_{1}, H_{D}]] + \dots

The leading commutator is $i [S_{1}, β m] = i [S_{1}, β] m = i (- 2 S_{1} β) m \cdot (- 1) = - O$ , exactly cancelling the offending odd operator at $O (m^{0})$ . Higher commutators produce new even operators at order $1/ m$ , $1/ m^{2}$ , etc.

The systematic expansion through order $1/ m^{2}$ . Carrying the BCH expansion to the required order, the transformed Hamiltonian becomes

H_{1}^{'} = β m + E + \frac{β O ^{2}}{2 m} - \frac{O ^{4}}{8 m ^{3}} + \frac{1}{8 m ^{2}} [O, [O, E]] - \frac{i}{8 m ^{2}} [O, \dot{O}] + O_{new},

where the new odd operator $O_{new} = O (1/ m)$ is one order smaller than $O$ . Iteration — applying a second FW rotation $U_{2} = exp (i S_{2})$ with $S_{2} = - i β O_{new} / (2 m)$ — eliminates this odd operator and introduces a yet-smaller one of order $1/ m^{2}$ . Continuing the iteration order by order produces a unitary $U = U_{3} U_{2} U_{1}$ such that the final Hamiltonian is block-diagonal up to errors of order $1/ m^{3}$ .

Reading off the physical terms. Specialize to the electron in an electrostatic potential, $O = α \cdot p$ (no $A$ ), $E = e V$ . Compute each operator in the standard representation:

$β O^{2} / (2 m) = β (α \cdot p)^{2} / (2 m) = β ∣ p ∣^{2} / (2 m)$ — the non-relativistic kinetic energy.
$- O^{4} / (8 m^{3}) = - ∣ p ∣^{4} / (8 m^{3})$ — the relativistic kinetic correction $- p^{4} / (8 m^{3} c^{2})$ (restoring $c$ ).
$[O, [O, E]] / (8 m^{2})$ evaluates via $[p, e V] = - i e \nabla V$ :
- The double commutator splits into a symmetric piece and an antisymmetric piece. The symmetric piece gives the Darwin term $\frac{e}{8 m ^{2}} \nabla^{2} V$ (a contact interaction that smears the electron position by its Compton wavelength).
- The antisymmetric piece, after acting with $σ \cdot (\nabla V \times p)$ , gives the spin-orbit term $\frac{e}{4 m ^{2}} σ \cdot (\nabla V \times p)$ . For a central potential $V (r)$ , $\nabla V = (1/ r) (d V / d r) r$ , and the spin-orbit coupling becomes the familiar $\frac{1}{2 m ^{2} c ^{2}} \frac{1}{r} \frac{d V}{d r} L \cdot S$ , with the famous Thomas factor of $1/2$ relative to the naive classical result emerging automatically (Thomas precession is no longer needed as an extra input).
For a magnetic field $A \neq = 0$ , the kinetic term picks up the Pauli magnetic moment as derived above: $- \frac{e}{2 m} σ \cdot B$ , with $g = 2$ .

The Pauli equation with relativistic corrections. Restricting to the upper two components (now the only ones occupied in the absence of pair-production sources), the effective non-relativistic Hamiltonian is

H_{NR} = \frac{∣ p ∣ ^{2}}{2 m} + e V - \frac{∣ p ∣ ^{4}}{8 m ^{3}} - \frac{e}{2 m} σ \cdot B + \frac{e}{8 m ^{2}} \nabla^{2} V + \frac{e}{4 m ^{2}} σ \cdot (\nabla V \times p) .

Term-by-term: non-relativistic kinetic + electrostatic + relativistic kinetic correction + Pauli Zeeman + Darwin + spin-orbit. This is the complete non-relativistic limit of the Dirac equation through order $1/ m^{2}$ . Applied to the hydrogen atom with $V = - e^{2} / r$ , these corrections reproduce the entire hydrogen fine structure — the same level splittings Sommerfeld derived empirically in 1916 from the old quantum theory but now from a single first-principles equation. This is the second great success of the Dirac equation (after $g = 2$ ), and the agreement with experiment is at the level of one part in $1 0^{- 4}$ for the $2 p_{1/2}$ – $2 p_{3/2}$ splitting, the Lamb shift between $2 s_{1/2}$ and $2 p_{1/2}$ being a QED radiative effect not captured by the FW reduction alone.

The FW transformation as a position-operator question. A deeper reading of the FW transformation: in the original four-component picture, the position operator $x$ mixes positive- and negative-energy states, leading to the unphysical Zitterbewegung (rapid oscillation at frequency $2 m /ℏ \approx 1 0^{21} Hz$ ) when computing the expectation value of $x$ on a positive-energy wave packet. The FW transformation diagonalises the Hamiltonian in the energy-sign basis, and the transformed position operator $x_{FW} = U x U^{†}$ no longer mixes the two branches — it is the mean position of the wave packet without the Compton-scale jitter. Newton and Wigner (1949) had introduced this operator independently from a representation-theoretic standpoint (the unique position operator covariant under the Euclidean group acting on a positive-energy irreducible representation of the Poincaré group); Foldy-Wouthuysen recover the same operator via a constructive unitary transformation.

Hole theory, the Dirac sea, and the bridge to QFT second-quantisation [Master]

The single-particle Dirac equation has a foundational problem that the original 1928 paper did not resolve and that drove the next two decades of physics: the negative-energy spectrum is unbounded below, so the theory has no ground state if interpreted naively as a one-particle wave mechanics.

The unbounded-below spectrum. The free Dirac Hamiltonian $H_{D} = α \cdot p + β m$ has eigenvalues $\pm E_{p} = \pm ∣ p ∣^{2} + m^{2}$ , each doubly degenerate (Exercise 9). As $∣ p ∣$ ranges over $R^{3}$ , the negative branch covers $(- \infty, - m]$ . Any external perturbation — say, a photon — could in principle cause an electron sitting at $E = + m$ to transition to $E = - m - 10 TeV$ , releasing $10 TeV$ of energy. The single-particle Dirac electron would tumble downward indefinitely. This catastrophic instability is not observed; real electrons sit happily in atomic orbitals for cosmic timescales.

Hole theory (Dirac 1930). Dirac's resolution: postulate that in the actual vacuum, every negative-energy state is filled by an electron, forming an inert sea. The Pauli exclusion principle, which we now know applies to electrons, prevents any positive-energy electron from cascading downward — there is no empty negative-energy state to fall into. The transition from $+ E$ to $- E$ would require simultaneously displacing an existing sea electron, which is energetically excluded.

A photon with energy $\geq 2 m$ can promote a sea electron to a positive-energy state. The result is observable: a positive-energy electron and a missing slot ("hole") in the sea. The hole has:

Energy $+ E_{p}$ (because removing a $- E_{p}$ state from the sea adds $+ E_{p}$ to the total).
Charge $+ ∣ e ∣$ (because removing a charge $- ∣ e ∣$ contributes net $+ ∣ e ∣$ ).
Momentum $- p$ relative to the missing sea state.
Spin $1/2$ with opposite spin projection to the missing sea state.

Dirac initially identified the hole with the proton (the only known positive charge), but Weyl and Oppenheimer immediately objected: the hole must have the same mass as the electron, and the proton is ~1836 times heavier. After two years of resistance, Dirac conceded in 1931 and predicted a new particle. Anderson's 1932 observation of the positron in a cloud chamber ^{[Anderson 1932]} confirmed the prediction in dramatic fashion.

Why hole theory failed as a literal physical picture. Hole theory, taken seriously, requires:

An infinite negative charge density of sea electrons (each carrying charge $- ∣ e ∣$ , summed over an infinite continuum). The observable physics is the deviation from this infinite reference, with infinite charge absorbed as a renormalisation of the vacuum.
The Pauli exclusion principle to prevent infinite cascading. But the exclusion principle is a statistical property of fermions, not of electrons alone — bosonic relativistic particles (e.g., the pion, described by the Klein-Gordon equation) also have negative-energy solutions, and there is no exclusion principle to stop them.

The second point is decisive. Klein-Gordon's negative-energy problem cannot be solved by hole theory because pions are bosons; any number of them can pile into the same state, so a "filled sea" does not stabilise them. A different framework is needed — one that handles bosons and fermions on the same footing.

The QFT resolution: the Dirac field as an operator. The modern treatment of the negative-energy problem is the second-quantisation programme. Promote $ψ (x)$ from a $C^{4}$ -valued classical field to an operator-valued field acting on a Fock space. Expand it in a basis of positive- and negative-frequency mode functions:

\hat{ψ} (x) = s \sum \int \frac{d ^{3} p}{( 2 π ) ^{3}} \frac{1}{2 E _{p}} [\overset{a}{^}_{p}^{s} u^{s} (p) e^{- i p \cdot x} + \hat{b}_{p}^{s †} v^{s} (p) e^{+ i p \cdot x}] .

The operators $\overset{a}{^}_{p}^{s}$ and $\overset{a}{^}_{p}^{s †}$ annihilate and create electrons; the operators $\hat{b}_{p}^{s}$ and $\hat{b}_{p}^{s †}$ annihilate and create positrons. The key reinterpretation: the negative-frequency mode $e^{+ i p \cdot x}$ that was a "negative-energy solution" in single-particle language is now accompanied by a positron creation operator $\hat{b}_{p}^{s †}$ . Removing a negative-energy electron from the sea is rephrased as creating a positive-energy positron in the vacuum.

The operators satisfy the canonical anticommutation relations (CAR):

{\overset{a}{^}_{p}^{s}, \overset{a}{^}_{q}^{r †}} = (2 π)^{3} δ^{3} (p - q) δ^{r s}, {\hat{b}_{p}^{s}, \hat{b}_{q}^{r †}} = (2 π)^{3} δ^{3} (p - q) δ^{r s},

with all other anticommutators vanishing. Anticommutators rather than commutators because $ψ$ describes fermions — this is the input from the spin-statistics theorem (cross-link 12.13.02). The Hilbert space is the fermionic Fock space $F_{a} (H_{+}) \otimes F_{a} (H_{-})$ built from the positive-energy electron one-particle Hilbert space $H_{+}$ and the positive-energy positron one-particle Hilbert space $H_{-}$ , both of which carry the irreducible positive-energy spinor representation of the Poincaré group (Wigner 1939 classification).

The Hamiltonian becomes positive-definite. Computing $\hat{H} = \int d^{3} x \hat{ψ}^{†} H_{D} \hat{ψ}$ in this expansion and applying the CAR (using $\hat{b}_{p} \hat{b}_{p}^{†} = (2 π)^{3} δ^{3} (0) - \hat{b}_{p}^{†} \hat{b}_{p}$ and normal-ordering the result) gives

: \hat{H} := s \sum \int \frac{d ^{3} p}{( 2 π ) ^{3}} E_{p} (\overset{a}{^}_{p}^{s †} \overset{a}{^}_{p}^{s} + \hat{b}_{p}^{s †} \hat{b}_{p}^{s}) .

Every term is manifestly $\geq 0$ : both number operators $\hat{N}_{e} = \overset{a}{^}^{†} \overset{a}{^}$ and $\hat{N}_{\overset{e}{ˉ}} = \hat{b}^{†} \hat{b}$ have non-negative eigenvalues, and $E_{p} > 0$ . The vacuum state $∣0 ⟩$ (annihilated by all $\overset{a}{^}_{p}^{s}$ and $\hat{b}_{p}^{s}$ ) is the unique ground state at energy zero. No Dirac sea is required: the negative-energy problem dissolves because the operator algebra rephrases the would-be negative-energy excitations as positron creations from a positive-definite vacuum.

Crossing symmetry and the unification of particle and antiparticle. In the second-quantised picture, particle and antiparticle are no longer two species but two excitations of one field. Feynman's interpretation, equivalent to the operator picture, treats the positron as an electron propagating backward in time: the negative-frequency mode $v (p) e^{+ i p \cdot x}$ is reinterpreted as a forward-time wave for the conjugate particle. Crossing symmetry — the fact that $M (A + B \to C + D) = M (A + \overset{ˉ}{D} \to C + \overset{ˉ}{B})$ in QFT scattering amplitudes — is the rigorous statement of this equivalence. The Klein paradox (Exercise 10) finds its proper explanation in this framework: the strong potential pair-creates electrons and positrons from the vacuum, and the "transmitted" wave is a real positron travelling away.

CPT and the spin-statistics theorem. Two structural results crown the second-quantised Dirac theory:

CPT theorem (Lüders, Pauli, Schwinger; ~1954): any Lorentz-invariant local quantum field theory with a positive-definite metric on the Hilbert space is invariant under the combined operation $Θ = CPT$ (charge conjugation $C$ : electron $\leftrightarrow$ positron; parity $P$ : $x \to - x$ ; time reversal $T$ : $t \to - t$ with antiunitarity). For the Dirac field, $C ψ C^{- 1} = - i γ^{2} ψ^{*}$ , and the combined transformation is an exact symmetry of the free theory and of any interacting Lorentz-invariant theory. CPT is one of the most stringently tested predictions in physics; no violation has ever been observed (current upper bound on the electron-positron mass difference is $< 1 0^{- 18}$ relative).
Spin-statistics theorem (Fierz 1939, Pauli 1940): in any Lorentz-invariant QFT, half-integer-spin fields must be quantised with anticommutators (fermions) and integer-spin fields must be quantised with commutators (bosons). Choosing the wrong statistics for a given spin makes the energy unbounded below or the algebra of observables ill-defined. The Dirac field, carrying the $(\frac{1}{2}, 0) \oplus (0, \frac{1}{2})$ spin- $1/2$ representation of $SL (2, C)$ , must obey the CAR — the negative-energy problem cannot be resolved by bosonic quantisation of the Dirac field. The full proof uses the Wightman axiom of positivity of the inner product and is laid out in 12.13.02 and in Streater-Wightman 1964.

Synthesis. The foundational reason the Dirac equation has the structure it does is that it sits exactly at the intersection of three constraints: relativistic energy-momentum, first-order time evolution, and the positive-definite probability current. The central insight of the Dirac framework is that satisfying all three simultaneously forces (a) the Clifford algebra Cl(1,3) and its minimal 4-component representation, (b) the spinor representation of $SL (2, C)$ which is exactly the spin- $1/2$ representation, (c) a negative-energy branch that must be reinterpreted, and (d) the prediction $g = 2$ for the magnetic moment with all relativistic corrections systematically calculable via Foldy-Wouthuysen. Putting these together with the second-quantised Fock framework identifies the negative-energy problem with the existence of antiparticles, the chirality structure with the parity-violating weak interaction, and the spin-orbit + Darwin + Zeeman terms with the entire hydrogen fine structure. This is exactly the bridge between single-particle relativistic quantum mechanics and QFT: the Dirac equation generalises classical wave mechanics by forcing the inclusion of antimatter, just as the Schrödinger equation generalises classical mechanics by forcing quantisation; the pattern recurs throughout the Standard Model, where the requirement of relativistic QFT plus gauge symmetry predicts the entire particle zoo and its weak-interaction asymmetries.

Connections [Master]

Special relativity 10.05.01 pending provides the Minkowski metric and Lorentz transformations that the Dirac equation is built on. The Clifford algebra $Cl (1, 3)$ is the algebraic encoding of the metric signature; the gamma matrices are its matrix representation. Without the relativity framework, there is no Dirac equation.
Spin and angular momentum 12.05.01 pending emerge from the Dirac equation rather than being postulated. The orbital angular momentum $L = x \times p$ is not separately conserved; the conserved quantity is $J = L + S$ where $S = \frac{1}{2} Σ$ is the spin operator built from the gamma matrices. The non-relativistic limit recovers the Pauli spin matrices.
Schrödinger equation 12.03.01 pending is the non-relativistic, single-component limit of the Dirac equation. The Dirac equation reduces to the Pauli equation (a two-component Schrödinger-type equation with spin) in the $v / c ≪ 1$ limit; the Pauli equation reduces to the Schrödinger equation when the magnetic field vanishes.
Path integral for the Dirac field 12.10.01 pending replaces the wave-function picture with a sum-over-histories. Fermionic path integrals require Grassmann variables (anticommuting numbers) because the Dirac field describes fermions. The path-integral formulation makes the connection to QED scattering amplitudes systematic.
Clifford algebras 03.09.08 are the mathematical substrate. The gamma matrices generate $Cl (1, 3) ≅ Mat (4, C)$ ; the classification of Clifford algebras $Cl (p, q)$ in the periodicity theorem (Bott periodicity) explains why spinors in different spacetime dimensions have different sizes (2-component in $d = 3$ , 4-component in $d = 4$ , etc.).
Klein-Gordon equation 12.11.00 pending (if present) is the relativistic wave equation that comes from "quantising" $E^{2} = p^{2} + m^{2}$ directly. Every Dirac solution satisfies the Klein-Gordon equation, but the Dirac equation is a stronger constraint that resolves the negative-probability and spin degeneracy problems.
Quantum electrodynamics builds on the Dirac equation as the matter sector. The QED Lagrangian $L = \overset{ˉ}{ψ} (i γ^{μ} D_{μ} - m) ψ - \frac{1}{4} F_{μν} F^{μν}$ couples the Dirac field to the Maxwell field via the covariant derivative $D_{μ} = \partial_{μ} + i e A_{μ}$ . The Dirac propagator, vertex factor $- i e γ^{μ}$ , and the gamma-matrix trace technology developed above are the computational ingredients of all QED calculations.
CPT theorem and the spin-statistics theorem are deep structural results of relativistic QFT that apply to the Dirac field. The CPT theorem guarantees that the combined operation of charge conjugation, parity, and time reversal is an exact symmetry. The spin-statistics theorem guarantees that half-integer spin fields (Dirac) are fermions and integer spin fields are bosons.

Historical & philosophical context [Master]

Dirac formulated his equation in 1928 ^{[Dirac 1928]}, published in two papers in the Proceedings of the Royal Society A 117 and A 118. The motivation was to find a relativistic wave equation that was first-order in time (unlike Klein-Gordon) and that yielded a positive-definite probability density. Dirac's starting point was the observation that the Klein-Gordon equation's second-order time derivative prevented a conserved positive-definite probability current; a first-order equation would fix this.

The equation immediately yielded the correct fine-structure splitting of the hydrogen atom (the original empirical success) and the $g = 2$ magnetic moment. But it also produced negative-energy solutions that had no obvious physical interpretation. Dirac wrestled with this for two years before proposing the hole theory in 1930: the negative-energy states form a filled sea, and a hole in this sea is a new particle with positive energy and positive charge. Dirac initially identified this particle with the proton, hoping to explain the electron-proton mass asymmetry — a proposal that Weyl and Oppenheimer immediately criticised on the grounds that the hole would have the same mass as the electron. Dirac conceded in 1931 and predicted a particle with the electron's mass but opposite charge.

Anderson's discovery of the positron in 1932 ^{[Anderson 1932]} — observed as a particle of electron mass but opposite curvature in a cloud chamber exposed to cosmic rays — was the dramatic confirmation. It was the first instance of a particle predicted mathematically before its experimental detection, and it established the concept of antimatter as a physical reality rather than a mathematical artifact.

The philosophical implications are deep. The Dirac equation showed that combining quantum mechanics with special relativity does not merely refine existing predictions — it predicts new particles. Antimatter is not an optional extra but a structural consequence of relativistic quantum mechanics. This pattern repeated throughout the 20th century: the Dirac equation's structure generalised to the Standard Model, where the requirement of relativistic quantum field theory, combined with gauge symmetry, predicts the existence of the $W$ and $Z$ bosons, the Higgs boson, and the entire particle zoo.

The Klein paradox (1929) showed early on that the single-particle interpretation of the Dirac equation fails in strong external potentials. The resolution — pair production from the vacuum — required the full machinery of quantum field theory (second quantisation, the Dirac field as an operator-valued distribution). This transition from wave mechanics to field theory, forced by the Dirac equation's own structure, is one of the central conceptual shifts in 20th-century physics.

The systematic non-relativistic reduction was completed by Foldy and Wouthuysen in 1950 ^{[Foldy-Wouthuysen 1950]}, whose canonical unitary transformation diagonalises the Dirac Hamiltonian in the energy-sign basis order by order in $1/ m$ and identifies the kinetic, Darwin, spin-orbit, and Pauli-Zeeman terms with full Thomas-half factor. Schwinger's 1948 one-loop calculation of the anomalous magnetic moment $a_{e} = α / (2 π)$ inaugurated quantum electrodynamics as a precision-predictive theory, and the modern decade-long Penning-trap measurements have pushed the agreement to twelve significant figures — the most precise theory-experiment match in all of physics, and the empirical benchmark against which any extension of the Standard Model (supersymmetry, composite electrons, extra dimensions) is calibrated.

Bibliography [Master]

Primary literature:

Dirac, P. A. M., "The Quantum Theory of the Electron", Proc. Roy. Soc. A 117 (1928), 610–624. [Originator paper.]
Dirac, P. A. M., "The Quantum Theory of the Electron. Part II", Proc. Roy. Soc. A 118 (1928), 351–361.
Dirac, P. A. M., "A Theory of Electrons and Protons", Proc. Roy. Soc. A 126 (1930), 360–365. [Hole theory.]
Anderson, C. D., "The Positive Electron", Phys. Rev. 43 (1933), 491. [Positron discovery.]
Klein, O., "Die Reflexion von Elektronen an einem Potentialsprung nach der relativistischen Dynamik von Dirac", Z. Phys. 53 (1929), 157–165. [Klein paradox.]
Foldy, L. L. & Wouthuysen, S. A., "On the Dirac Theory of Spin 1/2 Particles and its Non-Relativistic Limit", Phys. Rev. 78 (1950), 29–36. [Canonical FW transformation, Darwin term, spin-orbit with Thomas factor.]
Newton, T. D. & Wigner, E. P., "Localized States for Elementary Systems", Rev. Mod. Phys. 21 (1949), 400–406. [The mean-position operator on positive-energy irreducible representations.]
Schwinger, J., "On Quantum-Electrodynamics and the Magnetic Moment of the Electron", Phys. Rev. 73 (1948), 416. [Leading g-2 correction.]
Fierz, M., "Über die relativistische Theorie kräftefreier Teilchen mit beliebigem Spin", Helv. Phys. Acta 12 (1939), 3–37. [Precursor to spin-statistics.]
Pauli, W., "The Connection Between Spin and Statistics", Phys. Rev. 58 (1940), 716–722. [Spin-statistics theorem.]
Lüders, G., "On the Equivalence of Invariance under Time Reversal and under Particle-Antiparticle Conjugation for Relativistic Field Theories", Kgl. Dan. Vid. Selsk. Mat.-fys. Medd. 28 No. 5 (1954). [CPT theorem.]

Textbooks and monographs:

Dirac, P. A. M., The Principles of Quantum Mechanics, 4th ed. (Oxford, 1958), Ch. XI. [The master's own exposition.]
Bjorken, J. D. & Drell, S. D., Relativistic Quantum Mechanics (McGraw-Hill, 1964). [Standard reference for the single-particle Dirac theory.]
Bjorken, J. D. & Drell, S. D., Relativistic Quantum Fields (McGraw-Hill, 1965). [QFT sequel.]
Peskin, M. E. & Schroeder, D. V., An Introduction to Quantum Field Theory (Westview, 1995), Ch. 3. [Modern QFT textbook treatment of the Dirac field.]
Weinberg, S., The Quantum Theory of Fields, Vol. I (Cambridge, 1995), Ch. 5. [Rigorous derivation from Wigner's classification of Poincare irreps.]
Griffiths, D. J., Introduction to Elementary Particles, 2nd ed. (Wiley-VCH, 2008), Ch. 7. [Accessible intermediate-level treatment.]
Feynman, R. P., QED: The Strange Theory of Light and Matter (Princeton, 1985). [Popular-level exposition; Beginner anchor.]
Tong, D., Quantum Field Theory (DAMTP Cambridge lecture notes), §4. [Clear lecture-note exposition.]
Itzykson, C. & Zuber, J.-B., Quantum Field Theory (McGraw-Hill, 1980), Ch. 2. [Comprehensive reference on gamma-matrix technology.]
Greiner, W., Relativistic Quantum Mechanics: Wave Equations, 3rd ed. (Springer, 2000). [Detailed worked examples.]

Wave 3 unit, produced 2026-05-19. Status: draft pending Tyler review and external QM reviewer per PHYSICS_PLAN §6.

Prerequisites

12.05.01 pending

Tier anchors

beginner: Feynman, QED: The Strange Theory of Light and Matter (1985)
intermediate: Griffiths, Introduction to Elementary Particles, 2e (2008), Ch. 7
master: Dirac, The Principles of Quantum Mechanics, 4e (1958), Ch. XI; Bjorken & Drell, Relativistic Quantum Mechanics (1964)

References

tong
raw/pdfs/qft/qft.pdf · §4 The Dirac Equation
TODO_REF
Dirac, P. A. M. — The Quantum Theory of the Electron · Proc. Roy. Soc. A 117, 610–624 (1928); Part II at A 118, 351–361 (1928)
TODO_REF
Anderson, C. D. — The Positive Electron · Phys. Rev. 43, 491 (1933); discovery announcement in Science 76, 238 (1932)
TODO_REF
Foldy, L. L. & Wouthuysen, S. A. — On the Dirac Theory of Spin 1/2 Particles and its Non-Relativistic Limit · Phys. Rev. 78, 29 (1950)
TODO_REF
Dirac, P. A. M. — A Theory of Electrons and Protons · Proc. Roy. Soc. A 126, 360–365 (1930) — hole theory
TODO_REF
Klein, O. — Die Reflexion von Elektronen an einem Potentialsprung nach der relativistischen Dynamik von Dirac · Z. Phys. 53, 157–165 (1929) — Klein paradox
TODO_REF
Schwinger, J. — On Quantum-Electrodynamics and the Magnetic Moment of the Electron · Phys. Rev. 73, 416 (1948) — leading anomalous-moment radiative correction
TODO_REF
Peskin, M. E. & Schroeder, D. V. — An Introduction to Quantum Field Theory · Westview Press (1995), Ch. 3 The Dirac Field
TODO_REF
Bjorken, J. D. & Drell, S. D. — Relativistic Quantum Mechanics · McGraw-Hill (1964), Ch. 1-4
TODO_REF
Weinberg, S. — The Quantum Theory of Fields, Vol. I · Cambridge University Press (1995), Ch. 5 (Poincaré-irrep derivation of the Dirac field)
TODO_REF
Itzykson, C. & Zuber, J.-B. — Quantum Field Theory · McGraw-Hill (1980), Ch. 2 (gamma-matrix technology, Dirac bilinears, Gordon identity, Fierz rearrangement)
TODO_REF
Streater, R. F. & Wightman, A. S. — PCT, Spin and Statistics, and All That · Benjamin (1964; Princeton Landmarks reprint 2000), Ch. 4 (spin-statistics theorem for the Dirac field)

Reviewer

Tyler (pending external QM reviewer per PHYSICS_PLAN §6)

Estimated time

beginner: 20m
intermediate: 45m
master: 60m

Intuition [Beginner]

Visual [Beginner]

Worked example [Beginner]

Check your understanding [Beginner]

Formal definition [Intermediate+]

The non-relativistic limit and the magnetic moment

Key theorem with proof [Intermediate+]

Worked example: magnetic moment from the non-relativistic limit

Exercises [Intermediate+]

Lean formalization [Intermediate+]

Gamma matrix algebra and Clifford structure [Master]

Non-relativistic limit and g−2 [Master]

Chiral symmetry [Master]

Foldy-Wouthuysen transformation and the systematic non-relativistic expansion [Master]

Hole theory, the Dirac sea, and the bridge to QFT second-quantisation [Master]

Connections [Master]

Historical & philosophical context [Master]

Bibliography [Master]

Non-relativistic limit and $g - 2$ [Master]