Canonical ensemble and partition function
Anchor (Master): Landau & Lifshitz *Statistical Physics, Part 1*, 3rd ed. (Course of Theoretical Physics Vol. 5, Pergamon, 1980) §28–§31; Reichl *A Modern Course in Statistical Physics*, 4th ed. (Wiley-VCH, 2016) Ch. 3
Intuition [Beginner]
A statistical-mechanics problem starts with a small system in contact with a large one. The small system might be a single molecule, a tiny crystal, a single qubit; the large one is a thermal reservoir — a glass of water, the laboratory air, the rest of the universe — that we will not track in detail. The reservoir is so big that whatever the small system does, the reservoir's temperature stays fixed at . The question is: what is the probability that the small system is in any particular one of its microscopic states?
The answer is the Boltzmann factor. If the small system has a microstate with energy , the probability of finding it in that state is proportional to
where is Boltzmann's constant. High-energy states are penalised exponentially. Cold reservoirs ( small) make the penalty severe — only the lowest states have a chance. Hot reservoirs spread probability across many states.
The factor is unnormalised. To turn it into an actual probability you divide by the sum of all such factors, added over every state the system can occupy. That normaliser is the partition function, written . Writing the sum as and dividing through gives . Every state contributes one term.
A combination comes up so often we give it a name — the inverse temperature. Low means large and sharp penalties; high means small and broad sampling. In these units the Boltzmann factor is , and the partition function is the same sum written — one Boltzmann-weighted term for each state.
Why bother with ? Because once you know as a function of , you know everything macroscopic. The average energy is the rate at which changes with , with an overall minus sign — call this the "log- slope". Writing it as the rate of change of with respect to , multiplied by , gives the mean energy .
The energy fluctuations, the heat capacity, the magnetisation, the pressure — every thermodynamic observable is a derivative of with respect to some parameter (, the box volume, an external field). The partition function is the bookkeeping that turns one number per microstate into the handful of numbers a thermometer reads off.
Two examples set the tone. First, a single coin that can lie heads (energy ) or tails (energy ). Then , and the average energy smoothly interpolates between at low (the coin sits in heads) and at high (both faces equally likely). The heat capacity peaks at a temperature near — the Schottky anomaly — because that is where the system has room to absorb energy by flipping heads into tails. Two-state behaviour and a peaked heat capacity is a generic fingerprint of a gap in the energy spectrum.
Second, a particle in a one-dimensional box. The energy levels are now infinitely many, but the same recipe applies: write down , take a log, differentiate. The result is the ideal-gas law once you assemble such particles. Everything in thermodynamics — entropy, free energy, pressure, chemical potential — comes out of by differentiation.
Visual [Beginner]
Picture a small box of energy levels next to a much bigger box labelled "reservoir at temperature ". A double-headed arrow between them stands for energy exchange. Across the small box are the energy levels as horizontal rungs. The Boltzmann factor is drawn as a bar to the right of each rung — long bars for low energies, short ones for high. Across the bottom is a histogram of these bar lengths after dividing through by ; that is the canonical probability distribution.
A second panel shows the same picture at two temperatures. At low the bars decay quickly: the ground state takes almost all the probability and the histogram is a spike. At high the bars decay slowly: many states share probability roughly equally and the histogram is broad.
The picture is the whole content of the canonical ensemble. Every later refinement — quantum statistics, ideal gases, phase transitions — sharpens or specialises this one image.
Worked example [Beginner]
Take a two-level system: one state at energy , one at energy . Place it in thermal contact with a reservoir at temperature .
Step 1. Write down the partition function:
Step 2. Probabilities of the two states:
Step 3. Mean energy:
Check: at (i.e., ), — the system sits in the ground state. At (), — equal occupation of both states.
Step 4. Heat capacity, . Using so :
This is the Schottky anomaly. It is zero at (no states reachable), zero at (both states already maxed out), and peaks at a temperature where . The bump in is the signature of a gap of size in the energy spectrum.
What this tells us: the partition function is a single function from which all the thermodynamics — probabilities, mean energy, heat capacity — flows by elementary calculus. A two-level system is the smallest meaningful example; everything later is more of the same with more states.
Check your understanding [Beginner]
Formal definition [Intermediate+]
Let a physical system have a set of microstates with energies (discrete spectrum for the moment) and let denote the inverse temperature, where is Boltzmann's constant (an exact defined quantity since the 2019 SI redefinition). The canonical ensemble is the probability distribution on microstates
where is the canonical partition function. The distribution depends on and on the parameters that label which Hamiltonian is being used; we write when the parameter dependence matters. For a continuous classical spectrum on phase space , the sum is replaced by a Liouville-measure integral with the Gibbs correction for identical particles,
with Planck's constant making the integrand dimensionless. The math-side measure-theoretic version is developed in 08.01.01.
The canonical ensemble is the unique probability distribution on microstates that maximises the Shannon entropy subject to the constraints
where is the prescribed mean energy. Solving by Lagrange multipliers — introduce for normalisation, for the energy constraint — gives ; absorbing the constant into the normaliser identifies , and the multiplier is in one-to-one correspondence with via . The thermodynamic identification then comes from matching the canonical to the thermodynamic entropy and using from the first and second laws of thermodynamics [11.01.NN, pending].
Thermodynamic quantities from . The bridge between the partition function and macroscopic thermodynamics runs through the Helmholtz free energy
from which every other thermodynamic potential follows by Legendre transform and elementary identities:
For specific heats at constant volume, . The same identity reads
so energy fluctuations and the heat capacity are the same object — a particular case of the fluctuation-dissipation theorem [11.08.NN, pending].
Classical ideal gas
The ideal-gas Hamiltonian is . The phase-space integral factorises over the particles, with each particle's position integral giving a factor of the box volume and each momentum integral giving a Gaussian factor . Including the prefactor:
where is the thermal de Broglie wavelength. The Helmholtz free energy is
using Stirling's approximation valid for the macroscopic . From one reads off the ideal-gas equation of state , the Sackur–Tetrode entropy , and the internal energy — equipartition for three translational degrees of freedom. The classical limit (de Broglie wavelength much less than interparticle spacing) is when quantum statistics become unnecessary; the opposite regime is where Bose-Einstein or Fermi-Dirac statistics kick in [11.05.NN, pending].
Equipartition
For a Hamiltonian of the form in some canonical variables (positions, momenta, normal-mode coordinates) and arbitrary positive coefficients , each quadratic degree of freedom contributes to the mean energy at temperature — the equipartition theorem. The argument is a Gaussian integral: in the canonical ensemble, restricted to the subsystem, and so contributes per quadratic mode, hence each. Equipartition is the high-temperature limit of every quadratic system; at low temperature it fails because energy levels are no longer densely packed compared to , and modes "freeze out" — the historical signature that classical stat mech needed quantum corrections, dramatised in the heat-capacity-of-solids problem (Einstein 1907; Debye 1912).
Quantum harmonic oscillator
For a single oscillator with energy levels , , the partition function is a geometric series:
Mean energy:
The classical limit recovers (equipartition, one from position-quadratic, one from momentum-quadratic). The quantum limit gives — zero-point energy plus exponentially small thermal occupation. The crossover temperature is . Heat capacity has the same Einstein form (1907) and resolves the low-temperature failure of equipartition.
Counterexamples to common slips
The energy levels in are microstate energies, not energy eigenvalues with their degeneracies absorbed. A doubly-degenerate level at energy contributes two terms , not one. Equivalent presentations write where the sum is over distinct levels with degeneracies . State which form you mean.
The canonical in the classical phase-space partition function is the Gibbs correction required to make extensive ( at fixed ). Omitting it produces — the Gibbs paradox. Conceptually is the quotient by the symmetric-group action on identical particles; physically it is the residue of quantum indistinguishability surviving into the classical limit.
The free energy used here is the Helmholtz free energy , the natural potential for fixed . For fixed the relevant potential is the Gibbs free energy (a Legendre transform in ). For fixed it is the grand potential (Legendre in , the grand-canonical partition function). The canonical and the partition function together determine all of these.
The Helmholtz free energy is not the same as the internal energy ; the difference is the entropic contribution. At low they coincide; at high greatly exceeds .
Energy fluctuations scale like for a macroscopic system, whereas scales like . The relative fluctuation is — vanishing in the thermodynamic limit. This is why canonical and microcanonical predictions agree macroscopically even though they differ at the level of distributions.
Key theorem with proof [Intermediate+]
Theorem (Thermodynamic-quantity identities from the partition function). Let be the canonical partition function of a system whose energy spectrum depends smoothly on the macroscopic parameters , with the sum convergent for all . Then for each ,
and, defining ,
Proof. All four identities are differentiation of the absolutely convergent series , justified term-by-term by dominated convergence.
For mean energy, differentiate in : $$ \frac{\partial Z}{\partial \beta} ;=; \sum_i (-E_i) e^{-\beta E_i} ;=; -Z,\langle E\rangle, \qquad \text{so} \quad \frac{\partial \ln Z}{\partial \beta} ;=; -\langle E\rangle. $$
For energy variance, differentiate once more: $$ \frac{\partial^2 \ln Z}{\partial \beta^2} ;=; \frac{\partial}{\partial \beta}!\left[\frac{1}{Z}\sum_i (-E_i) e^{-\beta E_i}\right] ;=; \frac{1}{Z}\sum_i E_i^2 e^{-\beta E_i} - \frac{1}{Z^2}\left(\sum_i E_i e^{-\beta E_i}\right)^{!2} ;=; \langle E^2 \rangle - \langle E\rangle^2. $$ Using identifies the variance with .
For entropy, start from the Shannon expression evaluated on the canonical distribution : $$ S_{\mathrm{Sh}} ;=; -k_B \sum_i P_i \ln P_i ;=; -k_B \sum_i P_i (-\beta E_i - \ln Z) ;=; k_B \beta \langle E\rangle + k_B \ln Z ;=; \frac{\langle E\rangle}{T} + k_B \ln Z. $$ The thermodynamic identity then gives , so is the thermodynamic entropy and follows directly from and the chain rule.
For pressure, the energy levels depend on the box volume (think particle-in-a-box: shrinking raises ). Differentiating in at fixed : $$ \frac{\partial \ln Z}{\partial V}\bigg|_\beta ;=; \frac{1}{Z}\sum_i (-\beta)\frac{\partial E_i}{\partial V} e^{-\beta E_i} ;=; -\beta \left\langle \frac{\partial E_i}{\partial V} \right\rangle, $$ and the thermodynamic identification (the canonical-ensemble average of the mechanical force per area) gives .
The four identities are the entire content of canonical-ensemble thermodynamics. Any specific physical system reduces to: compute , take logarithms, differentiate. The rest of statistical mechanics is bookkeeping for which to write down.
Exercises [Intermediate+]
Lean formalization [Intermediate+]
Mathlib has the measure-theoretic and probabilistic infrastructure needed for a canonical-ensemble formalisation (Mathlib.Probability, Mathlib.MeasureTheory, Mathlib.Probability.Kernel.Entropy) but no canonical-ensemble construction tied to a Hamiltonian. The natural target is a Mathlib.StatMech.Canonical namespace containing:
-- (Schematic; not yet in Mathlib.)
structure CanonicalEnsemble (Ω : Type*) [MeasurableSpace Ω] (μ : Measure Ω) where
H : Ω → ℝ -- Hamiltonian, measurable
β : ℝ -- inverse temperature, positive
integrable : Integrable (fun ω => Real.exp (-β * H ω)) μ
noncomputable def partitionFunction
(E : CanonicalEnsemble Ω μ) : ℝ :=
∫ ω, Real.exp (-E.β * E.H ω) ∂μ
theorem meanEnergy_eq_neg_deriv_log_Z
(E : CanonicalEnsemble Ω μ) :
meanEnergy E = -deriv (fun β => Real.log (partitionFunction { E with β })) E.β :=
sorry
The dependence on a family indexed by (so that derivatives are well-typed) makes the statement subtle to phrase cleanly in Mathlib; this is the kind of design decision that the canonical formalisation pathway needs to settle. lean_status: none reflects the absence; aggregated none units in §11 form a Mathlib contribution roadmap as the section grows. Tyler's review attests intermediate-tier correctness pending the external stat-mech reviewer.
Phase-space and quantum formulations [Master]
The discrete-spectrum partition function is a special case. The two general formulations on which all of statistical mechanics rests are the classical phase-space partition function
where is the -particle phase space (a -dimensional symplectic manifold, the -fold product of single-particle phase spaces — see 09.04.02 pending for the symplectic structure on ), and the quantum canonical partition function
for a Hamiltonian operator bounded below on a separable Hilbert space, with the trace taken over an orthonormal basis. When has discrete spectrum this reduces to ; when the spectrum has a continuous part the trace is a spectral integral against the spectral measure of .
The factor in makes the measure dimensionless and is fixed by demanding agreement with in the classical limit — the Wigner correspondence. The is the Gibbs correction for identical particles: classical phase space treats particle labels as physical, while quantum mechanics treats identical particles as indistinguishable; the residue surviving the classical limit is the quotient by the symmetric group , equivalent to dividing the partition function by . The rigorous measure-theoretic version of on a general symplectic manifold is treated in 08.01.01 using the Liouville volume form as the canonical measure on phase space; the Gibbs is the quotient measure on .
Saddle-point derivation from the microcanonical ensemble
The microcanonical ensemble assigns equal probability to every microstate on the energy shell . Let denote the number of microstates with energy (or, in the continuous case, the density of states). The microcanonical entropy is . The canonical partition function can be rewritten as an integral over the energy shell against the microcanonical density:
In the thermodynamic limit , both and scale extensively as , so the exponent with intensive and becomes large. Laplace's method 02.05.05 evaluates the integral by its saddle point — the value at which the exponent is stationary,
Around the saddle, expand to quadratic order; the result is
Taking the log and dividing by : the per-particle free energy is , the standard Legendre transform of the microcanonical entropy. This is the Darwin-Fowler saddle-point derivation of the canonical distribution (1922); the modern formulation is in terms of large deviations and Legendre duality. The fluctuation prefactor recovers the result of the Intermediate tier, and the correction is the systematic difference between canonical and microcanonical expectations — vanishing in the thermodynamic limit. The macroscopic equivalence of the two ensembles is the statement that Legendre transforms of extensive thermodynamic potentials commute with the thermodynamic limit, an instance of a general principle in large-deviation theory.
Grand-canonical ensemble and the Laplace-transform connection
If particle number is also exchanged with the reservoir, fix instead the chemical potential and sum over both microstates and particle numbers:
is the grand-canonical partition function, related to by a discrete Laplace transform in . The grand potential is ; analogues of all canonical identities hold with replacing . For systems with fluctuating — bosons, fermions, open systems — the grand-canonical ensemble is the natural starting point, and the canonical at fixed is recovered by inverse Laplace transform in — a saddle-point evaluation gives the saddle identifying as the standard chemical potential. The three ensembles — microcanonical, canonical, grand-canonical — sit in a Legendre-transform chain: , with each transform fixing a different extensive variable in exchange for its conjugate intensive parameter.
Mayer cluster expansion
For a classical gas with two-body interactions , write and decompose with the Mayer -function. Expanding the product generates a sum over graphs on the vertices, with each graph contributing an integral over its connected components. Re-summing yields the cluster expansion
where is the fugacity and are cluster integrals over -particle connected diagrams. The expansion is the systematic small- (dilute-gas) corrections to ideal-gas behaviour, generating the virial expansion with virial coefficients explicit integrals over cluster diagrams. Mayer's expansion is the classical-gas analogue of perturbation theory in QFT — the same diagrammatic structure with Feynman rules replaced by cluster rules.
Functional integral at finite temperature
For a quantum system with Hamiltonian on the Hilbert space (one particle for simplicity), the Feynman-Kac formula evaluates the trace by a path integral over loops of imaginary-time period :
where the integral is over closed paths and the Euclidean action is with . This is the finite-temperature path integral: the quantum statistical-mechanics analogue of the Feynman path integral , related to the real-time formulation by Wick rotation . The math-side treatment of the Euclidean path integral with rigorous measures (Wiener measure, Symanzik regularisation) lives in 08.07.01, and the quantum-classical correspondence by Wick rotation in 08.09.01; at zero temperature () this reduces to the ground-state vacuum expectation value. For field theories the same construction with gives Euclidean QFT at finite temperature, the apparatus underlying thermal QFT, finite-temperature gauge theory, and the cosmological derivations of the cosmic-microwave-background spectrum.
Renormalization-group structure
The partition function as a function of the couplings in the Hamiltonian is the generating functional of statistical mechanics. Coarse-graining — the integration of short-wavelength modes — defines a flow on the space of couplings: such that the partition function on the coarse-grained lattice equals that on the fine lattice, up to a regular function of the couplings. Fixed points of this flow are scale-invariant theories; their stability properties classify the universality classes of continuous phase transitions, and the eigenvalues of the linearised flow at a fixed point give the critical exponents. The full apparatus is developed on the math side in 08.04.01–08.04.04 and on the physics side in [11.07.NN, pending]; the canonical ensemble is the underlying object on which all of this acts. The deep structural fact is that the partition function and its derivatives are the universal observables that the RG flow preserves up to scheme dependence — every other quantity (correlation functions, susceptibilities, response functions) is computable from as a derivative or correlator.
Connections [Master]
Partition function (math-side rigorous)
08.01.01is the measure-theoretic counterpart of the physics-side construction in this unit. The math-side treatment handles convergence, thermodynamic limits, and transfer-matrix formalism with full rigor; this unit cites it whenever a rigorous statement is needed and treats phenomenologically.Boltzmann distribution (math-side)
08.01.03and free energy (math-side)08.01.04are the math-side companion units carrying the rigorous measure-theoretic versions of the canonical distribution and its log. The §08 and §11 split is the canonical example of the mathematical-flavor / physical-flavor pairing.Multivariable Taylor and extrema
02.05.05supplies the Hessian + Laplace's-method machinery underlying the saddle-point derivation of canonical from microcanonical. The Gaussian fluctuation prefactor in is exactly the multivariable-Taylor stationary-phase result applied in the thermodynamic limit.Hamilton's equations
09.04.02pending is the classical phase-space framework on which the integral is built. The symplectic structure on provides the Liouville-volume measure that makes phase-space integration well-defined and the Gibbs correction a natural symmetric-group quotient.Path integral (math-side)
08.07.01treats the Feynman-Kac representation rigorously. This unit invokes it for the master-tier statement of the finite-temperature path integral; the rigorous Wiener-measure version lives in §08.Wick rotation
08.09.01is the analytic continuation that converts real-time quantum mechanics into imaginary-time statistical mechanics, identifying the period of the Euclidean path integral with inverse temperature. The math-side §08 unit covers the analytic structure; this unit cites it as the bridge between zero-temperature QFT and finite-temperature stat mech.Real-space RG
08.04.01, Wilson-Fisher08.04.02, beta function08.04.03, block-spin decimation08.04.04are the math-side RG units that act on regarded as a function of the couplings. The physics-side RG and critical-phenomena units [11.07.NN, pending] cite this unit's as the central observable on which RG transformations operate.Onsager solution
08.03.01, transfer matrix08.03.02, mean field08.02.01are the math-side units giving exact and approximate computations of for the 2D Ising model and its mean-field analogue. The physics-side phase-transitions chapter [11.06.NN, pending] cross-cites both this unit (for the canonical-ensemble framework) and the §08 units (for the exact and approximate solutions).Quantum statistics [11.05.NN, pending] — Bose-Einstein and Fermi-Dirac distributions emerge from the grand-canonical for ideal quantum gases. The classical limit recovers Maxwell-Boltzmann statistics derived here; the quantum-degenerate regime needs the proper trace formula.
Historical & philosophical context [Master]
The canonical ensemble is Gibbs's invention. His Elementary Principles in Statistical Mechanics of 1902 [Gibbs 1902] introduced the term canonical distribution (with the modular constant ) and developed its consequences as a self-contained programme — independent of Boltzmann's gas-kinetic motivations and aimed at a general thermodynamic formalism for arbitrary mechanical systems. Boltzmann's 1877 papers on the relation [Boltzmann 1877] had supplied the entropic content, but it was Gibbs who packaged the construction into the ensemble framework still used today. The 1922 paper of Darwin and Fowler [Darwin-Fowler 1922] gave the saddle-point derivation of the canonical from the microcanonical, putting the equivalence of ensembles on the analytic footing that the modern large-deviation reformulation (Lanford 1973, Ellis 1985) made precise.
The information-theoretic re-derivation is due to Jaynes (1957) [Jaynes 1957]. The argument inverts the historical sequence: instead of constructing the canonical ensemble from physical reservoir-contact reasoning, Jaynes derives it as the unique maximum-entropy distribution consistent with knowledge of — and identifies the multiplier with post hoc by matching to thermodynamics. The maximum-entropy principle is one of the long-standing conceptual reformulations of statistical mechanics; its critics (Sklar, Earman) note that "least biased given " is a subjective epistemic principle that requires further argument to license objective physical claims. The maximum-entropy programme nevertheless extends, in its more applied incarnations, to image reconstruction, neural-network energy models, and the Bayesian inference of physical parameters.
The path-integral formulation at finite temperature is due to Feynman (1953, 1972) [Feynman 1972], with the rigorous probabilistic version owed to Wiener's earlier construction of Brownian motion (1923) and to Kac's identification of with the Wiener integral of (Feynman-Kac, 1949). The two are joined in the Euclidean formulation of QFT (Symanzik 1969; Osterwalder-Schrader 1973), which underwrites the modern computation of finite-temperature observables in particle physics, condensed matter, and cosmology.
Bibliography [Master]
Primary literature (cited above; not all currently in reference/):
Boltzmann, L., "Über die Beziehung zwischen dem zweiten Hauptsatze der mechanischen Wärmetheorie und der Wahrscheinlichkeitsrechnung", Sitzungsber. Akad. Wiss. Wien (II) 76 (1877), 373–435. [Need to source — originator of .]
Gibbs, J. W., Elementary Principles in Statistical Mechanics, Developed with Especial Reference to the Rational Foundation of Thermodynamics (Yale University Press, 1902). [Need to source — originator of the canonical ensemble.]
Darwin, C. G. & Fowler, R. H., "On the partition of energy", Phil. Mag. (6) 44 (1922), 450–479; "Part II", 823–842. [Need to source — saddle-point derivation.]
Einstein, A., "Die Plancksche Theorie der Strahlung und die Theorie der spezifischen Wärme", Annalen der Physik 22 (1907), 180–190. [Need to source — Einstein solid heat capacity.]
Jaynes, E. T., "Information theory and statistical mechanics", Phys. Rev. 106 (1957), 620–630; "Part II", Phys. Rev. 108 (1957), 171–190. [Need to source — max-entropy derivation.]
Lanford, O. E. III, "Entropy and equilibrium states in classical statistical mechanics", in Statistical Mechanics and Mathematical Problems, Lecture Notes in Physics 20 (Springer, 1973), 1–113.
Ellis, R. S., Entropy, Large Deviations, and Statistical Mechanics (Springer Grundlehren 271, 1985).
Feynman, R. P. & Hibbs, A. R., Quantum Mechanics and Path Integrals (McGraw-Hill, 1965); emended edition Dover (2010).
Feynman, R. P., Statistical Mechanics: A Set of Lectures (W. A. Benjamin, 1972).
Kac, M., "On distributions of certain Wiener functionals", Trans. Amer. Math. Soc. 65 (1949), 1–13.
Schroeder, D. V., An Introduction to Thermal Physics (Addison-Wesley, 2000).
Reif, F., Fundamentals of Statistical and Thermal Physics (McGraw-Hill, 1965; reprint Waveland, 2009).
Landau, L. D. & Lifshitz, E. M., Statistical Physics, Part 1, 3rd ed. (Course of Theoretical Physics Vol. 5, Pergamon, 1980).
Reichl, L. E., A Modern Course in Statistical Physics, 4th ed. (Wiley-VCH, 2016).
Pathria, R. K. & Beale, P. D., Statistical Mechanics, 4th ed. (Academic Press, 2021).
Tong, D., Statistical Physics (DAMTP Cambridge lecture notes, §1 "The Fundamentals of Statistical Mechanics"; §2 "Classical Gases").
Wave 1 physics seed unit, agent-drafted 2026-05-18 (per docs/plans/PHYSICS_PLAN.md §5). The cross-cite into §08 math-stat-mech is the wave's stress test of cross-section dependence — five §08 units appear in prerequisites or Connections. All four cross-domain hooks_out targets are proposed; no chem/bio/phil seed unit yet exists to receive confirmed promotion. Status remains draft pending Tyler's review and the §11 retro per PHYSICS_PLAN.