40.01.07 · combinatorics / enumeration-generating-functions

Trees, Cayley's Formula, the Matrix-Tree Theorem, and Lagrange Inversion

shipped3 tiersLean: none

Anchor (Master): Stanley 1999 *Enumerative Combinatorics, Volume 2* (Cambridge) Ch. 5 in full — the tree function $T = xe^T$, Lagrange inversion (Theorem 5.4.2, Corollary 5.4.3), the matrix-tree theorem and its all-minors / weighted refinement (Theorem 5.6.8), and the BEST theorem for Eulerian circuits (Corollary 5.6.3 region); Flajolet-Sedgewick 2009 *Analytic Combinatorics* (Cambridge) Note I.49 and §VI.7 (Lagrange inversion, tree singularities); Tutte 1948 and Brooks-Smith-Stone-Tutte 1940 for the matrix-tree and BEST origins; Kirchhoff 1847 (the original Laplacian-cofactor count)

Intuition Beginner

A tree is the leanest way to keep a set of dots all connected: every dot reachable from every other, but with not one extra line. Cut any line and the tree falls into two pieces; add any line and you create a loop. Family trees, river systems, the folder structure on your computer — all of these are trees, and the surprising thing is not what one tree looks like but how many different trees you can build on a fixed set of labelled dots.

Here is the headline number. If you have $n$ labelled dots and you ask how many different trees join them all up, the answer is $n$ to the power $n - 2$ . For three dots there are $3$ trees; for four dots there are $16$ ; for five dots there are $125$ . That clean formula, $n^{n - 2}$ , is Cayley's formula, and a large part of this unit is four different ways of seeing why it is true. Each way teaches a separate lesson, which is why one formula deserves four proofs.

A second question is more practical. Given a specific network — a power grid, a road map, a circuit — how many ways are there to pick a skeleton of lines that keeps it connected with no loops? Such a skeleton is a spanning tree. Counting spanning trees turns out to be a determinant: you write down a simple matrix from the network and compute one cofactor. This is the matrix-tree theorem, and it converts a hard counting problem into a routine of linear algebra.

The third idea is a machine for inverting tree recipes. A tree is "a root with smaller trees hanging off it," a description that folds back on itself. Lagrange inversion is the algebraic crank that turns such self-referential recipes directly into counting formulas.

Visual Beginner

Picture the Prüfer encoding, the trick at the heart of the first proof of Cayley's formula. You take a labelled tree and repeatedly snip off the leaf with the smallest label, each time writing down the name of the dot it was attached to. Snip, record; snip, record. When only two dots remain you stop. A tree on $n$ dots gives a list of exactly $n - 2$ recorded names, and — this is the magic — every possible list of that length comes from exactly one tree.

The table below runs the snipping on one small tree so you can follow the bookkeeping. Because each recorded list has length $n - 2$ and each slot can hold any of the $n$ labels, the number of lists is $n \times n \times \dots \times n$ with $n - 2$ factors, which is $n^{n - 2}$ — and that is the count of trees.

step	smallest leaf removed	neighbour recorded	sequence so far
1	1	3	3
2	2	3	3, 3
3	4	5	3, 3, 5
4	3	5	3, 3, 5, 5
stop	two dots left (5, 6)	—	length 4 = 6 − 2

Worked example Beginner

Let us count the spanning trees of one small network by hand and then check the answer against the matrix-tree determinant. Take the triangle on three dots $1, 2, 3$ with all three sides drawn. How many spanning trees does it have?

Step 1. A spanning tree on $3$ dots needs exactly $3 - 1 = 2$ lines and must stay connected. The triangle has $3$ lines: side $1 - 2$ , side $2 - 3$ , side $3 - 1$ .

Step 2. Choosing $2$ of the $3$ sides to keep, and dropping one, gives three choices. Dropping any single side from a triangle still leaves a connected path through all three dots, so all three choices are valid spanning trees.

Step 3. The count is therefore $3$ . The three spanning trees are the paths $1 - 2 - 3$ , $2 - 3 - 1$ , and $3 - 1 - 2$ — each one a triangle with one side rubbed out.

Step 4. Check against the formula. The triangle is the complete network on $3$ dots, and Cayley's formula says a complete network on $n$ labelled dots has $n^{n - 2}$ spanning trees. For $n = 3$ that is $3^{1} = 3$ . The hand count and the formula agree.

What this tells us: counting spanning trees by listing them works for tiny networks but explodes fast. For $n = 10$ the count is $1 0^{8}$ , a hundred million — far too many to list. The matrix-tree theorem and Cayley's formula exist precisely so you never have to list them: you read the count off a determinant or a power.

Check your understanding Beginner

Formal definition Intermediate+

Throughout, a tree is a connected acyclic graph; the tree characterisations (connected with $n - 1$ edges, acyclic with $n - 1$ edges, unique paths) and the leaf-existence lemma are taken from 40.04.01. A rooted tree is a tree with one distinguished vertex, the root; orienting every edge toward (or away from) the root makes the parent-child structure explicit. A labelled tree on $[n] = {1, \dots, n}$ is a tree whose vertex set is $[n]$ ; two labelled trees are equal only if they have the same edge set as subsets of $(2 [ n ])$ .

Definition (Prüfer sequence). Given a labelled tree $T$ on $[n]$ with $n \geq 2$ , the Prüfer sequence $Pr (T) \in [n]^{n - 2}$ is produced by iterating: while more than two vertices remain, let $ℓ$ be the leaf of least label, append the label of $ℓ$ 's unique neighbour to the sequence, and delete $ℓ$ . The map $Pr$ records $de g_{T} (i) - 1$ copies of each label $i$ .

Definition (the tree function). The tree function is the formal power series $T (x) = \sum_{n \geq 1} n^{n - 1} \frac{x ^{n}}{n !}$ , the exponential generating function of rooted labelled trees ( $n^{n - 1}$ of them on $[n]$ , one choice of root per labelled tree times $n^{n - 2}$ trees). It is the unique series with $T (0) = 0$ satisfying the functional equation $$ T(x) = x,e^{T(x)}, $$ expressing "a rooted tree is a root vertex carrying a set (an EGF $exp$ , per the exponential formula of 40.01.03) of root-subtrees." Differentiating $T = x e^{T}$ gives $T^{'} (x) = \frac{T ( x )}{x ( 1 - T ( x ))}$ , exhibiting the singularity at $T = 1$ that 40.08.05 exploits.

Definition (Laplacian). For a graph $G$ on vertex set $[n]$ with adjacency matrix $A$ and degree-diagonal $D = diag (d (1), \dots, d (n))$ , the Laplacian is $L = L (G) = D - A$ . Each row and column of $L$ sums to $0$ , so $L$ is singular; the reduced Laplacian $L_{0}$ is $L$ with row $i$ and column $j$ deleted. A weighted graph assigns $w_{e} > 0$ to each edge and uses the weighted Laplacian $L_{uv} = - \sum_{e = uv} w_{e}$ off-diagonal, $L_{uu} = \sum_{e ∋ u} w_{e}$ on the diagonal. The spanning-tree generating polynomial is $τ_{w} (G) = \sum_{T} \prod_{e \in T} w_{e}$ , summed over spanning trees $T$ .

Counterexamples to common slips Intermediate+

"Any cofactor of the Laplacian could differ." All $(n - 1) \times (n - 1)$ cofactors of $L$ (deleting row $i$ , column $j$ , with sign $(- 1)^{i + j}$ ) are equal, to $τ (G)$ . The signed cofactors agree because $L$ has all line-sums zero; deleting different rows and columns gives the same value, so "any cofactor" is well-defined.
"The eigenvalue formula uses all eigenvalues." The Laplacian always has eigenvalue $0$ (eigenvector $1$ ), so $det L = 0$ . The product is over the $n - 1$ nonzero eigenvalues: $τ (G) = \frac{1}{n} \prod_{i = 2}^{n} μ_{i}$ . Including the zero eigenvalue would give $0$ .
"Lagrange inversion needs $ϕ$ to be a polynomial." It needs only $ϕ (0) \neq = 0$ so that $w = x ϕ (w)$ has a unique solution with $w (0) = 0$ and $w^{'} (0) = ϕ (0) \neq = 0$ . Any formal power series with nonzero constant term works; $ϕ = e^{w}$ (giving the tree function) is the canonical non-polynomial case.
"A Prüfer sequence determines the tree only up to relabelling." It determines the tree exactly. The degree of each vertex is recovered as one plus its multiplicity in the sequence, and the greedy leaf-reattachment reconstructs the unique tree; the map is a genuine bijection, not a quotient.

Key theorem with proof Intermediate+

The signature theorem is Cayley's formula, and it is stated here with the Prüfer bijection so that the same construction simultaneously delivers the degree-specified refinement.

Theorem (Cayley's formula, with degree refinement). For $n \geq 2$ the number of labelled trees on $[n]$ is $n^{n - 2}$ . More precisely, the number of labelled trees in which vertex $i$ has degree $d_{i}$ (with each $d_{i} \geq 1$ and $\sum_{i} d_{i} = 2 (n - 1)$ ) is the multinomial coefficient $$ \binom{n-2}{d_1 - 1, , d_2 - 1, , \dots, , d_n - 1} = \frac{(n-2)!}{\prod_{i=1}^n (d_i - 1)!}. $$ (Stanley ^{[Stanley §5.3]}, Aigner-Ziegler ^{[Aigner-Ziegler Ch. 33]}.)

Proof. The map $Pr$ sending a labelled tree $T$ on $[n]$ to its Prüfer sequence in $[n]^{n - 2}$ is a bijection; granting this, $∣ [n]^{n - 2} ∣ = n^{n - 2}$ counts the trees. The degree refinement is then immediate from the multiplicity property: in $Pr (T)$ a label $i$ occurs exactly $de g_{T} (i) - 1$ times, so trees with degree vector $(d_{1}, \dots, d_{n})$ correspond to sequences of length $n - 2$ containing label $i$ exactly $d_{i} - 1$ times, and the number of such sequences is the multinomial coefficient $(d _{1} - 1 , \dots , d _{n} - 1 n - 2)$ .

It remains to prove $Pr$ is a bijection. Multiplicity property. Each time a vertex $v$ is recorded, it is because a neighbour of $v$ was just deleted as the least leaf; $v$ is recorded once per neighbour deleted while $v$ survives. A vertex $v$ is itself deleted only once its surviving degree falls to $1$ (it has become a leaf) and it is the least such, after which it is never recorded again. The two endpoints surviving at the end are never deleted. Counting, a non-final vertex $v$ loses all but one of its $de g_{T} (v)$ incident edges to neighbour-deletions before becoming a leaf and being deleted, contributing $de g_{T} (v) - 1$ records; a final vertex loses all $de g_{T} (v)$ neighbours but is never deleted, also contributing $de g_{T} (v) - 1$ records once one accounts for the last edge joining the two survivors. So every $i$ appears $de g_{T} (i) - 1$ times.

Injectivity and surjectivity by an explicit inverse. Given any sequence $s = (s_{1}, \dots, s_{n - 2}) \in [n]^{n - 2}$ , define a candidate degree by $d (i) = 1 + # {k : s_{k} = i}$ ; then $\sum_{i} d (i) = n + (n - 2) = 2 (n - 1)$ , the correct edge-degree sum for a tree. Reconstruct as follows: maintain the multiset of "remaining" labels (initially all of $[n]$ ) and read $s$ left to right. At step $k$ , let $ℓ_{k}$ be the least remaining label that does not occur in the suffix $(s_{k}, \dots, s_{n - 2})$ — this is the least current leaf — add the edge ${ℓ_{k}, s_{k}}$ , and remove $ℓ_{k}$ from the remaining labels. After $n - 2$ steps, two labels remain; join them by the final edge. At each step a least non-occurring leaf exists because at most $n - 2 - (k - 1)$ distinct labels can occur in a suffix of length $n - 1 - k$ , leaving a remaining label that is a leaf. The construction is forced at every step — the least leaf and its recorded neighbour are determined by $s$ — so it inverts $Pr$ and is single-valued. Hence $Pr$ is a bijection. $□$

Corollary (rooted trees and the tree function). The number of rooted labelled trees on $[n]$ is $n \cdot n^{n - 2} = n^{n - 1}$ , so the EGF $T (x) = \sum_{n \geq 1} n^{n - 1} \frac{x ^{n}}{n !}$ counts them. The decomposition "root vertex plus a set of root-subtrees" gives $T = x e^{T}$ via the exponential formula of 40.01.03, the functional equation inverted in the Advanced section.

Bridge. This theorem builds toward the entire analytic theory of tree-like structures and appears again in 40.08.05, where the singularity of $T (x) = x e^{T}$ at $x = e^{- 1}$ delivers the $n^{- 3/2} ρ^{- n}$ universal asymptotic law for simply generated trees. The foundational reason one formula admits four proofs is that $n^{n - 2}$ is the value of four genuinely different functionals on the same object: a word count (Prüfer), a determinant (matrix-tree), a forest-refinement count (Pitman), and a coefficient of a fixed-point series (Lagrange on $T = x e^{T}$ ). This is exactly the situation where a single integer carries several structures, and the Prüfer bijection generalises the degree refinement that the determinant cannot see, while the determinant generalises to arbitrary graphs that the bijection cannot reach. Putting these together, the central insight is that Cayley's count is the spanning-tree count of $K_{n}$ , so the matrix-tree theorem is the general statement of which Cayley's formula is the complete-graph instance, and the tree function is the generating-function packaging that ports the count into 40.08.05.

Exercises Intermediate+

Exercise 3 (medium, numeric).

Use the matrix-tree theorem to count the spanning trees of the cycle $C_{4}$ on vertices $1, 2, 3, 4$ (edges $12, 23, 34, 41$ ).

Hint

The Laplacian is $2 I - (adjacency of C_{4})$ . Delete one row and column and take the determinant of the remaining $3 \times 3$ matrix. (Or: a spanning tree of a cycle is the cycle minus one edge.)

Answer

$4$ . A spanning tree of $C_{4}$ is the $4$ -cycle with exactly one of its $4$ edges removed, and each removal leaves a path (connected, acyclic), so there are $4$ spanning trees. By matrix-tree, $L = \begin{psmallmatrix} 2 & -1 & 0 & -1 \\ -1 & 2 & -1 & 0 \\ 0 & -1 & 2 & -1 \\ -1 & 0 & -1 & 2\end{psmallmatrix}$ ; deleting the last row and column gives $\det\begin{psmallmatrix} 2 & -1 & 0 \\ -1 & 2 & -1 \\ 0 & -1 & 2\end{psmallmatrix} = 2(4-1) - (-1)(-2-0) = 6 - 2 = 4$ . Rubric: full credit for either the direct count or the $3 \times 3$ cofactor.

Exercise 5 (medium, symbolic).

The series $R (x)$ with $R (0) = 0$ solving $R = x (1 + R)^{2}$ is the OGF of rooted plane trees by edges. Apply Lagrange inversion with $ϕ (λ) = (1 + λ)^{2}$ to show $[x^{n}] R = \frac{1}{n} (n - 1 2 n)$ , and identify this with the Catalan number $C_{n} = \frac{1}{n + 1} (n 2 n)$ .

Hint

Use $[x^{n}] R = \frac{1}{n} [λ^{n - 1}] ϕ (λ)^{n} = \frac{1}{n} [λ^{n - 1}] (1 + λ)^{2 n}$ , then read off a binomial coefficient.

Answer

By Lagrange inversion with $ϕ (λ) = (1 + λ)^{2}$ , $$ [x^n]R = \frac{1}{n}[\lambda^{n-1}],\phi(\lambda)^n = \frac{1}{n}\lambda^{n-1}^{2n} = \frac{1}{n}\binom{2n}{n-1}. $$ Now $\frac{1}{n} (n - 1 2 n) = \frac{1}{n} \cdot \frac{( 2 n )!}{( n - 1 )! ( n + 1 )!} = \frac{1}{n + 1} \cdot \frac{( 2 n )!}{n ! n !} = \frac{1}{n + 1} (n 2 n) = C_{n}$ , using $\frac{1}{n} (n - 1 2 n) = \frac{1}{n + 1} (n 2 n)$ (both equal $\frac{( 2 n )!}{n ! ( n + 1 )!}$ ). So $[x^{n}] R = C_{n}$ , the Catalan number, matching the quadratic solution $C (x) = \frac{1 - 1 - 4 x}{2 x}$ of 40.01.03 Exercise 4. Rubric: full credit for the residue extraction $\frac{1}{n} [λ^{n - 1}] (1 + λ)^{2 n}$ , the binomial $\frac{1}{n} (n - 1 2 n)$ , and its identification with $\frac{1}{n + 1} (n 2 n)$ .

Exercise 7 (hard, short-answer).

Prove that all $(n - 1) \times (n - 1)$ cofactors of the Laplacian $L$ of a connected graph are equal. (This is the well-definedness behind "any cofactor of $L$ ".)

Hint

Use that every row and column of $L$ sums to $0$ , i.e. $L 1 = 0$ and $1^{⊤} L = 0$ , and relate cofactors via the adjugate identity $L \cdot adj (L) = (det L) I = 0$ .

Answer

Let $adj (L)$ be the adjugate, with $(adj L)_{j i} = (- 1)^{i + j} M_{ij}$ where $M_{ij}$ is the $(i, j)$ minor (delete row $i$ , column $j$ ). The identity $L adj (L) = det (L) I = 0$ (since $det L = 0$ ) means every column of $adj (L)$ lies in the kernel of $L$ . For a connected graph $ker L$ is one-dimensional, spanned by $1$ , so each column of $adj (L)$ is a scalar multiple of $1$ ; thus $adj (L)$ has constant columns. Symmetrically $adj (L) L = 0$ forces constant rows. A matrix with constant rows and constant columns is constant: every entry of $adj (L)$ equals a single value $c$ . Hence $(- 1)^{i + j} M_{ij} = c$ for all $i, j$ , i.e. all signed cofactors coincide, equal to $τ (G)$ . Rubric: full credit for the adjugate identity, the one-dimensional-kernel argument giving constant columns and rows, and the conclusion that all cofactors agree.

Exercise 8 (hard, short-answer).

Prove the matrix-tree theorem for the complete graph directly: show the reduced Laplacian $L_{0}$ of $K_{n}$ (any row/column deleted) is $n I_{n - 1} - J_{n - 1}$ and compute $det L_{0} = n^{n - 2}$ , recovering Cayley.

Hint

$L (K_{n}) = n I_{n} - J_{n}$ . Deleting a row and column leaves $n I_{n - 1} - J_{n - 1}$ . The eigenvalues of $J_{n - 1}$ are $n - 1$ (once) and $0$ ( $n - 2$ times).

Answer

For $K_{n}$ , $D = (n - 1) I_{n}$ and $A = J_{n} - I_{n}$ , so $L = D - A = (n - 1) I_{n} - (J_{n} - I_{n}) = n I_{n} - J_{n}$ . Deleting the last row and column gives $L_{0} = n I_{n - 1} - J_{n - 1}$ . The matrix $J_{n - 1}$ (all-ones, size $n - 1$ ) has eigenvalue $n - 1$ with eigenvector $1$ and eigenvalue $0$ with multiplicity $n - 2$ . Therefore $n I_{n - 1} - J_{n - 1}$ has eigenvalue $n - (n - 1) = 1$ once and eigenvalue $n - 0 = n$ with multiplicity $n - 2$ , so $det L_{0} = 1 \cdot n^{n - 2} = n^{n - 2}$ . By the matrix-tree theorem this is $τ (K_{n})$ , the number of spanning trees of the complete graph, i.e. Cayley's $n^{n - 2}$ . Rubric: full credit for the $n I - J$ form, the eigenvalue computation, and the determinant $n^{n - 2}$ .

Advanced results Master

Theorem 1 (matrix-tree theorem, weighted and all-minors form). Let $G$ be a connected graph (or multigraph) on $[n]$ with edge weights $w_{e}$ , weighted Laplacian $L$ , and let $L_{0}$ be $L$ with row and column $r$ deleted. Then $$ \det L_0 = \tau_w(G) = \sum_{T \text{ spanning tree}} ;\prod_{e \in T} w_e . $$ Unweighted ( $w_{e} = 1$ ) this is the spanning-tree count, equal to any signed cofactor of $L$ . The all-minors generalisation expresses an arbitrary minor of $L$ — deleting a set $R$ of rows and $C$ of columns with $∣ R ∣ = ∣ C ∣$ — as a signed sum of weights of spanning forests whose trees each contain one prescribed vertex of $R$ and one of $C$ ^{[Stanley §5.6]}.

Proof sketch (Cauchy-Binet). Orient each edge arbitrarily and form the signed incidence matrix $N \in {0, \pm 1}^{V \times E}$ with $N_{v, e} = + 1$ if $v$ is the head of $e$ , $- 1$ if the tail, $0$ otherwise; for weights, scale column $e$ by $w_{e}$ . Then $L = N N^{⊤}$ . Deleting row $r$ from $N$ gives $N_{0}$ with $L_{0} = N_{0} N_{0}^{⊤}$ . By Cauchy-Binet, $$ \det L_0 = \det(N_0 N_0^\top) = \sum_{\substack{S \subseteq E \ |S| = n-1}} \det\big(N_0[S]\big)^2 \quad(\text{weighted: }\textstyle\prod_{e\in S}w_e), $$ the sum over $(n - 1)$ -subsets of edges. The square submatrix $N_{0} [S]$ has determinant $\pm 1$ when $S$ is the edge set of a spanning tree and $0$ otherwise — because an $(n - 1)$ -edge subgraph on $n$ vertices is a spanning tree iff it is acyclic iff its incidence columns are linearly independent, and a tree, processed leaf-first, makes $N_{0} [S]$ triangular with unit diagonal. Each spanning tree contributes $1$ (weighted: $\prod_{e \in T} w_{e}$ ), giving $τ_{w} (G)$ . $□$

Theorem 2 (the eigenvalue formula). For a connected graph on $n$ vertices with Laplacian eigenvalues $0 = μ_{1} < μ_{2} \leq \dots \leq μ_{n}$ , the number of spanning trees is $τ (G) = \frac{1}{n} \prod_{i = 2}^{n} μ_{i}$ ^{[Stanley §5.6]}. The Laplacian's characteristic polynomial is $det (μ I - L) = \prod_{i} (μ - μ_{i})$ ; the coefficient of $μ^{1}$ is $(- 1)^{n - 1} \sum_{i} \prod_{j \neq = i} μ_{j} = (- 1)^{n - 1} \prod_{i \geq 2} μ_{i}$ (only the $i = 1$ term survives since $μ_{1} = 0$ ), and the same coefficient equals $(- 1)^{n - 1}$ times the sum of the $(n - 1) \times (n - 1)$ principal minors of $L$ , namely $(- 1)^{n - 1} n τ (G)$ since each of the $n$ principal cofactors equals $τ (G)$ . Equating gives $\prod_{i \geq 2} μ_{i} = n τ (G)$ .

Theorem 3 (Lagrange-Bürmann inversion). Let $ϕ \in K [[λ]]$ with $ϕ (0) \neq = 0$ , and let $R = R (x)$ be the unique series with $R (0) = 0$ solving $R = x ϕ (R)$ . For any formal Laurent series $H$ and $n \geq 1$ , $$ [x^n],H(R(x)) ;=; \frac{1}{n},[\lambda^{n-1}]\Big(H'(\lambda),\phi(\lambda)^n\Big), \qquad\qquad [x^n],R(x) = \frac{1}{n},[\lambda^{n-1}],\phi(\lambda)^n, $$ and in residue form $[x^{n}] H (R) = Res_{λ} (H (λ) ψ^{'} (λ) ψ (λ)^{- n - 1})$ where $ψ (λ) = λ / ϕ (λ)$ is the compositional inverse of $R$ ^{[Stanley §5.4]}, ^{[Flajolet-Sedgewick Note I.49]}. Applied to $ϕ = e^{λ}$ it gives $[x^{n}] T = \frac{n ^{n - 1}}{n !}$ (rooted trees), to $H (λ) = λ$ and $ϕ (λ) = (1 + λ)^{2}$ the Catalan numbers, and to ballot/cycle-lemma weightings the Catalan and ballot-number refinements.

Theorem 4 (the BEST theorem). In a connected Eulerian digraph $G$ (every vertex has $de g^{+} (v) = de g^{-} (v)$ ), the number $ec (G)$ of Eulerian circuits, counted up to starting point, is $$ \mathrm{ec}(G) = t_w(G),\prod_{v \in V}\big(\deg^+(v) - 1\big)!, $$ where $t_{w} (G)$ is the number of arborescences rooted at (any fixed) vertex $w$ — itself a directed-Laplacian cofactor, independent of $w$ ^{[Stanley §5.6]}. The name abbreviates de Bruijn, van Aardenne-Ehrenfest, Smith, and Tutte. The proof pairs an Eulerian circuit with the arborescence of "last exit edges": for each vertex, the last edge used to leave it before the circuit closes forms a spanning in-tree toward $w$ , and conversely each arborescence together with an ordering of the remaining out-edges at each vertex reconstructs a unique circuit, giving the product of local factorials.

Theorem 5 (Joyal's proof and the doubly-rooted bijection). The identity $n^{n} = # {functions [n] \to [n]} = # {doubly-rooted labelled trees on [n]}$ holds via a bijection, whence the number of (singly) rooted trees is $n^{n - 1}$ and of trees is $n^{n - 2}$ ^{[Aigner-Ziegler Ch. 33]}. A function $f : [n] \to [n]$ is a functional digraph whose connected components each have one cycle with trees hanging off; replacing the cyclic part (a permutation of the cyclic vertices) by the linear order it induces converts the structure into a tree with two marked vertices (a vertebrate: head and tail of the spine). The two markings supply the two extra factors of $n$ , so trees number $n^{n} / n^{2} = n^{n - 2}$ .

Synthesis. The four proofs of Cayley's formula are the foundational reason this unit sits at the centre of the chapter: each computes $n^{n - 2}$ as a different functional, and putting these together the Prüfer word count, the Laplacian determinant, Pitman's forest refinement, and the Lagrange coefficient of $T = x e^{T}$ are four readings of one integer, exactly as the OGF and EGF were two readings of one sequence in 40.01.03. The central insight is that Cayley's count is the spanning-tree count of $K_{n}$ , so the matrix-tree theorem generalises it to every weighted graph, and the eigenvalue product $\frac{1}{n} \prod μ_{i}$ is dual to the cofactor determinant through the characteristic polynomial — the same number read off the spectrum or off a minor. This is exactly the duality that the BEST theorem exploits one level up: Eulerian circuits of a digraph are counted by arborescences, which are directed spanning trees counted by a directed-Laplacian cofactor, so circuit enumeration reduces to the same determinant. Lagrange inversion is the generating-function engine binding all of this to 40.08.05: it converts the fixed-point equation $T = x e^{T}$ into the explicit $n^{n - 1} / n!$ , and its residue form is what the singularity analysis of 40.08.05 differentiates to obtain the $n^{- 3/2} ρ^{- n}$ tree-asymptotic law. The bridge is that every count here — the word, the determinant, the spectrum, the circuit, the coefficient — is one of these four functionals evaluated on the tree object, and the analytic continuation of the last functional is the gateway to asymptotics.

Full proof set Master

Proposition 1 (Prüfer is a bijection; Cayley's formula). The map $Pr$ from labelled trees on $[n]$ to $[n]^{n - 2}$ is a bijection, so there are $n^{n - 2}$ labelled trees.

Proof. Multiplicity: in the leaf-pruning process, a vertex $v$ is recorded exactly once for each incident edge lost to the deletion of a neighbour, and $v$ is deleted exactly when its surviving degree reaches $1$ (after which it is recorded no more, and the final two vertices are never deleted). Each non-final vertex thus contributes $de g_{T} (v) - 1$ records; bookkeeping the last edge between the two survivors shows each final vertex also contributes $de g_{T} (v) - 1$ . Hence label $i$ appears $de g_{T} (i) - 1$ times, and $\sum_{i} (de g_{T} (i) - 1) = 2 (n - 1) - n = n - 2$ matches the sequence length. Inverse: given $s \in [n]^{n - 2}$ , set $d (i) = 1 + (count of i in s)$ , which sums to $2 (n - 1)$ . Read $s$ left to right; at step $k$ the least label not occurring in the suffix $(s_{k}, \dots, s_{n - 2})$ and not yet deleted is the unique forced least leaf $ℓ_{k}$ , and we add edge ${ℓ_{k}, s_{k}}$ and delete $ℓ_{k}$ ; a least such leaf exists because fewer than the number of remaining labels can occur in the strictly shorter suffix. Two labels survive and are joined by the last edge. The resulting graph has $n - 1$ edges and is connected and acyclic by induction (each step attaches a leaf to a smaller tree-in-progress), hence a tree, and the construction is the unique pre-image of $s$ . So $Pr$ is a bijection and the tree count is $∣ [n]^{n - 2} ∣ = n^{n - 2}$ . $□$

Proposition 2 (matrix-tree theorem). For a connected graph on $[n]$ , $det L_{0} = τ (G)$ for the reduced Laplacian $L_{0} = L$ with any one row $r$ and the same column $r$ removed; weighting edges by $w_{e}$ replaces $τ$ by $τ_{w} = \sum_{T} \prod_{e \in T} w_{e}$ .

Proof. Orient edges arbitrarily; let $N$ be the $n \times m$ signed incidence matrix (column $e$ has $+ 1$ at the head, $- 1$ at the tail), with column $e$ scaled by $w_{e}$ in the weighted case. Then $(N N^{⊤})_{uv} = \sum_{e} N_{u, e} N_{v, e}$ equals $\sum_{e ∋ u} w_{e}$ when $u = v$ and $- \sum_{e = uv} w_{e}$ when $u \neq = v$ , i.e. $N N^{⊤} = L$ . Delete row $r$ : $N_{0} N_{0}^{⊤} = L_{0}$ . Cauchy-Binet expands $det (N_{0} N_{0}^{⊤}) = \sum_{∣ S ∣ = n - 1} det (N_{0} [\cdot, S])^{2}$ (weighted: times $\prod_{e \in S} w_{e}$ ). For an $(n - 1)$ -subset $S$ of edges, the submatrix $N_{0} [\cdot, S]$ is $(n - 1) \times (n - 1)$ ; if $S$ contains a cycle, the alternating sum of the cycle's incidence columns vanishes, so the columns are dependent and the determinant is $0$ . If $S$ is acyclic with $n - 1$ edges it is a spanning tree; processing its leaves successively (each leaf $\neq = r$ gives a row with a single nonzero entry $\pm 1$ ) shows $N_{0} [\cdot, S]$ is, after row/column permutation, triangular with $\pm 1$ diagonal, so $det = \pm 1$ and its square is $1$ . Summing, only spanning trees contribute, each with weight $\prod_{e \in T} w_{e}$ , giving $det L_{0} = τ_{w} (G)$ . Equality of all signed cofactors of $L$ follows from $L adj (L) = (det L) I = 0$ and $dim ker L = 1$ , forcing $adj (L)$ constant. $□$

Proposition 3 (Lagrange inversion, residue proof). With $ϕ (0) \neq = 0$ and $R = x ϕ (R)$ , $R (0) = 0$ , one has $[x^{n}] H (R) = \frac{1}{n} [λ^{n - 1}] (H^{'} (λ) ϕ (λ)^{n})$ for $n \geq 1$ .

Proof. The compositional inverse of $R$ is $ψ (λ) = λ / ϕ (λ)$ , since $R = x ϕ (R)$ rearranges to $x = R / ϕ (R) = ψ (R)$ . The formal residue $Res_{λ} f = [λ^{- 1}] f$ annihilates derivatives ( $Res g^{'} = 0$ for $g \in K ((λ))$ ) and is invariant under formal change of variable. Write $H (R (x)) = \sum_{k} h_{k} x^{k}$ ; then $[x^{n}] H (R) = Res_{x} (x^{- n - 1} H (R (x)))$ . Substitute $x = ψ (λ)$ , $R (x) = λ$ , $d x = ψ^{'} (λ) d λ$ : $$ [x^n]H(R) = \mathrm{Res}\lambda\big(\psi(\lambda)^{-n-1}H(\lambda),\psi'(\lambda)\big). $$ Now $ψ^{- n - 1} ψ^{'} = - \frac{1}{n} \frac{d}{d λ} ψ^{- n}$ , so integrating by parts (using $Res (\cdot)^{'} = 0$ ), $$ [x^n]H(R) = -\frac1n\mathrm{Res}\lambda\Big(H(\lambda)\tfrac{d}{d\lambda}\psi^{-n}\Big) = \frac1n\mathrm{Res}\lambda\big(H'(\lambda)\psi(\lambda)^{-n}\big) = \frac1n[\lambda^{n-1}]\big(H'(\lambda)\phi(\lambda)^n\big), $$ since $ψ^{- n} = ϕ (λ)^{n} λ^{- n}$ and $\mathrm{Res}\lambda(g(\lambda)\lambda^{-n}) = [\lambda^{n-1}]g $. T ak in g$ H(\lambda) = \lambda $g i v es$ [x^n]R = \frac1n[\lambda^{n-1}]\phi(\lambda)^n $.$ \square$

Proposition 4 (Cayley via the tree function). The rooted-tree EGF $T = x e^{T}$ has $[x^{n}] T = n^{n - 1} / n!$ , so there are $n^{n - 1}$ rooted and $n^{n - 2}$ unrooted labelled trees.

Proof. Apply Proposition 3 with $ϕ (λ) = e^{λ}$ (so $R = T$ ): $[x^{n}] T = \frac{1}{n} [λ^{n - 1}] e^{nλ}$ . Expanding $e^{nλ} = \sum_{k \geq 0} \frac{n ^{k}}{k !} λ^{k}$ , the coefficient of $λ^{n - 1}$ is $\frac{n ^{n - 1}}{( n - 1 )!}$ , so $[x^{n}] T = \frac{1}{n} \cdot \frac{n ^{n - 1}}{( n - 1 )!} = \frac{n ^{n - 1}}{n !}$ . The number of rooted trees on $[n]$ is $n! [x^{n}] T = n^{n - 1}$ ; dividing by the $n$ choices of root gives $n^{n - 2}$ unrooted trees, consistent with Proposition 1. $□$

Proposition 5 (directed matrix-tree / Markov-chain-tree, used by BEST). Let $G$ be a digraph and $L^{-} = D^{out} - A$ its out-Laplacian (diagonal of out-degrees minus the adjacency). The principal cofactor deleting row and column $w$ equals the number $t_{w} (G)$ of spanning arborescences oriented toward $w$ , independent of the off-root deletion.

Proof. The argument mirrors Proposition 2 with the directed incidence structure. Each spanning arborescence toward $w$ is an acyclic out-degree-one-off- $w$ subgraph; the Cauchy-Binet / cofactor expansion of $L^{-}$ with row and column $w$ removed selects exactly the edge sets in which every vertex except $w$ has out-degree $1$ and which are acyclic, i.e. the arborescences rooted at $w$ , each contributing $+ 1$ . Independence of $w$ follows because the all-ones vector is a left null vector of $L^{-}$ (each row of $A$ sums to the out-degree), so by the adjugate argument the relevant cofactors are equal across roots. This $t_{w} (G)$ is the arborescence count entering the BEST product $ec (G) = t_{w} (G) \prod_{v} (de g^{+} (v) - 1)!$ , whose bijective core is the last-exit-tree correspondence stated in Theorem 4. $□$

Connections Master

The tree function $T (x) = x e^{T}$ assembled here is the exact object whose analytic continuation drives 40.08.05: that unit locates the dominant singularity of $T$ at $x = e^{- 1}$ (where $T = 1$ and the square-root branch of the inverse $ψ (λ) = λ e^{- λ}$ appears), and singularity analysis converts the $\frac{n ^{n - 1}}{n !}$ coefficient into Stirling-asymptotic form and then into the universal $c n^{- 3/2} ρ^{- n}$ law for simply generated families of trees. This unit owns the exact enumeration $n^{n - 1}$ and the fixed-point equation; 40.08.05 owns the coefficient asymptotics that the residue form of Lagrange inversion feeds.
The graph Laplacian $L = D - A$ and its cofactor/eigenvalue spanning-tree counts are the combinatorial source of spectral graph theory: the matrix-tree eigenvalue product $\frac{1}{n} \prod_{i \geq 2} μ_{i}$ , the multiplicity of the eigenvalue $0$ counting components, and the algebraic connectivity $μ_{2}$ all read graph structure off the Laplacian spectrum. The foundational graph object — degrees, adjacency, the handshake count — comes from 40.04.01, which co-produces the basic tree characterisations this unit's enumeration rests on; the matrix-tree theorem is the determinantal lift of that unit's Cayley preview into arbitrary weighted graphs.
Lagrange inversion and the Catalan instance $C = 1 + x C^{2}$ tie directly back to 40.01.03, where the Catalan generating function was solved by the quadratic and the exponential formula gave $T = x e^{T}$ as the set-of-subtrees construction; this unit supplies the inversion theorem that 40.01.03 previewed on $T = x e^{T}$ and the Catalan quadratic. The OGF/EGF symbolic dictionary of 40.01.03 — the SET row giving $exp$ , the sequence row giving $\frac{1}{1 - C}$ — is what makes "rooted tree = root + set of subtrees" translate to the functional equation that Lagrange inversion then solves in closed form.

Historical & philosophical context Master

The count $n^{n - 2}$ was published by Arthur Cayley in 1889, A theorem on trees (Quarterly Journal of Pure and Applied Mathematics 23, 376–378), in the course of enumerating rooted trees and chemical isomers; Cayley's own argument was a recursive generating-function count rather than a bijection. The bijective encoding now universally taught is due to Heinz Prüfer (1918), Neuer Beweis eines Satzes über Permutationen (Archiv der Mathematik und Physik 27, 142–144), which gave the leaf-pruning correspondence and with it the degree-specified refinement ^{[Aigner-Ziegler Ch. 33]}.

The determinantal count is older than Cayley's formula: Gustav Kirchhoff (1847), Über die Auflösung der Gleichungen, auf welche man bei der Untersuchung der linearen Vertheilung galvanischer Ströme geführt wird (Annalen der Physik 72, 497–508), derived the spanning-tree cofactor identity while computing currents in resistive networks, so the matrix-tree theorem entered mathematics through physics. James Joyal's 1981 theory of species ^{[Stanley §5.4]} supplied the doubly-rooted "vertebrate" bijection proving $n^{n - 2}$ by the functional identity $n^{n} = n^{2} \cdot (trees)$ , and Jim Pitman's 1999 double-counting of refining forests gave the fourth proof by counting the ways to grow a tree edge by edge in two orders. Lagrange's 1770 reversion of series is the analytic ancestor of the inversion theorem, given its residue form by Bürmann (1799) and folded into enumerative combinatorics through the symbolic method of Flajolet and Sedgewick ^{[Flajolet-Sedgewick Note I.49]}. The Eulerian-circuit count is the BEST theorem, after de Bruijn and van Aardenne-Ehrenfest (1951) and Smith and Tutte (1941), uniting the arborescence count of the directed matrix-tree theorem with the local factorial orderings.

Bibliography Master

@book{stanley1999ec2,
  author    = {Stanley, Richard P.},
  title     = {Enumerative Combinatorics, Volume 2},
  series    = {Cambridge Studies in Advanced Mathematics},
  volume    = {62},
  publisher = {Cambridge University Press},
  year      = {1999}
}

@book{aignerziegler2018book,
  author    = {Aigner, Martin and Ziegler, G\"unter M.},
  title     = {Proofs from THE BOOK},
  edition   = {6},
  publisher = {Springer},
  year      = {2018}
}

@book{flajoletsedgewick2009,
  author    = {Flajolet, Philippe and Sedgewick, Robert},
  title     = {Analytic Combinatorics},
  publisher = {Cambridge University Press},
  year      = {2009}
}

@book{vanlintwilson2001,
  author    = {van Lint, J. H. and Wilson, R. M.},
  title     = {A Course in Combinatorics},
  edition   = {2},
  publisher = {Cambridge University Press},
  year      = {2001}
}

@article{cayley1889trees,
  author  = {Cayley, Arthur},
  title   = {A theorem on trees},
  journal = {Quarterly Journal of Pure and Applied Mathematics},
  volume  = {23},
  year    = {1889},
  pages   = {376--378}
}

@article{prufer1918,
  author  = {Pr\"ufer, Heinz},
  title   = {Neuer Beweis eines Satzes \"uber Permutationen},
  journal = {Archiv der Mathematik und Physik},
  volume  = {27},
  year    = {1918},
  pages   = {142--144}
}

@article{kirchhoff1847,
  author  = {Kirchhoff, Gustav},
  title   = {\"Uber die Aufl\"osung der Gleichungen, auf welche man bei der Untersuchung der linearen Vertheilung galvanischer Str\"ome gef\"uhrt wird},
  journal = {Annalen der Physik},
  volume  = {148},
  year    = {1847},
  pages   = {497--508}
}

@article{pitman1999forests,
  author  = {Pitman, Jim},
  title   = {Coalescent random forests},
  journal = {Journal of Combinatorial Theory, Series A},
  volume  = {85},
  year    = {1999},
  pages   = {165--193}
}

Prerequisites

40.01.03
40.04.01

Tier anchors

beginner: Stanley 1999 *Enumerative Combinatorics, Volume 2* (Cambridge) Ch. 5 read for the small tree pictures and the Prüfer encoding by hand; Aigner-Ziegler 2018 *Proofs from THE BOOK* 6e (Springer) Ch. 33 (four proofs of Cayley's formula) for the picture-first bijections; West 2001 *Introduction to Graph Theory* 2e (Prentice Hall) §2.2 for leaf-pruning
intermediate: Stanley 1999 *Enumerative Combinatorics, Volume 2* (Cambridge) §5.3 (the Prüfer bijection and Cayley's formula), §5.6 (the matrix-tree theorem via Cauchy-Binet), §5.4 (Lagrange inversion, Theorem 5.4.2); van Lint-Wilson 2001 *A Course in Combinatorics* 2e (Cambridge) Ch. 2 (trees, the matrix-tree theorem); Aigner-Ziegler 2018 *Proofs from THE BOOK* 6e Ch. 33
master: Stanley 1999 *Enumerative Combinatorics, Volume 2* (Cambridge) Ch. 5 in full — the tree function $T = xe^T$, Lagrange inversion (Theorem 5.4.2, Corollary 5.4.3), the matrix-tree theorem and its all-minors / weighted refinement (Theorem 5.6.8), and the BEST theorem for Eulerian circuits (Corollary 5.6.3 region); Flajolet-Sedgewick 2009 *Analytic Combinatorics* (Cambridge) Note I.49 and §VI.7 (Lagrange inversion, tree singularities); Tutte 1948 and Brooks-Smith-Stone-Tutte 1940 for the matrix-tree and BEST origins; Kirchhoff 1847 (the original Laplacian-cofactor count)

References

Stanley, R. P. — Enumerative Combinatorics, Volume 2 · Cambridge Studies in Advanced Mathematics 62 (1999), Chapter 5. §5.3 develops the Prüfer bijection between labelled trees on $n$ vertices and words in $\{1,\dots,n\}^{n-2}$, with the refinement that vertex $i$ appears $\deg(i)-1$ times, yielding both Cayley's $n^{n-2}$ and the degree-specified multinomial count $\binom{n-2}{d_1-1,\dots,d_n-1}$. §5.4 states and proves Lagrange inversion (Theorem 5.4.2) in the form $[x^n]H(R(x)) = \frac{1}{n}[\lambda^{n-1}]\,H'(\lambda)\phi(\lambda)^n$ for $R = x\phi(R)$, with the tree function $T = xe^T$, rooted plane trees and Catalan numbers, and ballot numbers as worked applications. §5.6 proves the matrix-tree theorem (the number of spanning trees of a graph is any cofactor of the Laplacian $L = D - A$) by Cauchy-Binet applied to a signed incidence matrix, gives the weighted and all-minors generalisations, the eigenvalue product $\frac{1}{n}\prod_{i=2}^n \mu_i$, and the directed (arborescence) version feeding the BEST theorem counting Eulerian circuits of a digraph as $\mathrm{ec}(G) = t_w(G)\prod_v (\deg^+(v)-1)!$.
Aigner, M. & Ziegler, G. M. — Proofs from THE BOOK · Springer, 6th edition (2018), Chapter 33 'Cayley's formula for the number of trees'. Presents four independent proofs that there are $n^{n-2}$ labelled trees on $n$ vertices: (1) the Prüfer code bijection; (2) a linear-algebra proof via the matrix-tree theorem specialised to the complete graph, where the reduced Laplacian of $K_n$ is $nI - J$ on $n-1$ coordinates with determinant $n^{n-2}$; (3) Joyal's bijective/EGF proof counting doubly-rooted trees (vertebrates) as functions, $n^n = n^2 \cdot (\text{trees})$ rearranged; (4) Pitman's double-counting of refining sequences of rooted forests, counting the number of ways to build $K_n$'s spanning trees by adding directed edges in two ways.
Flajolet, P. & Sedgewick, R. — Analytic Combinatorics · Cambridge University Press (2009). Note I.49 and Appendix A.6 state the Lagrange-Bürmann inversion theorem in both the coefficient form and the residue form $[x^n]R(x) = \frac{1}{n}[\lambda^{-1}]\,\lambda^{-n}(\lambda/\phi(\lambda))$-style change of variable; §VI.7 applies it to the singularity analysis of the tree function $T(z) = ze^{T(z)}$, locating the square-root singularity at $z = e^{-1}$ that drives the $n^{n-1}$ rooted-tree asymptotics and the $n^{-3/2}\rho^{-n}$ universal tree-counting law. The Catalan generating function as the simplest plane-tree instance $C = 1 + zC^2$ inverted by Lagrange is Example I.16 / §VI.
van Lint, J. H. & Wilson, R. M. — A Course in Combinatorics · Cambridge University Press, 2nd edition (2001), Chapter 2 'Trees' and the matrix-tree chapter. Gives the leaf-existence and edge-count characterisation of trees, the Prüfer correspondence, and a self-contained proof of the matrix-tree theorem via the Binet-Cauchy formula applied to the oriented incidence matrix $N$, identifying a maximal set of independent columns of $N$ with the spanning trees and the corresponding $3\times3$-style minor determinants as $\pm1$.

Estimated time

beginner: 22m
intermediate: 52m
master: 92m