Generating Functions: Ordinary, Exponential, and the Exponential Formula
Anchor (Master): Stanley 1999 *Enumerative Combinatorics, Volume 2* (Cambridge) Ch. 5 (the exponential formula, the compositional formula, Lagrange inversion, and the symbolic algebra of labelled species); Flajolet-Sedgewick 2009 *Analytic Combinatorics* (Cambridge) Ch. I-II (the symbolic method, OGF/EGF dictionaries); Bender-Williamson 1991 *Foundations of Applied Combinatorics*; Euler 1748 *Introductio in analysin infinitorum* Ch. XVI (the originating generating-function method)
Intuition Beginner
Suppose someone hands you a counting problem and you compute the answers one at a time: zero of something, one of something, two, three, and so on. You get a list of numbers. The list is the honest answer, but a bare list is hard to work with — you cannot easily find a pattern, prove an identity, or jump straight to the thousandth term. A generating function is a single algebraic object that carries the entire list at once, so that the tools of algebra do the counting for you.
The trick is to hang the numbers onto the powers of a bookkeeping symbol . If your counts are , you write down the one expression and treat it as a thing you can add, multiply, and invert. You are not plugging in a number for and you are not worrying about whether the sum settles down to a value; is a placeholder whose only job is to keep the term for each position from colliding with the others.
Why bother? Because two combinatorial operations become two clean algebraic ones. When you build a structure by laying down one labelled piece after another, the counts multiply through the product of two generating functions. When you build a structure by gluing together independent "connected" pieces in any number, the counts come out of an exponential of a generating function. So a hard problem about assembling things becomes an easy problem about multiplying, dividing, or exponentiating series.
This unit is the engine room of the whole chapter. Recurrences like the Fibonacci numbers get solved by turning them into one equation in a series and reading off the closed form. Labelled structures — ways of seating named people, partitioning a named set, drawing a graph on numbered dots — get counted by a second flavour of generating function that builds the position-number into the bookkeeping. Later chapters turn these same moves into a complete dictionary.
Visual Beginner
Think of a generating function as a clothesline. Each peg on the line is a power of : the peg , then , then , and so on, marching off to the right. On the peg you hang the count of whatever you are counting at size . The whole hung-up line, read as one algebraic expression, is the generating function.
The two basic moves are pictured in the table below. "Multiply two lines" lays one structure next to another and counts the joint sizes by pairing up pegs whose powers add to — this is the product rule. "Exponentiate a line" stacks any number of independent connected pieces and counts the result — this is the set rule that becomes the exponential formula at the next tier.
| operation on structures | operation on the line | what the peg collects |
|---|---|---|
| lay piece A then piece B (ordered) | multiply the two lines | every split of into a part for A and a part for B |
| use any number of A-pieces in a row | divide by minus the A-line | every way to break into A-pieces in order |
| glue any number of connected pieces (labelled) | exponentiate the line | every way to partition the labels among connected pieces |
Each row is a counting fact you can check on small numbers, and the rest of the unit explains why each algebraic move is exactly the right bookkeeping for its combinatorial picture.
Worked example Beginner
Let us count the Fibonacci numbers with a generating function and pull out a closed formula. The Fibonacci rule is: each term is the sum of the two before it. Starting from and , the list is
Step 1. Hang the list on the clothesline. Write , the one expression carrying every Fibonacci number on its matching power of .
Step 2. Turn the recurrence into an equation. The rule "term equals term plus term " says, in clothesline language, that equals times (shift the line one peg right) plus times (shift two pegs right), with the very first peg added back because the seed has no two predecessors. That bookkeeping gives the single equation .
Step 3. Solve for the whole line at once. Collecting on the left, , so , and therefore . One short division has captured every Fibonacci number.
Step 4. Read off a count. To find the fifth Fibonacci number without the closed form, expand the division as a series to five pegs: dividing by yields , and the coefficient on is . The list agrees: the fifth term after the seed is indeed .
What this tells us: a recurrence that ties each term to earlier ones becomes a single equation in a generating function, and solving that equation hands back the whole sequence as a ratio you can expand or analyze. The hard step-by-step recurrence is replaced by one division.
Check your understanding Beginner
Formal definition Intermediate+
Fix a commutative ring containing the rationals (so that division by is available; take or unless stated otherwise). The ring of formal power series consists of all expressions with , added coefficientwise and multiplied by the Cauchy product $$ \Big(\sum_{n} a_n x^n\Big)\Big(\sum_{n} b_n x^n\Big) = \sum_{n} \Big(\sum_{k=0}^{n} a_k b_{n-k}\Big) x^n . $$ No convergence is assumed: is an indeterminate, and the sum defining each product coefficient is finite. The constant series is the multiplicative identity. A series is a unit (has a multiplicative inverse) if and only if its constant term is a unit of ; the inverse is computed recursively from and for . In particular is invertible with . Formal convergence of an infinite sum or product means coefficientwise stabilisation: converges when each has positive-order lowest term tending to infinity, so that every coefficient is determined by finitely many factors.
Definition (ordinary generating function). The ordinary generating function (OGF) of a sequence is . The OGF of the disjoint union of two combinatorial classes is the sum of their OGFs; the OGF of the product (ordered pairs with size ) is the product , whose -coefficient is the convolution counting the ways to split a total size between the two coordinates. The sequence rule follows: a structure that is an ordered list of any number of parts drawn from a class with OGF (and ) has OGF .
Definition (exponential generating function). The exponential generating function (EGF) of is . EGFs are the right device for labelled structures, where the objects carry the distinct labels . The labelled product of two classes superimposes an -structure and a -structure on disjoint label sets whose union is ; choosing which labels go to contributes a factor , so the product rule reads
$$
\widehat{(A \star B)}(x) = \hat{A}(x),\hat{B}(x), \qquad (A \star B)n = \sum{k=0}^{n} \binom{n}{k} a_k, b_{n-k},
$$
the binomial convolution. Notation , the operator for "coefficient of ", and the formal , on are recorded in _meta/NOTATION.md.
Counterexamples to common slips Intermediate+
"The OGF and EGF of a sequence carry different information." They carry the same sequence ; only the weighting of the bookkeeping symbol differs ( versus ). The choice is dictated by which product rule matches the combinatorics: plain convolution (unlabelled, ordered concatenation) calls for the OGF, binomial convolution (labelled, label-splitting) for the EGF.
"A formal power series with can be inverted." It cannot: invertibility requires a unit constant term. The series , , are non-units. What a series with admits instead is a compositional inverse (an inverse under substitution), provided — a different operation, handled by Lagrange inversion.
" is the sequence rule for any class ." It requires (no empty part), otherwise and the geometric expansion fails to converge formally — there would be infinitely many ways to pad with empty parts, so the coefficient is undefined.
" of a series always makes sense formally." Formal converges coefficientwise only when has zero constant term; otherwise the constant terms do not stabilise. This is why the exponential formula applies to connected structures of positive size.
Key theorem with proof Intermediate+
The signature theorem is the exponential formula, because it is the single statement that turns "a labelled structure is a set of connected components" into "the EGF of all structures is the exponential of the EGF of the connected ones", and every later set-construction in analytic combinatorics is an instance of it.
Theorem (exponential formula). Let a class of labelled structures be such that every structure on a label set is determined by a partition of into nonempty blocks together with a "connected" structure on each block, with connected structures counted by on an -element label set and . Let be the EGF of the connected structures and the EGF of all structures, where counts the structures on an -set (with , the empty structure). Then $$ H(x) = \exp\big(C(x)\big) = \sum_{k \ge 0} \frac{C(x)^k}{k!}. $$
Proof. A structure on is a set partition of together with, on each block , a connected structure, of which there are . Summing over the number of blocks and over all partitions into blocks, $$ h_n = \sum_{k \ge 0} ;\sum_{\substack{B_1 \sqcup \cdots \sqcup B_k = {1,\dots,n} \ B_i \ne \varnothing}} \frac{1}{k!}\prod_{i=1}^{k} c_{|B_i|}, $$ where dividing by compensates for ordering the (unordered) blocks. Fix and the block sizes with . The number of ordered set partitions of into blocks of these sizes is the multinomial coefficient . Hence $$ \frac{h_n}{n!} = \sum_{k \ge 0} \frac{1}{k!} \sum_{\substack{n_1 + \cdots + n_k = n \ n_i \ge 1}} \prod_{i=1}^{k} \frac{c_{n_i}}{n_i!}. $$ The inner sum is exactly the -coefficient of , since multiplying copies of and collecting the term sums over all compositions with . Therefore . Equating EGFs gives .
Corollary (set partitions and Bell numbers). Take the connected structures to be single nonempty blocks with exactly one structure each: for , so . Then , recovering the Bell-number EGF ; refining by the number of blocks gives , the Stirling-second EGF foreshadowed at 40.01.01.
Bridge. This theorem is the foundational reason ordinary and exponential generating functions are two specialisations of one idea rather than two tricks: the OGF product rule counts ordered concatenation by plain convolution, and the exponential formula is exactly the labelled analogue where unordered gluing of connected pieces is counted by the binomial convolution iterated, which builds toward the symbolic method of 40.08.01. The set-of-connected-pieces picture generalises the permutation-as-product-of-cycles decomposition — there the connected pieces are cycles with and , so recovers permutations — and it is dual to the OGF sequence rule in the sense that "set" replaces "sequence" precisely when labels are present. Putting these together, the appearance of here and of in the sequence rule is the central insight that each combinatorial construction has one canonical generating-function operation, and this same dictionary appears again in 40.01.04 where the rational case is solved by transfer matrices and in 40.01.07 where the tree recursions are inverted by Lagrange inversion.
Exercises Intermediate+
Advanced results Master
The two product rules and the exponential formula are the elementary stratum; their reach extends through rational-function asymptotics, the compositional formula, Lagrange inversion, and the species-level algebra that unifies the OGF and EGF dictionaries.
Theorem 1 (rational generating functions and linear recurrences). A sequence satisfies a linear recurrence with constant coefficients for if and only if its OGF is a rational function with and . Partial-fraction decomposition over writes , where the are the reciprocals of the roots of ; reading coefficients gives with a polynomial of degree . The dominant term is governed by the of largest modulus, so ; for Fibonacci gives [Stanley §4.1].
Theorem 2 (compositional formula). If a labelled structure is built by partitioning the label set, placing a connected -structure on each block, and then placing a single -structure on the set of blocks (treated as a new labelled set), then the EGFs compose: the EGF of the composite is , provided . The exponential formula is the case (an unstructured set of blocks). Functional substitution of EGFs thus models "a structure of structures", and the requirement is the same zero-constant-term condition that makes formal composition well-defined [Stanley §5.1].
Theorem 3 (Lagrange inversion). Let with , and let be the unique formal power series with solving . Then for any formal Laurent series and ,
$$
[x^n],H(w(x)) = \frac{1}{n}[w^{n-1}]\big(H'(w),\phi(w)^n\big), \qquad [x^n],w(x) = \frac{1}{n}[w^{n-1}],\phi(w)^n .
$$
This inverts the tree-type functional equations directly: the EGF of labelled rooted trees satisfies (a root attached to a set of subtrees), so and , recovering Cayley's count of labelled rooted trees. The full development and forward applications live at 40.01.07 [Stanley §5.4].
Theorem 4 (symbolic dictionary). For unlabelled classes the admissible constructions translate to OGF operations: disjoint union to , product to , sequence to , multiset to (Euler transform), cycle to . For labelled classes the EGF operations are , the labelled product , sequence to , set to , cycle to . The exponential formula is the labelled row; the entire chapter is the systematic exploitation of this table, made analytic in 40.08.01.
Theorem 5 (multivariate refinement). Introducing a second variable marking a statistic (number of components, number of parts, number of cycles) refines the exponential formula to , whose -coefficient is — the EGF of structures with exactly connected components. Setting recovers the univariate count; differentiating in at extracts the mean number of components. For set partitions is the bivariate Bell EGF whose -coefficient is , the Stirling-second EGF of 40.01.01.
Synthesis. The generating-function method is the foundational reason enumeration and algebra are the same subject viewed twice: a sequence becomes a single element of , and each way of assembling combinatorial objects becomes one canonical operation on that ring — ordered concatenation is convolution, so the OGF sequence rule is ; labelled gluing of connected pieces is the binomial convolution iterated, so the EGF set rule is . The central insight is that this is one dictionary, not two: the OGF and EGF are the unlabelled and labelled specialisations of the symbolic method, and the exponential formula, the compositional formula, and Lagrange inversion are exactly the , substitution, and inversion entries of that dictionary. Putting these together, rational OGFs solve linear recurrences because partial fractions diagonalise the convolution, and this is dual to the way diagonalises the set construction; the bridge is that every closed form in the chapter — Fibonacci's golden ratio, Catalan's , Cayley's , the Bell EGF — is the coefficient extraction from one of these canonical operations, and this generalises directly into the analytic combinatorics of 40.08.01, where the location and nature of the singularities of the same generating functions deliver the asymptotic growth that the formal theory leaves implicit. The formal ring and the analytic function are one object read at two resolutions.
Full proof set Master
Proposition 1 (unit criterion and the sequence rule). A series is invertible iff is a unit of ; consequently, for with , the geometric series converges formally and is the OGF of sequences of -parts.
Proof. If then , so is a unit. Conversely, given a unit, define and recursively ; then by construction, so is a two-sided inverse (the ring is commutative). For the sequence rule, with has lowest term of order , so has lowest term of order ; for fixed only the terms contribute to , a finite sum, giving formal convergence. The coefficient counts ordered -tuples of -parts of total size , and summing over counts all ordered sequences of parts, the sequence-rule interpretation. Since has constant term (a unit), it is invertible, and identifies .
Proposition 2 (EGF product rule, labelled product). The EGF of the labelled product is , with .
Proof. A labelled product structure on consists of a choice of subset to carry an -structure (with the complement carrying a -structure), where the labels in and in its complement are order-isomorphically relabelled to and . For there are subsets, structures on , and on the complement, giving ; summing over yields . The Cauchy product of and has -coefficient , so .
Proposition 3 (exponential formula, component-marked form). With the connected EGF and marking the number of components, the EGF of all structures is , and is the EGF of structures with exactly components.
Proof. A structure with exactly components on is an unordered partition into nonempty blocks with a connected structure on each. Ordering the blocks and dividing by , and grouping by the size profile as in the Key theorem, the EGF of -component structures is , because the -fold product of sums over compositions , and the multinomial supplies the labelled count exactly as in Proposition 2 iterated. Marking each component by and summing over gives . Setting recovers .
Proposition 4 (Lagrange inversion, coefficient form). Let and with , . Then for .
Proof. Since , has a compositional inverse . For , by the formal residue calculus (formal contour integral / residue at ), and integrating by parts on the formal level, . Changing the variable of the formal residue from to via , the formal differential, gives , where the residue picks the coefficient of . The change-of-variable for formal residues is invariant because is a well-defined linear functional annihilating derivatives. Applying this to () gives , so .
Connections Master
The rational-OGF solution of linear recurrences by partial fractions specialises, when the recurrence is encoded by a finite automaton or a weighted digraph, to the transfer-matrix method: the OGF becomes for a transition matrix , and the same denominator governs the recurrence. That refinement, with its applications to walks, tilings, and the cluster method, is owned by
40.01.04; this unit supplies the underlying partial-fraction asymptotics that read coefficients off any rational generating function.Lagrange inversion, previewed here on and the Catalan quadratic, is the engine that inverts every tree-type functional equation, and its full statement (the Bürmann form, the residue calculus, and the enumeration of labelled and plane trees) is developed at
40.01.07. The forward link is exact: the exponential formula gives the EGF of a set of components, and Lagrange inversion gives the EGF of a root attached to a set of components, so trees are the recursive closure of the set construction proved here.The OGF/EGF symbolic dictionary — disjoint union, product, sequence, set, cycle mapping to — is exactly the symbolic method that organises
40.08.01, where the constructions are made into an automatic translation scheme and the singularity structure of the resulting generating functions yields asymptotic counts. This unit is the formal-power-series foundation those analytic results stand on: the exponential formula is the labelled SET row, and analytic combinatorics adds the complex-analytic coefficient asymptotics.The exponential formula's instances — set partitions with EGF , permutations as sets of cycles with — are the generating-function repackaging of the surjection and partition counts of
40.01.01; the twelvefold-way Stirling cells are precisely the component-marked exponential formula applied to nonempty blocks, so the elementary table is the source data and this unit is its generating-function lift.
Historical & philosophical context Master
The generating-function method originates with Euler's 1748 Introductio in analysin infinitorum [Euler 1748], where the product is introduced to count integer partitions and the coefficients of an infinite product are read as combinatorial counts — the founding act of treating a formal series as a bookkeeping device for a sequence. Abraham de Moivre had used recurrence-to-rational-function arguments somewhat earlier (in the analysis of the Fibonacci-type recurrences arising from his work on annuities and probability, 1718–1730), giving the first systematic use of the partial-fraction reading of a linear recurrence.
The exponential generating function and the exponential formula crystallised much later. The combinatorial content — that the EGF of a structure built from connected components is the exponential of the connected EGF — appears in the cycle-index work of Pólya (1937) and was placed on a fully structural footing by André Joyal's 1981 theory of species [Stanley §5.1, citing Joyal], which recast labelled structures as functors and the OGF/EGF passage as the move from a species to its generating series; the SET, SEQ, and CYC constructions and their exponential, geometric, and logarithmic generating-function images are the operations of that category. Lagrange's 1770 reversion of series supplied the inversion half of the dictionary, later given its residue-calculus form by Bürmann. The synthesis into a single algorithmic "symbolic method" with analytic coefficient asymptotics is due to Flajolet and Sedgewick, whose Analytic Combinatorics (2009) made the OGF/EGF dictionary and its singularity analysis a complete framework, the form in which the method is now taught and the foundation on which 40.08.01 builds.
Bibliography Master
@book{stanley2012ec1,
author = {Stanley, Richard P.},
title = {Enumerative Combinatorics, Volume 1},
series = {Cambridge Studies in Advanced Mathematics},
volume = {49},
edition = {2},
publisher = {Cambridge University Press},
year = {2012}
}
@book{stanley1999ec2,
author = {Stanley, Richard P.},
title = {Enumerative Combinatorics, Volume 2},
series = {Cambridge Studies in Advanced Mathematics},
volume = {62},
publisher = {Cambridge University Press},
year = {1999}
}
@book{wilf2006gfology,
author = {Wilf, Herbert S.},
title = {generatingfunctionology},
edition = {3},
publisher = {A K Peters},
year = {2006}
}
@book{flajoletsedgewick2009,
author = {Flajolet, Philippe and Sedgewick, Robert},
title = {Analytic Combinatorics},
publisher = {Cambridge University Press},
year = {2009}
}
@book{gkp1994concrete,
author = {Graham, Ronald L. and Knuth, Donald E. and Patashnik, Oren},
title = {Concrete Mathematics: A Foundation for Computer Science},
edition = {2},
publisher = {Addison-Wesley},
year = {1994}
}
@book{euler1748introductio,
author = {Euler, Leonhard},
title = {Introductio in analysin infinitorum},
volume = {1},
publisher = {Marcum-Michaelem Bousquet},
address = {Lausanne},
year = {1748}
}
@article{joyal1981species,
author = {Joyal, Andr\'{e}},
title = {Une th\'{e}orie combinatoire des s\'{e}ries formelles},
journal = {Advances in Mathematics},
volume = {42},
number = {1},
year = {1981},
pages = {1--82}
}
@article{polya1937cycleindex,
author = {P\'{o}lya, George},
title = {Kombinatorische Anzahlbestimmungen f\"{u}r Gruppen, Graphen und chemische Verbindungen},
journal = {Acta Mathematica},
volume = {68},
year = {1937},
pages = {145--254}
}