40.08.02 · combinatorics / analytic-combinatorics-asymptotics

The Symbolic Method for Labelled Structures

shipped3 tiersLean: none

Anchor (Master): Flajolet-Sedgewick 2009 *Analytic Combinatorics* (Cambridge) Ch. II in full — the labelled (partitional) product and its EGF translation, the SEQ/SET/CYC dictionary, the boxed product as the integral operator $\int_0^z (\partial_t A)\,B\,dt$, pointing and substitution, supercritical sequences and the exp-log schema (Theorems II.1-II.5, Propositions II.4-II.6); Stanley 1999 *Enumerative Combinatorics, Volume 2* (Cambridge) Ch. 5 (the exponential / compositional formulae, species); Joyal 1981 *Une théorie combinatoire des séries formelles* (Adv. Math. 42) for the functorial (species) foundation; Foata-Schützenberger 1970 for the labelled-product / shuffle calculus

Intuition Beginner

Some things you count carry name tags. A permutation of five people is not just a shape — each seat holds a specific named person, so two arrangements that look the same after erasing the names are still different arrangements. A way of partitioning the numbers through into groups, a graph drawn on numbered dots, a function on a numbered set: these are labelled objects, where the identities of the parts matter. Counting them is a different game from counting plain shapes, because when you join two labelled pieces you also have to decide which name tags go to which piece.

The symbolic method is a way to count such objects without ever counting by hand. You describe the object by a recipe — "a permutation is a collection of cycles", "a set partition is a set of nonempty blocks" — and the recipe translates, word for word, into an operation on a special bookkeeping function. The recipe word "collection of" becomes one specific algebraic move; the recipe word "sequence of" becomes another. Once you have written the recipe, the counting function falls out by reading the translation off a short table.

The right bookkeeping function for labelled objects is the exponential generating function. It is the plain generating function with one extra division: the count at size is divided by factorial before being hung on the bookkeeping symbol. That single change is exactly what is needed to handle the name tags. When you split a labelled object into two pieces, you choose which tags go left and which go right, and the factorial weighting builds that choice into the algebra for free.

This unit is the labelled half of the symbolic method. Its plain-shape twin handles objects without name tags; here every construction — sequence, set, cycle — gets a labelled translation, and the recipes for permutations, set partitions, surjections, and labelled trees all become one-line calculations.

Visual Beginner

Think of a labelled object as a handful of numbered cards arranged into a structure. To build a two-part object, you deal the cards into a left hand and a right hand, then arrange each hand on its own. The picture below shows the one move that makes the whole method work: the label-shuffle. You start with the cards through , choose a subset to send left, and the rest go right — every choice of subset is a different way to combine the two parts.

The table records the four basic recipes and what each one does to the bookkeeping function. Each row is a counting fact you can check on tiny cases, and the rest of the unit explains why each recipe translates to exactly its listed operation.

recipe in words what the object is operation on the EGF
lay an -part beside a -part, shuffling the labels a labelled pair multiply the two EGFs
an ordered list of any number of -parts a labelled sequence divide by ( minus the -EGF)
an unordered collection of any number of -parts a labelled set take the exponential of the -EGF
arrange -parts around a ring a labelled cycle take the logarithm of over ( minus the -EGF)

The headline cases live in these rows. A set partition is a set of blocks; a permutation is a set of cycles; a surjection is a sequence of nonempty blocks. Read the recipe, pick the row, write the answer.

Worked example Beginner

Let us count the surjections from a labelled set onto an unlabelled one: ways to sort the numbers into a list of nonempty groups where the order of the groups matters. (These are the same as ordered set partitions.) We will count by hand and then check against the recipe.

Step 1. A surjection of is a sequence of nonempty blocks whose blocks together use each of exactly once. List by the number of blocks.

Step 2. One block: the single block . That is arrangement.

Step 3. Two blocks: choose which numbers form the first block (the rest form the second), with both blocks nonempty. The first block can be — six choices, and each fixes the second block. That is arrangements.

Step 4. Three blocks: each block is a single number, and the blocks are ordered, so this is just an ordering of into three slots: arrangements.

Step 5. Total: . So there are surjections from a -element set.

Step 6. Check against the recipe. A surjection is "a sequence of nonempty sets of labels". The EGF of nonempty labelled sets is (the all-ones set count minus the empty set), and the sequence recipe divides by minus that, giving . Expanding this to the term and multiplying by gives the count , matching the hand count.

What this tells us: a one-line recipe — sequence of nonempty sets — reproduces a count that took five cases by hand, and the same recipe gives the count for every size at once.

Check your understanding Beginner

Formal definition Intermediate+

A labelled combinatorial class is a class of objects together with a size function such that an object of size carries a set of distinct atoms, bijectively labelled by , and such that finitely many objects have each size; is the counting sequence. The labelled atom has one object of size . The exponential generating function (EGF) of is $$ A(z) = \sum_{n \ge 0} a_n \frac{z^n}{n!} \in \mathbb{Q}[[z]], $$ following the formal-power-series conventions of 40.01.03. Two labelled objects on the same atoms are equal only as labelled objects; relabelling by an order-isomorphism transports an object on an arbitrary -set to the canonical one on .

Definition (labelled product). The labelled product (partitional, or shuffle, product) has objects the pairs in which the atoms of and of form a disjoint partition of a common label set with ; both components are then relabelled order-isomorphically to canonical form. The number of ways to split into an -subset and its complement is , so $$ (\mathcal{A} \star \mathcal{B})n = \sum{k=0}^{n} \binom{n}{k}, a_k, b_{n-k}, \qquad (A \star B)(z) = A(z),B(z), $$ the binomial convolution of 40.01.03. The empty labelled object (size ) is the unit for .

Definition (labelled SEQ, SET, CYC). From a class with (no object of size , so ) one forms three iterated products:

  • the sequence of ordered tuples of -components, with EGF ;
  • the set of unordered collections of -components, the -component part being , with EGF ;
  • the cycle of -components arranged around an oriented ring (sequences up to cyclic rotation), the -component part being , with EGF .

The SET translation is the exponential formula of 40.01.03 read as a construction. Restricted variants truncate or fix the number of components, with EGFs , , , .

Definition (boxed product and pointing). The boxed product is the subclass of in which the least label of the whole object lies in the -component. Its EGF is the integral operator $$ \big(A^{\square}\star B\big)(z) = \int_0^z \big(\partial_t A(t)\big), B(t), dt, $$ and a least-label-in- variant integrates . Pointing distinguishes one atom of each object; it has EGF , since pointing multiplies the size- count by . The operators and , the boxed product , and the pointing operator are recorded in _meta/NOTATION.md.

Counterexamples to common slips Intermediate+

  • "The labelled and unlabelled symbolic methods give the same translations." They do not. The unlabelled SEQ is in both worlds, but unlabelled SET is the Euler transform and unlabelled CYC carries the Euler totient ; the labelled SET and CYC collapse to the clean and precisely because distinct labels remove all symmetry.

  • " requires to have no empty object." It requires , i.e. : otherwise has a divergent constant term and the construction admits infinitely many empty components. The same zero-constant-term condition governs SEQ and CYC.

  • "The boxed product is just an ordered version of the labelled product." The ordering is not arbitrary: pins the globally smallest label to . This is what makes the integral translation exact and is the device used to enumerate increasing/ordered structures (increasing trees, ordered set partitions read least-first) where a recursive "smallest element first" decomposition is natural.

  • " and differ only by a constant." Over components, SEQ counts ordered tuples and CYC counts rings (dividing by the rotations), so the generating functions are and — genuinely different series, agreeing only at the one-component level.

Key theorem with proof Intermediate+

The signature theorem is the translation theorem for labelled constructions: it states that the labelled product, sequence, set, and cycle constructions are admissible — each translates to a fixed analytic operation on the EGF — so any specification built from by these operations yields its EGF mechanically. This is the labelled half of the symbolic method, and every later enumeration in the chapter is one instance of it [Flajolet-Sedgewick §II.2-II.3].

Theorem (labelled translation theorem). Let be labelled classes with EGFs , and suppose where a construction requires it. Then:

  1. (union, disjoint) has EGF ;
  2. (labelled product) has EGF ;
  3. (sequence) has EGF ;
  4. (set) has EGF ;
  5. (cycle) has EGF .

Proof. Part (1) is the additivity of EGFs over a disjoint union, coefficientwise. For (2), an object of on is determined by a choice of -subset to carry the -component (the complement carrying ), an -object on relabelled canonically (there are of these), and a -object on the complement ( of these). Summing over , $$ (\mathcal{A}\star\mathcal{B})n = \sum{k=0}^{n}\binom{n}{k}a_k, b_{n-k}, $$ whose EGF is the Cauchy product of and : has -coefficient , so . This is the 40.01.03 product rule.

For (3), is a disjoint union of -fold labelled products, so by (1) and (2) its EGF is . Since , the series has lowest-order term of order , so for each fixed only finitely many contribute to and the sum converges formally to (the constant term of is , a unit, by 40.01.03 Proposition 1).

For (4), an object of is an unordered collection of components. The ordered version is , and each unordered collection of distinct-label components arises from exactly orderings (the components are pairwise distinguishable by their label sets, so no overcounting collapse occurs); hence the -component subclass has EGF . Summing, has EGF . This is the exponential formula of 40.01.03 presented as the SET construction.

For (5), a cycle on components is a sequence of components read up to cyclic rotation. The rotations of a -tuple of distinct-label components are all distinct (the label sets break the rotational symmetry, for ), so the -component subclass has EGF , and has EGF .

Corollary (permutations and set partitions). Taking — a cyclic arrangement of labelled atoms, EGF — the class of sets of cycles has EGF , recovering the permutations. Taking — a nonempty block of labels, EGF — the class of sets of nonempty blocks has EGF , the Bell-number EGF.

Bridge. This theorem builds toward the entire analytic machinery of the chapter and appears again in 40.08.01, where the same translation table is run for unlabelled classes with ordinary generating functions and the singularities of the resulting functions deliver asymptotics. The foundational reason the labelled SET collapses to the clean , whereas the unlabelled SET needs the Euler transform, is exactly that distinct labels destroy the automorphisms that the unlabelled count must mod out — this is the central insight separating the two halves of the method. The labelled product translating to plain multiplication of EGFs is dual to the unlabelled product translating to multiplication of OGFs: in both, juxtaposition becomes a product, but the EGF builds the label-shuffle into its factorial weighting. Putting these together, the permutation specification and the set-partition specification are not separate tricks but two readings of one -of-something table entry, and this generalises directly into the saddle-point asymptotics of 40.08.06, where the Bell and involution EGFs are extracted.

Exercises Intermediate+

Advanced results Master

The translation theorem is the elementary stratum; its reach extends through the boxed product as an integral calculus, pointing-and-substitution, the supercritical-sequence and exp-log schemas governing component counts, and the species-level functoriality that unifies the labelled and unlabelled dictionaries.

Theorem 1 (boxed product as integral operator). For labelled classes the least-label-in- boxed product has EGF $$ \big(A^{\square}\star B\big)(z) = \int_0^z \big(\partial_t A(t)\big),B(t),dt . $$ Iterating the boxed product builds structures with a prescribed order on a distinguished chain of labels: (the smallest-label-first sequence) generates increasing/ordered constructions, and increasing trees arise as , whose EGF solves -type ODEs [Flajolet-Sedgewick §II.4]. The integral form is exact because constraining the least of labels to multiplies the size- partitional count by the probability that the global minimum lands in the -component, and on together with implements exactly this size-weighting.

Theorem 2 (pointing, substitution, and the compositional formula). Pointing translates to . Labelled substitution — replacing each atom of a -object by an -object on a disjoint label block — translates to functional composition provided ; this is the compositional formula of 40.01.03 in construction form. The set construction is the case (), recovering ; the sequence construction is (), recovering . Pointing and substitution together give the "pointing-and-rooting" calculus: , the standard root-extraction identity [Flajolet-Sedgewick §II.4].

Theorem 3 (supercritical sequence schema). Let with having radius of convergence and (the supercritical case). Then the equation has a unique root , has a simple pole at , and the number of -components in a random size- -object is asymptotically Gaussian with mean and variance for constants depending on at [Flajolet-Sedgewick §V, IX]. This is the labelled-sequence component-count limit law, the mechanism behind surjection block-counts; the analytic extraction belongs to the singularity/quasi-powers analysis of 40.08.01 and 40.08.06.

Theorem 4 (exp-log schema). Let where has EGF (analytic at ) of logarithmic singular type — as for permutation cycles (, ). Then the number of -components in a random size- -object is asymptotically Gaussian with mean and variance both . For permutations this is the Goncharov-Erdős-Turán theorem that a random permutation of elements has cycles with fluctuations; the schema isolates the logarithmic singularity of as the cause [Flajolet-Sedgewick §II.5, IX]. The exp-log schema is the SET analogue of the supercritical-sequence schema, the SET versus SEQ distinction reappearing at the level of limit laws.

Theorem 5 (species and functoriality). A labelled class is, in Joyal's framework, a functor from the groupoid of finite sets and bijections; its EGF is the generating series of the sequence . The operations are functorial operations on species (sum, product, the "" species composed on the right, the cyclic species, substitution), and the EGF is a ring homomorphism from the species semiring (with and ) to [Stanley §5.1, citing Joyal]. Two species can have the same EGF yet differ as functors; the EGF forgets the bijection-action and retains only cardinalities, which is why it is a complete invariant for counting but not for structure.

Synthesis. The labelled symbolic method is the foundational reason that "describe the object, read off the count" is a theorem rather than a slogan: every labelled construction — product, sequence, set, cycle, boxed product, pointing, substitution — is a fixed analytic operation on the exponential generating function, so a specification built from the atom emits its own EGF. The central insight is that the factorial weighting makes the label-shuffle automatic: the labelled product becomes plain multiplication, and putting these together with the iterated-product definitions, SET becomes and CYC becomes because distinct labels make the symmetric and cyclic group actions free — exactly where the unlabelled method instead pays the Euler-transform and totient tolls. This is dual to the unlabelled dictionary of 40.08.01 construction by construction, the two halves differing only where automorphisms survive. The boxed product is the one genuinely new labelled device, an integral operator recording a least-label-first recursion that reaches increasing structures no product alone can build. The bridge to asymptotics is that the supercritical-sequence and exp-log schemas read the shape of the component-count distribution off the type of singularity of or — a pole for SEQ, a logarithm for SET — so the same EGF the symbolic method produces exactly will, under the singularity analysis of 40.08.01 and the saddle-point method of 40.08.06, surrender its asymptotic growth and its limit laws. The construction calculus and the analytic calculus are one method read at two resolutions.

Full proof set Master

Proposition 1 (labelled product translation). The EGF of is , with .

Proof. An object of on the label set is a triple: a subset of size designating the -atoms, an -object on , and a -object on the complement, where each component is read in canonical labelling via the order-isomorphism of its label set with (respectively ). The subset is chosen in ways, the -object in ways, the -object in ways; summing over gives . The Cauchy product of the EGFs has -coefficient $$ \sum_{k=0}^{n}\frac{a_k}{k!}\cdot\frac{b_{n-k}}{(n-k)!} = \frac{1}{n!}\sum_{k=0}^{n}\frac{n!}{k!(n-k)!}a_k b_{n-k} = \frac{1}{n!}\sum_{k=0}^n\binom{n}{k}a_k b_{n-k}, $$ so .

Proposition 2 (SET, SEQ, CYC translations). For with : has EGF , has EGF , and has EGF .

Proof. By Proposition 1 iterated, has EGF . The sequence (ordered -tuples) has EGF ; since , has order , so each is a finite sum and , being a unit. For SET, the -component subclass is the quotient by the symmetric group permuting the components. The action of on -tuples of components with pairwise disjoint nonempty label sets is free: a permutation fixing a tuple would have to fix each component's label set, but distinct components occupy distinct positions with distinct label sets, so only the identity fixes the tuple. Hence every orbit has size and the -component subclass has EGF ; summing, has EGF . For CYC, the cyclic group acts on -tuples by rotation; the same disjoint-label-set argument makes the action free for , so each orbit (cycle) has size and the -component subclass has EGF , giving .

Proposition 3 (boxed-product integral translation). The least-label-in- boxed product has EGF .

Proof. Write . An object of on with the -component of size has its global minimum label in the -component precisely when label is among the chosen -labels. Among the subsets of size , those containing number , a fraction . Hence, weighting Proposition 1's count, $$ c_n = \sum_{k\ge1}\binom{n-1}{k-1}a_k b_{n-k} = \sum_{k\ge1}\frac{k}{n}\binom{n}{k}a_k b_{n-k} = \frac{1}{n}\sum_{k}\binom{n}{k}(k,a_k)b_{n-k}. $$ Now is the size- count of shifted: , the EGF of re-indexed. The product is, by Proposition 1, the EGF of at size , i.e. at size ; integrating shifts and divides the coefficient by , returning . Therefore .

Proposition 4 (permutations, set partitions, surjections by specification). The specifications , , and have EGFs , , and , counting permutations (), set partitions (), and surjections.

Proof. has EGF by Proposition 2 (with ). Then has EGF , so the size- count is ; this is the cycle decomposition of a permutation, an unordered set of its cycles. has EGF (nonempty blocks of labels). Then has EGF , the Bell numbers, counting partitions of into nonempty blocks. Replacing the outer SET by SEQ orders the blocks: has EGF , counting surjections (ordered set partitions). Each specification's EGF follows mechanically from Proposition 2; the combinatorial reading is the standard component decomposition.

Connections Master

  • The unlabelled symbolic method of 40.08.01 runs the identical construction-to-operation program with ordinary generating functions: disjoint union, product, and sequence translate to , , in both worlds, but the unlabelled SET becomes the Euler transform and the unlabelled CYC carries the totient , whereas this unit's labelled SET and CYC collapse to and . The two units are the paired halves of one method; 40.08.01 owns the OGF dictionary and the labelling-removes-symmetry contrast is the exact seam between them.

  • The exponential generating-function algebra, the binomial convolution, and the exponential formula on which every translation here rests are developed at 40.01.03; the labelled product rule is that unit's EGF product rule, the SET translation is its exponential formula, and the compositional/substitution translation is its compositional formula. This unit lifts those formal identities into a construction calculus: where 40.01.03 proves is the EGF of sets of connected components, this unit makes a first-class operation that any specification may invoke.

  • The Cayley tree function produced here as the specification is solved in closed form by the Lagrange inversion of 40.01.07, which supplies and the matrix-tree and Prüfer counts of labelled trees; this unit gives the symbolic specification, that unit gives the inversion that extracts its coefficients, and the singularity of at feeds the tree asymptotics downstream.

  • The saddle-point method of 40.08.06, co-produced in this chapter, extracts the asymptotics of exactly the EGFs this unit constructs: the Bell-number EGF , the involution EGF , and the fragmented-permutation EGF are entire or essentially-singular functions whose coefficient growth is governed by a saddle point rather than a pole; this unit produces the functions exactly, and 40.08.06 reads off their large- behaviour and the associated component-count limit laws of the exp-log schema.

Historical & philosophical context Master

The labelled product and its exponential-generating-function calculus crystallised through several independent lines. Pólya's 1937 cycle-index enumeration [Stanley §5.1] supplied the set/cycle constructions in the symmetry-laden unlabelled setting; the labelled simplification, in which distinct labels make the relevant group actions free, was made systematic by the "exponential formula" of mid-twentieth-century combinatorics (Riddell, Uhlenbeck, Ford, and others on connected-graph enumeration) and by the Foata-Schützenberger (1970) calculus of the shuffle (partitional) product, which gave the labelled product its algebraic form.

The functorial foundation is André Joyal's 1981 Une théorie combinatoire des séries formelles (Advances in Mathematics 42, 1-82) [Stanley §5.1, citing Joyal], which recast a labelled class as a species — a functor on the groupoid of finite sets and bijections — and identified the EGF as the species' generating series, the constructions as functorial operations, and the SET-equals- and CYC-equals- rules as theorems about the exponential and cyclic species. The boxed product and its integral translation, organising least-label-first decompositions and the increasing structures, were developed within the analytic-combinatorics program; the complete codification of the labelled symbolic method as an automatic specification-to-EGF translation with the supercritical-sequence and exp-log limit-law schemas is due to Flajolet and Sedgewick, whose Analytic Combinatorics (2009) Chapter II is the form in which the method is now taught and on which 40.08.01 and 40.08.06 build their singularity and saddle-point analyses.

Bibliography Master

@book{flajoletsedgewick2009,
  author    = {Flajolet, Philippe and Sedgewick, Robert},
  title     = {Analytic Combinatorics},
  publisher = {Cambridge University Press},
  year      = {2009}
}

@book{stanley1999ec2,
  author    = {Stanley, Richard P.},
  title     = {Enumerative Combinatorics, Volume 2},
  series    = {Cambridge Studies in Advanced Mathematics},
  volume    = {62},
  publisher = {Cambridge University Press},
  year      = {1999}
}

@book{stanley2012ec1,
  author    = {Stanley, Richard P.},
  title     = {Enumerative Combinatorics, Volume 1},
  series    = {Cambridge Studies in Advanced Mathematics},
  volume    = {49},
  edition   = {2},
  publisher = {Cambridge University Press},
  year      = {2012}
}

@book{wilf2006gfology,
  author    = {Wilf, Herbert S.},
  title     = {generatingfunctionology},
  edition   = {3},
  publisher = {A K Peters},
  year      = {2006}
}

@article{joyal1981species,
  author  = {Joyal, Andr\'{e}},
  title   = {Une th\'{e}orie combinatoire des s\'{e}ries formelles},
  journal = {Advances in Mathematics},
  volume  = {42},
  number  = {1},
  year    = {1981},
  pages   = {1--82}
}

@article{foataschutzenberger1970,
  author    = {Foata, Dominique and Sch\"{u}tzenberger, Marcel-Paul},
  title     = {Th\'{e}orie g\'{e}om\'{e}trique des polyn\^{o}mes eul\'{e}riens},
  series    = {Lecture Notes in Mathematics},
  volume    = {138},
  publisher = {Springer},
  year      = {1970}
}

@article{polya1937cycleindex,
  author  = {P\'{o}lya, George},
  title   = {Kombinatorische Anzahlbestimmungen f\"{u}r Gruppen, Graphen und chemische Verbindungen},
  journal = {Acta Mathematica},
  volume  = {68},
  year    = {1937},
  pages   = {145--254}
}