07.05.08 · representation-theory / symmetric

Cutoff phenomenon

shipped3 tiersLean: none

Anchor (Master): Aldous-Diaconis 1986, 1987; Diaconis 1996 Proc. Symp. Appl. Math. 50; Diaconis-Shahshahani 1981

Intuition [Beginner]

Many random walks share a surprising property: for a long time the walk stays far from uniformly random, then it suddenly becomes close to uniform. The transition happens in a very narrow window of steps. This sharp transition is called the cutoff phenomenon.

Imagine a light switch rather than a dimmer. The walk is "off" (far from uniform) for thousands of steps, then within a few dozen more steps it flips "on" (close to uniform). The number of steps where the switch happens depends on the size of the group, growing logarithmically.

The cutoff phenomenon was first identified by Aldous and Diaconis in 1986. They showed it occurs for random transpositions on the symmetric group, for random walks on the hypercube, and for many other natural chains. Not every random walk exhibits cutoff — some converge gradually — but the ones that do share a structural feature: the mixing time is controlled by a single "bottleneck" representation.

Visual [Beginner]

A diagram showing total variation distance versus number of steps for three scenarios: a walk with cutoff (sharp drop), a walk without cutoff (gradual decay), and a walk with late cutoff. The cutoff curve looks like a step function; the non-cutoff curve is a smooth exponential decay.

Three curves plotted on axes with steps on x and total variation distance on y. One curve drops sharply from 1 to 0 like a step function, labelled cutoff. Another curves decays gradually as an exponential, labelled no cutoff. The third drops sharply but at a later time.

The key visual signature of cutoff is the step-function shape: the total variation distance stays near 1, then drops rapidly to near 0 within a narrow window.

Worked example [Beginner]

Consider the lazy random walk on the hypercube (a string of bits). At each step, choose a random coordinate and flip it with probability (or leave it unchanged with probability ).

For : the hypercube has vertices. Starting from :

Step 1. After 1 step: the walk has moved to one of , , , or stayed at , each with probability for the first three and ... actually, with the lazy step, choose a random coordinate (3 choices, each with probability ), then flip it with probability . So the probability of staying put is and the probability of moving to each of the 3 neighbours is .

Step 2. After 5 steps, the distribution is spread across most vertices but still concentrated near the starting point. The total variation distance from uniform is approximately .

Step 3. After 12 steps ( for , but more steps needed for a small cube), the distribution is close to uniform.

What this tells us: for large , the cutoff happens at steps. Before this threshold, the walk is far from uniform; after it, the walk is close. The window around the cutoff has width .

Check your understanding [Beginner]

Formal definition [Intermediate+]

Let be a family of finite groups with . For each , let be a probability measure on driving a random walk, and let be the mixing time with threshold .

Definition (Cutoff). The family exhibits cutoff if there exists a sequence and a window such that

and

Equivalently, cutoff holds if for all .

Definition (Cutoff window). The cutoff window is the smallest for which the above holds. If , the cutoff is sharp; if , the cutoff follows a standard diffusion scaling.

Counterexamples to common slips

  • Cutoff is a property of families, not individual chains. A single random walk on a fixed group does not "have cutoff." Cutoff describes asymptotic behaviour as the group size grows.
  • Cutoff does not mean TV reaches exactly 0. The TV distance never reaches 0 in finite time (for an aperiodic chain). Cutoff means the TV distance drops from near 1 to near 0 in a narrow window.
  • Spectral gap alone does not determine cutoff. Two chains with the same spectral gap can have different cutoff behaviour. The full spectrum, not just the gap, controls the cutoff profile.

Key theorem with proof [Intermediate+]

Theorem (Cutoff for random transpositions — Diaconis-Shahshahani 1981). Consider the random transposition walk on : at each step, choose two positions uniformly at random and swap the cards in those positions. This family exhibits cutoff at time with window . Concretely, for :

Proof. The proof combines the Upper Bound Lemma with an analysis of the character sum for .

Upper bound. The random transposition measure is uniform on the conjugacy class of transpositions. By the conjugacy class simplification of the Upper Bound Lemma:

where denotes a transposition and the sum is over all non-principal partitions of . The character value of the irreducible representation on a transposition is given by

where is the second Casimir content. For :

for any fixed with . The dimensions are at most , and there are partitions. Since and , the tail of the sum is controlled by the exponential decay of the character ratio. Summing over all non-principal partitions and using :

when . This gives the upper bound: .

Lower bound. For the lower bound, use the standard representation of dimension . Its character value on a transposition is , so .

The second eigenvalue of the transition matrix is from the standard representation. For with :

This remains bounded away from 0 when is large, so the TV distance stays close to 1. More precisely, construct a test function on using the character of the standard representation:

Then and . Substituting gives a lower bound that stays bounded away from 0 when , completing the proof.

Bridge. The cutoff theorem for random transpositions builds toward the general theory of cutoff for random walks on finite groups, and appears again in the strong stationary time analysis where alternative probabilistic arguments give matching bounds. The foundational reason cutoff occurs is that a single representation (the standard representation ) controls both the upper and lower bounds: the character ratio is maximised at among non-principal representations, and this is exactly the spectral bottleneck that determines the mixing time. The bridge is between the character-theoretic upper bound from the Upper Bound Lemma and the second-eigenvalue lower bound; putting these together, both bounds match at , identifying the cutoff time.

Exercises [Intermediate+]

Advanced results [Master]

Theorem 1 (Cutoff for random transpositions — Diaconis-Shahshahani 1981). The random transposition walk on exhibits cutoff at time with window . The profile function is (Poisson-type decay in the cutoff window).

The proof was given in the Key theorem section above. The profile function comes from the Poisson approximation to the distribution of fixed points in the partially random permutation.

Theorem 2 (Cutoff for the hypercube — Diaconis-Rock 1992). The lazy random walk on exhibits cutoff at time with window . The profile is determined by the Poisson distribution of "unvisited" coordinates.

Theorem 3 (Product test — Peres and others). A family of reversible Markov chains exhibits cutoff if and only if , where is the relaxation time and is the minimum stationary probability. This gives a necessary and sufficient condition for cutoff in terms of spectral data.

This result, due to Peres and independently to other authors, converts the cutoff question into a spectral gap computation: if the spectral gap is small relative to the mixing time, cutoff occurs.

Theorem 4 (Wilson's lower bound method — Wilson 2004). For any function , the total variation distance satisfies . Wilson showed that for the random transposition walk, the choice gives the optimal lower bound matching the Upper Bound Lemma upper bound.

Theorem 5 (Pre-cutoff and total cutoff — Saloff-Coste 2004). A weaker notion, pre-cutoff, requires only that is bounded (not tending to 1). Every random walk on driven by a conjugacy class of bounded support exhibits pre-cutoff. Full cutoff is known for specific conjugacy classes (transpositions, -cycles for fixed ) but remains open for general conjugacy classes.

Theorem 6 (Cutoff profile for random transpositions). The cutoff profile for random transpositions on converges to:

where . The Poisson comparison arises because the number of fixed points of a partially random permutation is approximately Poisson with parameter depending on .

Synthesis. The cutoff phenomenon is the foundational reason that many natural random walks on large groups exhibit a sharp threshold in their convergence to uniformity. The central insight is that when a single irreducible representation dominates the character sum in the Upper Bound Lemma, the upper and lower bounds match and the mixing time is determined by the character ratio of that representation. Putting these together with Wilson's lower bound method, the cutoff time is identified as where is the second-largest eigenvalue. This is exactly the spectral characterisation that generalises from random transpositions to arbitrary conjugacy class walks on . The bridge is between the representation-theoretic computation of character ratios and the probabilistic notion of a sharp transition; the pattern recurs across shuffle models, hypercube walks, and random walks on groups generated by a bounded number of conjugacy classes. Identifying the "bottleneck" representation with the slowest-decaying character ratio identifies the mixing time with a single spectral parameter.

Full proof set [Master]

Proposition 1 (Character ratio for transpositions). For the irreducible representation of indexed by the partition , the character ratio on a transposition is

where .

Proof. The character of on a transposition can be computed via the Frobenius character formula or by counting rim hooks. The transposition has cycle type , and the Murnaghan-Nakayama rule gives . Simplifying: gives the stated formula after normalising by via the hook-length formula. More directly, the content sum identity gives .

Proposition 2 (Standard representation dominates the character sum). For the random transposition walk on , the largest non-principal character ratio is attained at and equals .

Proof. The character ratio equals . The second Casimir content is maximised among non-principal partitions at , where . Substituting: . For any other non-principal partition, , giving a smaller character ratio.

Connections [Master]

  • Non-abelian Fourier transform 07.01.09. The Fourier-analytic framework that underpins the cutoff proof — bounding TV distance via the Plancherel identity and character sums — is the non-abelian Fourier transform of 07.01.09 applied to convolution powers of the step distribution.

  • Symmetric group representation 07.05.01. The partitions of indexing the irreducible representations of and their character values on cycle types, developed in 07.05.01, are the raw material for the character sum that determines the cutoff profile.

  • Character orthogonality 07.01.04. The orthogonality relations of 07.01.04 ensure that the principal representation contributes nothing to the TV distance bound, isolating the non-principal representations whose character ratios control the cutoff.

  • Regular representation 07.01.05. The random walk on a group acts as convolution on , which is the regular representation. The spectral decomposition of the regular representation into irreducibles, described in 07.01.05, is the mechanism by which the character ratios emerge as the eigenvalues of the transition operator.

Historical & philosophical context [Master]

Aldous and Diaconis introduced the cutoff phenomenon in their 1986 paper Shuffling Cards and Stopping Times [AldousDiaconis1986], where they proved cutoff for the random transposition walk and the top-to-random shuffle using strong stationary times. Diaconis 1996 [Diaconis1996] systematised the theory in his AMS proceedings paper, identifying cutoff as a widespread phenomenon in finite Markov chains and cataloguing the known examples.

Diaconis and Shahshahani's 1981 paper [DiaconisShahshahani1981] contained the first instance of cutoff proved by representation-theoretic methods, though the term "cutoff" was coined later. Wilson 2004 developed the lower bound method that bears his name, providing a systematic technique for proving matching lower bounds to complement the Upper Bound Lemma upper bounds. The product test of Peres (circa 2004) gave the first necessary and sufficient condition for cutoff in terms of spectral data.

Bibliography [Master]

@article{AldousDiaconis1986,
  author = {Aldous, David and Diaconis, Persi},
  title = {Shuffling cards and stopping times},
  journal = {Amer. Math. Monthly},
  volume = {93},
  year = {1986},
  pages = {333--348},
}

@incollection{Diaconis1996,
  author = {Diaconis, Persi},
  title = {The cutoff phenomenon in finite Markov chains},
  booktitle = {Proc. Symp. Appl. Math.},
  publisher = {AMS},
  volume = {50},
  year = {1996},
  pages = {23--38},
}

@article{DiaconisShahshahani1981,
  author = {Diaconis, Persi and Shahshahani, Mehrdad},
  title = {Generating a random permutation with random transpositions},
  journal = {Z. Wahrsch. Verw. Gebiete},
  volume = {57},
  year = {1981},
  pages = {159--179},
}

@article{DiaconisStroock1991,
  author = {Diaconis, Persi and Stroock, Daniel},
  title = {Geometric bounds for eigenvalues of Markov chains},
  journal = {Ann. Appl. Probab.},
  volume = {1},
  year = {1991},
  pages = {36--61},
}

@article{Wilson2004,
  author = {Wilson, David B.},
  title = {Mixing times of Lozenge tiling and card shuffling Markov chains},
  journal = {Ann. Appl. Probab.},
  volume = {14},
  year = {2004},
  pages = {274--325},
}

@incollection{SaloffCoste2004,
  author = {Saloff-Coste, Laurent},
  title = {Random Walks on Finite Groups},
  booktitle = {Probability on Discrete Structures},
  publisher = {Springer},
  year = {2004},
  series = {Encyclopaedia of Mathematical Sciences},
  volume = {110},
}