Strong stationary time; coupling argument
Anchor (Master): Aldous 1983; Aldous-Diaconis 1986, 1987; Diaconis-Shahshahani 1981; Levin-Peres-Wilmer 2009 Ch. 6
Intuition [Beginner]
A strong stationary time is a cleverly designed stopping rule that tells you the exact moment when a random walk has reached the uniform distribution. Instead of bounding how far the walk is from uniform at each step, you define a random time and prove: at time , the position of the walk is exactly uniform, and itself reveals no information about which position was reached.
The idea comes from card shuffling. If you shuffle cards by repeatedly moving the top card to a random position, there is a natural moment when the original bottom card first reaches the top and is then inserted randomly. At that moment, the deck becomes exactly uniform — not approximately, but exactly. The random time when this happens is a strong stationary time.
The power of this approach is that the distribution of gives an upper bound on the total variation distance. If is likely to be small, the walk mixes fast. If can be large, the walk mixes slowly.
Visual [Beginner]
A diagram showing the top-to-random shuffle. At each step, the top card is removed and inserted at a uniformly random position in the deck. A marker shows when the original bottom card first reaches the top — that moment is a strong stationary time.
The key insight: once the original bottom card reaches the top and is reinserted randomly, every ordering of the deck is equally likely.
Worked example [Beginner]
Consider the top-to-random shuffle on a deck of 4 cards, initially in order from top to bottom. At each step, take the top card and insert it at a uniformly random position in the deck (including back on top).
Step 1. After one step: the top card (1) is inserted at a random position among positions 1, 2, 3, 4. Each position has probability . If card 1 goes to position 4 (the bottom), the deck is .
Step 2. Card 4 (the original bottom card) reaches the top when all cards originally above it have been moved. The probability that card 4 reaches the top by step involves a coupon-collector type argument.
Step 3. Once card 4 is on top and is then inserted at a random position (at some step ), the deck becomes exactly uniform. The time is a strong stationary time. For a deck of cards, the expected time for is , which is approximately .
What this tells us: for , the expected is steps. The top-to-random shuffle mixes in about steps.
Check your understanding [Beginner]
Formal definition [Intermediate+]
Let be a Markov chain on a finite state space with stationary distribution . A stopping time is a random variable taking values in such that the event depends only on .
Definition (Strong stationary time). A stopping time is a strong stationary time (SST) if:
- has distribution (the position at time is exactly stationary), and
- is independent of (the position reveals nothing about when occurred).
The second condition is what makes strong: it is not enough that on average; must be independent of the event for every .
Definition (Separation distance). The separation distance at step is
Separation distance is an upper bound on total variation distance: . The separation distance equals for the fastest strong stationary time.
Proposition (SST bound). If is a strong stationary time, then
where is the fastest (stochastically smallest) SST.
Counterexamples to common slips
- Stationary time is not strong stationary time. A stationary time requires only (condition 1). A strong stationary time adds the independence condition (condition 2). Every SST is a stationary time, but not conversely.
- Coupling time is not SST. A coupling of two copies of the chain from different starting positions gives a coupling time when they first meet. The coupling time bounds TV distance via the coupling inequality, but it is not in general a strong stationary time because the meeting point need not be uniformly distributed.
- SST gives separation distance, not TV. The SST bound bounds the separation distance , which is at least as large as the TV distance. The SST bound can overestimate TV distance.
Key theorem with proof [Intermediate+]
Theorem (Aldous-Diaconis strong stationary time criterion 1987). Let be a strong stationary time for the random walk on a finite group driven by . Then:
Moreover, the separation distance equals where the infimum ranges over all strong stationary times .
Proof. The proof has two parts: the upper bound and the characterisation of separation distance.
Part (i): Upper bound. Let be any SST. Define the "foreknowledge" distribution: at step , if then the chain is already uniform (by the SST property), and if then we use the worst-case bound. Formally, for any :
By the SST property, . But for is obtained by further convolutions after , so need not equal . However, for a random walk on a group, convolution with any measure preserves the uniform distribution: if is uniform, then for is also uniform (since for any ). Therefore:
This gives , so .
For the TV bound: .
Part (ii): Separation distance equals the infimum. The separation distance can be achieved by an optimal SST constructed via duality. The strong stationary dual chain on a partially ordered state space with maximum element satisfies . Setting gives an SST with . Hence the infimum is achieved and equals .
Bridge. The Aldous-Diaconis SST criterion builds toward the duality theory for strong stationary times, which provides a systematic method for constructing optimal SSTs, and appears again in the analysis of specific shuffle models where the dual chain has a natural combinatorial interpretation. The foundational reason SSTs work is that the independence condition (position independent of stopping time) forces the distribution to be exactly uniform at time , and the Markov property preserves this uniformity for all later times. This is exactly the coupling perspective: the SST couples the walk from its starting position to a walk started from the uniform distribution, and the coupling time is the SST. The bridge is between the probabilistic notion of a stopping time and the analytic notion of separation distance; putting these together identifies the optimal SST with the separation distance, generalising the coupling inequality.
Exercises [Intermediate+]
Advanced results [Master]
Theorem 1 (Top-to-random mixing time — Aldous-Diaconis 1986). The top-to-random shuffle on has mixing time with cutoff at . The SST is the time when the original bottom card reaches the top and is reinserted.
The proof constructs the strong stationary dual on as in Exercise 7 and shows that reaches (complete uniformity) at a coupon-collector time, giving .
Theorem 2 (Duality for SST — Diaconis-Griffiths 2018, Fill 1998). Every strong stationary time corresponds to a strong stationary dual chain on a partially ordered set with a maximum element. Conversely, every such dual gives rise to an SST. The separation distance at time equals for the optimal dual.
This duality is the definitive structural result: it converts the probabilistic problem of finding an SST into the combinatorial problem of constructing a dual chain on a poset.
Theorem 3 (Coupling inequality — Aldous 1983). For any coupling of two copies of the Markov chain with (started from stationarity), define the coupling time . Then .
The coupling inequality is more general than the SST bound (it applies to any coupling, not just SSTs) but typically gives weaker bounds because the coupling time need not be an SST.
Theorem 4 (Random transposition SST and cutoff). The SST "every card touched" for random transpositions on gives where is the coupon collector time with batch size 2. The separation distance achieves cutoff at , matching the TV cutoff from the Upper Bound Lemma. In fact, separation cutoff and TV cutoff occur simultaneously for this walk.
Theorem 5 (Separation vs TV — separation can be much larger). For some walks, the separation distance is much larger than the TV distance . For the lazy random walk on with prime: at rate , but can be bounded below by , which is much slower when is large. The SST approach overestimates the mixing time for chains without cutoff.
Theorem 6 (SST for riffle shuffles — leakage argument). There is no known simple SST for the riffle shuffle. The best bounds come from the Fourier-analytic Upper Bound Lemma rather than from strong stationary times. This contrasts with the top-to-random shuffle and the random transposition walk, where SSTs give sharp results.
Synthesis. Strong stationary times provide the foundational reason that certain random walks admit sharp mixing bounds via probabilistic coupling. The central insight is that the independence condition in the SST definition — the position at time is uniform and independent of — forces the distribution to be exactly stationary at a random time, and the separation distance measures how long this takes. Putting these together with the duality construction, the separation distance is identified with the tail probability of the optimal dual chain hitting its maximum, converting the mixing problem into a first-passage problem on a poset. This is exactly the content that generalises from the top-to-random shuffle (where the dual is the coupon collector) to arbitrary walks with known SST constructions. The bridge is between the probabilistic coupling framework and the analytic separation distance; the pattern recurs across walks where a natural "randomisation threshold" can be identified, and identifies the mixing time with the coupon collector time of the dual chain. The strong stationary time approach complements the Fourier-analytic Upper Bound Lemma, and for walks with cutoff the two methods agree.
Full proof set [Master]
Proposition 1 (Top-to-random SST). The time at which the original bottom card in the top-to-random shuffle first reaches the top and is then reinserted at a random position is a strong stationary time.
Proof. Index the positions from top to bottom. The original bottom card starts at position . Call a card "good" if it is uniformly distributed among all cards below it in the deck. Initially, all cards are good vacuously.
When the top card is removed and reinserted at position : if , the top card returns to the top and nothing changes. If , the top card is placed at position and all cards in positions shift up one position. The key claim: if the bottom cards were uniformly distributed before this step, then after the step the bottom cards are uniformly distributed.
This holds because the inserted card is placed at a uniformly random position among all positions. If it lands among the bottom positions, the randomised block grows by 1. By induction, once (all cards randomised), the permutation is uniform.
The time when is exactly the time when the original bottom card reaches position 1 (the top) and is then reinserted. At this moment all cards are randomised, so is uniform. The independence of and follows because the distribution of the permutation at the randomisation threshold does not depend on when the threshold is reached.
Proposition 2 (Coupling inequality). For any coupling of two copies of the Markov chain with :
where .
Proof. For any event :
On the event , the two chains have coupled so , contributing 0 to the difference. On :
Taking the supremum over : .
Connections [Master]
Non-abelian Fourier transform
07.01.09. The Fourier-analytic bounds from the Upper Bound Lemma complement the SST approach: the Fourier transform gives TV bounds directly, while the SST gives separation bounds. For walks with cutoff, both methods converge to the same mixing time.Symmetric group representation
07.05.01. The irreducible representations of indexed by partitions control the eigenvalues of the shuffle transition matrix, which in turn determine the rate at which the SST dual chain approaches its maximum. The character values on cycle types developed in07.05.01appear in both the Fourier-analytic and SST approaches.Regular representation
07.01.05. The random walk on a group acts by convolution on the group algebra, which is the regular representation. The spectral decomposition of the regular representation into irreducibles, described in07.01.05, determines the eigenvalues that control both the SST bound and the Upper Bound Lemma bound.Character orthogonality
07.01.04. The orthogonality relations of07.01.04ensure that the Fourier-analytic bound isolates the non-principal representations whose decay rates determine the mixing time, complementing the SST bound which captures the same mixing through a probabilistic stopping time.
Historical & philosophical context [Master]
Aldous introduced strong stationary times in his 1983 paper [Aldous1983] on random walks on finite groups, establishing the coupling inequality and the connection to separation distance. Aldous and Diaconis 1986 [AldousDiaconis1986] applied SSTs to card shuffling, proving cutoff for the top-to-random shuffle and the random transposition walk using the coupon collector analysis of the dual chain.
Aldous and Diaconis 1987 [AldousDiaconis1987] developed the full theory of strong uniform times (their term for SSTs) in Advances in Applied Mathematics, proving the duality theorem and the characterisation of separation distance as the infimum over SSTs. Fill 1998 refined the duality construction, and Diaconis and Griffiths 2018 provided a modern treatment. Levin, Peres, and Wilmer 2009 systematised the coupling and SST theory in their monograph Markov Chains and Mixing Times.
Bibliography [Master]
@incollection{Aldous1983,
author = {Aldous, David},
title = {Random walks on finite groups and rapidly mixing Markov chains},
booktitle = {Seminaire de Probabilites XVII},
series = {Lecture Notes in Mathematics},
volume = {986},
publisher = {Springer},
year = {1983},
pages = {243--297},
}
@article{AldousDiaconis1986,
author = {Aldous, David and Diaconis, Persi},
title = {Shuffling cards and stopping times},
journal = {Amer. Math. Monthly},
volume = {93},
year = {1986},
pages = {333--348},
}
@article{AldousDiaconis1987,
author = {Aldous, David and Diaconis, Persi},
title = {Strong uniform times and finite random walks},
journal = {Adv. Appl. Math.},
volume = {8},
year = {1987},
pages = {69--97},
}
@book{LevinPeresWilmer2009,
author = {Levin, David A. and Peres, Yuval and Wilmer, Elizabeth L.},
title = {Markov Chains and Mixing Times},
publisher = {American Mathematical Society},
year = {2009},
}
@article{Fill1998,
author = {Fill, James Allen},
title = {An interruptible algorithm for perfect sampling via Markov chains},
journal = {Ann. Appl. Probab.},
volume = {8},
year = {1998},
pages = {131--162},
}
@article{DiaconisGriffiths2018,
author = {Diaconis, Persi and Griffiths, Robert},
title = {Column deletion and shuffling cards},
journal = {Ann. Appl. Probab.},
volume = {28},
year = {2018},
pages = {2170--2191},
}