07.05.11 · representation-theory / symmetric

Spectral analysis of permutation-valued data

shipped3 tiersLean: none

Anchor (Master): Diaconis 1989 J. Amer. Statist. Assoc. 84; Marden 1995 Analyzing and Modeling Rank Data

Intuition [Beginner]

Imagine you collect survey responses where people rank five items from best to worst. Each person's response is a permutation — a reordering of the items. You have hundreds of such rankings and want to find the main "patterns" in the data.

Fourier analysis on the symmetric group does for permutation data what ordinary Fourier analysis does for time-series data: it decomposes a complicated signal into simple frequency components. Instead of sines and cosines, the "frequencies" are the irreducible representations of the symmetric group, indexed by partitions. Each component captures a different layer of structure in the ranking data.

Why does this concept exist? Because classical statistical tools assume numerical data, but rankings are permutations — and the representation theory of the symmetric group provides the natural harmonic analysis for functions defined on permutations.

Visual [Beginner]

A diagram showing a bar chart of ranking data on three items (six possible rankings) being decomposed into three spectral components. The first component is the uniform average; the second captures "item 1 tends to be ranked first"; the third captures a pairwise swap effect.

A six-bar histogram of rankings of three items, with arrows pointing to three spectral components: a flat uniform level, a linear tilt along one axis, and a checkerboard pairwise pattern.

The uniform component gives the overall popularity; the second-order component identifies which items are consistently preferred; higher components capture finer interactions.

Worked example [Beginner]

Consider ranking data on items. There are possible rankings. Imagine we observe the following counts from 60 respondents: the identity ranking (1,2,3) appears 20 times, the swap (2,1,3) appears 10 times, and each of the remaining four rankings appears 7 or 8 times.

Step 1. Convert counts to a probability distribution on the six permutations. For the identity ; for the swap ; for the others roughly .

Step 2. Compute the first spectral component: the "mean" level, which is the uniform distribution giving to each permutation. The deviation measures how far the data strays from uniform.

Step 3. The next spectral component projects onto the standard representation, which detects whether item 1 is systematically ranked too high or too low. Here, and , so the data is concentrated on rankings where item 1 is first.

What this tells us: the spectral decomposition separates the overall level (uniform), the first-order preference effects (item 1 is preferred), and the residual structure, giving a hierarchical summary of the ranking data.

Check your understanding [Beginner]

Formal definition [Intermediate+]

Let be the symmetric group on letters with irreducible representations indexed by partitions , of dimension . Let denote the irreducible representation indexed by .

For a function (e.g., a probability distribution or a data histogram), the Fourier transform of at is the matrix

The spectral decomposition of is the collection of Fourier coefficients at all irreducible representations. The inverse Fourier formula reconstructs from its spectrum:

For ranking data, the key observation is that the isotypic components correspond to interpretable statistical effects. The component at is the overall mean (uniform distribution). The component at captures first-order effects (marginal preferences for individual items). The components at and capture second-order effects (pairwise interactions).

Counterexamples to common slips

  • Confusing Fourier coefficients with eigenvalues. The Fourier coefficient is a matrix, not a scalar. Its eigenvalues carry spectral information, but the coefficient itself is the basic object.
  • Omitting the dimension weighting. The inverse formula weights by , not by 1. For the dimensions vary widely (e.g., the standard representation has ), so this weighting is essential.
  • Assuming abelian Fourier analysis suffices. For abelian groups every irreducible is 1-dimensional and the Fourier transform produces scalars. For the irreducibles have dimensions up to roughly , so the matrix-valued transform carries much more information.

Key theorem with proof [Intermediate+]

Theorem (Spectral decomposition for ranking data — Diaconis 1989). Let be a probability distribution on permutations. Define the first-order projection and the second-order projection by projecting onto the isotypic subspaces corresponding to the partitions and respectively. Then:

and the first-order term equals the sum of squared deviations of the marginal ranking probabilities from uniform:

Proof. The proof has three steps.

Step 1 (Plancherel decomposition). By the Plancherel formula for :

Since (as is a probability measure, its Fourier transform at the identity representation equals 1), the sum starts at .

Step 2 (First-order identification). The standard representation has a concrete realisation: it is the -dimensional subspace of consisting of vectors whose coordinates sum to zero, with acting by permutation of coordinates. The Fourier coefficient can be computed entry by entry:

Using the basis for , the matrix entries of encode whether maps to . The squared Hilbert-Schmidt norm then becomes a sum over the marginal probabilities :

The factor arises from the relationship between the trace inner product on the -dimensional representation and the full permutation matrix entries.

Step 3 (Combine). Separating the term from the rest of the Plancherel sum gives the stated decomposition. The first-order term captures all marginal (single-item) effects; the remaining terms capture all higher-order interactions.

Bridge. This spectral decomposition builds toward the analysis of partial rankings in 07.05.13 where the coset structure requires the representation theory of Gelfand pairs, and appears again in the metric analysis of 07.05.12 where the character-theoretic expressions for distances between permutations emerge from the same Fourier coefficients. The foundational reason the decomposition works is that the group algebra decomposes as a direct product of matrix algebras , which is exactly the Artin-Wedderburn decomposition. The bridge is between statistical data analysis on permutations and the harmonic analysis on developed in 07.05.01; putting these together, every test statistic on ranking data has a representation-theoretic interpretation as a spectral norm.

Exercises [Intermediate+]

Advanced results [Master]

Theorem 1 (Diaconis 1989: spectral analysis generalises ANOVA). The decomposition of into isotypic components generalises the classical ANOVA decomposition for factorial designs. The identity component gives the grand mean; the component gives the main effects; the and components give the two-factor interactions; and so on through the partition lattice.

This was established in Diaconis 1989 J. Amer. Statist. Assoc. 84, where the isomorphism between isotypic components and ANOVA effects was made explicit through the relationship between irreducible characters and the inclusion-exclusion structure of marginal probabilities.

Theorem 2 (Mallows model: explicit Fourier coefficients). For the Mallows model with Kendall tau distance , the Fourier coefficients satisfy where is expressible as a rational function of involving the hook lengths of .

The closed-form expression was given by Diaconis 1989 and developed further by Marden 1995. The coefficient at equals , providing a direct link between the dispersion parameter and the first-order spectral component.

Theorem 3 (Consistency of spectral estimates). If are i.i.d. draws from a distribution on , then the empirical Fourier coefficients satisfy almost surely as , and converges in distribution to a matrix-valued Gaussian.

This follows from the multivariate central limit theorem applied to the matrix entries of , which are bounded random variables.

Theorem 4 (Testing uniformity via spectral components). To test against , the test statistic has a representation-theoretic decomposition , and the first-order term provides the most powerful invariant test against alternatives with first-order structure.

This result appears in Diaconis 1989 and connects to the Neyman-Pearson lemma for group-invariant testing problems.

Theorem 5 (Diaconis 1989: sufficient statistics from spectral components). For the exponential family , where is a matrix parameter for each partition , the collection of empirical Fourier coefficients forms a sufficient statistic.

This is the exponential family on with natural parameters in each isotypic component; sufficiency follows from the factorisation theorem since the log-likelihood depends on the data only through the Fourier coefficients.

Theorem 6 (Spectral analysis of paired comparisons). For paired-comparison data (where subjects choose between pairs of items), the Bradley-Terry model has a spectral interpretation: the Bradley-Terry preference parameters are the eigenvalues of the first-order Fourier coefficient .

This was observed by Marden 1995 and connects classical psychometric models to the spectral framework.

Synthesis. Spectral analysis of permutation-valued data is the foundational reason that representation theory enters statistics through the symmetric group. The central insight is that the Artin-Wedderburn decomposition of identifies the space of real-valued functions on permutations with the direct product of matrix algebras , and putting these together with the Plancherel isomorphism, every statistical question about ranking data has a spectral reformulation. This is exactly the content that builds toward the metric analysis in 07.05.12 where distances between permutations become spectral norms, and appears again in 07.05.13 where partial rankings extend the framework to cosets. The bridge is between the combinatorics of partitions and the statistics of rankings; the pattern generalises from full rankings to partial rankings to paired comparisons, identifying the Fourier transform on as the universal tool for permutation-valued data analysis.

Full proof set [Master]

Proposition 1 (Inverse Fourier formula). For :

Proof. Expand . Then:

By the orthogonality of characters: (column orthogonality of characters at the identity). Hence the right side collapses to .

Proposition 2 (First-order coefficient and marginals). For a probability distribution on , the trace of equals .

Proof. The character of the standard representation at is where is the number of fixed points of . So:

Connections [Master]

  • Random walk upper bound lemma 07.05.05. The spectral decomposition of permutation data is the static counterpart to the dynamic analysis in 07.05.05. The Upper Bound Lemma bounds the distance of a random walk distribution from uniform using the same character sums that appear here as spectral components; the random walk at time has Fourier coefficients , and their decay rates control mixing, while the spectral analysis of data examines the same coefficients for a fixed empirical distribution.

  • Symmetric group representation 07.05.01. The irreducible representations of indexed by partitions, their characters, and the hook-length formula for dimensions developed in 07.05.01 are the raw material for the spectral decomposition. Every Fourier coefficient is computed using the matrix representations and the characters from that unit.

  • Metrics on the symmetric group 07.05.12. The character-theoretic expressions for distances between permutations developed in the next unit are spectral norms of the difference of Fourier coefficients. The Cayley, Hamming, and Kendall tau distances all have representations as weighted character sums, making the spectral decomposition of this unit the common framework unifying distance computations on permutations.

  • Schur-Weyl duality 07.05.04. The decomposition of tensor powers of via Schur-Weyl duality produces the same indexing by partitions that governs the spectral analysis. The isotypic components of are dual to the irreducible -modules in the tensor algebra, linking the statistical framework to the representation theory of the general linear group.

Historical & philosophical context [Master]

Diaconis introduced spectral analysis of permutation-valued data in his 1989 paper A Generalization of Spectral Analysis [Diaconis1989] published in J. Amer. Statist. Assoc. 84. The key insight was that the ANOVA decomposition for factorial designs, which separates main effects from interactions, has a natural generalisation to permutations via the isotypic decomposition of the group algebra of .

The statistical framework was systematised in Diaconis's 1988 monograph Group Representations in Probability and Statistics [Diaconis1988], which placed the spectral analysis alongside the random walk and shuffling results as part of a unified programme applying representation theory to probability and statistics. Marden 1995 extended the framework to a wide variety of ranking models in Analyzing and Modeling Rank Data [Marden1995], including the Mallows model, the Bradley-Terry model, and the Plackett-Luce model.

Bibliography [Master]

@article{Diaconis1989,
  author = {Diaconis, Persi},
  title = {A Generalization of Spectral Analysis},
  journal = {J. Amer. Statist. Assoc.},
  volume = {84},
  year = {1989},
  pages = {694--701},
}

@book{Diaconis1988,
  author = {Diaconis, Persi},
  title = {Group Representations in Probability and Statistics},
  publisher = {Institute of Mathematical Statistics},
  year = {1988},
  series = {IMS Lecture Notes--Monograph Series},
  volume = {11},
}

@book{Marden1995,
  author = {Marden, John I.},
  title = {Analyzing and Modeling Rank Data},
  publisher = {Chapman \& Hall},
  year = {1995},
}