Models for partially ranked data on S_n/S_{n-k}
Anchor (Master): Diaconis 1988 IMS Lecture Notes 11 Ch. 9; Stanley 1986 Enumerative Combinatorics
Intuition [Beginner]
In many real surveys, people do not rank all items — they rank only their top 3 or top 5 choices. These are partial rankings. The question becomes: how do you do spectral analysis when the data consists of partial rankings instead of full permutations?
The answer uses cosets. A partial ranking of items out of can be viewed as a coset : you specify the top positions, and the remaining items can be in any order. The subgroup permutes the unranked items among themselves. The pair forms a Gelfand pair, meaning the harmonic analysis on this coset space is especially simple.
Why does this concept exist? Because the Gelfand pair structure guarantees that the Fourier analysis on partial rankings decomposes into one-dimensional blocks (indexed by partitions with at most parts), making every computation explicit and interpretable.
Visual [Beginner]
A diagram showing a full ranking of 5 items as a permutation in , with a downward arrow to a partial ranking showing only the top 2 items. The partial ranking corresponds to a coset containing all permutations with the same top 2, represented as a set of 6 permutations grouped together.
The projection map from full permutations to partial rankings collapses each coset (all permutations sharing the same top-) into a single point.
Worked example [Beginner]
Consider items with partial rankings showing only the top positions. The coset space has elements.
Step 1. A respondent picks item 3 as best and item 1 as second. This corresponds to the coset of all permutations with and . The remaining items 2 and 4 can be in either order in positions 3 and 4, so this coset contains exactly 2 permutations: and .
Step 2. There are possible partial rankings (top 2 out of 4). Under the uniform distribution, each has probability .
Step 3. The first-order spectral component on this coset space is indexed by the partition , just like in the full ranking case. It detects which items are disproportionately likely to appear in the top 2 positions.
What this tells us: the spectral analysis of partial rankings mirrors the full ranking analysis, with the Gelfand pair structure ensuring that only partitions with at most parts contribute nonzero spectral components.
Check your understanding [Beginner]
Formal definition [Intermediate+]
Let . The subgroup is the subgroup permuting the elements among themselves while fixing . A partial ranking of the top items is a coset for some , where specifies the ordering of the top positions.
The coset space has elements. A function on partial rankings lifts to a right--invariant function via .
Definition (Gelfand pair). The pair of a finite group and a subgroup is a Gelfand pair if the algebra of -bi-invariant functions on (i.e., functions satisfying for all ) is commutative under convolution.
Definition (Spherical function). For a Gelfand pair , the zonal spherical functions are the distinguished basis of the -bi-invariant function algebra, one for each irreducible representation of that appears in . They satisfy:
where is the character of the -th irreducible constituent.
For , the spherical functions are indexed by partitions of with at most parts, and are expressible in terms of the irreducible characters of via the branching rule from to .
Counterexamples to common slips
- Every pair is not a Gelfand pair. The Gelfand pair condition requires the double coset algebra to be commutative, which is a strong constraint. For example, is a Gelfand pair but is not (the full group algebra is not commutative for ).
- Partitions with more than parts do not contribute. The spectral decomposition of involves only irreducible representations indexed by partitions with (at most parts). This is a consequence of the Frobenius reciprocity and the branching rule.
- Spherical functions are not characters. The zonal spherical functions are averages of characters over the subgroup , which produces a different set of functions with different orthogonality relations.
Key theorem with proof [Intermediate+]
Theorem (Spectral decomposition on partial rankings — Diaconis 1989). Let be the Gelfand pair for partial rankings of the top out of items. Then decomposes as an -module:
and the decomposition is multiplicity-free. For a probability distribution on , the spherical Fourier transform for each partition with gives a complete spectral description of , and is reconstructed by:
Proof. The proof has four steps.
Step 1 (Induced representation). The space is isomorphic to the induced representation , where is the identity representation of .
Step 2 (Decomposition by Frobenius reciprocity). By Frobenius reciprocity, the multiplicity of in equals the multiplicity of in . By the branching rule for restricting from to , the identity representation of appears in if and only if , and in that case the multiplicity is exactly 1.
Step 3 (Multiplicity-free). Since each multiplicity is 0 or 1, the decomposition is multiplicity-free. This is exactly the Gelfand pair condition: the algebra of -bi-invariant functions is commutative because the decomposition has no repeated irreducibles.
Step 4 (Spherical inversion). The zonal spherical functions for form an orthogonal basis of the -bi-invariant function space. The Fourier coefficient extracts the projection of onto , and the inversion formula reconstructs from these coefficients.
Bridge. The spectral decomposition on partial rankings builds toward a unified statistical framework for any level of incompleteness in the data, and appears again in 07.05.11 as the special case (full rankings recover with all partitions). The foundational reason the Gelfand pair structure matters is that multiplicity-free decomposition guarantees unique spherical functions, which generalises the orthonormality of characters in the full group case. This is exactly the content that identifies partial ranking analysis as a clean sub-theory of the full spectral analysis; the bridge is that the branching rule for controls which irreducible representations survive the projection from full to partial rankings, and putting these together with the zonal spherical functions, every statistical question about partially ranked data reduces to a finite computation indexed by partitions with at most parts.
Exercises [Intermediate+]
Advanced results [Master]
Theorem 1 (Diaconis 1989: multiplicity-free decomposition via branching). The decomposition of into irreducible -modules is indexed by partitions with , each with multiplicity 1. The partitions are enumerated by restricting the Young diagram to at most rows.
This was proved by Diaconis 1989 using the branching rule and Frobenius reciprocity. The multiplicity-free property is equivalent to the Gelfand pair condition.
Theorem 2 (Zonal spherical functions for ). The zonal spherical functions are given by:
for partitions with . These functions form an orthogonal basis of the -bi-invariant function space with .
Theorem 3 (Spectral analysis of the Mallows model on partial rankings). For the Mallows model with the induced Cayley distance, the spherical Fourier coefficients have closed-form expressions:
where depends on the character values of on permutations with cycles moved within the top- positions.
Theorem 4 (Diaconis 1988: random walks on coset spaces). A random walk on driven by a -bi-invariant measure mixes in time determined by the spectral gap, which is the second-largest eigenvalue of the spherical Fourier transform of . The mixing time is where is the largest non-principal spherical eigenvalue.
Theorem 5 (Connection to Jack polynomials). The zonal spherical functions for are expressible as specialisations of Jack polynomials with parameter , evaluated at the eigenvalues of the appropriate coset-type matrices. This connects the spectral analysis of partial rankings to the broader theory of symmetric functions and hypergeometric functions of matrix argument.
This connection was developed by James 1978 and Macdonald 1995 in Symmetric Functions and Hall Polynomials.
Theorem 6 (Ewens sampling formula and partial rankings). The Ewens sampling formula on , which assigns probability proportional to where is the number of cycles, has a natural interpretation as a Mallows model with Cayley distance. The induced distribution on partial rankings has spherical Fourier coefficients determined by the Stirling numbers.
Synthesis. The spectral analysis of partially ranked data via Gelfand pairs provides the foundational reason that the representation-theoretic framework extends from full to partial rankings without losing its clean structure. The central insight is that the branching rule for selects exactly the partitions with at most parts, and the Gelfand pair condition guarantees that the selected irreducibles appear without multiplicity. Putting these together with the zonal spherical functions, every statistical procedure for full rankings (spectral decomposition, testing, estimation) has a partial-ranking analogue obtained by restricting the spectral sum to partitions with at most rows. This is exactly the content that builds toward the exchangeability framework in 07.05.14 where the symmetric group acts on sequences and the spectral decomposition detects dependence structures. The bridge is between the combinatorics of Young diagrams with bounded row count and the statistics of top- rankings; the pattern generalises from (full rankings, all partitions) to (only the identity and standard representations survive), identifying the spectral components that vanish as one passes from full to partial data.
Full proof set [Master]
Proposition 1 (Branching rule and multiplicity-free condition). The multiplicity of the identity representation of in equals 1 if and 0 otherwise.
Proof. Apply the branching rule iteratively. The restriction decomposes where the sum is over removable corners of the Young diagram of . After steps, the identity representation of (indexed by the single-row partition ) appears in iff there is a path from to in the branching lattice. Such a path exists iff we can remove boxes from to reach a single row of length , which requires removing at most one box from each row, hence . When , the path is unique (remove boxes from the bottom row upward), giving multiplicity 1.
Proposition 2 (Orthogonality of zonal spherical functions). The zonal spherical functions for satisfy:
Proof. The spherical functions are obtained from the matrix coefficients of the irreducible representations by averaging over . By the Schur orthogonality relations and the fact that the decomposition is multiplicity-free, the -averaged matrix coefficients for distinct irreducibles are orthogonal. The norm is computed from the inner product formula:
The double sum over collapses by orthogonality to . Adjusting for the normalisation gives the stated result.
Connections [Master]
Spectral analysis of permutation data
07.05.11. The partial ranking framework is a direct extension of the full spectral analysis in07.05.11. Setting recovers the full spectral decomposition of with all partitions; the Gelfand pair perspective specialises to the group algebra perspective when .Metrics on the symmetric group
07.05.12. The metrics on from07.05.12extend to partial rankings via the quotient metric: . The character-theoretic expressions for these quotient metrics use the same spherical functions that govern the spectral decomposition.Random walk upper bound lemma
07.05.05. The Upper Bound Lemma generalises to coset spaces: the total variation distance between a random walk on and the uniform distribution is bounded by the spherical Fourier coefficients, paralleling the character-sum bound in07.05.05but using zonal spherical functions instead of irreducible characters.De Finetti and exchangeability
07.05.14. The Gelfand pair structure that makes partial ranking analysis clean is the same symmetric group action that underlies exchangeability. The exchangeability theorem of de Finetti identifies distributions invariant under , and the partial ranking analysis identifies distributions invariant under the larger subgroup acting on cosets.
Historical & philosophical context [Master]
Diaconis developed the spectral analysis of partially ranked data in his 1989 paper A Generalization of Spectral Analysis [Diaconis1989], identifying the Gelfand pair as the structural reason the analysis remains explicit. The Gelfand pair theory was developed by Gelfand in the 1950s in the context of spherical functions on Lie groups, and adapted to finite groups by Curtis and Reiner, and by Delsarte 1973 in the context of association schemes.
The connection to Jack polynomials and symmetric function theory was developed by James 1978 and Macdonald 1995 [Macdonald1995] in Symmetric Functions and Hall Polynomials, who showed that the zonal spherical functions for are special cases of the more general framework of zonal polynomials. Marden 1995 [Marden1995] provided the comprehensive statistical treatment in Analyzing and Modeling Rank Data, covering both full and partial ranking models.
Bibliography [Master]
@article{Diaconis1989partial,
author = {Diaconis, Persi},
title = {A Generalization of Spectral Analysis},
journal = {J. Amer. Statist. Assoc.},
volume = {84},
year = {1989},
pages = {694--701},
}
@book{Diaconis1988partial,
author = {Diaconis, Persi},
title = {Group Representations in Probability and Statistics},
publisher = {Institute of Mathematical Statistics},
year = {1988},
series = {IMS Lecture Notes--Monograph Series},
volume = {11},
}
@book{Marden1995partial,
author = {Marden, John I.},
title = {Analyzing and Modeling Rank Data},
publisher = {Chapman \& Hall},
year = {1995},
}
@book{Macdonald1995,
author = {Macdonald, I. G.},
title = {Symmetric Functions and Hall Polynomials},
publisher = {Oxford University Press},
year = {1995},
edition = {2nd},
}