Hostname: page-component-77c78cf97d-4gwwn Total loading time: 0 Render date: 2026-04-23T08:20:07.524Z Has data issue: false hasContentIssue false

Bayesian Rank-Clustering

Published online by Cambridge University Press:  16 June 2025

Michael Pearce*
Affiliation:
Department of Mathematics and Statistics, Reed College , Portland, OR, USA
Elena A. Erosheva
Affiliation:
Department of Statistics, School of Social Work, and the Center for Statistics and the Social Sciences, University of Washington , Seattle, WA, USA
*
Corresponding author: Michael Pearce; Email: michaelpearce@reed.edu
Rights & Permissions [Opens in a new window]

Abstract

This article proposes a new statistical model to infer interpretable population-level preferences from ordinal comparison data. Such data is ubiquitous, e.g., ranked choice votes, top-10 movie lists, and pairwise sports outcomes. Traditional statistical inference on ordinal comparison data results in an overall ranking of objects, e.g., from best to worst, with each object having a unique rank. However, the ranks of some objects may not be statistically distinguishable. This could happen due to insufficient data or to the true underlying object qualities being equal. Because uncertainty communication in estimates of overall rankings is notoriously difficult, we take a different approach and allow groups of objects to have equal ranks or be rank-clustered in our model. Existing models related to rank-clustering are limited by their inability to handle a variety of ordinal data types, to quantify uncertainty, or by the need to pre-specify the number and size of potential rank-clusters. We solve these limitations through our proposed Bayesian Rank-Clustered Bradley–Terry–Luce (BTL) model. We accommodate rank-clustering via parameter fusion by imposing a novel spike-and-slab prior on object-specific worth parameters in the BTL family of distributions for ordinal comparisons. We demonstrate rank-clustering on simulated and real datasets in surveys, elections, and sports analytics.

Information

Type
Theory and Methods
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2025. Published by Cambridge University Press on behalf of Psychometric Society
Figure 0

Figure 1 Joint distribution of $(\omega _1,\omega _2)$ under the PSSF prior with varying combinations of $f_G$ and $f_\nu $.Note: In all cases, $\mathcal {J}=\{1,2\}$, and plots show 20,000 sampled values with marginal density estimates along the axes. Rows correspond to the choice of $f_\nu $ and columns to $f_G$.

Figure 1

Figure 2 Distribution of $\omega _2-\omega _1$ under the PSSF prior with varying combinations of $f_G$ and $f_\nu $.Note: In all cases, $\mathcal {J}=\{1,2\}$. Rows correspond to the choice of $f_\nu $ and columns to $f_G$.

Figure 2

Table 1 Simulation settings for $\omega _0$ under varying numbers of true rank-clusters, K

Figure 3

Figure 3 Boxplots of posterior mean absolute error for $\omega _0$ across combinations of the number of judges I, true number of rank-clusters K, hyperparameter $\lambda $, number of ranked objects R, and number of assessed objects S.Note: Errors are calculated after normalization of posterior samples such that $\sum _{j}\omega _{0j}=1$.

Figure 4

Figure 4 Boxplots of the mean posterior probability of rank-clustering object pairs which are truly rank-clustered (left) or independent (right) across combinations of I, K, $\lambda $, R, and S.

Figure 5

Table 2 Summary of applications by subsection

Figure 6

Figure 5 Primary results from Rank-Clustered BTL analysis of Tohoku sushi data.Note: Left: Posterior rank-clustering probabilities. Main diagonal displays posterior median estimate of worth parameter after normalization. Red squares indicate maximum a posteriori rank-clusters. Right: Posterior distributions of sushi-specific worth parameters.

Figure 7

Figure 6 Primary results from Rank-Clustered BTL analysis of mayoral votes.Note: Party abbreviations are in parentheses after candidate surnames. Left: Posterior rank-clustering probabilities. Main diagonal displays posterior median estimate of worth parameter after normalization. Red squares indicate maximum a posteriori rank-clusters. Right: Posterior distributions of candidate-specific worth parameters.

Figure 8

Figure 7 Comparison of estimated rank for each candidate across four aggregation methods: Ranked Choice, First-Past-the-Post (FPP), BTL, and Rank-Clustered BTL (RC BTL).Note: Candidates are ordered by their rank in the actual ranked choice election.

Figure 9

Figure 8 Primary results from Rank-Clustered BTL analysis of Eurobarometer 34.1 data.Note: Left:Posterior rank-clustering probabilities. Main diagonal displays posterior median estimate of worth parameter after normalization. Red squares indicate maximum a posteriori rank-clusters. Right: Posterior distributions of policy-specific worth parameters.

Figure 10

Figure 9 Primary results from Rank-Clustered BTL analysis of 2023–2024 NBA data.Note: Left: Posterior rank-clustering probabilities. Main diagonal displays posterior median estimate of worth parameter after normalization. Right: Posterior distributions of team-specific worth parameters.

Figure 11

Table A.1 Posterior predictive p-values based on a standard BTL and Rank-Clustered BTL (RC-BTL) to assess goodness-of-fit in the Sushi data analysis

Figure 12

Figure A.1 Stacked bar charts of ranks received by each sushi type.

Figure 13

Figure A.2 Comparison of results among comparator methods for the Sushi data analysis.

Figure 14

Figure A.3 Trace plot of K in the Sushi data analysis.

Figure 15

Figure A.4 Trace plots of $\omega $ in the Sushi data analysis.

Figure 16

Figure A.5 Number of votes by rank level and candidate. Candidates are ordered by their position in the official ranked choice election.Note: Acronyms on the tops of bars represent each candidate’s political party.

Figure 17

Table A.2 Posterior predictive p-values based on a standard BTL and Rank-Clustered BTL (RC-BTL) to assess goodness-of-fit in the Minneapolis mayoral election data analysis

Figure 18

Figure A.6 Trace plots of K in the Minneapolis mayoral election data analysis.

Figure 19

Figure A.7 Trace plots of $\omega $ in the Minneapolis mayoral election data analysis.

Figure 20

Figure A.8 Stacked bar charts of ranks received by each policy option.

Figure 21

Table A.3 Posterior predictive p-values based on a standard BTL and Rank-Clustered BTL (RC-BTL) to assess goodness-of-fit in the Eurobarometer survey data analysis

Figure 22

Figure A.9 Comparison of results among comparator methods for the Eurobarometer survey data analysis.

Figure 23

Figure A.10 Trace plot of K in the Eurobarometer survey data analysis.

Figure 24

Figure A.11 Trace plot of $\omega $ in the Eurobarometer survey data analysis.

Figure 25

Figure A.12 Stacked bar charts of ranks received by each NBA team across the 2023–2024 season.Note: Winning = rank 1; losing = rank 2.

Figure 26

Figure A.13 Comparison of results among comparator methods for the 2023–2024 NBA season analysis.

Figure 27

Table A.4 Posterior predictive p-values based on a standard BTL and Rank-Clustered BTL (RC-BTL) to assess goodness-of-fit in the 2023–2024 NBA season analysis

Figure 28

Figure A.14 Trace plot of K in the 2023–2024 NBA season analysis.

Figure 29

Figure A.15 Trace plot of $\omega $ in the 2023–2024 NBA season analysis.

Supplementary material: File

Pearce and Erosheva supplementary material

Pearce and Erosheva supplementary material
Download Pearce and Erosheva supplementary material(File)
File 8.4 MB