Hostname: page-component-5db58dd55d-xnzfm Total loading time: 0 Render date: 2026-06-02T14:47:36.794Z Has data issue: false hasContentIssue false

The Bradley–Terry Regression Trunk approach for Modeling Preference Data with Small Trees

Published online by Cambridge University Press:  01 January 2025

Alessio Baldassarre
Affiliation:
University of Cagliari
Elise Dusseldorp
Affiliation:
Leiden University
Antonio D’Ambrosio*
Affiliation:
University of Naples Federico II
Mark de Rooij
Affiliation:
Leiden University
Claudio Conversano
Affiliation:
University of Cagliari
*
Correspondence should be made to Antonio D’Ambrosio, University of Naples Federico II, Naples, Italy. Email: antdambr@unina.it
Rights & Permissions [Opens in a new window]

Abstract

This paper introduces the Bradley–Terry regression trunk model, a novel probabilistic approach for the analysis of preference data expressed through paired comparison rankings. In some cases, it may be reasonable to assume that the preferences expressed by individuals depend on their characteristics. Within the framework of tree-based partitioning, we specify a tree-based model estimating the joint effects of subject-specific covariates over and above their main effects. We, therefore, combine a tree-based model and the log-linear Bradley-Terry model using the outcome of the comparisons as response variable.The proposed model provides a solution to discover interaction effects when no a-priori hypotheses are available. It produces a small tree, called trunk, that represents a fair compromise between a simple interpretation of the interaction effects and an easy to read partition of judges based on their characteristics and the preferences they have expressed. We present an application on a real dataset following two different approaches, and a simulation study to test the model’s performance. Simulations showed that the quality of the model performance increases when the number of rankings and objects increases. In addition, the performance is considerably amplified when the judges’ characteristics have a high impact on their choices.

Information

Type
Theory and Methods
Creative Commons
Creative Common License - CCCreative Common License - BY
This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Copyright
Copyright © 2022 The Author(s)
Figure 0

Figure 1. Flowchart of the STIMA algorithm implementing the BTRT model for preference data.

Figure 1

Table 1. Simulated values of βi\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$\beta _i$$\end{document} for the estimation of the pruning parameter c.

Figure 2

Table 2. Results first scenario: type I error. Error higher than 0.05 in boldface.

Figure 3

Table 3. Results second scenario: type I error. Error higher than 0.05 in boldface.

Figure 4

Table 4. Results third scenario: test’s power (1-type II error). Power lower than 0.80 in boldface.

Figure 5

Table 5. Descriptive statistics of the subject-specific covariates in application.

Figure 6

Table 6. Pruned regression trunk: OSO approach. The table shows the node in which the split is found, the splitting covariate, and its split point together with the deviance associated with each estimated model.

Figure 7

Table 7. 10-fold cross-validation results with OSO approach: D=\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$D = $$\end{document} model deviance (Eq. 10); Dcv=\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$D^{cv} = $$\end{document} casewise cross-validation deviance (Eq. 14); SEcv=\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$SE^{cv} = $$\end{document} standard error of Dcv\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$D^{cv}$$\end{document} (Eq. 15).

Figure 8

Figure 2. Pruned regression trunk: OSO approach.

Figure 9

Table 8. Pruned regression trunk: MS approach. The table shows the node in which the split is found, the splitting covariate, and its split point together with the deviance associated with each estimated model.

Figure 10

Table 9. 10-fold cross-validation results with MS approach: D=\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$D = $$\end{document} model deviance (Eq. 10); Dcv=\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$D^{cv} = $$\end{document} casewise cross-validation deviance (Eq. 14); SEcv=\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$SE^{cv} = $$\end{document} standard error of Dcv\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$D^{cv}$$\end{document} (Eq. 15).

Figure 11

Figure 3. Comparison between OSO and MS approaches.

Figure 12

Figure 4. Pruned regression trunk: MS approach.

Figure 13

Table 10. MS regression trunk final output: the table shows the estimated coefficients associated to the objects o1\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$o_1$$\end{document}, o2\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$o_2$$\end{document}, o3\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$o_3$$\end{document}, and o4\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$o_4$$\end{document}. The last object o5\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$o_5$$\end{document} is set as reference level, so that the estimated parameters associated to λ^o5,h\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$${\hat{\lambda }}_{o_5,h}$$\end{document} (the professor helpfulness) are automatically set to zero. The standard errors are shown in parenthesis. There are two standard errors for each parameter: The first is the standard error coming for the Poisson regression, the second one is corrected for the detected overdispersion, which is equal to 1.25.

Figure 14

Table 11. Design matrix with one judge and three objects: The first column indicates the number of times a specific preference is expressed for each pair of objects ij. The second column, the parameter μ\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$\mu $$\end{document}, serves as an index for the n×(n-1)/2\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$n \times (n-1) / 2$$\end{document} comparisons. Finally, preferences are expressed in the last three columns. For example, the first line shows that object B is preferred to A since yij=1\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$y_{ij} = 1$$\end{document}, λBO=1\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$\lambda _B^O = 1$$\end{document}, and λAO=-1\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$\lambda _A^O = -1$$\end{document}.

Figure 15

Table 12. Design matrix with two judges, three objects, and one continuous subject-specific covariate: The first column indicates the number of times a specific preference is expressed for each pair of objects ij. The second column serves as an index for the n×(n-1)/2\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$n \times (n-1) / 2$$\end{document} comparisons. Preferences are expressed in the next three columns, and finally the age covariate is showed in the last column. In this example, the two judges express opposite preference, BCA and ACB, respectively.

Figure 16

Table 13. Full regression trunk: OSO approach. The table shows the node in which the split is found, the splitting covariate, and its split point together with the deviance associated with each estimated model.

Figure 17

Figure 5. Full regression trunk: OSO approach.

Figure 18

Table 14. Full regression trunk: MS approach. The table shows the node in which the split is found, the splitting covariate, and its split point together with the deviance associated with each estimated model.

Figure 19

Figure 6. Full regression trunk: MS approach.