Hostname: page-component-89b8bd64d-9prln Total loading time: 0 Render date: 2026-05-07T22:53:06.918Z Has data issue: false hasContentIssue false

Quadratically Weighted Agreement Coefficients: Interpretations and Connections

Published online by Cambridge University Press:  20 January 2026

Rutger van Oest*
Affiliation:
Department of Marketing, BI Norwegian Business School, Norway
Jonas Moss
Affiliation:
Department of Data Science and Analytics, BI Norwegian Business School, Norway
*
Corresponding author: Rutger van Oest; Email: rutger.d.oest@bi.no
Rights & Permissions [Opens in a new window]

Abstract

In this study, we review interpretations and connections of chance-corrected agreement coefficients with quadratic weights, applicable when raters classify objects or subjects on an ordinal scale. Whereas correlation interpretations exist for coefficients with noninterchangeable raters, represented by Cohen’s two-rater and Conger’s multirater kappas, interpretations are essentially absent for interchangeable raters, represented by Fleiss’ kappa and its two-rater version, Scott’s pi. We show that Fleiss’ quadratically weighted kappa equals Lin’s generalized concordance correlation coefficient after recentering the rater means and covariance matrix at the grand mean. Furthermore, it equals the Pearson product–moment correlation and associated regression slope for two random raters, computed after concatenating the ratings from all possible rater pairs, including reversed rater order. Next, we demonstrate that Fleiss’ and Conger’s quadratically weighted kappas are linear transformations of each other, entirely determined by the pairwise differences in rater means and corresponding variances. As these kappas coincide if the rater means coincide, the conceptual distinction between interchangeable and noninterchangeable raters becomes empirically irrelevant if raters have (approximately) the same mean rating, even with substantially different rating distributions. Finally, Fleiss’ quadratically weighted kappa (i.e., interchangeable raters) cannot exceed Conger’s (i.e., noninterchangeable raters).

Information

Type
Literature Review
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2026. Published by Cambridge University Press on behalf of Psychometric Society
Figure 0

Figure 1 Overview of chance-corrected agreement coefficients plus the scope of the present study.

Figure 1

Table 1 Joint and marginal distributions of ratings in Landis and Koch’s multiple sclerosis example with two raters

Figure 2

Figure 2 After concatenation, Scott’s quadratically weighted pi equals the Pearson correlation and regression slope of 0.6182 in Landis and Koch’s two-rater multiple sclerosis example. Scatter plot of original frequencies (left) and combined frequencies with regression line (right).

Figure 3

Figure 3 Correlation interpretations of distribution-based agreement coefficients with quadratic weights.

Figure 4

Table 2 Expressions and connections of distribution-based agreement coefficients with quadratic weights