Hostname: page-component-54dcc4c588-xh45t Total loading time: 0 Render date: 2025-09-12T19:14:09.584Z Has data issue: false hasContentIssue false

A Two-Stage Logistic Regression Model for Analyzing Inter-Rater Agreement

Published online by Cambridge University Press:  01 January 2025

Stuart R. Lipsitz
Affiliation:
Department of Biometry and Epidemiology, Medical University of South Carolina, Charleston
Michael Parzen
Affiliation:
Graduate School of Business, University of Chicago
Garrett M. Fitzmaurice*
Affiliation:
Department of Biostatistics, Harvard School of Public Health
Neil Klar
Affiliation:
Division of Preventive Oncology, Cancer Care Ontario
*
Requests for reprints should be sent to Gaxrett Fitzmaurice, Department of Biostatistics, Building II, 655 Huntington Ave., Rm. 423, Boston MA 02115. E-Mail: fitzmaur@hsph.harvard.edu

Abstract

Studies of agreement commonly occur in psychiatric research. For example, researchers are often interested in the agreement among radiologists in their review of brain scans of elderly patients with dementia or in the agreement among multiple informant reports of psychopathology in children. In this paper, we consider the agreement between two raters when rating a dichotomous outcome (e.g., presence or absence of psychopathology). In particular, we consider logistic regression models that allow agreement to depend on both rater- and subject-level covariates. Logistic regression has been proposed as a simple method for identifying covariates that are predictive of agreement (Coughlin et al., 1992). However, this approach is problematic since it does not take account of agreement due to chance alone. As a result, a spurious association between the probability (or odds) of agreement and a covariate could arise due entirely to chance agreement. That is, if the prevalence of the dichotomous outcome varies among subgroups of the population, then covariates that identify the subgroups may appear to be predictive of agreement. In this paper we propose a modification to the standard logistic regression model in order to take proper account of chance agreement. An attractive feature of the proposed method is that it can be easily implemented using existing statistical software for logistic regression. The proposed method is motivated by data from the Connecticut Child Study (Zahner et al., 1992) on the agreement among parent and teacher reports of psychopathology in children. In this study, parents and teachers provide dichotomous assessments of a child's psychopathology and it is of interest to examine whether agreement among the parent and teacher reports is related to the age and gender of the child and to the time elapsed between parent and teacher assessments of the child.

Information

Type
Articles
Copyright
Copyright © 2003 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

Footnotes

The authors thank the Associate Editor and the referees for their helpful comments and suggestions. We also thank Gwen Zahner for use of data from the Connecticut Child Study, which was conducted under contract to the Connecticut Department of Children and Youth Services. This research was supported by grants HL 69800, AHRQ 10871, HL52329, HL61769, GM 29745, MH 54693 and MH 17119 from the National Institutes of Health.

References

Barlow, W. (1996). Measurement of interrater agreement with adjustment for covariates. Biometrics, 52, 695702.CrossRefGoogle ScholarPubMed
Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20, 3746.CrossRefGoogle Scholar
Coughlin, S.S., Pickle, L.W., Goodman, M.T., Wilkens, L.R. (1992). The logistic modeling of interobserver agreement. Journal of Clinical Epidemiology, 45, 12371241.CrossRefGoogle ScholarPubMed
Fleiss, J.L. (1971). Measuring nominal scale agreement among many raters. Psychological Bulletin, 76, 378382.CrossRefGoogle Scholar
Klar, N., Lipsitz, S.R., Ibrahim, J. (2000). An estimating equations approach for modelling kappa. Biometrical Journal, 42, 4558.3.0.CO;2-#>CrossRefGoogle Scholar
Liang, K.Y., Zeger, S.L. (1986). Longitudinal data analysis using generalized linear models. Biometrika, 73, 1322.CrossRefGoogle Scholar
Prentice, R.L. (1988). Correlated binary regression with covariates specific to each binary observation. Biometrics, 44, 10331048.CrossRefGoogle ScholarPubMed
Quenouille, M.H. (1956). Notes on bias in estimation. Biometrika, 43, 353360.CrossRefGoogle Scholar
Tukey, J.N. (1958). Bias and confidence in not quite large samples. Annals of Mathematical Statistics, 29, 614614.Google Scholar
Welsh, A.H. (1996). Aspects of statistical inference. New York, NY: Wiley.CrossRefGoogle Scholar
White, H. (1982). Maximum likelihood estimation of misspecified models. Econometrica, 50, 125.CrossRefGoogle Scholar
Zahner, G.E., Daskalakis, C. (1998). Modeling sources of informant variance in parent and teacher ratings of child psychopathology. International Journal of Methods in Psychiatric Research, 7, 316.CrossRefGoogle Scholar
Zahner, G.E., Pawelkiewicz, W., DeFrancesco, J.J., Adnopoz, J. (1992). Children's mental health service needs and utilization patterns in an urban community: An epidemiological assessment. Journal of the American Academy of Child and Adolescent Psychiatry, 31, 951960.CrossRefGoogle Scholar