Hostname: page-component-6766d58669-fx4k7 Total loading time: 0 Render date: 2026-05-22T07:09:59.503Z Has data issue: false hasContentIssue false

Matching IRT Models to Patient-Reported Outcomes Constructs: The Graded Response and Log-Logistic Models for Scaling Depression

Published online by Cambridge University Press:  01 January 2025

Steven P. Reise*
Affiliation:
University of California, Los Angeles
Han Du
Affiliation:
University of California, Los Angeles
Emily F. Wong
Affiliation:
University of California, Los Angeles
Anne S. Hubbard
Affiliation:
University of California, Los Angeles
Mark G. Haviland
Affiliation:
Loma Linda University
*
Correspondence should be made to Steven P. Reise, Department of Psychology, University of California, Los Angeles, Los Angeles, USA. Email: reise@psych.ucla.edu
Rights & Permissions [Opens in a new window]

Abstract

Item response theory (IRT) model applications extend well beyond cognitive ability testing, and various patient-reported outcomes (PRO) measures are among the more prominent examples. PRO (and like) constructs differ from cognitive ability constructs in many ways, and these differences have model fitting implications. With a few notable exceptions, however, most IRT applications to PRO constructs rely on traditional IRT models, such as the graded response model. We review some notable differences between cognitive and PRO constructs and how these differences can present challenges for traditional IRT model applications. We then apply two models (the traditional graded response model and an alternative log-logistic model) to depression measure data drawn from the Patient-Reported Outcomes Measurement Information System project. We do not claim that one model is “a better fit” or more “valid” than the other; rather, we show that the log-logistic model may be more consistent with the construct of depression as a unipolar phenomenon. Clearly, the graded response and log-logistic models can lead to different conclusions about the psychometrics of an instrument and the scaling of individual differences. We underscore, too, that, in general, explorations of which model may be more appropriate cannot be decided only by fit index comparisons; these decisions may require the integration of psychometrics with theory and research findings on the construct of interest.

Information

Type
Theory and Methods
Creative Commons
Creative Common License - CCCreative Common License - BY
This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Copyright
Copyright © 2021 The Author(s)
Figure 0

Table 1. PROMIS depression item content

Figure 1

Table 2. Item-scale correlations (corrected for item overlap), means, and response proportions

Figure 2

Figure 1. Histograms of raw score distributions include (top) and excluding (bottom) all zero response patterns.

Figure 3

Table 3. Graded response model (GRM) item parameter estimates

Figure 4

Table 4. Log-logistic (LL) item parameter estimates

Figure 5

Table 5. Observed and model reproduced response proportions for graded response and log-logistic models

Figure 6

Figure 2. Item response curves under the graded response model and log-logistic model.

Figure 7

Figure 3. Average category response curves under the graded response model and log-logistic model.

Figure 8

Figure 4. EAP trait level estimates under the graded response model and log-logistic model.

Figure 9

Figure 5. EAP trait level estimates versus raw scores under the graded response model and log-logistic.

Figure 10

Figure 6. Test information under the graded response model and log-logistic model.

Figure 11

Figure 7. Confidence bands for trait level estimates under the graded response model and log-logistic model.

Supplementary material: File

Reise et al. supplementary material

Reise et al. supplementary material
Download Reise et al. supplementary material(File)
File 236.1 KB