Skip to main content Accessibility help
×
Hostname: page-component-76fb5796d-22dnz Total loading time: 0 Render date: 2024-04-26T17:46:26.431Z Has data issue: false hasContentIssue false

21 - Item response theory and its applications for cancer outcomes measurement

Published online by Cambridge University Press:  18 December 2009

Steven P. Reise Ph.D.
Affiliation:
Professor University of California at Los Angeles, Los Angeles, CA
Joseph Lipscomb
Affiliation:
National Cancer Institute, Bethesda, Maryland
Carolyn C. Gotay
Affiliation:
Cancer Research Center, Hawaii
Claire Snyder
Affiliation:
National Cancer Institute, Bethesda, Maryland
Get access

Summary

Introduction

Each year new health-related quality of life (HRQOL) questionnaires are developed or revised from previous measures in the hope of obtaining instruments that are more reliable, valid within the study population, and sensitive to a patient's change in health status. Also, it is important that the measures provide interpretable scores that accurately characterize a patient's HRQOL. While several quality instruments have emerged in cancer outcomes research, we presently lack the ability to crosswalk scores from one instrument to another so that researchers are able to combine or compare results from multiple studies when different instruments are used. Developing these psychometrically strong measures requires analytical methods that will allow researchers to choose the best set of informative questions to match study objectives and to crosswalk scores from one assessment to another, despite use of different sets of questions.

There has been growing interest in learning how applications of item response theory (IRT) modeling can be used to respond to these analytical needs of the cancer outcomes measurement field. This interest is generated by the ability of IRT models to analyze item and scale performance within a study population; to detect biased items that may occur when translating an instrument from one language to another, or when respondents from two different groups hold culturally different meanings for the item content; to link two or more instruments on a common metric and thus facilitate crosswalking of scores; and to create item banks that serve as a foundation for computerized adaptive assessment.

Type
Chapter
Information
Outcomes Assessment in Cancer
Measures, Methods and Applications
, pp. 425 - 444
Publisher: Cambridge University Press
Print publication year: 2004

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

McHorney, C. A., Cohen, A. S. (2000). Equating health status measures with item response theory: illustrations with functional status itemsMedical Care 38:43–59CrossRefGoogle ScholarPubMed
Ware, J. E., Bjorner, J. B., Kosinski, M. (2000). Practical implications of item response theory and computerized adaptive testing: a brief summary of ongoing studies of widely used headache impact scalesMedical Care 38:73–83CrossRefGoogle ScholarPubMed
Medical Outcomes Trust (1991). Medical Outcomes Trust: Improving Medical Outcomes from the Patient's Point of View. Boston, MA: Medical Outcomes Trust
van der Linden, W. J., Hambleton, R. K. (ed.) (1997). Handbook of Modern Item Response Theory. New York: Springer
Hambleton, this volume, Chapter 22
Wilson, this volume, Chapter 23
McHorney, Cook, this volume, Invited Paper B
Hays, R. D., Morales, L. S. (2001). The RAND-36 measure of health-related quality of lifeAnnals of Medicine 33:350–7CrossRefGoogle ScholarPubMed
Hays, R. D., Sherbourne, C. D., Mazel, R. M. (1993). The RAND 36-item Health Survey 1.0.Health Economics 2:217–27CrossRefGoogle ScholarPubMed
Ware, J. E., Sherbourne, C. D. (1992). The MOS 36-item short-form health survey (SF-36), conceptual framework and item selectionMedical Care 30:473–83CrossRefGoogle Scholar
Morales, L. S., Cunningham, W. E., Brown, J. A.et al. (1999). Are Latinos less satisfied with communication by health care providers?Journal of General Internal Medicine 14:409–17CrossRefGoogle ScholarPubMed
Bock, R. D. (1997). A brief history of item response theoryEducational Measurement: Issues and Practice 16:21–33CrossRefGoogle Scholar
Lord, F. M. (1980). Applications of Item Response Theory to Practical Testing Problems. Hillsdale, NJ: Lawrence Erlbaum Associates
Wright, B. D. (1997). A history of social science measurementEducational Measurement: Issues and Practice 16:33–45CrossRefGoogle Scholar
Rasch, G. (1960). Probabilistic Models for Some Intelligence and Attainment Tests. Copenhagen: Denmarks Paedagogiske Institute
Mellenbergh, G. J. (1994). A unidimensional latent trait model for continuous item responsesMultivariate Behavioral Research 29:223–36CrossRefGoogle ScholarPubMed
Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scoresPsychometric Monograph, No. 17CrossRefGoogle Scholar
Masters, G. N. (1982). A Rasch model for partial credit scoringPsychometrika 47:149–74CrossRefGoogle Scholar
Masters, G. N., Wright, B. D. (1996). The partial credit model. In Handbook of Modern Item Response Theory, ed. W. J. van der Linden, R. K. Hambleton. New York: Springer
Andrich, D. (1978). Application of a psychometric model to ordered categories which are scored with successive integersApplied Psychological Measurement 2:581–94CrossRefGoogle Scholar
Muraki, E., Bock, R. D. (1997). PARSCALE: IRT Based Test Scoring and Item Analysis for Graded Open-ended Exercises and Performance Tasks. Chicago: Scientific Software
Cooke, D. J., Michie, C. (1997). An item response theory analysis of the Hare psychopathy checklist — revisedPsychological Assessment 9:3–14CrossRefGoogle Scholar
Drasgow, F., Parsons, C. (1983). Applications of unidimensional item response theory models to multidimensional dataApplied Psychological Measurement 7:189–99CrossRefGoogle Scholar
Reckase, M. D. (1979). Unifactor latent trait models applied to multifactor tests: results and implicationsJournal of Educational Statistics 4:207–30CrossRefGoogle Scholar
Hattie, J. (1985). Methodology review: assessing unidimensionality of tests and itemsApplied Psychological Measurement 9:139–64CrossRefGoogle Scholar
Reise, S. P., Waller, N. G. (2001). Dichotomous IRT models. In Advances in Measurement and Data Analysis, ed. F. Drasgow, Schmitt, pp. 88–122. Williamsburg, VA: Jossey-Bass
McLeod, L. D., Swygert, K. A., Thissen, D. (2001). Factor analysis for items scored in two categories. In Test Scoring, ed. D. Thissen, H. Wainer, pp. 141–86. Mahwah, NJ: Lawrence Erlbaum Associates
Waller, N. G. (2000). Micro FACT 2.0 User's Manual. St. Paul, MN: Assessment Systems Corporation
McDonald, R. P. (1981). The dimensionality of tests and itemsBritish Journal of Mathematical and Statistical Psychology 34:100–17CrossRefGoogle Scholar
Chen, W., Thissen, D. (1997). Local dependence indexes for item pairs using item response theoryJournal of Educational and Behavioral Statistics 22:265–89CrossRefGoogle Scholar
Yen, W. M. (1984). Effects of local item dependence on the fit and equating performance of the three-parameter logistic modelApplied Psychological Measurement 8:125–45CrossRefGoogle Scholar
Embretson, S. E., Reise, S. P. (2000). Item Response Theory for Psychologists. Mahwah, NJ: Lawrence Erlbaum Associates
Baker, F. B. (1992). Item Response Theory Parameter Estimation Techniques. New York: Marcel Dekker, Inc
Kingston, N., Dorans, N. (1985). The analysis of item-ability regressions: an exploratory IRT model-fit toolApplied Psychological Measurement 9:281–8CrossRefGoogle Scholar
McKinley, R., Mills, C. (1985). A comparison of several goodness-of-fit statisticsApplied Psychological Measurement 9:49–57CrossRefGoogle Scholar
Hambleton, R. K., Robin, F., Xing, D. (2000). Item response models for the analysis of educational and psychological test data. In H. E. A. Tinsley, S. D. Brown (ed.), Handbook of Applied Multivariate Statistics and Mathematical Modeling, pp. 553–85. San Diego, CA: Academic PressCrossRef
Orlando, M., Thissen, D. (2000). Likelihood-based item-fit indices for dichotomous item response theory modelsApplied Psychological Measurement 24:50–64CrossRefGoogle Scholar
Rogers, H., Hattie, J. (1987). A Monte Carlo investigation of several person and item-fit statistics for item response modelsApplied Psychological Measurement 11:47–57CrossRefGoogle Scholar
Mislevy, R. J., Bock, R. D. (1989). BILOG 3: Item Analysis and Test Scoring with Binary Logistic Models. Mooresville, IN: Scientific Software
Meijer, R. R., Sijtsma, K. (1995). Detection of aberrant item score patterns: a review of recent developmentsApplied Measurement in Education 8:261–72CrossRefGoogle Scholar
Reise, S. P., Waller, N. G. (1993). Traitedness and the assessment of response pattern scalabilityJournal of Personality and Social Psychology 65:143–51CrossRefGoogle Scholar
Zickar, M. J., Drasgow, F. (1996). Detecting faking on a personality instrument using appropriateness measurementApplied Psychological Measurement 20:71–88CrossRefGoogle Scholar
Embretson, S. E. (1996). The new rules of measurement. Psychological Assessment 8:341–9CrossRefGoogle Scholar
Cronbach, L. J. (1951). Coefficient alpha and the internal structure of testsPsychometrika 16:297–334CrossRefGoogle Scholar
Dodd, B. G., DeAyala, R. J. (1994). Item information as a function of threshold values in the rating scale model. In Objective Measurement: Theory and Practice, ed. M. Wilson, Vol. 2, pp. 201–317. Norwood, NJ: Ablex
Flaugher, R. (2000). Item pools. In Computerized Adaptive Testing: A Primer (2nd Edition), ed. H. Wainer, N. J. Dorans, D. Eignor et al., pp. 37–60. Mahwah, NJ: Lawrence Erlbaum Associates
Orlando, M., Sherbourne, C. D., Thissen, D. (2000). Summed-score linking using item response theory: application to depression measurementPsychological Assessment 12:354–9CrossRefGoogle ScholarPubMed
Vale, D. C. (1986). Linking item parameters onto a common scaleApplied Psychological Measurement 10:133–44CrossRefGoogle Scholar
Wainer, H., Eignor, D. (2000). Caveats, pitfalls, and unexpected consequences of implementing large-scale computerized testing. In Computerized Adaptive Testing: A Primer (2nd Edition), H. Wainer, N. J. Dorans, D. Eignor et al., pp. 37–60. Mahwah, NJ: Lawrence Erlbaum Associates
Weiss, D. J. (1985). Adaptive testing by computerJournal of Consulting and Clinical Psychology 53:774–89CrossRefGoogle Scholar
Drasgow, F., Olson-Buchanan, J. B. (1999). Innovations in Computerized Assessment. Mahwah, NJ: Lawrence Erlbaum Associates
Wainer, H., Dorans, N. J., Eignor, D. et al. (2000). Computerized Adaptive Testing: A Primer (2nd Edition). Mahwah, NJ: Lawrence Erlbaum Associates
van der Linden, W. J., Glas, C. A. W. (2000). Computerized Aadaptive Testing: Theory and Practice. London: Kluwer Academic Publishers
Bjorner, J. B., Kreiner, S., Ware, J. E.et al. (1998). Differential item functioning in the Danish translation of the SF-36Journal of Clinical Epidemiology 51:1189–202CrossRefGoogle ScholarPubMed
Gandek, B., Ware, J. E. Jr., Neil, K. A.et al. (1998). Tests of data quality, scaling assumptions, and reliability of the SF-36 in eleven countries: results from the IQOLA projectJournal of Clinical Epidemiology 51:1149–58CrossRefGoogle ScholarPubMed
Millsap, R. E., Everson, H. T. (1993). Methodology review: statistical approaches for assessing measurement biasApplied Psychological Measurement 17:297–334CrossRefGoogle Scholar
Zimowski, M. F., Muraki, E., Mislevy, R. J. et al. (1996). BILOG-MG: Multiple-group IRT Analysis and Test Maintenance for Binary Items. Chicago: Scientific Software
Kim, S., Cohen, A. (1998). Detection of differential item functioning under the graded response model with the likelihood ratio testApplied Psychological Measurement 22:345–55CrossRefGoogle Scholar
Raju, N. S. (1988). The area between two item characteristic curvesPsychometrika 53:495–502CrossRefGoogle Scholar
Reise, S. P., Smith, L., Furr, R. M. (2001). Invariance on the NEO PI-R Neuroticism ScaleMultivariate Behavioral Research 36:83–110CrossRefGoogle Scholar
Thissen, D., Steinberg, L., Gerrard, M. (1986). Beyond group-mean differences: the concept of item biasPsychological Bulletin 99:118–28CrossRefGoogle Scholar
Reise, S. P., Widaman, K. F., Pugh, R. H. (1993). Confirmatory factor analysis and item response theory: two approaches for exploring measurement invariancePsychological Bulletin 114:352–66CrossRefGoogle ScholarPubMed
Holland, P. W., Wainer, H. (1993). Differential Item Functioning. Hillsdale, NJ: Lawrence Erlbaum Associates
Yen, W. M. (1986). The choice of scale for educational measurement: an IRT perspectiveJournal of Educational Measurement 23:299–325CrossRefGoogle Scholar
Thissen, D., Orlando, M. (2001). Item response theory for items scored in two categories. In Test Scoring, ed. D. Thissen, H. Wainer, pp. 73–140. Mahwah, NJ: Lawrence Erlbaum Associates
Cella, D., Chang, C. (2000). A discussion of item response theory and its applications in health status assessmentMedical Care 9:66–72Google Scholar
Santor, D. A., Ramsay, J. O. (1998). Progress in the technology of measurement: applications of item response modelsPsychological Assessment 10:345–59CrossRefGoogle Scholar
Embretson, S. E. (1996). Item response theory models and spurious interaction effects in factorial ANOVA designsApplied Psychological Measurement 20:201–12CrossRefGoogle Scholar

Save book to Kindle

To save this book to your Kindle, first ensure coreplatform@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Available formats
×

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

Available formats
×

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

Available formats
×