Skip to main content Accessibility help
×
Hostname: page-component-76fb5796d-22dnz Total loading time: 0 Render date: 2024-04-26T09:39:07.425Z Has data issue: false hasContentIssue false

15 - Receiver Operating Characteristic Analysis: Basic Concepts and Practical Applications

from Part III - Perception Metrology

Published online by Cambridge University Press:  20 December 2018

Ehsan Samei
Affiliation:
Duke University Medical Center, Durham
Elizabeth A. Krupinski
Affiliation:
Emory University, Atlanta
Get access

Summary

Image of the first page of this content. For PDF version, please use the ‘Save PDF’ preceeding this image.'
Type
Chapter
Information
Publisher: Cambridge University Press
Print publication year: 2018

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Aoki, K., Misumi, J., Kimura, T., Zhao, W., Xie, T. (1997). Evaluation of cutoff levels for screening of gastric cancer using serum pepsinogens and distributions of levels of serum pepsinogen I, Ii and of Pg I/Pg Ii ratios in a gastric cancer case-control study. J Epidemiol, 7, 143151.Google Scholar
Bamber, D. (1975). The area above the ordinal dominance graph and the area below the receiver operating characteristic graph. J Math Psychol, 12, 387415.Google Scholar
Begg, C.B., Greenes, R.A. (1983). Assessment of diagnostic tests when disease verification is subject to selection bias. Biometrics, 39, 207215.Google Scholar
Beiden, S.V., Campbell, G., Meier, K.L., Wagner, R.F. (2000a). The problem of ROC analysis without truth: the EM algorithm and the information matrix. Proc SPIE, 3981, 126134.Google Scholar
Beiden, S.V., Wagner, R.F., Campbell, G. (2000b). Components-of-variance models and multiple-bootstrap experiments: an alternative method for random effects, receiver operating characteristic analysis. Acad Radiol, 7, 341349.Google Scholar
Cortes, C., Mohri, M. (2003). AUC optimization vs. error rate. In: Advances in Neural Information Processing Systems 16: Proceedings of the 2003 Conference. Cambridge, MA: MIT Press.Google Scholar
Delong, E.R., Delong, D.M., Clarke-Pearson, D.L. (1988). Comparing the areas under two or more correlated receiver operating characteristics curves: a non-parametric approach. Biometrics, 44, 837845.Google Scholar
Deneef, P., Kent, D.L. (1993). Using treatment-tradeoff preferences to select diagnostic strategies. Med Decis Making, 13, 126132.CrossRefGoogle ScholarPubMed
Dorfman, D.D., Alf, E. (1968). Maximum likelihood estimation of parameters of signal detection theory: a direct solution. Psychometrika, 33, 117124.CrossRefGoogle ScholarPubMed
Dorfman, D.D., Alf, E. (1969). Maximum likelihood estimation of parameters of signal detection theory and determination of confidence intervals – rating method data. J Math Psychol, 6, 487496.Google Scholar
Dorfman, D.D., Berbaum, K.S., Metz, C.E. (1992). Receiver operating characteristic rating analysis: generalization to the population of readers and patients with the jackknife method. Invest Radiol, 27, 723731.Google Scholar
Dorfman, D.D., Berbaum, K.S., Metz, C.E., Lenth, R.V., Hanley, J.A., Abu Dagga, H. (1997). Proper receiver operating characteristic analysis: the bigamma model. Acta Radiol, 4, 138–149.Google Scholar
Dwyer, A.J. (1997). In pursuit of a piece of the ROC. Radiology, 202, 621625.Google Scholar
Efron, B., Tibshirani, R.J. (1993). An Introduction to the Bootstrap. New York, NY: Chapman and Hall.CrossRefGoogle Scholar
Faraggi, D., Reiser, B. (2002). Estimation of the area under the ROC curve. Statistics Med, 21, 30933106.CrossRefGoogle ScholarPubMed
Goddard, M.J., Hinberg, I. (1990). Receiver operating characteristic (ROC) curves and non-normal data: an empirical study. Statistics Med, 9, 325337.Google Scholar
Greiner, M., Sohr, D., Gobel, P. (1995). A modified ROC analysis for the selection of cut-off values and the definition of intermediate results for serodiagnostic tests. J Immunol Methods, 185, 123132.Google Scholar
Grmec, I., Kupnik, D. (2004). Does the Mainz emergency evaluation scoring (MEES) in combination with capnometry (MEESC) help in the prognosis of outcome from cardiopulmonary resuscitation in a prehospital setting? Resuscitation, 58, 8996.CrossRefGoogle Scholar
Hajian-Tilaki, K.O., Hanley, J.A., Joseph, L., Collet, J.P. (1997). A comparison of parametric and nonparametric approaches to ROC analysis of quantitative diagnostic tests. Med Decis Making, 17, 94102.Google Scholar
Halpern, E.J., Albert, M., Krieger, A.M., Metz, C.E., Maidment, A.D. (1996). Comparison of receiver operating characteristic curves on the basis of optimal operating points. Acad Radiol, 3, 245253.Google Scholar
Hand, D.J., Till, R.J. (2001). A simple generalization of the area under the ROC curve to multiple class classification problems. Machine Learn, 45, 171186.Google Scholar
Hanley, J.A. (1988). The robustness of the “binormal” assumptions used in fitting ROC curves. Med Decis Making, 8, 197203.CrossRefGoogle ScholarPubMed
Hanley, J.A. (1996). The use of the ‘‘binormal’’ model for parametric ROC analysis of quantitative diagnostic tests. Statistics Med, 15, 15751585.Google Scholar
Hanley, J.A., Hajian-Tilaki, K.O. (1997). Sampling variability of nonparametric estimates of the areas under receiver operating characteristic curves: an update. Acad Radiol, 4, 4958.CrossRefGoogle ScholarPubMed
Hanley, J.A., McNeil, B.J. (1982). The meaning and use of the area under a receiver operating characteristic curve. Radiology, 143, 2936.Google Scholar
Hanley, J.A., McNeil, B.J. (1983). A method for comparing the areas under receiver operating characteristic curves derived from the same cases. Radiology, 148, 839843.Google Scholar
Harrell, F.E., Jr., Califf, R.M., Pryor, D.B., Lee, K.L., Rosati, R.A. (1982). Evaluating the yield of medical tests. JAMA, 247, 25432546.Google Scholar
Henkelman, R.M., Kay, I., Bronskill, M.J. (1990). Receiver operator characteristic (ROC) analysis without truth. Med Decis Making, 10, 2429.Google Scholar
Ikeda, M., Ishigaki, T., Yamauch, K. (2003). How to establish equivalence between two treatments in ROC analysis. Proc SPIE, 5034, 383392.Google Scholar
Jiang, Y., Metz, C.E., Nishikawa, R.M. (1996). A receiver operating characteristic partial area index for highly sensitive diagnostic tests. Radiology, 201, 745750.Google Scholar
Johnson, W.O., Gastwirth, J.L., Pearson, L.M. (2001). Screening without a “gold standard”: the Hui-Walter paradigm revisited. Am J Epidemiol, 153, 921924.Google Scholar
Kairisto, V., Poola, A. (1995). Software for illustrative presentation of basic clinical characteristics of laboratory tests – Graphroc for Windows. Scand J Clin Lab Invest, 55, 4360.Google Scholar
Kijewski, M.F., Swennson, R.G., Judy, P.F. (1989). Analysis of rating data from multiple-alternative tasks. J Math Psychol, 33, 123.Google Scholar
Lee, W.C. (1999). Probabilistic analysis of global performances of diagnostic tests: interpreting the Lorenz curve-based summary measures. Statistics Med, 18, 455471.Google Scholar
Lee, W.C., Hsiao, C.K. (1996). Alternative summary indices for the receiver operating characteristic curve. Epidemiology, 7, 605611.Google Scholar
Li, C.R., Liao, C.-T., Liu, J.-P. (2008). A non-inferiority test for diagnostic accuracy based on the paired partial areas under ROC curves. Statistics Med, 27, 17621776.Google Scholar
Liu, J.-P., Ma, M.-C., Wu, C.-Y., Tai, J.-Y. (2006). Tests of equivalence and non-inferiority for diagnostic accuracy based on the paired areas under ROC curves. Statistics Med, 25, 12191238.Google Scholar
Lusted, L.B. (1960). Logical analysis in roentgen diagnosis. Radiology, 74, 7893.Google Scholar
Lusted, L.B. (1961). Signal detectability and medical decision making. Science, 171, 12171219.Google Scholar
McClish, D.K. (1989). Analyzing a portion of the ROC curve. Med Decis Making, 9, 190195.Google Scholar
Metz, C.E. (1978). Basic principles of ROC analysis. Semin Nucl Med, 8, 283298.CrossRefGoogle ScholarPubMed
Metz, C.E. (1986a). Statistical analysis of ROC data in evaluating diagnostic performance. In: Herbert, D., Myers, R. (eds.) Multiple Regression Analysis: Applications in the Health Sciences. New York, NY: American Institute of Physics, pp. 365384.Google Scholar
Metz, C.E. (1986b). ROC methodology in radiologic imaging. Invest Radiol, 21, 720733.Google Scholar
Metz, C.E., Kronman, H.B. (1980). Statistical significance tests for binormal ROC curves. J Math Psychol, 22, 218243.Google Scholar
Metz, C.E., Pan, X. (1999). “Proper” binormal ROC curves: theory and maximum-likelihood estimation. J Math Psychol, 43, 133.Google Scholar
Metz, C.E., Wang, P.-L., Kronman, H.B. (1984). A new approach for testing the significance of differences between ROC curves measured from correlated data. In: Deconinck, F. (ed.) Information Processing in Medical Imaging. The Hague: Nijhoff, pp. 432445.CrossRefGoogle Scholar
Metz, C.E., Herman, B.A., Shen, J.-H. (1998). Maximum-likelihood estimation of ROC curves from continuously-distributed data. Statistics Med, 17, 10331053.3.0.CO;2-Z>CrossRefGoogle ScholarPubMed
Miller, D.P., O’Shaughnessy, K.F., Wood, S.A., Castellino, R.A. (2004). Gold standards and expert panels: a pulmonary nodule case study with challenges and solutions. Proc SPIE, 5372, 173.Google Scholar
Mossman, D. (1999). Three-way ROCs. Med Decis Making, 19, 7889.Google Scholar
Obuchowski, N.A. (1994). Sample size for receiver operating characteristic studies. Invest Radiol, 29, 238243.Google Scholar
Obuchowski, N.A. (1997). Testing for equivalence of diagnostic tests. Am J Radiol, 168, 1317.Google Scholar
Obuchowski, N.A. (2000). Sample size tables for receiver operating characteristic studies. Am J Roentgenol, 175, 603608.Google Scholar
Obuchowski, N.A. (2005). Multi-reader multi-modality ROC studies: hypothesis testing and sample size estimation using an ANOVA approach with dependent observations. Acad Radiol, 2, 522529.Google Scholar
Obuchowski, N.A. (2006). An ROC-type measure of diagnostic accuracy when the gold standard is continuous-scale. Statistics Med, 25, 481493.Google Scholar
Obuchowski, N.A., Liebler, M.L. (1998). Confidence intervals for the receiver operating characteristic area in studies with small samples. Acad Radiol, 5, 561571.Google Scholar
Obuchowski, N.A., Goske, M.J., Applegate, K.E. (2001). Assessing physicians’ accuracy in diagnosing pediatric patients with acute abdominal pain: measuring accuracy for multiple diseases. Statistics Med, 20, 32613278.Google Scholar
Pepe, M.S. (2003). The Statistical Evaluation of Medical Tests for Classification and Prediction. Oxford, UK: Oxford University Press.Google Scholar
Petrick, N., Gallas, B.D., Samuelson, F.W., Wagner, R.F., Myers, K.J. (2005). Influence of panel size and expert skill on truth panel performance when combining expert ratings. Proc SPIE, 5749, 49.CrossRefGoogle Scholar
Phelps, C.E., Hutson, A. (1995). Estimating diagnostic test accuracy using a “fuzzy gold standard.” Med Decis Making, 15, 4457.Google Scholar
Schafer, H. (1989). Constructing a cut-off point for a quantitative diagnostic test. Statistics Med, 8, 13811391.Google Scholar
Schoonjans, F., Zalata, A., Depuydt, C.E., Comhaire, F.H. (1995). Medcalc: a new computer program for medical statistics. Comput Methods Programs Biomed, 48, 257262.Google Scholar
Schisterman, E.F., Perkins, N.J., Aiyi, L., Bondell, H. (2005). Optimal cut-point and its corresponding Youden index to discriminate individuals using pooled blood samples. Epidemiology, 16, 7381.Google Scholar
Schuirmann, D.U.I. (1987). A comparison of the two 1-sided tests procedure and the power approach for assessing the equivalence of average bioavailability. J Pharmacokinet Pharmacodyn, 15, 657680.Google Scholar
Stephan, C., Wesseling, S., Schink, T., Jung, K. (2003). Comparison of eight computer programs for receiver-operating characteristic analysis. Clin Chem, 49, 433439.CrossRefGoogle ScholarPubMed
Swets, J.A. (1979). ROC analysis applied to the evaluation of medical imaging techniques. Invest Radiol, 14, 109121.Google Scholar
Swets, J.A. (1986). Empirical ROCs in discrimination and diagnostic tasks: implications for theory and measurement of performance. Psychol Bull, 99, 181198.Google Scholar
Swets, J.A. (1988). Measuring the accuracy of diagnostic systems. Science, 240, 12851293.Google Scholar
Swets, J.A. (1992). The science of choosing the right decision threshold in high-stakes diagnostics. Am Psychol, 47, 522532.Google Scholar
Toledano, A.Y., Gatsonis, C. (1999). Generalized estimating equations for ordinal categorical data: arbitrary patterns of missing responses and missingness in a key covariate. Biometrics, 55, 488496.Google Scholar
Vergara, I.A., Norambuena, T., Ferrada, E., Slater, A.W., Melo, F. (2008). StAR: a simple tool for the statistical comparison of ROC curves. BMC Bioinformatics, 9, 265.Google Scholar
Wagner, R.F., Beiden, C.V., Metz, C.E., Campbell, G. (2001). Continuous versus categorical data for ROC analysis: some quantitative considerations. Acad Radiol, 8, 328334.CrossRefGoogle ScholarPubMed
Wagner, R.F., Metz, C.E., Campbell, G. (2007). Assessment of medical imaging systems and computer aids: a tutorial review. Acad Radiol, 14, 723748.CrossRefGoogle ScholarPubMed
Walsh, S.J. (1999). Goodness-of-fit issues in ROC curve estimation. Med Decis Making, 19, 193201.Google Scholar
Wieand, S., Gail, M.H., James, B.R., James, K.L. (1989). A family of nonparametric statistics for comparing diagnostic markers with paired or unpaired data. Biometrika, 76, 585592.Google Scholar
Youden, W.J. (1950). Index for rating diagnostic tests. Cancer, 3, 3235.3.0.CO;2-3>CrossRefGoogle ScholarPubMed
Zhang, D.D., Zhou, X.-H., Freeman, D.H., Jr., Freeman, J.L. (2002). A non-parametric method for the comparison of partial areas under ROC curves and its application to large health care data sets. Statistics Med, 21, 701–15.Google Scholar
Zhou, X.-H., Higgs, R.E. (2000). Assessing the relative accuracies of two screening tests in the presence of verification bias. Statistics Med, 19, 16971705.Google Scholar
Zhou, X.-H., Obuchowski, N.A., McClish, D.K. (2002). Statistical Methods in Diagnostic Medicine. New York, NY: Wiley.CrossRefGoogle Scholar
Zou, K.H. (2001). Comparison of correlated receiver operating characteristic curves derived from repeated diagnostic test data. Acad Radiol, 8, 225233.Google Scholar
Zou, K.H., Hall, W.J., Shapiro, D.E. (1997). Smooth non-parametric receiver operating characteristic (ROC) curves for continuous diagnostic tests. Statistics Med, 16, 21432156.Google Scholar
Zou, K.H., Tempany, C.M., Fielding, J.R., Silverman, S.G. (1998). Original smooth receiver operating characteristic curve estimation from continuous data: statistical methods for analyzing the predictive value of spiral CT of ureteral stones. Acad Radiol, 5, 680687.Google Scholar
Zou, K.H., Resnic, F.S., Talos, I.F., et al. (2005). A global goodness-of-fit test for receiver operating characteristic curve analysis via the bootstrap method. J Biomed Informatics, 38, 395403.Google Scholar

Save book to Kindle

To save this book to your Kindle, first ensure coreplatform@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Available formats
×

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

Available formats
×

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

Available formats
×