Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Is the area under an ROC curve a valid measure of the performance of a screening or diagnostic test?

Literature DB >> 24407586

Is the area under an ROC curve a valid measure of the performance of a screening or diagnostic test?

Abstract

OBJECTIVES: The area under a receiver operating characteristic (ROC) curve (the AUC) is used as a measure of the performance of a screening or diagnostic test. We here assess the validity of the AUC.
METHODS: Assuming the test results follow Gaussian distributions in affected and unaffected individuals, standard mathematical formulae were used to describe the relationship between the detection rate (DR) (or sensitivity) and the false-positive rate (FPR) of a test with the AUC. These formulae were used to calculate the screening performance (DR for a given FPR, or FPR for a given DR) for different AUC values according to different standard deviations of the test result in affected and unaffected individuals.
RESULTS: The DR for a given FPR is strongly dependent on relative differences in the standard deviation of the test variable in affected and unaffected individuals. Consequently, two tests with the same AUC can have a different DR for the same FPR. For example, an AUC of 0.75 has a DR of 24% for a 5% FPR if the standard deviations are the same in affected and unaffected individuals, but 39% for the same 5% FPR if the standard deviation in affected individuals is 1.5 times that in unaffected individuals.
CONCLUSION: The AUC is an unreliable measure of screening performance because in practice the standard deviation of a screening or diagnostic test in affected and unaffected individuals can differ. The problem is avoided by not using AUC at all, and instead specifying DRs for given FPRs or FPRs for given DRs.

Keywords: AUC; ROC curve; diagnostic test; screening test

Mesh：

Year: 2014 PMID： 24407586 DOI： 10.1177/0969141313517497

Source DB: PubMed Journal: J Med Screen ISSN： 0969-1413 Impact factor: 2.136

Keyword Cloud
Cited

8 in total

1. A threshold-free summary index of prediction accuracy for censored time to event data.

Authors: Yan Yuan; Qian M Zhou; Bingying Li; Hengrui Cai; Eric J Chow; Gregory T Armstrong
Journal: Stat Med Date: 2018-02-08 Impact factor: 2.373

2. The Matthews correlation coefficient (MCC) is more reliable than balanced accuracy, bookmaker informedness, and markedness in two-class confusion matrix evaluation.

Authors: Davide Chicco; Niklas Tötsch; Giuseppe Jurman
Journal: BioData Min Date: 2021-02-04 Impact factor: 2.522

Review 3. Supervised Machine Learning: A Brief Primer.

Authors: Tammy Jiang; Jaimie L Gradus; Anthony J Rosellini
Journal: Behav Ther Date: 2020-05-16

4. Developing algorithms to predict adult onset internalizing disorders: An ensemble learning approach.

Authors: Anthony J Rosellini; Siyu Liu; Grace N Anderson; Sophia Sbi; Esther S Tung; Evdokia Knyazhanskaya
Journal: J Psychiatr Res Date: 2019-12-06 Impact factor: 4.791

5. Threshold-free measures for assessing the performance of medical screening tests.

Authors: Yan Yuan; Wanhua Su; Mu Zhu
Journal: Front Public Health Date: 2015-04-20

6. Quantitative PET Imaging and Clinical Parameters as Predictive Factors for Patients With Cervical Carcinoma: Implications of a Prediction Model Generated Using Multi-Objective Support Vector Machine Learning.

Authors: Zhiguo Zhou; Genevieve M Maquilan; Kimberly Thomas; Jason Wachsmann; Jing Wang; Michael R Folkert; Kevin Albuquerque
Journal: Technol Cancer Res Treat Date: 2020 Jan-Dec

7. Pregnanetriolone in paper-borne urine for neonatal screening for 21-hydroxylase deficiency: The place of urine in neonatal screening.

Authors: José Ramón Alonso-Fernández
Journal: Mol Genet Metab Rep Date: 2016-08-18

8. Comparison of Regression and Machine Learning Methods in Depression Forecasting Among Home-Based Elderly Chinese: A Community Based Study.

Authors: Shaowu Lin; Yafei Wu; Ya Fang
Journal: Front Psychiatry Date: 2022-01-17 Impact factor: 4.157

8 in total