Literature DB >> 35656541

Performance metric curve analysis framework to assess impact of the decision variable threshold, disease prevalence, and dataset variability in two-class classification.

Heather M Whitney1,2, Karen Drukker1, Maryellen L Giger1.   

Abstract

Purpose: The aim of this study is to (1) demonstrate a graphical method and interpretation framework to extend performance evaluation beyond receiver operating characteristic curve analysis and (2) assess the impact of disease prevalence and variability in training and testing sets, particularly when a specific operating point is used. Approach: The proposed performance metric curves (PMCs) simultaneously assess sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV), and the 95% confidence intervals thereof, as a function of the threshold for the decision variable. We investigated the utility of PMCs using six example operating points associated with commonly used methods to select operating points (including the Youden index and maximum mutual information). As an example, we applied PMCs to the task of distinguishing between malignant and benign breast lesions using human-engineered radiomic features extracted from dynamic contrast-enhanced magnetic resonance images. The dataset had 1885 lesions, with the images acquired in 2015 and 2016 serving as the training set (1450 lesions) and those acquired in 2017 as the test set (435 lesions). Our study used this dataset in two ways: (1) the clinical dataset itself and (2) simulated datasets with features based on the clinical set but with five different disease prevalences. The median and 95% CI of the number of type I (false positive) and type II (false negative) errors were determined for each operating point of interest.
Results: PMCs from both the clinical and simulated datasets demonstrated that PMCs could support interpretation of the impact of decision threshold choice on type I and type II errors of classification, particularly relevant to prevalence.
Conclusion: PMCs allow simultaneous evaluation of the four performance metrics of sensitivity, specificity, PPV, and NPV as a function of the decision threshold. This may create a better understanding of two-class classifier performance in machine learning.
© 2022 Society of Photo-Optical Instrumentation Engineers (SPIE).

Entities:  

Keywords:  AUC; artificial intelligence; machine learning; performance assessment; radiomics; repeatability

Year:  2022        PMID: 35656541      PMCID: PMC9152992          DOI: 10.1117/1.JMI.9.3.035502

Source DB:  PubMed          Journal:  J Med Imaging (Bellingham)        ISSN: 2329-4302


  38 in total

1.  Comparing diagnostic tests: a simple graphic using likelihood ratios.

Authors:  B J Biggerstaff
Journal:  Stat Med       Date:  2000-03-15       Impact factor: 2.373

2.  Basic principles of ROC analysis.

Authors:  C E Metz
Journal:  Semin Nucl Med       Date:  1978-10       Impact factor: 4.446

3.  Developing a utility decision framework to evaluate predictive models in breast cancer risk estimation.

Authors:  Yirong Wu; Craig K Abbey; Xianqiao Chen; Jie Liu; David C Page; Oguzhan Alagoz; Peggy Peissig; Adedayo A Onitilo; Elizabeth S Burnside
Journal:  J Med Imaging (Bellingham)       Date:  2015-08-17

4.  Optimal cut-point and its corresponding Youden Index to discriminate individuals using pooled blood samples.

Authors:  Enrique F Schisterman; Neil J Perkins; Aiyi Liu; Howard Bondell
Journal:  Epidemiology       Date:  2005-01       Impact factor: 4.822

5.  Caveats and pitfalls of ROC analysis in clinical microarray research (and how to avoid them).

Authors:  Daniel Berrar; Peter Flach
Journal:  Brief Bioinform       Date:  2011-03-21       Impact factor: 11.622

6.  Combined Benefit of Quantitative Three-Compartment Breast Image Analysis and Mammography Radiomics in the Classification of Breast Masses in a Clinical Data Set.

Authors:  Karen Drukker; Maryellen L Giger; Bonnie N Joe; Karla Kerlikowske; Heather Greenwood; Jennifer S Drukteinis; Bethany Niell; Bo Fan; Serghei Malkov; Jesus Avila; Leila Kazemi; John Shepherd
Journal:  Radiology       Date:  2018-12-11       Impact factor: 11.105

Review 7.  Receiver operating characteristic (ROC) curves: review of methods with applications in diagnostic medicine.

Authors:  Nancy A Obuchowski; Jennifer A Bullen
Journal:  Phys Med Biol       Date:  2018-03-29       Impact factor: 3.609

8.  Automatic identification and classification of characteristic kinetic curves of breast lesions on DCE-MRI.

Authors:  Weijie Chen; Maryellen L Giger; Ulrich Bick; Gillian M Newstead
Journal:  Med Phys       Date:  2006-08       Impact factor: 4.071

9.  Multi-Stage Harmonization for Robust AI across Breast MR Databases.

Authors:  Heather M Whitney; Hui Li; Yu Ji; Peifang Liu; Maryellen L Giger
Journal:  Cancers (Basel)       Date:  2021-09-26       Impact factor: 6.639

10.  Improved Classification of Benign and Malignant Breast Lesions Using Deep Feature Maximum Intensity Projection MRI in Breast Cancer Diagnosis Using Dynamic Contrast-enhanced MRI.

Authors:  Qiyuan Hu; Heather M Whitney; Hui Li; Yu Ji; Peifang Liu; Maryellen L Giger
Journal:  Radiol Artif Intell       Date:  2021-02-24
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.