Literature DB >> 34468239

Revisiting performance metrics for prediction with rare outcomes.

Samrachana Adhikari1, Sharon-Lise Normand2, Jordan Bloom3, David Shahian3, Sherri Rose4.   

Abstract

Machine learning algorithms are increasingly used in the clinical literature, claiming advantages over logistic regression. However, they are generally designed to maximize the area under the receiver operating characteristic curve. While area under the receiver operating characteristic curve and other measures of accuracy are commonly reported for evaluating binary prediction problems, these metrics can be misleading. We aim to give clinical and machine learning researchers a realistic medical example of the dangers of relying on a single measure of discriminatory performance to evaluate binary prediction questions. Prediction of medical complications after surgery is a frequent but challenging task because many post-surgery outcomes are rare. We predicted post-surgery mortality among patients in a clinical registry who received at least one aortic valve replacement. Estimation incorporated multiple evaluation metrics and algorithms typically regarded as performing well with rare outcomes, as well as an ensemble and a new extension of the lasso for multiple unordered treatments. Results demonstrated high accuracy for all algorithms with moderate measures of cross-validated area under the receiver operating characteristic curve. False positive rates were <1%, however, true positive rates were <7%, even when paired with a 100% positive predictive value, and graphical representations of calibration were poor. Similar results were seen in simulations, with the addition of high area under the receiver operating characteristic curve (>90%) accompanying low true positive rates. Clinical studies should not primarily report only area under the receiver operating characteristic curve or accuracy.

Entities:  

Keywords:  Prediction; classification; ensembles; machine learning; mortality

Mesh:

Year:  2021        PMID: 34468239      PMCID: PMC8561661          DOI: 10.1177/09622802211038754

Source DB:  PubMed          Journal:  Stat Methods Med Res        ISSN: 0962-2802            Impact factor:   2.494


  39 in total

1.  A solution to the problem of separation in logistic regression.

Authors:  Georg Heinze; Michael Schemper
Journal:  Stat Med       Date:  2002-08-30       Impact factor: 2.373

2.  Much ado about nothing: A comparison of missing data methods and software to fit incomplete data regression models.

Authors:  Nicholas J Horton; Ken P Kleinman
Journal:  Am Stat       Date:  2007-02       Impact factor: 8.710

3.  The meaning and use of the area under a receiver operating characteristic (ROC) curve.

Authors:  J A Hanley; B J McNeil
Journal:  Radiology       Date:  1982-04       Impact factor: 11.105

Review 4.  Aortic Stenosis: Pathophysiology, Diagnosis, and Therapy.

Authors:  Jessica Joseph; Syed Yaseen Naqvi; Jay Giri; Sheldon Goldberg
Journal:  Am J Med       Date:  2016-11-01       Impact factor: 4.965

5.  Mortality prediction in intensive care units with the Super ICU Learner Algorithm (SICULA): a population-based study.

Authors:  Romain Pirracchio; Maya L Petersen; Marco Carone; Matthieu Resche Rigon; Sylvie Chevret; Mark J van der Laan
Journal:  Lancet Respir Med       Date:  2014-11-24       Impact factor: 30.700

6.  Regularization Paths for Generalized Linear Models via Coordinate Descent.

Authors:  Jerome Friedman; Trevor Hastie; Rob Tibshirani
Journal:  J Stat Softw       Date:  2010       Impact factor: 6.440

7.  Ethical Machine Learning in Healthcare.

Authors:  Irene Y Chen; Emma Pierson; Sherri Rose; Shalmali Joshi; Kadija Ferryman; Marzyeh Ghassemi
Journal:  Annu Rev Biomed Data Sci       Date:  2021-05-06

8.  Drug-eluting or bare-metal stents for acute myocardial infarction.

Authors:  Laura Mauri; Treacy S Silbaugh; Pallav Garg; Robert E Wolf; Katya Zelevinsky; Ann Lovett; Manu R Varma; Zheng Zhou; Sharon-Lise T Normand
Journal:  N Engl J Med       Date:  2008-09-25       Impact factor: 91.245

9.  Mechanical or Biologic Prostheses for Aortic-Valve and Mitral-Valve Replacement.

Authors:  Andrew B Goldstone; Peter Chiu; Michael Baiocchi; Bharathi Lingala; William L Patrick; Michael P Fischbein; Y Joseph Woo
Journal:  N Engl J Med       Date:  2017-11-09       Impact factor: 91.245

10.  Classifying Lung Cancer Severity with Ensemble Machine Learning in Health Care Claims Data.

Authors:  Savannah L Bergquist; Gabriel A Brooks; Nancy L Keating; Mary Beth Landrum; Sherri Rose
Journal:  Proc Mach Learn Res       Date:  2017-08
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.