Literature DB >> 34446286

Limitations of receiver operating characteristic curve on imbalanced data: Assist device mortality risk scores.

Faezeh Movahedi1, Rema Padman2, James F Antaki3.   

Abstract

OBJECTIVE: In the left ventricular assist device domain, the receiver operating characteristic is a commonly applied metric of performance of classifiers. However, the receiver operating characteristic can provide a distorted view of classifiers' ability to predict short-term mortality due to the overwhelmingly greater proportion of patients who survive, that is, imbalanced data. This study illustrates the ambiguity of the receiver operating characteristic in evaluating 2 classifiers of 90-day left ventricular assist device mortality and introduces the precision recall curve as a supplemental metric that is more representative of left ventricular assist device classifiers in predicting the minority class.
METHODS: This study compared the receiver operating characteristic and precision recall curve for 2 classifiers for 90-day left ventricular assist device mortality, HeartMate Risk Score and Random Forest for 800 patients (test group) recorded in the Interagency Registry for Mechanically Assisted Circulatory Support who received a continuous-flow left ventricular assist device between 2006 and 2016 (mean age, 59 years; 146 female vs 654 male patients), in whom 90-day mortality rate is only 8%.
RESULTS: The receiver operating characteristic indicates similar performance of Random Forest and HeartMate Risk Score classifiers with respect to area under the curve of 0.77 and Random Forest 0.63, respectively. This is in contrast to their precision recall curve with area under the curve of 0.43 versus 0.16 for Random Forest and HeartMate Risk Score, respectively. The precision recall curve for HeartMate Risk Score showed the precision rapidly decreased to only 10% with slightly increasing sensitivity.
CONCLUSIONS: The receiver operating characteristic can portray an overly optimistic performance of a classifier or risk score when applied to imbalanced data. The precision recall curve provides better insight about the performance of a classifier by focusing on the minority class.
Copyright © 2021 The American Association for Thoracic Surgery. Published by Elsevier Inc. All rights reserved.

Entities:  

Keywords:  LVAD; PRC; ROC; imbalanced data

Year:  2021        PMID: 34446286      PMCID: PMC8800945          DOI: 10.1016/j.jtcvs.2021.07.041

Source DB:  PubMed          Journal:  J Thorac Cardiovasc Surg        ISSN: 0022-5223            Impact factor:   5.209


  13 in total

1.  A CROC stronger than ROC: measuring, visualizing and optimizing early retrieval.

Authors:  S Joshua Swamidass; Chloé-Agathe Azencott; Kenny Daily; Pierre Baldi
Journal:  Bioinformatics       Date:  2010-04-07       Impact factor: 6.937

2.  Training neural network classifiers for medical decision making: the effects of imbalanced datasets on classification performance.

Authors:  Maciej A Mazurowski; Piotr A Habas; Jacek M Zurada; Joseph Y Lo; Jay A Baker; Georgia D Tourassi
Journal:  Neural Netw       Date:  2007-12-27

3.  Caveats and pitfalls of ROC analysis in clinical microarray research (and how to avoid them).

Authors:  Daniel Berrar; Peter Flach
Journal:  Brief Bioinform       Date:  2011-03-21       Impact factor: 11.622

4.  A Random Forests Quantile Classifier for Class Imbalanced Data.

Authors:  Robert O'Brien; Hemant Ishwaran
Journal:  Pattern Recognit       Date:  2019-01-29       Impact factor: 7.740

Review 5.  A comprehensive data level analysis for cancer diagnosis on imbalanced data.

Authors:  Sara Fotouhi; Shahrokh Asadi; Michael W Kattan
Journal:  J Biomed Inform       Date:  2019-01-03       Impact factor: 6.317

6.  A new Bayesian network-based risk stratification model for prediction of short-term and long-term LVAD mortality.

Authors:  Natasha A Loghmanpour; Manreet K Kanwar; Marek J Druzdzel; Raymond L Benza; Srinivas Murali; James F Antaki
Journal:  ASAIO J       Date:  2015 May-Jun       Impact factor: 2.872

7.  Predicting survival in patients receiving continuous flow left ventricular assist devices: the HeartMate II risk score.

Authors:  Jennifer Cowger; Kartik Sundareswaran; Joseph G Rogers; Soon J Park; Francis D Pagani; Geetha Bhat; Brian Jaski; David J Farrar; Mark S Slaughter
Journal:  J Am Coll Cardiol       Date:  2012-12-19       Impact factor: 24.094

8.  REPLY: THE STANDARDIZATION AND AUTOMATION OF MACHINE LEARNING FOR BIOMEDICAL DATA.

Authors:  Hemant Ishwaran; Robert O'Brien
Journal:  J Thorac Cardiovasc Surg       Date:  2020-08-28       Impact factor: 5.209

9.  Commentary: Dabblers: Beware of hidden dangers in machine-learning comparisons.

Authors:  Hemant Ishwaran; Eugene H Blackstone
Journal:  J Thorac Cardiovasc Surg       Date:  2020-08-31       Impact factor: 6.439

10.  Imbalanced biomedical data classification using self-adaptive multilayer ELM combined with dynamic GAN.

Authors:  Liyuan Zhang; Huamin Yang; Zhengang Jiang
Journal:  Biomed Eng Online       Date:  2018-12-04       Impact factor: 2.819

View more
  2 in total

1.  Commentary: To classify means to choose a threshold.

Authors:  Jiangnan Lyu; Hemant Ishwaran
Journal:  J Thorac Cardiovasc Surg       Date:  2021-08-08       Impact factor: 5.209

2.  Prediction of inpatient pressure ulcers based on routine healthcare data using machine learning methodology.

Authors:  Felix Walther; Luise Heinrich; Jochen Schmitt; Maria Eberlein-Gonska; Martin Roessler
Journal:  Sci Rep       Date:  2022-03-23       Impact factor: 4.379

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.