Literature DB >> 30092410

A systematic study of the class imbalance problem in convolutional neural networks.

Mateusz Buda1, Atsuto Maki2, Maciej A Mazurowski3.   

Abstract

In this study, we systematically investigate the impact of class imbalance on classification performance of convolutional neural networks (CNNs) and compare frequently used methods to address the issue. Class imbalance is a common problem that has been comprehensively studied in classical machine learning, yet very limited systematic research is available in the context of deep learning. In our study, we use three benchmark datasets of increasing complexity, MNIST, CIFAR-10 and ImageNet, to investigate the effects of imbalance on classification and perform an extensive comparison of several methods to address the issue: oversampling, undersampling, two-phase training, and thresholding that compensates for prior class probabilities. Our main evaluation metric is area under the receiver operating characteristic curve (ROC AUC) adjusted to multi-class tasks since overall accuracy metric is associated with notable difficulties in the context of imbalanced data. Based on results from our experiments we conclude that (i) the effect of class imbalance on classification performance is detrimental; (ii) the method of addressing class imbalance that emerged as dominant in almost all analyzed scenarios was oversampling; (iii) oversampling should be applied to the level that completely eliminates the imbalance, whereas the optimal undersampling ratio depends on the extent of imbalance; (iv) as opposed to some classical machine learning models, oversampling does not cause overfitting of CNNs; (v) thresholding should be applied to compensate for prior class probabilities when overall number of properly classified cases is of interest.
Copyright © 2018 Elsevier Ltd. All rights reserved.

Keywords:  Class imbalance; Convolutional neural networks; Deep learning; Image classification

Mesh:

Year:  2018        PMID: 30092410     DOI: 10.1016/j.neunet.2018.07.011

Source DB:  PubMed          Journal:  Neural Netw        ISSN: 0893-6080


  119 in total

1.  Deep learning for World Health Organization grades of pancreatic neuroendocrine tumors on contrast-enhanced magnetic resonance images: a preliminary study.

Authors:  Xuan Gao; Xiaolin Wang
Journal:  Int J Comput Assist Radiol Surg       Date:  2019-09-26       Impact factor: 2.924

2.  Impact of JPEG 2000 compression on deep convolutional neural networks for metastatic cancer detection in histopathological images.

Authors:  Farhad Ghazvinian Zanjani; Svitlana Zinger; Bastian Piepers; Saeed Mahmoudpour; Peter Schelkens; Peter H N de With
Journal:  J Med Imaging (Bellingham)       Date:  2019-04-24

3.  Predicting substance use disorder using long-term attention deficit hyperactivity disorder medication records in Truven.

Authors:  Sajjad Fouladvand; Emily R Hankosky; Heather Bush; Jin Chen; Linda P Dwoskin; Patricia R Freeman; Darren W Henderson; Kathleen Kantak; Jeffery Talbert; Shiqiang Tao; Guo-Qiang Zhang
Journal:  Health Informatics J       Date:  2019-05-19       Impact factor: 2.681

4.  Deep Learning for Predicting Enhancing Lesions in Multiple Sclerosis from Noncontrast MRI.

Authors:  Ponnada A Narayana; Ivan Coronado; Sheeba J Sujit; Jerry S Wolinsky; Fred D Lublin; Refaat E Gabr
Journal:  Radiology       Date:  2019-12-17       Impact factor: 11.105

5.  A convolutional neural network to filter artifacts in spectroscopic MRI.

Authors:  Saumya S Gurbani; Eduard Schreibmann; Andrew A Maudsley; James Scott Cordova; Brian J Soher; Harish Poptani; Gaurav Verma; Peter B Barker; Hyunsuk Shim; Lee A D Cooper
Journal:  Magn Reson Med       Date:  2018-03-09       Impact factor: 4.668

6.  A survey on generative adversarial networks for imbalance problems in computer vision tasks.

Authors:  Vignesh Sampath; Iñaki Maurtua; Juan José Aguilar Martín; Aitor Gutierrez
Journal:  J Big Data       Date:  2021-01-29

7.  Serendipity-A Machine-Learning Application for Mining Serendipitous Drug Usage From Social Media.

Authors:  Boshu Ru; Dingcheng Li; Yueqi Hu; Lixia Yao
Journal:  IEEE Trans Nanobioscience       Date:  2019-04-04       Impact factor: 2.935

8.  Comparison of orthogonal NLP methods for clinical phenotyping and assessment of bone scan utilization among prostate cancer patients.

Authors:  Jean Coquet; Selen Bozkurt; Kathleen M Kan; Michelle K Ferrari; Douglas W Blayney; James D Brooks; Tina Hernandez-Boussard
Journal:  J Biomed Inform       Date:  2019-04-20       Impact factor: 6.317

9.  Hidden Stratification Causes Clinically Meaningful Failures in Machine Learning for Medical Imaging.

Authors:  Luke Oakden-Rayner; Jared Dunnmon; Gustavo Carneiro; Christopher Ré
Journal:  Proc ACM Conf Health Inference Learn (2020)       Date:  2020-04

10.  Deep Learning-based Prescription of Cardiac MRI Planes.

Authors:  Kevin Blansit; Tara Retson; Evan Masutani; Naeim Bahrami; Albert Hsiao
Journal:  Radiol Artif Intell       Date:  2019-11-27
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.