| Literature DB >> 36068817 |
Rogers Aloo1, Atsuko Mutoh1, Koichi Moriyama1, Tohgoroh Matsui2, Nobuhiro Inuzuka1.
Abstract
Binary classification and anomaly detection face the problem of class imbalance in data sets. The contribution of this paper is to provide an ensemble model that improves image binary classification by reducing the class imbalance between the minority and majority classes in a data set. The ensemble model is a classifier of real images, synthetic images, and metadata associated with the real images. First, we apply a generative model to synthesize images of the minority class from the real image data set. Secondly, we train the ensemble model jointly with synthesized images of the minority class, real images, and metadata. Finally, we evaluate the model performance using a sensitivity metric to observe the difference in classification resulting from the adjustment of class imbalance. Improving the imbalance of the minority class by adding half the size of the majority class we observe an improvement in the classifier's sensitivity by 12% and 24% for the benchmark pre-trained models of RESNET50 and DENSENet121 respectively. © International Society of Artificial Life and Robotics (ISAROB) 2022, Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.Entities:
Keywords: Chest X-rays; Image classification; Image synthesis; Imbalance data; Patient metadata; Pneumonia detection
Year: 2022 PMID: 36068817 PMCID: PMC9437415 DOI: 10.1007/s10015-022-00781-8
Source DB: PubMed Journal: Artif Life Robot ISSN: 1433-5298
Fig. 1Schematic illustration of the flow of image and metadata. Right indicated an MLP classified on the metadata (MLP-meta). Middle indicates a DCNN-real based on the real Images (DCNN-real). Left indicates a GAN generated images and applying them jointly with real images and metadata for an ensemble classifier (DCNN-real-synth-meta)
Fig. 2Image 1 and 2 represent real normal and pneumonia positive images respectively. Image 3 and 4 represent synthetic normal and pneumonia positive images respectively
Sensitivity scores on test set over three models and three data sets
| Model | Real | Real-Meta | Ensemble |
|---|---|---|---|
| RESNET50 | 0.5667 | 0.7333 | |
| DENSENet121 | 0.400 | 0.6333 | |
| VGG16 | 0.6000 | 0.7857 |
The bold values indicate the most significant results for the experiement as explained in the results
Fig. 3Ensemble classifier performance metrics on DENSENet121
Fig. 4Ensemble classifier performance metrics on RESNet50
Fig. 5Ensemble classifier performance metrics on VGG16