| Literature DB >> 33163973 |
Nahida Habib1,2, Md Mahmodul Hasan1, Md Mahfuz Reza1, Mohammad Motiur Rahman1.
Abstract
Pneumonia, an acute respiratory infection, causes serious breathing hindrance by damaging lung/s. Recovery of pneumonia patients depends on the early diagnosis of the disease and proper treatment. This paper proposes an ensemble method-based pneumonia diagnosis from Chest X-ray images. The deep Convolutional Neural Networks (CNNs)-CheXNet and VGG-19 are trained and used to extract features from given X-ray images. These features are then ensembled for classification. To overcome data irregularity problem, Random Under Sampler (RUS), Random Over Sampler (ROS) and Synthetic Minority Oversampling Technique (SMOTE) are applied on the ensembled feature vector. The ensembled feature vector is then classified using several Machine Learning (ML) classification techniques (Random Forest, Adaptive Boosting, K-Nearest Neighbors). Among these methods, Random Forest got better performance metrics than others on the available standard dataset. Comparison with existing methods shows that the proposed method attains improved classification accuracy, AUC values and outperforms all other models providing 98.93% accurate prediction. The model also exhibits potential generalization capacity when tested on different dataset. Outcomes of this study can be great to use for pneumonia diagnosis from chest X-ray images. © Springer Nature Singapore Pte Ltd 2020.Entities:
Keywords: AUC; Convolutional neural network; Ensemble method; Machine learning; Pneumonia; SMOTE
Year: 2020 PMID: 33163973 PMCID: PMC7597433 DOI: 10.1007/s42979-020-00373-y
Source DB: PubMed Journal: SN Comput Sci ISSN: 2661-8907
Fig. 1Methodological steps of pneumonia detection
Description of datasets
| Name of dataset | Train | Validation | Test | |||
|---|---|---|---|---|---|---|
| Normal | Pneumonia | Normal | Pneumonia | Normal | Pneumonia | |
| [ | 1341 | 3875 | 8 | 8 | 234 | 390 |
| [ | 1341 | 1345 | ||||
Fig. 2Architectural design of fine-tuned CheXNet Model
Fig. 3Fine-tuned VGG-19 architecture
Fig. 4Random forest classifier
Fig. 5Preprocessed normal and pneumonia image of dataset [9]
Fig. 6Preprocessed normal and pneumonia image of dataset [21]
Fig. 7a CheXNet performance on validation dataset. b VGG-19 performance on validation dataset
Mean accuracy value of RF, AdaBoost and KNN against RUS, ROS, SMOTE
| ML models | Dataset balancing techniques | ||
|---|---|---|---|
| RUS (%) | ROS (%) | SMOTE (%) | |
| RF | 97.42 | 98.93 | 98.41 |
| AdaBoost | 96.54 | 98.53 | 98.37 |
| KNN | 97.43 | 97.53 | 97.73 |
Fold wise performances of RF, AdaBoost and KNN with ROS
| Five-fold cross validation | RF (%) | AdaBoost (%) | KNN (%) |
|---|---|---|---|
| Fold 1 | 98.83 | 98.36 | 97.54 |
| Fold 2 | 98.59 | 98.48 | 97.42 |
| Fold 3 | 98.89 | 98.59 | 97.19 |
| Fold 4 | 99.18 | 98.53 | 97.59 |
| Fold 5 | 99.18 | 98.71 | 97.89 |
| Mean | 98.93 | 98.53 | 97.53 |
Fig. 8Proposed final model’s performance on validation dataset
Fig. 9Proposed final model’s performance on new test dataset
Comparison with existing methods for pneumonia classification on Kermany et al. [9] dataset
| Model | Accuracy |
|---|---|
| Prateek et al. [ | 0.901 |
| Liang et al. [ | 0.905 |
| Stephen et al. [ | 0.937 |
| Saraiva et al. [ | 0.953 |
| Proposed model | 0.989 |
Fig. 10ROC curve of proposed final model