| Literature DB >> 34518739 |
Prashant Bhardwaj1, Amanpreet Kaur1.
Abstract
With the exponential growth of COVID-19 cases, medical practitioners are searching for accurate and quick automated detection methods to prevent Covid from spreading while trying to reduce the computational requirement of devices. In this research article, a deep learning Convolutional Neural Network (CNN) based accurate and efficient ensemble model using deep learning is being proposed with 2161 COVID-19, 2022 pneumonia, and 5863 normal chest X-ray images that has been collected from previous publications and other online resources. To improve the detection accuracy contrast enhancement and image normalization have been done to produce better quality images at the pre-processing level. Further data augmentation methods are used by creating modified versions of images in the dataset to train the four efficient CNN models (Inceptionv3, DenseNet121, Xception, InceptionResNetv2) Experimental results provide 98.33% accuracy for binary class and 92.36% for multiclass. The performance evaluation metrics reveal that this tool can be very helpful for early disease diagnosis.Entities:
Keywords: Matthews correlation coefficients; deep learning models; simple averaging; weighted averaging
Year: 2021 PMID: 34518739 PMCID: PMC8426690 DOI: 10.1002/ima.22627
Source DB: PubMed Journal: Int J Imaging Syst Technol ISSN: 0899-9457 Impact factor: 2.177
FIGURE 1Proposed classification framework
Distribution of data for all classes
| Class label | Number of samples | Training | Testing | Validation | Modality |
|---|---|---|---|---|---|
| Pneumonia | 2022 | 1617 | 203 | 202 | X‐ray |
| COVID‐19 | 2161 | 1728 | 216 | 217 | |
| Non‐COVID | 5863 | 4690 | 586 | 587 |
FIGURE 2(A) Data pre‐processing and (B) data augmentation
FIGURE 3Inception v3 architecture
FIGURE 4DenseNet architecture
FIGURE 5Inceptionresnetv2 architecture
FIGURE 6Xception architecture
Different parameters used for training a deep learning model
| Algorithm #1: experimental setup | |
|---|---|
| Input | Dataset collected for three categories (COVID‐19, non‐COVID, pneumonia) |
| Environment | Use of Google Collaboratory with required libraries |
| Dataset collection | Kaggle, previous publication |
| Directories | Split data into three parts: training, testing, and validation and create subfolders for each folder defining three classes (COVID‐19, non‐COVID, pneumonia) |
| Data generator | For data generation different data augmentation methods: image rotation, flipping, scaling were employed |
| Libraries and optimizers used | Numpy, matplotlib, sklearn, scikitplot, different keras model, SGD |
| Training and testing | Create the proposed model using four pre‐trained networks (Inceptionv3, DenseNet121, Xception, and InceptionResNetv2). Different layers are used in depth with Relu activation function and an output layer with a softmax activation function |
| Apply 5‐fold cross‐validation |
Compile the model using the SGD optimizer with the learning rate 0.001 and categorical cross‐entropy as a loss function. The model is tested for 50 epochs. ReduceLROnPlateau function is used with minimum learning rate of 0.00001 to improve the metrics rate. Different evaluation metrics used to check model performance Confusion metrics Precision‐recall curve ROC curve |
| Model validation |
Identification of the best performing model from four pre‐trained architectures as mentioned above with training and testing data. Perform validation on the validated dataset. Generate performance metrics Confusion metric ROC curve |
Calculation of evaluation metrics
| Model | Input size | No. of layers | No. of parameters in million | Class | MCC | MSE | Mean squared log error |
|---|---|---|---|---|---|---|---|
| Xception | 224 × 224 × 3 | 48 | 24 | 2 | 0.964 | 0.017 | 0.0085 |
| 3 | 0.889 | 0.269 | 0.0825 | ||||
| DenseNet 121 | 224 × 224 × 3 | 121 | 1 | 2 | 0.754 | 0.128 | 0.0617 |
| 3 | 0.853 | 0.267 | 0.0771 | ||||
| Inception v3 | 299 × 299 × 3 | 164 | 56 | 2 | 0.962 | 0.019 | 0.0091 |
| 3 | 0.870 | 0.299 | 0.0915 | ||||
| Inception ResNet v2 | 299 × 299 × 3 | 170 | 22.9 | 2 | 0.952 | 0.023 | 0.0114 |
| 3 | 0.912 | 0.138 | 0.0403 |
Classification results for binary and multiclass after 5‐fold cross‐validation
| CNN model | Class | Accuracy (%) | Precision | Recall | F‐1 score |
|---|---|---|---|---|---|
| Xception | Binary | 95.05 | 0.9949 | 0.9680 | 0.9813 |
| Multiclass | 85.30 | 0.8312 | 0.9704 | 0.8955 | |
| DenseNet 121 | Binary | 94.53 | 0.8130 | 0.9532 | 0.8776 |
| Multiclass | 86.05 | 0.8457 | 0.9852 | 0.9101 | |
| Inception v3 | Binary | 95.13 | 0.9924 | 0.9680 | 0.9800 |
| Multiclass | 82.33 | 0.8098 | 0.9754 | 0.8849 | |
| Inception ResNet v2 | Binary | 96.28 | 0.9949 | 0.9557 | 0.9749 |
| Multiclass | 88.19 | 0.9207 | 0.9729 | 0.9461 | |
| Ensemble 1 (simple averaging) | Binary | 98.45 | 0.9975 | 0.9704 | 0.9838 |
| Multiclass | 91.74 | 0.8725 | 0.9606 | 0.9144 | |
| Ensemble 2 (weighted averaging) | Binary | 98.33% | 0.9975 | 0.9680 | 0.9825 |
| Multiclass | 92.36% | 0.8772 | 0.9680 | 0.9204 |
FIGURE 7(A–D) Confusion matrix and (E–H) precision‐recall curve for binary and multiclass problem
FIGURE 8(A, C) Receiver operating characteristics (ROC) curve using simple averaging (B, D); ROC curve using weighted averaging
Comparison of existing models with our proposed ensemble model
| References | Model used | Dataset | Performance metrics | Year | ||
|---|---|---|---|---|---|---|
| Pneumonia | COVID‐19 | Normal | ||||
| Gunraj et al. | Covidnet | 5538 | 385 | 8066 | Accuracy = 93.30% | 2020 |
| Chowdhury et al. | AlexNet, SqueezeNet, ResNet18, DenseNet201 | 1485 | 423 | 1579 | Accuracy = 97.94% | 2020 |
| Ozturk et al. | CNN | 500 | 127 | 500 |
Accuracy (binary) = 98.08% (multiclass) = 97.02% | 2020 |
| Khan et al. | CoroNet | 657 | 284 | 310 | Accuracy = 95% | 2020 |
| Nour et al. | CNN, SVM, DT, KNN | 1345 | 219 | 1341 | Accuracy = 98.97% | 2020 |
| Öksüz et al. | SqueezeNet, ShuffleNet, and EfficientNet‐B0 | 1345 | 219 | 1341 | Accuracy = 98.30% | 2020 |
| Afifi et al. | Resnet18, densenet161, inceptionv4 | 5541 | 1056 | 7218 | Accuracy = 91.2% | 2021 |
| Tang et al. | Modified covidnet named EDL‐net model | 6053 | 573 | 8851 | Accuracy = 95% | 2021 |
| Proposed model | Inceptionv3, densenet121, inceptionresnetv2, and xception | 2022 | 2161 | 5863 |
Accuracy (binary) = 98.33% (multiclass) = 92.36% | |