| Literature DB >> 34690398 |
Nayeeb Rashid1, Md Adnan Faisal Hossain1, Mohammad Ali1, Mumtahina Islam Sukanya1, Tanvir Mahmud1, Shaikh Anowarul Fattah1.
Abstract
With the onset of the COVID-19 pandemic, the automated diagnosis has become one of the most trending topics of research for faster mass screening. Deep learning-based approaches have been established as the most promising methods in this regard. However, the limitation of the labeled data is the main bottleneck of the data-hungry deep learning methods. In this paper, a two-stage deep CNN based scheme is proposed to detect COVID-19 from chest X-ray images for achieving optimum performance with limited training images. In the first stage, an encoder-decoder based autoencoder network is proposed, trained on chest X-ray images in an unsupervised manner, and the network learns to reconstruct the X-ray images. An encoder-merging network is proposed for the second stage that consists of different layers of the encoder model followed by a merging network. Here the encoder model is initialized with the weights learned on the first stage and the outputs from different layers of the encoder model are used effectively by being connected to a proposed merging network. An intelligent feature merging scheme is introduced in the proposed merging network. Finally, the encoder-merging network is trained for feature extraction of the X-ray images in a supervised manner and resulting features are used in the classification layers of the proposed architecture. Considering the final classification task, an EfficientNet-B4 network is utilized in both stages. An end to end training is performed for datasets containing classes: COVID-19, Normal, Bacterial Pneumonia, Viral Pneumonia. The proposed method offers very satisfactory performances compared to the state of the art methods and achieves an accuracy of 90:13% on the 4-class, 96:45% on a 3-class, and 99:39% on 2-class classification.Entities:
Keywords: Autoencoder; COVID-19 diagnosis; Medical Image Analysis; Neural Network; X-ray
Year: 2021 PMID: 34690398 PMCID: PMC8526490 DOI: 10.1016/j.bbe.2021.09.004
Source DB: PubMed Journal: Biocybern Biomed Eng ISSN: 0208-5216 Impact factor: 4.314
Fig. 1The novel approach of the proposed method is presented. In traditional approach images are passed through a randomly initialized neural network model and the model learns to classify the images, in the proposed method there are two phases of training. In the first phase an autoencoder model learns to reconstruct the input X-ray images. In the second phase the encoder portion of the autoencoder is initialized with the weights learned in phase one and connected to a proposed merging block network and this combined model is trained for the classification task.
Fig. 2The proposed autoencoder framework consists of two-stage training: (1) training the autoencoder network using the input X-ray images and (2) training the intermediate layers of the encoder network (utilizing weights obtained in the first stage) followed by the merging network. Output from the merging network is passed to a classification network that makes the final classification.
Fig. 3Model architecture of the proposed deep convolutional autoencoder.
The layers and their corresponding output shape for the proposed autoencoder model.
| Encoder Feature Extraction Layers | Decoder Layers | ||||
|---|---|---|---|---|---|
| Layer (type) | Output Shape | Layer (type) | Output Shape | Layer (type) | Output Shape |
| “Block2a expand activation” Layer | (128,128,144) | 1. Conv2D Transpose Layer | (16,16,512) | 8. Conv2D Layer | (64,64,256) |
| “Block3a expand activation” Layer | (64,64,192) | 2. Conv2D Layer | (16,16,512) | 9. Conv2D Layer | (64,64,256) |
| “Block4a expand activation” Layer | (32,32,336) | 3. Conv2D Layer | (16,16,512) | 10. Conv2D Transpose Layer | (128,128,128) |
| “Block6a expand activation” Layer | (16,16,960) | 4. Conv2D Transpose Layer | (32,32,256) | 11. Conv2D Layer | (128,128,128) |
| EfficientB4 Output Layer | (8,8,1792) | 5. Conv2D Layer | (32,32,256) | 12. Conv2D Transpose Layer | (256,256,64) |
| 6. Conv2D Layer | (32,32,256) | 13. Conv2D Layer | (256,256,64) | ||
| 7. Conv2D Transpose Layer | (64,64,256) | 14. Conv2D Layer | (256,256,3) | ||
Fig. 4Structure of the M-blocks.
Fig. 5Tree Structured Feature Merging Network.
Fig. 6Samples of chest X-ray images from prepared dataset.
Precision, Recall, F1-score and Accuracy across all 3 classes for the 5 folds of data.
| Folds | Precision (%) | Recall (%) | F1-score (%) | Accuracy (%) |
|---|---|---|---|---|
| Fold 1 | 98.05 | 97.97 | 97.96 | 97.97 |
| Fold 2 | 95.93 | 95.93 | 95.93 | 95.93 |
| Fold 3 | 95.57 | 95.53 | 95.52 | 95.53 |
| Fold 4 | 97.13 | 97.12 | 97.11 | 97.12 |
| Fold 5 | 95.58 | 95.47 | 95.47 | 95.47 |
Precision, Sensitivity, F1-score and Accuracy of the 3 classes for Fold 1.
| Class | Precision (%) | Sensitivity (%) | F1-score (%) | Accuracy (%) |
|---|---|---|---|---|
| COVID19 | 98.79 | 100 | 99.39 | 100 |
| Normal | 100 | 93.90 | 96.86 | 93.90 |
| Pneumonia | 95.35 | 100 | 97.62 | 100 |
Fig. 7Confusion Matrix of the Test Set for 3-class Dataset.
Precision, Recall, F1-score and Accuracy across all 4 classes for the 5 folds of data.
| Folds | Precision (%) | Recall (%) | F1-score (%) | Accuracy (%) |
|---|---|---|---|---|
| Fold 1 | 91.46 | 91.46 | 91.46 | 91.46 |
| Fold 2 | 92.07 | 92.07 | 92.07 | 92.07 |
| Fold 3 | 89.33 | 89.33 | 89.33 | 89.33 |
| Fold 4 | 90.12 | 90.12 | 90.12 | 90.12 |
| Fold 5 | 87.65 | 87.65 | 87.65 | 87.65 |
Classwise result for 4-class dataset of the best performing Fold.
| Class | Precision (%) | Sensitivity (%) | F1-Score (%) | Accuracy (%) |
|---|---|---|---|---|
| Bacterial Pneumonia | 89.74 | 85.37 | 87.5 | 87.5 |
| COVID19 | 100 | 100 | 100 | 100 |
| Normal | 96.25 | 93.9 | 95.06 | 95.06 |
| Viral Pneumonia | 84.09 | 90.24 | 87.06 | 87.06 |
Precision, Sensitivity, F1-score and Accuracy of the 2 classes for Fold 1.
| Class | Precision (%) | Sensitivity (%) | F1-score (%) | Accuracy (%) |
|---|---|---|---|---|
| COVID19 | 97.62 | 100 | 98.8 | 100 |
| Non-Covid19 | 100 | 98.78 | 99.39 | 98.78 |
Fig. 8Confusion Matrices of the Test Set for 4-class and 2-class Dataset.
Comparisons of different models for 3-class and 4-class classification using our scheme.
| Classification Type | Model Accuracy(%) | ||||||
|---|---|---|---|---|---|---|---|
| Efficient Net B1 | Efficient Net B2 | Efficient Net B3 | Efficient Net B4 | Inception V3 | Resnet 50 | Vgg-11 | |
| 3-class | 96.75 | 97.56 | 97.56 | 97.97 | 97.15 | 96.75 | 95.53 |
| 4-class | 90.85 | 91.31 | 92.07 | 92.38 | 88.11 | 89.33 | 86.89 |
Cohen’s Kappa score and Mattheus Correlation Coefficient of different models for the 4-class classification using the proposed method.
| Model | Cohen’s Kappa Score | Mattheus Correlation Coefficient |
|---|---|---|
| EfficientNet-B4 | 0.8861 | 0.8867 |
| ResNet-50 | 0.7723 | 0.7727 |
| InceptionNet-V3 | 0.8292 | 0.8321 |
| Vgg-11 | 0.8252 | 0.8253 |
Comparison between the Proposed scheme and the traditional transfer learning method.
| Classification Scheme | F1-Score(%) | |
|---|---|---|
| 3-class | 4-class | |
| Traditional Transfer Learning with pre-trained imagenet weights | 96.32 | 88.09 |
| Our proposed scheme with autoencoder + M-block | 96.45 | 90.13 |
Statistical test results between the Proposed scheme and the traditional transfer learning method.
| Classification Scheme | McNemar’s test | Wilcoxon signed ranked test | ||
|---|---|---|---|---|
| P-Value | Chi-squared Value | P-Value | Statistics | |
| 4-class Dataset | 0.83825 | 0.04166 | 0.638 | 157.5 |
| 3-class Dataset | 0.68309 | 0.16666 | 0.084 | 2.500 |
Comparison of our proposed method with the existing literature.
| Work | Amount of chest X-rays | Architecture | Accuracy (%) | Sensitivity (%) | Specificity (%) |
|---|---|---|---|---|---|
| Ozturk et al | 125 Covid-19 + 500 Normal | DarkCovidNet | 98.08 | 95.13 | 95.3 |
| 125 Covid-19 + 500 Normal + 500 Pneumonia | 87.02 | 85.35 | 92.18 | ||
| Wang and Wong | 53 Covid-19 + 5526 Non-Covid | COVID-Net | 92.4 | 93.33 | – |
| Ioannis et al. | 224 Covid-19 + 700 Pneumonia + 504 Normal | VGG-19 | 93.48 | 92.85 | 98.75 |
| Sethy and Behra | 25 COVID-19 + 25 Non-Covid | ReNet-50/SVM | 95.38 | 95.33 | – |
| Hemdan et al | 50 Covid-19 + 50 Non-Covid | VGG-19 | 90 | – | – |
| Narin et al | 50 Covid-19 + 50 Non-Covid | ResNet-50 | 96.1 | 91.8 | 96.6 |
| 305 Covid-19 + 305 Normal | 97.4 | 94.7 | |||
| 305 Covid-19 + 305 Viral Pneumonia | 87.3 | 85.5 | |||
| Tanvir et al | 304 Covid-19 + 305 Bacterial Pneumonia | CovXNet | 94.7 | – | 93.3 |
| 305 Covid-19 + 305 Viral Pneumonia + 305 Bacterial Pneumonia | 89.6 | 87.6 | |||
| 305 Covid-19 + 305 Normal + 305 Viral Pneumonia + 305 Bacterial Pneumonia | 90.3 | 89.1 | |||
| Khan et al | 284 Covid-19 + 310 Normal + 330 Bacerial Pneumonia + 327 Viral Pneumonia | CoroNet | 89.6 | – | 96.4 |
| Abbas et al | 105 COVID-19 + 11 SARS + 80 Normal | DeTraC | 97.35% | 98.23% | 96.34% |
| 500COVID-19 + 800 Normal + 400 Pneumonia-Viral + 400 Pneumonia-bacteria | 91.2 | 91.76 | 93.48 | ||
| Emtiaz et al | 500COVID-19 + 800 Normal + 800 Pneumonia-bacteria | CoroDet | 94.2 | 92.76 | 94.56 |
| 500COVID-19 + 800 Normal | 99.12 | 95.36 | 97.36 | ||
| 371 COVID-19 + 1076 Normal | 99.16 | 97.44 | 100 | ||
| Ibrahim et al | 371 COVID-19 + 1076 Normal + 4078 Pneumonia-bacteria | AlexNet | 97.40 | 91.30 | 84.78 |
| 371 COVID-19 + 1076 Normal + 4078 Pneumonia-bacteria + 4237 Pneumonia-Viral | 93.42 | 89.18 | 98.92 | ||
| 408 Covid-19 + 408 Normal + 408 Viral Pneumonia + 408 Bacterial Pneumonia | 90.13 | 91.46 | 97.15 | ||
| Proposed Method | 408 Covid-19 + 408 Normal + 408 Pneumonia | AutoCovNet | 96.45 | 95.94 | 97.96 |
| 408 Covid-19 + 816 Non-Covid | 99.39 | 99.39 | 100 |
Comparison of the proposed method with the existing literature on common evaluation protocol.
| System Architecture | Accuracy on the Common Evaluation Protocols | ||
|---|---|---|---|
| 408 Covid-19 + 408 Normal + 408 Viral Pneumonia + 408 Bacterial Pneumonia | 408 Covid-19 + 408 Normal + 408 Pneumonia | 408 Covid-19 + 816 Non-Covid | |
| CoroNet | 82.93 | 95.12 | 98.37 |
| DarkCovidNet | 89.33 | 95.93 | 98.78 |
Fig. 9Regions of interest in each class of the dataset that is being used in the classification task.
Fig. 10Gradient-based activation map for the misclassified instances of the test set.