| Literature DB >> 34010794 |
Fareed Ahmad1, Muhammad Usman Ghani Khan2, Kashif Javed3.
Abstract
Novel Coronavirus is deadly for humans and animals. The ease of its dispersion, coupled with its tremendous capability for ailment and death in infected people, makes it a risk to society. The chest X-ray is conventional but hard to interpret radiographic test for initial diagnosis of coronavirus from other related infections. It bears a considerable amount of information on physiological and anatomical features. To extract relevant information from it can occasionally become challenging even for a professional radiologist. In this regard, deep-learning models can help in swift, accurate and reliable outcomes. Existing datasets are small and suffer from the balance issue. In this paper, we prepare a relatively larger and well-balanced dataset as compared to the available datasets. Furthermore, we analyze deep learning models, namely, AlexNet, SqueezeNet, DenseNet201, MobileNetV2 and InceptionV3 with numerous variations such as training the models from scratch, fine-tuning without pre-trained weights, fine-tuning along with updating pre-trained weights of all layers, and fine-tuning with pre-trained weights along with applying augmentation. Our results show that fine-tuning with augmentation generates best results in pre-trained models. Finally, we have made architectural adjustments in MobileNetV2 and InceptionV3 models to learn more intricate features, which are then merged in our proposed ensemble model. The performance of our model is statistically analyzed against other models using four different performance metrics with paired two-sided t-test on 5 different splits of training and test sets of our dataset. We find that it is statistically better than its competing methods for the four metrics. Thus, the computer-aided classification based on the proposed model can assist radiologists in identifying coronavirus from other related infections in chest X-rays with higher accuracy. This can help in a reliable and speedy diagnosis, thereby saving valuable lives and mitigating the adverse impact on the socioeconomics of our community.Entities:
Keywords: Data augmentation; Deep learning models; Ensemble learning; Feature fusion; Novel coronavirus classification; Transfer learning
Year: 2021 PMID: 34010794 PMCID: PMC8058056 DOI: 10.1016/j.compbiomed.2021.104401
Source DB: PubMed Journal: Comput Biol Med ISSN: 0010-4825 Impact factor: 4.589
A comparison of different CNN models for COVID-19 Classification.
| Approach | No. Of Classes | Dataset Details | Aug-mentation? | Dataset Balanced? | Transfer Learning? | 5-fold CV? | Ensemble Model? | Statistical Comparison? | Tuning Hyper parameters? |
|---|---|---|---|---|---|---|---|---|---|
| [ | 5 | N 191, B 54, T 57, V 20, C 180 | |||||||
| [ | 3 | N 8851, P 6012, C 180 | |||||||
| [ | 3 | N 1349, P 3895 C 66 | |||||||
| [ | 5 | Not specified | |||||||
| [ | 3 | 3 N 8066, P 5521 C 183 | |||||||
| [ | 3 | N 8851, P9579C99 | |||||||
| [ | 4 | N 7595, B 2780 C 313, UP 6012 | |||||||
| [ | 2 | N 500, C 184 | |||||||
| [ | 4 | N 1203, B 931 V 660, C 68 | |||||||
| [ | 3 | N 1579, V 1485 C 423 | |||||||
| [ | 3 | N 401, V 401 C 401 |
Fig. 1Example posteroanterior chest radiograph images of our dataset.
Fig. 2Various phases of our proposed method.
Fig. 3Flowchart of various steps of our proposed method.
The range of hyper-parameters used for different pre-trained models.
| Deep Model | Batch Size | Learning Rate |
|---|---|---|
| AlexNet | 48, 32 | 5e-10, 1e-8, 8e-7, |
| SqueezeNet | 48, 32, 24 | 1e-8, 8e-7, 1e-7, |
| DenseNet201 | 48, 32 | 1e-7, 1e-6,1e-5 |
| MobileNetV2 | 48, 32, 24 | 1e-8, 6e-7, 4e-7, 1e-7, |
| InceptionV3 | 36, 32, 24 | 1e-8, 1e-6, 8e-5, 2e-5, |
| Our ensemble model (InceptionV3 + MobileNetV2) | 48 | 9e-8, 9e-7, 7e-7, 4e-7, 1e-7, |
The number of training parameters and depth of Pre-trained Models.
| Pre-trained Model | Depth | Weight Parameters |
|---|---|---|
| AlexNet | 8 | 61 million |
| SqueezeNet | 18 | 1.24 million |
| DenseNet201 | 201 | 20.0 million |
| MobileNetV2 | 53 | 3.5 million |
| InceptionV3 | 48 | 25 million |
Fig. 4InceptionV3 module.
A comparison of performances of our ensemble model and various deep learning models.
| Model | Methods | Specificity | Recall | F-score | Accuracy |
|---|---|---|---|---|---|
| Our ensemble method (InceptionV3+MobileNetV2) | 98.97 ± 0.22 | 96.89 ± 0.65 | 96.90 ± 0.65 | 98.45 ± 0.32 | |
| Ensemble method [ | |||||
| AlexNet | Trained on ODS-NPTW | 76.15 ± 0.37 | 28.48 ± 1.46 | 24.18 ± 1.82 | 64.36 ± 0.64 |
| FT on ODS-NPTW | 92.69 ± 0.70 | 78.11 ± 2.20 | 77.77 ± 2.17 | 89.03 ± 1.03 | |
| FT on ODS-PTW | 97.41 ± 0.43 | 92.21 ± 1.25 | 92.20 ± 1.33 | 96.13 ± 0.62 | |
| FT on ADS-ALUF | |||||
| SqueezeNet | Trained on ODS-NPTW | 75.28 ± 0.31 | 25.85 ± 0.95 | 14.00 ± 1.32 | 62.91 ± 0.69 |
| FT on ODS-NPTW | 92.46 ± 0.49 | 77.30 ± 1.13 | 77.35 ± 1.15 | 88.69 ± 0.65 | |
| FT on ODS-PTW | 97.24 ± 0.38 | 91.77 ± 0.87 | 91.78 ± 0.95 | 95.87 ± 0.54 | |
| FT on ADS-ALUF | |||||
| Densenet201 | Trained on ODS-NPTW | 74.64 ± 1.43 | 23.73 ± 4.03 | 22.20 ± 4.68 | 61.99 ± 2.14 |
| FT on ODS-NPTW | 87.46 ± 0.74 | 62.36 ± 2.49 | 62.55 ± 2.33 | 81.21 ± 1.01 | |
| FT on ODS-PTW | 97.36 ± 0.43 | 92.08 ± 1.35 | 92.04 ± 1.41 | 96.05 ± 0.65 | |
| FT on ADS-ALUF | |||||
| MobileNetV2 | Trained on ODS-NPTW | 74.48 ± 0.57 | 23.46 ± 1.73 | 14.29 ± 0.85 | 61.74 ± 1.06 |
| FT on ODS-NPTW | 92.33 ± 0.54 | 77.04 ± 1.06 | 76.68 ± 1.13 | 88.50 ± 0.74 | |
| FT on ODS-PTW | 97.91 ± 0.21 | 93.76 ± 0.63 | 93.71 ± 0.66 | 96.86 ± 0.32 | |
| FT on ADS-ALUF | |||||
| InceptionV3 | Trained on ODS-NPTW | 72.16 ± 0.58 | 16.57 ± 1.33 | 16.31 ± 1.00 | 58.24 ± 0.59 |
| FT on ODS-NPTW | 85.01 ± 0.41 | 55.02 ± 1.09 | 55.29 ± 1.17 | 77.53 ± 0.52 | |
| FT on ODS-PTW | 98.08 ± 0.45 | 94.19 ± 1.63 | 94.21 ± 1.60 | 97.11 ± 0.70 | |
| FT on ADS-ALUF |
ODS, ADS, NPTW, PTW, ALUF, FT stands for original dataset, augmented dataset, no pre-trained weights, pre-trained weights, all layers un-frozen, fine-tuned. A • denotes that our deep learning ensemble model is statistically better than its competing model.
Fig. 5Confusion matrices for 5-fold cross-validation.
Fig. 6ROC curve of the deep ensemble model for 5-fold cross-validation.
Fig. 7Learning curves for training and validation accuracy (blue, black doted lines) and training and validation loss (orange, black dotted lines) of fold-2 of fine-tuned, pre-trained, ensemble model for various infections in X-rays.
A comparison of Our model with previous approaches for COVID-19 Classification.
(c) In case the null hypothesis is rejected, the performance of our model is statistically different from the other model. We consider it to be a win for our model if the mean accuracy value of our model is greater than that of the competing model. We denote it be a •. Otherwise, it's a loss for our model. We denote it by a ◦.
If t-test does not reject the null hypothesis, the performance of our model is not statistically different from the other model and we consider it to be a tie. A tie will be represented by no symbol.
| Approach | Classes | COVID-19 | Balanced | 5-folds CV? | Statistical | Tuning Hyper parameters? | Accuracy |
|---|---|---|---|---|---|---|---|
| Our Model | 4 | 1000 | 98.45% | ||||
| Covid-Net [ | 3 | 180 | 93.3% | ||||
| CoroNet [ | 4 | 284 | 89.6% | ||||
| XGB [ | 4 | 130 | 79.52% | ||||
| DELT [ | 4 | 305 | 90.13% | ||||
| COVID-ResNet [ | 4 | 68 | 96.23% |