| Literature DB >> 36104401 |
Rishav Pramanik1, Subhrajit Dey2, Samir Malakar3, Seyedali Mirjalili4,5, Ram Sarkar1.
Abstract
The novel coronavirus (COVID-19), has undoubtedly imprinted our lives with its deadly impact. Early testing with isolation of the individual is the best possible way to curb the spread of this deadly virus. Computer aided diagnosis (CAD) provides an alternative and cheap option for screening of the said virus. In this paper, we propose a convolution neural network (CNN)-based CAD method for COVID-19 and pneumonia detection from chest X-ray images. We consider three input types for three identical base classifiers. To capture maximum possible complementary features, we consider the original RGB image, Red channel image and the original image stacked with Robert's edge information. After that we develop an ensemble strategy based on the technique for order preference by similarity to an ideal solution (TOPSIS) to aggregate the outcomes of base classifiers. The overall framework, called TOPCONet, is very light in comparison with standard CNN models in terms of the number of trainable parameters required. TOPCONet achieves state-of-the-art results when evaluated on the three publicly available datasets: (1) IEEE COVID-19 dataset + Kaggle Pneumonia Dataset, (2) Kaggle Radiography dataset and (3) COVIDx.Entities:
Mesh:
Year: 2022 PMID: 36104401 PMCID: PMC9471038 DOI: 10.1038/s41598-022-18463-7
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.996
Figure 1A pictorial representation of how a neuron of one layer connects with neurons of other layer.
Figure 2A pictorial description of classical feed-forward neural network with one input layer of size 12, two hidden layers each of size 8 and an output vector of length 4. This figure has been generated using a tool developed by Alex LeNail https://alexlenail.me/NN-SVG/.
Figure 3The pipeline for the proposed model. Here, the CNN architecture for all three base classifiers is the same. Images are taken from the public datasets found in[51,52].
Figure 4Architecture of the proposed CNN model.

Distribution of datasets for training and evaluating the present TOPCONet model.
| Dataset | Source | Train samples | Test samples | ||||
|---|---|---|---|---|---|---|---|
| COVID-19 | Normal | Pneumonia | COVID-19 | Normal | Pneumonia | ||
| Dataset-1 | Kaggle Pneumonia[ | 739 | 1072 | 3100 | 185 | 269 | 775 |
| Dataset-2 | KaggleCOVID-19 Radiography Database[ | 2893 | 8154 | 1076 | 723 | 2038 | 269 |
Figure 5Validation accuracy (in %) with respect to different learning rates used to train the customised CNN model.
Performance comparison of the proposed 3 base classifiers and TOPCONet with some state-of-the-art CNN models utilising the concept of transfer learning and training from scratch on Dataset-1.
| Model | # Trainable parameters | Performance (in %) in terms of | Total time (in s) | ||||
|---|---|---|---|---|---|---|---|
| Pre | Rec | F1-score | RA | Train | Test | ||
| SqueezeNet | 1,248,424 | 93.12 | 93.27 | 93.19 | 93.32 | 112.6 | 2.4 |
| SqueezeNet* | 1,248,424 | 92.24 | 92.04 | 92.14 | 92.35 | 226.3 | 4.5 |
| MobileNetV2 | 3,504,872 | 98.23 | 98.00 | 98.11 | 97.96 | 102.1 | 3.1 |
| MobileNetV2* | 3,504,872 | 98.39 | 97.65 | 98.25 | 98.12 | 448.4 | 4.5 |
| ResNet101 | 44,549,160 | 94.33 | 91.00 | 92.67 | 94.38 | 331.5 | 7.0 |
| ResNet101* | 44,549,160 | 96.83 | 92.70 | 94.41 | 95.03 | 779.2 | 6.8 |
| DenseNet121 | 7,978,856 | 97.67 | 96.67 | 96.67 | 97.96 | 184.9 | 6.0 |
| DenseNet121* | 7,978,856 | 97.57 | 98.12 | 97.84 | 97.80 | 637.5 | 5.7 |
| VGG-19 | 143,667,240 | 96.67 | 96.33 | 96.67 | 98.20 | 207.0 | 11.1 |
| VGG-19* | 143,667,240 | 95.14 | 97.53 | 96.21 | 96.41 | 625.6 | 8.3 |
| InceptionV3 | 27,161,264 | 96.33 | 96.67 | 96.67 | 96.69 | 209.5 | 4.7 |
| InceptionV3* | 27,161,264 | 95.5 | 96.43 | 95.94 | 96.58 | 430.9 | 7.6 |
| Classifier 1* | 97.34 | 97.00 | 97.67 | 97.96 | |||
| Classifier 2* | 339,235 | 97.67 | 97.34 | 97.67 | 98.04 | 68.4 | 1.5 |
| Classifier 3* | 355,619 | 98.34 | 98.00 | 98.37 | 98.53 | 86.2 | 2.4 |
| TOPCONet | 1,001,324 | N/A# | 4.9 | ||||
For citing recall (Rec), precision (Pre) and F1-score micro-average method is used.
*After the CNN models’ name means that the models have been trained from scratch. #As the proposed ensemble method is non trainable, hence time required to run on the training split is not applicable (N/A).
RA indicates recognition accuracy and bold faced numbers indicate the best scores.
Performance comparison of the proposed 3 base classifiers and TOPCONet with some state-of-the-art CNN models utilising the concept of transfer learning and training from scratch on Dataset-2.
| Model | # Trainable parameters | Performance (in %) in terms of | Total time (in s) | ||||
|---|---|---|---|---|---|---|---|
| Pre | Rec | F1-score | RA | Train | Test | ||
| SqueezeNet | 1,248,424 | 95.35 | 95.20 | 95.27 | 95.32 | 165.4 | 5.5 |
| SqueezeNet* | 1,248,424 | 94.56 | 94.23 | 94.39 | 94.78 | 556.6 | 4.9 |
| MobileNetV2 | 3,504,872 | 95.96 | 95.88 | 95.92 | 96.14 | 270.5 | 5.0 |
| MobileNetV2* | 3,504,872 | 95.93 | 97.26 | 96.57 | 97.6 | 1047.0 | 5.2 |
| ResNet101 | 44,549,160 | 95.20 | 95.45 | 95.32 | 95.48 | 693.3 | 11.9 |
| ResNet101* | 44,549,160 | 83.79 | 89.40 | 86.31 | 88.01 | 1901.0 | 11.8 |
| DenseNet121 | 7,978,856 | 97.00 | 96.67 | 96.67 | 96.30 | 453.7 | 9.4 |
| DenseNet121* | 7,978,856 | 97.10 | 94.90 | 95.97 | 96.96 | 1357.0 | 9.0 |
| VGG-19 | 143,667,240 | 96.67 | 96.00 | 96.34 | 96.08 | 499.6 | 9.4 |
| VGG-19* | 143,667,240 | 93.51 | 97.37 | 95.31 | 96.03 | 1508.0 | 13.6 |
| InceptionV3 | 27,161,264 | 95.01 | 95.11 | 95.06 | 95.24 | 390.7 | 7.4 |
| InceptionV3* | 27,161,264 | 93.42 | 94.98 | 94.1 | 95.1 | 1044.0 | 7.7 |
| Classifier 1* | 97.30 | 96.87 | 97.08 | 97.92 | |||
| Classifier 2* | 339,235 | 97.04 | 97.08 | 97.06 | 97.98 | 189.9 | 2.7 |
| Classifier 3* | 355,619 | 97.37 | 97.76 | 97.57 | 98.28 | 165 | 5.5 |
| TOPCONet | 1,001,324 | N/A# | 11.2 | ||||
For citing recall (Rec), precision (Pre) and F1-score micro-average method is used.
*After the CNN models’ name means that the models have been trained from scratch. #As the proposed ensemble method is non trainable, hence time required to run on the training split is not applicable (N/A).
RA indicates recognition accuracy and bold faced numbers indicate the best scores.
Performance comparison of the TOPSIS aided ensemble method with other popular ensemble methods.
| Category | Mmethod | Dataset-1 | Dataset-2 | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Precision | Recall | F1-score | Accuracy | Precision | Recall | F1-score | Accuracy | ||
| Hard-voting | Majority voting | 98.40 | 97.76 | 98.42 | 98.45 | 96.74 | 97.68 | 97.17 | 98.21 |
| Soft-voting | Product rule | 98.34 | 97.34 | 98.00 | 98.37 | 98.10 | 97.97 | 98.04 | 98.42 |
| Sum rule | 98.65 | 97.98 | 98.32 | 98.62 | 97.90 | 97.78 | 97.85 | 98.32 | |
| Weighted average | 98.65 | 97.98 | 98.32 | 98.62 | 97.90 | 97.78 | 97.84 | 98.32 | |
| TOPSIS-aided | 98.47 | 98.26 | 98.37 | 98.78 | 97.84 | 97.85 | 97.85 | 98.61 | |
All the scores reported are in %.
Time required to process 16 sample images at a time. The training time reported is recorded as the sum of total time required to train for each epoch. The time reported is the average of five independent runs.
| Process | Time to train (in ms) | Time to test (in ms) |
|---|---|---|
| Pre-processing | 240 | 240 |
| Classifier 1 | 350 | 12 |
| Classifier 2 | 375 | 13 |
| Classifier 3 | 425 | 14 |
| TOPSIS ensemble | – | 4 |
Accuracies (in %) while evaluating using 5-fold cross validation scheme.
| Fold | Dataset-1 | Dataset-2 | ||||||
|---|---|---|---|---|---|---|---|---|
| Classifier 1 | Classifier 2 | Classifier 3 | Ensemble | Classifier 1 | Classifier 2 | Classifier 3 | Ensemble | |
| Fold1 | 96.09 | 96.42 | 95.84 | 96.58 | 96.43 | 96.46 | 95.84 | 96.83 |
| Fold2 | 97.79 | 97.74 | 97.55 | 98.12 | 95.64 | 95.60 | 95.80 | 96.27 |
| Fold3 | 97.02 | 96.58 | 97.72 | 97.80 | 97.46 | 97.52 | 97.72 | 97.85 |
| Fold4 | 97.71 | 97.79 | 97.71 | 98.24 | 96.73 | 96.86 | 96.93 | 96.96 |
| Fold5 | 96.57 | 96.74 | 96.82 | 97.06 | 95.59 | 95.95 | 95.16 | 96.13 |
| Maximum | 97.79 | 97.79 | 97.72 | 98.24 | 97.46 | 97.52 | 97.72 | 97.85 |
| Minimum | 96.09 | 96.42 | 95.84 | 96.58 | 95.59 | 95.60 | 95.16 | 96.13 |
| Average | 97.04 | 97.05 | 97.13 | 97.56 | 96.37 | 96.48 | 96.29 | 96.80 |
| SD | 0.65 | 0.59 | 0.72 | 0.64 | 0.70 | 0.67 | 0.91 | 0.61 |
Performance comparison of the proposed method with some state-of-the-art models on the Dataset-1.
| Work Ref. | Technique | Experimental protocol | Performance (in %) in terms of | |||
|---|---|---|---|---|---|---|
| Precision | Recall | F1-score | Accuracy | |||
| Khan et al.[ | CoroNet | Hold-out test set | 95.00 | 96.90 | 95.94 | 95.00 |
| Jain et al.[ | Xception | Hold-out test set | 98.00 | 94.60 | 96.00 | 97.00 |
| Hussain et al.[ | CoroDet | 5-Fold cross validation | 96.34 | 96.00 | 96.00 | 96.66 |
| Ismael et al.[ | End to end CNN | Hold-out test set | 95.67 | 94.67 | 95.00 | 96.09 |
| Das et al.[ | Bi level prediction | Hold-out test set | 97.87 | 98.14 | 98.00 | 98.45 |
| Goel et al.[ | OptCoNet | Hold-out test set | 92.88 | 96.25 | 95.25 | 97.78 |
| Paul et al.[ | Inverted bell ensemble | Hold-out test set | 97.21 | 97.81 | 97.50 | 97.97 |
| Paul et al.[ | Inverted bell ensemble | 5-Fold cross validation | 97.12 | 97.62 | 97.39 | 97.85 |
| Gour et al.[ | UA-ConvNet model | 5-Fold cross validation | 98.49 | 98.26 | 98.36 | 98.09 |
| Gour et al.[ | Stacked CNN model | 5-Fold cross validation | 97.62 | 98.52 | 97.50 | 97.27 |
| Hasoon et al.[ | LBP-KNN model | 5-Fold cross validation | 97.80 | 100.0 | 98.88 | 98.70 |
| Bashar et al.[ | Optimized CNN model | Hold-out test set | 95.67 | 93.34 | 94.67 | 95.20 |
| Senan et al.[ | ResNet50 | Hold-out test set | 98.00 | 98.67 | 98.67 | 98.70 |
| Naeem et al.[ | CNN-LSTM model | Hold-out test set | 95.00 | 95.00 | 95.00 | 96.60 |
| Goyal et al.[ | F-RRN-LSTM model | Hold-out test set | 88.89 | 95.41 | 92.03 | 94.31 |
| Proposed | TOPCONet | Hold-out test set | 98.67 | 98.00 | 98.34 | 98.78 |
| 5-Fold cross validation | 98.10 | 97.53 | 97.81 | 98.24 | ||
Performance comparison of the proposed method with some state-of-the-art models on the Dataset-2.
| Work Ref. | Technique | Experimental protocol | Performance (in %) in terms of | |||
|---|---|---|---|---|---|---|
| Precision | Recall | F1-score | Accuracy | |||
| Aslan et al.[ | mAlexNet+BiLSTM | Hold-out test set | 98.77 | 98.76 | 98.76 | 98.70 |
| Aslan et al.[ | mAlexNet | Hold-out test set | 98.16 | 98.26 | 98.20 | 98.14 |
| Ouchicha et al.[ | CVDNet | 5-Fold cross validation | 96.72 | 96.84 | 96.68 | 96.69 |
| Kedia et al.[ | CoVNet-19 | Hold-out test set | 98.34 | 98.34 | 98.34 | 98.20 |
| Ahmad et al.[ | InceptionV3+MobileNetV2 | 5-Fold cross validation | 97.56 | 97.54 | 97.55 | 98.77 |
| Chowdhury et al.[ | CheXNet | 5-Fold cross validation | 96.61 | 96.61 | 96.61 | 97.74 |
| Sedik et al.[ | ConvLSTM | Hold-out test set | 94.67 | 97.09 | 95.64 | 95.96 |
| Wu et al.[ | ULNet | 5-Fold cross validation | 96.93 | 96.60 | 96.60 | 95.25 |
| Panetta et al.[ | Classical Fibonacci p-pattern | Hold-out test set | 97.78 | 96.90 | 97.32 | 97.79 |
| Panetta et al.[ | Shape dependent Fibonacci p-pattern | Hold-out test set | 97.20 | 96.76 | 96.69 | 98.03 |
| Yang et al.[ | Fast.AI ResNet | Hold-out test set | 97.00 | 97.00 | 97.00 | 97.00 |
| Paul et al.[ | Inverted bell ensemble | Hold-out test set | 97.24 | 97.25 | 97.24 | 97.64 |
| Paul et al.[ | Inverted bell ensemble | 5-Fold cross validation | 96.84 | 96.72 | 96.72 | 97.12 |
| Gour et al.[ | UA-ConvNet model | 5-Fold cross validation | 99.51 | 98.00 | 98.73 | 98.90 |
| Roy et al.[ | CoWarriorNet | Hold-out test set | 94.66 | 91.33 | 92.66 | 97.80 |
| Bashar et al.[ | Optimized CNN model | Hold-out test set | 97.00 | 93.67 | 95.67 | 96.55 |
| Senan et al.[ | ResNet50 model | Hold-out test set | 97.00 | 97.67 | 97.67 | 98.01 |
| Goyal & Singh[ | F-RRN-LSTM model | Hold-out test set | 93.65 | 96.78 | 95.19 | 95.04 |
| Proposed | TOPCONet | Hold-out test set | 97.84 | 97.85 | 97.85 | 98.61 |
| 5-Fold cross validation | 96.99 | 96.99 | 96.99 | 97.85 | ||
Performances of TOPCONet model and its sub-modules along with some state-of-the-art models on COVIDx Dataset.
| Work Ref. | Technique | #Trainable parameters | Performance (in %) in terms of | |||
|---|---|---|---|---|---|---|
| Precision | Recall | F1-score | Accuracy | |||
| Jain et al.[ | Xception | 22,910,480 | 79.50 | 74.00 | 73.00 | 74.25 |
| Chowdhury et al.[ | CheXNet | 20,242,984 | 82.50 | 75.00 | 73.50 | 74.75 |
| Bashar et al.[ | Optimized CNN | 138,357,544 | 87.50 | 85.50 | 85.00 | 85.50 |
| Senan et al.[ | ResNet50 | 25,636,712 | 80.50 | 77.00 | 76.50 | 77.25 |
| Dey et al.[ | CovidConvLSTM | 363,996,809 | 88.71 | 86.75 | 86.58 | 86.75 |
| Proposed | Classifier 1 | 306,467 | 94.00 | 92.61 | 93.30 | 93.25 |
| Proposed | Classifier 2 | 339,235 | 92.00 | 92.46 | 92.23 | 92.25 |
| Proposed | Classifier 3 | 355,619 | 81.00 | 69.53 | 74.83 | 72.75 |
| Proposed | 1,001,324 | |||||
The best values are highlighted in bold.
Figure 6Confusion matrices obtained on Dataset-1, Dataset-2 and COVIDx dataset.
Figure 7Grad-cam images along with original chest X-ray images for the base classifier 2 trained on the Dataset-1. Images are taken from the public datasets found in[51,52].