| Literature DB >> 33994894 |
Muhammad Owais1, Hyo Sik Yoon1, Tahir Mahmood1, Adnan Haider1, Haseeb Sultan1, Kang Ryoung Park1.
Abstract
Currently, the coronavirus disease 2019 (COVID19) pandemic has killed more than one million people worldwide. In the present outbreak, radiological imaging modalities such as computed tomography (CT) and X-rays are being used to diagnose this disease, particularly in the early stage. However, the assessment of radiographic images includes a subjective evaluation that is time-consuming and requires substantial clinical skills. Nevertheless, the recent evolution in artificial intelligence (AI) has further strengthened the ability of computer-aided diagnosis tools and supported medical professionals in making effective diagnostic decisions. Therefore, in this study, the strength of various AI algorithms was analyzed to diagnose COVID19 infection from large-scale radiographic datasets. Based on this analysis, a light-weighted deep network is proposed, which is the first ensemble design (based on MobileNet, ShuffleNet, and FCNet) in medical domain (particularly for COVID19 diagnosis) that encompasses the reduced number of trainable parameters (a total of 3.16 million parameters) and outperforms the various existing models. Moreover, the addition of a multilevel activation visualization layer in the proposed network further visualizes the lesion patterns as multilevel class activation maps (ML-CAMs) along with the diagnostic result (either COVID19 positive or negative). Such additional output as ML-CAMs provides a visual insight of the computer decision and may assist radiologists in validating it, particularly in uncertain situations Additionally, a novel hierarchical training procedure was adopted to perform the training of the proposed network. It proceeds the network training by the adaptive number of epochs based on the validation dataset rather than using the fixed number of epochs. The quantitative results show the better performance of the proposed training method over the conventional end-to-end training procedure. A large collection of CT-scan and X-ray datasets (based on six publicly available datasets) was used to evaluate the performance of the proposed model and other baseline methods. The experimental results of the proposed network exhibit a promising performance in terms of diagnostic decision. An average F1 score (F1) of 94.60% and 95.94% and area under the curve (AUC) of 97.50% and 97.99% are achieved for the CT-scan and X-ray datasets, respectively. Finally, the detailed comparative analysis reveals that the proposed model outperforms the various state-of-the-art methods in terms of both quantitative and computational performance.Entities:
Keywords: COVID19; Computer-aided diagnosis; Deep learning; ML-CAMs; Medical image classification
Year: 2021 PMID: 33994894 PMCID: PMC8103783 DOI: 10.1016/j.asoc.2021.107490
Source DB: PubMed Journal: Appl Soft Comput ISSN: 1568-4946 Impact factor: 6.725
Comparative summary of the proposed and the existing state-of-the-art methods for automatic diagnosis of COVID19 pneumonia. (V.C: Varicella, S.T: Streptococcus, P.M: Pneumocystis, T.B: Tuberculosis, M.R: MERS, S.R: SARS, P.N: Pneumonia, B.P: Bacterial pneumonia, V.P: Viral pneumonia, ACC: Accuracy, F1: F1 score, AP: Average precision, AR: Average recall, Spec: Specificity, Kap: Kappa statistics, Others: Combination of different diseased and normal classes beside COVID19).
| Literature | Method | Class name (No. of classes) | Dataset (No. of images) | Imaging modality | Result (%) | Strength | Limitation | |
|---|---|---|---|---|---|---|---|---|
| COVID19+ | COVID19 | |||||||
| Ghoshal et al. | Bayesian CNN | COVID19/B.P/V.P/ | 68 | 5873 | X-ray | ACC: 89.82 | Enhanced detection performance compared to Standard ResNet | Limited COVID19 data samples |
| Oh et al. | FC-DenseNet+ResNet18 | COVID19/B.P/V.P/T.B/ Normal | 180 | 322 | X-ray | ACC: 88.9, F1: 84.4, AP: 83.4, AR: 85.9, Spec: 96.4 | Provides clinically interpretable saliency maps | - Limited COVID19 data samples |
| Das et al. | Truncated InceptionNet | COVID19/P.N/T.B/ | 162 | 6683 | X-ray | ACC: 98.7, F1: 97, AP: 99, AR: 95, Spec: 99, AUC: 99 | Enhanced computational performance compared to Standard InceptionNet | Limited COVID19 data samples |
| Singh et al. | MODE-based CNN | COVID19/Others | 69 | 63 | CT | ACC: 93.5, F1: 89.9, AR: 90.75, Spec: 90.8, Kap: 90.5 | Applicable in real-time screening | - Limited dataset |
| Pereira et al. | Texture Descriptors and InceptionNet | COVID19/M.R/S.R/ V.C/S.T/P.M/Normal | 180 | 2108 | X-ray | F1: 89 | High COVID19 recognition rate | - Limited COVID19 data samples |
| Khan et al. | CoroNet | COVID19/B.P/V.P/ Normal | 284 | 967 | X-ray | ACC: 89.6, F1: 89.8, AP: 90, AR: 89.92, Spec: 96.4 | High COVID19 detection rate compared to other classes | - Limited COVID19 data samples |
| Asnaoui et al. | InceptionResNet | COVID19/B.P/Normal | 231 | 5856 | X-ray | F1: 92.08, AP: 92.38, AR: 92.11, Spec: 96.06 | Detailed performance analysis under the same experimental protocol | - Low COVID19 detection rate compared to other classes |
| Brunese et al. | VGG16 | COVID19/P.N/Normal | 250 | 6273 | X-ray | ACC: 97 | High COVID19 detection rate compared to other classes | - Lack of ablation study |
| Han et al. | AD3D-MIL | COVID19/P.N/Normal | 230 | 230 | CT | ACC: 94.3, F1: 92.3, AP: 95.9, AR: 90.5, AUC: 98.8, Kap: 91.1 | Provides clinically interpretable saliency maps | - Required high computational power |
| Mahmud et al. | CovXNet | COVID19/B.P/V.P/ Normal | 305 | 915 | X-ray | ACC: 90.2, F1: 90.4, AP: 90.8, AR: 89.9, Spec: 89.1, AUC: 91 | Generate clinically interpretable activation maps | - Limited dataset |
| Ensemble-Net | COVID19/Others (2) | 3296 | 4143 | X-ray | ACC: 95.83, F1: 95.94, AP: 95.68, AR: 96.20, AUC: 97.99 | - Reduced number of trainable parameters - Visualize clinically interpretable ML-CAMs | - Training time is longer than the conventional end-to-end training method | |
| 3254 | 2217 | CT | ACC: 94.72, F1: 94.60, AP: 95.22, AR: 94.00, AUC: 97.5 | |||||
Fig. 1Overall workflow of the proposed framework visualizing the progression of both training and testing phases distinctly.
Fig. 2Architecture of the proposed ensemble network based on MobileNet, ShuffleNet, and FCNet. Both MobileNet and ShuffleNet are connected in parallel and their outputs are simultaneously cascade-connected to FCNet.
Fig. 3MobileNet and ShuffleNet basic building blocks: (a) Mobile Unit-A, (b) Mobile Unit-B, (c) Shuffle Unit-A, and (d) Shuffle Unit-B. (BN: Batch normalization, ReLU: Rectified linear unit, CReLU: Clipped ReLU, Ch. Shuf. Opr: Channel shuffle operation).
Layer-wise configuration details of the proposed network. (Itr.: Iterations, #Filt.: Number of filters, Str. Stride value, #Par.: Number of trainable parameters).
| Subnetwork | Layer name | Itr. | Input size | Output size | Filter size | #Filt. | Str. | #Par. |
|---|---|---|---|---|---|---|---|---|
| (A) Mobile Net | Input | – | 224 × 224 | n/a | n/a | n/a | n/a | n/a |
| Conv | 1 | 224 × 224 | 112 × 112 | ( | 32 | 2 | 960 | |
| DW-conv | 1 | 112 × 112 | 112 × 112 | ( | 32 | 1 | 384 | |
| Conv | 1 | 112 × 112 | 112 × 112 | ( | 16 | 1 | 560 | |
| Mobile Unit-A | 1 | 112 × 112 | 56 × 56 | (1 | 96, 96, 24 | 1, 2, 1 | 5352 | |
| Mobile Unit-B | 1 | 56 × 56 | 56 × 56 | (1 | 144, 144, 24 | 1, 1, 1 | 9144 | |
| Mobile Unit-A | 1 | 56 × 56 | 28 × 28 | (1 | 144, 144, 32 | 1, 2, 1 | 10,320 | |
| Mobile Unit-B | 2 | 28 × 28 | 28 × 28 | (1 | 192, 192, 32 | 1, 1, 1 | 30,528 | |
| Mobile Unit-A | 1 | 28 × 28 | 14 × 14 | (1 | 192, 192, 64 | 1, 2, 1 | 21,504 | |
| Mobile Unit-B | 3 | 14 × 14 | 14 × 14 | (1 | 384, 384, 64 | 1, 1, 1 | 165,312 | |
| Mobile Unit-A | 1 | 14 × 14 | 14 × 14 | (1 | 384, 384, 96 | 1, 1, 1 | 67,488 | |
| Mobile Unit-B | 2 | 14 × 14 | 14 × 14 | (1 | 576, 576, 96 | 1, 1, 1 | 239,040 | |
| Mobile Unit-A | 1 | 14 × 14 | 7 × 7 | (1 | 576, 576, 160 | 1, 2, 1 | 156,576 | |
| Mobile Unit-B | 2 | 7 × 7 | 7 × 7 | (1 | 960, 960, 160 | 1, 1, 1 | 644,160 | |
| Mobile Unit-A | 1 | 7 × 7 | 7 × 7 | (1 | 960, 960, 320 | 1, 1, 1 | 476,160 | |
| Conv | 1 | 7 × 7 | 7 × 7 | ( | 1280 | 1 | 413,440 | |
| Avg Pooling | 1 | 7 × 7 | 1 × 1 | ( | 1 | 1 | – | |
| (B) Shuffle Net | Conv | 1 | 224 × 224 | 112 × 112 | ( | 24 | 2 | 720 |
| Max Pooling | 1 | 112 × 112 | 56 × 56 | ( | 1 | 2 | – | |
| Shuffle Unit-A | 1 | 56 × 56 | 28 × 28 | (1 | 112, 112, 112 | 1, 2, 1 | 5824 | |
| Shuffle Unit-B | 3 | 28 × 28 | 28 × 28 | (1 | 136, 136, 136 | 1, 1, 1 | 35,088 | |
| Shuffle Unit-A | 1 | 28 × 28 | 14 × 14 | (1 | 136, 136, 136 | 1, 2, 1 | 11,696 | |
| Shuffle Unit-B | 7 | 14 × 14 | 14 × 14 | (1 | 272, 272, 272 | 1, 1, 1 | 293,216 | |
| Shuffle Unit-A | 1 | 14 × 14 | 7 × 7 | (1 | 272, 272, 272 | 1, 2, 1 | 41,888 | |
| Shuffle Unit-B | 3 | 7 × 7 | 7 × 7 | (1 | 544, 544, 544 | 1, 1, 1 | 473,280 | |
| Avg Pooling | 1 | 7 × 7 | 1 × 1 | ( | 1 | 1 | – | |
| (C) FCNet | Depth Concatenation | 1 | 1 × 1 | 1 × 1 | – | 1 | ||
| FC1 | 1 | 1 × 1 | 1 × 1 | – | – | – | 58,400 | |
| FC2 | 1 | 1 × 1 | 1 × 1 | – | – | – | 66 | |
| SoftMax | 1 | 1 × 1 | 1 × 1 | – | 1 | |||
| Classification | 1 | – | 1 | |||||
Brief description of datasets selected in this study.
| Modality | Dataset | COVID19+ | COVID19 | ||
|---|---|---|---|---|---|
| #Images | #Patients | #Images | #Patients | ||
| CT | BIMCV COVID19 | 2905 | 1311 | – | – |
| COVID-CT | 349 | 349 | 397 | 397 | |
| Cancer Archive | – | – | 1820 | 732 | |
| X-ray | BIMCV COVID19 | 3296 | 1311 | – | – |
| Shenzhen | – | – | 662 | 662 | |
| Montgomery | – | – | 138 | 138 | |
| CoronaHack | – | – | 3343 | 3343 | |
Fig. 4Example images from the selected datasets to show the visual difference between COVID19 positive and negative cases: (a) CT scan images; (b) X-ray images.
Summary of the total number of data samples included in the training, validation, and testing datasets.
| Dataset | Data splitting | COVID19+ | COVID19 | ||
|---|---|---|---|---|---|
| #Images | #Patients | #Images | #Patients | ||
| CT | Training | 2278 | 1162 | 1552 | 790 |
| Validation | 325 | 166 | 222 | 113 | |
| Testing | 651 | 332 | 443 | 226 | |
| X-ray | Training | 2307 | 918 | 2900 | 2900 |
| Validation | 330 | 131 | 414 | 414 | |
| Testing | 659 | 262 | 829 | 829 | |
Quantitative results of the proposed network including the ablated performance of each subnetwork.
| Dataset | Network | ACC | F1 | AP | AR | AUC |
|---|---|---|---|---|---|---|
| CT | ShuffleNet | 91.65 ± 6.14 | 91.52 ± 6.10 | 92.69 ± 4.41 | 90.40 ± 7.68 | 95.61 ± 5.61 |
| MobileNet | 92.95 ± 5.25 | 92.85 ± 5.27 | 93.81 ± 3.93 | 91.90 ± 6.57 | 96.51 ± 3.68 | |
| MobShufNet | 93.11 ± 7.64 | 93.07 ± 7.46 | 94.25 ± 5.25 | 92.00 ± 9.43 | 96.57 ± 5.82 | |
| X-ray | ShuffleNet | 94.97 ± 0.40 | 95.03 ± 0.46 | 94.81 ± 0.41 | 95.30 ± 0.51 | 97.53 ± 0.38 |
| MobileNet | 94.65 ± 1.49 | 94.68 ± 1.58 | 94.51 ± 1.42 | 94.90 ± 1.74 | 97.51 ± 0.46 | |
| MobShufNet | 95.07 ± 1.39 | 95.10 ± 1.46 | 94.91 ± 1.35 | 95.30 ± 1.58 | 97.69 ± 0.42 | |
Fig. 5Receiver operating characteristic curves of ShuffleNet, MobileNet, MobShufNet, and proposed network. The true positive rate (TPR) is plotted against the false positive rate (FPR) of each network at distinct thresholds from 0 to 1 in 0.001 increments.
Fig. 6Confusion matrices of the proposed network and each subnetwork. These results present the individual performance of each network in terms of the actual number of true positive (TP), true negative (TN), false positive (FP), and false negative (FN) data samples. (Note: In each confusion matrix, Top-left-box: TP, Top-right-box: FN, Bottom-left-box: FP, Bottom-right-box: TN).
Comparative results of the adopted hierarchical training procedure vs. end-to-end training method.
| Dataset | Training procedure | ACC | F1 | AP | AR | AUC |
|---|---|---|---|---|---|---|
| CT | end-to-end | 91.23 ± 5.84 | 91.06 ± 5.88 | 92.48 ± 4.66 | 89.70 ± 7.02 | 95.73 ± 4.57 |
| Hierarchical | ||||||
| X-ray | end-to-end | 95.75 ± 0.37 | 95.83 ± 0.41 | 95.59 ± 0.38 | 96.10 ± 0.43 | 97.88 ± 0.17 |
| Hierarchical | ||||||
Quantitative performance comparison of proposed network with the state-of-the-art deep learning methods. (#Par: Total number of parameters).
| Study | Method | #Par (Million) | Dataset 1: CT | Dataset 2: X-ray | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ACC | F1 | AP | AR | AUC | ACC | F1 | AP | AR | AUC | |||
| Minaee et al. | SqueezeNet | 1.24 | 89.84 | 89.48 | 89.91 | 89.06 | 93.86 | 93.51 | 93.56 | 93.43 | 93.70 | 97.03 |
| Brunese et al. | VGG16 | 134.27 | 89.66 | 89.54 | 91.43 | 87.81 | 92.35 | 95.79 | 95.91 | 95.65 | 96.18 | 97.79 |
| Khan et al. | VGG19 | 139.58 | 91.54 | 91.33 | 92.26 | 90.47 | 94.54 | 95.30 | 95.39 | 95.14 | 95.65 | 97.84 |
| Martínez et al. | NASNet | 4.27 | 93.68 | 93.49 | 94.19 | 92.82 | 96.67 | 94.06 | 94.06 | 93.89 | 94.23 | 97.07 |
| Misra et al. | ResNet18 | 11.18 | 92.96 | 92.76 | 93.41 | 92.14 | 95.06 | 95.59 | 95.69 | 95.44 | 95.95 | 97.79 |
| Farooq et al. | ResNet50 | 23.54 | 90.30 | 90.22 | 92.17 | 88.53 | 92.79 | 94.73 | 94.77 | 94.62 | 94.92 | 97.66 |
| Ardakani et al. | ResNet101 | 42.56 | 90.30 | 90.26 | 92.17 | 88.64 | 95.71 | 94.53 | 94.58 | 94.44 | 94.72 | 97.19 |
| Jaiswal et al. | DenseNet201 | 18.11 | 94.17 | 94.03 | 94.63 | 93.46 | 97.36 | 93.41 | 93.39 | 93.38 | 93.39 | 97.31 |
| Hu et al. | ShuffleNet | 0.86 | 91.65 | 91.52 | 92.69 | 90.44 | 95.61 | 94.97 | 95.03 | 94.81 | 95.25 | 97.53 |
| Apostolopoulos et al. | MobileNetV2 | 2.24 | 92.95 | 92.85 | 93.81 | 91.94 | 96.51 | 94.65 | 94.68 | 94.51 | 94.85 | 97.51 |
| Tsiknakis et al. | InceptionV3 | 21.81 | 94.57 | 94.41 | 94.89 | 93.94 | 95.44 | 95.53 | 95.29 | 95.78 | 97.52 | |
| 97.50 | ||||||||||||
Fig. 7Tradeoff between the number of parameters and accuracies of the proposed and top five baseline networks.
Quantitative performance comparison of proposed network with the conventional handcrafted feature-based methods.
| Method | Dataset 1: CT scan | Dataset 2: X-ray | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| ACC | F1 | AP | AR | AUC | ACC | F1 | AP | AR | AUC | |
| LBP & SVM, | 80.18 | 79.64 | 79.52 | 79.77 | 83.06 | 79.89 | 79.63 | 80.41 | 78.87 | 86.80 |
| LBP & KNN, | 82.71 | 81.97 | 82.60 | 81.37 | 81.37 | 88.76 | 88.62 | 88.62 | 88.63 | 88.63 |
| LBP & RF, | 83.40 | 82.75 | 83.78 | 81.81 | 90.35 | 88.95 | 88.91 | 88.80 | 89.03 | 94.82 |
| LBP & AB, | 85.73 | 85.20 | 86.00 | 84.46 | 89.73 | 88.88 | 88.79 | 88.73 | 88.84 | 94.15 |
| MLBP & SVM, | 83.49 | 83.16 | 82.93 | 83.39 | 88.30 | 86.82 | 86.71 | 87.13 | 86.30 | 92.61 |
| MLBP & KNN, | 84.15 | 83.51 | 83.98 | 83.06 | 83.06 | 90.15 | 90.04 | 90.02 | 90.06 | 90.06 |
| MLBP & RF, | 87.22 | 86.72 | 87.36 | 86.10 | 93.23 | 90.39 | 90.39 | 90.24 | 90.53 | 95.31 |
| MLBP & AB, | 89.05 | 88.65 | 89.23 | 88.09 | 92.01 | 92.26 | 92.25 | 92.09 | 92.42 | 95.60 |
| HoG & SVM, | 88.10 | 87.82 | 88.98 | 86.73 | 94.24 | 94.19 | 94.24 | 94.02 | 94.46 | 96.94 |
| HoG & KNN, | 84.63 | 83.99 | 85.24 | 82.88 | 82.88 | 93.17 | 93.12 | 93.00 | 93.25 | 93.25 |
| HoG & RF, | 87.81 | 87.53 | 89.27 | 85.94 | 92.65 | 92.23 | 92.34 | 92.10 | 92.58 | 96.79 |
| HoG & AB, | 86.82 | 86.68 | 88.88 | 84.73 | 87.57 | 93.92 | 93.99 | 93.75 | 94.22 | 96.25 |
Fig. 8Additional output of MLAV layers as ML-CAMs for the given CT scan images (including both COVID19 positive and negative data samples).
Fig. 9Additional output of MLAV layers as ML-CAMs for the given X-ray images (including both COVID19 positive and negative data samples).