| Literature DB >> 36210962 |
Abstract
The first case of novel Coronavirus (COVID-19) was reported in December 2019 in Wuhan City, China and led to an international outbreak. This virus causes serious respiratory illness and affects several other organs of the body differently for different patient. Worldwide, several waves of this infection have been reported, and researchers/doctors are working hard to develop novel solutions for the COVID diagnosis. Imaging and vision-based techniques are widely explored for the prediction of COVID-19; however, COVID infection percentage estimation is under explored. In this work, we propose a novel framework for the estimation of COVID-19 infection percentage based on deep learning techniques. The proposed network utilizes the features from vision transformers and CNN (Convolutional Neural Networks), specifically EfficientNet-B7. The features of both are fused together for preparing an information-rich feature vector that contributes to a more precise estimation of infection percentage. We evaluate our model on the Per-COVID-19 dataset (Bougourzi et al., 2021b) which comprises labelled CT data of COVID-19 patients. For the evaluation of the model on this dataset, we employ the most widely-used slice-level metrics, i.e., Pearson correlation coefficient (PC), Mean absolute error (MAE), and Root mean square error (RMSE). The network outperforms the other state-of-the-art methods and achieves 0 . 9886 ± 0 . 009 , 1 . 23 ± 0 . 378 , and 3 . 12 ± 1 . 56 , PC, MAE, and RMSE, respectively, using a 5-fold cross-validation technique. In addition, the overall average difference in the actual and predicted infection percentage is observed to be < 2 % . In conclusion, the detailed experimental results reveal the robustness and efficiency of the proposed network.Entities:
Keywords: COVID-19; Deep network; EfficientNet-B7; Huber loss; Percentage estimation; Vision transformer
Year: 2022 PMID: 36210962 PMCID: PMC9527203 DOI: 10.1016/j.eswa.2022.118939
Source DB: PubMed Journal: Expert Syst Appl ISSN: 0957-4174 Impact factor: 8.665
Comparative summary of major research studies in the COVID-19 prediction and percentage estimation domain.
| Authors | Research approach/Techniques used | Task/Dataset | Findings/Results |
|---|---|---|---|
| Proposed a light CNN (SqueezeNet) approach to detect COVID-19 from CT scans. | Classification task | (a) The approach results significant enhancement over complex CNN architectural designs. | |
| (a) Developed an artificial intelligence system to detect COVID-19 from 3D CT scan volume. | Classification task using the fine-tuning strategy. | The approach achieved 96% area under curve value for the target detection task. | |
| Implemented transfer learning to provide a generalized solution for COVID detection task. | COVID-19 Classification task | (a) The authors signified the impact of various initialization parameters and limited dataset availability on the model results. | |
| Three phase strategy combining CNN DenseNet-161 and Clustering for the identification, and severity estimation. | Percentage Estimation | The evaluation results obtained using the approach implemented in the research are listed as follows: | |
| Amalgamation of Swin transformer (feature extraction) and multi-layer perceptron (regression). | COVID-19 percentage estimation from CT-scans | The evaluation results obtained using the approach implemented in the research are listed as follows: | |
| Authors integrated Mixup Data augmentation module with Inception-v3 model for improved regression performance. | Percentage estimation from CT images | The augmentation techniques helped achieving the desired prediction performance. | |
| Proposed feature regularization based deep regression approach for the severity prediction task. | Percentage estimation from CT scan. | The approach achieved significant improvements over baseline by generating 4.912 MAE. | |
| Two-stage workflow architecture is proposed to detect COVID. The first phase involves identifying the lesion types using R-CNN and then, the fused data is used for classification in the next phase. | Dataset consisting of 3000 images is used for the COVID detection task. | (a) The model effectiveness is validated on the basis of various evaluation measures such as sensitivity, F1-score and accuracy. |
Fig. 1Overall architecture of the proposed Network that comprises of two paths for feature extraction: one is EfficientNet-B7 and the other is Vision Transformer.
Fig. 2Overall Architecture of EfficientNet-B7, consisting of seven blocks (B1 to B7) where MBConv (mobile inverted bottleneck convolution) is the main component of each block.
Fig. 3MBConv block of EfficientNet architecture.
Fig. 4Architecture of Vision Transformer module.
Fig. 5Architecture of Transform Encoder and its internal blocks.
Fig. 6Some Sample CT Images of COVID-19 patients with infection percentage: , respectively (starting from first row, left to right) Bougourzi, Distante et al. (2021).
5-fold cross-validation results and the overall average performance by the proposed EffViT model.
| Fold | ||||||
|---|---|---|---|---|---|---|
| Fold 1 | 0.9924 | 1.38 | 2.64 | 0.9951 | 1.86 | 2.25 |
| Fold 2 | 0.9781 | 1.71 | 4.87 | 0.9861 | 2.53 | 4.47 |
| Fold 3 | 0.9795 | 1.36 | 4.68 | 0.9712 | 2.39 | 6.50 |
| Fold 4 | 0.9957 | 0.82 | 1.70 | 0.9976 | 1.17 | 1.55 |
| Fold 5 | 0.9974 | 0.86 | 1.70 | 0.9987 | 1.26 | 1.49 |
| Mean | 0.9886 | 1.23 | 3.12 | 0.9897 | 1.83 | 3.25 |
| STD | 0.0092 | 0.378 | 1.56 | 0.0115 | 0.608 | 2.18 |
Overall average performance by the proposed EffViT method and the state-of-the-art methods.
| Methods | ||||||
|---|---|---|---|---|---|---|
| ResneXt-50 | 0.9207 | 5.29 | 10.10 | 0.9532 | 3.95 | 7.14 |
| DenseNet-161 | 0.9341 | 5.23 | 9.42 | 0.9582 | 4.07 | 7.00 |
| Inception-V3 | 0.9365 | 5.10 | 9.25 | 0.9603 | 4.01 | 6.79 |
| MobileNetV3-S | 0.9374 | 6.01 | 9.45 | 0.9540 | 4.16 | 7.06 |
| SuffleNet | 0.9409 | 5.46 | 8.92 | 0.9613 | 3.98 | 6.37 |
| MobileNetV3-L | 0.9427 | 5.95 | 9.59 | 0.9598 | 3.97 | 6.49 |
| GoogleNet | 0.9438 | 5.93 | 9.63 | 0.9577 | 4.55 | 6.99 |
| RegNet_y_1_6gf | 0.9442 | 6.02 | 9.72 | 0.9598 | 4.18 | 6.72 |
| RegNet_x_1_6gf | 0.9443 | 5.17 | 8.67 | 0.9590 | 4.14 | 6.76 |
Fig. 7Boxplot of the 5-fold cross validation results showing MSE, RMSE, subject-wise MSE, subject-wise RMSE.
Fig. 8Quantitative performance analysis by the different methods in terms of MAE, RMSE, Subject-wise MAE and Subject-wise RMSE.
Fig. 9Average inference time in seconds for the proposed approach and the state-of-the-art methods.
Fig. 10Image-wise absolute error analysis.
Fig. 11Qualitative results by the proposed EffViT method on some of the CT Images from test set of COVID-19 dataset (Bougourzi, Distante et al., 2021). Starting from first row, left to right; actual: 60% pred: 60%, actual: 38% pred: 39%, actual: 85% pred: 87%, actual: 26% pred: 26%, actual: 90% pred: 91%, actual: 5% pred: 5%, respectively.
Fig. 12GradCAM maps of some slices of a patient’s CT scan.
Fig. 13Results on some sample CT Images of a COVID-19 patients with actual infection percentage 25% for all the three images and predicted as , respectively.
Fig. 14Results on some sample CT Images of a COVID-19 patients with actual infection percentage 2% for all the three images and predicted as , respectively. Bounding box and arrows represents the region with ground glass opacity (GGO).
Overall average performance by the proposed EffViT method and the state-of-the-art methods for the noisy data.
| Methods | |||
|---|---|---|---|
| ResneXt-50 | 0.8410 | 10.31 | 19.15 |
| DenseNet-161 | 0.8435 | 10.02 | 18.93 |
| Inception-V3 | 0.8542 | 9.87 | 17.02 |
| MobileNetV3-S | 0.8496 | 9.66 | 16.78 |
| SuffelNet | 0.8679 | 9.59 | 16.55 |
| MobileNetV3-L | 0.8722 | 9.34 | 15.93 |
| GoogleNet | 0.8774 | 9.01 | 15.75 |
| RegNet_y_1_6gf | 0.8845 | 8.98 | 14.98 |
| RegNet_x_1_6gf | 0.8936 | 8.85 | 13.54 |