| Literature DB >> 34764594 |
Shibaprasad Sen1, Soumyajit Saha2, Somnath Chatterjee2, Seyedali Mirjalili3,4,5, Ram Sarkar6.
Abstract
The rapid spread of coronavirus disease has become an example of the worst disruptive disasters of the century around the globe. To fight against the spread of this virus, clinical image analysis of chest CT (computed tomography) images can play an important role for an accurate diagnostic. In the present work, a bi-modular hybrid model is proposed to detect COVID-19 from the chest CT images. In the first module, we have used a Convolutional Neural Network (CNN) architecture to extract features from the chest CT images. In the second module, we have used a bi-stage feature selection (FS) approach to find out the most relevant features for the prediction of COVID and non-COVID cases from the chest CT images. At the first stage of FS, we have applied a guided FS methodology by employing two filter methods: Mutual Information (MI) and Relief-F, for the initial screening of the features obtained from the CNN model. In the second stage, Dragonfly algorithm (DA) has been used for the further selection of most relevant features. The final feature set has been used for the classification of the COVID-19 and non-COVID chest CT images using the Support Vector Machine (SVM) classifier. The proposed model has been tested on two open-access datasets: SARS-CoV-2 CT images and COVID-CT datasets and the model shows substantial prediction rates of 98.39% and 90.0% on the said datasets respectively. The proposed model has been compared with a few past works for the prediction of COVID-19 cases. The supporting codes are uploaded in the Github link: https://github.com/Soumyajit-Saha/A-Bi-Stage-Feature-Selection-on-Covid-19-Dataset.Entities:
Keywords: COVID-19 dataset; Chest CT image; Convolutional neural network; Coronavirus; Deep learning; Dragonfly algorithm; Feature selection
Year: 2021 PMID: 34764594 PMCID: PMC8053442 DOI: 10.1007/s10489-021-02292-8
Source DB: PubMed Journal: Appl Intell (Dordr) ISSN: 0924-669X Impact factor: 5.086
Fig. 1Flowchart of the proposed model for the prediction of COVID and non-COVID cases from chest CT images
Detail of the CNN architecture used for the purpose of feature extraction from the chest CT images
| Layer | Type | Filter size | Number of filters | Strider |
|---|---|---|---|---|
| Input | 224x224x3 | − | − | − |
| Conv_1 | CL+BN+ReLu | 3x3 | 64 | 1x1 |
| MPL_1 | − | 2X2 | − | 2X2 |
| Conv_2 | CL+BN+ReLu | 3x3 | 64 | 1x1 |
| MPL_2 | − | 2X2 | − | 2X2 |
| Conv_3 | CL+BN+ReLu | 3x3 | 32 | 1x1 |
| MPL_3 | − | 2x2 | − | 2x2 |
| Conv_4 | CL+BN+ReLu | 3x3 | 16 | 1x1 |
| MPL_4 | − | 2x2 | − | 2x2 |
| Conv_5 | CL+BN+ReLu | 3x3 | 8 | 1x1 |
| Output | Sigmoid | − | − | − |
Fig. 2Principle of the SVM for a two-class dataset separated by hyperplane
Fig. 3Sample images taken from a SARS-CoV-2 CT scan dataset, and b COVID-CT database that are positive for COVID-19
Detail performance measures of the feature set produced by guided-FS procedure applied on features obtained from CNN model for SARS-CoV-2 CT scan and COVID-CT dataset
| Dataset | Reduced feature dimension | Accuracy (in %) | Precision | Recall | F1-score | AUC |
|---|---|---|---|---|---|---|
| SARS-CoV-2 CT scan dataset [ | 284 | 95.77 | 0.9409 | 0.9755 | 0.9579 | 0.9856 |
| COVID-CT-Dataset [ | 262 | 85.33 | 0.8833 | 0.7794 | 0.8281 | 0.9244 |
Details of the performance of the feature set produced by applying DA on features obtained from the guided-FS for SARS-CoV-2 CT scan dataset and COVID-CT dataset
| Dataset | Reduced feature dimension | Accuracy (in %) | Precision | Recall | F1-score | AUC |
|---|---|---|---|---|---|---|
| SARS-CoV-2 CT scan dataset [ | 179 | 98.39 | 0.9821 | 0.9778 | 0.98 | 0.9952 |
| COVID-CT-Dataset [ | 168 | 90.0 | 0.9355 | 0.8406 | 0.8855 | 0.9414 |
Fig. 4ROC curves for the feature set selected by guided-FS with respect to the features obtained through CNN model (green line) and features selected by DA with respect to the features generated using guided-FS technique (orange line) for SARS-CoV-2 CT scan dataset
Fig. 5ROC curves for the feature set selected by guided-FS with respect to the features obtained through CNN model (green line) and features selected by DA with respect to the features generated using guided-FS technique (orange line) for COVID-CT-Dataset
Observed outcomes after tuning the parameters of SVM classifier on SARS-CoV-2-CT scan dataset and COVID-CT dataset
| Database | Kernel | coef0 | Accuracy | Precision | Recall | F1-score | AUC |
|---|---|---|---|---|---|---|---|
| SARS-CoV-2 | Linear | 0 | 88.93 | 0.8798 | 0.9043 | 0.8919 | 0.9532 |
| 1 | 90.14 | 0.8924 | 0.9106 | 0.9014 | 0.9572 | ||
| 2 | 88.93 | 0.8508 | 0.9214 | 0.8847 | 0.9584 | ||
| 3 | 90.54 | 0.9016 | 0.9053 | 0.9035 | 0.9732 | ||
| Polynomial | 0 | 92.15 | 0.9774 | 0.864 | 0.9172 | 0.9732 | |
| 1 | 95.77 | 0.9736 | 0.9364 | 0.9546 | 0.9848 | ||
| 2 | 98.39 | 0.9821 | 0.9778 | 0.98 | 0.9952 | ||
| 3 | 95.37 | 0.9631 | 0.9438 | 0.9533 | 0.9916 | ||
| COVID-CT-Datbase | Linear | 0 | 76 | 0.7692 | 0.7042 | 0.7353 | 0.8302 |
| 1 | 78.67 | 0.8136 | 0.6957 | 0.75 | 0.86 | ||
| 2 | 80.67 | 0.8169 | 0.7838 | 0.8 | 0.8396 | ||
| 3 | 81.33 | 0.8333 | 0.7639 | 0.7971 | 0.8534 | ||
| Polynomial | 0 | 82 | 0.8158 | 0.8052 | 0.8105 | 0.8820 | |
| 1 | 84.67 | 0.8333 | 0.8267 | 0.8299 | 0.8932 | ||
| 2 | 90 | 0.9355 | 0.8406 | 0.8855 | 0.9414 | ||
| 3 | 87.09 | 0.8309 | 0.8194 | 0.8252 | 0.8878 |
Detailed outcomes observed for 5-fold cross-validation scheme on SARS-CoV-2 CT scan dataset and COVID-CT-Database
| Dataset | Accuracy | Precision | Recall | F1-score | AUC |
|---|---|---|---|---|---|
| SARS-CoV-2 CT-scan-dataset | 95.32 | 0.953 | 0.953 | 0.953 | 0.953 |
| COVID-CT-Dtabase | 76.01 | 0.761 | 0.760 | 0.759 | 0.834 |
Comparison of the performances of proposed model with some existing techniques on SARS-CoV-2-CT scan dataset and COVID-CT dataset
| Dataset | References | Accuracy (in %) |
|---|---|---|
| SARS-CoV-2-CT | Soares et al. [ | 97.38 |
| Jaiswal et al. [ | 96.25 | |
| Simonyan et al. [ | 97.4 | |
| He et al. [ | 95.17 | |
| Chollet et al. [ | 94.57 | |
| Proposed work | 98.39 | |
| COVID-CT | Yang et al. [ | 89.1 |
| He et al. [ | 86 | |
| Mobiny et al. [ | 87.6 | |
| Polsinelli et al. [ | 83 | |
| Dan-Sebastian et al. [ | 87.74 | |
| Shamsi Jokandan et al. [ | 87.9 | |
| Mishra et al. [ | 88.34 | |
| Ewen et al. [ | 86.21 | |
| Loey et al. [ | 82.91 | |
| Proposed work | 90.0 |