| Literature DB >> 35002099 |
Arpan Basu1, Khalid Hassan Sheikh1, Erik Cuevas2, Ram Sarkar1.
Abstract
Coronavirus disease 2019 (COVID-19) is a contagious disease caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). It may cause serious ailments in infected individuals and complications may lead to death. X-rays and Computed Tomography (CT) scans can be used for the diagnosis of the disease. In this context, various methods have been proposed for the detection of COVID-19 from radiological images. In this work, we propose an end-to-end framework consisting of deep feature extraction followed by feature selection (FS) for the detection of COVID-19 from CT scan images. For feature extraction, we utilize three deep learning based Convolutional Neural Networks (CNNs). For FS, we use a meta-heuristic optimization algorithm, Harmony Search (HS), combined with a local search method, Adaptive β -Hill Climbing (A β HC) for better performance. We evaluate the proposed approach on the SARS-COV-2 CT-Scan Dataset consisting of 2482 CT scan images and an updated version of the previous dataset containing 2926 CT scan images. For comparison, we use a few state-of-the-art optimization algorithms. The best accuracy scores obtained by the present approach are 97.30% and 98.87% respectively on the said datasets, which are better than many of the algorithms used for comparison. The performances are also at par with some recent works which use the same datasets. The codes for the FS algorithms are available at: https://github.com/khalid0007/Metaheuristic-Algorithms.Entities:
Keywords: Adaptive β-Hill Climbing; COVID-19 detection; Convolutional Neural Network; Harmony Search
Year: 2022 PMID: 35002099 PMCID: PMC8720180 DOI: 10.1016/j.eswa.2021.116377
Source DB: PubMed Journal: Expert Syst Appl ISSN: 0957-4174 Impact factor: 6.954
Some common issues that are present in recent works on Covid-19 detection.
| Issue | Description |
|---|---|
| Dataset quality | This is perhaps the most important issue for COVID-19 detection methods ( |
| Feature extraction | Recent methods extract features from the input images by one of the following techniques: classical techniques i.e., using feature engineering, DL-based techniques or a hybrid of the previous two. Classical techniques generally produce inferior results when compared to DL-based or hybrid techniques. At the same time, DL-based techniques are mostly black-box models and it is difficult to interpret their results easily. Hence, it is a practical trade-off between classification performance and the explainability of the model, both of which are important for medical image classification. |
| Parameter tuning | Parameter tuning is an important stage in both meta-heuristic and DL-based approaches ( |
| Heavyweight DL-based methods | In general, DL based methods produce better results as compared to conventional methods. There are a few disadvantages associated with them as mentioned above. However, many DL based approaches using the latest architectures like graph neural networks ( |
| Computing environment | This issue is especially common in methods where one step involves the use of DL and another step involves the use of meta-heuristic algorithms. Some times, DL algorithms are implemented in Python while meta-heuristic algorithms are implemented in any other language. This is because Python provides many features and libraries suited for implementing DL algorithms which other languages may be lacking. This can sometimes produce a marked change in the results due to data-interchange issues like incompatible formats, different representation of vectors, etc. |
| Limitations of FS Methods | In various COVID-19 papers FS is also used like in |
Fig. 1A pictorial overview of the proposed work used for COVID-19 detection. HS and AHC denote the Harmony Search and Adaptive -Hill Climbing algorithms respectively.
Fig. 2A pictorial representation of the skip connections in the ResNet architecture (He et al., 2016).
Fig. 3A pictorial representation of the dense connections in the DenseNet architecture (Huang et al., 2017).
Fig. 4A schematic diagram representing a simplified inception block (Chollet, 2017).
Fig. 5A schematic diagram representing a strictly equivalent reformulation of the simplified inception block (Chollet, 2017).
Fig. 6A schematic diagram of the feature extraction approach.
Fig. 7Some sample CT images from the SARS-COV-2 CT-Scan Dataset.
Parameter values used for all the meta-heuristic algorithms.
| Algorithm | Parameters |
|---|---|
| GA | Mutation rate = 0.3, Crossover rate = 0.4 |
| PSO | weight = [1 0] |
| GWO | a = [2 0] |
| WOA | a = [2 0] |
| BBA | L = 1, PER = 0.15, |
| HS | HMCR = 0.9 |
The average time taken per iteration for each of the CNN feature extractors applied on two datasets. Times reported correspond to a 14 GB NVIDIA GPU.
| Model | Time taken | |
|---|---|---|
| Dataset 1 | Dataset 2 | |
| DenseNet201 | 38.7 s | 44.7 s |
| ResNet152 | 50.9 s | 58.9 s |
| Xception | 40.6 s | 48.7 s |
Accuracies and F1-scores of the DL models on the test partition of dataset 1.
| Mode | Model | Accuracy (%) | F1 score (%) |
|---|---|---|---|
| Mode 1 | DenseNet201 | 93.93 | 93.93 |
| ResNet152 | 93.12 | 93.12 | |
| Xception | 91.90 | 91.90 | |
| Mode 2 | DenseNet201 | 93.52 | 93.53 |
| ResNet152 | 92.18 | 92.17 | |
| Xception | 94.34 | 94.34 | |
The results of feature selection (FS) in mode 1 on dataset 1.
| Model | FS algorithm | Accuracy (%) | % increase |
|---|---|---|---|
| DenseNet201 | GA | 95.54 | 1.61 |
| PSO | 94.33 | 0.40 | |
| GWO | 95.54 | 1.61 | |
| WOA | 93.11 | −0.82 | |
| BBA | 94.73 | 0.80 | |
| HS | 95.54 | 1.61 | |
| HS + A | 95.54 | 1.61 | |
| ResNet152 | GA | 94.33 | 1.21 |
| PSO | 94.73 | 1.61 | |
| GWO | 94.73 | 1.61 | |
| WOA | 95.14 | 2.02 | |
| BBA | 93.52 | 0.40 | |
| HS | 94.33 | 1.21 | |
| HS + A | 95.14 | 2.02 | |
| Xception | GA | 96.76 | 4.86 |
| PSO | 95.95 | 4.05 | |
| GWO | 96.35 | 4.45 | |
| WOA | 94.73 | 2.83 | |
| BBA | 95.95 | 4.05 | |
| HS | 96.35 | 4.45 | |
| HS + A | |||
The reduction in the number of features after FS (FS) in mode 1 on dataset 1.
| Model | FS algorithm | No. of features | % reduction | |
|---|---|---|---|---|
| initially | finally | |||
| DenseNet201 | HS | 1920 | 800 | 58.33 |
| HS + A | 1920 | 786 | 59.06 | |
| ResNet152 | HS | 2048 | 793 | 61.28 |
| HS + A | 2048 | 888 | 56.64 | |
| Xception | HS | 2048 | 781 | 61.86 |
| HS + A | 2048 | 913 | 55.42 | |
The results of feature selection (FS) in mode 2 on dataset 1.
| Model | FS algorithm | Accuracy (%) | % increase |
|---|---|---|---|
| DenseNet201 | GA | 95.68 | 2.16 |
| PSO | 96.49 | 2.97 | |
| GWO | 96.49 | 2.97 | |
| WOA | 96.49 | 2.97 | |
| BBA | 95.95 | 2.43 | |
| HS | 96.77 | 3.25 | |
| HS + A | |||
| ResNet152 | GA | 92.18 | 0.00 |
| PSO | 92.72 | 0.54 | |
| GWO | 92.45 | 0.27 | |
| WOA | 92.18 | 0.00 | |
| BBA | 92.45 | 0.27 | |
| HS | 92.45 | 0.27 | |
| HS + A | 94.34 | 2.16 | |
| Xception | GA | 95.95 | 1.61 |
| PSO | 96.76 | 2.42 | |
| GWO | 95.68 | 1.34 | |
| WOA | 95.68 | 1.34 | |
| BBA | 95.95 | 1.61 | |
| HS | 95.42 | 1.08 | |
| HS + A | 96.22 | 1.88 | |
The reduction in the number of features after feature selection (FS) in mode 2 on dataset 1.
| Model | FS algorithm | No. of features | % reduction | |
|---|---|---|---|---|
| initially | finally | |||
| DenseNet201 | HS | 1920 | 870 | 54.68 |
| HS + A | 1920 | 807 | 57.96 | |
| ResNet152 | HS | 2048 | 833 | 59.32 |
| HS + A | 2048 | 838 | 59.08 | |
| Xception | HS | 2048 | 848 | 58.59 |
| HS + A | 2048 | 876 | 57.22 | |
Comparison of the proposed approach with some state-of-the-art approaches on dataset 1.
| Method | Accuracy (%) |
|---|---|
| DenseNet201 + transfer learning | 96.25 |
| ( | |
| Explainable DL (xDNN) | 97.38 |
| ( | |
| COVID-Net + contrastive training | 90.83 |
| ( | |
| CNN + bi-stage FS | 95.77 |
| ( | |
| CNN | 94.98 |
| ( | |
| Norm-VGG16 | 96.39 |
| ( | |
| COV-CAF | 97.59 |
| ( | |
| Proposed | 97.30 |
Accuracies and F1-scores of the DL models on the test partition of dataset 2.
| Mode | Model | Accuracy (%) | F1 score (%) |
|---|---|---|---|
| Mode 1 | DenseNet201 | 92.74 | 91.27 |
| ResNet152 | 93.88 | 91.95 | |
| Xception | 90.93 | 89.10 | |
| Mode 2 | DenseNet201 | 90.70 | 89.47 |
| ResNet152 | 89.80 | 87.25 | |
| Xception | 93.42 | 91.59 | |
The results of FS on dataset 2. The Model + Mode column indicates in brief the feature extraction model and the testing mode. D, R and X denote the DenseNet201, ResNet152 and Xception models respectively. M1 and M2 denote mode 1 and mode 2 respectively.
| Model + Mode | FS algorithm | Accuracy (%) | % increase |
|---|---|---|---|
| D + M1 | HS | 94.84 | 2.1 |
| HS + AEFA | 95.23 | 2.49 | |
| HS + A | 95.23 | 2.49 | |
| R + M1 | HS | 95.23 | 1.35 |
| HS + AEFA | 96.44 | 2.56 | |
| HS + A | 96.82 | 2.94 | |
| X + M1 | HS | 96.14 | 5.21 |
| HS + AEFA | 98.59 | 7.66 | |
| HS + A | |||
| D + M2 | HS | 92.06 | 1.36 |
| HS + AEFA | 92.51 | 1.81 | |
| HS + A | 92.51 | 1.81 | |
| R + M2 | HS | 92.07 | 2.27 |
| HS + AEFA | 92.74 | 2.94 | |
| HS + A | 92.74 | 2.94 | |
| X + M2 | HS | 95.71 | 2.29 |
| HS + AEFA | 96.14 | 2.72 | |
| HS + A | 97.62 | 4.2 | |
The p-values obtained from the Wilcoxon rank-sum test on the dataset 1 in mode 1 when comparing the present method in the best setting (Xception features) with the other algorithms.
| FS algorithm | |
|---|---|
| GA | 0.0001 |
| PSO | 0.0001 |
| GWO | 0.0001 |
| WOA | 0.0001 |
| BBA | 0.0001 |
| HS | 0.0001 |
The p-values obtained from the Wilcoxon rank-sum test on the dataset 1 in mode 2 when comparing the present method in the best setting (DenseNet201 features) with the other algorithms.
| FS algorithm | |
|---|---|
| GA | 0.0001 |
| PSO | 0.0001 |
| GWO | 0.0001 |
| WOA | 0.0001 |
| BBA | 0.0001 |
| HS | 0.0001 |