| Literature DB >> 33746657 |
Abstract
The COVID-19 outbreak has been causing a global health crisis since December 2019. Due to this virus declared by the World Health Organization as a pandemic, the health authorities of the countries are constantly trying to reduce the spread rate of the virus by emphasizing the rules of masks, social distance, and hygiene. COVID-19 is highly contagious and spreads rapidly globally and early detection is of paramount importance. Any technological tool that can provide rapid detection of COVID-19 infection with high accuracy can be very useful to medical professionals. The disease findings on COVID-19 images, such as computed tomography (CT) and X-rays, are similar to other lung infections, making it difficult for medical professionals to distinguish COVID-19. Therefore, computer-aided diagnostic solutions are being developed to facilitate the identification of positive COVID-19 cases. The method currently used as a gold standard in detecting the virus is the Reverse Transcription Polymerase Chain Reaction (RT-PCR) test. Due to the high false-negative rate of this test and the delays in the test results, alternative solutions are sought. This study was conducted to investigate the contribution of machine learning and image processing to the rapid and accurate detection of COVID-19 from two of the most widely used different medical imaging modes, chest X-ray and CT images. The main purpose of this study is to support early diagnosis and treatment to end the coronavirus epidemic as soon as possible. One of the primary aims of the study is to provide support to medical professionals who are most worn out and working under intense stress during COVID-19 through smart learning methods and image classification models. The proposed approach was applied to three different public COVID-19 data sets and consists of five basic steps: data set acquisition, pre-processing, feature extraction, dimension reduction, and classification stages. Each stage has its sub-operations. The proposed model performs in considerable levels of COVID-19 detection for dataset-1 (CT), dataset-2 (X-ray) and dataset-3 (CT) with the accuracy of 89.41%, 99.02%, 98.11%, respectively. On the other hand, in the X-ray data set, an accuracy of 85.96% was obtained for COVID-19 (+), COVID-19 (-), and those with Pneumonia but not COVID-19 classes. As a result of the study, it has been shown that COVID-19 can be detected with a high success rate in about less than one minute with image processing and classical learning methods. In the light of the findings, it is possible to say that the proposed system will help radiologists in their decisions, will be useful in the early diagnosis of the virus, and can distinguish pneumonia caused by the COVID-19 virus from the pneumonia of other diseases.Entities:
Keywords: CAD; COVID-19; CT; Machine learning; X-ray
Year: 2021 PMID: 33746657 PMCID: PMC7968176 DOI: 10.1016/j.asoc.2021.107323
Source DB: PubMed Journal: Appl Soft Comput ISSN: 1568-4946 Impact factor: 6.725
Fig. 1GGO on the Chest X-ray Image (dataset 2).
Fig. 2Comparison of X-ray and CT images in the early stage of COVID-19 [15].
Fig. 3Changes in the proportions of patients with GGO, crazy-paving pattern, and consolidation [13].
Fig. 4CT findings of COVID-19 in a 47-year-old female patient (a) A small region of GGO with partial consolidation (day 3) (b) Enlarged region of GGO with crazy-paving pattern with partial consolidation (day 7,) (c) A new area of consolidation with a small GGO (day 11) (d) the decreased GGO [13].
Fig. 5Case change (as of December 17, 2020) [1] .
Highest case and death rates (as of 17 December 2020, 2020) [1].
| Country/Area/Territory | Total cases | Total deaths | |
|---|---|---|---|
| 1 | United States of America | 16,245,376 | 298,594 |
| 2 | India | 9,932,547 | 144,096 |
| 3 | Brazil | 6,927,145 | 181,835 |
| 4 | Russian Federation | 2,734,454 | 48,564 |
| 5 | France | 2,350,207 | 58,700 |
| 6 | United Kingdom of Great Britain and Northern Ireland | 1,888,120 | 64,908 |
| 7 | Italy | 1,870,576 | 65,857 |
| 8 | Spain | 1,762,212 | 48,401 |
| 9 | Argentina | 1,503,222 | 41,041 |
| 10 | Colombia | 1,434,516 | 39,195 |
Related works on COVID-19 in the literature.
| Author(s) | Method | Imaging type | Size of data |
|---|---|---|---|
| Ozturk et al. | DarkCovidNet | Chest X-ray | 125 COVID-19 (+), 500 No Findings |
| Hemdan et al. | COVIDX-Net | Chest X-ray | 25 COVID-19 (+), 25 COVID-19 (−) |
| Barstugan et al. | DWT + SVM | CT | 53 COVID-19 (+), 97 COVID-19 (−) |
| Wang et al. | COVID-Net | Chest X-ray | 358 COVID-19 (+), 8,066 no pneumonia, |
| Maghdid et al. | Deep Transfer Learning | X-ray and CT | 85 X-ray and 203 CT COVID-19 (+), 85 X-ray and 153 CT COVID-19 (−) |
| Ghoshal et al. | Dropweights based Bayesian Convolutional Neural Networks (BCNN) | X-ray | Normal: 1583, Bacterial Pneumonia: 2786, non-COVID-19 Viral Pneumonia: 1504, COVID-19: 68 |
| Abbas et al. | Deep Transfer Learning, PCA | X-ray | 116 X-ray Covid-19 (+), 80 X-ray Covid-19 (−) |
| Farooq and Hafeez | ResNet-50 | X-ray | 660 patients with nonCOVID-19 viral pneumonia cases, 68 COVID-19 radiographs 1203 patients with negative pneumonia |
| Singh et al. | Multi-Objective Differential Evolution Based CNN | CT | |
| Soares et al. | eXplainable Deep Learning classification approach | CT | 1252 COVID-19 (+), 1230 COVID-19 (−) |
| Yang et al. | Multi-Task Learning and Contrastive Self-Supervised Learning | CT | 349 COVID-19 (+), 397 COVID-19 (−) |
| Wei et al. | 3D ResNet-18 | CT | 305 COVID-19 (+), 872 Community Acquired Pneumonia (CAP), 1498 Non-pneumonia |
| Hu et al. | Weakly Supervised Multi-scale Learning Framework | CT | 150 Covid-19 (+), 300 community-acquired pneumonia (CAP) and non-pneumonia (NP) |
| Wu et al. | Deep Learning Based | CT | 400 COVID-19 (+) Cases, 350 COVID-19 (−) Cases |
| Sun et al. | Adaptive Feature Selection guided Deep Forest | CT | 1495 Covid-19 (+), 1027 community-acquired pneumonia (CAP) |
| Jaiswal et al. | DenseNet201 based deep transfer learning | CT | 1262 COVID-19 (+) Cases, 1230 COVID-19 (−) Cases |
| Abraham et al. | Multi-CNN and Bayesnet Classifier | X-ray | 453 COVID-19 (+), 497 COVID-19 (−) |
| Altan and Karasu | 2D curvelet transform, chaotic salp swarm algorithm, and deep learning technique | X-ray | 263 COVID-19 (+), 1609 COVID-19 (−), 1614 viral pneumonia |
| Nour et al. | Deep Features and Bayesian Optimization | X-ray | 219 COVID-19 (+), 1341 COVID-19 (−), 1345 viral pneumonia |
| Kassani et al. | MobileNet, DenseNet, Xception, ResNet, InceptionV3, InceptionResNetV2, VGGNet, NASNet | X-ray and CT | 117 X-ray and 20 CT images COVID-19 (+), 117 X-ray and 20 CT images COVID-19 (−) |
| Ardakani et al. | K-Nearest Neighbor, Naïve Bayes, Support Vector Machine (SVM), and Ensemble | CT | 306 COVID-19 (+), 306 COVID-19 (−) |
| Zhou et al. | Ensemble Deep Learning Model | CT | 500 COVID-19 (+), 500 COVID-19 (−) |
| Gupta et al. | Integrated Stacking InstaCovNet-19 model | X-ray | 361 COVID-19 (+), 365 Normal (−), 362 Pneumonia |
| Aslan et al. | CNN-based transfer learning–BiLSTM | X-ray | 219 COVID-19 (+), 1341 COVID-19 (−), 1345 viral pneumonia |
Fig. 6Age distributions of Data Set-1.
Fig. 7Sample images from data set-1.
Fig. 8Sample images from data set-2.
Fig. 9Sample images from data set-2.
Information about the data sets used in the study.
| Image type | # of COVID-19 (+) | # of COVID-19 (−) | Gender distribution | Location | |
|---|---|---|---|---|---|
| Dataset-1 | CT | 349 | 397 | %63 Male | China |
| %37 Female | |||||
| Dataset-2 | X-ray | 125 | 500 Clear | %66 Male | Mixed |
| 500 Not COVID-19 but Pneumonia | %34 Female | ||||
| Dataset-3 | CT | 1252 | 1230 | %53 Male | Sao Paulo, Brazil |
| %47 Female | |||||
Information about the data sets width and height.
| Width | Height | |
|---|---|---|
| Dataset-1 | 400 | 300 |
| Dataset-2 | 525 | 525 |
| Dataset-3 | 375 | 255 |
Fig. 10An example of generating an LBP identifier code.
The purpose and parameters of the methods used in the study.
| Process | Aim of use | Parameters |
|---|---|---|
| Image resizing | Preprocessing | Dataset-1 400 |
| Image Sharpening | Preprocessing | Radius |
| Gray level transformation | Preprocessing | Default values |
| HOG | Feature extraction | Variant |
| LBP | Feature extraction | Window Size: 3 |
| PCA | Feature selection | Algorithm: Singular value decomposition (SVD) |
| k-NN | Classification | |
| Bag of Tree | Classification | # of Trees |
| SVM | Classification | Default values |
| K-ELM | Classification | Kernel |
Fig. 11Flow chart of the proposed model.
Binary classification results for data set-1.
| Accuracy (%) | Sensitivity (%) | Specificity (%) | PPV (%) | NPV (%) | ||
|---|---|---|---|---|---|---|
| HOG | Bag of Tree | 80.97 | 77.36 | 84.13 | 81.08 | 80.87 |
| K-ELM | 80.16 | 77.94 | 82.12 | 79.30 | 80.89 | |
| k-NN | 84.45 | 83.95 | 84.89 | 83.00 | 85.75 | |
| SVM | 83.24 | 80.80 | 85.39 | 82.94 | 83.50 | |
| LBP | Bag of Tree | 78.02 | 65.90 | 88.67 | 83.64 | 74.73 |
| K-ELM | 80.83 | 76.50 | 84.63 | 81.40 | 80.38 | |
| k-NN | ||||||
| SVM | 86.73 | 83.95 | 89.17 | 87.20 | 86.34 | |
Multi-class classification results for data set-2.
| Accuracy (%) | Sensitivity (%) | Specificity (%) | PPV (%) | NPV (%) | ||
|---|---|---|---|---|---|---|
| HOG | Bag of Tree | 80.71 | 64.80 | 99.80 | 97.59 | 95.78 |
| K-ELM | 81.60 | 82.40 | 99.40 | 94.50 | 97.83 | |
| k-NN | 76.09 | 86.40 | 99.00 | 91.53 | 98.31 | |
| SVM | 84.89 | 87.20 | 99.50 | 95.61 | 98.42 | |
| LBP | Bag of Tree | 82.84 | 71.20 | 96.53 | ||
| K-ELM | 80.18 | 88.00 | 99.40 | 94.83 | 98.51 | |
| k-NN | 77.51 | 98.80 | 90.77 | |||
| SVM | 88.80 | 99.80 | 98.23 | 98.62 | ||
Binary classification (COVID vs. No Findings) results for data set 2.
| Accuracy (%) | Sensitivity (%) | Specificity (%) | PPV (%) | NPV (%) | ||
|---|---|---|---|---|---|---|
| HOG | Bag of Tree | 96.48 | 83.20 | 99.80 | 99.05 | 95.96 |
| K-ELM | 94.40 | 98.62 | ||||
| k-NN | 98.08 | 91.20 | 99.80 | 99.13 | 97.84 | |
| SVM | 98.72 | 94.40 | 99.80 | 99.16 | 98.62 | |
| LBP | Bag of Tree | 96.32 | 81.60 | 100.00 | 100.00 | 95.60 |
| K-ELM | 98.24 | 92.00 | 99.80 | 99.14 | 98.04 | |
| k-NN | 98.72 | 99.40 | 97.56 | |||
| SVM | 98.72 | 99.40 | 97.56 | |||
Binary classification (COVID vs. Pneumonia) results for data set 2.
| Accuracy (%) | Sensitivity (%) | Specificity (%) | PPV (%) | NPV (%) | ||
|---|---|---|---|---|---|---|
| HOG | Bag of Tree | 94.72 | 75.20 | 99.60 | 97.92 | 94.14 |
| K-ELM | 96.96 | 87.20 | 99.40 | 97.32 | 96.88 | |
| k-NN | 95.68 | 86.40 | 98.00 | 91.53 | 96.65 | |
| SVM | 97.76 | 92.80 | 99.00 | 95.87 | 98.21 | |
| LBP | Bag of Tree | 94.40 | 72.00 | 93.46 | ||
| K-ELM | 98.24 | 92.80 | 99.60 | 98.31 | 98.22 | |
| k-NN | 96.16 | 91.20 | 97.40 | 89.76 | 97.79 | |
| SVM | 99.80 | 99.15 | ||||
Binary classification (COVID vs. Pneumonia +No Findings) results for data set 2.
| Accuracy (%) | Sensitivity (%) | Specificity (%) | PPV (%) | NPV (%) | ||
|---|---|---|---|---|---|---|
| HOG | Bag of Tree | 95.64 | 62.40 | 99.80 | 97.50 | 95.50 |
| K-ELM | 97.69 | 84.00 | 99.40 | 94.59 | 98.03 | |
| k-NN | 97.60 | 85.60 | 99.10 | 92.24 | 98.22 | |
| SVM | 98.49 | 92.00 | 99.30 | 94.26 | 99.00 | |
| LBP | Bag of Tree | 95.20 | 56.80 | 94.88 | ||
| K-ELM | 98.93 | 90.40 | 98.81 | |||
| k-NN | 98.13 | 98.70 | 90.00 | |||
| SVM | 92.80 | 99.80 | 98.31 | 99.11 | ||
Binary classification results for data set 3.
| Accuracy (%) | Sensitivity (%) | Specificity (%) | PPV (%) | NPV (%) | ||
|---|---|---|---|---|---|---|
| HOG | Bag of Tree | 86.90 | 85.38 | 88.45 | 88.27 | 85.59 |
| K-ELM | 91.70 | 91.69 | 91.70 | 91.84 | 91.55 | |
| k-NN | 97.90 | 98.40 | 97.47 | 98.36 | ||
| SVM | 95.85 | 96.09 | 95.61 | 95.70 | 96.00 | |
| LBP | Bag of Tree | 85.21 | 82.51 | 87.96 | 87.47 | 83.15 |
| K-ELM | 90.08 | 90.02 | 90.15 | 90.30 | 89.86 | |
| k-NN | ||||||
| SVM | 96.41 | 96.96 | 95.95 | 95.97 | 96.88 | |
Average time of Covid-19 diagnosis by different methods.
| Binary classification | Binary classification | Binary classification | Multiclass (3-Class) | ||||||
|---|---|---|---|---|---|---|---|---|---|
| HOG | LBP | HOG | LBP | HOG | LBP | HOG | LBP | ||
| Time (s) | Bag of Tree | 149.5 | 219.1 | 286.2 | 474.8 | 209.0 | 391.8 | 337.7 | 600.4 |
| K-ELM | 77.6 | 51.1 | 77.9 | 146.8 | 52.9 | 252.9 | 108.3 | 226.8 | |
| k-NN | 17.10 | 91.9 | 46.8 | 80.9 | |||||
| SVM | 32.7 | 76.2 | 151.8 | 362.3 | 27.5 | 51.5 | 155.7 | 304.5 | |
The confusion matrices were obtained as a result of the classification process. (Green values in the lower right corner show Accuracy, Bottom-Left blue value Sensitivity, Bottom-right blue value specificity, Right-top blue PPV, Right-bottom blue NPV values. (a)-Dataset-1, (b)-Dataset-2 (Multiclass), (c, d, e) - Dataset-2 (Binary), (f)-Dataset-3))
Fig. 12Examples of images detected incorrectly as a result of the study.
Fig. 13Display of the process steps of the study on the sample image.
Comparison of the performances of this study and other studies in the literature.
| Author(s) | Dataset | Accuracy (%) | Sensitivity (%) | Specificity (%) | PPV (%) | NPV (%) |
|---|---|---|---|---|---|---|
| Ozturk et al. | Dataset-2 (Multiclass) | 87.02 | 85.35 | 92.18 | 89.96 | NA |
| Ozturk et al. | Dataset-2 (Binary) | 98.08 | 95.13 | 95.3 | 98.03 | NA |
| Soares et al. | Dataset-3 | 97.38 | 95.53 | NA | 99.16 | NA |
| Yang et al. | Dataset-1 | 89.0 | NA | NA | NA | NA |
| Jaiswal et al. | Dataset-3 | 96.25 | 96.29 | 96.21 | 96.29 | NA |
| Our study | Dataset-1 | 89.41 | 86.53 | 91.94 | 90.42 | 88.59 |
| Our study | Dataset-2 (Multiclass) | 85.96 | 94.40 | 100.00 | 100.00 | 99.30 |
| Our study | Dataset-2 (Binary) | 98.88 | 96.00 | 100.00 | 100.00 | 98.42 |
| Our study | Dataset-3 | 98.11 | 98.80 | 97.40 | 97.48 | 98.76 |
Average time of Covid-19 diagnosis by different methods.
| Method | RT-PCR | CT-Radiologist | Our system |
|---|---|---|---|
| Time | 21.5 min |
Fig. 14Comparison of the metric values of the classification methods in all data sets.