| Literature DB >> 35077936 |
Muhammet Fatih Aslan1, Kadir Sabanci2, Akif Durdu3, Muhammed Fahri Unlersen4.
Abstract
The coronavirus outbreak 2019, called COVID-19, which originated in Wuhan, negatively affected the lives of millions of people and many people died from this infection. To prevent the spread of the disease, which is still in effect, various restriction decisions have been taken all over the world. In addition, the number of COVID-19 tests has been increased to quarantine infected people. However, due to the problems encountered in the supply of RT-PCR tests and the ease of obtaining Computed Tomography and X-ray images, imaging-based methods have become very popular in the diagnosis of COVID-19. Therefore, studies using these images to classify COVID-19 have increased. This paper presents a classification method for computed tomography chest images in the COVID-19 Radiography Database using features extracted by popular Convolutional Neural Networks (CNN) models (AlexNet, ResNet18, ResNet50, Inceptionv3, Densenet201, Inceptionresnetv2, MobileNetv2, GoogleNet). The determination of hyperparameters of Machine Learning (ML) algorithms by Bayesian optimization, and ANN-based image segmentation are the two main contributions in this study. First of all, lung segmentation is performed automatically from the raw image with Artificial Neural Networks (ANNs). To ensure data diversity, data augmentation is applied to the COVID-19 classes, which are fewer than the other two classes. Then these images are applied as input to five different CNN models. The features extracted from each CNN model are given as input to four different ML algorithms, namely Support Vector Machine (SVM), k-Nearest Neighbors (k-NN), Naive Bayes (NB), and Decision Tree (DT) for classification. To achieve the most successful classification accuracy, the hyperparameters of each ML algorithm are determined using Bayesian optimization. With the classification made using these hyperparameters, the highest success is obtained as 96.29% with the DenseNet201 model and SVM algorithm. The Sensitivity, Precision, Specificity, MCC, and F1-Score metric values for this structure are 0.9642, 0.9642, 0.9812, 0.9641 and 0.9453, respectively. These results showed that ML methods with the most optimum hyperparameters can produce successful results.Entities:
Keywords: Bayesian Optimization; COVID-19 pandemic; Convolutional neural networks; Machine learning
Mesh:
Year: 2022 PMID: 35077936 PMCID: PMC8770389 DOI: 10.1016/j.compbiomed.2022.105244
Source DB: PubMed Journal: Comput Biol Med ISSN: 0010-4825 Impact factor: 4.589
Fig. 1Block diagram of the proposed study.
Number of samples classes in the COVID-19 Radiography Database.
| Class | COVID-19 | Normal | Viral Pneumonia |
|---|---|---|---|
| 219 | 1341 | 1345 | |
| 2905 | |||
Fig. 2Some raw CXR images of COVID-19 Radiography Database.
Fig. 3A CXR Image sample that contains irrelevant patterns or noises.
Fig. 4Manually selected points, mask image, and segmented lung image.
Fig. 5Cropped image obtained by ANN-based segmentation.
Fig. 6Rotation operation.
Fig. 7Block diagram representation of feature extraction, optimization, and classification steps.
State-of-the-art CNN model features.
| CNN models | Alexnet | Resnet18 | Resnet50 | Inceptionv3 | Densenet201 | Inceptionresnetv2 | GoogleNet | MobileNetv2 |
|---|---|---|---|---|---|---|---|---|
| 227 × 227 | 224 × 224 | 224 × 224 | 299 × 299 | 224 × 224 | 299 × 299 | 224 × 224 | 224 × 224 | |
| fc8 | fc1000 | fc1000 | predictions | fc1000 | predictions | pool5-drop_7 × 7_s1 | logits | |
| 1000 | 1000 | 1000 | 1000 | 1000 | 1000 | 1000 | 1000 |
Hyperparameters of ML algorithm obtained as a result of Bayesian Optimization.
| ML alg. | Hyperparameters | CNN models used for feature extraction | |||||||
|---|---|---|---|---|---|---|---|---|---|
| Alexnet | Resnet18 | Resnet50 | Inceptionv3 | Densenet201 | Inceptionresnetv2 | GoogleNet | MobileNetv2 | ||
| SVM | Coding method | onevsall | onevsall | onevsone | onevsone | onevsone | onevsone | onevsall | onevsall |
| Box constraints (C) | 0.0010143 | 0.10313 | 0.0046093 | 0.0027652 | 0.003083 | 0.0010105 | 0.0010391 | 0.0023347 | |
| Kernel scale | 0.16082 | 1.1931 | 0.2569 | 0.43568 | 0.77278 | 0.32909 | 0.082927 | 0.66399 | |
| Total elapsed time(s) | 4834.7835 | 4949.4327 | 5109.2594 | 5298.5672 | 5436.9534 | 5314.5984 | 5273.306 | 5380.0478 | |
| k-NN | Number of neighbors | 3 | 3 | 3 | 3 | 6 | 4 | 3 | 4 |
| Distance | correlation | seuclidean | seuclidean | spearman | seuclidean | spearman | seuclidean | spearman | |
| Total elapsed time(s) | 81.6571 | 82.5329 | 86.2578 | 87.9510 | 95.5283 | 89.1256 | 86.6401 | 94.8055 | |
| NB | Distribution type | kernel | kernel | kernel | kernel | kernel | kernel | kernel | kernel |
| Kernel width | 0.051574 | 0.058872 | 0.058477 | 0.067775 | 0.070217 | 0.059119 | 0.063142 | 0.060272 | |
| Total elapsed time(s) | 2589.5847 | 2714.706 | 2730.5094 | 2930.5064 | 3010.9512 | 2906.2571 | 2760.4442 | 3006.6945 | |
| DT | Minimum leaf size | 4 | 12 | 2 | 6 | 3 | 8 | 8 | 10 |
| Total elapsed time(s) | 85.2036 | 86.1918 | 88.0563 | 89.2318 | 90.6345 | 88.8135 | 83.7987 | 89.5203 | |
Fig. 8Confusion matrices of state-of-the-art CNN models according to ML algorithms.
The calculated metrics of state-of-the-art CNN models according to ML algorithms.
| Model | ML | Accuracy (%) | Sensitivity | Specificity | Precision | F1 Score | MCC |
|---|---|---|---|---|---|---|---|
| SVM | 0.9522 | 0.9750 | 0.9526 | 0.9519 | 0.9273 | ||
| k-NN | 91.17 | 0.9135 | 0.9553 | 0.9156 | 0.9140 | 0.8698 | |
| NB | 86.04 | 0.8617 | 0.9292 | 0.8688 | 0.8639 | 0.7945 | |
| DT | 85.34 | 0.8550 | 0.9261 | 0.8562 | 0.8554 | 0.7817 | |
| SVM | 0.9443 | 0.9713 | 0.9457 | 0.9448 | 0.9163 | ||
| k-NN | 94.17 | 0.9433 | 0.9705 | 0.9430 | 0.9431 | 0.9136 | |
| NB | 89.93 | 0.9003 | 0.9489 | 0.9042 | 0.9017 | 0.8512 | |
| DT | 84.45 | 0.8459 | 0.9213 | 0.8496 | 0.8475 | 0.7691 | |
| SVM | 0.9534 | 0.9759 | 0.9535 | 0.9534 | 0.9293 | ||
| k-NN | 93.11 | 0.9328 | 0.9650 | 0.9334 | 0.9331 | 0.8981 | |
| NB | 87.99 | 0.8810 | 0.9391 | 0.8860 | 0.8830 | 0.8225 | |
| DT | 85.34 | 0.8564 | 0.9258 | 0.8569 | 0.8565 | 0.7824 | |
| SVM | 0.9570 | 0.9777 | 0.9574 | 0.9570 | 0.9348 | ||
| k-NN | 92.40 | 0.9261 | 0.9615 | 0.9282 | 0.9264 | 0.8887 | |
| NB | 86.57 | 0.8669 | 0.9320 | 0.8730 | 0.8687 | 0.8020 | |
| DT | 84.45 | 0.8460 | 0.9213 | 0.8506 | 0.8479 | 0.7695 | |
| SVM | 0.9642 | 0.9812 | 0.9642 | 0.9641 | 0.9453 | ||
| k-NN | 93.46 | 0.9365 | 0.9669 | 0.9371 | 0.9365 | 0.9036 | |
| NB | 85.34 | 0.8556 | 0.9258 | 0.8589 | 0.8569 | 0.7830 | |
| DT | 85.34 | 0.8556 | 0.9258 | 0.8589 | 0.8569 | 0.7830 | |
| SVM | 0.9570 | 0.9776 | 0.9569 | 0.9569 | 0.9345 | ||
| k-NN | 90.11 | 0.9031 | 0.9496 | 0.9051 | 0.9038 | 0.8539 | |
| NB | 81.80 | 0.8216 | 0.9076 | 0.8273 | 0.8239 | 0.7319 | |
| DT | 84.28 | 0.8455 | 0.9201 | 0.8497 | 0.8473 | 0.7677 | |
| SVM | 0.9468 | 0.9723 | 0.9473 | 0.9467 | 0.9193 | ||
| k-NN | 92.58 | 0.9277 | 0.9624 | 0.9281 | 0.9278 | 0.8903 | |
| NB | 88.69 | 0.8877 | 0.9426 | 0.8921 | 0.8894 | 0.8326 | |
| DT | 86.57 | 0.8683 | 0.9323 | 0.8672 | 0.8677 | 0.8000 | |
| SVM | 0.9519 | 0.9749 | 0.9517 | 0.9517 | 0.9267 | ||
| k-NN | 89.58 | 0.8977 | 0.9468 | 0.9021 | 0.8987 | 0.8470 | |
| NB | 88.87 | 0.8889 | 0.9437 | 0.8925 | 0.8903 | 0.8345 | |
| DT | 86.40 | 0.8656 | 0.9308 | 0.8712 | 0.8676 | 0.7993 |
Fig. 9ROC curves of the DenseNet-SVM structure.
The comparison of the proposed study and previous studies.
| Previous Studies | Methodologies | Accuracy (%) |
|---|---|---|
| Wang, Wong [ | DL | 92.30 |
| Brunese, Mercaldo, Reginelli and Santone [ | DL & TL | 97.00 |
| Afshar, Heidarian, Naderkhani, Oikonomou, Plataniotis and Mohammadi [ | DL & Capsule network | 95.70 |
| Ismael and Şengür [ | DL & SVM | 94.70 |
| Chowdhury, Rahman, Khandakar, Mazhar, Kadir, Mahbub, Islam, Khan, Iqbal and Al-Emadi [ | DL & TL | 97.94 |
| Farooq, Hafeez [ | DL & TL | 96.20 |
| Hemdan, Shouman and Karar [ | VGG19 | 90.00 |
| Ucar, Korkmaz [ | Bayes & SqueezeNet | 98.30 |
| Apostolopoulos, Mpesiana [ | DL & TL | 93.48 |
| Xu, Jiang, et al. [ | ResNet & Location Attention | 86.70 |
| Ozturk, Talo, Yildirim, Baloglu, Yildirim and Acharya [ | DarkCovidNet | 87.02 |
| Rahimzadeh, Attar [ | Xception - ResNet50V2 | 91.40 |
| Narin, Kaya and Pamuk [ | DL & TL | 98.00 |
| Asif and Wenhui [ | DL & TL | 96.00 |
| Nour, Cömert and Polat [ | DL & ML | 98.97 |
| Wang, Xiao, Li, Zhang, Lu, Hou and Liu [ | ResNet & Feature Pyramid Network | 94.00 |
| Khan, Shah and Bhat [ | DL & TL | 95.00 |
| Momeny et al. [ | DL & AlexNet | 77.60 |
| Rahman et al. [ | HOG & DL | 96.04 |
| Shadin et al. [ | DL & InceptionV3 | 85.94 |
| Monshi et al. [ | DL & CovidXRayNet | 95.82 |
| Singh et al. [ | DL & LeNet-5/ResNet-50 | 87.00 |
Fig. 10Confusion matrices of DenseNet-SVM structure.
Fig. 11ROC curves of the test dataset.