| Literature DB >> 33520007 |
Tripti Goel1, R Murugan1, Seyedali Mirjalili2,3, Deba Kumar Chakrabartty4.
Abstract
The quick spread of coronavirus disease (COVID-19) has resulted in a global pandemic and more than fifteen million confirmed cases. To battle this spread, clinical imaging techniques, for example, computed tomography (CT), can be utilized for diagnosis. Automatic identification software tools are essential for helping to screen COVID-19 using CT images. However, there are few datasets available, making it difficult to train deep learning (DL) networks. To address this issue, a generative adversarial network (GAN) is proposed in this work to generate more CT images. The Whale Optimization Algorithm (WOA) is used to optimize the hyperparameters of GAN's generator. The proposed method is tested and validated with different classification and meta-heuristics algorithms using the SARS-CoV-2 CT-Scan dataset, consisting of COVID-19 and non-COVID-19 images. The performance metrics of the proposed optimized model, including accuracy (99.22%), sensitivity (99.78%), specificity (97.78%), F1-score (98.79%), positive predictive value (97.82%), and negative predictive value (99.77%), as well as its confusion matrix and receiver operating characteristic (ROC) curves, indicate that it performs better than state-of-the-art methods. This proposed model will help in the automatic screening of COVID-19 patients and decrease the burden on medicinal services frameworks. © Springer Science+Business Media, LLC, part of Springer Nature 2021.Entities:
Keywords: Automatic diagnosis; COVID-19; Coronavirus; Deep learning; Generative Adversarial Network; Whale Optimization Algorithm
Year: 2021 PMID: 33520007 PMCID: PMC7829098 DOI: 10.1007/s12559-020-09785-7
Source DB: PubMed Journal: Cognit Comput ISSN: 1866-9956 Impact factor: 4.890
Fig. 1(a) CT image of a COVID-19-infected person showing ground glass opacities. (b) CT image of a non-COVID-19-infected person [https://www.kaggle.com/plameneduardo/sarscov2-ctscan-dataset]
Summary of related works
| Reference | Motivation | Dataset size | Limitations/research gap |
|---|---|---|---|
| [ | Finding COVID-19 symptoms progress in pregnant women’s is quite challenge | 59 | Very small dataset, majority images are acquired from women which includes pregnant women and children’s, acquired images are low resolution |
| [ | To generate more number of COVID-19 images | 624 | Lack of clinical studies and the generated images are low resolution |
| [ | To generate more number of COVID-19 images sing GAN | 306 | Lack of testing and validated data |
| [ | To increase the COVID-19 CT images with available limited CT images | 742 | Augmented images are low resolution and lack of clinical studies |
| [ | To differentiate the symptoms of COVID-19 from general lung disease | 50 | Very small dataset and differentiating the sign of COVID-19 is difficult from the lung diseases |
| [ | To analyze the characteristics of COVID-19 diseases using CT images | 37 | Very small dataset |
| [ | To find the infected region of COVID-19 using CT images | 4 | Very small dataset |
| [ | To study the CT images temporal changes in COVID-19-infected persons | 90 | Very small dataset |
| [ | The RT-PCR testing costly and limited in numbers | 757 | Limited training data and classification accuracy is poor |
| [ | The RT-PCR testing is the time consuming process | 618 | Training and testing samples are limited |
| [ | High false negative in RT-PCR testing | 742 | Acquired CT images are showing more number of artifacts |
| [ | To monitor the periodical changes of COVID-19-infected lungs | 126 | Very small dataset, lack of architecture information, and systematic evaluation has not presented |
| [ | To develop an automated toolkit for COVID-19 detection instead of manual detection | 361 | Limited amount of training and testing images |
| [ | To effectively use DL to make shortage of medical professional in this pandemic situation | 646 | Limited training data |
| [ | Radiographic patterns of Ct slices produced better performance than RT-PCR test | 618 | Lack of clinical studies |
Fig. 2General architecture of the generative adversarial network
Fig. 3Workflow of the proposed methodology
Architecture of the proposed GAN
| Generator network G |
|---|
| Input layer: noise, number of latent inputs = 100 |
| [Layer 1]; fully connect, reshape to (4 × 4 × 512); ReLu |
| [Layer 2]; transposed convolution (4, 4, 256); stride = 2; Batchnorm; ReLu |
| [Layer 3]; transposed convolution (4, 4, 128); stride = 2; Batchnorm; ReLu |
| [Layer 4]; transposed convolution (4, 4, 64); stride = 2; Batchnorm; ReLu |
| [Layer 5]; transposed convolution (4, 4, 3); stride = 2; Tanh |
| Output: generated image (64 × 64 × 3) |
| Discriminator network D |
| Input layer: CT image (64 × 64 × 3) |
| [Layer 1]; convolution layer (5, 5, 64); stride = 2; Batchnorm; LreLu |
| [Layer 2]; convolution layer (5, 5, 128); stride = 2; Batchnorm; LreLu |
| [Layer 3]; convolution layer (5, 5, 256); stride = 2; Batchnorm; LreLu |
| [Layer 4]; convolution layer (5, 5, 512); stride = 2; Batchnorm; LreLu |
| [Layer 5]; convolution layer (4,4,1); stride = 2; Batchnorm; LreLu |
| Output: probability of real or fake |
Details of the dataset
| Dataset | No of images | No of images generated using GAN | Total no. of images used | No. of training images | No. of testing images |
|---|---|---|---|---|---|
| SARS-COV-2 CT-scan [ | 2482 | 518 | 3000 | 2100 | 900 |
Training parameters for optimized GAN
| Options | Parameters |
|---|---|
| Training algorithm | Adam |
| Adam number of epoch | 2000 |
| Number of iterations for optimization | 20 |
| Number of search agents | 20 |
| Number of dimension | 3 |
| Activation function of discriminator | Sigmoid |
| Batch size | 64 |
| Validation frequency | 1000 |
| Number of generated new images | 300 |
Fig. 4Sample COVID-19 training images
Fig. 5Sample non-COVID-19 training images
Fig. 6Training progress of the InceptionV3 network (a) accuracy and loss (b) results
Fig. 7Sample COVID-19 testing images
Fig. 8Sample non-COVID-19 testing images
Fig. 9CT images generated using the optimized-GAN (a) COVID-19 and (b) non-COVID-19
Comparison of performance metrics of optimized and non-optimized GAN
| Method | Accuracy (%) | Sensitivity (%) | Specificity (%) | F1 score (%) | PPV (%) | NPV (%) |
|---|---|---|---|---|---|---|
| COVID-19 non-optimized GAN | 91.60 | 84.80 | 98.40 | 90.99 | 84.8 | 98.4 |
| COVID-19 optimized GAN | 98.78 | 99.78 | 97.78 | 98.79 | 97.82 | 99.77 |
Fig. 10Comparison of ROC curves (a) non-optimized GAN and (b) optimized GAN
Fig. 11Comparison of confusion matrixes (a) non-optimized GAN and (b) optimized GAN
Performance analysis with other DL networks
| Method | Accuracy (%) | Sensitivity (%) | Specificity (%) | F1 score (%) | PPV (%) | NPV (%) |
|---|---|---|---|---|---|---|
| AlexNet | 96.44 | 97.56 | 95.33 | 96.48 | 95.43 | 97.50 |
| GoogleNet | 95.33 | 97.11 | 93.56 | 95.41 | 93.78 | 97.00 |
| VGG19 | 94.89 | 90.22 | 99.56 | 94.64 | 99.51 | 91.06 |
| SqueezeNet | 91.11 | 89.11 | 93.11 | 90.93 | 92.82 | 89.53 |
| ResNet50 | 97.78 | 97.56 | 98.00 | 97.77 | 97.99 | 97.57 |
| InceptionV3 | 99.22 | 99.78 | 97.78 | 98.79 | 97.82 | 99.77 |
Fig. 12Comparison of the confusion matrixes
Fig. 13Comparison of the ROC curves (a) AlexNet, (b) GoogleNet, (c) VGG19, (d) SqueezeNet, (e) ResNet-50, (f) InceptionV3
Comparative analysis with other meta-heuristics algorithm
| Method | Accuracy (%) | Sensitivity (%) | Specificity (%) | F1 score (%) | PPV (%) | NPV (%) |
|---|---|---|---|---|---|---|
| GA | 96.22 | 97.56 | 96.89 | 97.23 | 96.91 | 97.54 |
| PS | 93.22 | 90.78 | 94.67 | 92.03 | 94.39 | 90.25 |
| PSO | 97.44 | 96.89 | 98.00 | 97.43 | 97.98 | 96.92 |
| SA | 97.11 | 76.67 | 97.56 | 97.10 | 97.53 | 96.70 |
| GWO | 96.44 | 97.11 | 95.78 | 96.47 | 95.83 | 97.07 |
| Proposed optimization | 99.22 | 99.78 | 97.78 | 98.79 | 97.82 | 99.77 |
Comparison of the results with state of the art DL networks with CT images
| Reference | Methods | Accuracy (%) | Sensitivity (%) | Specificity (%) | F1 score (%) | PPV (%) | NPV (%) |
|---|---|---|---|---|---|---|---|
| [ | ResNet-50 | - | 90 | 96 | - | - | - |
| [ | CGAN | 82.91 | - | - | - | - | - |
| [ | Ensemble CNN | 86 | - | - | 86.7 | - | - |
| [ | CNN-Resnet-18 | 86.7 | - | - | - | - | - |
| [ | CNN-Ensemble | - | - | - | 92.2 | - | |
| [ | DL | 90.8 | 84 | 93 | - | - | - |
| [ | CNN | - | 98.2 | 92.2 | - | - | - |
| [ | CNN | 94.98 | 94.06 | 95.47 | - | - | - |
| [ | CNN-Resnet-50 | - | 93 | - | - | - | - |
| Proposed | Optimized GAN based InceptionV3 | 99.22 | 99.78 | 97.78 | 98.79 | 97.82 | 99.77 |
Fig. 14Cross-validation results