| Literature DB >> 35996535 |
G Appasami1, S Nickolas1.
Abstract
The novel corona virus disease (COVID-19) is a pandemic disease that is currently affecting over 200 countries around the world and more than 6 millions of people died in last 2 years. Early detection of COVID-19 can mitigate and control its spread. Reverse transcription polymerase chain reaction (RT-CPR), Chest X-ray (CXR) scan, and Computerized Tomography (CT) scan are used to identify the COVID-19. Chest X-ray image analysis is relatively time efficient than compared with RT-CPR and CT scan. Its cost-effectiveness make it a good choice for COVID-19 Classification. We propose a deep learning based Convolutional Neural Network model for detection of COVID-19 from CXR. Chest X-ray images are collected from various sources dataset for training with augmentation and evaluating our model, which is widely used for COVID-19 detection and diagnosis. A Deep Convolutional neural network (CNN) based model for analysis of COVID-19 with data augmentation is proposed, which uses the patient's chest X-ray images for the diagnosis of COVID-19 with an aim to help the physicians to assist the diagnostic process among high workload conditions. The overall accuracy of 93 percent for COVID-19 Classification is achieved by choosing best optimizer.Entities:
Year: 2022 PMID: 35996535 PMCID: PMC9386662 DOI: 10.1140/epjs/s11734-022-00647-x
Source DB: PubMed Journal: Eur Phys J Spec Top ISSN: 1951-6355 Impact factor: 2.891
Fig. 1Workflow diagram of proposed work
Fig. 2Chest X-ray augmented images of COVID-19 and normal
Statistics of chest X-ray images
| Dataset | Original images | Train images | Test images | Augmented images for training | Augmented images for validation |
|---|---|---|---|---|---|
| COVID-19 | 3000 | 2400 | 600 | 19,200 | 4800 |
| Normal | 3000 | 2400 | 600 | 19,200 | 4800 |
| Total | 6000 | 4800 | 1200 | 38,400 | 9600 |
Hyperparameter values
| S. no | Hyperparameter | Values |
|---|---|---|
| 1 | Epochs | 50, 100, 200, 300 and 400 |
| 2 | Batch size | 8, 16, 32 and 64 |
| 3 | Dropout factor (%) | 10, 15, 20, 25, 30 and 35, |
| 4 | Learning rate | 0.00001 and 0.000001 |
| 5 | Gradient optimizers | Adam, SGD, RMSProp, Adadelta, Adagrade, Adamax, Nadam, and FTRL |
Summary of CNN model
| Layer (type) | Output Shape | Number of parameters |
|---|---|---|
| conv2d (Conv2D) | (254, 254, 32) | 896 |
| conv2d_1 (Conv2D) | (252, 252, 64) | 18,496 |
| max_pooling2d (MaxPooling2D) | (126, 126, 64) | 0 |
| dropout (Dropout) | (126, 126, 64) | 0 |
| conv2d_2 (Conv2D) | (124, 124, 64) | 36,928 |
| max_pooling2d_1 (MaxPooling2D) | (62, 62, 64) | 0 |
| dropout_1 (Dropout) | (62, 62, 64) | 0 |
| conv2d_3 (Conv2D) | (60, 60, 128) | 73,856 |
| max_pooling2d_2 (MaxPooling2D) | (30, 30, 128) | 0 |
| dropout_2 (Dropout) | (30, 30, 128) | 0 |
| flatten (Flatten) | (115,200) | 0 |
| dense (Dense) | (64) | 7,372,864 |
| dropout_3 (Dropout) | (64) | 0 |
| dense_1 (Dense) | (1) | 65 |
| Total trainable parameters: | 7,503,105 |
Accuracy for different optimize
| Optimizer | Epochs | ||||
|---|---|---|---|---|---|
| 50 | 100 | 200 | 300 | 400 | |
| Adam | 79.82 | 84.40 | 88.65 | 92.85 | |
| SGD | 78.75 | 83.50 | 87.30 | 91.71 | 91.95 |
| RMSProp | 73.55 | 82.23 | 86.55 | 88.72 | 88.79 |
| Adadelta | 76.68 | 81.50 | 84.64 | 86.67 | 86.59 |
| Adagrade | 77.75 | 81.29 | 83.75 | 85.56 | 85.61 |
| Adamax | 75.72 | 79.46 | 81.15 | 83.25 | 83.37 |
| Nadam | 73.55 | 78.95 | 83.86 | 84.57 | 84.65 |
| FTRL | 75.63 | 79.92 | 81.52 | 82.76 | 82.85 |
Best value is in bold
Fig. 3Accuracy bar graph for different optimizer
Summary of CNN models
| Ref. Paper | Total | Training | Testing | Validation | Accuracy | Sensitivity / recall /TPR | Specificity / Selectivity | Precision / PPV | AUC | |
|---|---|---|---|---|---|---|---|---|---|---|
| Horry et al. [ | 1475 | 1180 | 295 | N/A | 85.00 | 83.00 | 84.00 | 84.00 | 83.00 | N/A |
| Momeny et al. [ | 1248 | 998 | 125 | 125 | 77.60 | 80.80 | 91.50 | N/A | 73.70 | N/A |
| Ucar et al. [ | 1500 | 1200 | 300 | 5-fold | 92.49 | 88.93 | 94.37 | 89.03 | 88.76 | 91.70 |
| Frid-Adar et al. [ | 1845 | 1795 | 50 | N/A | 91.00 | 98.00 | 80.00 | 98.00 | N/A | 95.00 |
| Rahamanet al. [ | 860 | 580 | 140 | 140 | 89.30 | 89.67 | N/A | 90.83 | 88.67 | N/A |
| Pereiraa et al. [ | 1144 | 802 | 342 | 5-fold | 87.02 | N/A | N/A | N/A | 83.00 | N/A |
| Apostolo et al. [ | 3905 | N/A | N/A | 10-fold | 87.66 | 97.36 | 99.42 | N/A | N/A | N/A |
| Khan et al. [ | 1248 | N/A | N/A | 4-fold | 89.60 | 89.92 | 96.40 | 90.00 | 89.80 | N/A |
| Rahimzadeh et al. [ | 9011 | N/A | N/A | 5-fold | 91.40 | 80.53 | 94.00 | 72.83 | N/A | N/A |
| Jain et al. [ | 6432 | 5467 | 965 | N/A | 93.00 | 88.00 | N/A | 94.00 | 90.33 | N/A |
| Nandi et al. [ | 20,425 | 18,207 | 2218 | 5-fold | 89.39 | N/A | N/A | N/A | N/A | 99.99 |
| Proposed work | 60,000 | 38,400 | 12,000 | 5-fold | 93.37 | 92.75 | 92.78 | 93.07 | 93.42 |
Best value is in bold
Fig. 4a Confusion matrix, b accuracy curve, c Loss curve, and d ROC with AUC is 0.93
| Classification Metrics | |
| The total number of real COVID-19 positive images ( | (1) |
| The total number of real COVID-19 negative images ( | (2) |
| Test result that correctly indicates the presence of a COVID-19 condition (TP) = 4482 | (3) |
| Test result that correctly indicates the absence of a COVID-19 condition (TN) = 4452 | (4) |
| Test result which wrongly indicates that a COVID-19 condition is present (FN) = 318 | (5) |
| Test result which wrongly indicates that a COVID-19 condition is absent ( FP) = 348 | (6) |
| Accuracy = (TP + TN)/(TP + FN + FP + TN) = (TP + TN)/(T + F) = (4482 + 4452) / (4800 + 4800) = 0.930625 | (7) |
| Sensitivity = Recall = Hit rate = True positive Rate = TPR = TP/(TP + FN) = TP/P = 4482/4800 = 0.93375 | (8) |
| Specificity = Selectivity = True negative Rate = TNR = TN/(FP + TN) = TN/N = 4452/4800 = 0.9275 | (9) |
| Precision = Positive predictive Value = PPV = (TP)/(TP + FP) = 4482/(4482 + 348) = 0.92795 | (10) |
| | (11) |
| Matthews Correlation Coefficient = (TP × TN − FP × FN)/Sqrt((TP + FP)(TP + FN)(TN + FP)(TN + FN)) = 0.8613 | (12) |