| Literature DB >> 34075356 |
Ahan Chatterjee1, Swagatam Roy1, Sunanda Das2.
Abstract
In this paper, we propose an ensemble-based transfer learning method to predict the X-ray image of a COVID-19 affected person. We have used a weighted Euclidean distance average as the parameter to ensemble the transfer learning model viz. ResNet50, VGG16, VGG19, Xception, and InceptionV3. Image augmentations have been carried out using generative adversarial network modelling. We took 784 training images, and 278 test images to validate our model accuracy, and the accuracy of our proposed model was around 98.67% for the training data set and 95.52% for the test data set. Along with that, we also propose a genetic algorithm optimized classification algorithm, to analyze the symptoms of COVID-19 for low, medium, and high-risk patients. The accuracy for the optimized set overshadowed the accuracy of un-optimized classification, and the optimized accuracy is as high as 88.96% for the optimized model. The novelty of this paper lies in the bi-sided model of the paper, i.e., we propose two major models, and one is the genetic algorithm optimized model to analyze the symptoms for a patient of varied risk and the other is to classify the X-ray image using an ensemble-based transfer learning model.Entities:
Keywords: COVID-19; DCGAN; Genetic algorithm; Naïve Bayes; ResNet50; Transfer learning
Year: 2021 PMID: 34075356 PMCID: PMC8160081 DOI: 10.1007/s42979-021-00701-w
Source DB: PubMed Journal: SN Comput Sci ISSN: 2661-8907
Fig. 1a Lung CT image on day 2, showing patchy pattern, ill-defined alveolar condition. b Lung CT image on day 7 showing notable patches
Fig. 2CT image after noise removal
Fig. 3CT image after shadow removal
Fig. 4CT image after applying histogram equalizer
Fig. 5DCGAN architecture [12]
Synthetic CT lung images created using DCGAN Architecture for 50 and 300 Epochs.
Source: Created by Author, based on the dataset
Fig. 6Training loss curve for ResNet50 architecture
Fig. 7Training loss curve for VGG16 architecture
Fig. 8Training loss curve for VGG19 architecture
Fig. 9Training loss curve for Xception architecture
Fig. 10Training loss curve for InceptionV3 architecture
Fig. 11Training loss curve for proposed architecture
Result of pre-trained models and our proposed algorithm.
Source: Created by Author, based on the dataset
| Models | Parameters | Validation accuracy (%) | Sensitivity (%) | F1 score (%) |
|---|---|---|---|---|
| ResNet50 | 25,636,712 | 85.69 | 92 | 94.4 |
| VGG16 | 138,357,544 | 92.58 | 96 | 92.58 |
| VGG19 | 143,667,240 | 88.22 | 94 | 93.44 |
| Xception | 22,910,480 | 91.67 | 91 | 91.87 |
| InceptionV3 | 23,851,784 | 74.44 | 86 | 90.55 |
| Proposed Architecture | 98.67 | 99 | 97.45 | |
Fig. 12Graph showed days elapsed vs. showing symptom and hospital admission
Fig. 13Bayesian plot for feature relation
Fig. 14Framework of feature selection.
Source: Jiliang Tang, Salem Alelyani and Huan Liu Feature Selection for
Fig. 15Showing key components [22]
Fig. 16a Represent crossover and b represent mutation
Genetic algorithm result for feature optimization.
Source: Created by Author, based on Dataset
| Feature Subset | Validation accuracy | Percentile |
|---|---|---|
| [1,0,0,1,1,0,1] | 84.56% | 1 |
Fig. 17Genetic algorithm result for 10 generation iteration.
Source: Created by Author, based on Dataset
Fig. 18Genetic algorithm curve across different generations.
Source: Created by Author, based on Dataset
Features deduced after the data set gone through genetic algorithm.
Source: Created by Author, based on Dataset
| Optimized features using genetic algorithm |
|---|
| Fever |
| Difficulty in breathing |
| Sore throat |
| Tiredness |
| Dry cough |
| Nasal congestion |
Grid search model parameters
| Classifier Model | HYPER-PARAM |
|---|---|
| Random Forest | 'n_estimators': [100, 200, 300, 400,500,600,700,800], 'max_features': ['auto', 'sqrt', 'log2'], 'max_depth': [4,5,6,7,8,9,10], 'criterion':['gini', 'entropy'] |
| SVM | 'c': [0.1,1, 10, 100,100], 'gamma': [1,0.1,0.01,0.001,0.0001], 'kernel': [‘linear’,'rbf', 'poly', 'sigmoid'] |
| Naïve Bayes | 'var_smoothing': np.logspace(0,-9, num = 100) |
Accuracy table of optimized and non-optimized parameter.
Source: Created by Author, based on Dataset
| Classifier model | Validation accuracy (%) | Sensitivity (%) | Parameters optimized using grid search algorithm |
|---|---|---|---|
| SVM | 73.52 | 68 | ‘c’: 0.1 ‘gama’:1 ‘kernel’: rbf |
| Random forest | 77.43 | 70 | 'n_estimators': 100 'max_features': ‘auto’ 'max_depth': 4 'criterion': ‘gini’ |
| Naïve Bayes | 70.66 | 74 | 'mean_score_time': array([0.00097208]), 'mean_test_score': array([0.20824808]) |
| SVM [optimized using genetic algorithm] | 84.64 | 79 | ‘c’: 0.1 ‘gama’:1 ‘kernel’: rbf |
| Random forest [optimized using genetic algorithm] | 88.96 | 84 | 'n_estimators': 100 'max_features': ‘auto’ 'max_depth': 4 'criterion': ‘gini’ |
| Naïve Bayes [optimized using genetic algorithm] | 82.36 | 81 | 'mean_score_time': array([0.00097208]), 'mean_test_score': array([0.20824808]) |