| Literature DB >> 36010198 |
Nada M Elshennawy1, Dina M Ibrahim1,2, Amany M Sarhan1, Mohamed Arafa1.
Abstract
The SARS-CoV-2 virus has proliferated around the world and caused panic to all people as it claimed many lives. Since COVID-19 is highly contagious and spreads quickly, an early diagnosis is essential. Identifying the COVID-19 patients' mortality risk factors is essential for reducing this risk among infected individuals. For the timely examination of large datasets, new computing approaches must be created. Many machine learning (ML) techniques have been developed to predict the mortality risk factors and severity for COVID-19 patients. Contrary to expectations, deep learning approaches as well as ML algorithms have not been widely applied in predicting the mortality and severity from COVID-19. Furthermore, the accuracy achieved by ML algorithms is less than the anticipated values. In this work, three supervised deep learning predictive models are utilized to predict the mortality risk and severity for COVID-19 patients. The first one, which we refer to as CV-CNN, is built using a convolutional neural network (CNN); it is trained using a clinical dataset of 12,020 patients and is based on the 10-fold cross-validation (CV) approach for training and validation. The second predictive model, which we refer to as CV-LSTM + CNN, is developed by combining the long short-term memory (LSTM) approach with a CNN model. It is also trained using the clinical dataset based on the 10-fold CV approach for training and validation. The first two predictive models use the clinical dataset in its original CSV form. The last one, which we refer to as IMG-CNN, is a CNN model and is trained alternatively using the converted images of the clinical dataset, where each image corresponds to a data row from the original clinical dataset. The experimental results revealed that the IMG-CNN predictive model outperforms the other two with an average accuracy of 94.14%, a precision of 100%, a recall of 91.0%, a specificity of 100%, an F1-score of 95.3%, an AUC of 93.6%, and a loss of 0.22.Entities:
Keywords: COVID-19 detection; deep learning; machine learning; mortality and severity risk
Year: 2022 PMID: 36010198 PMCID: PMC9406405 DOI: 10.3390/diagnostics12081847
Source DB: PubMed Journal: Diagnostics (Basel) ISSN: 2075-4418
Summary of the commonly reviewed studies on the severity and the prediction of mortality risk in COVID-19 patients.
| Study | Method | ML/DL | Performance |
|---|---|---|---|
| Pourhomayoun et al. [ | SVM, NN, and RF | Machine learning | 89.98% (Accuracy) |
| Yan et al. [ | XGBoost | Machine learning | Accuracy 90% |
| Yan et al. [ | XGBoost | Machine learning | 93% (Accuracy) |
| Wang et al. [ | XGBoost | Machine learning | 83% (AUC for clinical model) |
| Rechtman et al. [ | XGBoost | Machine learning | 86% (AUC) |
| Bertsimas et al. [ | XGBoost | Machine learning | 81%, 87%, and 92% |
| Guan et al. [ | XGBoost | Machine learning | Precision >90%, |
| Booth et al. [ | SVM | Machine learning | 93% (AUC), |
| Sun et al. [ | SVM | Machine learning | 97.57% (AUC) |
| Yao et al. [ | SVM | Machine learning | 81.48% (Accuracy) |
| Zhao et al. [ | SVM | Machine learning | 91.38% (Accuracy), |
| Chowdhury et al. [ | XGBoost | Machine learning | 96.1% (AUC) |
| Karthikeyan et al. [ | NN, SVM, LR, random forests, | Machine learning | NN model performance |
| Kar et al. [ | XGBoost | Machine learning | 78.2% (AUC), |
| Hu et al. [ | partial least squares regression, | Machine learning | LR model performance |
| Zhao et al. [ | LR | Machine learning | 83% (AUC for mortality prediction), |
| Huang et al. [ | LR | Machine learning | 94.4% (AUC), |
| Zhou et al. [ | LR | Machine learning | 87.9% (AUC), |
| Zhu et al. [ | LR | Machine learning | 90% (AUC) |
| Gong et al. [ | LASSO regression, DT, | Machine learning | LR model performance |
| Liu et al [ | Multivariable LR | Machine learning | 99.4% (AUC), |
| Li et al. [ | Autoencoder, LR, RF, SVM, | Machine learning | Autoencoder model performance |
| Terwangne et al. [ | Bayesian network analysis | Machine learning | 83.8% (ROC for WHO classification model) |
| Aktar et al. [ | Random Forest, DT, GBM, | Machine learning | 88% (Accuracy of comorbidity |
| Tezza et al. [ | RPART, SVM, GBM, and | Machine learning | Random Forest model performance |
| Khozeimeh et al. [ | CNN and autoencoders | Deep learning | 96.05% (Average Accuracy) |
Figure 1Block diagram of the proposed predictive models (Deep-Risk).
Figure 2Sample of the tabular dataset.
The list of the 57 selected features used in our predictive models.
| Feature Type | Feature Name | ||
|---|---|---|---|
| Symptoms | anorexia | fever | shortness of breath |
| chest pain | gasp | somnolence | |
| chills | headache | sore throat | |
| conjunctivitis | kidney failure | sputum | |
| cough | lesions on chest radiographs | septic shock | |
| diarrhea | hypertension | Heart attack | |
| dizziness | Myalgia | old | |
| dyspnea | Obnubilation | cardiac disease | |
| emesis | pneumonia | hypoxia | |
| expectoration | myelofibrosis | fatigue | |
| eye irritation | respiratory distress | rhinorrhea | |
| Pre-existing Conditions | diabetes | COPD | coronary heart disease |
| hypertension | Parkinson’s disease | prostate hypertrophy | |
| chronic kidney disease | asthma | Tuberculosis | |
| hypothyroidism | cancer | hepatitis B | |
| cerebral infarction | HIV positive | chronic bronchitis | |
| cardiac disease | dyslipidemia | any chronic disease | |
| Demographics | age | country | province |
| gender | city | travel history | |
Figure 3Sample of the original clinical dataset in tabular rows.
Figure 4Samples of the converted images from the original clinical dataset.
Figure 5A schematic diagram of the CV-CNN model.
Figure 6The pseudo-code of the CV-CNN model.
Architecture and parameter settings of the CV-CNN model.
| Layer (Type) | Output Shape | Parameters |
|---|---|---|
| conv1d_5 (Conv1D) | (None, 52, 256) | 1024 |
| conv1d_6 (Conv1D) | (None, 50, 256) | 196,864 |
| conv1d_7 (Conv1D) | (None, 48, 256) | 196,864 |
| max_pooling1d_1 (MaxPooling1D) | (None, 47, 256) | 0 |
| flatten_2 (Flatten) | (None, 12,032) | 0 |
| dense_6 (Dense) | (None, 64) | 770,112 |
| batch_normalization_4 (BatchNormalization) | (None, 64) | 256 |
| dropout_4 (Dropout) | (None, 64) | 0 |
| dense_7 (Dense) | (None, 32) | 2080 |
| batch_normalization_5 (BatchNormalization) | (None, 32) | 128 |
| dropout_5 (Dropout) | (None, 32) | 0 |
| dense_8 (Dense) | (None, 1) | 33 |
| Total parameters: 1,167,361 | ||
| Trainable parameters: 1,167,169 | ||
| Non-trainable parameters: 192 | ||
Figure 7A schematic diagram of the CV-LSTM + CNN model.
Figure 8The pseudo-code of the CV-LSTM+CNN model.
Architecture and parameter settings of the CV-LSTM+CNN model.
| Layer (Type) | Output Shape | Parameters |
|---|---|---|
| batch_normalization_2 (BatchNormalization) | (None, 54, 1) | 4 |
| Reshape (Reshape) | (None, 9, 6, 1) | 0 |
| Time_distribution (TimeDistributed) | (None, 9, 6, 256) | 264,192 |
| dropout_2 (Dropout) | (None, 9, 6, 256) | 0 |
| batch_normalization_3 (BatchNormalization) | (None, 9, 6, 256) | 1024 |
| Time_distribution_1 (TimeDistributed) | (None, 9, 6, 256) | 26,2400 |
| conv1d_4 (Conv1D) | (None, 9, 4, 256) | 196,864 |
| average_pooling2d_1 (AveragePooling2D) | (None, 4, 2, 256) | 0 |
| flatten_1 (Flatten) | (None, 2048) | 0 |
| dropout_3 (Dropout) | (None, 2048) | 0 |
| dense_3 (Dense) | (None, 128) | 262,272 |
| dense_4 (Dense) | (None, 64) | 8256 |
| dense_5 (Dense) | (None, 1) | 65 |
| Total parameters: 995,077 | ||
| Trainable parameters: 994,563 | ||
| Non-trainable parameters: 514 | ||
Figure 9A schematic diagram of the proposed IMG-CNN model.
Figure 10The pseudo-code of the IMG-CNN model.
Architecture and parameter settings of the IMG-CNN model.
| Layer (Type) | Output Shape | Parameters |
|---|---|---|
| Conv2d (Conv2D) | (None, 224, 224, 256) | 7168 |
| Activation (Activation) | (None, 224, 224, 256) | 0 |
| batch_normalization (BatchNormalization) | (None, 224, 224, 256) | 1024 |
| Conv2d_1 (Conv2D) | (None, 224, 224, 128) | 295,040 |
| Activation_1 (Activation) | (None, 224, 224, 128) | 0 |
| max_pooling2d (MaxPooling2D) | (None, 74, 74, 128) | 0 |
| dropout (Dropout) | (None, 74, 74, 128) | 0 |
| Conv2d_2 (Conv2D) | (None, 72, 72, 64) | 73,792 |
| Activation_2 (Activation) | (None, 72, 72, 64) | 0 |
| batch_normalization_1 (BatchNormalization) | (None, 72, 72, 64) | 256 |
| flatten (Flatten) | (None, 331,776) | 0 |
| dense (Dense) | (None, 512) | 169,869,824 |
| dropout_1 (Dropout) | (None, 512) | 0 |
| dense_1 (Dense) | (None, 64) | 32,832 |
| dense_2 (Dense) | (None, 1) | 65 |
| Total parameters: 170,280,001 | ||
| Trainable parameters: 170,279,361 | ||
| Non-trainable parameters: 640 | ||
Figure 11Confusion matrix for the proposed CV-CNN model.
Results of the CV-CNN proposed model using different evaluation metrics based on a 10-fold cross-validation.
| Fold | Precision Recovered | Died | Macro | Weight | Recall Recovered | Died | Macro | Weight | F1-Score Recovered | Died | Macro | Weight | Accuracy |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 0.79 | 0.93 | 0.86 | 0.86 | 0.94 | 0.75 | 0.85 | 0.85 | 0.86 | 0.83 | 0.85 | 0.85 | 0.85 |
| 2 | 0.80 | 0.93 | 0.86 | 0.86 | 0.94 | 0.76 | 0.85 | 0.85 | 0.86 | 0.83 | 0.85 | 0.85 | 0.85 |
| 3 | 0.53 | 0.95 | 0.74 | 0.74 | 1.00 | 0.09 | 0.54 | 0.55 | 0.69 | 0.17 | 0.43 | 0.43 | 0.55 |
| 4 | 0.59 | 0.95 | 0.77 | 0.77 | 0.98 | 0.30 | 0.64 | 0.64 | 0.74 | 0.46 | 0.60 | 0.60 | 0.64 |
| 5 | 0.69 | 0.93 | 0.81 | 0.81 | 0.96 | 0.57 | 0.77 | 0.77 | 0.80 | 0.71 | 0.76 | 0.76 | 0.77 |
| 6 | 0.74 | 0.93 | 0.84 | 0.83 | 0.95 | 0.66 | 0.80 | 0.81 | 0.83 | 0.77 | 0.80 | 0.80 | 0.81 |
| 7 | 0.81 | 0.93 | 0.87 | 0.87 | 0.94 | 0.78 | 0.86 | 0.86 | 0.87 | 0.85 | 0.86 | 0.86 | 0.86 |
| 8 | 0.50 | 0.00 | 0.25 | 0.25 | 1.00 | 0.00 | 0.50 | 0.50 | 0.67 | 0.00 | 0.33 | 0.34 | 0.50 |
| 9 | 0.81 | 0.92 | 0.87 | 0.87 | 0.94 | 0.77 | 0.86 | 0.86 | 0.87 | 0.84 | 0.86 | 0.86 | 0.86 |
| 10 | 0.65 | 0.92 | 0.78 | 0.78 | 0.96 | 0.47 | 0.71 | 0.72 | 0.77 | 0.62 | 070 | 0.70 | 0.72 |
Figure 12Confusion matrix for the proposed LSTM+CNN model.
Results of the LSTM + CNN proposed model using different evaluation metrics based on a 10-fold cross-validation.
| Fold | Precision Recovered | Died | Macro | Weight | Recall Recovered | Died | Macro | Weight | F1-Score Recovered | Died | Macro | Weight | Accuracy |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 0.81 | 0.93 | 0.87 | 0.87 | 0.94 | 0.77 | 0.86 | 0.86 | 0.87 | 0.84 | 0.85 | 0.85 | 0.86 |
| 2 | 0.83 | 0.63 | 0.73 | 0.73 | 0.48 | 0.90 | 0.69 | 0.69 | 0.61 | 0.74 | 0.68 | 0.67 | 0.69 |
| 3 | 0.81 | 0.92 | 0.86 | 0.86 | 0.93 | 0.78 | 0.85 | 0.85 | 0.87 | 0.84 | 0.85 | 0.85 | 0.85 |
| 4 | 0.76 | 0.61 | 0.68 | 0.69 | 0.47 | 0.85 | 0.66 | 0.66 | 0.58 | 0.71 | 0.65 | 0.65 | 0.66 |
| 5 | 0.81 | 0.93 | 0.87 | 0.87 | 0.94 | 0.77 | 0.86 | 0.86 | 0.87 | 0.84 | 0.86 | 0.86 | 0.86 |
| 6 | 0.81 | 0.92 | 0.86 | 0.86 | 0.93 | 0.77 | 0.85 | 0.85 | 0.86 | 0.84 | 0.85 | 0.85 | 0.85 |
| 7 | 0.63 | 0.50 | 0.56 | 0.56 | 0.01 | 0.99 | 0.50 | 0.50 | 0.03 | 0.66 | 0.35 | 0.34 | 0.50 |
| 8 | 0.96 | 0.50 | 0.73 | 0.73 | 0.02 | 1.00 | 0.51 | 0.51 | 0.04 | 0.67 | 0.35 | 0.35 | 0.51 |
| 9 | 0.79 | 0.92 | 0.86 | 0.86 | 0.94 | 0.75 | 0.84 | 0.84 | 0.86 | 0.83 | 0.84 | 0.84 | 0.84 |
| 10 | 0.10 | 0.49 | 0.30 | 0.30 | 0.00 | 0.98 | 0.49 | 0.49 | 0.00 | 0.66 | 0.33 | 0.33 | 0.49 |
Figure 13Confusion matrix for the proposed IMG-CNN model.
Evaluation metrics for the IMG-CNN model.
| Performance Metric | Value | Performance Metric | Value | Performance Metric | Value | Performance Metric | Value |
|---|---|---|---|---|---|---|---|
| Tp | 124 | val_Tp | 305 | accuracy | 0.85 | val_accuracy | 0.94 |
| Fp | 2 | val_Fp | 0 | precision | 0.98 | val_precision | 1 |
| Tn | 94 | val_Tn | 177 | recall | 0.77 | val_recall | 0.91 |
| Fn | 36 | val_Fn | 30 | AUC | 0.94 | val_AUC | 0.936 |
| loss | 0.27 | val_loss | 0.22 |
Figure 14Loss, AUC, precision, recall, and accuracy between the training and validation phases, with the number of epochs for the IMG-CNN model.
Evaluation metrics for the three proposed Deep-Risk models.
| Models | Precision Recovered | Died | Avg. | Recall Recovered | Died | Avg. | F1-Score Recovered | Died | Avg. | Accuracy |
|---|---|---|---|---|---|---|---|---|---|---|
| CV-LSTM + CNN | 81% | 92% | 86.5% | 93% | 77% | 85% | 86% | 84% | 85% | 85.27% |
| CV-CNN | 81% | 93% | 87% | 94% | 78% | 86% | 87% | 85% | 86% | 86.06% |
| IMG-CNN | 83% | 100% | 91.5% | 100% | 84% | 92% | 91% | 91% | 91% | 94.14% |
Comparison of the performance evaluation metrics of the different models presented in previous studies and our proposed IMG-CNN model using the same clinical dataset.
| Models | Rank | Accuracy (%) | AUC (%) |
|---|---|---|---|
| NN [ | 2 | 89.98 | 93 |
| KNN [ | 3 | 89.83 | 90 |
| SVM [ | 4 | 89.02 | 88 |
| RF [ | 5 | 87.93 | 94 |
| LR [ | 6 | 87.91 | 92 |
| DT [ | 7 | 86.87 | 93 |
| IMG-CNN (proposed) | 1 | 94.14 | 93.6 |
Figure 15Comparison of the performance evaluation metrics of the different models presented in previous studies (NN, KNN, SVM, RF, LR, and DT) and our proposed IMG-CNN model using the same clinical dataset.
Comparison of the performance evaluation metrics of the different models presented in previous studies and our proposed IMG-CNN model using deep learning models.
| Models | Rank | Accuracy (%) | Precision (%) | Recall (%) | Specificity (%) | F1-score (%) | AUC (%) | Loss |
|---|---|---|---|---|---|---|---|---|
| ABC-CNN [ | 3 | 92.32 | 94.7 | 97.4 | 89.65 | 96.0 | 53.3 | 0.25 |
| ACO-CNN [ | 2 | 93.10 | 95.6 | 97.3 | 90.57 | 96.4 | 62.5 | 0.26 |
| BOA-CNN [ | 7 | 91.37 | 94.1 | 97.0 | 89.08 | 95.1 | 53.5 | 0.28 |
| EHO-CNN [ | 5 | 91.86 | 94.1 | 98.0 | 85.69 | 95.9 | 53.2 | 0.23 |
| GA-CNN [ | 4 | 92.18 | 94.8 | 97.8 | 88.32 | 96.1 | 57.5 | 0.29 |
| PSO-CNN [ | 6 | 91.85 | 95.0 | 96.4 | 88.17 | 95.5 | 61.5 | 0.28 |
| IMG-CNN (proposed) | 1 | 94.14 | 100 | 91.0 | 100 | 95.3 | 93.6 | 0.22 |
Figure 16Comparison of the performance evaluation metrics of the different models presented in previous studies and our proposed IMG-CNN model using deep learning models.
Figure 17Loss comparison among the different models presented in previous studies and our proposed IMG-CNN model using deep learning models.