| Literature DB >> 34247134 |
Mahmoud Y Shams1, Omar M Elzeki2, Lobna M Abouelmagd3, Aboul Ella Hassanien4, Mohamed Abd Elfattah3, Hanaa Salem5.
Abstract
BACKGROUND ANDEntities:
Keywords: Artificial intelligence; COVID-19; Healthy food; Machine learning; Nutrition analysis; Regression
Year: 2021 PMID: 34247134 PMCID: PMC8241585 DOI: 10.1016/j.compbiomed.2021.104606
Source DB: PubMed Journal: Comput Biol Med ISSN: 0010-4825 Impact factor: 4.589
Fig. 1Proposed HANA model.
Fig. 2Representative stages of the proposed HANA model pipeline.
Fig. 3Examples of a scatter graph for COVID-19 Healthy Diet Dataset.
Top six food category (features) more correlated to death.
| Features | μ | σ2 |
|---|---|---|
| Animal Fats | 4.138451 | 3.277778 |
| Cereals Excluding Beer | 4.376548 | 3.174437 |
| Vegetal Products | 29.304396 | 7.978798 |
| Animal Products | 20.695714 | 7.979141 |
| Eggs | 0.953890 | 0.642060 |
| Milk Excluding Butter | 5.109061 | 3.321855 |
| Spices | 0.281251 | 0.447500 |
Fig. 4The distribution curves for different food categories and death.
Fig. 5The transformation of features by using PCA.
Fig. 6Enhancing features by using PCA.
Selected features based on PCA.
| PC # | Value | PC # | Value | PC # | Value | PC # | Value | PC # | Value | |
|---|---|---|---|---|---|---|---|---|---|---|
| PC1 | 0.185161 | PC6 | 0.130819 | 0.106391 | 0.091944 | 0.026076 | ||||
| PC2 | 0.145267 | PC7 | 0.059881 | 0.074000 | 0.052747 | 0.085176 | ||||
| PC3 | 0.095224 | PC8 | 0.070417 | 0.119977 | 0.060325 | 0.140997 | ||||
| PC4 | 0.069354 | PC9 | 0.084768 | 0.049968 | 0.103454 | 0.104525 | ||||
| PC5 | 0.123344 | PC10 | 0.102241 | 0.038403 | 0.073350 | 0.080452 | ||||
PCA for reduction features.
| PC # | Value | PC # | Value | PC # | Value | PC # | Value |
|---|---|---|---|---|---|---|---|
| PC1 | 0.185161 | 0.119977 | 0.095224 | 0.074000 | |||
| PC2 | 0.145267 | 0.106391 | 0.091944 | 0.073350 | |||
| PC23 | 0.140997 | 0.104525 | 0.085176 | 0.070417 | |||
| PC6 | 0.130819 | 0.103454 | 0.084768 | 0.069354 | |||
| PC5 | 0.123344 | 0.102241 | 0.080452 | 0.060325 |
Comparison of the proposed framework regression prediction models based on evaluation metrics (The best results are represented in bold).
| Model | Evaluation Metric | |||||
|---|---|---|---|---|---|---|
| MSE | RMSE | MAE | R2 | |||
| Linear Regression | Ridge Regression | 0.00023083 | 0.01519314 | 0.01023939 | −0.15965093 | |
| 0.00023091 | 0.01519604 | 0.01024034 | −0.16009405 | |||
| 0.00018113 | 0.01345867 | 0.00873109 | ||||
| AdaBoost | 0.00020749 | 0.01440446 | 0.00761746 | −0.04237952 | ||
Regression models comparison by evaluation metrics (MSE, RMSE, MAE, and R2) before and after reduction.
| Linear Regression | AdaBoost | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Linear Regression | Ridge Regression | 0.147 | 0.020 | 0.931 | 0.918 | 0.758 | 0.502 | |||
| 0.135 | 0.020 | 0.960 | 0.942 | 0.890 | 0.853 | |||||
| 0.110 | 0.013 | 0.986 | 0.974 | 0.998 | 0.994 | |||||
| 0.956 | 0.854 | 0.045 | 0.089 | 0.110 | 0.162 | |||||
| Simple Linear Regularization | 0.853 | 0.980 | 0.930 | 0.918 | 0.758 | 0.502 | ||||
| 0.865 | 0.980 | 0.960 | 0.942 | 0.890 | 0.853 | |||||
| 0.890 | 0.987 | 0.986 | 0.974 | 0.998 | 0.994 | |||||
| 0.044 | 0.146 | 0.045 | 0.089 | 0.110 | 0.162 | |||||
| Elastic Net Regression | 0.069 | 0.082 | 0.070 | 0.082 | 0.071 | 0.291 | ||||
| 0.040 | 0.058 | 0.040 | 0.058 | 0.313 | 0.724 | |||||
| 0.014 | 0.026 | 0.014 | 0.026 | 0.949 | 0.980 | |||||
| 0.955 | 0.911 | 0.955 | 0.911 | 0.228 | 0.213 | |||||
| AdaBoost | 0.242 | 0.498 | 0.242 | 0.498 | 0.929 | 0.709 | ||||
| 0.110 | 0.147 | 0.110 | 0.147 | 0.687 | 0.276 | |||||
| 0.002 | 0.006 | 0.002 | 0.006 | 0.051 | 0.020 | |||||
| 0.890 | 0.838 | 0.890 | 0.838 | 0.772 | 0.787 | |||||
The comparative study between the proposed ML regression model and [37], [39] using the same database.
| Author | Methodology | Dataset used | Problem Type | Metrics | Comment | |
|---|---|---|---|---|---|---|
| Ordás et al. [ | (PCA, K-means Algorithm) | Classification | Accuracy = 95% | The correlations between the eating habits and death cases of 170 countries during the COVID-19 pandemic were assessed to find the relationship between these habits and death rates-based ML. | ||
| Shams et al. [ | SVM Model based on RBF | Classification | Accuracy = 99.73% | This architecture can forecast the human cases affected by the COVID-19 pandemic due to each patient's diet habits and system. | ||
| SVM Model with Linear | Accuracy = 99.83% | |||||
| SVM Model with Linear Kernel | Accuracy = 79.30% | |||||
| Deep Learning | Accuracy = 99.72% | |||||
| Our Proposed (HANA Model) | Elastic Net Regression | Regression | MSE = 0.00018113 | This proposed regression model able to forecast the human cases affected by the COVID-19 pandemic due to each patient's diet habits and system using MSE. | ||
| (PCA, Backpropogation Neural Netwroks) | Classification | Accuracy = 98.76% | ||||
Fig. 7The Root cause analysis of the proposed HANA model