| Literature DB >> 36231838 |
Najada Firza1,2, Alfonso Monaco3,4.
Abstract
The COVID-19 pandemic has now spread worldwide, becoming a real global health emergency. The main goal of this work is to present a framework for studying the impact of COVID-19 on Italian territory during the first year of the pandemic. Our study was based on different kinds of health features and lifestyle risk factors and exploited the capabilities of machine learning techniques. Furthermore, we verified through our model how these factors influenced the severity of the pandemics. Using publicly available datasets provided by the Italian Civil Protection, Italian Ministry of Health and Italian National Statistical Institute, we cross-validated the regression performance of a Random Forest model over 21 Italian regions. The robustness of the predictions was assessed by comparison with two other state-of-the-art regression tools. Our results showed that the proposed models reached a good agreement with data. We found that the features strongly associated with the severity of COVID-19 in Italy are the people aged over 65 flu vaccinated (24.6%) together with individual lifestyle behaviors. These findings could shed more light on the clinical and physiological aspects of the disease.Entities:
Keywords: COVID-19; feature selection; flu; forecasting models; generalized linear model; lifestyle risk factor; machine learning; random forests; support vector machine; vaccination
Mesh:
Year: 2022 PMID: 36231838 PMCID: PMC9565136 DOI: 10.3390/ijerph191912538
Source DB: PubMed Journal: Int J Environ Res Public Health ISSN: 1660-4601 Impact factor: 4.614
Figure 1Historical series of positives Panel (A) and deaths Panel (B) for COVID-19 in Italy. The graph includes the values from 22 February to 22 November 2020.
Figure 2Flowchart of the proposed methodology.
Summary table with some information of independent features used in our model.
| Independent Feature | Explanation |
|---|---|
| Allergic subjects | percentage of people affected by chronic allergic diseases in 2016 in each Italian region [ |
| Flu vaccinated | percentage of people over 65 years old vaccinated against the seasonal flu in each Italian region in 2019 [ |
| Sedentary subjects | percentage of subjects, in each Italian region, that do not engage in any physical activity in their free time, nor do they do heavy work calculated from 2015 to 2018 [ |
| Deaths respiratory system | number of deaths due to diseases of the respiratory system per |
| Asthmatics | percentage of subjects suffering from chronic bronchitis and bronchial asthma in each Italian region in 2019 [ |
| Alcohol consumers | percentage of subjects who claim they have a high daily alcohol consumption in each Italian region calculated from 2015 to 2018 [ |
| Old-age index | ratio between the population aged 65 years and over and that under 15 in each Italian region in 2019 [ |
| Population density | population density expressed in inhabitants per square kilometer in each Italian region in 2019. |
| Passenger | data collected by each Italian national airport about the passengers who departed from or landed at that airport in 2018. |
Summary table with some statistics of independent features.
| Independent Feature | Mean | Standard Deviation | Median | 25th Percentile | 75th Percentile |
|---|---|---|---|---|---|
| Allergic subjects |
|
|
|
|
|
| Flu vaccinated |
|
|
|
|
|
| Sedentary subjects |
|
|
|
|
|
| Deaths respiratory system |
|
|
|
|
|
| Asthmatics |
|
|
|
|
|
| Alcohol consumers |
|
|
|
|
|
| Old-age index |
|
|
|
|
|
| Population density |
|
|
|
|
|
| Passenger | 8,841,969 | 1,412,266 | 3,193,386 | 223,436 | 8,893,672 |
Figure 3Correlation matrix for independent features, Crude Mortality Rate (CMR), and Crude Positivity Rate (CPR).
Figure 4The agreement between the predicted values the actual values for CMR (panel A) and CPR (panel B).
Summary table of regression performance measures obtained through the implemented models. MAPE: Mean Absolute Percentage Error; SD: Standard deviation.
| Predicted Values | Regression Models | MAPE (±SD) | Adjusted |
|---|---|---|---|
| Crude Positivity Rate | Random Forest |
|
|
| Support Vector Machine |
|
| |
| Generalized Linear Model |
|
| |
| Crude Mortality Rate | Random Forest |
|
|
| Support Vector Machine |
|
| |
| Generalized Linear Model |
|
|
Figure 5The feature importance produced by RF, SVM and GLM to predict CMR (panel A) and CPR (panel B).
Figure 6The geographical distribution in Italy of CPR (panel A), CMR (panel B) and the percentage of people over 65 years old vaccinated against the seasonal flu 2019 (panel C) in Italy.