| Literature DB >> 33095811 |
Abstract
BACKGROUND: After claiming nearly five hundred thousand lives globally, the COVID-19 pandemic is showing no signs of slowing down. While the UK, USA, Brazil and parts of Asia are bracing themselves for the second wave-or the extension of the first wave-it is imperative to identify the primary social, economic, environmental, demographic, ethnic, cultural and health factors contributing towards COVID-19 infection and mortality numbers to facilitate mitigation and control measures.Entities:
Mesh:
Year: 2020 PMID: 33095811 PMCID: PMC7584177 DOI: 10.1371/journal.pone.0241165
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Summary of features and their statistics (i.e., mean, standard deviation (dev.), maximum (max.) and minimum (min.)).
The features in the order shown under “Feature name” are: GDP, inter-state distance based on lat-long coordinates, gender, ethnicity, quality of health care facility, number of homeless people, total infected and death, population density, airport passenger traffic, age group, days for infection and death to peak, number of people tested for COVID-19, days elapsed between first reported infection and the imposition of lockdown measures at a given state.
| Feature name | Abbreviation | Mean | Dev. | Max | Min |
|---|---|---|---|---|---|
| Gross Domestic Product | 412286.6 | 527087.5 | 3018337 | 34154 | |
| Distance | 22.1 | 17.6 | 90.7 | 0.0 | |
| Gender | 0.5 | 0.01 | 0.52 | 0.48 | |
| Ethnicity | 0.24 | 0.28 | 0.93 | 0.0 | |
| Healthcare index | 25.8 | 14.8 | 51.0 | 1.0 | |
| Homeless | 11963.48 | 21859.53 | 136826.0 | 946.0 | |
| Total Cases | 32155.46 | 39521.26 | 168663.0 | 487.0 | |
| Total Death | 1677.86 | 2428.85 | 11770.0 | 10.0 | |
| Population Density | 173.39 | 210.6 | 1035.64 | 1.12 | |
| Busy Airport Score | 375630.44 | 249207.97 | 1019704.0 | 100000.0 | |
| Age group | 362738.87 | 439896.78 | 3125816.0 | 6853.0 | |
| Peak Infected | 60.38 | 27.55 | 128.0 | 13.0 | |
| Peak Death | 58.88 | 23.7 | 112.0 | 14.0 | |
| Testing | 64353.04 | 24981.93 | 161172.0 | 31192.0 | |
| FirstInf-Lockdown | 22.64 | 14.13 | 63.0 | 7.0 |
Values of parameters.
| Method | Parameter |
|---|---|
| SVM | |
| SGD | |
| NC | |
| DT | |
| NB | |
| Extra trees | |
| Regression | |
| KBinsDiscretizer | ’Number of bins’: 5 |
Fig 1Accuracy scores of the 5-tuple of features for the output variables of infected and death scores for different supervised learning techniques.
Fig 2Participation of features in 5-tuples of key feature combinations for infected score (top) and death score (bottom).
Refer Table 1 for the feature abbreviations.
Fig 3Feature importance: (a) 5 discriminatory features along with their importance scores and standard deviation (in the decreasing order) affecting infected and death scores based on randomized decision trees; (b) principal component analysis on the 5 features showing that the most highly COVID-19 affected states form two groups: (1) early epicenters colored red and (2) states experiencing strong second wave or peaking late w.r.t. infection and death counts (colored brown).
Fig 4Effect of COVID-19 infected and death score on age: Comparison of accuracy scores for feature set of (a) age (≤40, >40) and (b) age (40–60, beyond 60).
Fig 5Preprocessing to study the variation in feature values for the top and bottom k US states on the basis of COVID-19 infected and death scores.
Fig 6Identification of discriminating features: (a) maximum difference in weighted average percentile for top and bottom k COVID-19 affected US; (b) heatmap showing the pairwise Pearson correlation correlation between discriminating features.
Fig 7Role of mobility and testing on spread: (a) the effect of testing and lockdown on infection spread: Testing rate (blue line) increases steadily over time and confirmed cases to testing ratio drops post lockdown due to reduced contact; (b) correlation between mobility (or distance) and days for infected to peak in neighboring NYC boroughs.
Multiple linear regression table with R2, coefficient and p value for input features (population density, normalized busy airport, pre-infected count, pre-death count) and observed factors (post-infected count and post-death count).
| Constant | Post-Inf | 0.92 | -0.92 | 0.068 |
| PD | 0.17 | 0 | ||
| Air | 0.19 | 0 | ||
| Pre-Inf | 0.81 | 0 | ||
| Constant | Post-Dth | 0.94 | -0.68 | 0.106 |
| PD | 0.17 | 0 | ||
| Air | 0.06 | 0.005 | ||
| Pre-Inf | 0.91 | 0 | ||
| Constant | Post-Inf | 0.83 | -1.37 | 0.074 |
| PD | 0.20 | 0 | ||
| Air | 0.18 | 0 | ||
| Pre-Dth | 0.67 | 0 | ||
| Constant | Post-Dth | 0.82 | -1.17 | 0.116 |
| PD | 0.20 | 0 | ||
| Air | 0.05 | 0.213 | ||
| Pre-Dth | 0.76 | 0 |