| Literature DB >> 34253943 |
Syeda Amna Rizvi1, Muhammad Umair2, Muhammad Aamir Cheema3.
Abstract
The coronavirus has a high basic reproduction number ( R 0 ) and has caused the global COVID-19 pandemic. Governments are implementing lockdowns that are leading to economic fallout in many countries. Policy makers can take better decisions if provided with the indicators connected with the disease spread. This study is aimed to cluster the countries using social, economic, health and environmental related metrics affecting the disease spread so as to implement the policies to control the widespread of disease. Thus, countries with similar factors can take proactive steps to fight against the pandemic. The data is acquired for 79 countries and 18 different feature variables (the factors that are associated with COVID-19 spread) are selected. Pearson Product Moment Correlation Analysis is performed between all the feature variables with cumulative death cases and cumulative confirmed cases individually to get an insight of relation of these factors with the spread of COVID-19. Unsupervised k-means algorithm is used and the feature set includes economic, environmental indicators and disease prevalence along with COVID-19 variables. The learning model is able to group the countries into 4 clusters on the basis of relation with all 18 feature variables. We also present an analysis of correlation between the selected feature variables, and COVID-19 confirmed cases and deaths. Prevalence of underlying diseases shows strong correlation with COVID-19 whereas environmental health indicators are weakly correlated with COVID-19.Entities:
Keywords: COVID-19; COVID-19 Confirmed Cases; COVID-19 Death Cases; Clustering Methods; Disease Prevalence; KMeans; Pearson Correlation; Second Wave; Unsupervised Learning
Year: 2021 PMID: 34253943 PMCID: PMC8264526 DOI: 10.1016/j.chaos.2021.111240
Source DB: PubMed Journal: Chaos Solitons Fractals ISSN: 0960-0779 Impact factor: 5.944
Description of feature variables.
| Notions and data sources | Variable name | Description |
|---|---|---|
| COVID-19 cases | Cum_confirm cases | Cumulative confirmed COVID-19 cases. |
| Cum_deaths | Cumulative COVID-19 death cases. | |
| Socio-economic Indicators | GDP_per_capita | Gross domestic product per capita, is a proportion of a country’s financial yield represented by the number of individuals. |
| Health Exp | Health expenditure per capita is the average amount that country is devoting for health services for an individual. | |
| Alcohol Consumption | Alcohol consumption rate per capita by an age group of 15+. | |
| Smoking Prevalence | Pervasiveness of smoking is the level of people over the age of 15 who presently smoke any tobacco item on a day by day or non-regular schedule. | |
| Life Expectancy | It is the anticipated years a baby would live if accepted examples of mortality at childbirth were to remain the equivalent for the duration of its life. | |
| Disease Prevalence | Tuberculosis cases | Prevalence in terms of cases of tuberculosis disease. |
| Cardiovascular Disease | Prevalence in terms of number of cases due to known cardiovascular causes such as ischaemic heat disease, stroke e.t.c. | |
| Diabetes Mellitus | Prevalence in terms of cases of diabetes mellitus. | |
| Respiratory Infections | Prevalence in terms of cases of Respiratory infections such as pneumonia and bronchitis. | |
| Asthma | Prevalence in terms of cases of chronic lung disease asthma. | |
| Nutritional deficiencies | Prevalence in terms of cases of nourishing inadequacies including protein-energy unhealthiness, lack of iodine, nutrient A insufficiency, iron inadequacy, and other health insufficiencies. | |
| Environmental Performance Indicators | PM2.5 exposure (PMD) | It is the indicator of number of people who have lost life years per 100,000 people due to exposure to fine air particulate matter smaller than 2.5 micrometers. |
| Environmental Performance Index (EPI) | It is a score assigned from 1 to 100, on the basis of how close countries are to set up health environmental targets | |
| Environmental Health (HLT) | It gives the environmental health score of a country. | |
| Air Quality (AIR) | Country score on the basis of effects of air contamination. | |
| Household Solid Fuels (HAD) | Indicator of AIR issue category which gives score on the basis of lives lost due to use of household solid fuels. | |
| Sanitation and Drinking Water (H2O) | Country score on basis of how well nations shield human well-being from natural dangers on two pointers: UWD and USD. | |
| Unsafe Water for Drinking (UWD) | Country score on basis of people who have lost life years per 100,000 people due to insufficient proper drinking water facilities. |
Fig. 1Optimal number of clusters using elbow method.
Fig. 2Pearson correlation heatmap for COVID-19 confirmed cases.
Clusters produced based on COVID-19 Cases.
| (a) Cluster no. 1 |
|---|
| Albania, Algeria, Argentina, Armenia, Bahrain, Belarus, Bosnia and Herzegovina, Brazil, Bulgaria, Chile, Colombia, Costa Rica, Croatia, Dominican Republic, Ecuador, Hungary, Iran (Islamic Republic of), Iraq, Kazakhstan, Kuwait, Malaysia, Mexico, Oman, Panama, Poland, Qatar, Romania, Russian Federation, Saudi Arabia, Serbia, Turkey, Ukraine, United Arab Emirates |
| (b) Cluster no. 2 |
| Austria, Belgium, Canada, Denmark, France, Germany, Iceland, Ireland, Israel, Italy, Japan, Luxembourg, Netherlands, Norway, Portugal, Singapore, Spain, Sweden, Switzerland, United States of America, United Kingdom. |
| (c) Cluster no. 3 |
| China, India |
| (d) Cluster no. 4 |
| Afghanistan, Azerbaijan, Bangladesh, Bolivia, Djibouti, Ethiopia, Egypt, Ghana, Guatemala, Honduras, Indonesia, Madagascar, Mauritania, Morocco, Nepal, Nigeria, Pakistan, Philippines, Senegal, South Africa, Sudan, Uzbekistan, Zambia |
Cluster mean of variables for COVID-19 confirmed cases.
| Feature variables | Cluster 1 | Cluster 2 | Cluster 3 | Cluster 4 |
|---|---|---|---|---|
| 0.241 | 1.906 | |||
| 1.367 | ||||
| 1.371 | ||||
| 1.451 | ||||
| 1.475 | ||||
| 1.392 | ||||
| 0.043 | 1.220 | |||
| 1.344 | ||||
| 1.412 | ||||
| 5.277 | ||||
| 5.494 | ||||
| 5.714 | ||||
| 5.172 | 0.021 | |||
| 6.043 | ||||
| 1.441 | ||||
| 4.956 | 0.183 | |||
| 0.128 | 0.578 | 0.046 | ||
| 0.203 | 0.142 | |||
| 0.118 | 1.122 |
Fig. 3Pearson correlation heatmap for COVID-19 death cases.
Cluster mean of variables for COVID-19 confirmed Deaths.
| Feature variables | Cluster 1 | Cluster 2 | Cluster 3 | Cluster 4 |
|---|---|---|---|---|
| 0.241 | 1.260 | |||
| 1.367 | ||||
| 1.371 | ||||
| 1.451 | ||||
| 1.475 | ||||
| 1.392 | ||||
| 1.220 | 0.043 | |||
| 1.344 | 0.00016 | |||
| 1.412 | ||||
| 5.277 | ||||
| 5.494 | ||||
| 5.7142 | ||||
| 5.172 | 0.0212 | |||
| 6.043 | ||||
| 1.441 | ||||
| T | 4.956 | 0.183 | ||
| 0.578 | 0.128 | 0.046 | ||
| 0.142 | 0.203 | |||
| 1.122 | 0.118 |
Fig. 4Choropleth maps for COVID-19 confirmed cases.
Fig. 5Choropleth maps for COVID-19 death cases.