| Literature DB >> 36242027 |
Sujoy Ghosh1,2, Saikat Sinha Roy3.
Abstract
BACKGROUND: Studies examining factors responsible for COVID-19 incidence have been mostly focused at the national or sub-national level. A global-level characterization of contributing factors and temporal trajectories of disease incidence is currently lacking. Here we conducted a global-scale analysis of COVID-19 infections to identify key factors associated with early disease incidence. Additionally, we compared longitudinal trends of COVID-19 incidence at a per-country level, and classified countries based on COVID-19 incidence trajectories and effects of lockdown responses.Entities:
Keywords: COVID-19; Clustering; Factors; Global; Lockdown; Modeling; Regression
Mesh:
Year: 2022 PMID: 36242027 PMCID: PMC9568998 DOI: 10.1186/s12889-022-14336-w
Source DB: PubMed Journal: BMC Public Health ISSN: 1471-2458 Impact factor: 4.135
Sources for demographic, meteorological, health and economic data utilized in the current analysis. Col 1, abbreviated variable name; col 2, variable description; col 3, variable domain; col 4, year for which variable data was obtained; col 5, web-link to the source of the variable data
| Variable | Description | Domain | Data_Year | Source |
|---|---|---|---|---|
| D_Age_15_64y_2018 | Total population between the ages 15 to 64 as a percentage of the total population in 2018. Population is based on the de facto definition of population, which counts all residents regardless of legal status or citizenship. | Demographic | 2018 | |
| D_Pop_over65_2018 | Population ages 65 and above as a percentage of the total population in 2018. Population is based on the de facto definition of population, which counts all residents regardless of legal status or citizenship. | Demographic | 2018 | |
| D_Popden2018 | Population density in 2018. Calculated from total population and land area | Demographic | 2018 | Calculated from total population and land area |
| E_Employ_agri_%totemp_2018 | Employment in agriculture, male (% total male employment), based on International Labour Organization (ILO) estimate | Economic | 2018 | |
| E_Employ_ind_%totemp_2018 | Employment in agriculture, male (% total male employment), based on International Labour Organization (ILO) estimate | Economic | 2018 | |
| E_Employ_serv_%totemp_2018 | Employment in agriculture, male (% total male employment), based on International Labour Organization (ILO) estimate | Economic | 2018 | |
| E_Total_visitors2018 | Number of total registered visitors in 2018. International inbound tourists (overnight visitors) are the number of tourists who travel to a country other than that in which they have their usual residence, but outside their usual environment, for a period not exceeding 12 months and whose main purpose in visiting is other than an activity remunerated from within the country visited. | Economic | 2018 | |
| E_Urban_pct2018 | Percentage of urban living population. Urban population refers to people living in urban areas as defined by national statistical offices. The data are collected and smoothed by United Nations Population Division. | Economic | 2018 | |
| G_Land_area_sqkm | Land area in square kilometers. Land area is a country’s total area, excluding area under inland water bodies, national claims to continental shelf, and exclusive economic zones. In most cases the definition of inland water bodies includes major rivers and lakes. | Meterological | 2016 | |
| G_Rain_mm_Apr2016 | Average rainfall in millimeters in April 2016 | Meterological | 2016 | |
| G_Rain_mm_Dec2016 | Average rainfall in millimeters in December 2016 | Meterological | 2016 | |
| G_Rain_mm_Feb2016 | Average rainfall in millimeters in February 2016 | Meterological | 2016 | |
| G_Rain_mm_Jan2016 | Average rainfall in millimeters in January 2016 | Meterological | 2016 | |
| G_Rain_mm_Mar2016 | Average rainfall in millimeters in March 2016 | Meterological | 2016 | |
| G_Temp_C_Apr2016 | Average temperature in degrees Celsius in April 2016 | Meterological | 2016 | |
| G_Temp_C_Feb2016 | Average temperature in degrees Celsius in February 2016 | Meterological | 2016 | |
| G_Temp_C_Jan2016 | Average temperature in degrees Celsius in January 2016 | Meterological | 2016 | |
| G_Temp_C_Mar2016 | Average temperature in degrees Celsius in March 2016 | Meterological | 2016 | |
| H_covid_duration_9Apr | A calculated estimate between April 9, 2020 and the first day of reported COVID-19 case(s) in a country | Health | 2020 | calculated |
| H_DALY_CVD_70yrs | One Disability Adjusted Life Year (DALY) is the equivalent of losing one year in good health because of either | Health | 2016 | |
| H_DALY_CVD_all | One Disability Adjusted Life Year (DALY) is the equivalent of losing one year in good health because of either | Health | 2016 | |
| H_DALY_resp_70yrs | One Disability Adjusted Life Year (DALY) is the equivalent of losing one year in good health because of either | Health | 2016 | |
| H_DALY_resp_all | One Disability Adjusted Life Year (DALY) is the equivalent of losing one year in good health because of either | Health | 2016 | |
| H_Diabetes2019 | Diabetes prevalence refers to the percentage of people ages 20–79 who have type 1 or type 2 diabetes. | Health | 2019 | |
| H_Total_COVID_cases | Number of total reported COVID-19 cases per day per country | Health | Daily |
Fig. 1Analysis of the time-course of increase in COVID-19 total cases by country, using different growth-curve models. For each plot, the actual number of COVID-19 cases are shown as open circles and the fitted curve is shown in red. The y-axis refers to the proportion of daily total cases to the maximum total cases recorded in the time interval studied (0–1 scale), and the x-axis refers to the time-course as dates. The best growth-curve model for each country was determined by minimization of the AIC. Two exemplar countries for each model-type are shown with model names listed at the top. Countries are indicated by their ISO codes
Results from bivariate regression analysis of demographic, meteorological, health and economic determinants of COVID-19 incidence tested across 3 time points. Columns 1 and 2 show the variable names and an associated brief description, respectively. Columns 3–5, 6–8, and 9–11 show results for the coefficients(+ standard error), adjusted r-squared and p-values from the regression fits of each variable to total COVID-19 cases in April, May and June 2020, respectively
| Variable | coefficient( | Adjusted r-squared | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Description | Apr 2020 | May 2020 | Jun 2020 | Apr 2020 | May 2020 | Jun 2020 | Apr 2020 | May 2020 | Jun 2020 | |
| D_Age_15_64y_2018 | % total population aged 15–64 yrs. in 2018 | 0.094 (0.010) | 0.076 (0.009) | 0.064 (0.009) | 0.325 | 0.278 | 0.236 | 8.99E-17 | 3.51E-14 | 4.69E-12 |
| D_Pop_over65_2018 | % total population over 65 yrs. in 2018 | 0.116 (0.009) | 0.085 (0.009) | 0.059 (0.009) | 0.496 | 0.349 | 0.201 | 6.47E-28 | 3.47E-18 | 2.74E-10 |
| D_Popden2018 | Population density in 2018 | 0.380 (0.124) | 0.368 (0.109) | 0.255 (0.100) | 0.043 | 0.053 | 0.029 | 0.002542 | 0.000919 | 0.011711 |
| E_Employ2018 agri_pct_tot_emp | % total male employment in agriculture in 2018 | −0.037 (0.002) | − 0.029 (0.002) | − 0.023 (0.002) | 0.613 | 0.486 | 0.360 | 4.55E-37 | 1.39E-26 | 1.99E-18 |
| E_Employ2018 ind_pct_tot_emp | % total male employment in industry in 2018 | 0.064 (0.009) | 0.052 (0.008) | 0.041 (0.007) | 0.234 | 0.195 | 0.149 | 1.09E-11 | 7.69E-10 | 1.02E-07 |
| E_Employ2018 serv_pct_tot_emp | % total male employment in service in 2018 | 0.044 (0.003) | 0.035 (0.003) | 0.027 (0.003) | 0.566 | 0.442 | 0.324 | 6.86E-33 | 1.53E-23 | 2.2E-16 |
| E_Urban_pct2018 | % total population in urban areas in 2018 | 0.029 (0.003) | 0.024 (0.002) | 0.021 (0.002) | 0.372 | 0.355 | 0.343 | 3.66E-21 | 4.73E-20 | 2.74E-19 |
| G_Land_area_sqkm | land area in square kilometers | −0.358 (0.049) | −0.255 (0.044) | −0.166 (0.041) | 0.207 | 0.139 | 0.071 | 6.12E-12 | 2.68E-08 | 7.8E-05 |
| G_Rain_mm_Apr2016 | avg. rainfall in mm in April 2016 | 0.128 (0.161) | 0.036 (0.142) | −0.075 (0.128) | −0.002 | − 0.005 | −0.004 | 0.427862 | 0.801857 | 0.554773 |
| G_Rain_mm_Feb2016 | avg. rainfall in mm in Feb 2016 | 0.608 (0.120) | 0.372 (0.109) | 0.225 (0.100) | 0.125 | 0.057 | 0.023 | 9.73E-07 | 0.000823 | 0.025551 |
| G_Rain_mm_Jan2016 | avg. rainfall in mm in Jan 2016 | 0.519 (0.120) | 0.227 (0.110) | 0.047 (0.100) | 0.092 | 0.019 | −0.005 | 2.56E-05 | 0.039809 | 0.640364 |
| G_Rain_mm_Mar2016 | avg. rainfall in mm in Mar 2016 | 0.343 (0.155) | 0.143 (0.137) | 0.008 (0.124) | 0.022 | 0.000 | −0.006 | 0.027904 | 0.300263 | 0.950179 |
| G_Temp_C_Apr2016 | avg. temp in oC in April 2016 | −0.064 (0.007) | −0.048 (0.006) | − 0.038 (0.006) | 0.315 | 0.235 | 0.174 | 1.73E-16 | 3.71E-12 | 3.58E-09 |
| G_Temp_C_Feb2016 | avg. temp in oC in Feb 2016 | −0.047 (0.006) | −0.036 (0.005) | − 0.029 (0.005) | 0.263 | 0.211 | 0.163 | 1.25E-13 | 6.03E-11 | 1.31E-08 |
| G_Temp_C_Jan2016 | avg. temp in oC in Jan 2016 | −0.040 (0.005) | −0.031 (0.005) | − 0.025 (0.004) | 0.236 | 0.195 | 0.150 | 3.48E-12 | 3.79E-10 | 4.98E-08 |
| G_Temp_C_Mar2016 | avg. temp in oC in Mar 2016 | −0.054 (0.006) | −0.041 (0.005) | − 0.033 (0.005) | 0.316 | 0.242 | 0.181 | 1.58E-16 | 1.55E-12 | 1.76E-09 |
| H_covid_duration | duration (days) of COVID-19 from first reported date | 1.707 (0.367) | 2.574 (0.591) | 2.964 (0.765) | 0.093 | 0.082 | 0.065 | 6E-06 | 2.13E-05 | 0.000145 |
| H_DALY_CVD_70yrs | DALY estimates by cardiovascular disease for > 70 yrs | −0.004 (0.101) | 0.051 (0.090) | 0.071 (0.082) | −0.006 | −0.004 | − 0.001 | 0.966794 | 0.570639 | 0.38638 |
| H_DALY_CVD_all | DALY estimates by cardiovascular disease for all ages | −0.183 (0.102) | −0.078 (0.091) | − 0.018 (0.083) | 0.013 | − 0.002 | −0.006 | 0.073734 | 0.389982 | 0.833299 |
| H_DALY_resp_70yrs | DALY estimates by respiratory disease for > 70 yrs | −0.014 (0.103) | 0.032 (0.091) | 0.049 (0.083) | −0.006 | −0.005 | − 0.004 | 0.890447 | 0.727317 | 0.554885 |
| H_DALY_resp_all | DALY estimates by respiratory disease for all ages | −0.241 (0.101) | − 0.134 (0.091) | − 0.068 (0.083) | 0.027 | 0.007 | −0.002 | 0.018512 | 0.139592 | 0.411349 |
| H_Diabetes2019 | % population between 20 and 79 yrs. with Type 1/Type 2 diabetes in 2019 | 0.028 (0.019) | 0.016 (0.016) | 0.013 (0.015) | 0.007 | 0.000 | −0.001 | 0.135637 | 0.330341 | 0.364124 |
| E_TotVisitor2018 permillion | Total registered visitors per million population in 2018 | 0.911 (0.074) | 0.693 (0.073) | 0.527 (0.075) | 0.489 | 0.363 | 0.237 | 1.42E-24 | 4.13E-17 | 6.06E-11 |
Fig. 2Association of selected variables with total COVID-19 cases in May 2020. Each plot shows the change in total COVID-19 cases per million population (expressed in log10 units) on the y-axis and the relevant variables on the x-axis. The line of best fit is shown along with its equation, the coefficient of determination (R2) and the associated significance of the regression model. Some selected countries with very high or very low COVID-19 cases are annotated by their ISO codes
Fig. 3Multivariable regression analysis of variables associated with COVID-19 cases. a results from all-subsets regression analysis to identify the best sub-model with a smaller list of variables, based on minimization of the AIC. Selected variables are highlighted in red (in addition to the intercept). The x-axis refers to the model size (number of variables in each sub-model), and y-axis lists all the variables tested as follows: Temp_Jan(C,2016), temperatures in January in degrees Celsius in 2016; Temp_Feb(C, 2016), temperatures in February in 2016; Temp_Mar(C, 2016), temperatures in March, 2016; Temp_Apr(C, 2016), temperatures in April, 2016; Urban%(2018), percentage of urban living population in 2018; Emp_service_%total(2018), percentage of total male employment in service sector in 2018; Emp_agri_%total(2018), percentage of total male employment in agriculture in 2018; Emp_ind_%total(2018), percentage of total male employment in industry in 2018; log-COVID_duration(May2020); duration (in days) between May 11, 2020 and the first reported COVID-19 case in a country (log10 scale); log_popdens(2018), population density in 2018 (log10 scale); log_Rain_Feb(mm,2018), rain in millimetres in February 2018 (log10 scale); Age_15_64yrs(2018), population between the ages 15 to 64 as percentage of the total population in 2018; >65yrs_%total(2018), population aged 65 and above as percentage of the total population in 2018; Land_area(sqkm), land area in square kilometres. b change in AIC scores as a function of the number of variables included in the model. c Variance inflation factor (VIF) test of multicollinearity among the 5 variables in the sub-model identified from all-subsets regression analysis. The x-axis refers to the VIF scores and the y-axis refers to the selected variables
Statistical summary of multivariable regression analysis. Statistical estimates of top variables identified after subset selection and minimization of the AIC are reported. Col 1, variable name; col 2, regression estimate for variable; col 3, standard error of estimate; col 4, t-statistic for regression; col 5, significance of variable association (p-value); col 6, variable association significance codes
| Variable | Estimate | Std.Error | t | value | Pr(>|t|) |
|---|---|---|---|---|---|
| (Intercept) | 1.25828 | 0.665479 | 1.891 | 0.06097 | . |
| G_Temp_C_Mar2016 | −0.02075 | 0.004482 | −4.63 | 9.04E-06 | *** |
| E_Urban_pct2018 | 0.009165 | 0.002912 | 3.148 | 0.00206 | ** |
| E_Employ2018_agri_pct_tot_emp | −0.021051 | 0.003676 | −5.726 | 7.22E-08 | *** |
| log_D_Popden2018 | 0.175727 | 0.087111 | 2.017 | 0.04581 | * |
| D_Age_15_64y_2018 | 0.014129 | 0.009234 | 1.53 | 0.12854 |
Significance codes: 0 ‘***’, 0.001 ‘**’, 0.01 ‘*’, 0.05 ‘.’
Residual standard error: 0.4992 on 125 degrees of freedom
Multiple R-squared: 0.6928; Adjusted R-squared: 0.6805
F-statistic: 56.39 on 5 and 125 degrees of freedom, p-value < 2.20E-16
Fig. 4Characterization of new COVID-19 cases at the beginning and close of lockdowns. Countries were characterized on a 5-point heuristic based on new COVID-19 cases prior to, during, at the end of, and 5-days and 14-days post lockdown, and subjected to hierarchical clustering. Dendrogram and associated heatmap shows six major clusters (indicated by dashed blue line on the dendogram). Time-courses of new COVID-19 cases are shown for an exemplar country from each cluster, with the lockdown start and end days indicated by the two blue vertical bars in each plot. Heatmap is color-coded by the assigned values of the five-point criteria (− 1 = skyblue, 0 = ivory, 1 = coral)