| Literature DB >> 36061268 |
Mohamed Lamine Sidibé1, Roland Yonaba1, Fowé Tazen1, Héla Karoui1, Ousmane Koanda1, Babacar Lèye1, Harinaivo Anderson Andrianisa1, Harouna Karambiri1.
Abstract
The COVID-19 pandemic, which outbroke in Wuhan (China) in December 2019, severely hit almost all sectors of activity in the world as a consequence of the restrictive measures imposed. Two years later, Africa still emerges as the least affected continent by the pandemic. This study analyzed COVID-19 prevalence across African countries through country-level variables prior to clustering. Using Spearman-rank correlation, multicollinearity analysis and univariate filtering, 9 country-level variables were identified from an initial set of 34 variables. These variables relate to socioeconomic status, population structure, healthcare system and environment and the climatic setting. A clustering of the 54 African countries is further carried out through the use of agglomerative hierarchical clustering (AHC) method, which generated 3 distinctive clusters. Cluster 1 (11 countries) is the most affected by COVID-19 (median of 63,508.6 confirmed cases and 946.5 deaths per million) and is composed of countries with the highest socioeconomic status. Cluster 2 (27 countries) is the least affected (median of 4473.7 confirmed cases and 81.2 deaths per million), and mainly features countries with the least socioeconomic features and international exposure. Cluster 3 (16 countries) is intermediate in terms of COVID-19 prevalence (median of 2569.3 confirmed cases and 35.7 deaths per million) and features countries the least urbanized and geographically close to the equator, with intermediate international exposure and socioeconomic features. These findings shed light on the main features of COVID-19 prevalence in Africa and might help refine effectively coping management strategies of the ongoing pandemic. Supplementary Information: The online version contains supplementary material available at 10.1007/s10668-022-02646-3.Entities:
Keywords: Africa; COVID-19; Cluster analysis; Hierarchical clustering; Pandemic; Transmission factors
Year: 2022 PMID: 36061268 PMCID: PMC9424840 DOI: 10.1007/s10668-022-02646-3
Source DB: PubMed Journal: Environ Dev Sustain ISSN: 1387-585X Impact factor: 4.080
Fig. 1Flowchart of the methodology used in this study
Country-level variables selected for this study
| Category | Variables | Description | Sources |
|---|---|---|---|
| COVID-19 prevalence | conf_pm | Cumulative confirmed cases (as of 08/31/21) | Dong et al. ( |
| death_pm | Cumulative confirmed deaths (as of 08/31/21) | Dong et al. ( | |
| International exposure and socioeconomic status | arriv | International tourism, number of arrivals (thousands) | WorldBank ( |
| hdi | Human development index (HDI) | UNDP ( | |
| gini | Gini index (metric for inequalities) | WorldBank ( | |
| gdp_cap | Gross domestic product per capita (GDP) ($US) | WorldBank ( | |
| alphab | Literacy rate (%) | WorldBank ( | |
| Population structure | dens_pop | Population density (people/km2) | WorldBank ( |
| urb_pop | Urban population percentage (%) | WorldBank ( | |
| median_age | Median age of the population (years old) | WorldBank ( | |
| life_exp | Life expectancy (years old) | WorldBank ( | |
| p65yrs | Percentage of people aged over 65 years (%) | WorldBank ( | |
| Healthcare system and environment | lack_hygien | Mortality rate due to lack of hygiene, unsafe water and sanitation (per 100,000 people) | WorldBank ( |
| hous_fossf | Mortality rate due to air pollution from the use of household solid fuels (per 100,000 people) | Yale ( | |
| med_1000 | Number of physicians (per 1000 people) | WorldBank ( | |
| pm25 | Annual mean concentration of particulate matter of less than 2.5 microns of diameter (PM2.5) [µg/m3] in urban areas | WorldBank ( | |
| health_exp | Current health expenditure per capita ($US) | WorldBank ( | |
| epi | Environmental performance index | Yale ( | |
| immuniz_dtp1 | Immunization coverage / DTP1 (%) | WHO ( | |
| immuniz_bcg | Immunization coverage / BCG (%) | WHO ( | |
| Diseases prevalence and risk factors | prev_diab | Diabetes prevalence (number of people) | IHME ( |
| prev_cvlds | Cardiovascular diseases prevalence (number of people concerned) | IHME ( | |
| prev_ch.resp | Chronic respiratory diseases prevalence (number of people concerned) | IHME ( | |
| prev_malaria | Malaria prevalence (number of people concerned) | IHME ( | |
| prev_nutdef | Malnutrition and nutritional deficiencies prevalence (number of people concerned) | IHME ( | |
| prev_respdtub | Respiratory infections and tuberculosis prevalence (number of people concerned) | IHME ( | |
| alcohol_cons | Total alcohol consumption per capita (liters) | WorldBank ( | |
| Climatic setting | lat_abs | Absolute latitude (°) | Gelaro et al. ( |
| ws2m_avg | Average daily wind speed (m/s) | Gelaro et al. ( | |
| rh2m_avg | Average daily relative humidity (%) | Gelaro et al. ( | |
| tmax_avg | Average daily maximum temperature (°C) | Gelaro et al. ( | |
| tmin_avg | Average daily minimum temperature (°C) | Gelaro et al. ( | |
| tmoy_avg | Average daily temperature (°C) | Gelaro et al. ( | |
| insol_avg | Average daily insolation (MJ/m2/j) | Gelaro et al. ( | |
| tdew_avg | Average dew point temperature (°C) | Gelaro et al. ( | |
| ah_avg | Absolute air humidity (%) – | Gelaro et al. ( |
Fig. 2COVID-19 prevalence evolution in Africa. a Cumulative confirmed cases and deaths. b Daily confirmed cases and deaths
Fig. 3Choropleth map showing the spatial and temporal spread of COVID-19 cumulative cases and deaths in Africa over the period January 2020 to March 2022. a–d Cumulative cases per million people. e–h Cumulative deaths per million people
Fig. 4Spearman’s rho correlation coefficient between country-level variables and their association to COVID-19 prevalence in African countries. Blank values show nonsignificant correlation coefficients (at = 5% level)
VIF values for all variables
| N° | Variable | conf_pm | death_pm | N° | Variable | conf_pm | death_pm |
|---|---|---|---|---|---|---|---|
| 1 | alcohol_cons | 3.11 | 3.11 | 16 | prev_ch.resp | 9.52 | 9.52 |
| 2 | dens_pop | 3.36 | 3.36 | 17 | insol_avg | 10.57 | 10.57 |
| 3 | urb_pop | 3.59 | 3.59 | 18 | life_exp | 11.48 | 11.48 |
| 4 | pm25 | 3.66 | 3.66 | 19 | p65yrs | 14.63 | 14.63 |
| 5 | gini | 4.99 | 4.99 | 20 | hous_fossf | 17.20 | 17.20 |
| 6 | arriv | 6.10 | 6.10 | 21 | hdi | 18.74 | 18.74 |
| 7 | med_1000 | 7.48 | 7.48 | 22 | gdp_cap | 32.48 | 32.48 |
| 8 | prev_malaria | 7.51 | 7.51 | 23 | health_exp | 32.54 | 32.54 |
| 9 | ws2m_avg | 7.75 | 7.75 | 24 | median_age | 35.61 | 35.61 |
| 10 | epi | 7.83 | 7.83 | 25 | rh2m_avg | 140.79 | 140.79 |
| 11 | alphab | 7.96 | 7.96 | 26 | ah_avg | 180.74 | 180.74 |
| 12 | immuniz_bcg | 8.46 | 8.46 | 27 | tmax_avg | 258.91 | 258.91 |
| 13 | lack_hygien | 8.82 | 8.82 | 28 | tmin_avg | 575.63 | 575.63 |
| 14 | immuniz_dtp1 | 8.92 | 8.92 | 29 | tmoy_avg | 1349.89 | 1349.89 |
| 15 | lat_abs | 9.08 | 9.08 |
Optimal features explaining variability in COVID-19 prevalence across African countries
| N° | Variable | Conf_pm ( | Variable | Death_pm ( | N° | Variable | conf_pm ( | Variable | Death_pm ( |
|---|---|---|---|---|---|---|---|---|---|
| Period: January 1, 2020 to September 30, 2020 | Period: March 31, 2021 to September 30, 2021 | ||||||||
| 1 | lack_hygien | − 0.69 *** | lack_hygien | − 0.62 *** | 1 | lack_hygien | − 0.74 *** | lack_hygien | − 0.73 *** |
| 2 | med_1000 | 0.62 *** | med_1000 | 0.56 *** | 2 | alphab | 0.65 *** | med_1000 | 0.63 *** |
| 3 | urb_pop | 0.58 *** | urb_pop | 0.54 *** | 3 | med_1000 | 0.64 *** | alphab | 0.55 *** |
| 4 | alphab | 0.51 *** | alphab | 0.41 ** | 4 | epi | 0.49 *** | epi | 0.49 *** |
| 5 | epi | 0.38 ** | lat_abs | 0.37 ** | 5 | pm25 | − 0.44 ** | lat_abs | 0.49 *** |
| 6 | lat_abs | 0.3 * | epi | 0.33 * | 6 | urb_pop | 0.43 ** | pm25 | − 0.41 ** |
| 7 | arriv | 0.18 | arriv | 0.16 | 7 | arriv | 0.35 * | arriv | 0.40 ** |
| 8 | gini | 0.15 | gini | 0.02 | 8 | lat_abs | 0.35 * | urb_pop | 0.38 ** |
| 9 | gini | 0.32 * | gini | 0.19 | |||||
| Period: September 30, 2020 to March 31, 2021 | Period: September 30, 2021 to March 31, 2022 | ||||||||
| 1 | lack_hygien | − 0.73 *** | lack_hygien | − 0.68 *** | 1 | lack_hygien | − 0.74 *** | lack_hygien | − 0.76 *** |
| 2 | med_1000 | 0.68 *** | med_1000 | 0.63 *** | 2 | alphab | 0.67 *** | med_1000 | 0.66 *** |
| 3 | alphab | 0.62 *** | lat_abs | 0.53 *** | 3 | med_1000 | 0.66 *** | alphab | 0.58 *** |
| 4 | epi | 0.51 *** | alphab | 0.49 *** | 4 | epi | 0.49 *** | epi | 0.53 *** |
| 5 | urb_pop | 0.46 *** | epi | 0.47 *** | 5 | pm25 | − 0.45 ** | lat_abs | 0.51 *** |
| 6 | lat_abs | 0.4 ** | urb_pop | 0.42 ** | 6 | urb_pop | 0.42 ** | pm25 | − 0.41 ** |
| 7 | pm25 | − 0.39 ** | pm25 | − 0.35 * | 7 | arriv | 0.37 ** | urb_pop | 0.39 ** |
| 8 | arriv | 0.33 * | arriv | 0.34 * | 8 | lat_abs | 0.37 ** | arriv | 0.39 ** |
| 9 | gini | 0.25 | gini | 0.14 | 9 | gini | 0.31 * | gini | 0.16 |
indicates Spearman’s rank correlation coefficient. ‘***’ indicates significance at the 0.001 level. ‘**’ indicates significance at the 0.01 level. ‘*’ indicates significance at the 0.05 level. Variables are ranked out by order of decreasing importance
Fig. 5Dendrogram of observations based on AHC using the optimal subset of 9 variables
Fig. 6Map of the 3 clusters identified in this study
Statistical description of the 3 clusters of countries
| Clusters | 1 | 2 | 3 | 1 | 2 | 3 | 1 | 2 | 3 | 1 | 2 | 3 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| lack_hygien | alphab | med_1000 | epi | |||||||||
| Min | 0.2 | 11.4 | 4.1 | 71.2 | 34.5 | 22.3 | 0.2 | 0.0 | 0.0 | 34.7 | 26.5 | 22.6 |
| Q1 | 0.8 | 26.1 | 37.8 | 80.2 | 61.4 | 39.8 | 0.5 | 0.1 | 0.1 | 41.4 | 30.2 | 26.6 |
| Q2 | 1.9 | 38.4 | 45.9 | 86.7 | 76.5 | 46.4 | 1.2 | 0.1 | 0.1 | 43.3 | 33.8 | 29.3 |
| Avg | 7.9 | 39.8 | 50.5 | 84.6 | 70.6 | 47.6 | 1.2 | 0.2 | 0.2 | 43.9 | 33.2 | 29.0 |
| Q3 | 12.8 | 49.8 | 69.1 | 89.2 | 79.7 | 52.3 | 1.9 | 0.2 | 0.1 | 45.0 | 36.0 | 30.7 |
| Max | 34.9 | 86.6 | 101.0 | 95.9 | 94.4 | 86.8 | 2.5 | 0.7 | 0.8 | 58.2 | 45.8 | 38.3 |
‘Min’ is the minimum value, ‘Q1’ the first quartile, ‘Q2’ the second quartile, i.e., the median, ‘Avg’ is the average, ‘Q3’ is the third quartile, ‘Max’ is the maximum value
Fig. 7Box-plot comparison of COVID-19 prevalence across clusters. a Cumulative cases per million (conf_pm). b Cumulative deaths per million (death_pm). c Case fatality rate (%), calculated as the cumulative number of deaths out of the cumulative number of confirmed cases. d Mortality (per million people), calculated as the cumulative number of deaths out of population estimates (WorldBank, 2021). The vertical axis was transformed to log10 scale for easier visual cross-comparison of clusters
Fig. 8Cluster comparison by variables. The vertical axis was transformed to log10 scale to enable visual cross-comparison across clusters
Fig. 9Scatterplots of natural logarithm (log) of COVID-19 cases and deaths per million people opposed to absolute latitude (in degrees) for African countries. a COVID-19 cumulated confirmed cases (R2 = 0.063, p value = 0.063 > 0.05). b COVID-19 cumulated deaths (R2 = 0.198, p value = 0.005 < 0.05)