Literature DB >> 36124060

Differences between gridded population data impact measures of geographic access to healthcare in sub-Saharan Africa.

Fleur Hierink^1,2, Gianluca Boo^3,4, Peter M Macharia^5,6, Paul O Ouma⁵, Pablo Timoner^1,2, Marc Levy⁷, Kevin Tschirhart⁷, Stefan Leyk⁸, Nicholas Oliphant⁹, Andrew J Tatem³, Nicolas Ray^1,2.

Abstract

Background: Access to healthcare is imperative to health equity and well-being. Geographic access to healthcare can be modeled using spatial datasets on local context, together with the distribution of existing health facilities and populations. Several population datasets are currently available, but their impact on accessibility analyses is unknown. In this study, we model the geographic accessibility of public health facilities at 100-meter resolution in sub-Saharan Africa and evaluate six of the most popular gridded population datasets for their impact on coverage statistics at different administrative levels.
Methods: Travel time to nearest health facilities was calculated by overlaying health facility coordinates on top of a friction raster accounting for roads, landcover, and physical barriers. We then intersected six different gridded population datasets with our travel time estimates to determine accessibility coverages within various travel time thresholds (i.e., 30, 60, 90, 120, 150, and 180-min).
Results: Here we show that differences in accessibility coverage can exceed 70% at the sub-national level, based on a one-hour travel time threshold. The differences are most notable in large and sparsely populated administrative units and dramatically shape patterns of healthcare accessibility at national and sub-national levels. Conclusions: The results of this study show how valuable and critical a comparative analysis between population datasets is for the derivation of coverage statistics that inform local policies and monitor global targets. Large differences exist between the datasets and the results underscore an essential source of uncertainty in accessibility analyses that should be systematically assessed.

Entities: Chemical

Keywords: Health services; Public health

Year: 2022 PMID： 36124060 PMCID： PMC9481590 DOI： 10.1038/s43856-022-00179-4

Source DB: PubMed Journal: Commun Med (Lond) ISSN： 2730-664X

Introduction

Geographic access to healthcare is essential to ensure universal health coverage, a key target of the United Nations Sustainable Development Goals (SDGs)[1]. While geographic access is only one of many factors, such as affordability, availability, and acceptability[2-4], that impacts access to healthcare, it is fundamental to the organization of a health system as it determines the spatial reach of health services in relation to the population[5,6]. Modeling geographic access to healthcare is necessary to identify gaps in health system coverage and to support targeted health system optimization and planning, such as placement of new facilities, deployment of community health workers, or mobile outreach[7,8]. The key components of a geographic accessibility analysis are the population needing access, the locations of health facilities, and data to help model connectivity and travel time (i.e., road networks, land cover, streams, elevation, and care-seeking specificities)[5,9]. Although data on each of these components is increasingly available, accurate, and current[10], there are persistent differences between regions, hampering accessibility analyzes in data-poor regions[11]. Global advancements in population modeling have enabled the research community to use several gridded population datasets[12-17] in combination with recent data on health facility location[18], opening new avenues for modeling geographic accessibility to healthcare in data-poor settings. It is not known to what extent the use of different population data in accessibility analyzes affects accessibility coverage (i.e., the proportion of the population that can access a health facility within a given travel time threshold) and thus the monitoring of indicators that underpin policy-making at the global, national, and subnational level. This study aims to shed light on the magnitude and variation of these effects and possible policy implications, by conducting the first comprehensive comparison of six of the most commonly used global gridded population datasets in a geographic accessibility model at 100-meter resolution for sub-Saharan Africa. Gridded population datasets allocate population counts across rows and columns of grid cells either by using simple techniques to uniformly redistribute census data or by using ancillary variables derived from Earth observations (e.g., land cover, elevation, and night lights) or socio-economic data to apply dasymetric modeling techniques, that provide more refined population estimates[19]. These datasets typically use a country’s most recent census or projected estimates, summarized in available administrative units or census enumeration areas, to disaggregate population numbers at a finer spatial and temporal resolution[20-22]. Population redistribution techniques vary from dataset to dataset, meaning that the suitability of each dataset for any spatial analysis is context-dependent. Discrepancies between datasets do not necessarily reflect specific appropriateness; rather the suitability of each gridded population dataset is highly dependent on the target scale, context and purpose, and geographic extent of the analysis[19]. However, even when two or more gridded population datasets meet some predetermined criteria, differences in accessibility coverage may be observed. Different population data have been used in accessibility analyzes, exposing potential uncertainty in accessibility coverage estimates and making comparability across studies difficult. Some studies have used national censuses[23], WorldPop products[7,11,24-28], Gridded Population of the World (GPWv4)[29], High-Resolution Settlement Layer (HRSL)[30], or LandScan[31]. The scientific literature increasingly acknowledges differences between gridded population datasets[19,20]. However, the focus is often on general data characteristics and their suitability[19,20,32] or on the country- or discipline-specific implications of using the different data products[21,33-35], rather than quantifying differences in model outcomes at large geographical scales. In addition, the motivation and implications of using a particular population dataset are usually neglected in accessibility studies[35,36]. The choice of any specific population layer is likely driven by personal preferences, lack of knowledge of other sources, or ease of access and use. Here, we systematically assess differences between estimates of geographic healthcare accessibility for all of sub-Saharan Africa using the most popular gridded population data products: (1) WorldPop top–down constrained, (2) WorldPop top–down unconstrained, (3) HRSL, (4) GPWv4, (5) LandScan, and (6) Global Human Settlement Population (GHS-POP). Healthcare accessibility is modeled at 100-meter resolution using the most recent release of the geocoded health facility inventory of 50 countries in sub-Saharan Africa to enable a fair comparison between models[18]. We contrast accessibility coverage statistics derived from the six population datasets, across countries at national and subnational scale. Travel time was calculated by developing a friction layer at 100 meters resolution, representing the estimated time required to reach the nearest health facility. We intersected the six different gridded population datasets with our travel time estimates to determine accessibility coverages within various travel time thresholds (i.e., 30, 60, 90, 120, 150, and 180-min). Our accessibility coverages vary widely between the different datasets and estimates on the sub-Saharan African level mask larger subnational variations. Differences are most pronounced in scarcely settled regions, where administrative units are large. Datasets that distribute population over larger land areas, rather than being limited to building footprints, result in longer travel times for a portion of the population and therefore lower overall estimates of accessibility, notably changing accessibility patterns. The results provide useful clues for policy-making and critical reflection on previous estimates of accessibility to healthcare and their associated uncertainties.

Methods

In order to quantify and compare the differences in healthcare coverage between the six different datasets, we took several steps to prepare, process, and analyze the spatial data.

Accessibility model

Accessibility to healthcare was modeled in terms of travel time to the nearest public health facility. This calculation was made by overlaying health facility coordinates on top of a friction raster. Each grid cell in the friction raster represented a unique land cover class which was assigned a travel speed. On-road travel represented motorized speeds whereas for off-road travel walking speeds were used. The cumulative time required to traverse all cells to the nearest health facility was then calculated for each grid cell which represents the travel time raster. This calculation was done on the eight-directional least-cost path algorithm[9,37] and was isotropic, meaning that no corrections were made for slopes. Although anisotropic analyzes make the model results more realistic, we preferred an isotropic analysis to minimize model complexity and assumptions, in the absence of local transport information. Slope corrections are usually applied to the speeds of pedestrians and cyclists. It is therefore important to have local information on modes of transport, and this is likely to vary from country to country and from region to region. The friction raster represents information about potential impacts on a patient’s journey to healthcare, including land cover type, barriers to movement, and the road network. All this information was extracted from open data sources and processed between January 2021 and October 2021, however, reference dates of some of the data can date back up until 2015 as indicated in Table 1 and Supplementary Table 2. (Table 1). We fully automated the entire workflow in an R and Python environment (Supplementary Fig. 1). In brief, road networks, rivers, and lakes were extracted from OpenStreetMap (OSM) using the osmextract[38] library in R[39] (version 4.0.4). The land cover for sub-Saharan Africa was downloaded at 100-meter resolution from Copernicus[40]. Health facility coordinates were extracted from a geocoded database for sub-Saharan Africa[18]. Administrative boundaries for all African countries were taken from the database of Global Administrative Areas (GADM)[41].

Table 1

Overview of spatial data sources used in the study.

Dataset	Producer	Resolution	Year	Citation
Landcover	Copernicus	∼100 meters	2019	[40]
Roads	OpenStreetMap	Vectorized	2021	[38,71]
Waterbodies (lines and polygons)	OpenStreetMap	Vectorized	2021	[38,71]
Health facilities	Maina et al. (2019)	Vectorized	2018	[18]
Travel scenario	Adapted from Weiss et al. (2020)	–	–	[36]
Administrative boundaries	Global Administrative Areas (GADM)	vectorized	2020	[41]
Mean administrative unit area for publicly available population census data	Center for International Earth Science Information Network - CIESIN	∼1 kilometer	2018	[53]

For an overview of the different gridded population data products please see Supplementary Table 2.

Overview of spatial data sources used in the study. For an overview of the different gridded population data products please see Supplementary Table 2. Data preparation was done on a per-country basis and optimized to minimize computation time as detailed in Supplementary Fig. 1, implying that land cover data was first downloaded for the entire African continent and then processed for each country, separately. In summary, and as shown in Supplementary Fig. 1, data processing included cropping to the bounding box of each country to minimize computation time in the masking step. Then rasters were clipped to exact country borders. Lastly, the land cover raster was projected in the country’s coordinate system (Supplementary Table 1). The process was parallelized using the doParallel[42] and foreach[43] R libraries. All necessary data processing steps were done using the terra package[44]. Scripts for data processing and analysis can be sourced from Github [https://github.com/fleurhierink/Population_Access] and Zenodo [10.5281/zenodo.7004009][45]. Vector data representing road networks and barriers to movement were fetched using the osmextract[38] library in R[39] (version 4.0.4) and projected in the country’s coordinate system (Supplementary Table 1). All road classes that are officially classified by OSM were included for analysis[46]. Barriers to movement (unless a road crosses over) included hydrographic lines classified as river and hydrographic polygons. Streams and smaller waterbodies were excluded from the analysis since they can be traversed with ease[46]. The geocoded inventory of public health facilities in sub-Saharan Africa[18] assembled between 2012 and 2018 was downloaded and projected to match the spatial coordinate system of the other datasets by country (Supplementary Table 1). We included all health facilities irrespective of type (e.g., primary, secondary, health centers, etc.). Finally, all data were combined in a friction raster at 100-meter resolution. This resolution offered the best compromise between computational efficiency, spatial detail to address fine-scale disparities in healthcare access, and consistency with the assembled spatial data described above. The vector data were rasterized at 100-meter resolution. All raster cells were aligned, and layers merged to create one comprehensive land cover raster, to which travel scenarios (Supplementary Data 1) were applied. The travel scenarios for all sub-Saharan African countries were taken from Weiss et al. (2020)[36], but adapted to the context of this paper (Supplementary Data 1). When a travel scenario from Weiss et al.[36] did not indicate a speed for a specific road class in a given country, we used the African average travel speed for that road class (Supplementary Data 1). We did not use an existing travel time surface, such as the one available from Weiss et al.[36], because its coarser resolution (i.e., 1 km × 1 km) did not match the resolution of most of the input- and population data, nor our objective to capture barriers with higher spatial accuracy. In addition, the assumptions made by Weiss et al.[36] about travel speeds and barriers to movement, such as the traversability of waterbodies at a speed of 1 km/h and the use of global average speeds for road classes for which no information on speed limits was available, did not fit well in the context of sub-Saharan Africa. Most importantly, the travel time surfaces modeled in this study were used as an indicator to assess the impact of using different gridded population products, rather than to inform the research community on coverage statistics. To inform policy- and decision-making, it is preferable to work at a finer spatial scale to ensure greater accuracy and robustness in the model inputs by consulting local experts on health facility data and information of health seeking behavior of the target population, so that the travel scenarios can be best adapted to the local context.

Data processing of population grids

In Supplementary Table 2, the properties of the different gridded population datasets are described. All population rasters were clipped to country borders and reprojected to each country’s projection system (Supplementary Table 1). Population that was lost from the original files, due to these data processing steps, were equally smoothed out over the rasters so that total population counts remained the same as in the original files. This was done by comparing the summed population at administrative level 2 for the original and projected rasters. Due to the different resolutions of the datasets, and to avoid resampling of population raster data, all population grids were transformed into spatial points representing the centroids of the grid cells.

Extraction of accessibility coverage statistics

To assess the spatial variation in national and subnational accessibility coverage statistics, we overlaid the six gridded population datasets onto the travel time rasters for each country. We extracted the travel time and the administrative boundary (level 1 and 2) for each population point feature. We then calculated the accessibility coverage statistics, by means of zonal statistics, to output the population able to reach the nearest health facility within a certain travel time. Both relative and absolute coverage statistics were obtained per administrative unit. Population falling on barriers (i.e., waterbodies or just outside country borders) were not included in the extraction of coverage statistics. The absolute and relative number of people falling on barriers are indicated in Supplementary Data 2.

Limitations of method

We note that our travel time grid, which captures the accessibility of the nearest health facility, served as the main input data for deriving the coverage statistics presented. However, we recognize that realistic estimates of geographic access to healthcare require local knowledge of health-seeking behavior, such as travel modes and speed, as well as information on (seasonal) barriers to mobility. Although we have used local expert knowledge to build accessibility models in previous studies[28,30,47], the scale and context of the present analysis did not allow us to use such local knowledge. Such detailed input was beyond the scope of this study, which aims to reflect important differences between population datasets. Therefore, our travel time maps and associated accessibility estimates should not be used for health system planning at national and subnational levels. However, our methodology can be adapted to local contexts, drawing on the expertize of different stakeholders at national and subnational levels, particularly in relation to transport modes and speeds. A limitation of the current study is that the unconstrained datasets included a proportionally higher number of people living in areas considered to be barriers (i.e. waterbodies or areas outside national borders). In addition, modeling geographic accessibility presents challenges other than differences between gridded population datasets. For example, uncertainties in travel modes and speeds can lead to under- or overestimation of accessibility. If travel speeds are assumed to be higher than they actually are, the accessibility model results will incorrectly indicate a higher accessibility coverage. This also applies to uncertainties in road network data when some roads maybe missing or when roads may actually present dirt tracks that in reality cannot be traveled by motorized vehicles. Realistic modeling of access to healthcare is therefore highly dependent on reliable and locally agreed model inputs. One nascent area is the use of Google Maps APIs to characterize travel time which has been shown to estimate near to reality travel times in urban areas. The approach potentially accounts for traffic, weather conditions, difference in speeds, road conditions and other predisposing factors. However, the approach is still at an early stage of development and is more applicable in urban areas where data collection through voluntary geographic research is better than in remote and rural areas where the majority of people live. Therefore, the use of least-cost path algorithms still remains feasible but requires improved parameterization[48].

Table 2

Summary coverage statistics for sub-Saharan Africa.

	Total population	30 min		60 min		90 min		120 min		150 min		180 min
	Total population	nr. covered	% covered	nr. covered	% covered	nr. covered	% covered	nr. covered	% covered	nr. covered	% covered	nr. covered	% covered
HRSL	837,427,969	738,362,867	88.2	789,665,384	94.3	808,260,473	96.5	817,316,977	97.6	822,468,235	98.2	825,661,368	98.6
GHS-POP	1,007,629,498	879,872,628	87.3	926,126,071	91.9	946,476,919	93.9	958,041,182	95.1	964,843,199	95.8	969,695,172	96.2
GPWv4	1,142,381,994	691,184,991	60.5	864,838,798	75.7	945,653,297	82.8	991,147,422	86.8	1,019,755,993	89.3	1,039,511,968	91.0
LandScan	1,114,787,854	900,416,765	80.8	998,239,544	89.5	1,035,462,222	92.9	1,054,119,811	94.6	1,064,963,010	95.5	1,071,921,785	96.2
WorldPop constrained	1,135,848,278	916,456,304	80.7	1,015,980,775	89.4	1,057,068,975	93.1	1,078,609,302	95.0	1,091,225,178	96.1	1,099,582,164	96.8
WorldPop unconstrained	1,135,121,848	808,494,660	71.2	951,961,369	83.9	1,011,228,872	89.1	1,042,316,719	91.8	1,060,764,621	93.4	1,072,860,699	94.5

Absolute and relative accessibility coverage as visually presented in Fig. 1 for the six different gridded population data sets: HRSL, GHS-POP, GPWv4, LandScan, WorldPop top–down constrained, WorldPop top–down unconstrained. Total population is lower for HRSL because Ethiopia, Somalia, Sudan, and South Sudan are not included in the dataset released in 2018.

37 in total

1. The effect of population distribution measures on evaluating spatial accessibility of primary health-care institutions: A case study from China.

Authors: Jianxia Tan; Xiuli Wang; Jay Pan
Journal: Geospat Health Date: 2021-03-11 Impact factor: 1.212

2. Global maps of travel time to healthcare facilities.

Authors: D J Weiss; A Nelson; C A Vargas-Ruiz; K Gligorić; S Bavadekar; E Gabrilovich; A Bertozzi-Villa; J Rozier; H S Gibson; T Shekel; C Kamath; A Lieber; K Schulman; Y Shao; V Qarkaxhija; A K Nandi; S H Keddie; S Rumisha; P Amratia; R Arambepola; E G Chestnutt; J J Millar; T L Symons; E Cameron; K E Battle; S Bhatt; P W Gething
Journal: Nat Med Date: 2020-09-28 Impact factor: 53.440

3. Spatial accessibility to basic public health services in South Sudan.

Authors: Peter M Macharia; Paul O Ouma; Ezekiel G Gogo; Robert W Snow; Abdisalan M Noor
Journal: Geospat Health Date: 2017-05-11 Impact factor: 1.212

4. Geographic accessibility to public health facilities providing tuberculosis testing services at point-of-care in the upper east region, Ghana.

Authors: Desmond Kuupiel; Kwame M Adu; Felix Apiribu; Vitalis Bawontuo; Duncan A Adogboba; Kwasi T Ali; Tivani P Mashamba-Thompson
Journal: BMC Public Health Date: 2019-06-10 Impact factor: 3.295

5. New estimates of flood exposure in developing countries using high-resolution population data.

Authors: Andrew Smith; Paul D Bates; Oliver Wing; Christopher Sampson; Niall Quinn; Jeff Neal
Journal: Nat Commun Date: 2019-04-18 Impact factor: 14.919

6. Mapping vaccination coverage to explore the effects of delivery mechanisms and inform vaccination strategies.

Authors: C Edson Utazi; Julia Thorley; Victor A Alegana; Matthew J Ferrari; Saki Takahashi; C Jessica E Metcalf; Justin Lessler; Felicity T Cutts; Andrew J Tatem
Journal: Nat Commun Date: 2019-04-09 Impact factor: 14.919

7. High-resolution population estimation using household survey data and building footprints.

Authors: Gianluca Boo; Edith Darin; Douglas R Leasure; Claire A Dooley; Heather R Chamberlain; Attila N Lázár; Kevin Tschirhart; Cyrus Sinai; Nicole A Hoff; Trevon Fuller; Kamy Musene; Arly Batumbo; Anne W Rimoin; Andrew J Tatem
Journal: Nat Commun Date: 2022-03-14 Impact factor: 14.919

8. Geographic accessibility to primary healthcare centers in Mozambique.

Authors: António Dos Anjos Luis; Pedro Cabral
Journal: Int J Equity Health Date: 2016-10-18

9. Spatial access inequities and childhood immunisation uptake in Kenya.

Authors: Noel K Joseph; Peter M Macharia; Paul O Ouma; Jeremiah Mumo; Rose Jalang'o; Peter W Wagacha; Victor O Achieng; Eunice Ndung'u; Peter Okoth; Maria Muñiz; Yaniss Guigoz; Rocco Panciera; Nicolas Ray; Emelda A Okiro
Journal: BMC Public Health Date: 2020-09-15 Impact factor: 3.295

10. Modelling geographical accessibility to support disaster response and rehabilitation of a healthcare system: an impact analysis of Cyclones Idai and Kenneth in Mozambique.

Authors: Fleur Hierink; Nelson Rodrigues; Maria Muñiz; Rocco Panciera; Nicolas Ray
Journal: BMJ Open Date: 2020-11-03 Impact factor: 2.692