| Literature DB >> 28381641 |
C Bosco1,2,3, V Alegana4,2, T Bird4,2, C Pezzulo4,2, L Bengtsson4,2,5, A Sorichetta4,2, J Steele4,2, G Hornby4, C Ruktanonchai4,2, N Ruktanonchai4,2, E Wetter4,2,6, A J Tatem4,2.
Abstract
Improved understanding of geographical variation and inequity in health status, wealth and access to resources within countries is increasingly being recognized as central to meeting development goals. Development and health indicators assessed at national or subnational scale can often conceal important inequities, with the rural poor often least well represented. The ability to target limited resources is fundamental, especially in an international context where funding for health and development comes under pressure. This has recently prompted the exploration of the potential of spatial interpolation methods based on geolocated clusters from national household survey data for the high-resolution mapping of features such as population age structures, vaccination coverage and access to sanitation. It remains unclear, however, how predictable these different factors are across different settings, variables and between demographic groups. Here we test the accuracy of spatial interpolation methods in producing gender-disaggregated high-resolution maps of the rates of literacy, stunting and the use of modern contraceptive methods from a combination of geolocated demographic and health surveys cluster data and geospatial covariates. Bayesian geostatistical and machine learning modelling methods were tested across four low-income countries and varying gridded environmental and socio-economic covariate datasets to build 1×1 km spatial resolution maps with uncertainty estimates. Results show the potential of the approach in producing high-resolution maps of key gender-disaggregated socio-economic indicators, with explained variance through cross-validation being as high as 74-75% for female literacy in Nigeria and Kenya, and in the 50-70% range for many other variables. However, substantial variations by both country and variable were seen, with many variables showing poor mapping accuracies in the range of 2-30% explained variance using both geostatistical and machine learning approaches. The analyses offer a robust basis for the construction of timely maps with levels of detail that support geographically stratified decision-making and the monitoring of progress towards development goals. However, the great variability in results between countries and variables highlights the challenges in applying these interpolation methods universally across multiple countries, and the importance of validation and quantifying uncertainty if this is undertaken.Entities:
Keywords: development indicators; geo-statistics; geographic information system; mapping
Mesh:
Year: 2017 PMID: 28381641 PMCID: PMC5414904 DOI: 10.1098/rsif.2016.0825
Source DB: PubMed Journal: J R Soc Interface ISSN: 1742-5662 Impact factor: 4.118
List of geospatial covariates assembled for mapping literacy, stunting and the use of modern contraception methods.
| category | covariates | data source |
|---|---|---|
| travel time | accessibility | European Commission Joint Research Centre ( |
| distances | distance to settlements, roads, rivers, conflicts, schools and health facility | input data from the WorldPop Project ( |
| climate | temperature, precipitation, aridity index, potential evapotranspiration | MODIS ( |
| satellite indices | MODIS EVI, mid-infrared index, nightlights | MODIS, NOAA VIIRS ( |
| demographic | population, births, pregnancies, ethnicity | WorldPop Project, ETH Zurich ( |
| topography | elevation | US Geological Survey (USGS) ( |
| environment | protected areas, percentage of urban areas | WDPA ( |
| livestock densities | small ruminant, cattle, goats, pigs, poultry, sheep | FAO in collaboration with the Environmental Research Group Oxford (ERGO) ( |
| economic | gross cell product | Yale GEcon Research Project ( |
| land/agriculture | land cover, rainfed crop suitability | NASA/USGS ( |
Comparison of five different models (logistic regression (LR), ANN and BGS) for calculating the proportions of literate ever married women in Bangladesh based on validation statistics (mean square error (MSE), explained variance).
| model | MSE (valid) | exp. var. (valid) |
|---|---|---|
| INLA | 0.025 | 0.18 |
| INLA (SPDE) | 0.023 | 0.24 |
| LR | 0.025 | 0.19 |
| ANN (R) | 0.023 | 0.24 |
| ANN (Octave) | 0.022 | 0.27 |
Modelling results related to different gender-disaggregated development indicators in Nigeria. RMSE, MAE, explained variance, MSE and MSE of a trivial model (mean) were calculated.
| country | modelled parameter | modelling technique | MSE | RMSE | MAE | exp. var. | MSE (mean) |
|---|---|---|---|---|---|---|---|
| Nigeria | female literacy | INLA | 0.03 | 0.18 | 0.133 | 0.74 | 0.12 |
| Nigeria | male literacy | INLA | 0.04 | 0.20 | 0.145 | 0.57 | 0.096 |
| Nigeria | female stunting | INLA | 0.020 | 0.143 | 0.112 | 0.61 | 0.052 |
| Nigeria | male stunting | INLA | 0.021 | 0.146 | 0.117 | 0.60 | 0.053 |
| Nigeria | modern cont. met. | INLA | 0.005 | 0.073 | 0.057 | 0.58 | 0.012 |
Figure 1.(a) The distribution of cluster-level data from the DHS household survey in Nigeria showing the proportion of women aged 15–49 that were classified as literate. (b,c) Map of the mean predicted proportion of literacy in Nigeria for women age 15–49 at 1 km2 resolution (b) and related uncertainty map (c) showing its standard deviation. (d) Scatter plot of the estimated proportions of female literacy in Nigeria (y-axis) by observed data (x-axis) for the training (i) and validation (ii) subset of data. (Online version in colour.)
Figure 2.(a) The distribution of cluster-level data from the DHS household survey in Nigeria showing the proportion of male children under age 5 that were classified as stunted. (b,c) Map of the mean predicted proportion of stunting at 1 km2 resolution (b) and related uncertainty map (c) showing its interdecile range. (d) Scatter plot of the predicted proportion of stunted male children (y-axis) by observed data (x-axis) for the training (i) and validation (ii) subset of data. (Online version in colour.)
Modelling results related to different gender-disaggregated development indicators in Kenya. RMSE, MAE, explained variance, MSE and MSE of a trivial model (mean) were calculated.
| country | modelled parameter | modelling technique | MSE | RMSE | MAE | exp. var. | MSE (mean) |
|---|---|---|---|---|---|---|---|
| Kenya | female literacy | INLA | 0.016 | 0.127 | 0.09 | 0.75 | 0.065 |
| Kenya | male literacy | INLA | 0.021 | 0.144 | 0.10 | 0.32 | 0.030 |
| Kenya | female stunting | INLA | 0.054 | 0.23 | 0.186 | 0.04 | 0.056 |
| Kenya | female stunting | ANN (Octave) | 0.054 | 0.23 | 0.186 | 0.04 | 0.056 |
| Kenya | male stunting | INLA | 0.062 | 0.25 | 0.20 | 0.02 | 0.0628 |
Figure 3.(a) The distribution of cluster-level data from the DHS household survey in Kenya showing the proportion of women aged 15–49 that were classified as literate. (b,c) Map of the mean predicted proportion of female literacy at 1 km2 resolution (b) and related uncertainty map (c) showing its standard deviation. (d) Scatter plot of the predicted proportion of female literacy (y-axis) by observed data (x-axis) for the training (i) and validation (ii) subset of data. (Online version in colour.)
Comparison of different modelling results related to gender-disaggregated development indicators in Tanzania. RMSE, MAE, explained variance, MSE and MSE of a trivial model (mean) were calculated.
| country | modelled parameter | modelling technique | MSE | RMSE | MAE | exp. var. | MSE (mean) |
|---|---|---|---|---|---|---|---|
| Tanzania | female literacy | INLA | 0.023 | 0.15 | 0.1159 | 0.42 | 0.040 |
| Tanzania | male literacy | INLA | 0.045 | 0.21 | 0.16 | 0.08 | 0.050 |
| Tanzania | male literacy | ANN (R) | 0.044 | 0.21 | 0.15 | 0.10 | 0.050 |
| Tanzania | modern cont. met. | INLA | 0.015 | 0.12 | 0.096 | 0.35 | 0.024 |
| Tanzania | modern cont. met. | ANN (R) | 0.0157 | 0.125 | 0.10 | 0.33 | 0.024 |
Figure 4.(a) The distribution of cluster-level data from the DHS household survey in Tanzania showing the proportion of women aged 15–49 using modern contraceptive methods. (b,c) Map of the mean predicted proportion of women using modern contraceptive methods at 1 km2 resolution (b) and related uncertainty map (c) showing its standard deviation. (d) Scatter plot of the predicted proportion of women using modern contraceptive methods (y-axis) by observed data (x-axis) for the training (i) and validation (ii) subset of data. (Online version in colour.)
Comparison of neural networks and Bayesian models for different gender-disaggregated development indicators in Bangladesh. RMSE, MAE, explained variance, MSE and MSE of a trivial model (mean) were calculated.
| country | modelled parameter | modelling technique | MSE | RMSE | MAE | exp. var. | MSE (mean) |
|---|---|---|---|---|---|---|---|
| Bangladesh | female literacy | ANN (Octave) | 0.022 | 0.15 | 0.12 | 0.27 | 0.032 |
| Bangladesh | female literacy | INLA | 0.024 | 0.15 | 0.12 | 0.24 | 0.032 |
| Bangladesh | male literacy | INLA | 0.056 | 0.24 | 0.19 | 0.11 | 0.064 |
| Bangladesh | female stunting | ANN (R) | 0.061 | 0.25 | 0.2 | 0.04 | 0.064 |
| Bangladesh | female stunting | INLA | 0.060 | 0.246 | 0.20 | 0.04 | 0.064 |
| Bangladesh | male stunting | INLA | 0.048 | 0.22 | 0.17 | 0.02 | 0.049 |
Figure 5.(a) The distribution of cluster-level data from the DHS household survey in Bangladesh showing the proportion of women aged 15–49 that were classified as literate. (b,c) Map of the mean predicted proportion of female literacy at 1 km2 resolution (b) and related uncertainty (c) showing its standard deviation. (d) Scatter plot of the predicted proportion of female literacy (y-axis) by observed data (x-axis) for the training (i) and validation (ii) subset of data. (Online version in colour.)