| Literature DB >> 33344538 |
Fennis J Reed1, Andrea E Gaughan1, Forrest R Stevens1, Greg Yetman2, Alessandro Sorichetta3, Andrew J Tatem3,4.
Abstract
The spatial distribution of humans on the earth is critical knowledge that informs many disciplines and is available in a spatially explicit manner through gridded population techniques. While many approaches exist to produce specialized gridded population maps, little has been done to explore how remotely sensed, built-area datasets might be used to dasymetrically constrain these estimates. This study presents the effectiveness of three different high-resolution built area datasets for producing gridded population estimates through the dasymetric disaggregation of census counts in Haiti, Malawi, Madagascar, Nepal, Rwanda, and Thailand. Modeling techniques include a binary dasymetric redistribution, a random forest with a dasymetric component, and a hybrid of the previous two. The relative merits of these approaches and the data are discussed with regards to studying human populations and related spatially explicit phenomena. Results showed that the accuracy of random forest and hybrid models was comparable in five of six countries.Entities:
Keywords: binary dasymetric; built areas; geographic information systems; geography; gridded population distribution; random forest; regression; remote sensing
Year: 2018 PMID: 33344538 PMCID: PMC7680951 DOI: 10.3390/data3030033
Source DB: PubMed Journal: Data (Basel) ISSN: 2306-5729
Model enumeration and brief descriptions, indicating the number of resulting maps and built area restrictions. Ordered by increasing complexity.
| Model | Name | Description | Raster Type | Output Maps |
|---|---|---|---|---|
| 1 | Binary Dasymetric | Redistribution of population into built areas. | Built Area Restricted | 24 |
| 2 | Random Forest + Dasymetric | Redistribution of population across weighted surface. | Continuous | 6 |
| 3 | Hybrid | Redistribution of population into weighted built areas. | Built Area Restricted | 24 |
Figure 3Model enumeration and visual representation of feature overlays used to produce output datasets by means of dasymetric redistribution. Ordered by increasing complexity.
Census data for the six sampled countries and supporting data for finest available and aggregate products. Each model is built using the aggregate data, while finest available census units are reserved for accuracy assessment.
| Type | Country | ISO | Census Year (Adm. Lvl.) | Admin Units | Total Pop | ASR |
|---|---|---|---|---|---|---|
| Finest Available | Haiti | HTI | 2015 (3) | 570 | 10,911,819 | 6.9 |
| Madagascar | MDG | 2006 (4) | 17,459 | 20,966,899 | 5.8 | |
| Malawi | MWI | 2008 (3) | 12,666 | 13,053,968 | 2.7 | |
| Nepal | NPL | 2011 (4) | 36,042 | 26,246,586 | 2.0 | |
| Rwanda | RWA | 2002 (4) | 9192 | 9,482,511 | 1.7 | |
| Thailand | THA | 2010 (3) | 7416 | 64,978,504 | 8.3 | |
| 2/3 Aggregate | Haiti | HTI | 2015 | 380 | 10,911,819 | 8.4 |
| Madagascar | MDG | 2006 | 11,639 | 20,966,899 | 7.1 | |
| Malawi | MWI | 2008 | 8444 | 13,053,968 | 3.4 | |
| Nepal | NPL | 2011 | 24,028 | 26,246,586 | 2.5 | |
| Rwanda | RWA | 2002 | 6128 | 9,482,511 | 2.0 | |
| Thailand | THA | 2010 | 4944 | 64,978,504 | 10.2 |
Figure 4An example of the three primary model types and the rasters they produce for Kigali, Rwanda. Pictured built area extent on models 1 and 3 is the combination layer described in Section 3.1.2.
Three primary built/human settlement datasets and supporting information. GHSL and HRSL datasets are accessible from their respective portals, while WSF is available upon request [23].
| Built Dataset | Year | Source | Nominal Resolution | Citation |
|---|---|---|---|---|
| WSF | 2015 | Landsat 8, Sentinel1 | 10 m | [ |
| GHSL | 2014 | Landsat 8 | 38 m | [ |
| HRSL | 2015 | DigitalGlobe | 0.5 m | [ |
Covariates and data sources included in the random forest. Nominal resolutions noted with ‘as’ represent the unit arcseconds.
| Cultivated Terrestrial Lands | ESA CCI Land cover, 2010 | 10 arc-second | [ | |
| Lights at Night | Suomi VIIRS-Derived, 2012 | 15 arc-second | [ | |
| Generic Populated Places | VMAP0 merged, 1979–1999 | NA | [ |
Figure 1Census unit aggregation procedure in which 1/3 of the finest available units are randomly selected independent of spatial size or any other stratification and merged with its neighbor with the longest shared border until the target 2/3 census count is reached.
Figure 2Workflow for generating the population distribution maps.
Error metrics for each of the 52 maps. Tables are shaded to indicate increasing methodological complexity. Values highlighted in red represent minimum error. Labeled as follows a: Haiti, b: Madagascar, c: Malawi, d: Nepal, e: Rwanda, f: Thailand.
| Model | Built Area | RMSE | MAE | RMSE Density | MAE Density | Model | Built Area | RMSE | MAE | RMSE Density | MAE Density | ||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| (a) | Dasymetric Masked | HRSL | 12861.2 | 3281 | 8.1 | 1.6 | (b) | Dasymetric Masked | HRSL | 777.4 | 245.9 | 32.9 | 3.9 | ||
| Random Forest + Dasymetric | 11083.9 | 3021.8 | 7.3 | 1.5 | Random Forest + Dasymetric | 934.5 | 287.9 | 37.6 | 4.7 | ||||||
| Hybrid | HRSL | 11935.6 | 3061.9 | 7.9 | 1.5 | Hybrid | HRSL | 727.2 | 256.6 | 37.1 | 3.9 | ||||
| Hybrid | GHSL | 12823.1 | 4779 | 8.1 | 2 | Hybrid | GHSL | 1130.1 | 403.3 | 33.1 | 4.8 | ||||
| Hybrid | WSF | 12267.5 | 4548.4 | 8.1 | 2 | Hybrid | WSF | 897.2 | 380.4 | 33.7 | 4.3 | ||||
| Hybrid | COMBO | 11897.6 | 3116.8 | 7.9 | 1.5 | Hybrid | COMBO | 782.4 | 271.4 | 39.3 | 4.2 | ||||
| (c) | Dasymetric Masked | HRSL | 549.1 | 225.2 | 31.1 | 5 | (d) | Dasymetric Masked | HRSL | 456.3 | 176.2 | 22 | 3.7 | ||
| Random Forest + Dasymetric | 567.6 | 213.6 | 27.7 | 4.8 | Random Forest + Dasymetric | 412.5 | 140.8 | 21.8 | 3.4 | ||||||
| Hybrid | HRSL | 529 | 233.7 | 30.2 | 4.9 | Hybrid | HRSL | 452.6 | 186.7 | 22.4 | 3.9 | ||||
| Hybrid | GHSL | 699.1 | 340.5 | 27.1 | 5.5 | Hybrid | GHSL | 645.5 | 209 | 27.6 | 4.6 | ||||
| Hybrid | WSF | 705.9 | 354.3 | 27.1 | 5.5 | Hybrid | WSF | 540.1 | 224.5 | 23.9 | 4.6 | ||||
| Hybrid | COMBO | 545.3 | 236.2 | 28.5 | 4.9 | Hybrid | COMBO | 448.5 | 185.2 | 21.9 | 3.8 | ||||
| (e) | Dasymetric Masked | HRSL | 390.9 | 146.7 | 11.3 | 1.7 | (f) | Dasymetric Masked | HRSL | 4040.9 | 1160.3 | 9.8 | 1.5 | ||
| Random Forest + Dasymetric | 343.4 | 110.3 | 11.1 | 1.4 | Random Forest + Dasymetric | 3802.9 | 1139.5 | 9.9 | 1.4 | ||||||
| Hybrid | HRSL | 376.3 | 153.2 | 10.7 | 1.7 | Hybrid | HRSL | 3697.2 | 1278.9 | 8.6 | 1.3 | ||||
| Hybrid | GHSL | 595.7 | 291.4 | 11.4 | 2.7 | Hybrid | GHSL | 4279 | 1789 | 8.3 | 1.6 | ||||
| Hybrid | WSF | 579 | 273.9 | 11.6 | 2.7 | Hybrid | WSF | 3932.4 | 1462.8 | 8.3 | 1.4 | ||||
| Hybrid | COMBO | 386.1 | 157.7 | 11 | 1.7 | Hybrid | COMBO | 3809.1 | 1299.5 | 9.6 | 1.4 | ||||
| Model | Built Area | RMSE | MAE | RMSE Density | MAE Density | Model | Built Area | RMSE | MAE | RMSE Density | MAE Density |
Variance explained captured in the random forest models of each sampled country.
| Country | Variance Explained | Country | Variance Explained |
|---|---|---|---|
| Haiti | 52.4 | Nepal | 82.12 |
| Madagascar | 78.96 | Thailand | 84.49 |
| Malawi | 72.27 | Rwanda | 73.07 |
Figure 5Box plots of global variable importance presented as mean squared error for each covariate class. The median is represented by the black bar, while the whiskers represent the min/max values within 1.5× inter-quartile range. Variables sourced in Table 4.