| Literature DB >> 26347245 |
Alessandro Sorichetta1, Graeme M Hornby2, Forrest R Stevens3, Andrea E Gaughan3, Catherine Linard4, Andrew J Tatem5.
Abstract
The Latin America and the Caribbean region is one of the most urbanized regions in the world, with a total population of around 630 million that is expected to increase by 25% by 2050. In this context, detailed and contemporary datasets accurately describing the distribution of residential population in the region are required for measuring the impacts of population growth, monitoring changes, supporting environmental and health applications, and planning interventions. To support these needs, an open access archive of high-resolution gridded population datasets was created through disaggregation of the most recent official population count data available for 28 countries located in the region. These datasets are described here along with the approach and methods used to create and validate them. For each country, population distribution datasets, having a resolution of 3 arc seconds (approximately 100 m at the equator), were produced for the population count year, as well as for 2010, 2015, and 2020. All these products are available both through the WorldPop Project website and the WorldPop Dataverse Repository.Entities:
Mesh:
Year: 2015 PMID: 26347245 PMCID: PMC4555876 DOI: 10.1038/sdata.2015.45
Source DB: PubMed Journal: Sci Data ISSN: 2052-4463 Impact factor: 6.444
Figure 1Schematic overview of the Random Forest (RF)-based dasymetric mapping approach used to produce the WorldPop Americas datasets (modified from Stevens et al.[24]).
The preparation of the response variable and covariates is described in the yellow and orange panels, respectively, the RF modelling steps are outlined in in the green panels, and the disaggregation of the input population counts from administrative units into grid cells is described in the blue panel.
Summary information about population count data and administrative unit datasets used to produce the WorldPop Americas datasets
| For each country (identified by its ISO country code in the 1st column), the Average Spatial Resolution was calculated as the square root of its surface area divided by the number of administrative units and represents the effective resolution of the latter (i.e., the cell size of administrative units if all units were square of equal size)[ | ||||||||
|---|---|---|---|---|---|---|---|---|
| ATG | 436 | 81,799C | 2011 | 8 | Parish/1 | 2.6 | Census OfficeG | GADM[ |
| ARG | 2,804,771 | 40,117,096C | 2010 | 526 | Department/2 | 3.2 | INDECG | IGN[ |
| BLZ | 21,918 | 312,971C | 2010 | 16 | Subdivision/1 | 9.3 | Statistical Institute of BelizeG | Meerman[ |
| BOL | 1,069,327 | 10,027,262C | 2012 | 112 | Province/2 | 9.2 | INEG | GADM[ |
| BRA | 8,233,131 | 190,732,694C | 2010 | 5565 | Municipality/3 | 0.6 | IBGE | IBGE |
| CHL | 756,096 | 16,341,929C | 2012 | 297 | Municipality/3 | 2.9 | INE-CELADEG | GADM[ |
| COL | 1,141,261 | 47,661,787E | 2013 | 1115 | Municipality/2 | 1.0 | Departamento Administrativo Nacional de EstadísticaG | GADM[ |
| CRI | 51,100 | 4,301,712C | 2011 | 469 | District/3 | 0.5 | INECG | GADM[ |
| CUB | 109,884 | 11,167,325C | 2012 | 168 | Municipality/2 | 2.0 | ONEG | GADM[ |
| DOM | 48,070 | 9,445,281C | 2010 | 155 | Municipality/3 | 1.4 | ONEG | GADM[ |
| ECU | 257,320 | 14,483,499C | 2010 | 978 | Parish/4 | 0.5 | INECG | GADM[ |
| SLV | 21,045 | 5,744,113C | 2007 | 267 | Municipality/2 | 0.5 | Dirección General de Estadística y CensosG | GADM[ |
| GUF | 83,534 | 231,167E | 2010 | 21 | Municipality/3 | 13.8 | InseeG | GADM[ |
| GTM | 108,201 | 15,073,375P | 2012 | 335 | Municipality/2 | 1.0 | INEG | GADM[ |
| GUY | 214,999 | 751,223C | 2002 | 116 | Council/2 | 4.0 | Statistics GuyanaG | GADM[ |
| HTI | 26,964 | 9,923,243E | 2009 | 570 | Section/4 | 0.3 | IHSI | GADM[ |
| HND | 112,457 | 8,045,990P | 2010 | 298 | Municipality/2 | 1.1 | INEG | GADM[ |
| JAM | 10,991 | 2,697,983C | 2011 | 14 | Parish/1 | 7.5 | Statistical InstituteG | GADM[ |
| MEX | 19,67,138 | 112,336,538C | 2010 | 2456 | Municipality/2 | 0.6 | INEGI | Valle-Jones[ |
| NIC | 120,340 | 6,071,045E | 2012 | 139 | Municipality/3 | 2.5 | INIDEG | GADM[ |
| PAN | 741,77 | 3,405,813C | 2010 | 77 | District/2 | 3.5 | Dirección de Estadística y CensoG | GADM[ |
| PRY | 406,752 | 3,725,789E | 2002 | 247 | Municipality/2 | 2.6 | Dirección General de Estadística, Encuestas y CensosG | GADM[ |
| PER | 1,294,681 | 30,135,875P | 2012 | 194 | Province/2 | 5.9 | INEIG | GADM[ |
| PRI | 13,790 | 3,725,789C | 2010 | 78 | Municipality/1 | 1.5 | U.S. Census BureauG | GADM[ |
| SUR | 163,820 | 541,638C | 2004 | 62 | Resort/2 | 6.5 | Algemeen Bureau voor de StatistiekG | GADM[ |
| TTO | 5127 | 1,328,019C | 2011 | 14 | Municipality/1 | 5.1 | Central Statistical OfficeG | GADM[ |
| URY | 175,016 | 3,286,314C | 2011 | 19 | Department/1 | 22.0 | INEG | GADM[ |
| VEN | 913,982 | 28,946,101C | 2011 | 344 | Municipality/2 | 2.8 | INEG | GADM[ |
Summary information on the twelve default datasets and the derived default covariates used for input to the RF method
| Continuous raster datasets were resampled for being used as covariates, while both categorical raster and rasterized vector data sets were firstly resampled and then processed into ‘presence/absence’, ‘distance to’, and ‘proportion of’ raster covariates. ‘Class #’, in the 2nd column, refers to the WorlPop Americas classes described in | ||||||
|---|---|---|---|---|---|---|
| Night-lights’ intensity | 2012 | Continuous | Raster | 100 m | ||
| Plants’ energy productivity | 2014/2015 | Continuous | Raster | 100 m | ||
| Annual Mean Temperature | 1950–2000 | Continuous | Raster | 100 m | ||
| Annual Precipitation | 1950–2000 | Continuous | Raster | 100 m | ||
| Elevation | 2000 | Continuous | Raster | 100 m | ||
| Slope | 2000 | Continuous | Raster | 100 m | ||
| Presence/absence of class # | 2000/2009 | Categorical (binary) | Raster | 100 m | ||
| Distance to class # | 2000/2009 | Continuous | Raster | 100 m | ||
| Proportion of class # | 2000/2009 | Continuous | Raster | 100 m | ||
| Presence/absence of built-up areas (BLT) | 2000/2009 | Categorical (binary) | Raster | 100 m | ||
| Distance to built-up areas (BLT) | 2000/2009 | Continuous | Raster | 100 m | ||
| Proportion of built-up area (BLT) | ||||||
| Presence/absence of urban areas | 2000/2001 | Categorical (binary) | Raster | 100 m | ||
| Distance to urban areas | 2000/2001 | Continuous | Raster | 100 m | ||
| Proportion of urban area | 2000/2001 | Continuous | Raster | 100 m | ||
| Presence/absence of protected areas | 2012 | Categorical (binary) | Raster | 100 m | ||
| Distance to protected areas | 2012 | Continuous | Raster | 100 m | ||
| Proportion of protected area | 2012 | Continuous | Raster | 100 m | ||
| Presence/absence of populated places/roads/rivers/waterbodies | Categorical (binary) | Raster | 100 m | |||
| Distance to populated places/roads/rivers/waterbodies | Continuous | Raster | 100 m | |||
| Proportion of populated places/roads/rivers/waterbodies | Continuous | Raster | 100 m |
Figure 2Schematic overview of the procedure used to generate population density weighting layers.
For illustrative purpose, only 4 out of the 74 covariates considered for Puerto Rico are shown here (the uninhabited Puerto Rican islands of Mona, Monito, and Desecheo are not shown).
Figure 3Estimated people per grid cell for Latin America and the Caribbean in 2010 (excluding Guadalupe, Martinique, Bahamas, Barbados, Saint Lucia, Curaçao, Aruba, Saint Vincent and The Grenadines, US and British Virgin Islands, Grenada, Dominica, Cayman Islands, Saint Kitts and Nevis, Sint Maarten, Turks and Caicos Islands, Saint Martin, Caribbean Netherlands, Anguilla, Saint Barthélemy, and Montserrat).
The grid cell resolution is 3 arc seconds (approximately 100 m at the equator) and coordinates refer to GCS WGS 1984. For illustrative purpose, the color ranges used are country-specific.
Name (ISO and YEAR represent the ISO country code and the population count year, respectively), description, and format of all files contained in each 7-Zip archive associated with the 28 countries listed in Table 1.
| ISO_ppp_v2b_YEAR.tif | Estimated people per grid cell for the year the official population count data refer to (3 arc seconds) | GeoTIFF |
| ISO_ppp_v2b_2010.tif | Projected estimated people per grid cell for 2010 (3 arc seconds) | GeoTIFF |
| ISO_ppp_v2b_2010_UNadj.tif | Projected estimated people per grid cell for 2010 adjusted to match UNPD estimates (3 arc seconds) | GeoTIFF |
| ISO_ppp_v2b_2015.tif | Projected estimated people per grid cell for 2015 (3 arc seconds) | GeoTIFF |
| ISO_ppp_v2b_2015_UNadj.tif | Projected estimated people per grid cell for 2015 adjusted to match UNPD estimates (3 arc seconds) | GeoTIFF |
| ISO_ppp_v2b_2020.tif | Projected estimated people per grid cell for 2020 (3 arc seconds) | GeoTIFF |
| ISO_ppp_v2b_2020_UNadj.tif | Projected estimated people per grid cell for 2020 adjusted to match UNPD estimates (3 arc seconds) | GeoTIFF |
| ISO_pph_v2b_YEAR.tif | Estimated people per hectare for the year the official population count data refer to (3 arc seconds) | GeoTIFF |
| ISO_pph_v2b_2010.tif | Projected estimated people per hectare for 2010 | GeoTIFF |
| ISO_pph_v2b_2010_UNadj.tif | Projected estimated people per hectare for 2010 adjusted to match UNPD estimates (3 arc seconds) | GeoTIFF |
| ISO_pph_v2b_2015.tif | Projected estimated people per hectare for 2015 | GeoTIFF |
| ISO_pph_v2b_2015_UNadj.tif | Projected estimated people per hectare for 2015 adjusted to match UNPD estimates (3 arc seconds) | GeoTIFF |
| ISO_pph_v2b_2020.tif | Projected estimated people per hectare for 2020 (3 arc seconds) | GeoTIFF |
| ISO_pph_v2b_2020_UNadj.tif | Projected estimated people per hectare for 2020 adjusted to match UNPD estimates (3 arc seconds) | GeoTIFF |
| ISO_ppp_v2b_YEAR.kmz | Estimated people per grid cell for the year the official census/population counts refer to | Keyhole Markup Language (Zipped) |
| ISO_metadata.html | Metadata report for the Random Forest model | HyperText Markup Language |
Prediction accuracy of the RF model used to generate the dasymetric weighting layers and accuracy assessment of the RF-based dasymetric mapping approach compared to the simple areal-weighting (SAW) mapping approach
| The OOB error and the percentage of variance explained are provided for all 28 countries while the RMSE, the %RMSE, and the MAE values are provided for six countries. ‘RF’ and ‘SAW’, in the 2nd column, indicate that, for that specific country, the population counts at the administrative unit level were disaggregated using the RF-based dasymetric mapping approach and the simple areal-weighting approach, respectively. | ||||||||
|---|---|---|---|---|---|---|---|---|
| ATG | RF | 1 | 8 | 0.21 | 86 | — | — | — |
| ARG | RF | 2 | 526 | 0.78 | 88 | — | — | — |
| BLZ | RF | 1 | 16 | 0.25 | 79 | — | — | — |
| BOL | RF | 2 | 112 | 0.88 | 65 | — | — | — |
| BRA | RF | 3 | 5565 | 0.32 | 84 | — | — | — |
| CHL | RF | 3 | 297 | 1.40 | 70 | — | — | — |
| COL | RF | 2 | 1115 | 0.35 | 84 | — | — | — |
| COL | RF | 1 | 33 | 1.20 | 75 | 109798.10 | 259.81 | 29361.29 |
| COL | SAW | 1 | 33 | — | — | 128372.29 | 303.76 | 36463.22 |
| CRI | RF | 3 | 469 | 0.40 | 92 | — | — | — |
| CRI | RF | 2 | 81 | 0.20 | 93 | 4837.37 | 52.96 | 3012.04 |
| CRI | SAW | 2 | 81 | — | — | 14463.43 | 158.34 | 7976.94 |
| CUB | RF | 2 | 168 | 0.33 | 82 | — | — | — |
| DOM | RF | 2 | 155 | 0.22 | 86 | — | — | — |
| DOM | RF | 1 | 32 | 0.53 | 62 | 46349.33 | 76.06 | 19461.99 |
| DOM | SAW | 1 | 32 | — | — | 101563.30 | 166.67 | 39729.54 |
| ECU | RF | 4 | 978 | 0.47 | 82 | — | — | — |
| ECU | RF | 3 | 198 | 0.43 | 77 | 36713.59 | 248.75 | 7243.05 |
| ECU | SAW | 3 | 198 | — | — | 60295.60 | 408.52 | 12322.64 |
| SLV | RF | 2 | 267 | 0.20 | 81 | — | — | — |
| GUF | RF | 3 | 21 | 2.60 | 59 | — | — | — |
| GTM | RF | 2 | 335 | 0.24 | 80 | — | — | — |
| GTM | RF | 1 | 22 | 0.33 | 58 | 51704.90 | 114.13 | 20590.36 |
| GTM | SAW | 1 | 22 | — | — | 62299.18 | 137.52 | 26125.56 |
| GUY | RF | 2 | 116 | 1.10 | 87 | — | — | — |
| HTI | RF | 4 | 570 | 0.14 | 84 | — | — | — |
| HTI | RF | 3 | 140 | 0.071 | 90 | 10794.96 | 62.01 | 5493.25 |
| HTI | SAW | 3 | 140 | — | — | 18677.50 | 107.29 | 8501.97 |
| HND | RF | 2 | 298 | 0.20 | 71 | — | — | — |
| JAM | RF | 1 | 14 | 0.21 | 86 | — | — | — |
| MEX | RF | 2 | 2456 | 0.21 | 92 | — | — | — |
| NIC | RF | 3 | 139 | 0.32 | 79 | — | — | — |
| PAN | RF | 2 | 77 | 0.41 | 74 | — | — | — |
| PRY | RF | 2 | 247 | 0.44 | 85 | — | — | — |
| PER | RF | 2 | 194 | 0.58 | 63 | — | — | — |
| PRI | RF | 1 | 78 | 0.16 | 74 | — | — | — |
| SUR | RF | 2 | 62 | 1.40 | 86 | — | — | — |
| TTO | RF | 1 | 14 | 0.21 | 86 | — | — | — |
| URY | RF | 1 | 19 | 0.58 | 91 | — | — | — |
| VEN | RF | 2 | 344 | 1.20 | 71 | — | — | — |