| Literature DB >> 28724433 |
Dana R Thomson1,2,3, Forrest R Stevens4,5, Nick W Ruktanonchai6,4, Andrew J Tatem6,4, Marcia C Castro7.
Abstract
BACKGROUND: Household survey data are collected by governments, international organizations, and companies to prioritize policies and allocate billions of dollars. Surveys are typically selected from recent census data; however, census data are often outdated or inaccurate. This paper describes how gridded population data might instead be used as a sample frame, and introduces the R GridSample algorithm for selecting primary sampling units (PSU) for complex household surveys with gridded population data. With a gridded population dataset and geographic boundary of the study area, GridSample allows a two-step process to sample "seed" cells with probability proportionate to estimated population size, then "grows" PSUs until a minimum population is achieved in each PSU. The algorithm permits stratification and oversampling of urban or rural areas. The approximately uniform size and shape of grid cells allows for spatial oversampling, not possible in typical surveys, possibly improving small area estimates with survey results.Entities:
Keywords: Cluster sample; Cluster survey; Multi-stage
Mesh:
Year: 2017 PMID: 28724433 PMCID: PMC5518145 DOI: 10.1186/s12942-017-0098-4
Source DB: PubMed Journal: Int J Health Geogr ISSN: 1476-072X Impact factor: 3.918
Fig. 1Comparison of first stage in typical population sampling and gridded population sampling
Fig. 2GridSample workflow
Summary of attributes in the output shapefile
| Label | Type | Description |
|---|---|---|
| PSUid | Integer | PSU identifier |
| stratum | Integer | Stratum identifier |
| psu_pop | Decimal | Estimated population in PSU derived by summing the seed cell and any growth cells selected for PSU |
| psu_r_pop | Decimal | Estimated rural population in PSU derived by summing all rural cells selected for PSU |
| psu_u_pop | Decimal | Estimated urban population in PSU derived by summing all urban cells selected for PSU |
| psus_in_stratum | Integer | Number of PSUs in the stratum |
| str_pop | Decimal | Estimated population in stratum derived by summing all grid cells |
| str_r_pop | Decimal | Estimated rural population in stratum derived by summing all grid cells classified as rural |
| str_u_pop | Decimal | Estimated urban population in stratum derived by summing all grid cells classified as urban |
| str_cells | Integer | Number of total cells in the stratum |
| xCent | Decimal | Longitude of PSU seed cell centroid in decimal degrees |
| yCent | Decimal | Latitude of PSU seed cell centroid in decimal degrees |
| U_R | Character | Urban or rural label based on whether the seed cell was classified as urban or rural |
Fig. 3Input datasets to the Rwanda gridded population sample in GridSample
Number of primary sampling units in a Demographic and Health Survey and equivalent GridSample survey
| District name | Alternative name | DHS |
| ||
|---|---|---|---|---|---|
| Urban | Rural | Urban | Rural | ||
| Bugesera | Bugesera | 16 | 2 | 14 | |
| Burera | Burera | 16 | 1 | 15 | |
| Butamwa | Nyarugenge | 19 | 1 | 15 | 1 |
| Butare | Huye | 3 | 13 | 3 | 13 |
| Byumba | Gicumbi | 2 | 14 | 1 | 15 |
| Cyangugu | Rusizi | 2 | 14 | 5 | 11 |
| Gakenke | Gakenke | 16 | 16 | ||
| Gasiza | Nyabihu | 16 | 1 | 15 | |
| Gatagara | Ruhango | 3 | 13 | 16 | |
| Gatsibo | Gatsibo | 16 | 16 | ||
| Gikongoro | Nyamagabe | 1 | 15 | 3 | 13 |
| Gisagara | Gisagara | 16 | 16 | ||
| Gisenyi | Rubavu | 1 | 15 | 10 | 6 |
| Gitarama | Muhanga | 4 | 12 | 1 | 15 |
| Kamonyi | Kamonyi | 16 | 16 | ||
| Kayonza | Kayonza | 16 | 16 | ||
| Kibungo | Ngoma | 3 | 13 | 1 | 15 |
| Kibuye | Karongi | 2 | 14 | 2 | 14 |
| Kicukiro | Kicukiro | 20 | 13 | 3 | |
| Kigali | Gasabo | 11 | 9 | 8 | 8 |
| Kirehe | Kirehe | 16 | 16 | ||
| Nogororero | Ngororero | 16 | 16 | ||
| Nyagatare | Nyagatare | 16 | 16 | ||
| Nyamasheke | Nyamasheke | 16 | 16 | ||
| Nyanza | Nyanza | 4 | 12 | 2 | 14 |
| Nyaruguru | Nyaruguru | 16 | 16 | ||
| Ruhengeri | Musanze | 2 | 14 | 3 | 13 |
| Rulindo | Rulindo | 16 | 1 | 15 | |
| Rutsiro | Rutsiro | 16 | 16 | ||
| Rwamagana | Rwamagana | 2 | 14 | 3 | 13 |
| Total | 79 | 413 | 75 | 405 | |
Fig. 4Visual comparison of primary sampling units (PSUs) generated by the 2010 Rwanda DHS [56] and GridSample
Fig. 5Probability sampling approaches with gridded population data
Fig. 6Schematic of four field implementation options for gridded population sampling