| Literature DB >> 33278901 |
Ann C Miller1, Peter Rohloff2,3,4, Alexandre Blake5, Eloin Dhaenens4, Leah Shaw4, Eva Tuiz4, Francesco Grandesso5, Carlos Mendoza Montano6, Dana R Thomson7.
Abstract
BACKGROUND: Population-representative household survey methods require up-to-date sampling frames and sample designs that minimize time and cost of fieldwork especially in low- and middle-income countries. Traditional methods such as multi-stage cluster sampling, random-walk, or spatial sampling can be cumbersome, costly or inaccurate, leading to well-known biases. However, a new tool, Epicentre's Geo-Sampler program, allows simple random sampling of structures, which can eliminate some of these biases. We describe the study design process, experiences and lessons learned using Geo-Sampler for selection of a population representative sample for a kidney disease survey in two sites in Guatemala.Entities:
Keywords: Guatemala; Population-representative study; Sample selection; Sampling frame; Simple random sample
Mesh:
Year: 2020 PMID: 33278901 PMCID: PMC7718677 DOI: 10.1186/s12942-020-00250-0
Source DB: PubMed Journal: Int J Health Geogr ISSN: 1476-072X Impact factor: 3.918
Pros and cons of sampling approaches and tools considered
| Approach | Pros | Cons |
|---|---|---|
| Census frame with manual cluster selection | Calculate weights and CIs Low cost of 1st-stage cluster selection Does not require skills beyond survey statistics | Outdated 1st-stage frame High cost of 2nd-stage household enumeration |
| Gridded population frame with GridSample.org | Calculate weights and CIs Low cost of 1st-stage cluster selection Requires few skills beyond survey statistics | Outdated 1st-stage frame High cost of 2nd-stage household enumeration |
| Gridded population frame with RTI Geo-Sampling | Calculate weights and CIs Requires few skills beyond survey statistics | Outdated 1st-stage frame High cost of 1st-stage cluster selection High cost of 2nd-stage household enumeration |
| Census frame with EPI design | Low cost of 1st-stage cluster selection Low cost of 2nd-stage Does not require skills beyond survey statistics | Outdated 1st-stage frame Cannot calculate weights and CIs |
| No stratification | Sample not representative of the population Requires GIS skills | |
| Stratification on structure density | Updated frame Calculate weights and CIs | Requires GIS and other skills beyond survey statistics High cost of structure enumeration (or similar count of buildings) |
| Enumerate all structures in OpenStreetMap, Google Earth, or GIS software | Updated frame Calculate weights and CIs | High cost of structure enumeration Requires GIS skills |
| Epicentre Geo-Sampler | Updated frame Calculate weights and CIs Low cost of structure selection (no enumeration) Requires few skills beyond survey statistics | |
Fig. 1Example of Comparison of Simple Spatial Sampling (dark blue markers 1-10) vs. Simple random sample of structures (light blue markers 101-110) in which only those selected points that contain a structure were retained. Both were generated by Geo-Sampler
Results of approaches to households in CKD prevalence survey, Guatemala
| Variable | Tecpán N (%) | Suchitepéquez N (%) | Total N | P value |
|---|---|---|---|---|
| Structure sample size | 220 | 130 | 350 | 0.04 |
| Residences | 174 (79.1) | 114 (87.7) | 288 | |
| Non residences/vacant | 46 (20.1) | 16 (12.3) | 62 | |
| Median [IQR] structures per residence | 1 [1–5] | 1 [1–3] | ||
| N(%) with more than one household per structure (n = 158) | 7 (9.0) | 10 (12.2) | ||
| Household responses | 174 | 114 | 288 | 0.008 |
| Recruited | 110 (63.2) | 78 (68.4) | 188 | |
| Refused | 57 (32.7) | 23 (20.2) | 80 | |
| Ineligible or uncontactable | 7 (4.0) | 13 (11.4) | 20 |
Efficiencies and challenges of using Geo-Sampler tool and protocols for population-level data collection in Guatemala
| Steps in sampling protocol | Efficiencies | Challenges and considerations for future work |
|---|---|---|
| 1. Training of study staff on Geo-Sampler Tool | Professional contacts facilitated access to Epicentre staff | Currently, limited technical documentation When available, technical documentation is in limited languages Formal training on the tool is not currently available |
| 2. Selecting samples and digitizing sampled structure coordinates | No special software required Geo-Sampler tool using very up-to-date, high resolution imagery, so we were able to identify recently constructed structures Sample can be selected at one time, by one person Sample of structures behind walls can be easily selected Geo-Sampler does not retain the list of selected structure coordinates after the program is shut down, which limits privacy concerns | Sample selection required significant expert time If updated samples are needed, extra time from someone trained in the tool will be required, unless multiple people are trained If multiple people are trained and are selecting the sample, significant coordination and oversight would be required to ensure quality control, consistency, and to eliminate repetition of structures, unless adaptations are made to Geo-Sampler to allow for simultaneous use of multiple users |
| 3. Locating selected structures in the field | Use of the Android/Google Maps platform was intuitive and well-known to local study staff, cost-saving Satellite overlay on Google Maps useful not only for finding tagged structures but also identifying principal entrances, alleyways, etc. when attempting to approach structures (many located in walled compounds, etc.) Selecting structures rather than relying upon investigators’ concepts of what a “residence” looked like allowed us to include non-traditional living situations | Initial version of the Geo-Sampler tool provided.kml files but without identifiable latitude and longitude, which then required a 2 stage process to determine. This was changed during the course of the study by Epicentre Saved efficiencies of multi-stage sampling somewhat offset by inefficiency of inevitably many tagged structures not being residences. This method does not allow for people who do not live in structures. People living on the street or in cars would still be left out of these surveys A few coordinates selected were close enough that 2 different structures were given the same study ID by different data collectors. This was discovered and addressed in the data cleaning phase by recoding one of the residences of each pair Connectivity in our sites in Guatemala was generally good, but we experienced frequent signal drop-outs, requiring large-format printed physical map back-up at all times. This would likely be the case in many LMIC settings, especially rural areas. Drop-outs in connectivity also caused rapid phone battery drain (due to searching signal), and required staff to carry multiple recharging packets when in the field in order to keep phones charged |
Data collected on each sampled structure
| Query | Possible Responses |
|---|---|
| Is the selected structure a residence?a | Yes/No |
| How many associated structures are in use by the household? | Number of structures and typeb |
| Does more than one household live in structure? | Number of households in residence |
| Recruitment outcomes | Number of households approached Number of households with a contact Number of households with at least one member recruited (“hh enrolled”) Number of eligible adults in household and whether each individual was recruited or declined participation |
aIncluding structures of multiple use (stores, churches, etc.) as long as also a residence
bFor example, selected structure might be the primary residence for the household, but there may also be a separate kitchen structure and garage structure for the same household