| Literature DB >> 24922530 |
Valerie A Paz-Soldan1, Robert C Reiner2, Amy C Morrison3, Steven T Stoddard2, Uriel Kitron4, Thomas W Scott2, John P Elder5, Eric S Halsey6, Tadeusz J Kochel6, Helvio Astete6, Gonzalo M Vazquez-Prokopec4.
Abstract
Quantifying human mobility has significant consequences for studying physical activity, exposure to pathogens, and generating more realistic infectious disease models. Location-aware technologies such as Global Positioning System (GPS)-enabled devices are used increasingly as a gold standard for mobility research. The main goal of this observational study was to compare and contrast the information obtained through GPS and semi-structured interviews (SSI) to assess issues affecting data quality and, ultimately, our ability to measure fine-scale human mobility. A total of 160 individuals, ages 7 to 74, from Iquitos, Peru, were tracked using GPS data-loggers for 14 days and later interviewed using the SSI about places they visited while tracked. A total of 2,047 and 886 places were reported in the SSI and identified by GPS, respectively. Differences in the concordance between methods occurred by location type, distance threshold (within a given radius to be considered a match) selected, GPS data collection frequency (i.e., 30, 90 or 150 seconds) and number of GPS points near the SSI place considered to define a match. Both methods had perfect concordance identifying each participant's house, followed by 80-100% concordance for identifying schools and lodgings, and 50-80% concordance for residences and commercial and religious locations. As the distance threshold selected increased, the concordance between SSI and raw GPS data increased (beyond 20 meters most locations reached their maximum concordance). Processing raw GPS data using a signal-clustering algorithm decreased overall concordance to 14.3%. The most common causes of discordance as described by a sub-sample (n=101) with whom we followed-up were GPS units being accidentally off (30%), forgetting or purposely not taking the units when leaving home (24.8%), possible barriers to the signal (4.7%) and leaving units home to recharge (4.6%). We provide a quantitative assessment of the strengths and weaknesses of both methods for capturing fine-scale human mobility.Entities:
Mesh:
Year: 2014 PMID: 24922530 PMCID: PMC4055589 DOI: 10.1371/journal.pntd.0002888
Source DB: PubMed Journal: PLoS Negl Trop Dis ISSN: 1935-2727
Demographic description of 160 participants for which concurrent semi-structured interviews and GPS tracking were performed.
| Number of participants | |||
| Phase 1 (n = 59) | Phase 2 (n = 101) | Total (n = 160) | |
|
| |||
| Male | 52.5 (31) | 34.7 (35) | 41.3 (66) |
| Female | 47.5 (28) | 65.3 (66) | 58.8 (94) |
|
| |||
| 7–18 | 5.1 (3) | 42.6 (43) | 28.8 (46) |
| 19–30 | 27.1 (16) | 15.8 (16) | 20.0 (32) |
| 31–40 | 23.7 (14) | 10.9 (11) | 15.6 (25) |
| 41–50 | 23.7 (14) | 16.8 (17) | 19.4 (31) |
| >50 | 20.3 (12) | 13.9 (14) | 16.3 (26) |
Phase 1 included individuals who used the GPS units at all times for 14 days and responded (on day 15) to a retrospective SSI, whereas Phase 2 included the same methods as Phase 1, but in addition individuals were interviewed on day 18 about any discordant information (i.e., locations on the SSI but not registered on the GPS, or vice versa).
Comparison of number of “concordant” locations identified by semi-structured interviews and GPS from both study phases.
| Type of location | Concordant % (n) | SSI+ GPS – | GPS+ SSI- | Total % (n) |
| Residential | 42.5 (156) | 22.1 (372) | 58.6 (304) | 32.4 (832) |
| Market/Shops | 26.4 (97) | 34.2 (575) | 11.4 (59) | 28.5 (731) |
| Recreational | 10.1 (37) | 17.0 (286) | 3.5 (18) | 13.3 (341) |
| Educational | 10.8 (40) | 6.3 (105) | 4.2 (22) | 6.5 (167) |
| Public Bldg. | 0.8 (3) | 4.1 (69) | 1.7 (9) | 3.2 (81) |
| Health | 2.5 (9) | 5.0 (84) | 1.9 (10) | 4.0 (103) |
| Church/religious | 2.5 (9) | 3.2 (53) | 1.0 (5) | 2.6 (67) |
| Cemetery | 0.5 (2) | 1.4 (23) | 0 | 1.0 (25) |
| Lodging | 0 | 0.7 (12) | 1.9 (10) | 0.9 (22) |
| Others | 2.7 (10) | 1.8 (30) | 3.3 (17) | 2.2 (57) |
| Missing land-use information | 1.1 (4) | 4.2 (71) | 12.5 (65) | 4.1 (106) |
|
|
|
|
|
|
Concordance between methods occurred when both methods identified the same location as visited by the same participant. A clustering algorithm was used to summarize raw GPS points into specific locations.
Locations identified on the SSI, but not on the GPS.
Locations identified on the GPS (using a clustering algorithm), but not on the SSI.
Figure 1Locations inferred by (A) semi-structured interviews (SSI) and (B) GPS units.
(A) Spatial distribution of all locations reported as visited by 160 participants during a 14-day period. (B) Raw GPS tracks (yellow points) and locations inferred after the application of a data-reduction algorithm (black dots) that assigns each track to a specific location code in the Iquitos GIS.
Figure 2Concordance between SSI locations and raw GPS positions at different distance buffer thresholds, GPS data collection frequencies, and number of GPS points.
Concordance was expressed as the percentage of locations for which a SSI-GPS match was found.
Figure 3Concordance between SSI locations and raw GPS positions at 20 meters from a SSI location.
Concordance is expressed as the proportion of locations for which a SSI-GPS match was found. Panels show values for different location types, combinations of GPS data collection frequencies (15, 90 and 150 seconds) and number of GPS points used to define a visit (1, 5 and 10 points).
Figure 4Sample map to interview participants about possible causes of discordance between GPS-derived vs. semi-structured interview locations.
Given both types of locations were joined to the Iquitos GIS, the lot code was provided to ease identification of locations in the database. Size of points was proportional to reported or calculated time spent at each location. Inset of map shows locations within the city of Iquitos. GPS-derived locations were obtained using a clustering algorithm.
Reasons for discordance given by participants between locations from semi-structured interviews (SSI) and GPS data, from Phase 2 (n = 101).
| Reason given for discordance | Number of locations (%) |
|
|
|
| Says used GPS, no explanation for missing point | 235 (35.8) |
| Says used GPS, but unit might have been off | 197 (30.0) |
| Admits did not use GPS: rushed out and forgot | 82 (12.5) |
| Says used GPS, but describes possible “barrier” (i.e., unit in purse, under a lot of clothes) | 31 (4.7) |
| Admits did not use GPS, but no explanation given | 33 (5.0) |
| Admits did not use GPS: it was recharging at home | 30 (4.6) |
| Admits did not use GPS: concerned about GPS safety (not getting it stolen) | 23 (3.5) |
| Admits did not use GPS: was going to a location near house | 22 (3.4) |
| Admits did not use GPS: concerned about personal safety if wearing it in this location | 3 (0.5) |
|
|
|
| Simply forgot to mention on SSI | 78 (38.2) |
| Location on GPS was en route to another place | 45 (22.1) |
| Forgot to mention on SSI because location not part of regular routine | 31 (15.2) |
| Did not think to mention on SSI because location was outdoors | 27 (13.2) |
| Doesn't remember being there | 12 (5.9) |
| Embarrassed to mention on SSI | 6 (2.9) |
| GPS was used by someone else in household | 5 (2.5) |
|
|
|
| Problem in merging methods | 43 (57.3) |
| GPS marked location next door | 32 (42.7) |
A clustering algorithm was used to summarize raw GPS points into specific locations.