| Literature DB >> 28282865 |
Abstract
With the development of information and communications technology, user-generated content and crowdsourced data are playing a large role in studies of transport and public health. Recently, Strava, a popular website and mobile app dedicated to tracking athletic activity (cycling and running), began offering a data service called Strava Metro, designed to help transportation researchers and urban planners to improve infrastructure for cyclists and pedestrians. Strava Metro data has the potential to promote studies of cycling and health by indicating where commuting and non-commuting cycling activities are at a large spatial scale (street level and intersection level). The assessment of spatially varying effects of air pollution during active travel (cycling or walking) might benefit from Strava Metro data, as a variation in air pollution levels within a city would be expected. In this paper, to explore the potential of Strava Metro data in research of active travel and health, we investigate spatial patterns of non-commuting cycling activities and associations between cycling purpose (commuting and non-commuting) and air pollution exposure at a large scale. Additionally, we attempt to estimate the number of non-commuting cycling trips according to environmental characteristics that may help identify cycling behavior. Researchers who are undertaking studies relating to cycling purpose could benefit from this approach in their use of cycling trip data sets that lack trip purpose. We use the Strava Metro Nodes data from Glasgow, United Kingdom in an empirical study. Empirical results reveal some findings that (1) when compared with commuting cycling activities, non-commuting cycling activities are more likely to be located in outskirts of the city; (2) spatially speaking, cyclists riding for recreation and other purposes are more likely to be exposed to relatively low levels of air pollution than cyclists riding for commuting; and (3) the method for estimating of the number of non-commuting cycling activities works well in this study. The results highlight: (1) a need for policymakers to consider how to improve cycling infrastructure and road safety in outskirts of cities; and (2) a possible way of estimating the number of non-commuting cycling activities when the trip purpose of cycling data is unknown.Entities:
Keywords: Strava Metro; air pollution exposure; crowdsourced data; cycling purpose; particulate matter
Mesh:
Year: 2017 PMID: 28282865 PMCID: PMC5369110 DOI: 10.3390/ijerph14030274
Source DB: PubMed Journal: Int J Environ Res Public Health ISSN: 1660-4601 Impact factor: 3.390
Figure 1Nodes and edges of Strava Metro data (Basemap: OpenStreetMap, licensed under the Open Database License (ODbL)).
Figure 2Cumulative distributions of numbers of all-purpose cycling activities, non-commuting cycling activities and commuting cycling activities in the log-linear plot.
Figure 3Census output areas in Glasgow (source: DATA.GOV.UK [52]).
Figure 4PM grids in Glasgow to represent levels of PM10 and PM2.5 (source: Scottish Air Quality Database, SAQD).
Independent variables in the estimation of non-commuting cycling activities.
| Variable | Meaning |
|---|---|
| Num_cycling | Number of cycling activities at the node |
| Dis_to_Greenspace | Distance from node to its nearest rail station |
| Dis_to_Waterbody | Distance from node to its nearest water body |
| Dis_to_Citycentre | Distance from node to the city center |
| Num_nearest_busstops | Number of near bus stops (number of bus stops within a distance of 100 m to node) |
Cluster types and associated solution values.
| Solution Value | Cluster Type |
|---|---|
| ≥1 | Cluster of High value |
| 0 | Outside of Cluster |
| ≤−1 | Cluster of Low value |
Figure 5Clusters of high and low non-commuting rate.
Percentages of areas of clusters of high and low non-commuting rate with different levels of PM10 and PM2.5.
| Percent of Areas | Clusters of High and Low | |||
|---|---|---|---|---|
| Cluster of High Value | Outside of Cluster | Cluster of Low Value | ||
| PM10 (Unit: μg/m3) | <0 | 0% | 0% | 0% |
| 10–12 | 80% | 61% | 30% | |
| 12–14 | 19% | 36% | 46% | |
| >14 | 1% | 3% | 24% | |
| PM2.5 (Unit: μg/m3) | <8 | 60% | 33% | 8% |
| 8–9 | 36% | 58% | 50% | |
| 9–10 | 4% | 8% | 28% | |
| >10 | 0% | 1% | 14% | |
Means of instantaneous exposure to PM10 and PM2.5 for non-commuting and commuting cycling activities.
| Air Pollution Exposure | ||||
|---|---|---|---|---|
| Unit: μg/m3 | 11.823 | 12.631 | 8.233 | 8.753 |
| Wilcoxo | <0.001 | <0.001 | ||
Percentages of ‘high exposure’ activities for non-commuting and commuting cycling activities.
| 6.4% | 15.0% |
Figure 6Scatterplots generated for each independent variable with the dependent variable: (a) Num_cycling vs. the dependent variable; (b) Dis_to_Greenspace vs. the dependent variable; (c) Dis_to_Waterbody vs. the dependent variable; (d) Dis_to_Citycentre vs. the dependent variable; (e) Num_nearest_busstops vs. the dependent variable.
Estimation results of non-commuting cycling activities by different algorithms.
| Accuracy | OLS | MLP | SVM | RF |
|---|---|---|---|---|
| Correlation coefficient (Pearson’s | 0.814 | 0.904 | 0.807 | 0.981 |
OLS: ordinary least squares; MLP: multilayer perceptron neutral newtork; SVM: support vector machine; RF: random forest.
Figure 7Predicted and observed number of non-commuting activities.