| Literature DB >> 30558222 |
Sophie Cowie1, Rudy Arthur2, Hywel T P Williams3,4.
Abstract
Allergic rhinitis (hayfever) affects a large proportion of the population in the United Kingdom. Although relatively easily treated with medication, symptoms nonetheless have a substantial adverse effect on wellbeing during the summer pollen season. Provision of accurate pollen forecasts can help sufferers to manage their condition and minimise adverse effects. Current pollen forecasts in the UK are based on a sparse network of pollen monitoring stations. Here, we explore the use of "social sensing" (analysis of unsolicited social media content) as an alternative source of pollen and hayfever observations. We use data from the Twitter platform to generate a dynamic spatial map of pollen levels based on user reports of hayfever symptoms. We show that social sensing alone creates a spatiotemporal pollen measurement with remarkable similarity to measurements taken from the established physical pollen monitoring network. This demonstrates that social sensing of pollen can be accurate, relative to current methods, and suggests a variety of future applications of this method to help hayfever sufferers manage their condition.Entities:
Keywords: crowdsourcing; hayfever; pollen; social media; social sensing
Mesh:
Year: 2018 PMID: 30558222 PMCID: PMC6308444 DOI: 10.3390/s18124434
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1Time-series of total UK daily pollen counts (y-axis), grouped by vegetation type. Plot shows: tree pollen (blue), weed pollen (orange), grass pollen (green).
Figure 2Time-series of various data sources for monitoring pollen and hayfever, aggregated weekly and normalised by their maximum value over the time period. Plot shows: Normalised Twitter activity (t, blue), Google Trends web search activity (g, orange), Met Office web traffic (mw, green), counts for all pollen (mo, solid red) and grass pollen (mo, dashed red). Correlations, r, are given for each data source pair; pollen count correlations are for all pollen, with correlations for grass pollen only given in brackets.
Figure 3Pollen monitoring station locations (red dots), with green circles indicating the 50 km radius around each station.
Pearson’s r, for the relationship between number of tweets within a 50 km radius of each pollen observation station and the pollen counts observed at each station, for all pollen types and grass pollen only. Text colour indicates p-value (Red: , Blue: , Black: ).
| Correlation Between GRASS Pollen Counts and Tweets | Correlation Between ALL Pollen Counts and Tweets | No. of Tweets in 50 km Radius of Station | |
|---|---|---|---|
| York | 0.9 | 0.76 | 637 |
| Beverley | 0.88 | 0.88 | 271 |
| Chester | 0.87 | 0.69 | 617 |
| Leicester | 0.44 | 0.07 | 790 |
| Worcester | 0.83 | 0.83 | 814 |
| Ipswich | 0.66 | 0.47 | 164 |
| London | 0.59 | 0.59 | 2325 |
| Cardiff | 0.88 | 0.55 | 544 |
| Plymouth | 0.8 | 0.53 | 217 |
| Exeter | 0.68 | 0.52 | 235 |
| Wight | 0.84 | 0.82 | 405 |
| MEAN | 0.76 | 0.61 | 638.1 |
Model performance with/without Twitter data. Table shows (coefficient of determination) and root mean squared error (RMSE) for Models 1, 2, 3, 1T and 2T. All scores are averaged from leave-one-out validation tests across all 11 pollen monitoring stations.
|
| RMSE | |
|---|---|---|
| Model 1 | 0.810 | 0.115 |
| Model 2 | 0.847 | 0.099 |
| Model 3 | 0.745 | 0.188 |
| Model 1T | 0.858 | 0.100 |
| Model 2T | 0.858 | 0.096 |