| Literature DB >> 19197389 |
Anette Hulth1, Gustaf Rydevik, Annika Linde.
Abstract
In the field of syndromic surveillance, various sources are exploited for outbreak detection, monitoring and prediction. This paper describes a study on queries submitted to a medical web site, with influenza as a case study. The hypothesis of the work was that queries on influenza and influenza-like illness would provide a basis for the estimation of the timing of the peak and the intensity of the yearly influenza outbreaks that would be as good as the existing laboratory and sentinel surveillance. We calculated the occurrence of various queries related to influenza from search logs submitted to a Swedish medical web site for two influenza seasons. These figures were subsequently used to generate two models, one to estimate the number of laboratory verified influenza cases and one to estimate the proportion of patients with influenza-like illness reported by selected General Practitioners in Sweden. We applied an approach designed for highly correlated data, partial least squares regression. In our work, we found that certain web queries on influenza follow the same pattern as that obtained by the two other surveillance systems for influenza epidemics, and that they have equal power for the estimation of the influenza burden in society. Web queries give a unique access to ill individuals who are not (yet) seeking care. This paper shows the potential of web queries as an accurate, cheap and labour extensive source for syndromic surveillance.Entities:
Mesh:
Year: 2009 PMID: 19197389 PMCID: PMC2634970 DOI: 10.1371/journal.pone.0004378
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Summary of investigated queries, with genuine examples in Swedish (one complete query per line) in addition to their English translations.
| Query type | Explanation | Examples in Swedish | Examples translated to English | Total number of queries (% of all queries) |
| Influenza | A one word query containing “influenza”. |
|
| 5,745 (0.14) |
| (the only matching query) | ||||
| Influenza in complex search | The word “influenza” as a single word in combination with something else. |
|
| 1,366 (0.03) |
|
|
| |||
| Influenza as one word | The word “influenza” is present as a single word either as the only term or in a multi word query. |
|
| 7,111 (0.17) |
|
|
| |||
|
|
| |||
| Influenza compound | The word “influenza” is part of a compound. |
|
| 7,054 (0.17) |
|
|
| |||
| Influenza and more | The query contains “influenza” in any constellation. |
|
| 14,165 (0.34) |
|
|
| |||
|
|
| |||
|
|
| |||
| Cleaned influenza | The query contains “influenza”, but queries matching “bird”, “stomach flu” or “vaccination” are removed. |
|
| 7,072 (0.17) |
|
|
| |||
|
|
| |||
| Stomach flu | The query contains “stomach flu”. |
|
| 2,730 (0.06) |
|
|
| |||
| Influenza and symptom | The query contains two or more ILI symptoms and possibly “influenza”. |
|
| 95 (0.00) |
|
|
| |||
|
|
| |||
| ILI | The query matches the given ILI definition; may also contain other words. |
|
| 448 (0.01) |
|
|
| |||
|
|
| |||
|
|
| |||
| More than one ILI symptom | The query contains at least two ILI symptoms, regardless of the definition, without other terms. |
|
| 289 (0.01) |
|
|
| |||
|
|
| |||
| Cough | A one word query containing “cough”. |
|
| 5,646 (0.13) |
| (the only matching query) | ||||
| Sore throat | A one word query containing “sore throat”. |
|
| 3,620 (0.09) |
| (the only matching query) | ||||
| Shortness of breath | A one word query containing “shortness of breath”. |
|
| 186 (0.00) |
| (the only matching query) | ||||
| Coryza | A one word query containing “coryza”. |
|
| 385 (0.01) |
| (the only matching query) | ||||
| Fever | A one word query containing “fever”. |
|
| 9,338 (0.22) |
| (the only matching query) | ||||
| Headache | A one word query containing “headache”. |
|
| 4,575 (0.11) |
| (the only matching query) | ||||
| Myalgia | A one word query containing “myalgia”. |
|
| 385 (0.01) |
| (the only matching query) | ||||
| Cough and more | The query contains “cough” in any constellation. |
|
| 1,037 (0.02) |
|
|
| |||
|
|
| |||
| Fever and more | The query contains “fever” in any constellation. |
|
| 11,128 (0.26) |
|
|
| |||
|
|
| |||
| Cold | A one word query containing “cold”. |
|
| 32,156 (0.76) |
| (the only matching query) |
The table also shows the total number of queries for the two seasons, as well as the percentage of queries matching the query type.
Figure 1A flow-chart of the statistical analysis.
Figure 2An overview of the sentinel and the laboratory data for the two investigated influenza seasons (2005/2006 and 2006/2007).
Figure 3The number of queries matching the selected query types plotted over time.
R2 values and mean predictive errors using leave-one-out cross validation for all generated models for the sentinel and the laboratory models respectively.
| Number of components in model | R2 (sentinel model) | R2 (laboratory model) | Mean predictive error (sentinel model) | Mean predictive error (laboratory model) |
| 1 | 0.76 | 0.78 | 0.12 | 26.43 |
| 2 | 0.84 | 0.84 | 0.09 | 22.14 |
| 3 | 0.88 | 0.88 | 0.08 | 20.00 |
| 4 |
|
|
|
|
| 5 | 0.90 | 0.90 | 0.08 | 19.64 |
| 6 | 0.90 | 0.91 | 0.09 | 20.26 |
| 7 | 0.90 | 0.91 | 0.09 | 19.61 |
| 8 | 0.90 | 0.92 | 0.09 | 20.12 |
| 9 | 0.90 | 0.92 | 0.09 | 20.79 |
| 10 | 0.90 | 0.92 | 0.09 | 21.14 |
| 11 | 0.90 | 0.92 | 0.10 | 21.35 |
| 12 | 0.90 | 0.92 | 0.10 | 21.17 |
| 13 | 0.90 | 0.92 | 0.10 | 21.03 |
| 14 | 0.90 | 0.92 | 0.10 | 20.97 |
| 15 | 0.90 | 0.92 | 0.10 | 21.48 |
| 16 | 0.90 | 0.92 | 0.10 | 21.38 |
| 17 | 0.90 | 0.92 | 0.10 | 21.32 |
| 18 | 0.90 | 0.92 | 0.10 | 21.27 |
| 19 | 0.90 | 0.92 | 0.10 | 21.50 |
| 20 (all) | 0.90 | 0.92 | 0.10 | 21.50 |
The values for the chosen models (using 4 components) are marked in bold.
Figure 4Observed values (black), predicted values from the full models (red, with circles), and predicted values using a model fitted on data from the opposite season (blue, with squares) for the model predicting sentinel values and for the model predicting laboratory values.
Figure 5The relative contribution of the scaled and centred queries to each component in the model predicting sentinel values.
Figure 6The relative contribution of the scaled and centred queries to each component in the model predicting laboratory values.