| Literature DB >> 21886802 |
Samantha Cook1, Corrie Conrad, Ashley L Fowlkes, Matthew H Mohebbi.
Abstract
BACKGROUND: Google Flu Trends (GFT) uses anonymized, aggregated internet search activity to provide near-real time estimates of influenza activity. GFT estimates have shown a strong correlation with official influenza surveillance data. The 2009 influenza virus A (H1N1) pandemic [pH1N1] provided the first opportunity to evaluate GFT during a non-seasonal influenza outbreak. In September 2009, an updated United States GFT model was developed using data from the beginning of pH1N1. METHODOLOGY/PRINCIPALEntities:
Mesh:
Year: 2011 PMID: 21886802 PMCID: PMC3158788 DOI: 10.1371/journal.pone.0023610
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Comparison of relative query category volume in original and updated United States GFT models.
| Query Category | Sample Query | Original Model Relative Category Volume | Updated Model Relative Category Volume |
| Symptoms of an influenza complication | [symptoms of bronchitis] | 6% | 11% |
| Influenza complication | [pnumonia] | 42% | 6% |
| Specific influenza symptom | [fever] | 6% | 39% |
| General influenza symptoms | [early signs of the flu] | 2% | 30% |
| Cold/flu remedy | [robitussin] | 12% | 4% |
| Term for influenza | [influenza a] | <1% | 3% |
| Antibiotic medication | [amoxicillin] | 12% | 0% |
| Related disease | [strep throat] | 16% | <1% |
*Search users often misspell the word pneumonia.
Figure 1Time series plots of ILINet data and original and updated GFT estimates.
A) ILINet data and GFT estimates from 2009. B) ILINet data and GFT estimates for the entire time period where GFT estimates are available: 2003–2009.
Figure 2Time series plots of ILINet data and category-level GFT estimates.
Category-level estimates are created by applying the GFT methodology to a subset of the queries in a given model. A) ILINet data and GFT estimates based on original model queries related to influenza complications. B) ILINet data and GFT estimates based on updated model queries related to specific influenza symptoms.
Figure 3Time series plots of ILINet data and query-level GFT estimates.
Query-level estimates are created by applying the GFT methodology to the search activity for a single query. A) ILINet data and GFT estimates based on the query [symptoms of flu]. B) ILINet data and GFT estimates based on the query [symptoms of bronchitis]. C) ILINet data and GFT estimates based on the query [symptoms of pneumonia].
Correlation and RMSE between United States Google Flu Trends estimates and ILINet data.
| Pre-pH1N1(September 2003–March 2009) | pH1N1 Overall(March 2009–December 2009) | pH1N1 Wave 1(March 2009–August 2009) | pH1N1 Wave 2(August 2009–December 2009) | |
| Correlation | ||||
| Original Model | 0.906 | 0.912 | 0.290 | 0.916 |
| Updated Model | 0.942 | 0.989 | 0.945 | 0.985 |
| RMSE | ||||
| Original Model | 0.006 | 0.018 | 0.008 | 0.023 |
| Updated Model | 0.005 | 0.005 | 0.001 | 0.007 |
*The overall correlation during pH1N1 is not an average of the Waves 1 and 2 correlations. The range of ILI rates was larger in Wave 2 than in Wave 1, causing the Wave 2 data to contribute more than the Wave 1 data to the overall correlation during pH1N1.
Category-level query volume before and during the pH1N1 pandemic in the updated United States GFT model.
| Query Category | Pre-pH1N1 Relative Category Volume | Wave 1 pH1N1 Relative Category Volume | Wave 2 pH1N1 Relative Category Volume |
| Specific influenza symptom | 39% | 28% | 20% |
| General influenza symptoms | 30% | 28% | 38% |
| Term for influenza | 3% | 11% | 8% |
| Symptoms of an influenza complication | 11% | 15% | 15% |
| Influenza complication | 6% | 6% | 4% |
| Related disease | <1% | <1% | <1% |
| Cold/flu remedy | 4% | 3% | 4% |