| Literature DB >> 24743682 |
David J McIver1, John S Brownstein1.
Abstract
Circulating levels of both seasonal and pandemic influenza require constant surveillance to ensure the health and safety of the population. While up-to-date information is critical, traditional surveillance systems can have data availability lags of up to two weeks. We introduce a novel method of estimating, in near-real time, the level of influenza-like illness (ILI) in the United States (US) by monitoring the rate of particular Wikipedia article views on a daily basis. We calculated the number of times certain influenza- or health-related Wikipedia articles were accessed each day between December 2007 and August 2013 and compared these data to official ILI activity levels provided by the Centers for Disease Control and Prevention (CDC). We developed a Poisson model that accurately estimates the level of ILI activity in the American population, up to two weeks ahead of the CDC, with an absolute average difference between the two estimates of just 0.27% over 294 weeks of data. Wikipedia-derived ILI models performed well through both abnormally high media coverage events (such as during the 2009 H1N1 pandemic) as well as unusually severe influenza seasons (such as the 2012-2013 influenza season). Wikipedia usage accurately estimated the week of peak ILI activity 17% more often than Google Flu Trends data and was often more accurate in its measure of ILI intensity. With further study, this method could potentially be implemented for continuous monitoring of ILI activity in the US and to provide support for traditional influenza surveillance tools.Entities:
Mesh:
Year: 2014 PMID: 24743682 PMCID: PMC3990502 DOI: 10.1371/journal.pcbi.1003581
Source DB: PubMed Journal: PLoS Comput Biol ISSN: 1553-734X Impact factor: 4.475
List of Wikipedia articles selected for investigation for inclusion in ILI estimation models.
| Avian influenza | Influenza Virus B |
| Centers for Disease Control and Prevention | Influenza Virus C |
| Common Cold | Influenza Virus Subtype H1N1 |
| Epidemic | Influenza Virus Subtype H2N2 |
| European Centers for Disease Control and Prevention | Influenza Virus Subtype H2N9 |
| Fever | Influenza Virus Subtype H3N1 |
| Flu Season | Influenza Virus Subtype H3N2 |
| Human Influenza | Influenza Virus Subtype H5N1 |
| Influenza | Influenza Virus Subtype H5N2 |
| Influenza-like Illness | Oseltamivir |
| Influenza Pandemic | Pandemic |
| Influenza Research | Swine Influenza |
| Influenza Treatment | Tamiflu |
| Influenza Vaccine | Vaccine |
| Influenza Virus | Wikipedia Main Page |
| Influenza Virus A | 1918 Flu Pandemic |
*Only terms with an asterisk were included in the Lasso regression model.
Figure 1Time series plot of CDC ILI data versus estimated ILI data.
(A) Wikipedia Full Model (Mf) accurately estimated 3 out of 6 ILI activity peaks and had a mean absolute difference of 0.27% compared to CDC ILI data. (B) Wikipedia Lasso Model (Ml) accurately estimated 2 out of 6 ILI activity peaks and had a mean absolute difference of 0.29% compared to CDC ILI data,. (C) Google Flue Trends (GFT) model accurately estimated 2 of 6 ILI activity peaks and had a mean absolute difference of 0.42% compared to CDC ILI data.
Comparisons of CDC, Mf, Ml, and GFT peak ILI values.
| Influenza Season | Year | Week | ILI Value | Referent CDC ILI Value | % Difference from CDC ILI Value | Peak Agrees with CDC |
|
| ||||||
|
| 2008 | 7 | 5.98 | |||
|
| 2008 | 8 | 4.94 | 5.62 | 0.68 | N |
|
| 2008 | 7 | 4.43 | 5.98 | −1.55 | Y |
|
| 2008 | 8 | 5.81 | 5.62 | 0.19 | N |
|
| ||||||
|
| 2009 | 7 | 3.57 | |||
|
| 2009 | 12 | 3.48 | 2.43 | −1.05 | N |
|
| 2009 | 12 | 3.33 | 2.43 | 0.90 | N |
|
| 2009 | 8 | 3.50 | 3.37 | 0.13 | N |
|
| ||||||
|
| 2009 | 43 | 7.72 | |||
|
| 2009 | 43 | 8.36 | 7.72 | −0.64 | Y |
|
| 2009 | 44 | 8.66 | 7.55 | 1.11 | N |
|
| 2009 | 43 | 7.11 | 7.72 | −0.61 | Y |
|
| ||||||
|
| 2011 | 4 | 4.55 | |||
|
| 2011 | 6 | 4.55 | |||
|
| 2011 | 6 | 5.84 | 4.55 | −1.29 | Y |
|
| 2011 | 6 | 5.73 | 4.55 | 1.18 | Y |
|
| 2011 | 6 | 4.08 | 4.55 | −0.47 | Y |
|
| ||||||
|
| 2012 | 10 | 2.39 | |||
|
| 2012 | 7 | 2.68 | 2.24 | −0.44 | N |
|
| 2012 | 7 | 2.85 | 2.24 | −1.55 | N |
|
| 2011 | 52 | 2.86 | 1.74 | 1.12 | N |
|
| ||||||
|
| 2012 | 51 | 6.07 | |||
|
| 2012 | 51 | 5.31 | 6.07 | 0.76 | Y |
|
| 2012 | 52 | 5.40 | 4.65 | −1.55 | N |
|
| 2013 | 2 | 10.56 | 4.52 | 6.04 | N |
ILI: Influenza-like illness, CDC: Centers for Disease Control and Prevention.
Mf: Full model, Ml: Lasso model, GFT: Google Flu Trends.
*Referent values are CDC ILI values for the corresponding week of the estimated ILI peak for Mf, Ml, and GFT.