| Literature DB >> 21556151 |
Justin R Ortiz1, Hong Zhou, David K Shay, Kathleen M Neuzil, Ashley L Fowlkes, Christopher H Goss.
Abstract
BACKGROUND: Google Flu Trends was developed to estimate US influenza-like illness (ILI) rates from internet searches; however ILI does not necessarily correlate with actual influenza virus infections. METHODS ANDEntities:
Mesh:
Year: 2011 PMID: 21556151 PMCID: PMC3083406 DOI: 10.1371/journal.pone.0018687
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1United States Influenza Surveillance by Google Flu Trends1, CDC Influenza-like Illness Surveillance2, and CDC Influenza Virologic Surveillance3, June 29, 2003 through May 31, 20084.
1Google Flu Trends estimates the percentage of persons seeking health care for the non-specific complaint of influenza-like illness (ILI) based on internet key word searches. 2CDC Influenza-like Illness Surveillance involves a network of health care providers who record the weekly proportion of patients seen with ILI. Google Flu Trends was created and validated using CDC ILI Surveillance data, explaining the similarity between the two curves. 3CDC Influenza Virologic Surveillance consists of about 140 laboratories located throughout the United States that report the weekly total specimens tested and laboratory tests positive for influenza virus. This is the only US surveillance system that provides national and regional data of laboratory-confirmed influenza virus infection. 4Because CDC surveillance is intensified from calendar week 40 through calendar week 20 of the subsequent year, we restricted our correlation analyses to this time period.
Figure 2Scatter Plot Google Flu Trends and CDC Influenza Laboratory Surveillance; September 28, 2003 through May 17, 2008.
1. Data Sources: a. US Influenza Virologic Surveillance System (http://www.cdc.gov/flu/weekly/fluactivity.htm); and b. Google Flu Trends (http://www.google.org/about/flutrends/us-historic.txt). 2. There are 166 total observations in each panel. Influential observations were defined by DFBETA statistic greater than the absolute value of 2 divided by the square root of the total number of observations in a simple linear regression model [17]. 3. RA represents Pearson's correlation coefficients calculated from comparisons of Google Flu Trends with US Influenza Virologic Surveillance. 4. R'A represents the calculated Pearson's correlation coefficients after exclusion of all influential observations. 5. Because CDC surveillance is intensified from calendar week 40 through calendar week 20 of the subsequent year, we restricted our correlation analyses to this time period.
Figure 3Scatter Plot CDC ILI Surveillance and CDC Influenza Laboratory Surveillance; September 28, 2003 through May 17, 2008.
1. Data Sources: a. Outpatient Influenza-like Illness Surveillance Network; (http://www.cdc.gov/flu/weekly/fluactivity.htm); and b. US Influenza Virologic Surveillance System (http://www.cdc.gov/flu/weekly/fluactivity.htm). 2. For CDC influenza surveillance, Influenza-like Illness (ILI) is defined as a fever ≥37.8°C and a cough and/or a sore throat without known etiology [22]. 3. There are 166 total observations in each panel. Influential observations were defined by DFBETA statistic greater than the absolute value of 2 divided by the square root of the total number of observations in a simple linear regression model [17]. 4. RB represents Pearson's correlation coefficients calculated from comparisons of Outpatient Influenza-like Illness Surveillance with US Influenza Virologic Surveillance. 5. R'B represents the calculated Pearson's correlation coefficients after exclusion of all influential observations. 6. Because CDC surveillance is intensified from calendar week 40 through calendar week 20 of the subsequent year, we restricted our correlation analyses to this time period.
Pearson's Correlation Coefficient Matrix of Data from Three Influenza Surveillance Systems: Google Flu Trends, CDC Influenza-like Illness Surveillance, CDC Influenza Virologic Surveillance, September 28, 2003 through May 17, 20081.
| Dataset | Google Flu Trends | CDC ILI | CDC Virologic |
| Google Flu Trends | 1.00 | -- | -- |
| CDC ILI | 0.94 (0.92, 0.96) | 1.00 | -- |
| CDC Virologic | 0.72 (0.64, 0.79) | 0.85 (0.81, 0.89) | 1.00 |
| CDC Virologic plus One Week | 0.69 (0.60, 0.76) | 0.79 CI: 0.72, 0.84 | -- |
| CDC Virologic plus Two Weeks | 0.66 (0.56, 0.74) | 0.75 (0.68, 0.81) | -- |
Because CDC surveillance is intensified from calendar week 40 through calendar week 20 of the subsequent year, we restricted our correlation analyses to this time period.
Pearson's Correlation Coefficient Matrix of Data from Three Influenza Surveillance Systems by Surveillance Year: Google Flu Trends, CDC Influenza-like Illness Surveillance, CDC Influenza Virologic Surveillance, September 28, 2003 through May 17, 20081.
| Surveillance Year | Dataset | Google Flu Trends | CDC ILI | CDC Virologic |
| 2003–04 | Google Flu Trends | 1.00 | -- | -- |
| CDC ILI | 0.94 (0.89, 0.97) | 1.00 | -- | |
| CDC Virologic | 0.67 (0.43, 0.82) | 0.84 (0.69, 0.92) | 1.00 | |
| 2004–05 | Google Flu Trends | 1.00 | -- | -- |
| CDC ILI | 0.98 (0.95, 0.99) | 1.00 | -- | |
| CDC Virologic | 0.94 (0.89, 0.97) | 0.94 (0.88, 0.97) | 1.00 | |
| 2005–06 | Google Flu Trends | 1.00 | -- | -- |
| CDC ILI | 0.97 (0.94, 0.99) | 1.00 | -- | |
| CDC Virologic | 0.72 (0.50, 0.85) | 0.79 (0.62, 0.89) | 1.00 | |
| 2006–07 | Google Flu Trends | 1.00 | -- | -- |
| CDC ILI | 0.94 (0.89, 0.97) | 1.00 | -- | |
| CDC Virologic | 0.71 (0.49, 0.85) | 0.81 (0.64, 0.90) | 1.00 | |
| 2007–08 | Google Flu Trends | 1.00 | -- | -- |
| CDC ILI | 0.98 (0.96, 0.99) | 1.00 | -- | |
| CDC Virologic | 0.91 (0.82, 0.95) | 0.92 (0.85, 0.96) | 1.00 |
Because CDC surveillance is intensified from calendar week 40 through calendar week 20 of the subsequent year, we restricted our correlation analyses to this time period.
Pearson's Correlation Coefficient Matrix of Data from Three Influenza Surveillance Systems by US Census Region: Google Flu Trends, CDC Influenza-like Illness Surveillance, CDC Influenza Virologic Surveillance, September 28, 2003 through May 17, 20081.
| US Census Region | Dataset | Google Flu Trends | CDC ILI | CDC Virologic |
| New England | Google Flu Trends | 1.00 | -- | -- |
| CDC ILI | 0.94 (0.92, 0.96) | 1.00 | ||
| CDC Virologic | 0.65 (0.55, 0.73) | 0.76 (0.68, 0.82) | 1.00 | |
| Middle Atlantic | Google Flu Trends | 1.00 | -- | -- |
| CDC ILI | 0.87 (0.82, 0.90) | 1.00 | ||
| CDC Virologic | 0.67 (0.58, 0.75) | 0.70 (0.61, 0.77) | 1.00 | |
| East North Central | Google Flu Trends | 1.00 | -- | -- |
| CDC ILI | 0.96 (0.94, 0.97) | 1.00 | ||
| CDC Virologic | 0.64 (0.54, 0.72) | 0.73 (0.65, 0.80) | 1.00 | |
| West North Central | Google Flu Trends | 1.00 | -- | -- |
| CDC ILI | 0.95 (0.93, 0.97) | 1.00 | ||
| CDC Virologic | 0.80 (0.74, 0.85) | 0.82 (0.76, 0.87) | 1.00 | |
| South Atlantic | Google Flu Trends | 1.00 | -- | -- |
| CDC ILI | 0.91 (0.88, 0.93) | 1.00 | -- | |
| CDC Virologic | 0.72 (0.64, 0.79) | 0.82 (0.76, 0.86) | 1.00 | |
| East South Central | Google Flu Trends | 1.00 | -- | -- |
| CDC ILI | 0.81 (0.75, 0.86) | 1.00 | -- | |
| CDC Virologic | 0.69 (0.60, 0.76) | 0.64 (0.55, 0.73) | 1.00 | |
| West South Central | Google Flu Trends | 1.00 | -- | -- |
| CDC ILI | 0.87 (0.82, 0.90) | 1.00 | -- | |
| CDC Virologic | 0.74 (0.70, 0.80) | 0.86 (0.81, 0.89) | 1.00 | |
| Mountain | Google Flu Trends | 1.00 | -- | -- |
| CDC ILI | 0.91 (0.88, 0.93) | 1.00 | -- | |
| CDC Virologic | 0.72 (0.64, 0.79) | 0.81 (0.75, 0.86) | 1.00 | |
| Pacific | Google Flu Trends | 1.00 | -- | -- |
| CDC ILI | 0.84 (0.79, 0.88) | 1.00 | -- | |
| CDC Virologic | 0.67 (0.58, 0.75) | 0.78 (0.71, 0.83) | 1.00 |
Note:
Because CDC surveillance is intensified from calendar week 40 through calendar week 20 of the subsequent year, we restricted our correlation analyses to this time period.
US Census Regions include the following states: (1) New England – Connecticut, Maine, Massachusetts, New Hampshire, Rhode Island, Vermont; (2) Middle Atlantic – New Jersey, New York, Pennsylvania; (3) East North Central – Indiana, Illinois, Michigan, Ohio, Wisconsin; (4) West North Central – Iowa, Kansas, Minnesota, Missouri, Nebraska, North Dakota, South Dakota; (5) South Atlantic – Delaware, District of Columbia, Florida, Georgia, Maryland, North Carolina, South Carolina, Virginia, West Virginia; (6) East South Central – Alabama, Kentucky, Mississippi, Tennessee; (7) West South Central – Arkansas, Louisiana, Oklahoma, Texas; (8) Mountain – Arizona, Colorado, Idaho, New Mexico, Montana, Utah, Nevada, Wyoming; (9) Pacific – Alaska, California, Hawaii, Oregon, Washington.
Pearson's Correlation Coefficient Matrix of Three Influenza Surveillance Systems by US Census Region during the 2007-08 Influenza Season1,2
| US Census Region3 | Dataset | Google Flu Trends | CDC ILI | CDC Virologic |
| New England | Google Flu Trends | 1.00 | -- | -- |
| CDC ILI | 0.98 (0.95, 0.99) | 1.00 | -- | |
| CDC Virologic | 0.86 (0.74, 0.93) | 0.88 (0.77, 0.94) | 1.00 | |
| Middle Atlantic | Google Flu Trends | 1.00 | -- | -- |
| CDC ILI | 0.97 (0.03, 0.98) | 1.00 | -- | |
| CDC Virologic | 0.84 (0.70, 0.92) | 0.91 (0.82, 0.96) | 1.00 | |
| East North Central | Google Flu Trends | 1.00 | -- | -- |
| CDC ILI | 0.98 (0.96, 0.99) | 1.00 | -- | |
| CDC Virologic | 0.87 (0.75, 0.93) | 0.92 (0.83, 0.96) | 1.00 | |
| West North Central | Google Flu Trends | 1.00 | -- | -- |
| CDC ILI | 0.98 (0.96, 0.99) | 1.00 | -- | |
| CDC Virologic | 0.82 (0.66, 0.91) | 0.87 (0.76, 0.94) | 1.00 | |
| South Atlantic | Google Flu Trends | 1.00 | -- | -- |
| CDC ILI | 0.98 (0.95, 0.99) | 1.00 | -- | |
| CDC Virologic | 0.90 (0.81, 0.95) | 0.90 (0.81, 0.95) | 1.00 | |
| East South Central | Google Flu Trends | 1.00 | -- | -- |
| CDC ILI | 0.98 (0.95, 0.99) | 1.00 | -- | |
| CDC Virologic | 0.85 (0.72, 0.92) | 0.89 (0.79, 0.95) | 1.00 | |
| West South Central | Google Flu Trends | 1.00 | -- | -- |
| CDC ILI | 0.97 (0.94, 0.99) | 1.00 | -- | |
| CDC Virologic | 0.94 (0.88, 0.97) | 0.94 (0.89, 0.97) | 1.00 | |
| Mountain | Google Flu Trends | 1.00 | -- | -- |
| CDC ILI | 0.98 (0.96, 0.98) | 1.00 | -- | |
| CDC Virologic | 0.91 (0.83, 0.96) | 0.91 (0.82, 0.95) | 1.00 | |
| Pacific | Google Flu Trends | 1.00 | -- | -- |
| CDC ILI | 0.92 (0.84, 0.96) | 1.00 | -- | |
| CDC Virologic | 0.88 (0.77, 0.94) | 0.77 (0.58, 0.88) | 1.00 |
Note:
Because CDC surveillance is intensified from calendar week 40 through calendar week 20 of the subsequent year, we restricted our correlation analyses to this time period.
Overall mean Pearson's correlation coefficient by US Census Region during 2007-08 influenza season:
a. Google Flu Trends and CDC ILI Surveillance: 0.97 (0.02)
b. Google Flu Trends and CDC Virologic Surveillance: 0.87 (0.04)
c. CDC ILI Surveillance and CDC Virologic Surveillance: 0.89 (0.05)
US Census Regions include the following states: (1) New England; Connecticut, Maine, Massachusetts, New Hampshire, Rhode Island, Vermont; (2) Middle Atlantic; New Jersey, New York, Pennsylvania; (3) East North Central; Indiana, Illinois, Michigan, Ohio, Wisconsin; (4) West North Central; Iowa, Kansas, Minnesota, Missouri, Nebraska, North Dakota, South Dakota; (5) South Atlantic; Delaware, District of Columbia, Florida, Georgia, Maryland, North Carolina, South Carolina, Virginia, West Virginia; (6) East South Central; Alabama, Kentucky, Mississippi, Tennessee; (7) West South Central; Arkansas, Louisiana, Oklahoma, Texas; (8) Mountain; Arizona, Colorado, Idaho, New Mexico, Montana, Utah, Nevada, Wyoming; (9) Pacific; Alaska, California, Hawaii, Oregon, Washington.