| Literature DB >> 29109069 |
Sasikiran Kandula1, Daniel Hsu2, Jeffrey Shaman1.
Abstract
BACKGROUND: Limiting the adverse effects of seasonal influenza outbreaks at state or city level requires close monitoring of localized outbreaks and reliable forecasts of their progression. Whereas forecasting models for influenza or influenza-like illness (ILI) are becoming increasingly available, their applicability to localized outbreaks is limited by the nonavailability of real-time observations of the current outbreak state at local scales. Surveillance data collected by various health departments are widely accepted as the reference standard for estimating the state of outbreaks, and in the absence of surveillance data, nowcast proxies built using Web-based activities such as search engine queries, tweets, and access of health-related webpages can be useful. Nowcast estimates of state and municipal ILI were previously published by Google Flu Trends (GFT); however, validations of these estimates were seldom reported.Entities:
Keywords: classification and regression trees; human influenza; infodemiology; infoveillance; nowcasts; surveillance
Mesh:
Year: 2017 PMID: 29109069 PMCID: PMC5696582 DOI: 10.2196/jmir.7486
Source DB: PubMed Journal: J Med Internet Res ISSN: 1438-8871 Impact factor: 5.428
Figure 1Autoregressive integrated moving average (ARIMA) formulation.
Figure 2Formulation for two error measures: root mean square error (RMSE) and mean absolute proportion error (MAPE).
Figure 3Top 20 features by importance as determined by random forest models built at regional level. The dot and whiskers in red show the median and interquartile range (IQR), respectively, whereas the blue point is the mean. The label shows the percentage of models in which the feature was used (n=3130). ar refers to the autoregressive integrated moving average (ARIMA) component. Features prefixed by ENT are entities identified using Freebase.
Median (interquartile range), Pearson correlation coefficient (COR), root mean square error (RMSE), and mean absolute proportion error (MAPE) for RRS, RR0, RRR models, and Google Flu Trends (GFT). Results are stratified by state population size and season.
| GFTa, median | ||||||
| Overall | 0.85 (0.74-0.91) | 0.83 (0.7-0.9) | 0.86 (0.75-0.91) | 0.89 (0.8-0.94) | ||
| 0-2 (n=14) | 0.79 (0.64-0.87) | 0.76 (0.62-0.86) | 0.81 (0.67-0.88) | 0.83 (0.72-0.91) | ||
| 2-5 (n=14) | 0.84 (0.72-0.89) | 0.82 (0.7-0.89) | 0.84 (0.75-0.90) | 0.9 (0.81-0.94) | ||
| 5-7.5 (n=10) | 0.84 (0.74-0.91) | 0.82 (0.7-0.9) | 0.86 (0.73-0.92) | 0.89 (0.8-0.95) | ||
| ≥7.5 (n=12) | 0.91 (0.85-0.93) | 0.9 (0.84-0.93) | 0.91 (0.86-0.94) | 0.93 (0.86-0.96) | ||
| 05-06 | 0.8 (0.62-0.85) | 0.8 (0.62-0.85) | 0.81 (0.64-0.87) | 0.83 (0.71-0.88) | ||
| 06-07 | 0.82 (0.65-0.88) | 0.8 (0.6-0.88) | 0.82 (0.71-0.89) | 0.83 (0.76-0.9) | ||
| 07-08 | 0.88 (0.81-0.92) | 0.87 (0.79-0.92) | 0.89 (0.82-0.93) | 0.93 (0.87-0.96) | ||
| 08-09 | 0.75 (0.69-0.83) | 0.71 (0.58-0.82) | 0.78 (0.67-0.83) | 0.81 (0.71-0.89) | ||
| 09-10 | 0.9 (0.85-0.93) | 0.89 (0.8-0.93) | 0.9 (0.85-0.93) | 0.97 (0.94-0.98) | ||
| 10-11 | 0.89 (0.82-0.92) | 0.88 (0.75-0.91) | 0.89 (0.85-0.92) | 0.89 (0.86-0.93) | ||
| Overall | 0.99 (0.7-1.51) | 1.06 (0.73-1.56) | 0.97 (0.72-1.54) | 0.93 (0.66-1.33) | ||
| 0-2 (n=14) | 1.06 (0.69-1.58) | 1.19 (0.73-1.62) | 1.05 (0.72-1.6) | 0.88 (0.63-1.29) | ||
| 2-5 (n=14) | 1.21 (0.84-1.87) | 1.33 (0.92-1.81) | 1.22 (0.83-1.84) | 1.02 (0.78-1.52) | ||
| 5-7.5 (n=10) | 0.93 (0.65-1.21) | 0.98 (0.72-1.33) | 0.93 (0.61-1.14) | 0.88 (0.67-1.48) | ||
| ≥7.5 (n=12) | 0.87 (0.66-1.01) | 0.85 (0.70-1.08) | 0.88 (0.69-1.01) | 0.87 (0.63-1.16) | ||
| 05-06 | 0.93 (0.64-1.5) | 0.92 (0.70-1.64) | 0.93 (0.64-1.52) | 0.88 (0.60-1.45) | ||
| 06-07 | 0.84 (0.56-1.16) | 0.89 (0.57-1.16) | 0.85 (0.5-1.1) | 0.82 (0.52-1.13) | ||
| 07-08 | 1.08 (0.81-1.7) | 1.06 (0.83-1.59) | 0.99 (0.82-1.67) | 1.09 (0.70-1.55) | ||
| 08-09 | 1.02 (0.77-1.47) | 1.10 (0.79-1.48) | 1.03 (0.79-1.55) | 1.02 (0.79-1.41) | ||
| 09-10 | 1.31 (0.98-1.77) | 1.40 (1.08-1.72) | 1.28 (0.98-1.72) | 1.05 (0.80-1.32) | ||
| 10-11 | 0.77 (0.59-1.16) | 0.83 (0.61-1.26) | 0.83 (0.59-1.15) | 0.73 (0.64-1.20) | ||
| Overall | 0.8 (0.43-1.75) | 0.67 (0.42-1.54) | 0.77 (0.43-1.62) | 0.71 (0.44-1.51) | ||
| 0-2 (n=14) | 0.9 (0.54-1.7) | 0.77 (0.51-1.41) | 0.84 (0.55-1.55) | 0.76 (0.51-1.56) | ||
| 2-5 (n=14) | 0.95 (0.48-1.79) | 0.82 (0.44-1.65) | 0.87 (0.45-1.71) | 0.77 (0.41-1.48) | ||
| 5-7.5 (n=10) | 0.65 (0.36-1.62) | 0.59 (0.37-1.69) | 0.63 (0.35-1.57) | 0.68 (0.4-1.41) | ||
| ≥7.5 (n=12) | 0.65 (0.34-1.64) | 0.54 (0.3-1.34) | 0.65 (0.33-1.5) | 0.7 (0.43-1.54) | ||
| 05-06 | 1.2 (0.46-3.06) | 0.78 (0.47-2.77) | 0.99 (0.49-2.72) | 1.07 (0.56-2.67) | ||
| 06-07 | 0.97 (0.53-1.84) | 0.92 (0.49-1.81) | 0.91 (0.51-1.67) | 0.88 (0.46-1.48) | ||
| 07-08 | 0.85 (0.5-1.67) | 0.83 (0.49-1.64) | 0.81 (0.51-1.51) | 0.76 (0.5-1.57) | ||
| 08-09 | 0.82 (0.47-1.59) | 0.67 (0.43-1.36) | 0.84 (0.43-1.52) | 0.71 (0.44-1.48) | ||
| 09-10 | 0.73 (0.36-1.96) | 0.64 (0.4-1.83) | 0.74 (0.36-1.96) | 0.63 (0.43-1.17) | ||
| 10-11 | 0.49 (0.3-1.04) | 0.48 (0.28-0.96) | 0.48 (0.31-1.04) | 0.61 (0.32-0.93) | ||
aGFT: Google Flu Trends.
bCOR: Pearson correlation coefficient.
cRMSE: root mean square error.
dMAPE: mean absolute percentage error.
Mean rank and statistical significance from post hoc Nemenyi test. For each season-state combination, the model forms are ranked from best (rank=1) to worst (rank=4).
| CORa | RMSEb | MAPEc | ||||||||||
| GFTd | ||||||||||||
| GFT | 1.91 | 2.33 | 2.45 | |||||||||
| RR0 | 3.07 | <.001 | 2.75 | <.001 | 2.24 | .17 | ||||||
| RRR | 2.38 | <.001 | <.001 | 2.41 | .89 | .01 | 2.43 | .99 | .25 | |||
| RRS | 2.63 | <.001 | <.001 | .1 | 2.51 | .35 | .09 | .79 | 2.87 | <.001 | <.001 | <.001 |
aCOR: Pearson correlation coefficient.
bRMSE: root mean square error.
cMAPE: mean absolute percentage error.
dGFT: Google Flu Trends.
Median (interquartile range), Pearson correlation coefficient (COR), root mean square error (RMSE), and mean absolute percentage error (MAPE) for Google Flu Trends (GFT), SS0, SRR, SRS, and SSS models. Results are stratified by state population and season.
| GFTa, median | |||||||
| Overall | 0.89 (0.8-0.94) | 0.56 (0.4-0.75) | 0.8 (0.7-0.88) | 0.8 (0.7-0.88) | 0.74 (0.61-0.83) | ||
| 0-2 (n=14) | 0.83 (0.72-0.91) | 0.46 (0.31-0.66) | 0.74 (0.57-0.82) | 0.71 (0.56-0. 8) | 0.62 (0.55-0.74) | ||
| 2-5 (n=14) | 0.9 (0.81-0.94) | 0.58 (0.42-0.76) | 0.78 (0.72-0.87) | 0.8 (0.72-0.85) | 0.73 (0.66-0.81) | ||
| 5-7.5 (n=10) | 0.89 (0.8-0.95) | 0.51 (0.36-0.64) | 0.83 (0.7-0.88) | 0.81 (0.73-0.88) | 0.75 (0.63-0.82) | ||
| ≥7.5 (n=12) | 0.93 (0.86-0.96) | 0.73 (0.48-0.85) | 0.88 (0.79-0.92) | 0.89 (0.8-0.92) | 0.86 (0.72-0.91) | ||
| 05-06 | 0.83 (0.71-0.88) | 0.72 (0.56-0.85) | 0.78 (0.68-0.86) | 0.76 (0.62-0.86) | 0.74 (0.66-0.86) | ||
| 06-07 | 0.83 (0.76-0.9) | 0.75 (0.61-0.84) | 0.8 (0.7-0.88) | 0.8 (0.64-0.87) | 0.8 (0.72-0.89) | ||
| 07-08 | 0.93 (0.87-0.96) | 0.61 (0.47-0.77) | 0.87 (0.78-0.92) | 0.86 (0.78-0.9) | 0.81 (0.73-0.86) | ||
| 08-09 | 0.81 (0.71-0.89) | 0.37 (0.28-0.44) | 0.7 (0.59-0.8) | 0.74 (0.58-0.79) | 0.57 (0.45-0.68) | ||
| 09-10 | 0.97 (0.94-0.98) | 0.51 (0.39-0.73) | 0.82 (0.75-0.89) | 0.82 (0.74-0.89) | 0.74 (0.63-0.85) | ||
| 10-11 | 0.89 (0.86-0.93) | 0.47 (0.33-0.6) | 0.82 (0.75-0.88) | 0.81 (0.75-0.88) | 0.71 (0.63-0.78) | ||
| Overall | 0.93 (0.66-1.33) | 1.07 (0.68-1.84) | 0.84 (0.54-1.25) | 0.86 (0.55-1.27) | 0.9 (0.55-1.35) | ||
| 0-2 (n=14) | 0.88 (0.63-1.29) | 1.17 (0.61-1.92) | 0.96 (0.55-1.47) | 0.96 (0.62-1.49) | 0.92 (0.58-1.44) | ||
| 2-5 (n=14) | 1.02 (0.78-1.52) | 1.37 (0.83-2.13) | 1.04 (0.7-1.54) | 1.11 (0.62-1.57) | 1.11 (0.66-1.68) | ||
| 5-7.5 (n=10) | 0.88 (0.67-1.48) | 0.99 (0.66-1.79) | 0.74 (0.49-1.07) | 0.71 (0.51-1.14) | 0.79 (0.55-1.24) | ||
| ≥7.5 (n=12) | 0.87 (0.63-1.16) | 0.91 (0.64-1.49) | 0.69 (0.43-1.05) | 0.67 (0.41-0.99) | 0.74 (0.46-1.01) | ||
| 05-06 | 0.88 (0.60-1.45) | 0.81 (0.49-1.47) | 0.71 (0.5-1.11) | 0.68 (0.49-1.13) | 0.64 (0.46-1.06) | ||
| 06-07 | 0.82 (0.52-1.13) | 0.70 (0.48-1.02) | 0.59 (0.43-0.88) | 0.58 (0.42-0.94) | 0.56 (0.41-0.83) | ||
| 07-08 | 1.09 (0.70-1.55) | 1.36 (0.78-1.85) | 0.91 (0.54-1.27) | 0.95 (0.58-1.37) | 0.97 (0.6-1.42) | ||
| 08-09 | 1.02 (0.79-1.41) | 1.21 (0.92-1.98) | 0.95 (0.69-1.31) | 0.93 (0.67-1.26) | 1.05 (0.78-1.4) | ||
| 09-10 | 1.05 (0.80-1.32) | 1.91 (1.28-2.44) | 1.34 (0.9-1.9) | 1.37 (0.92-1.92) | 1.53 (1.01-1.9) | ||
| 10-11 | 0.73 (0.64-1.20) | 1.00 (0.73-1.62) | 0.73 (0.5-1.04) | 0.7 (0.51-1.1) | 0.86 (0.58-1.16) | ||
| Overall | 0.71 (0.44-1.51) | 0.58 (0.38-0.8) | 0.54 (0.33-0.9) | 0.61 (0.34-1) | 0.61 (0.35-1.02) | ||
| 0-2 (n=14) | 0.76 (0.51-1.56) | 0.68 (0.48-0.86) | 0.76 (0.5-1.36) | 0.84 (0.56-1.44) | 0.82 (0.58-1.28) | ||
| 2-5 (n=14) | 0.77 (0.41-1.48) | 0.63 (0.36-0.85) | 0.58 (0.36-0.9) | 0.64 (0.39-1) | 0.68 (0.37-1.02) | ||
| 5-7.5 (n=10) | 0.68 (0.4-1.41) | 0.58 (0.39-0.74) | 0.41 (0.31-0.75) | 0.46 (0.32-0.86) | 0.55 (0.34-0.92) | ||
| ≥7.5 (n=12) | 0.7 (0.43-1.54) | 0.4 (0.31-0.59) | 0.38 (0.2-0.59) | 0.37 (0.2-0.69) | 0.41 (0.24-0.61) | ||
| 05-06 | 1.07 (0.56-2.67) | 0.59 (0.39-0.8) | 0.68 (0.4-0.93) | 0.77 (0.41-1.12) | 0.74 (0.38-1.08) | ||
| 06-07 | 0.88 (0.46-1.48) | 0.54 (0.36-0.71) | 0.51 (0.32-0.84) | 0.62 (0.35-0.94) | 0.58 (0.3-0.89) | ||
| 07-08 | 0.76 (0.5-1.57) | 0.69 (0.4-0.83) | 0.54 (0.38-0.78) | 0.62 (0.41-0.94) | 0.62 (0.38-0.81) | ||
| 08-09 | 0.71 (0.44-1.48) | 0.57 (0.42-0.77) | 0.62 (0.37-1.01) | 0.66 (0.36-0.93) | 0.68 (0.39-1.14) | ||
| 09-10 | 0.63 (0.43-1.17) | 0.59 (0.36-0.85) | 0.52 (0.31-1.25) | 0.59 (0.31-1.38) | 0.67 (0.37-1.14) | ||
| 10-11 | 0.61 (0.32-0.93) | 0.5 (0.35-0.85) | 0.38 (0.26-0.67) | 0.38 (0.26-0.75) | 0.43 (0.31-0.83) | ||
aGFT: Google Flu Trends.
bCOR: Pearson correlation coefficient.
cRMSE: root mean square error.
dMAPE: mean absolute percentage error.
Figure 4Measures observed with the different model forms A: Pearson correlation coefficient (COR); B: Root mean square error (RMSE); and C: Mean absolute percentage error (MAPE). Left: The box and whiskers show the median, interquartile range (IQR), and extrema (1.5×IQR) for each model form. The colored regions are violin plots showing probability density. Right: Heat map of the distribution of relative ranks of the models; more frequent ranks are darker.
Figure 5Pairwise plots for the model forms on the three measures forms A: Pearson correlation coefficient (COR); B: Root mean square error (RMSE); and C: Mean absolute percentage error (MAPE). The subpanels along the diagonal show density of the measure for the model form. Subpanels in the lower triangle are scatter plots (n=300) denoting a state-season. Points on or close to the black line (y=x) are state-seasons where the pair of model forms have similar measures (correlation or error). Subpanels in the upper triangle are heat maps of the counts of points in each two-dimensional (2D) grid of the plot area (low counts in yellow, high in red). For example, to compare the correlations of RRS and SS0, see the scatter plot in (5,4) or heat map in (4,5) of A.
Mean rank and statistical significance from post hoc Nemenyi test. For each season-state combination, the model forms are ranked from best (rank=1) to worst (rank=8).
| GFTa | |||||||||
| Pearson correlation coefficient (COR) | GFT | 2.67 | |||||||
| RR0 | 4.55 | <.001 | |||||||
| RRR | 3.34 | .002 | <.001 | ||||||
| RRS | 3.68 | <.001 | <.001 | .68 | |||||
| SS0 | 6.87 | <.001 | <.001 | <.001 | <.001 | ||||
| SRR | 4.37 | <.001 | .98 | <.001 | .01 | <.001 | |||
| SRS | 4.75 | <.001 | .97 | <.001 | <.001 | <.001 | .55 | ||
| SSS | 5.73 | <.001 | <.001 | <.001 | <.001 | <.001 | <.001 | <.001 | |
| Root mean square error (RMSE) | GFT | 4.46 | |||||||
| RR0 | 5.27 | .002 | |||||||
| RRR | 4.68 | .96 | .06 | ||||||
| RRS | 4.82 | .62 | .35 | .99 | |||||
| SS0 | 5.77 | <.001 | .19 | <.001 | <.001 | ||||
| SRR | 3.34 | <.001 | <.001 | <.001 | <.001 | <.001 | |||
| SRS | 3.71 | .005 | <.001 | <.001 | <.001 | <.001 | .61 | ||
| SSS | 3.96 | .2 | <.001 | <.001 | <.001 | <.001 | .04 | .92 | |
| Mean absolute proportion error (MAPE) | GFT | 5.26 | |||||||
| RR0 | 4.91 | .65 | |||||||
| RRR | 5.18 | .99 | .89 | ||||||
| RRS | 5.7 | .37 | .002 | .15 | |||||
| SS0 | 3.75 | <.001 | <.001 | <.001 | <.001 | ||||
| SRR | 3.17 | <.001 | <.001 | <.001 | <.001 | .07 | |||
| SRS | 3.93 | <.001 | <.001 | <.001 | <.001 | .99 | <.001 | ||
| SSS | 4.09 | <.001 | .001 | <.001 | <.001 | .69 | <.001 | .99 |
aGFT: Google Flu Trends.