| Literature DB >> 29092812 |
John S Brownstein1,2,3, Shuyu Chu4, Achla Marathe4, Madhav V Marathe4, Andre T Nguyen2,5, Daniela Paolotti6, Nicola Perra7, Daniela Perrotta6, Mauricio Santillana1,2,3, Samarth Swarup4, Michele Tizzoni6, Alessandro Vespignani8, Anil Kumar S Vullikanti4, Mandy L Wilson4, Qian Zhang8.
Abstract
BACKGROUND: Influenza outbreaks affect millions of people every year and its surveillance is usually carried out in developed countries through a network of sentinel doctors who report the weekly number of Influenza-like Illness cases observed among the visited patients. Monitoring and forecasting the evolution of these outbreaks supports decision makers in designing effective interventions and allocating resources to mitigate their impact.Entities:
Keywords: crowdsourcing; disease surveillance; forecasting; nonresponse bias
Year: 2017 PMID: 29092812 PMCID: PMC5688248 DOI: 10.2196/publichealth.7344
Source DB: PubMed Journal: JMIR Public Health Surveill ISSN: 2369-2960
Figure 1Mapping of MTurk sample to synthetic individuals.
Figure 2The FluOutlook framework.
Figure 3(Top panel) The US Centers for Disease Control and Prevention (CDC) influenza-like illness (ILI) percent value (y-axis) is displayed as a function of time (x-axis). Predictions produced 1 week ahead of the publication of CDC-ILI reports using (1) only historical CDC information via an autoregressive model, AR(2), (2) an autoregressive model that combines historical CDC information with Flu Near You (FNY) information, AR(2)+FNY, and (3) an ensemble method that combines multiple data sources including FNY, Google search frequencies, electronic health records, and historical CDC information (all sources) are shown. (Bottom panel) The errors between the predictions and the CDC-reported ILI for each prediction model are displayed.
Figure 4(a) Epidemic curves under low transmission rate. (b) Epidemic curves under high transmission rate.
Bias in epidemic metrics under low transmission rate.
| Metric | Nonresponse bias (V-S) | Sample-size bias (S-S') | Total bias (V-S') | |
| Mean difference, % | 7.90 | 2.13 | 10.03 | |
| 95% CI | 7.88 to 7.91 | 1.58 to 2.68 | 9.47 to 10.58 | |
| <.001 | <.001 | <.001 | ||
| Mean difference, % | 1.22 | 0.14 | 1.36 | |
| 95% CI | 1.22 to 1.22 | 0.05 to 0.23 | 1.27 to 1.45 | |
| <.001 | .003 | <.001 | ||
| Mean difference, days | –1.76 | 0.76 | –1 | |
| 95% CI | –1.96 to –1.56 | 0.16 to 1.36 | –1.58 to –0.42 | |
| <.001 | .02 | .002 | ||
Bias in epidemic metrics under high transmission rate.
| Metric | Nonresponse bias (V-S) | Sample-size bias (S-S') | Total bias (V-S') | |
| Mean difference, % | 6.31 | 3.58 | 9.90 | |
| 95% CI | 6.30 to 6.32 | 3.06 to 4.10 | 9.38 to 10.42 | |
| <.001 | <.001 | <.001 | ||
| Mean difference, % | 2.51 | 0.63 | 3.14 | |
| 95% CI | 2.50 to 2.53 | 0.49 to 0.77 | 3.01 to 3.28 | |
| <.001 | <.001 | <.001 | ||
| Mean difference, days | –1.44 | 0.12 | –1.32 | |
| 95% CI | –1.69 to –1.20 | –0.10 to 0.34 | –1.59 to –1.05 | |
| <.001 | .28 | <.001 | ||
Figure 5Epidemic profiles for Belgium, Denmark, Italy, the Netherlands, Spain, and the United Kingdom considering 4-week, 3-week, 2-week, and 1-week lead predictions. The best estimation (solid line) and the 95% confidence interval (colored area) are shown together with sentinel doctors' surveillance data (black dots) which represent the ground truth (ie, the target signals).
Figure 6Pearson correlations, mean absolute percentage errors, and peak week accuracy obtained by comparing the forecast results and the sentinel doctors' influenza-like illness surveillance data along the entire season in each country.
Figure 7Heatmap showing the relevance of each of the input data sources on the flu prediction during the 7/2013-4/2015 time window (x-axis). These values change from week to week due to a dynamic model recalibration process. The multiple data sources entered into the HealthMap Flu Trends system are on the y-axis with their tendencies, or derivatives. The bar on the right is a color code of the magnitude of the regression coefficients of the multiple data sources used as inputs.