| Literature DB >> 17144927 |
Tom Burr1, Todd Graves, Richard Klamann, Sarah Michalak, Richard Picard, Nicolas Hengartner.
Abstract
BACKGROUND: Syndromic surveillance (SS) can potentially contribute to outbreak detection capability by providing timely, novel data sources. One SS challenge is that some syndrome counts vary with season in a manner that is not identical from year to year. Our goal is to evaluate the impact of inconsistent seasonal effects on performance assessments (false and true positive rates) in the context of detecting anomalous counts in data that exhibit seasonal variation.Entities:
Mesh:
Year: 2006 PMID: 17144927 PMCID: PMC1698911 DOI: 10.1186/1472-6947-6-40
Source DB: PubMed Journal: BMC Med Inform Decis Mak ISSN: 1472-6947 Impact factor: 2.796
Figure 1Average daily respiratory counts, by week (top) Real data; (middle) Data simulated from Eq. (1) (nonhierarchical model with larger-than-Poisson variance); (bottom) Data simulated from Eq. (2) (hierarchical model with larger-than-Poisson variance). In all 3 plots, the weekly averages begin on Jan 31, 1994 and end on May 31, 2003. The smooth curve is the Eq. (1) fit to all nine years of respiratory data.
Figure 2Average daily average respiratory count, by week, for week 20 through week 71. The same data as in Figure 1, but only for week 20 through week 71 in 1993 and for week 20 through week 71 in 1994. Note that the peak onset, shape, and duration varies each year in the real data (top left) and in data simulated from the hierarchical model (top right), but not in data simulated from the nonhierarchical model (top middle). The forecast errors arising from using Eq. (1) to forecast (Method 1) and a smooth curve fit to the errors illustrate strong serial correlation in the corresponding bottom left and right plots (and very mild or negligible serial correlation in the middle plot). Each bottom plot shows forecast errors for the same period, weeks 20 through 123, without overlaying successive years.
Detection probabilities (multiplied by 100) by forecast method, length of training data, and data set.
| Forecast Method | ||||||||
| Years of Training Data | 1 Eq. (1) | 2 EWMA | 3 Avg1 | 4 Avg2 | 5 HistAvg1 | 6 HistAvg2 | 7 HistAvg3 | 8 HistAvg4 |
| 1 year | 88,86,100 | 93,92,88 | 92,72,86 | 89,78,84 | 81,74,84 | 84,75,100 | 71,61,92 | 79,74,100 |
| 2 years | 69,79,49 | 94,91,90 | 92,73,89 | 90,78,89 | 74,61,62 | 77,65,68 | 70,53,55 | 73,63,64 |
| 3 years | 70,99,95 | 89,93,93 | 87,92,91 | 81,88,89 | 66,90,65 | 69,94,82 | 61,89,69 | 68,90,66 |
| 4 years | 83,94,99 | 89,90,91 | 89,89,89 | 81,89,71 | 79,89,97 | 78,90,97 | 76,88,95 | 66,92,94 |
| 5 years | 92,90,68 | 91,95,85 | 88,92,72 | 84,91,75 | 76,83,73 | 77,86,75 | 73,80,70 | 76,83,75 |
| 6 years | 75,91,96 | 90,95,86 | 90,88,82 | 86,72,66 | 68,84,93 | 68,90,93 | 66,84,93 | 68,89,90 |
| 7 years | 96,92,66 | 93,86,91 | 92,80,92 | 90,88,87 | 97,90,72 | 97,92,74 | 96,90,70 | 96,91,73 |
| 8 years | 82,93,96 | 95,96,91 | 94,91,84 | 92,89,81 | 87,90,94 | 88,92,94 | 88,88,94 | 89,87,93 |
| Average | 83, | 92,92,89 | 91,85,85 | 87,84,81 | 79, | 79, | 77, | 77, |
The (x,y,z) entry in each cell denotes 100 times the DP (with a false alarm rate of one per year) for the real data, data simulated from the nonhierarchical model, and data simulated from the hierarchical model, respectively. Individual entries are each based on 1000 simulated outbreaks, so their confidence limits are approximately ± 2. In the last row, boldface entries indicate optimistic DPs attributable to using data simulated from the nonhierarchical model. The eight methods are based on applying Page's test to forecast errors based on forecasts from: (1) a fit to Eq. (1), (2) EWMA, (3) moving average with a 1-day gap, (4) moving average with a 3-day gap, (5) historical average using day d, d-1, d+1 in the training data to predict day d in the testing data, (6) historical average using three 4-day windows, (7) historical average using day d, and the (8) historical average using one 4-day window.