| Literature DB >> 28886112 |
Robert W Mathes1, Ramona Lall1, Alison Levin-Rector1, Jessica Sell1, Marc Paladini1, Kevin J Konty1, Don Olson1, Don Weiss1.
Abstract
The New York City Department of Health and Mental Hygiene has operated an emergency department syndromic surveillance system since 2001, using temporal and spatial scan statistics run on a daily basis for cluster detection. Since the system was originally implemented, a number of new methods have been proposed for use in cluster detection. We evaluated six temporal and four spatial/spatio-temporal detection methods using syndromic surveillance data spiked with simulated injections. The algorithms were compared on several metrics, including sensitivity, specificity, positive predictive value, coherence, and timeliness. We also evaluated each method's implementation, programming time, run time, and the ease of use. Among the temporal methods, at a set specificity of 95%, a Holt-Winters exponential smoother performed the best, detecting 19% of the simulated injects across all shapes and sizes, followed by an autoregressive moving average model (16%), a generalized linear model (15%), a modified version of the Early Aberration Reporting System's C2 algorithm (13%), a temporal scan statistic (11%), and a cumulative sum control chart (<2%). Of the spatial/spatio-temporal methods we tested, a spatial scan statistic detected 3% of all injects, a Bayes regression found 2%, and a generalized linear mixed model and a space-time permutation scan statistic detected none at a specificity of 95%. Positive predictive value was low (<7%) for all methods. Overall, the detection methods we tested did not perform well in identifying the temporal and spatial clusters of cases in the inject dataset. The spatial scan statistic, our current method for spatial cluster detection, performed slightly better than the other tested methods across different inject magnitudes and types. Furthermore, we found the scan statistics, as applied in the SaTScan software package, to be the easiest to program and implement for daily data analysis.Entities:
Mesh:
Year: 2017 PMID: 28886112 PMCID: PMC5590919 DOI: 10.1371/journal.pone.0184419
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Epidemic curves for (A) point-source exposure and (B) propagated transmission.
Daily count distributions of tested syndrome baselines.
| Syndrome | Minimum | Maximum | Mean | Standard deviation |
|---|---|---|---|---|
| Diarrhea | 65 | 309 | 144 | 42 |
| Fever/Influenza | 306 | 1093 | 563 | 113 |
| Influenza-like illness | 51 | 555 | 187 | 87 |
| Respiratory | 349 | 1528 | 793 | 218 |
| Vomit | 147 | 525 | 287 | 66 |
| Respiratory | 0 | 48 | 4 | 5 |
Characteristics of simulated temporal injects, daily counts.
| Syndrome | Min | Max | Mean | Number of injects |
|---|---|---|---|---|
| Diarrhea | 1 | 86 | 6 | 60 |
| Fever/Influenza | 1 | 315 | 17 | 60 |
| Influenza-like illness | 1 | 205 | 10 | 60 |
| Respiratory | 1 | 343 | 20 | 60 |
| Vomit | 1 | 110 | 7 | 60 |
Characteristics of simulated spatial injects, daily counts.
| Respiratory | Mean number of cases per ZIP (range) | Mean duration of outbreak (in days) | Mean number of ZIPs | Number of injects |
|---|---|---|---|---|
| Overall | 2 (1–14) | 9.3 | 2.9 | 120 |
| Single day | 3 (1–14) | 1 | 2.7 | 24 |
| Point source | 2 (1–11) | 7.6 | 2.7 | 72 |
| Propagated | 2 (1–13) | 22.9 | 3.7 | 24 |
Candidate methods for evaluation.
| Temporal | Spatio-temporal |
|---|---|
| Autoregressive integrated moving average (ARIMA) | Bayesian space-time regression |
| Cumulative sum control chart (CUSUM) | Generalized linear mixed model (GLMM) |
| Generalized Linear Model (GLM) | Space-time permutation scan statistic |
| Holt-Winters exponential smoother | Spatial scan statistic |
| Modified EARS C2 | |
| Temporal scan statistic |
Fig 2Receiver operator characteristic (ROC) curve for tested temporal methods.
Metrics for temporal methods for citywide injects at specificity 0.95.
| Small sized injects (1 SD) | Medium sized injects (2 SD) | Large sized injects (3 SD) | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Sens | PPV | Time | Sens | PPV | Time | Sens | PPV | Time | ||
| 0.20 | 0.01 | N/A | 0.55 | 0.02 | N/A | 0.95 | 0.03 | N/A | ||
| 0.20 | 0.01 | N/A | 0.55 | 0.02 | N/A | 0.75 | 0.02 | N/A | ||
| 0.00 | 0.00 | N/A | 0.00 | 0.00 | N/A | 0.10 | 0.01 | N/A | ||
| 0.00 | 0.00 | N/A | 0.45 | 0.01 | N/A | 0.75 | 0.02 | N/A | ||
| 0.30 | 0.01 | N/A | 0.65 | 0.02 | N/A | 0.95 | 0.03 | N/A | ||
| 0.05 | 0.01 | N/A | 0.20 | 0.01 | N/A | 0.35 | 0.01 | N/A | ||
| 0.38 | 0.01 | 0.70 | 0.33 | 0.01 | 0.84 | 0.65 | 0.02 | 0.72 | ||
| 0.28 | 0.01 | 0.68 | 0.32 | 0.01 | 0.68 | 0.52 | 0.03 | 0.63 | ||
| 0.07 | 0.01 | 1.00 | 0.03 | 0.01 | 0.83 | 0.13 | 0.05 | 0.63 | ||
| 0.23 | 0.01 | 0.83 | 0.28 | 0.02 | 0.83 | 0.45 | 0.03 | 0.67 | ||
| 0.27 | 0.01 | 0.69 | 0.33 | 0.02 | 0.78 | 0.67 | 0.02 | 0.73 | ||
| 0.22 | 0.02 | 0.78 | 0.27 | 0.01 | 0.54 | 0.28 | 0.02 | 0.57 | ||
| 0.45 | 0.02 | 0.61 | 0.40 | 0.02 | 0.51 | 0.55 | 0.03 | 0.51 | ||
| 0.40 | 0.02 | 0.66 | 0.30 | 0.01 | 0.45 | 0.50 | 0.02 | 0.46 | ||
| 0.20 | 0.04 | 0.88 | 0.05 | 0.03 | 0.29 | 0.15 | 0.06 | 0.73 | ||
| 0.15 | 0.02 | 0.59 | 0.25 | 0.01 | 0.70 | 0.35 | 0.03 | 0.72 | ||
| 0.45 | 0.02 | 0.69 | 0.55 | 0.02 | 0.67 | 0.65 | 0.03 | 0.59 | ||
| 0.25 | 0.02 | 0.67 | 0.45 | 0.03 | 0.70 | 0.55 | 0.04 | 0.51 | ||
Fig 3ROC curve of spatial and spatio-temporal methods, all outbreak types.
Fig 4ROC curves of spatial and spatio-temporal methods, by inject magnitude.
Fig 5ROC curves of spatial and spatio-temporal methods, by inject type.
Programming metrics for tested methods.
| Method | Software | Programming time | Run time (sec) | Ease of use |
|---|---|---|---|---|
| ARIMA | SAS 9.2 | 3 weeks | 00:01 | Significant amount of testing needed for the final model inputs; also, it is a method that has not frequently been used at DOHMH so there was a learning curve; its description in the literature [ |
| C2 | SAS 9.2 | 1 week | 00:01 | One of the most commonly used methods; easy to understand; easy to code and well documented in the literature [ |
| CUSUM | SAS 9.2 | 1 week | 00:01 | Commonly used method; determining inputs to the CUSUM model was the most difficult part; otherwise, easy to code and well documented [ |
| GLM | SAS 9.2 | 3 days | 00:01 | Commonly used model; accessible to most analysts to understand, code, and troubleshoot [ |
| Holt-Winters | SAS 9.2 | 2 weeks | 00:01 | Experience was similar to the ARIMA where significant time went into developing the model specifications; the method has not been frequently used but is accessible and not difficult to understand [ |
| Temporal scan statistic | SaTScan | 3 days | 00:04 | Much of the programming is done already in the SaTScan program [ |
| GLMM | R | 1 week | 0:50 | Moderately difficult to program the model and output, knowledge of regression models needed [ |
| Bayesian | R, WinBUGS | 10 weeks | 2:00 | Highly difficult; extensive knowledge of several statistical packages and advanced statistics is required to understand and implement [ |
| Spatial scan statistic | SaTScan, R | 3 days | 0:06 | Much of the programming is done already in the SaTScan software [ |
| Space-time permutation | SaTScan, R | 3 days | 0:12 | Same as spatial scan statistic [ |