Literature DB >> 27648777

Daily Reportable Disease Spatiotemporal Cluster Detection, New York City, New York, USA, 2014-2015.

Sharon K Greene, Eric R Peterson, Deborah Kapell, Annie D Fine, Martin Kulldorff.   

Abstract

Each day, the New York City Department of Health and Mental Hygiene uses the free SaTScan software to apply prospective space-time permutation scan statistics to strengthen early outbreak detection for 35 reportable diseases. This method prompted early detection of outbreaks of community-acquired legionellosis and shigellosis.

Entities:  

Keywords:  Legionella; New York; New York City; Shigella; USA; bacteria; communicable diseases; detection; disease clustering; epidemiology; foodborne diseases; legionellosis; outbreaks; shigellosis; surveillance; zoonoses

Mesh:

Year:  2016        PMID: 27648777      PMCID: PMC5038417          DOI: 10.3201/eid2210.160097

Source DB:  PubMed          Journal:  Emerg Infect Dis        ISSN: 1080-6040            Impact factor:   6.883


The Bureau of Communicable Disease (BCD) at the New York City Department of Health and Mental Hygiene (DOHMH) monitors and investigates >70 reportable diseases among the city’s 8.49 million residents. Each day, healthcare providers and laboratories submit ≈1,000 communicable disease reports to BCD. Clusters (significant increases in observed vs. expected cases) and outbreaks (clusters believed to be associated with a common infection source) are detected through several methods, including notification by astute healthcare providers and by applying the modified historical limits method to detect increases in disease counts during the previous 4 weeks (). This temporal analysis is applied weekly citywide and for each of 5 boroughs and 42 neighborhoods. Cluster detection methods have been applied to syndromic data sources (e.g., emergency department visits) since the early 2000s (,). Less extensively described is cluster detection using reportable disease data, which reflect specific laboratory-confirmed diagnoses, contain patient home addresses, and may include illness onset dates and work addresses collected during patient interviews and medical record reviews. Other public health practitioners have applied purely temporal prospective cluster detection methods to reportable disease data (,) or conducted proof-of-concept spatiotemporal prospective analyses (,). However, published descriptions of actual prospective application of spatiotemporal methods to reportable diseases are rare (,), suggesting lack of widespread adoption among public health officials. We describe BCD’s experience with automated daily reportable disease spatiotemporal cluster detection using prospective space–time permutation scan statistics () in SaTScan () during February 2014–September 2015, highlighting instances in which findings guided public health action.

The Study

For 35 reportable communicable diseases for which cluster detection could inform programmatic activities (), we analyzed disease counts for patients of all ages combined. For amebiasis, cryptosporidiosis, and giardiasis, for which outbreaks among young children are of particular interest, additional analyses were restricted to disease counts among patients <5 years of age, for 38 total daily analyses. In BCD’s application, the space–time permutation scan statistic detects disease clusters in space–time cylinders centered on every census tract centroid; the circular base represents space (maximum geographic cluster size of 50% of all reported cases), and the height represents time (maximum temporal window length of 30 days, for most diseases). For each cylinder, a likelihood ratio–based test statistic is calculated. The test statistic is considered elevated if the observed disease count during the time window in census tracts with centroids inside the cylinder’s circular base exceeds the expected number of cases, which is a function of 1) the case count in the circle during a baseline period (which accounts for any purely geographic variations in disease occurrence, diagnosis, and reporting) and 2) the total case count citywide during the time window (which accounts for citywide purely temporal patterns, such as seasonality or secular trends) (). The cylinder with the maximum test statistic is the cluster least likely to be due to chance under the null hypothesis that the same process generated disease counts inside and outside the cylinder. To create a simulated dataset, cases’ dates are randomly shuffled and assigned to the original census tracts. The maximum statistic for each simulated dataset is calculated in the same way as for the observed dataset. For each disease, this process is repeated daily 999 times. The maximum value for the observed dataset is ranked among the 999 trial maxima. A p value (range 0.001–1) is derived from this ranking; p = 0.001 represents the highest significance relative to the permutation trials. The Monte Carlo approach to deriving significance by using repeated trials, each permuting observed data attributes, is designed to control for multiple testing. A recurrence interval (RI) is calculated as the reciprocal of the p value and represents the number of days of daily surveillance required for the expected number of clusters at least as unusual as the observed cluster to be equal to 1 by chance (). We defined a signal as any cluster with an RI >100 days; that is, during any 100-day daily analysis period, the expected number of clusters at least as unlikely as the current cluster is 1. We developed a SAS program (SAS Institute, Inc., Cary, NC, USA) to generate case and parameter files (Table 1), read in a coordinate file of census tract centroids, invoke SaTScan in batch mode, read analysis results back into SAS for further processing, and output files to secured folders. For any signals, the program also generated emails notifying BCD leadership and staff responsible for follow-up (Technical Appendix).
Table 1

Case file specifications for routine reportable disease analyses in New York City, New York, using the prospective space–time permutation scan statistic

FeatureSelectionNotes
Geographic aggregation
Census tract (defined using US Census 2000 boundaries) of residential address at time of report*
   The less data are spatially aggregated, the more precisely areas with elevated rates can be identified. New York City has 2,216 census tracts in an area of 305 square miles.
Date of interest for analysis
Event date, defined using hierarchy of onset date → diagnosis date (collection date of first specimen testing positive) → report date → date event created in surveillance database
   Defining reportable disease clusters according to when case-patients became ill is preferred. However, onset date is missing for most case-patients who have not yet been interviewed, and each case needs a date to be included in analysis. Thus, the best available proxy for onset date is used. Because we use daily data (rather than weekly, monthly, or yearly data), the time precision is specified as day on the SaTScan (http://www.satscan.org/) input tab. The time precision parameter indicates the temporal resolution of the data in the case file.
Study period
1 y for most diseases, ending the day before analysis†
   One year is a reasonable choice, balancing the need for a period long enough to establish a stable local baseline for each spatial unit, yet short enough to avoid variable secular trends (e.g., geographically different increases in the underlying population of a spatial unit). Analyses are run each morning using data with event dates through the previous day.
Case inclusion criteria
Include all reported cases, regardless of current status (e.g., confirmed, probable, suspected, pending, noncase)†
   Depending on the disease, cases initially might be assigned a transient pending status and, upon investigation, be reclassified as a case (confirmed, probable, or suspected) or a noncase. Timeliness is preserved by analyzing all reported cases, including noncases and pending cases, regardless of whether they ultimately will be confirmed. By analyzing all reported cases, case inclusion criteria are consistent across the study period. If instead the case file were restricted to confirmed and pending cases, then analyses would be biased toward false signaling, as some cases with an initial pending status will be ultimately reclassified after investigation as a noncase. This reclassification process is complete for the baseline but ongoing for the current period of interest (1), and the speed of reclassification might vary geographically.
Day-of-week variableInclude a variable that indicates the day of the week (1–7)    The analysis automatically adjusts for day-of-week effects but not for space by day-of-week interaction. Including this variable in the SaTScan case file accounts for how the daily pattern of health-seeking behavior and diagnoses might vary geographically.

*Exception to residential address at time of report: if the residential address is not geocodable (e.g., because the case-patient is not a resident of the city or because a post office box is reported instead of a street address), then the geocoded work address, if available, is substituted.
†For exceptions, see online Technical Appendix (http://wwwnc.cdc.gov/EID/article/22/10/16-0097-Techapp1.pdf).

*Exception to residential address at time of report: if the residential address is not geocodable (e.g., because the case-patient is not a resident of the city or because a post office box is reported instead of a street address), then the geocoded work address, if available, is substituted.
†For exceptions, see online Technical Appendix (http://wwwnc.cdc.gov/EID/article/22/10/16-0097-Techapp1.pdf). This automated analysis detected the second largest US outbreak of community-acquired legionellosis (), identifying a cluster of 8 cases centered in the South Bronx on Friday, July 17, 2015 (RI = 500 days) (Figure), before any human public health monitor noticed it. On Monday, July 20, an increase in cases was independently noticed by BCD staff members routinely investigating individual cases, and on July 21, an infection-control nurse working in the outbreak area called BCD to report an increase. The DOHMH and state and federal partners conducted an extensive epidemiologic, environmental, and laboratory investigation to identify and remediate the outbreak source, a cooling tower.
Figure

Automated output from spatiotemporal analysis on July 17, 2015, indicating a cluster (dark gray) of 8 legionellosis cases over 8 days centered in the South Bronx, New York City, New York, USA. In subsequent days, this cluster expanded in space and time into the second largest US outbreak of community-acquired legionellosis.

Automated output from spatiotemporal analysis on July 17, 2015, indicating a cluster (dark gray) of 8 legionellosis cases over 8 days centered in the South Bronx, New York City, New York, USA. In subsequent days, this cluster expanded in space and time into the second largest US outbreak of community-acquired legionellosis. A shigellosis outbreak among the observant Jewish community in Brooklyn () began in late October 2014 and was detected with 9 cases on November 14, 2014 (RI = 333 days). BCD does not routinely investigate individual shigellosis reports, so automated analysis alone prompted early outbreak identification. Shigellosis outbreaks within this community occur cyclically and have been linked to daycare and preschool attendance (). Starting in mid-November, BCD staff visited community schools, daycare centers, and health fairs to promote appropriate handwashing. The outbreak subsided by mid-March 2015. Other clusters prompting investigations included legionellosis (Queens, April–May 2015) and campylobacteriosis (Brooklyn, October 2014). During a 1-year period, 28 unique signals were observed across 15 diseases (Table 2), which staff perceived as a reasonable number for investigation.
Table 2

Signaling rates at 3 recurrence interval thresholds for 35 reportable diseases under surveillance in New York City, New York, USA, 2014–2015*

DiseaseAnnual no. cases‡No. signals during 365 d of prospective surveillance†
Recurrence interval >365 d§Recurrence interval >100 dRecurrence interval >30 d
Amebiasis47601.24.3
Babesiosis57000
Campylobacteriosis1,6630.60.64.9
Chikungunya1710.61.83.1
Cholera0000
Cryptosporidiosis135000.6
Cyclosporiasis51001.2
Dengue57001.8
Encephalitis2000
Giardiasis8711.21.85.5
Hemolytic uremic syndrome4000
Hepatitis A781.91.95.8
Acute hepatitis B510.61.23.7
Hepatitis D0000
Hepatitis E000.60.6
Human granulocytic anaplasmosis510.60.60.6
Human monocytic ehrlichiosis800.60.6
Invasive group A Streptococcus disease263001.8
Invasive group B Streptococcus disease330.61.22.4
Invasive Haemophilus influenzae disease97001.8
Invasive Streptococcus pneumoniae disease64701.21.8
Legionellosis4349.19.111.4
Listeriosis34000.6
Malaria1870.61.84.3
Meningococcal disease8000.6
Noncholera Vibrio spp. infection18000
Paratyphoid fever11000
Rickettisalpox9000
Rocky Mountain spotted fever6002.4
Shiga toxin–producing Escherichia coli96000
Shigellosis8061.81.86.1
Typhoid fever3101.93.8
Vancomycin-intermediate Staphylococcus aureus infection28000
West Nile virus disease19000
Yersiniosis
25
0
0
0
Total signals across all diseases under surveillanceNA17.827.669.8

*Signals were detected by using the prospective space–time permutation scan statistic. NA, not applicable.
†A signal for a particular disease was defined as unique if the first most likely cluster on a particular day did not encompass any of the same census tracts as the first most likely cluster on the prior day. The signaling rate for most diseases was based on 598 d of surveillance (February 10, 2014–September 30, 2015). For 5 diseases, the signaling rate was based on a shorter surveillance period to reflect analytic adjustments: hepatitis A, paratyphoid fever, and typhoid fever (190 d under surveillance after extending to a 60-d maximum temporal cluster size); legionellosis (160 d under surveillance after excluding unresolved cases); and Shiga toxin–producing E. coli (21 d under surveillance after excluding cases with only a positive multiplex PCR gastrointestinal panel test).
‡Confirmed, probable, and suspected cases among residents with event dates October 1, 2014–September 30, 2015.
§The signal was detected at the lower ≥100-d threshold on the same day for 50% of the signals, 1 d earlier for 19% of signals, 2 d earlier for 19% of signals, 3 d earlier for 6% of signals, and 7 d earlier for 6% of signals.

*Signals were detected by using the prospective space–time permutation scan statistic. NA, not applicable.
†A signal for a particular disease was defined as unique if the first most likely cluster on a particular day did not encompass any of the same census tracts as the first most likely cluster on the prior day. The signaling rate for most diseases was based on 598 d of surveillance (February 10, 2014–September 30, 2015). For 5 diseases, the signaling rate was based on a shorter surveillance period to reflect analytic adjustments: hepatitis A, paratyphoid fever, and typhoid fever (190 d under surveillance after extending to a 60-d maximum temporal cluster size); legionellosis (160 d under surveillance after excluding unresolved cases); and Shiga toxin–producing E. coli (21 d under surveillance after excluding cases with only a positive multiplex PCR gastrointestinal panel test).
‡Confirmed, probable, and suspected cases among residents with event dates October 1, 2014–September 30, 2015.
§The signal was detected at the lower ≥100-d threshold on the same day for 50% of the signals, 1 d earlier for 19% of signals, 2 d earlier for 19% of signals, 3 d earlier for 6% of signals, and 7 d earlier for 6% of signals. Not all detected clusters were actionable. No public health response was conducted for an amebiasis cluster (Manhattan, April 2015; RI = 143 days) consisting of 6 men (34–49 years of age) diagnosed within a 12-day period and residing within a 0.35-mile radius because no case-patients were identified as food handlers or daycare workers. A public health response also was not conducted for a giardiasis cluster (Bronx, April 2015; RI = 1,000 days) that consisted of 6 household members who acquired the infection during international travel. Investigators were interested in being notified of and following such clusters over time, even if they ultimately were not actionable or verified as true outbreaks.

Conclusions

Several outbreaks in New York City, New York, were detected by daily automated spatiotemporal analyses. Early cluster detection facilitated prioritization of individual case investigations, outbreak recognition and investigation, provider and community outreach, and timely intervention to limit sickness and death. This method has proven particularly useful for identifying and monitoring outbreaks of shigellosis (,,) and legionellosis and might be useful for monitoring additional diseases with outbreak potential, including pertussis, syphilis, and tuberculosis. Key to the system’s success is a strong informatics infrastructure, especially electronic laboratory reporting and near real-time geocoding of surveillance data. Other facilitators include a powerful statistical disease surveillance methodology, knowledgeable epidemiologists to interpret signals, and adequate outbreak investigation resources. These methods could be useful to other health departments receiving more reports than can be rapidly reviewed manually. State health departments could consider conducting similar analyses to detect clusters spanning multiple jurisdictions.

Technical Appendix

Supplemental information, SAS code, and sample output for daily reportable disease spatiotemporal cluster detection, New York City, New York, USA.
  10 in total

1.  A recurring outbreak of Shigella sonnei among traditionally observant Jewish children in New York City: the risks of daycare and household transmission.

Authors:  V Garrett; K Bornschlegel; D Lange; V Reddy; L Kornstein; J Kornblum; A Agasan; M Hoekstra; M Layton; J Sobel
Journal:  Epidemiol Infect       Date:  2006-04-20       Impact factor: 2.451

2.  A generalized linear mixed models approach for detecting incident clusters of disease in small areas, with an application to biological terrorism.

Authors:  Ken Kleinman; Ross Lazarus; Richard Platt
Journal:  Am J Epidemiol       Date:  2004-02-01       Impact factor: 4.897

3.  Use of a prospective space-time scan statistic to prioritize shigellosis case investigations in an urban jurisdiction.

Authors:  Roderick C Jones; Monica Liberatore; Julio R Fernandez; Susan I Gerber
Journal:  Public Health Rep       Date:  2006 Mar-Apr       Impact factor: 2.792

4.  The bioterrorism preparedness and response Early Aberration Reporting System (EARS).

Authors:  Lori Hutwagner; William Thompson; G Matthew Seeman; Tracee Treadwell
Journal:  J Urban Health       Date:  2003-06       Impact factor: 3.671

5.  Syndromic surveillance in public health practice, New York City.

Authors:  Richard Heffernan; Farzad Mostashari; Debjani Das; Adam Karpati; Martin Kulldorff; Don Weiss
Journal:  Emerg Infect Dis       Date:  2004-05       Impact factor: 6.883

6.  Near real-time space-time cluster analysis for detection of enteric disease outbreaks in a community setting.

Authors:  Aharona Glatman-Freedman; Zalman Kaufman; Eran Kopel; Ravit Bassal; Diana Taran; Lea Valinsky; Vered Agmon; Manor Shpriz; Daniel Cohen; Emilia Anis; Tamy Shohat
Journal:  J Infect       Date:  2016-06-14       Impact factor: 6.072

7.  An evaluation of SaTScan for the prospective detection of space-time Campylobacter clusters in the North East of England.

Authors:  G J Hughes; R Gorton
Journal:  Epidemiol Infect       Date:  2013-01-25       Impact factor: 4.434

8.  A space-time permutation scan statistic for disease outbreak detection.

Authors:  Martin Kulldorff; Richard Heffernan; Jessica Hartman; Renato Assunção; Farzad Mostashari
Journal:  PLoS Med       Date:  2005-02-15       Impact factor: 11.069

9.  Refining historical limits method to improve disease cluster detection, New York City, New York, USA.

Authors:  Alison Levin-Rector; Elisha L Wilson; Annie D Fine; Sharon K Greene
Journal:  Emerg Infect Dis       Date:  2015-02       Impact factor: 6.883

10.  Laboratory-based prospective surveillance for community outbreaks of Shigella spp. in Argentina.

Authors:  María R Viñas; Ezequiel Tuduri; Alicia Galar; Katherine Yih; Mariana Pichel; John Stelling; Silvina P Brengi; Anabella Della Gaspera; Claudia van der Ploeg; Susana Bruno; Ariel Rogé; María I Caffer; Martin Kulldorff; Marcelo Galas
Journal:  PLoS Negl Trop Dis       Date:  2013-12-12
  10 in total
  20 in total

1.  A Large Community Outbreak of Legionnaires' Disease Associated With a Cooling Tower in New York City, 2015.

Authors:  Don Weiss; Christopher Boyd; Jennifer L Rakeman; Sharon K Greene; Robert Fitzhenry; Trevor McProud; Kimberlee Musser; Li Huang; John Kornblum; Elizabeth J Nazarian; Annie D Fine; Sarah L Braunstein; Daniel Kass; Keren Landman; Pascal Lapierre; Scott Hughes; Anthony Tran; Jill Taylor; Deborah Baker; Lucretia Jones; Laura Kornstein; Boning Liu; Rodolfo Perez; David E Lucero; Eric Peterson; Isaac Benowitz; Kristen F Lee; Stephanie Ngai; Mitch Stripling; Jay K Varma
Journal:  Public Health Rep       Date:  2017-01-31       Impact factor: 2.792

2.  Understanding the Importance of Contact Heterogeneity and Variable Infectiousness in the Dynamics of a Large Norovirus Outbreak.

Authors:  Jon Zelner; Carly Adams; Joshua Havumaki; Ben Lopman
Journal:  Clin Infect Dis       Date:  2020-01-16       Impact factor: 9.079

3.  Is precision public health the future - or a contradiction?

Authors:  Carrie Arnold
Journal:  Nature       Date:  2022-01       Impact factor: 49.962

4.  Likely community transmission of COVID-19 infections between neighboring, persistent hotspots in Ontario, Canada.

Authors:  Eliseos J Mucaki; Ben C Shirley; Peter K Rogan
Journal:  F1000Res       Date:  2021-12-23

5.  A Broad Safety Assessment of the Recombinant Herpes Zoster Vaccine.

Authors:  W Katherine Yih; Martin Kulldorff; Inna Dashevsky; Judith C Maro
Journal:  Am J Epidemiol       Date:  2022-03-24       Impact factor: 5.363

6.  An Exploration of the Spatiotemporal and Demographic Patterns of Ebola Virus Disease Epidemic in West Africa Using Open Access Data Sources.

Authors:  Vasile A Suchar; Noha Aziz; Amanda Bowe; Aran Burke; Michelle M Wiest
Journal:  Appl Geogr       Date:  2017-12-06

7.  Legionnaires' Disease Outbreaks and Cooling Towers, New York City, New York, USA.

Authors:  Robert Fitzhenry; Don Weiss; Dan Cimini; Sharon Balter; Christopher Boyd; Lisa Alleyne; Renee Stewart; Natasha McIntosh; Andrea Econome; Ying Lin; Inessa Rubinstein; Teresa Passaretti; Anna Kidney; Pascal Lapierre; Daniel Kass; Jay K Varma
Journal:  Emerg Infect Dis       Date:  2017-11       Impact factor: 6.883

8.  The 2015 New York City Legionnaires' Disease Outbreak: A Case Study on a History-Making Outbreak.

Authors:  Allison T Chamberlain; Jonathan D Lehnert; Ruth L Berkelman
Journal:  J Public Health Manag Pract       Date:  2017 Jul/Aug

9.  Automated outbreak detection of hospital-associated pathogens: Value to infection prevention programs.

Authors:  Meghan A Baker; Deborah S Yokoe; John Stelling; Ken Kleinman; Rebecca E Kaganov; Alyssa R Letourneau; Neha Varma; Thomas O'Brien; Martin Kulldorff; Damilola Babalola; Craig Barrett; Marci Drees; Micaela H Coady; Amanda Isaacs; Richard Platt; Susan S Huang
Journal:  Infect Control Hosp Epidemiol       Date:  2020-06-10       Impact factor: 6.520

10.  Effect of a new motorway on social-spatial patterning of road traffic accidents: A retrospective longitudinal natural experimental study.

Authors:  Jonathan R Olsen; Richard Mitchell; David Ogilvie
Journal:  PLoS One       Date:  2017-09-07       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.