| Literature DB >> 32357126 |
Prosper Kandabongee Yeng1,2, Ashenafi Zebene Woldaregay1, Terje Solvoll3, Gunnar Hartvigsen1.
Abstract
BACKGROUND: The time lag in detecting disease outbreaks remains a threat to global health security. The advancement of technology has made health-related data and other indicator activities easily accessible for syndromic surveillance of various datasets. At the heart of disease surveillance lies the clustering algorithm, which groups data with similar characteristics (spatial, temporal, or both) to uncover significant disease outbreak. Despite these developments, there is a lack of updated reviews of trends and modelling options in cluster detection algorithms.Entities:
Keywords: aberration detection; sentinel surveillance; space-time clustering
Mesh:
Year: 2020 PMID: 32357126 PMCID: PMC7284413 DOI: 10.2196/11512
Source DB: PubMed Journal: JMIR Public Health Surveill ISSN: 2369-2960
Data categories and their definitions.
| Category | Definition |
| Clustering and aberration detection algorithm | The kind of clustering and aberration detection algorithm used and implemented in the study. |
| Type of clustering algorithm | The type of algorithm used (spatial, temporal, or spatiotemporal algorithm). |
| Threshold | The type of threshold used to generate alarms and alerts in the study. |
| Design method | The design method used in implementing the system, such as prototype, participatory or joint application development, or agile or waterfall model. |
| Evaluation criteria | The criteria used to evaluate the algorithms. |
| Performance metrics | The performance metrics used to evaluate the algorithms, such as sensitivity, specificity, and positive predictive value. |
| Type of location | Locations used in clustering, including geolocation, postal codes, and counties; specifies the exact type of location used in the system. |
| Source of location | Where the type of location information was obtained. |
| Nature of location | State of the location as static or dynamic. |
| Visualization tool | The type of tool used to implement the visualization aspect of the system. |
| Display report | The type of visual displays (eg, graphs, maps, time series) implemented by the various systems in the study. |
| Design layout | The stages and processes used in the architectural design of the syndromic surveillance system (eg, a layout may consist of data acquisition, clustering and aberration detection, and visualization [ |
Figure 1Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) flow diagram of the literature review process.
Summary of articles reviewed.
| Reference (first author, year) | Target disease | Place | Period | Input source |
| Gesteland, 2003 [ | Bioterrorism | 2002, Olympics | 2002 | Chief complaints from emergency departments |
| Yan, 2013 [ | Infectious diseases | Rural China | 2012 | Symptoms of patients from health facilities, medication sales from pharmacies, and primary school absenteeism |
| Maciejewski, 2009 [ | Detection of public health emergencies | Indiana State Department of Health | 2009-2010 | Symptoms in emergency departments |
| Thapen, 2016 [ | Generalized disease nowcasting | United Kingdom | 2014 | Twitter and |
| Thapen, 2016 [ | Infectious diseases, eg, hay fever and flu | England and Wales | 2014 | |
| Gomide, 2011 [ | Dengue | Observatório da Dengue website (www.observatorio.inweb.org.br/dengue/) | N/Aa | |
| Qi, 2013 [ | Influenza infection | University campus | Spring 2011 | Movement trajectory |
| Mathes, 2017 [ | Infectious diseases | New York City | Since 2001 | Emergency department visits with infectious diseases such as cough, sore throat, and fever for influenzalike illness |
| Yih, 2010 [ | Acute illness for bioterrorism event | Greater Boston area, Greater Twin Cities area, Austin and Travis County, San Mateo County | 2007-2008 | Ambulatory care encounters |
| Kleinman, 2005 [ | Lower respiratory tract infection | Boston area | N/A | Ambulatory care encounters |
| Dafni, 2004 [ | Emergency department data | Athens, 2004 Olympic Games | 2002-2003 | Symptoms in emergency department |
| Wagner, 2004 [ | Infectious disease | Utah, Atlantic City | 1999 | Chief-complaint data |
| Weng, 2015 [ | Enterovirus and influenza | Taipei | 2010/2011 | School-based syndromes |
| Maciejewski, 2010 [ | Respiratory illness | State of Indiana | 2007 | Infectious disease |
| Higgs, 2007 [ | Comprehensive tuberculosis data | San Francisco homeless | 1991-2002 | Tuberculosis |
| Ali, 2016 [ | Infectious disease | Pakistan | 2011-2015 | Chief complaints from emergency departments |
| Groeneveld, 2017 [ | Infectious disease | Netherlands | 2014/2015 | Respiratory tract infection, hepatitis, and encephalitis/meningitis |
| Kajita, 2017 [ | Emergency department data | Los Angeles County Department of Public Health, 2015 Special Olympic Games | 2015 | Monitor health impact |
| Choi, 2010 [ | Infectious disease | Hong Kong | 2005 | Febrile patients |
| Heffernan, 2004 [ | Emergency department chief complaint | New York City Department of Health and Mental Hygiene | 2001-2002 | Infectious disease, eg, respiratory, fever, diarrhea, and vomiting |
| Takahashi, 2008 [ | Infectious disease | Massachusetts | 2005 | Daily syndromic surveillance data |
| Besculides, 2005 [ | Infectious disease | New York City | 2001-2002 | School absenteeism data |
| Blake, 2016 [ | Poliomyelitis outbreaks | N/A | 2003-2012 | Reporting of acute flaccid paralysis cases and laboratory confirmation |
| Greene, 2012 [ | Gastrointestinal disease outbreak detection | Kaiser Permanente Northern California | 2009 | Data streams from electronic medical records |
| Vilain, 2016 [ | Infectious disease | French Institute for Public Health Surveillance, Reunion Island | 2013-2014 | Emergency department visits |
| Sharip, 2006 [ | Infectious disease | Los Angeles County | 2003-2004 | Emergency department syndromic data |
| Duangchaemkarn, 2017 [ | Infectious disease | N/A | 2016-2017 | Chief complaint symptoms |
aN/A: not available.
Frequency of clustering and aberration detection algorithms (n=66).
| Algorithm | Usage, n (%) |
| Cumulative summation | 10 (15) |
| Space-time permutation scan statistic | 10 (15) |
| Space-time scan statistic | 5 (8) |
| Space scan statistic | 4 (6) |
| Kernel density | 3 (5) |
| Moving average | 3 (5) |
| Log-linear regression | 2 (3) |
| Density-based spatial clustering of applications with noise | 2 (3) |
| Recursive least square | 2 (3) |
| Statistical process control | 2 (3) |
| Autoregressive integrated moving average | 2 (3) |
| Risk-adjusted support vector clustering | 1 (2) |
| Bayesian spatial scan statistic | 1 (2) |
| Exponentially weighted moving average | 1 (2) |
| Flexible space-time scan statistic | 1 (2) |
| k-means clustering | 1 (2) |
| K-nearest neighbor with Haversine distance | 1 (2) |
| Shewhart chart | 1 (2) |
| Pulsar method | 1 (2) |
| Risk-adjusted nearest neighbor hierarchical clustering | 1 (2) |
| Small area regression and testing | 1 (2) |
| Spatiotemporal density-based spatial clustering of applications with noise | 1 (2) |
| What is strange about recent event | 1 (2) |
| Bayesian space-time regression | 1 (2) |
| Generalized linear mixed model | 1 (2) |
| Generalized linear model | 1 (2) |
| Holt-Winters exponential smoother | 1 (2) |
| Temporal scan statistic | 1 (2) |
| Modified Early Aberration Reporting System C2 | 1 (2) |
| Temporal aberration detection | 1 (2) |
Figure 2Sensitivity and specificity of the evaluated algorithms.
Evaluation metrics of some algorithms.
| Algorithms | Specificity (%) | Sensitivity (%) | Detected cases (n) |
| Space-time permutation scan statistic | 82 | 83 | 26 |
| Pulsar method | 97 | 85 | 223 |
| Cumulative summation | 95 | 92 | 212 |
| Space scan statistic | 95 | 89 | 790 |
| Space-time scan statistic | 99 | 92 | 3 |
| Flexible space-time scan statistic | 82 | 99.5 | 4 |
Design layouts and their frequencies (n=22).
| Design layout | Description | Usage, n (%) |
| Data clustering and aberration detection, alarms and alerts (DCADAA) | This layout consists of obtaining data first. Then clustering and aberration detection are done, followed by generating alarms to create alerts of aberrations [ | 12 (55) |
| Data clustering and aberration detection, visualization, alarms and alerts (DCAVAA) | A visualizing module is built in addition to processes defined in DCADAA [ | 1 (5) |
| Data cleaning and transformation, clustering and aberration detection visualization, alarms and alerts | In addition to the DCAVAA layer, this layer has data cleaning and transformation features. | 3 (14) |
| Data clustering, filtering or categorizing, aberration detection, alarms and alerts | In addition to DCADAA, this layout filters data or categorizes the data into some defined groups, either manually or by employing machine learning techniques. | 2 (9) |
| Data clustering and aberration detection, privacy-preserving mechanism (DPVCAAA) | In addition to DCAVAA, this layout has privacy-preserving mechanisms, such as anonymization and pseudonymization [ | 2 (9) |
| Real time, privacy-preserving mechanism, data clustering and aberration detection, alerts and alarms | On top of the DPVCAAA layout, there is an additional module for real-time data processing [ | 1 (5) |
| User tracking, data clustering, aberration detection, visualization, alarms and alerts | In addition to DCAVAA, this layout tracks the user’s movement to obtain data. This is followed by validating the data before clustering and aberration detection [ | 1 (5) |
Figure 3Cluster detection mechanism framework.
Summary of the most used categories.
| Category | Most used |
| Clustering algorithm | Space-time permutation scan statistic |
| Type of clustering | Spatiotemporal type |
| Threshold | Recurrence interval |
| Design method | Participatory design |
| Evaluation method | Simulation with historical data |
| Performance metric | Sensitivity |
| Type of location | Geocode |
| Source of location | Patient health record |
| Nature of location source | Static |
| Visualization tool used | ArcGIS |
| Displayed output | Maps |
| Layout | Data clustering and aberration detection, alarms and alerts |