Literature DB >> 16233974

Learning from imbalanced data in surveillance of nosocomial infection.

Gilles Cohen1, Mélanie Hilario, Hugo Sax, Stéphane Hugonnet, Antoine Geissbuhler.   

Abstract

OBJECTIVE: An important problem that arises in hospitals is the monitoring and detection of nosocomial or hospital acquired infections (NIs). This paper describes a retrospective analysis of a prevalence survey of NIs done in the Geneva University Hospital. Our goal is to identify patients with one or more NIs on the basis of clinical and other data collected during the survey. METHODS AND MATERIAL: Standard surveillance strategies are time-consuming and cannot be applied hospital-wide; alternative methods are required. In NI detection viewed as a classification task, the main difficulty resides in the significant imbalance between positive or infected (11%) and negative (89%) cases. To remedy class imbalance, we explore two distinct avenues: (1) a new re-sampling approach in which both over-sampling of rare positives and under-sampling of the noninfected majority rely on synthetic cases (prototypes) generated via class-specific sub-clustering, and (2) a support vector algorithm in which asymmetrical margins are tuned to improve recognition of rare positive cases. RESULTS AND
CONCLUSION: Experiments have shown both approaches to be effective for the NI detection problem. Our novel re-sampling strategies perform remarkably better than classical random re-sampling. However, they are outperformed by asymmetrical soft margin support vector machines which attained a sensitivity rate of 92%, significantly better than the highest sensitivity (87%) obtained via prototype-based re-sampling.

Entities:  

Mesh:

Year:  2005        PMID: 16233974     DOI: 10.1016/j.artmed.2005.03.002

Source DB:  PubMed          Journal:  Artif Intell Med        ISSN: 0933-3657            Impact factor:   5.326


  28 in total

1.  Improving predictions in imbalanced data using Pairwise Expanded Logistic Regression.

Authors:  Xiaoqian Jiang; Robert El-Kareh; Lucila Ohno-Machado
Journal:  AMIA Annu Symp Proc       Date:  2011-10-22

2.  Integrating new data balancing technique with committee networks for imbalanced data: GRSOM approach.

Authors:  Danaipong Chetchotsak; Sirorat Pattanapairoj; Banchar Arnonkijpanich
Journal:  Cogn Neurodyn       Date:  2015-07-31       Impact factor: 5.082

3.  Comparative performance analysis of state-of-the-art classification algorithms applied to lung tissue categorization.

Authors:  Adrien Depeursinge; Jimison Iavindrasana; Asmâa Hidki; Gilles Cohen; Antoine Geissbuhler; Alexandra Platon; Pierre-Alexandre Poletti; Henning Müller
Journal:  J Digit Imaging       Date:  2008-11-04       Impact factor: 4.056

Review 4.  Application of machine learning algorithms for clinical predictive modeling: a data-mining approach in SCT.

Authors:  R Shouval; O Bondi; H Mishan; A Shimoni; R Unger; A Nagler
Journal:  Bone Marrow Transplant       Date:  2013-10-07       Impact factor: 5.483

5.  Distance Metric Based Oversampling Method for Bioinformatics and Performance Evaluation.

Authors:  Meng-Fong Tsai; Shyr-Shen Yu
Journal:  J Med Syst       Date:  2016-05-16       Impact factor: 4.460

6.  The effects of data sources, cohort selection, and outcome definition on a predictive model of risk of thirty-day hospital readmissions.

Authors:  Colin Walsh; George Hripcsak
Journal:  J Biomed Inform       Date:  2014-08-23       Impact factor: 6.317

7.  Anomaly and signature filtering improve classifier performance for detection of suspicious access to EHRs.

Authors:  Jihoon Kim; Janice M Grillo; Aziz A Boxwala; Xiaoqian Jiang; Rose B Mandelbaum; Bhakti A Patel; Debra Mikels; Staal A Vinterbo; Lucila Ohno-Machado
Journal:  AMIA Annu Symp Proc       Date:  2011-10-22

8.  A Supervised Learning Process to Validate Online Disease Reports for Use in Predictive Models.

Authors:  Helena M M Patching; Laurence M Hudson; Warrick Cooke; Andres J Garcia; Simon I Hay; Mark Roberts; Catherine L Moyes
Journal:  Big Data       Date:  2015-12-01       Impact factor: 2.128

9.  Bias-corrected diagonal discriminant rules for high-dimensional classification.

Authors:  Song Huang; Tiejun Tong; Hongyu Zhao
Journal:  Biometrics       Date:  2010-12       Impact factor: 2.571

10.  A particle swarm based hybrid system for imbalanced medical data sampling.

Authors:  Pengyi Yang; Liang Xu; Bing B Zhou; Zili Zhang; Albert Y Zomaya
Journal:  BMC Genomics       Date:  2009-12-03       Impact factor: 3.969

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.