Literature DB >> 23428358

Missing data in medical databases: impute, delete or classify?

Federico Cismondi1, André S Fialho, Susana M Vieira, Shane R Reti, João M C Sousa, Stan N Finkelstein.   

Abstract

BACKGROUND: The multiplicity of information sources for data acquisition in modern intensive care units (ICUs) makes the resulting databases particularly susceptible to missing data. Missing data can significantly affect the performance of predictive risk modeling, an important technique for developing medical guidelines. The two most commonly used strategies for managing missing data are to impute or delete values, and the former can cause bias, while the later can cause both bias and loss of statistical power.
OBJECTIVES: In this paper we present a new approach for managing missing data in ICU databases in order to improve overall modeling performance.
METHODS: We use a statistical classifier followed by fuzzy modeling to more accurately determine which missing data should be imputed and which should not. We firstly develop a simulation test bed to evaluate performance, and then translate that knowledge using exactly the same database as previously published work by [13].
RESULTS: In this work, test beds resulted in datasets with missing data ranging 10-50%. Using this new approach to missing data we are able to significantly improve modeling performance parameters such as accuracy of classifications by an 11%, sensitivity by 13%, and specificity by 10%, including also area under the receiver-operator curve (AUC) improvement of up to 13%.
CONCLUSIONS: In this work, we improve modeling performance in a simulated test bed, and then confirm improved performance replicating previously published work by using the proposed approach for missing data classification. We offer this new method to other researchers who wish to improve predictive risk modeling performance in the ICU through advanced missing data management.
Copyright © 2013 Elsevier B.V. All rights reserved.

Mesh:

Year:  2013        PMID: 23428358     DOI: 10.1016/j.artmed.2013.01.003

Source DB:  PubMed          Journal:  Artif Intell Med        ISSN: 0933-3657            Impact factor:   5.326


  16 in total

1.  Can structured EHR data support clinical coding? A data mining approach.

Authors:  José Carlos Ferrão; Mónica Duarte Oliveira; Filipe Janela; Henrique M G Martins; Daniel Gartner
Journal:  Health Syst (Basingstoke)       Date:  2020-03-01

2.  Preprocessing structured clinical data for predictive modeling and decision support. A roadmap to tackle the challenges.

Authors:  José Carlos Ferrão; Mónica Duarte Oliveira; Filipe Janela; Henrique M G Martins
Journal:  Appl Clin Inform       Date:  2016-12-07       Impact factor: 2.342

3.  Identifying and mitigating biases in EHR laboratory tests.

Authors:  Rimma Pivovarov; David J Albers; Jorge L Sepulveda; Noémie Elhadad
Journal:  J Biomed Inform       Date:  2014-04-13       Impact factor: 6.317

4.  The visual outcomes of idiopathic epiretinal membrane removal in eyes with ectopic inner foveal layers and preserved macular segmentation.

Authors:  Michele Coppola; Maria Brambati; Maria Vittoria Cicinelli; Alessandro Marchese; Emma Clara Zanzottera; Antonio Peroglio Deiro; Michal Post; Francesco Bandello
Journal:  Graefes Arch Clin Exp Ophthalmol       Date:  2021-02-02       Impact factor: 3.117

5.  Modeling plasticity during epileptogenesis by long short term memory neural networks.

Authors:  Marzieh Shahpari; Morteza Hajji; Javad Mirnajafi-Zadeh; Peyman Setoodeh
Journal:  Cogn Neurodyn       Date:  2021-09-15       Impact factor: 5.082

6.  Artificial intelligence in the management and treatment of burns: a systematic review.

Authors:  Francisco Serra E Moura; Kavit Amin; Chidi Ekwobi
Journal:  Burns Trauma       Date:  2021-08-19

7.  Spatially-Constrained Fisher Representation for Brain Disease Identification With Incomplete Multi-Modal Neuroimages.

Authors:  Yongsheng Pan; Mingxia Liu; Chunfeng Lian; Yong Xia; Dinggang Shen
Journal:  IEEE Trans Med Imaging       Date:  2020-03-24       Impact factor: 10.048

Review 8.  Meal Pattern Analysis in Nutritional Science: Recent Methods and Findings.

Authors:  Cathal O'Hara; Eileen R Gibney
Journal:  Adv Nutr       Date:  2021-07-30       Impact factor: 8.701

9.  Predicting Missing Values in Medical Data via XGBoost Regression.

Authors:  Xinmeng Zhang; Chao Yan; Cheng Gao; Bradley A Malin; You Chen
Journal:  J Healthc Inform Res       Date:  2020-08-03

10.  Fuzzy Modeling to Predict Severely Depressed Left Ventricular Ejection Fraction following Admission to the Intensive Care Unit Using Clinical Physiology.

Authors:  Rúben Duarte M A Pereira; Cátia M Salgado; Andre Dejam; Shane R Reti; Susana M Vieira; João M C Sousa; Leo A Celi; Stan N Finkelstein
Journal:  ScientificWorldJournal       Date:  2015-08-05
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.