Literature DB >> 35415441

A Combined Interpolation and Weighted K-Nearest Neighbours Approach for the Imputation of Longitudinal ICU Laboratory Data.

Sebastian Daberdaku1, Erica Tavazzi1, Barbara Di Camillo1.   

Abstract

The presence of missing data is a common problem that affects almost all clinical datasets. Since most available data mining and machine learning algorithms require complete datasets, accurately imputing (i.e. "filling in") the missing data is an essential step. This paper presents a methodology for the missing data imputation of longitudinal clinical data based on the integration of linear interpolation and a weighted K-Nearest Neighbours (KNN) algorithm. The Maximal Information Coefficient (MIC) values among features are employed as weights for the distance computation in the KNN algorithm in order to integrate intra- and inter-patient information. An interpolation-based imputation approach was also employed and tested both independently and in combination with the KNN algorithm. The final imputation is carried out by applying the best performing method for each feature. The methodology was validated on a dataset of clinical laboratory test results of 13 commonly measured analytes of patients in an intensive care unit (ICU) setting. The performance results are compared with those of 3D-MICE, a state-of-the-art imputation method for cross-sectional and longitudinal patient data. This work was presented in the context of the 2019 ICHI Data Analytics Challenge on Missing data Imputation (DACMI). © Springer Nature Switzerland AG 2020.

Entities:  

Keywords:  Clinical datasets; DACMI; Imputation; Interpolation; KNN

Year:  2020        PMID: 35415441      PMCID: PMC8982781          DOI: 10.1007/s41666-020-00069-1

Source DB:  PubMed          Journal:  J Healthc Inform Res        ISSN: 2509-498X


  17 in total

1.  PhysioBank, PhysioToolkit, and PhysioNet: components of a new research resource for complex physiologic signals.

Authors:  A L Goldberger; L A Amaral; L Glass; J M Hausdorff; P C Ivanov; R G Mark; J E Mietus; G B Moody; C K Peng; H E Stanley
Journal:  Circulation       Date:  2000-06-13       Impact factor: 29.690

2.  Missing data imputation: focusing on single imputation.

Authors:  Zhongheng Zhang
Journal:  Ann Transl Med       Date:  2016-01

Review 3.  Review: a gentle introduction to imputation of missing values.

Authors:  A Rogier T Donders; Geert J M G van der Heijden; Theo Stijnen; Karel G M Moons
Journal:  J Clin Epidemiol       Date:  2006-07-11       Impact factor: 6.437

4.  Minerva and minepy: a C engine for the MINE suite and its R, Python and MATLAB wrappers.

Authors:  Davide Albanese; Michele Filosi; Roberto Visintainer; Samantha Riccadonna; Giuseppe Jurman; Cesare Furlanello
Journal:  Bioinformatics       Date:  2012-12-14       Impact factor: 6.937

5.  Relationship between haemoglobin and haematocrit in the definition of anaemia.

Authors:  Llorenç Quintó; John J Aponte; Clara Menéndez; Jahit Sacarlal; Pedro Aide; Mateu Espasa; Inacio Mandomando; Caterina Guinovart; Eusebio Macete; Rosmarie Hirt; Honorathy Urassa; Margarita M Navia; Ricardo Thompson; Pedro L Alonso
Journal:  Trop Med Int Health       Date:  2006-08       Impact factor: 2.622

6.  Detecting novel associations in large data sets.

Authors:  David N Reshef; Yakir A Reshef; Hilary K Finucane; Sharon R Grossman; Gilean McVean; Peter J Turnbaugh; Eric S Lander; Michael Mitzenmacher; Pardis C Sabeti
Journal:  Science       Date:  2011-12-16       Impact factor: 47.728

7.  Practical and statistical issues in missing data for longitudinal patient-reported outcomes.

Authors:  Melanie L Bell; Diane L Fairclough
Journal:  Stat Methods Med Res       Date:  2013-02-19       Impact factor: 3.021

8.  Multi-task Gaussian process for imputing missing data in multi-trait and multi-environment trials.

Authors:  Tomoaki Hori; David Montcho; Clement Agbangla; Kaworu Ebana; Koichi Futakuchi; Hiroyoshi Iwata
Journal:  Theor Appl Genet       Date:  2016-08-19       Impact factor: 5.699

9.  Comparison of imputation methods for missing laboratory data in medicine.

Authors:  Akbar K Waljee; Ashin Mukherjee; Amit G Singal; Yiwei Zhang; Jeffrey Warren; Ulysses Balis; Jorge Marrero; Ji Zhu; Peter Dr Higgins
Journal:  BMJ Open       Date:  2013-08-01       Impact factor: 2.692

10.  Nearest neighbor imputation algorithms: a critical evaluation.

Authors:  Lorenzo Beretta; Alessandro Santaniello
Journal:  BMC Med Inform Decis Mak       Date:  2016-07-25       Impact factor: 2.796

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.