Literature DB >> 29403567

Random Forest Missing Data Algorithms.

Fei Tang1, Hemant Ishwaran1.   

Abstract

Random forest (RF) missing data algorithms are an attractive approach for imputing missing data. They have the desirable properties of being able to handle mixed types of missing data, they are adaptive to interactions and nonlinearity, and they have the potential to scale to big data settings. Currently there are many different RF imputation algorithms, but relatively little guidance about their efficacy. Using a large, diverse collection of data sets, imputation performance of various RF algorithms was assessed under different missing data mechanisms. Algorithms included proximity imputation, on the fly imputation, and imputation utilizing multivariate unsupervised and supervised splitting-the latter class representing a generalization of a new promising imputation algorithm called missForest. Our findings reveal RF imputation to be generally robust with performance improving with increasing correlation. Performance was good under moderate to high missingness, and even (in certain cases) when data was missing not at random.

Entities:  

Keywords:  Correlation; Imputation; Machine Learning; Missingness; Splitting (random; multivariate; univariate; unsupervised)

Year:  2017        PMID: 29403567      PMCID: PMC5796790          DOI: 10.1002/sam.11348

Source DB:  PubMed          Journal:  Stat Anal Data Min        ISSN: 1932-1864            Impact factor:   1.051


  9 in total

1.  Missing value estimation methods for DNA microarrays.

Authors:  O Troyanskaya; M Cantor; G Sherlock; P Brown; T Hastie; R Tibshirani; D Botstein; R B Altman
Journal:  Bioinformatics       Date:  2001-06       Impact factor: 6.937

2.  MissForest--non-parametric missing value imputation for mixed-type data.

Authors:  Daniel J Stekhoven; Peter Bühlmann
Journal:  Bioinformatics       Date:  2011-10-28       Impact factor: 6.937

3.  Multiple imputation of discrete and continuous data by fully conditional specification.

Authors:  Stef van Buuren
Journal:  Stat Methods Med Res       Date:  2007-06       Impact factor: 3.021

Review 4.  Dealing with missing values in large-scale studies: microarray data imputation and beyond.

Authors:  Tero Aittokallio
Journal:  Brief Bioinform       Date:  2009-12-04       Impact factor: 11.622

5.  The Effect of Splitting on Random Forests.

Authors:  Hemant Ishwaran
Journal:  Mach Learn       Date:  2014-07-02       Impact factor: 2.940

6.  Missing value imputation in high-dimensional phenomic data: imputable or not, and how?

Authors:  Serena G Liao; Yan Lin; Dongwan D Kang; Divay Chandra; Jessica Bon; Naftali Kaminski; Frank C Sciurba; George C Tseng
Journal:  BMC Bioinformatics       Date:  2014-11-05       Impact factor: 3.169

7.  Comparison of imputation methods for missing laboratory data in medicine.

Authors:  Akbar K Waljee; Ashin Mukherjee; Amit G Singal; Yiwei Zhang; Jeffrey Warren; Ulysses Balis; Jorge Marrero; Ji Zhu; Peter Dr Higgins
Journal:  BMJ Open       Date:  2013-08-01       Impact factor: 2.692

8.  Multiple imputation of covariates by fully conditional specification: Accommodating the substantive model.

Authors:  Jonathan W Bartlett; Shaun R Seaman; Ian R White; James R Carpenter
Journal:  Stat Methods Med Res       Date:  2014-02-12       Impact factor: 3.021

9.  Comparison of random forest and parametric imputation models for imputing missing data using MICE: a CALIBER study.

Authors:  Anoop D Shah; Jonathan W Bartlett; James Carpenter; Owen Nicholas; Harry Hemingway
Journal:  Am J Epidemiol       Date:  2014-01-12       Impact factor: 4.897

  9 in total
  63 in total

1.  Application of Machine Learning in Intensive Care Unit (ICU) Settings Using MIMIC Dataset: Systematic Review.

Authors:  Mahanazuddin Syed; Shorabuddin Syed; Kevin Sexton; Hafsa Bareen Syeda; Maryam Garza; Meredith Zozus; Farhanuddin Syed; Salma Begum; Abdullah Usama Syed; Joseph Sanford; Fred Prior
Journal:  Informatics (MDPI)       Date:  2021-03-03

2.  Vascular biomarkers and digital ulcerations in systemic sclerosis: results from a randomized controlled trial of oral treprostinil (DISTOL-1).

Authors:  Christopher A Mecoli; Jamie Perin; Jennifer E Van Eyk; Jie Zhu; Qin Fu; Andrew G Allmon; Youlan Rao; Scott Zeger; Fredrick M Wigley; Laura K Hummers; Ami A Shah
Journal:  Clin Rheumatol       Date:  2019-12-19       Impact factor: 2.980

3.  Precision Surgical Therapy for Adenocarcinoma of the Esophagus and Esophagogastric Junction.

Authors:  Thomas W Rice; Min Lu; Hemant Ishwaran; Eugene H Blackstone
Journal:  J Thorac Oncol       Date:  2019-08-20       Impact factor: 15.609

4.  Variables of importance in the Scientific Registry of Transplant Recipients database predictive of heart transplant waitlist mortality.

Authors:  Eileen M Hsich; Lucy Thuita; Dennis M McNamara; Joseph G Rogers; Maryam Valapour; Lee R Goldberg; Clyde W Yancy; Eugene H Blackstone; Hemant Ishwaran
Journal:  Am J Transplant       Date:  2019-02-13       Impact factor: 8.086

5.  Estimating Individual Treatment Effect in Observational Data Using Random Forest Methods.

Authors:  Min Lu; Saad Sadiq; Daniel J Feaster; Hemant Ishwaran
Journal:  J Comput Graph Stat       Date:  2018-02-01       Impact factor: 2.302

6.  Estimation of Mortality Risk in Type 2 Diabetic Patients (ENFORCE): An Inexpensive and Parsimonious Prediction Model.

Authors:  Massimiliano Copetti; Hetal Shah; Andrea Fontana; Maria Giovanna Scarale; Claudia Menzaghi; Salvatore De Cosmo; Monia Garofolo; Maria Rosaria Sorrentino; Olga Lamacchia; Giuseppe Penno; Alessandro Doria; Vincenzo Trischitta
Journal:  J Clin Endocrinol Metab       Date:  2019-10-01       Impact factor: 5.958

7.  Association of breastfeeding duration, birth weight, and current weight status with the risk of elevated blood pressure in preschoolers.

Authors:  Jiahong Sun; Lisha Wu; Yuanyuan Zhang; Chunan Li; Yake Wang; Wenhua Mei; Jianduan Zhang
Journal:  Eur J Clin Nutr       Date:  2020-03-20       Impact factor: 4.016

8.  Heart Transplantation: An In-Depth Survival Analysis.

Authors:  Eileen M Hsich; Eugene H Blackstone; Lucy W Thuita; Dennis M McNamara; Joseph G Rogers; Clyde W Yancy; Lee R Goldberg; Maryam Valapour; Gang Xu; Hemant Ishwaran
Journal:  JACC Heart Fail       Date:  2020-06-10       Impact factor: 12.035

9.  Value of Lymphadenectomy in Patients Receiving Neoadjuvant Therapy for Esophageal Adenocarcinoma.

Authors:  Siva Raja; Thomas W Rice; Sudish C Murthy; Usman Ahmad; Marie E Semple; Eugene H Blackstone; Hemant Ishwaran
Journal:  Ann Surg       Date:  2021-10-01       Impact factor: 12.969

10.  Small-scale population divergence is driven by local larval environment in a temperate amphibian.

Authors:  Patrik Rödin-Mörch; Hugo Palejowski; Maria Cortazar-Chinarro; Simon Kärvemo; Alex Richter-Boix; Jacob Höglund; Anssi Laurila
Journal:  Heredity (Edinb)       Date:  2020-09-21       Impact factor: 3.821

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.