Literature DB >> 30034909

Variable importance-weighted Random Forests.

Yiyi Liu1, Hongyu Zhao1,2.   

Abstract

BACKGROUND: Random Forests is a popular classification and regression method that has proven powerful for various prediction problems in biological studies. However, its performance often deteriorates when the number of features increases. To address this limitation, feature elimination Random Forests was proposed that only uses features with the largest variable importance scores. Yet the performance of this method is not satisfying, possibly due to its rigid feature selection, and increased correlations between trees of forest.
METHODS: We propose variable importance-weighted Random Forests, which instead of sampling features with equal probability at each node to build up trees, samples features according to their variable importance scores, and then select the best split from the randomly selected features.
RESULTS: We evaluate the performance of our method through comprehensive simulation and real data analyses, for both regression and classification. Compared to the standard Random Forests and the feature elimination Random Forests methods, our proposed method has improved performance in most cases.
CONCLUSIONS: By incorporating the variable importance scores into the random feature selection step, our method can better utilize more informative features without completely ignoring less informative ones, hence has improved prediction accuracy in the presence of weak signals and large noises. We have implemented an R package "viRandomForests" based on the original R package "randomForest" and it can be freely downloaded from http://zhaocenter.org/software.

Entities:  

Keywords:  Random Forests; classification; regression; variable importance score

Year:  2017        PMID: 30034909      PMCID: PMC6051549     

Source DB:  PubMed          Journal:  Quant Biol        ISSN: 2095-4689


  9 in total

1.  Enriched random forests.

Authors:  Dhammika Amaratunga; Javier Cabrera; Yung-Seop Lee
Journal:  Bioinformatics       Date:  2008-07-22       Impact factor: 6.937

2.  Prediction of central nervous system embryonal tumour outcome based on gene expression.

Authors:  Scott L Pomeroy; Pablo Tamayo; Michelle Gaasenbeek; Lisa M Sturla; Michael Angelo; Margaret E McLaughlin; John Y H Kim; Liliana C Goumnerova; Peter M Black; Ching Lau; Jeffrey C Allen; David Zagzag; James M Olson; Tom Curran; Cynthia Wetmore; Jaclyn A Biegel; Tomaso Poggio; Shayan Mukherjee; Ryan Rifkin; Andrea Califano; Gustavo Stolovitzky; David N Louis; Jill P Mesirov; Eric S Lander; Todd R Golub
Journal:  Nature       Date:  2002-01-24       Impact factor: 49.962

Review 3.  Random forests for genetic association studies.

Authors:  Benjamin A Goldstein; Eric C Polley; Farren B S Briggs
Journal:  Stat Appl Genet Mol Biol       Date:  2011-07-12

4.  Gene expression correlates of clinical prostate cancer behavior.

Authors:  Dinesh Singh; Phillip G Febbo; Kenneth Ross; Donald G Jackson; Judith Manola; Christine Ladd; Pablo Tamayo; Andrew A Renshaw; Anthony V D'Amico; Jerome P Richie; Eric S Lander; Massimo Loda; Philip W Kantoff; Todd R Golub; William R Sellers
Journal:  Cancer Cell       Date:  2002-03       Impact factor: 31.743

Review 5.  Hallmarks of cancer: the next generation.

Authors:  Douglas Hanahan; Robert A Weinberg
Journal:  Cell       Date:  2011-03-04       Impact factor: 41.582

6.  Random forest models to predict aqueous solubility.

Authors:  David S Palmer; Noel M O'Boyle; Robert C Glen; John B O Mitchell
Journal:  J Chem Inf Model       Date:  2007 Jan-Feb       Impact factor: 4.956

7.  The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity.

Authors:  Jordi Barretina; Giordano Caponigro; Nicolas Stransky; Kavitha Venkatesan; Adam A Margolin; Sungjoon Kim; Christopher J Wilson; Joseph Lehár; Gregory V Kryukov; Dmitriy Sonkin; Anupama Reddy; Manway Liu; Lauren Murray; Michael F Berger; John E Monahan; Paula Morais; Jodi Meltzer; Adam Korejwa; Judit Jané-Valbuena; Felipa A Mapa; Joseph Thibault; Eva Bric-Furlong; Pichai Raman; Aaron Shipway; Ingo H Engels; Jill Cheng; Guoying K Yu; Jianjun Yu; Peter Aspesi; Melanie de Silva; Kalpana Jagtap; Michael D Jones; Li Wang; Charles Hatton; Emanuele Palescandolo; Supriya Gupta; Scott Mahan; Carrie Sougnez; Robert C Onofrio; Ted Liefeld; Laura MacConaill; Wendy Winckler; Michael Reich; Nanxin Li; Jill P Mesirov; Stacey B Gabriel; Gad Getz; Kristin Ardlie; Vivien Chan; Vic E Myer; Barbara L Weber; Jeff Porter; Markus Warmuth; Peter Finan; Jennifer L Harris; Matthew Meyerson; Todd R Golub; Michael P Morrissey; William R Sellers; Robert Schlegel; Levi A Garraway
Journal:  Nature       Date:  2012-03-28       Impact factor: 49.962

8.  Gene selection and classification of microarray data using random forest.

Authors:  Ramón Díaz-Uriarte; Sara Alvarez de Andrés
Journal:  BMC Bioinformatics       Date:  2006-01-06       Impact factor: 3.169

9.  MiPred: classification of real and pseudo microRNA precursors using random forest prediction model with combined features.

Authors:  Peng Jiang; Haonan Wu; Wenkai Wang; Wei Ma; Xiao Sun; Zuhong Lu
Journal:  Nucleic Acids Res       Date:  2007-06-06       Impact factor: 16.971

  9 in total
  5 in total

1.  Landscape of Immune Microenvironment in Epithelial Ovarian Cancer and Establishing Risk Model by Machine Learning.

Authors:  Shi-Yi Liu; Rong-Hui Zhu; Zi-Tao Wang; Wei Tan; Li Zhang; Yan-Qing Wang; Fang-Fang Dai; Meng-Qin Yuan; Ya-Jing Zheng; Dong-Yong Yang; Fei-Yan Wang; Shu Xian; Juan He; Yu-Wei Zhang; Ma-Li Wu; Zhi-Min Deng; Min Hu; Yan-Xiang Cheng; Ye-Qiang Liu
Journal:  J Oncol       Date:  2021-08-26       Impact factor: 4.375

2.  Establishment of a Preoperative Laboratory Panel to identify Lymph Node Metastasis in Superficial Esophageal Cancer.

Authors:  Han Chen; Ruoyun Yang; Xin Yu; Xingzhou Jiang; Liuqin Jiang; Guoxin Zhang; Xiaoying Zhou
Journal:  J Cancer       Date:  2022-04-11       Impact factor: 4.478

3.  The use of machine learning for investigating the role of plastic surgeons in anatomical injuries: A retrospective observational study.

Authors:  Nam Kyu Lim; Jong Hyun Park
Journal:  Medicine (Baltimore)       Date:  2022-10-07       Impact factor: 1.817

4.  Machine learning-based long-term outcome prediction in patients undergoing percutaneous coronary intervention.

Authors:  Shangyu Liu; Shengwen Yang; Anlu Xing; Lihui Zheng; Lishui Shen; Bin Tu; Yan Yao
Journal:  Cardiovasc Diagn Ther       Date:  2021-06

5.  Genetic expression and mutational profile analysis in different pathologic stages of hepatocellular carcinoma patients.

Authors:  Xingjie Gao; Chunyan Zhao; Nan Zhang; Xiaoteng Cui; Yuanyuan Ren; Chao Su; Shaoyuan Wu; Zhi Yao; Jie Yang
Journal:  BMC Cancer       Date:  2021-07-08       Impact factor: 4.430

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.