Literature DB >> 31797618

Tree-Weighting for Multi-Study Ensemble Learners.

Maya Ramchandran1, Prasad Patil, Giovanni Parmigiani.   

Abstract

Multi-study learning uses multiple training studies, separately trains classifiers on each, and forms an ensemble with weights rewarding members with better cross-study prediction ability. This article considers novel weighting approaches for constructing tree-based ensemble learners in this setting. Using Random Forests as a single-study learner, we compare weighting each forest to form the ensemble, to extracting the individual trees trained by each Random Forest and weighting them directly. We find that incorporating multiple layers of ensembling in the training process by weighting trees increases the robustness of the resulting predictor. Furthermore, we explore how ensembling weights correspond to tree structure, to shed light on the features that determine whether weighting trees directly is advantageous. Finally, we apply our approach to genomic datasets and show that weighting trees improves upon the basic multi-study learning paradigm. Code and supplementary material are available at https://github.com/m-ramchandran/tree-weighting.

Entities:  

Mesh:

Year:  2020        PMID: 31797618      PMCID: PMC6980320     

Source DB:  PubMed          Journal:  Pac Symp Biocomput        ISSN: 2335-6928


  9 in total

1.  A Weighted Random Forests Approach to Improve Predictive Performance.

Authors:  Stacey J Winham; Robert R Freimuth; Joanna M Biernacka
Journal:  Stat Anal Data Min       Date:  2013-12-01       Impact factor: 1.051

2.  Training replicable predictors in multiple studies.

Authors:  Prasad Patil; Giovanni Parmigiani
Journal:  Proc Natl Acad Sci U S A       Date:  2018-03-12       Impact factor: 11.205

3.  Regularization Paths for Generalized Linear Models via Coordinate Descent.

Authors:  Jerome Friedman; Trevor Hastie; Rob Tibshirani
Journal:  J Stat Softw       Date:  2010       Impact factor: 6.440

4.  The impact of different sources of heterogeneity on loss of accuracy from genomic prediction models.

Authors:  Yuqing Zhang; Christoph Bernau; Giovanni Parmigiani; Levi Waldron
Journal:  Biostatistics       Date:  2020-04-01       Impact factor: 5.899

5.  Design of Probabilistic Random Forests with Applications to Anticancer Drug Sensitivity Prediction.

Authors:  Raziur Rahman; Saad Haider; Souparno Ghosh; Ranadip Pal
Journal:  Cancer Inform       Date:  2016-03-31

6.  CoINcIDE: A framework for discovery of patient subtypes across multiple datasets.

Authors:  Catherine R Planey; Olivier Gevaert
Journal:  Genome Med       Date:  2016-03-09       Impact factor: 11.117

7.  Iterative random forests to discover predictive and stable high-order interactions.

Authors:  Sumanta Basu; Karl Kumbier; James B Brown; Bin Yu
Journal:  Proc Natl Acad Sci U S A       Date:  2018-01-19       Impact factor: 11.205

8.  curatedOvarianData: clinically annotated data for the ovarian cancer transcriptome.

Authors:  Benjamin Frederick Ganzfried; Markus Riester; Benjamin Haibe-Kains; Thomas Risch; Svitlana Tyekucheva; Ina Jazic; Xin Victoria Wang; Mahnaz Ahmadifar; Michael J Birrer; Giovanni Parmigiani; Curtis Huttenhower; Levi Waldron
Journal:  Database (Oxford)       Date:  2013-04-02       Impact factor: 3.451

9.  Cross-study validation for the assessment of prediction algorithms.

Authors:  Christoph Bernau; Markus Riester; Anne-Laure Boulesteix; Giovanni Parmigiani; Curtis Huttenhower; Levi Waldron; Lorenzo Trippa
Journal:  Bioinformatics       Date:  2014-06-15       Impact factor: 6.937

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.