Literature DB >> 24778462

Subsemble: an ensemble method for combining subset-specific algorithm fits.

Stephanie Sapp1, Mark J van der Laan2, John Canny3.   

Abstract

Ensemble methods using the same underlying algorithm trained on different subsets of observations have recently received increased attention as practical prediction tools for massive datasets. We propose Subsemble: a general subset ensemble prediction method, which can be used for small, moderate, or large datasets. Subsemble partitions the full dataset into subsets of observations, fits a specified underlying algorithm on each subset, and uses a clever form of V-fold cross-validation to output a prediction function that combines the subset-specific fits. We give an oracle result that provides a theoretical performance guarantee for Subsemble. Through simulations, we demonstrate that Subsemble can be a beneficial tool for small to moderate sized datasets, and often has better prediction performance than the underlying algorithm fit just once on the full dataset. We also describe how to include Subsemble as a candidate in a SuperLearner library, providing a practical way to evaluate the performance of Subsemlbe relative to the underlying algorithm fit just once on the full dataset.

Entities:  

Keywords:  big data; cross-validation; ensemble methods; machine learning; prediction

Year:  2014        PMID: 24778462      PMCID: PMC4000126          DOI: 10.1080/02664763.2013.864263

Source DB:  PubMed          Journal:  J Appl Stat        ISSN: 0266-4763            Impact factor:   1.404


  1 in total

1.  Super learner.

Authors:  Mark J van der Laan; Eric C Polley; Alan E Hubbard
Journal:  Stat Appl Genet Mol Biol       Date:  2007-09-16
  1 in total
  2 in total

1.  Composition or Context: Using Transportability to Understand Drivers of Site Differences in a Large-scale Housing Experiment.

Authors:  Kara E Rudolph; Nicole M Schmidt; M Maria Glymour; Rebecca Crowder; Jessica Galin; Jennifer Ahern; Theresa L Osypuk
Journal:  Epidemiology       Date:  2018-03       Impact factor: 4.822

2.  Predicting the Effect of Single and Multiple Mutations on Protein Structural Stability.

Authors:  Ramin Dehghanpoor; Evan Ricks; Katie Hursh; Sarah Gunderson; Roshanak Farhoodi; Nurit Haspel; Brian Hutchinson; Filip Jagodzinski
Journal:  Molecules       Date:  2018-01-27       Impact factor: 4.411

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.