Literature DB >> 17108393

A comparison of decision tree ensemble creation techniques.

Robert E Banfield1, Lawrence O Hall, Kevin W Bowyer, W P Kegelmeyer.   

Abstract

We experimentally evaluate bagging and seven other randomization-based approaches to creating an ensemble of decision tree classifiers. Statistical tests were performed on experimental results from 57 publicly available data sets. When cross-validation comparisons were tested for statistical significance, the best method was statistically more accurate than bagging on only eight of the 57 data sets. Alternatively, examining the average ranks of the algorithms across the group of data sets, we find that boosting, random forests, and randomized trees are statistically significantly better than bagging. Because our results suggest that using an appropriate ensemble size is important, we introduce an algorithm that decides when a sufficient number of classifiers has been created for an ensemble. Our algorithm uses the out-of-bag error estimate, and is shown to result in an accurate ensemble for those methods that incorporate bagging into the construction of the ensemble.

Mesh:

Year:  2007        PMID: 17108393     DOI: 10.1109/tpami.2007.250609

Source DB:  PubMed          Journal:  IEEE Trans Pattern Anal Mach Intell        ISSN: 0098-5589            Impact factor:   6.226


  20 in total

1.  Building Diversified Multiple Trees for classification in high dimensional noisy biomedical data.

Authors:  Jiuyong Li; Lin Liu; Jixue Liu; Ryan Green
Journal:  Health Inf Sci Syst       Date:  2017-10-10

2.  Sequential feature selection and inference using multi-variate random forests.

Authors:  Joshua Mayer; Raziur Rahman; Souparno Ghosh; Ranadip Pal
Journal:  Bioinformatics       Date:  2018-04-15       Impact factor: 6.937

3.  Predicting Malignant Nodules from Screening CT Scans.

Authors:  Samuel Hawkins; Hua Wang; Ying Liu; Alberto Garcia; Olya Stringfield; Henry Krewer; Qian Li; Dmitry Cherezov; Robert A Gatenby; Yoganand Balagurunathan; Dmitry Goldgof; Matthew B Schabath; Lawrence Hall; Robert J Gillies
Journal:  J Thorac Oncol       Date:  2016-07-13       Impact factor: 15.609

4.  Genome Wide Association Study to predict severe asthma exacerbations in children using random forests classifiers.

Authors:  Mousheng Xu; Kelan G Tantisira; Ann Wu; Augusto A Litonjua; Jen-hwa Chu; Blanca E Himes; Amy Damask; Scott T Weiss
Journal:  BMC Med Genet       Date:  2011-06-30       Impact factor: 2.103

5.  Insights into the classification of small GTPases.

Authors:  Dominik Heider; Sascha Hauke; Martin Pyka; Daniel Kessler
Journal:  Adv Appl Bioinform Chem       Date:  2010-05-21

6.  Development and validation of a novel platform-independent metastasis signature in human breast cancer.

Authors:  Shuang G Zhao; Mark Shilkrut; Corey Speers; Meilan Liu; Kari Wilder-Romans; Theodore S Lawrence; Lori J Pierce; Felix Y Feng
Journal:  PLoS One       Date:  2015-05-14       Impact factor: 3.240

7.  A protocol for developing early warning score models from vital signs data in hospitals using ensembles of decision trees.

Authors:  Michael Xu; Benjamin Tam; Lehana Thabane; Alison Fox-Robichaud
Journal:  BMJ Open       Date:  2015-09-09       Impact factor: 2.692

8.  Detecting paroxysmal coughing from pertussis cases using voice recognition technology.

Authors:  Danny Parker; Joseph Picone; Amir Harati; Shuang Lu; Marion H Jenkyns; Philip M Polgreen
Journal:  PLoS One       Date:  2013-12-31       Impact factor: 3.240

9.  Phenotype prediction from genome-wide association studies: application to smoking behaviors.

Authors:  Dankyu Yoon; Young Jin Kim; Taesung Park
Journal:  BMC Syst Biol       Date:  2012-12-12

10.  Predicting Bevirimat resistance of HIV-1 from genotype.

Authors:  Dominik Heider; Jens Verheyen; Daniel Hoffmann
Journal:  BMC Bioinformatics       Date:  2010-01-20       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.