Literature DB >> 19390645

Is bagging effective in the classification of small-sample genomic and proteomic data?

T T Vu1, U M Braga-Neto.   

Abstract

There has been considerable interest recently in the application of bagging in the classification of both gene-expression data and protein-abundance mass spectrometry data. The approach is often justified by the improvement it produces on the performance of unstable, overfitting classification rules under small-sample situations. However, the question of real practical interest is whether the ensemble scheme will improve performance of those classifiers sufficiently to beat the performance of single stable, nonoverfitting classifiers, in the case of small-sample genomic and proteomic data sets. To investigate that question, we conducted a detailed empirical study, using publicly-available data sets from published genomic and proteomic studies. We observed that, under t-test and RELIEF filter-based feature selection, bagging generally does a good job of improving the performance of unstable, overfitting classifiers, such as CART decision trees and neural networks, but that improvement was not sufficient to beat the performance of single stable, nonoverfitting classifiers, such as diagonal and plain linear discriminant analysis, or 3-nearest neighbors. Furthermore, as expected, the ensemble method did not improve the performance of these classifiers significantly. Representative experimental results are presented and discussed in this work.

Year:  2009        PMID: 19390645      PMCID: PMC3171418          DOI: 10.1155/2009/158368

Source DB:  PubMed          Journal:  EURASIP J Bioinform Syst Biol        ISSN: 1687-4145


  13 in total

1.  The SELDI-TOF MS approach to proteomics: protein profiling and biomarker identification.

Authors:  Haleem J Issaq; Timothy D Veenstra; Thomas P Conrads; Donna Felschow
Journal:  Biochem Biophys Res Commun       Date:  2002-04-05       Impact factor: 3.575

2.  Prediction of clinical drug efficacy by classification of drug-induced genomic expression profiles in vitro.

Authors:  Erik C Gunther; David J Stone; Robert W Gerwien; Patricia Bento; Melvyn P Heyes
Journal:  Proc Natl Acad Sci U S A       Date:  2003-07-17       Impact factor: 11.205

3.  Application of the random forest classification algorithm to a SELDI-TOF proteomics study in the setting of a cancer prevention trial.

Authors:  Grant Izmirlian
Journal:  Ann N Y Acad Sci       Date:  2004-05       Impact factor: 5.691

4.  Proteomic mass spectra classification using decision tree based ensemble methods.

Authors:  Pierre Geurts; Marianne Fillet; Dominique de Seny; Marie-Alice Meuwis; Michel Malaise; Marie-Paule Merville; Louis Wehenkel
Journal:  Bioinformatics       Date:  2005-05-12       Impact factor: 6.937

5.  A novel ensemble strategy for classification of prostate cancer protein mass spectra.

Authors:  Amin Assareh; Mohammad Hassan Moradi; Vahid Esmaeili
Journal:  Conf Proc IEEE Eng Med Biol Soc       Date:  2007

6.  A predictor based on the somatic genomic changes of the BRCA1/BRCA2 breast cancer tumors identifies the non-BRCA1/BRCA2 tumors with BRCA1 promoter hypermethylation.

Authors:  Sara Alvarez; Ramon Diaz-Uriarte; Ana Osorio; Alicia Barroso; Lorenzo Melchor; Maria Fe Paz; Emiliano Honrado; Raquel Rodríguez; Miguel Urioste; Laura Valle; Orland Díez; Juan Cruz Cigudosa; Joaquin Dopazo; Manel Esteller; Javier Benitez
Journal:  Clin Cancer Res       Date:  2005-02-01       Impact factor: 12.531

7.  Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses.

Authors:  A Bhattacharjee; W G Richards; J Staunton; C Li; S Monti; P Vasa; C Ladd; J Beheshti; R Bueno; M Gillette; M Loda; G Weber; E J Mark; E S Lander; W Wong; B E Johnson; T R Golub; D J Sugarbaker; M Meyerson
Journal:  Proc Natl Acad Sci U S A       Date:  2001-11-13       Impact factor: 11.205

8.  Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data.

Authors:  Baolin Wu; Tom Abbott; David Fishman; Walter McMurray; Gil Mor; Kathryn Stone; David Ward; Kenneth Williams; Hongyu Zhao
Journal:  Bioinformatics       Date:  2003-09-01       Impact factor: 6.937

9.  A gene-expression signature as a predictor of survival in breast cancer.

Authors:  Marc J van de Vijver; Yudong D He; Laura J van't Veer; Hongyue Dai; Augustinus A M Hart; Dorien W Voskuil; George J Schreiber; Johannes L Peterse; Chris Roberts; Matthew J Marton; Mark Parrish; Douwe Atsma; Anke Witteveen; Annuska Glas; Leonie Delahaye; Tony van der Velde; Harry Bartelink; Sjoerd Rodenhuis; Emiel T Rutgers; Stephen H Friend; René Bernards
Journal:  N Engl J Med       Date:  2002-12-19       Impact factor: 91.245

10.  Using decision forest to classify prostate cancer samples on the basis of SELDI-TOF MS data: assessing chance correlation and prediction confidence.

Authors:  Weida Tong; Qian Xie; Huixiao Hong; Leming Shi; Hong Fang; Roger Perkins; Emanuel F Petricoin
Journal:  Environ Health Perspect       Date:  2004-11       Impact factor: 9.031

View more
  2 in total

1.  To aggregate or not to aggregate high-dimensional classifiers.

Authors:  Cheng-Jian Xu; Huub C J Hoefsloot; Age K Smilde
Journal:  BMC Bioinformatics       Date:  2011-05-13       Impact factor: 3.169

2.  Unbiased bootstrap error estimation for linear discriminant analysis.

Authors:  Thang Vu; Chao Sima; Ulisses M Braga-Neto; Edward R Dougherty
Journal:  EURASIP J Bioinform Syst Biol       Date:  2014-10-03
  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.