Literature DB >> 18483613

Which is better: holdout or full-sample classifier design?

Marcel Brun1, Qian Xu, Edward R Dougherty.   

Abstract

Is it better to design a classifier and estimate its error on the full sample or to design a classifier on a training subset and estimate its error on the holdout test subset? Full-sample design provides the better classifier; nevertheless, one might choose holdout with the hope of better error estimation. A conservative criterion to decide the best course is to aim at a classifier whose error is less than a given bound. Then the choice between full-sample and holdout designs depends on which possesses the smaller expected bound. Using this criterion, we examine the choice between holdout and several full-sample error estimators using covariance models and a patient-data model. Full-sample design consistently outperforms holdout design. The relation between the two designs is revealed via a decomposition of the expected bound into the sum of the expected true error and the expected conditional standard deviation of the true error.

Entities:  

Year:  2008        PMID: 18483613      PMCID: PMC3171393          DOI: 10.1155/2008/297945

Source DB:  PubMed          Journal:  EURASIP J Bioinform Syst Biol        ISSN: 1687-4145


  7 in total

1.  Is cross-validation valid for small-sample microarray classification?

Authors:  Ulisses M Braga-Neto; Edward R Dougherty
Journal:  Bioinformatics       Date:  2004-02-12       Impact factor: 6.937

2.  Prediction error estimation: a comparison of resampling methods.

Authors:  Annette M Molinaro; Richard Simon; Ruth M Pfeiffer
Journal:  Bioinformatics       Date:  2005-05-19       Impact factor: 6.937

3.  Confidence intervals for the true classification error conditioned on the estimated error.

Authors:  Qian Xu; Jianping Hua; Ulisses Braga-Neto; Zixinag Xiong; Edward Suh; Edward R Dougherty
Journal:  Technol Cancer Res Treat       Date:  2006-12

4.  Genetic test bed for feature selection.

Authors:  Ashish Choudhary; Marcel Brun; Jianping Hua; James Lowey; Ed Suh; Edward R Dougherty
Journal:  Bioinformatics       Date:  2006-01-20       Impact factor: 6.937

5.  Quantification of the impact of feature selection on the variance of cross-validation error estimation.

Authors:  Yufei Xiao; Jianping Hua; Edward R Dougherty
Journal:  EURASIP J Bioinform Syst Biol       Date:  2007

6.  Gene expression profiling predicts clinical outcome of breast cancer.

Authors:  Laura J van 't Veer; Hongyue Dai; Marc J van de Vijver; Yudong D He; Augustinus A M Hart; Mao Mao; Hans L Peterse; Karin van der Kooy; Matthew J Marton; Anke T Witteveen; George J Schreiber; Ron M Kerkhoven; Chris Roberts; Peter S Linsley; René Bernards; Stephen H Friend
Journal:  Nature       Date:  2002-01-31       Impact factor: 49.962

7.  A gene-expression signature as a predictor of survival in breast cancer.

Authors:  Marc J van de Vijver; Yudong D He; Laura J van't Veer; Hongyue Dai; Augustinus A M Hart; Dorien W Voskuil; George J Schreiber; Johannes L Peterse; Chris Roberts; Matthew J Marton; Mark Parrish; Douwe Atsma; Anke Witteveen; Annuska Glas; Leonie Delahaye; Tony van der Velde; Harry Bartelink; Sjoerd Rodenhuis; Emiel T Rutgers; Stephen H Friend; René Bernards
Journal:  N Engl J Med       Date:  2002-12-19       Impact factor: 91.245

  7 in total
  1 in total

1.  Evaluating and Improving Automatic Sleep Spindle Detection by Using Multi-Objective Evolutionary Algorithms.

Authors:  Min-Yin Liu; Adam Huang; Norden E Huang
Journal:  Front Hum Neurosci       Date:  2017-05-18       Impact factor: 3.169

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.