Literature DB >> 21914630

High-dimensional bolstered error estimation.

Chao Sima1, Ulisses M Braga-Neto, Edward R Dougherty.   

Abstract

MOTIVATION: In small-sample settings, bolstered error estimation has been shown to perform better than cross-validation and competitively with bootstrap with regard to various criteria. The key issue for bolstering performance is the variance setting for the bolstering kernel. Heretofore, this variance has been determined in a non-parametric manner from the data. Although bolstering based on this variance setting works well for small feature sets, results can deteriorate for high-dimensional feature spaces.
RESULTS: This article computes an optimal kernel variance depending on the classification rule, sample size, model and feature space, both the original number and the number remaining after feature selection. A key point is that the optimal variance is robust relative to the model. This allows us to develop a method for selecting a suitable variance to use in real-world applications where the model is not known, but the other factors in determining the optimal kernel are known. AVAILABILITY: Companion website at http://compbio.tgen.org/paper_supp/high_dim_bolstering. CONTACT: edward@mail.ece.tamu.edu.

Mesh:

Year:  2011        PMID: 21914630      PMCID: PMC3198579          DOI: 10.1093/bioinformatics/btr518

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  18 in total

1.  Optimal number of features as a function of sample size for various classification rules.

Authors:  Jianping Hua; Zixiang Xiong; James Lowey; Edward Suh; Edward R Dougherty
Journal:  Bioinformatics       Date:  2004-11-30       Impact factor: 6.937

2.  Superior feature-set ranking for small samples using bolstered error estimation.

Authors:  Chao Sima; Ulisses Braga-Neto; Edward R Dougherty
Journal:  Bioinformatics       Date:  2004-10-28       Impact factor: 6.937

3.  Prediction error estimation: a comparison of resampling methods.

Authors:  Annette M Molinaro; Richard Simon; Ruth M Pfeiffer
Journal:  Bioinformatics       Date:  2005-05-19       Impact factor: 6.937

4.  The molecular classification of multiple myeloma.

Authors:  Fenghuang Zhan; Yongsheng Huang; Simona Colla; James P Stewart; Ichiro Hanamura; Sushil Gupta; Joshua Epstein; Shmuel Yaccoby; Jeffrey Sawyer; Bart Burington; Elias Anaissie; Klaus Hollmig; Mauricio Pineda-Roman; Guido Tricot; Frits van Rhee; Ronald Walker; Maurizio Zangari; John Crowley; Bart Barlogie; John D Shaughnessy
Journal:  Blood       Date:  2006-05-25       Impact factor: 22.113

5.  A comparison of bootstrap methods and an adjusted bootstrap approach for estimating the prediction error in microarray classification.

Authors:  Wenyu Jiang; Richard Simon
Journal:  Stat Med       Date:  2007-12-20       Impact factor: 2.373

6.  Over-optimism in bioinformatics research.

Authors:  Anne-Laure Boulesteix
Journal:  Bioinformatics       Date:  2009-11-26       Impact factor: 6.937

7.  Small-sample precision of ROC-related estimates.

Authors:  Blaise Hanczar; Jianping Hua; Chao Sima; John Weinstein; Michael Bittner; Edward R Dougherty
Journal:  Bioinformatics       Date:  2010-02-03       Impact factor: 6.937

8.  Gene expression profiling predicts clinical outcome of breast cancer.

Authors:  Laura J van 't Veer; Hongyue Dai; Marc J van de Vijver; Yudong D He; Augustinus A M Hart; Mao Mao; Hans L Peterse; Karin van der Kooy; Matthew J Marton; Anke T Witteveen; George J Schreiber; Ron M Kerkhoven; Chris Roberts; Peter S Linsley; René Bernards; Stephen H Friend
Journal:  Nature       Date:  2002-01-31       Impact factor: 49.962

9.  Characterization of the effectiveness of reporting lists of small feature sets relative to the accuracy of the prior biological knowledge.

Authors:  Chen Zhao; Michael L Bittner; Robert S Chapkin; Edward R Dougherty
Journal:  Cancer Inform       Date:  2010-03-18

10.  Many accurate small-discriminatory feature subsets exist in microarray transcript data: biomarker discovery.

Authors:  Leslie R Grate
Journal:  BMC Bioinformatics       Date:  2005-04-13       Impact factor: 3.169

View more
  3 in total

Review 1.  Biomarker discovery studies for patient stratification using machine learning analysis of omics data: a scoping review.

Authors:  Enrico Glaab; Armin Rauschenberger; Rita Banzi; Chiara Gerardi; Paula Garcia; Jacques Demotes
Journal:  BMJ Open       Date:  2021-12-06       Impact factor: 2.692

2.  Ten quick tips for biomarker discovery and validation analyses using machine learning.

Authors:  Ramon Diaz-Uriarte; Elisa Gómez de Lope; Rosalba Giugno; Holger Fröhlich; Petr V Nazarov; Isabel A Nepomuceno-Chamorro; Armin Rauschenberger; Enrico Glaab
Journal:  PLoS Comput Biol       Date:  2022-08-11       Impact factor: 4.779

3.  The fecal microbiome in dogs with acute diarrhea and idiopathic inflammatory bowel disease.

Authors:  Jan S Suchodolski; Melissa E Markel; Jose F Garcia-Mazcorro; Stefan Unterer; Romy M Heilmann; Scot E Dowd; Priyanka Kachroo; Ivan Ivanov; Yasushi Minamoto; Enricka M Dillman; Jörg M Steiner; Audrey K Cook; Linda Toresson
Journal:  PLoS One       Date:  2012-12-26       Impact factor: 3.240

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.