Literature DB >> 11911798

Strong feature sets from small samples.

Seungchan Kim1, Edward R Dougherty, Junior Barrera, Yidong Chen, Michael L Bittner, Jeffrey M Trent.   

Abstract

For small samples, classifier design algorithms typically suffer from overfitting. Given a set of features, a classifier must be designed and its error estimated. For small samples, an error estimator may be unbiased but, owing to a large variance, often give very optimistic estimates. This paper proposes mitigating the small-sample problem by designing classifiers from a probability distribution resulting from spreading the mass of the sample points to make classification more difficult, while maintaining sample geometry. The algorithm is parameterized by the variance of the spreading distribution. By increasing the spread, the algorithm finds gene sets whose classification accuracy remains strong relative to greater spreading of the sample. The error gives a measure of the strength of the feature set as a function of the spread. The algorithm yields feature sets that can distinguish the two classes, not only for the sample data, but for distributions spread beyond the sample data. For linear classifiers, the topic of the present paper, the classifiers are derived analytically from the model, thereby providing an enormous savings in computation time. The algorithm is applied to cancer classification via cDNA microarrays. In particular, the genes BRCA1 and BRCA2 are associated with a hereditary disposition to breast cancer, and the algorithm is used to find gene sets whose expressions can be used to classify BRCA1 and BRCA2 tumors.

Entities:  

Mesh:

Year:  2002        PMID: 11911798     DOI: 10.1089/10665270252833226

Source DB:  PubMed          Journal:  J Comput Biol        ISSN: 1066-5277            Impact factor:   1.479


  17 in total

1.  Coupling of functional gene diversity and geochemical data from environmental samples.

Authors:  A V Palumbo; J C Schryver; M W Fields; C E Bagwell; J-Z Zhou; T Yan; X Liu; C C Brandt
Journal:  Appl Environ Microbiol       Date:  2004-11       Impact factor: 4.792

2.  Parameters selection in gene selection using Gaussian kernel support vector machines by genetic algorithm.

Authors:  Yong Mao; Xiao-Bo Zhou; Dao-Ying Pi; You-Xian Sun; Stephen T C Wong
Journal:  J Zhejiang Univ Sci B       Date:  2005-10       Impact factor: 3.066

3.  Development and Validation of Biomarker Classifiers for Treatment Selection.

Authors:  Richard Simon
Journal:  J Stat Plan Inference       Date:  2008-02-01       Impact factor: 1.111

4.  MicroRNA-328 is associated with (non-small) cell lung cancer (NSCLC) brain metastasis and mediates NSCLC migration.

Authors:  Shilpi Arora; Aarati R Ranade; Nhan L Tran; Sara Nasser; Shravan Sridhar; Ronald L Korn; Julianna T D Ross; Harshil Dhruv; Kristen M Foss; Zita Sibenaller; Timothy Ryken; Michael B Gotway; Seungchan Kim; Glen J Weiss
Journal:  Int J Cancer       Date:  2011-03-29       Impact factor: 7.396

5.  Identifying radiation exposure biomarkers from mouse blood transcriptome.

Authors:  Daniel R Hyduke; Evagelia C Laiakis; Heng-Hong Li; Albert J Fornace
Journal:  Int J Bioinform Res Appl       Date:  2013

Review 6.  Radiomics: the process and the challenges.

Authors:  Virendra Kumar; Yuhua Gu; Satrajit Basu; Anders Berglund; Steven A Eschrich; Matthew B Schabath; Kenneth Forster; Hugo J W L Aerts; Andre Dekker; David Fenstermacher; Dmitry B Goldgof; Lawrence O Hall; Philippe Lambin; Yoganand Balagurunathan; Robert A Gatenby; Robert J Gillies
Journal:  Magn Reson Imaging       Date:  2012-08-13       Impact factor: 2.546

7.  Gene expression profiling in uveal melanoma reveals two molecular classes and predicts metastatic death.

Authors:  Michael D Onken; Lori A Worley; Justis P Ehlers; J William Harbour
Journal:  Cancer Res       Date:  2004-10-15       Impact factor: 12.701

8.  Computational Systems Bioinformatics and Bioimaging for Pathway Analysis and Drug Screening.

Authors:  Xiaobo Zhou; Stephen T C Wong
Journal:  Proc IEEE Inst Electr Electron Eng       Date:  2008-08-01       Impact factor: 10.961

9.  Analysis of DNA microarray expression data.

Authors:  Richard Simon
Journal:  Best Pract Res Clin Haematol       Date:  2009-06       Impact factor: 3.020

10.  Transcriptional alterations related to neuropathology and clinical manifestation of Alzheimer's disease.

Authors:  Aderbal R T Silva; Lea T Grinberg; Jose M Farfel; Breno S Diniz; Leandro A Lima; Paulo J S Silva; Renata E L Ferretti; Rafael M Rocha; Wilson Jacob Filho; Dirce M Carraro; Helena Brentani
Journal:  PLoS One       Date:  2012-11-07       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.