Literature DB >> 21383420

Stable gene selection from microarray data via sample weighting.

Lei Yu1, Yue Han, Michael E Berens.   

Abstract

Feature selection from gene expression microarray data is a widely used technique for selecting candidate genes in various cancer studies. Besides predictive ability of the selected genes, an important aspect in evaluating a selection method is the stability of the selected genes. Experts instinctively have high confidence in the result of a selection method that selects similar sets of genes under some variations to the samples. However, a common problem of existing feature selection methods for gene expression data is that the selected genes by the same method often vary significantly with sample variations. In this work, we propose a general framework of sample weighting to improve the stability of feature selection methods under sample variations. The framework first weights each sample in a given training set according to its influence to the estimation of feature relevance, and then provides the weighted training set to a feature selection method. We also develop an efficient margin-based sample weighting algorithm under this framework. Experiments on a set of microarray data sets show that the proposed algorithm significantly improves the stability of representative feature selection algorithms such as SVM-RFE and ReliefF, without sacrificing their classification performance. Moreover, the proposed algorithm also leads to more stable gene signatures than the state-of-the-art ensemble method, particularly for small signature sizes.

Entities:  

Mesh:

Year:  2011        PMID: 21383420     DOI: 10.1109/TCBB.2011.47

Source DB:  PubMed          Journal:  IEEE/ACM Trans Comput Biol Bioinform        ISSN: 1545-5963            Impact factor:   3.710


  7 in total

1.  Algebraic comparison of partial lists in bioinformatics.

Authors:  Giuseppe Jurman; Samantha Riccadonna; Roberto Visintainer; Cesare Furlanello
Journal:  PLoS One       Date:  2012-05-17       Impact factor: 3.240

2.  An experimental study of the intrinsic stability of random forest variable importance measures.

Authors:  Huazhen Wang; Fan Yang; Zhiyuan Luo
Journal:  BMC Bioinformatics       Date:  2016-02-03       Impact factor: 3.169

3.  iRDA: a new filter towards predictive, stable, and enriched candidate genes.

Authors:  Hung-Ming Lai; Andreas A Albrecht; Kathleen K Steinhöfel
Journal:  BMC Genomics       Date:  2015-12-09       Impact factor: 3.969

4.  Classifying Incomplete Gene-Expression Data: Ensemble Learning with Non-Pre-Imputation Feature Filtering and Best-First Search Technique.

Authors:  Yuanting Yan; Tao Dai; Meili Yang; Xiuquan Du; Yiwen Zhang; Yanping Zhang
Journal:  Int J Mol Sci       Date:  2018-10-30       Impact factor: 5.923

5.  An Integrated Approach for Identifying Molecular Subtypes in Human Colon Cancer Using Gene Expression Data.

Authors:  Wen-Hui Wang; Ting-Yan Xie; Guang-Lei Xie; Zhong-Lu Ren; Jin-Ming Li
Journal:  Genes (Basel)       Date:  2018-08-02       Impact factor: 4.096

6.  An Occlusion-Robust Feature Selection Framework in Pedestrian Detection .

Authors:  Zhixin Guo; Wenzhi Liao; Yifan Xiao; Peter Veelaert; Wilfried Philips
Journal:  Sensors (Basel)       Date:  2018-07-13       Impact factor: 3.576

7.  An Immune-Gene-Based Classifier Predicts Prognosis in Patients With Cervical Squamous Cell Carcinoma.

Authors:  Huixia Yang; Xiaoyan Han; Zengping Hao
Journal:  Front Mol Biosci       Date:  2021-07-05
  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.