Literature DB >> 17666757

Development of two-stage SVM-RFE gene selection strategy for microarray expression data analysis.

Yuchun Tang1, Yan-Qing Zhang, Zhen Huang.   

Abstract

Extracting a subset of informative genes from microarray expression data is a critical data preparation step in cancer classification and other biological function analyses. Though many algorithms have been developed, the Support Vector Machine - Recursive Feature Elimination (SVM-RFE) algorithm is one of the best gene feature selection algorithms. It assumes that a smaller "filter-out" factor in the SVM-RFE, which results in a smaller number of gene features eliminated in each recursion, should lead to extraction of a better gene subset. Because the SVM-RFE is highly sensitive to the "filter-out" factor, our simulations have shown that this assumption is not always correct and that the SVM-RFE is an unstable algorithm. To select a set of key gene features for reliable prediction of cancer types or subtypes and other applications, a new two-stage SVM-RFE algorithm has been developed. It is designed to effectively eliminate most of the irrelevant, redundant and noisy genes while keeping information loss small at the first stage. A fine selection for the final gene subset is then performed at the second stage. The two-stage SVM-RFE overcomes the instability problem of the SVM-RFE to achieve better algorithm utility. We have demonstrated that the two-stage SVM-RFE is significantly more accurate and more reliable than the SVM-RFE and three correlation-based methods based on our analysis of three publicly available microarray expression datasets. Furthermore, the two-stage SVM-RFE is computationally efficient because its time complexity is O(d*log(2)d}, where d is the size of the original gene set.

Entities:  

Mesh:

Substances:

Year:  2007        PMID: 17666757     DOI: 10.1109/TCBB.2007.70224

Source DB:  PubMed          Journal:  IEEE/ACM Trans Comput Biol Bioinform        ISSN: 1545-5963            Impact factor:   3.710


  19 in total

1.  Non-small-cell lung cancer pathological subtype-related gene selection and bioinformatics analysis based on gene expression profiles.

Authors:  Jiangpeng Chen; Xiaoqi Dong; Xun Lei; Yinyin Xia; Qing Zeng; Ping Que; Xiaoyan Wen; Shan Hu; Bin Peng
Journal:  Mol Clin Oncol       Date:  2017-11-27

2.  Gene selection using iterative feature elimination random forests for survival outcomes.

Authors:  Herbert Pang; Stephen L George; Ken Hui; Tiejun Tong
Journal:  IEEE/ACM Trans Comput Biol Bioinform       Date:  2012 Sep-Oct       Impact factor: 3.710

3.  Classification of dengue fever patients based on gene expression data using support vector machines.

Authors:  Ana Lisa V Gomes; Lawrence J K Wee; Asif M Khan; Laura H V G Gil; Ernesto T A Marques; Carlos E Calzavara-Silva; Tin Wee Tan
Journal:  PLoS One       Date:  2010-06-23       Impact factor: 3.240

4.  A novel sparse coding algorithm for classification of tumors based on gene expression data.

Authors:  Morteza Kolali Khormuji; Mehrnoosh Bazrafkan
Journal:  Med Biol Eng Comput       Date:  2015-09-04       Impact factor: 2.602

5.  Improving accuracy for cancer classification with a new algorithm for genes selection.

Authors:  Hongyan Zhang; Haiyan Wang; Zhijun Dai; Ming-shun Chen; Zheming Yuan
Journal:  BMC Bioinformatics       Date:  2012-11-13       Impact factor: 3.169

Review 6.  A Review of Feature Selection and Feature Extraction Methods Applied on Microarray Data.

Authors:  Zena M Hira; Duncan F Gillies
Journal:  Adv Bioinformatics       Date:  2015-06-11

7.  Regions of interest computed by SVM wrapped method for Alzheimer's disease examination from segmented MRI.

Authors:  Antonio R Hidalgo-Muñoz; Javier Ramírez; Juan M Górriz; Pablo Padilla
Journal:  Front Aging Neurosci       Date:  2014-02-20       Impact factor: 5.750

8.  A class-information-based penalized matrix decomposition for identifying plants core genes responding to abiotic stresses.

Authors:  Jin-Xing Liu; Jian Liu; Ying-Lian Gao; Jian-Xun Mi; Chun-Xia Ma; Dong Wang
Journal:  PLoS One       Date:  2014-09-02       Impact factor: 3.240

9.  Fuzzy logic for elimination of redundant information of microarray data.

Authors:  Edmundo Bonilla Huerta; Béatrice Duval; Jin-Kao Hao
Journal:  Genomics Proteomics Bioinformatics       Date:  2008-06       Impact factor: 7.691

10.  Informative gene selection and the direct classification of tumors based on relative simplicity.

Authors:  Yuan Chen; Lifeng Wang; Lanzhi Li; Hongyan Zhang; Zheming Yuan
Journal:  BMC Bioinformatics       Date:  2016-01-20       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.