| Literature DB >> 23567540 |
Zhiyi Mao1, Wensheng Cai, Xueguang Shao.
Abstract
Gene selection is an important task in bioinformatics studies, because the accuracy of cancer classification generally depends upon the genes that have biological relevance to the classifying problems. In this work, randomization test (RT) is used as a gene selection method for dealing with gene expression data. In the method, a statistic derived from the statistics of the regression coefficients in a series of partial least squares discriminant analysis (PLSDA) models is used to evaluate the significance of the genes. Informative genes are selected for classifying the four gene expression datasets of prostate cancer, lung cancer, leukemia and non-small cell lung cancer (NSCLC) and the rationality of the results is validated by multiple linear regression (MLR) modeling and principal component analysis (PCA). With the selected genes, satisfactory results can be obtained.Entities:
Keywords: Cancer classification; Gene expression data; Gene selection; Partial least squares discriminant analysis; Randomization test
Mesh:
Year: 2013 PMID: 23567540 DOI: 10.1016/j.jbi.2013.03.009
Source DB: PubMed Journal: J Biomed Inform ISSN: 1532-0464 Impact factor: 6.317