| Literature DB >> 14518732 |
Abstract
Constructing a classifier based on microarray gene expression data has recently emerged as an important problem for cancer classification. Recent results have suggested the feasibility of constructing such a classifier with reasonable predictive accuracy under the circumstance where only a small number of cancer tissue samples of known type are available. Difficulty arises from the fact that each sample contains the expression data of a vast number of genes and these genes may interact with one another. Selection of a small number of critical genes is fundamental to correctly analyze the otherwise overwhelming data. It is essential to use a multivariate approach for capturing the correlated structure in the data. However, the curse of dimensionality leads to the concern about the reliability of selected genes. Here, we present a new gene selection method in which error and repeatability of selected genes are assessed within the context of M-fold cross-validation. In particular, we show that the method is able to identify source variables underlying data generation.Entities:
Mesh:
Substances:
Year: 2003 PMID: 14518732 DOI: 10.1109/titb.2003.816558
Source DB: PubMed Journal: IEEE Trans Inf Technol Biomed ISSN: 1089-7771