Literature DB >> 20150666

Feature selection for gene expression using model-based entropy.

Shenghuo Zhu1, Dingding Wang, Kai Yu, Tao Li, Yihong Gong.   

Abstract

Gene expression data usually contain a large number of genes but a small number of samples. Feature selection for gene expression data aims at finding a set of genes that best discriminate biological samples of different types. Using machine learning techniques, traditional gene selection based on empirical mutual information suffers the data sparseness issue due to the small number of samples. To overcome the sparseness issue, we propose a model-based approach to estimate the entropy of class variables on the model, instead of on the data themselves. Here, we use multivariate normal distributions to fit the data, because multivariate normal distributions have maximum entropy among all real-valued distributions with a specified mean and standard deviation and are widely used to approximate various distributions. Given that the data follow a multivariate normal distribution, since the conditional distribution of class variables given the selected features is a normal distribution, its entropy can be computed with the log-determinant of its covariance matrix. Because of the large number of genes, the computation of all possible log-determinants is not efficient. We propose several algorithms to largely reduce the computational cost. The experiments on seven gene data sets and the comparison with other five approaches show the accuracy of the multivariate Gaussian generative model for feature selection, and the efficiency of our algorithms.

Mesh:

Year:  2010        PMID: 20150666     DOI: 10.1109/TCBB.2008.35

Source DB:  PubMed          Journal:  IEEE/ACM Trans Comput Biol Bioinform        ISSN: 1545-5963            Impact factor:   3.710


  9 in total

1.  Binary matrix shuffling filter for feature selection in neuronal morphology classification.

Authors:  Congwei Sun; Zhijun Dai; Hongyan Zhang; Lanzhi Li; Zheming Yuan
Journal:  Comput Math Methods Med       Date:  2015-03-29       Impact factor: 2.238

2.  Informative gene selection and direct classification of tumor based on Chi-square test of pairwise gene interactions.

Authors:  Hongyan Zhang; Lanzhi Li; Chao Luo; Congwei Sun; Yuan Chen; Zhijun Dai; Zheming Yuan
Journal:  Biomed Res Int       Date:  2014-07-23       Impact factor: 3.411

3.  AVC: Selecting discriminative features on basis of AUC by maximizing variable complementarity.

Authors:  Lei Sun; Jun Wang; Jinmao Wei
Journal:  BMC Bioinformatics       Date:  2017-03-14       Impact factor: 3.169

4.  CURE-SMOTE algorithm and hybrid algorithm for feature selection and parameter optimization based on random forests.

Authors:  Li Ma; Suohai Fan
Journal:  BMC Bioinformatics       Date:  2017-03-14       Impact factor: 3.169

5.  Estimating Differential Entropy using Recursive Copula Splitting.

Authors:  Gil Ariel; Yoram Louzoun
Journal:  Entropy (Basel)       Date:  2020-02-19       Impact factor: 2.524

6.  Cancer Categorization Using Genetic Algorithm to Identify Biomarker Genes.

Authors:  M Sathya; M Jeyaselvi; Shubham Joshi; Ekta Pandey; Piyush Kumar Pareek; Sajjad Shaukat Jamal; Vinay Kumar; Henry Kwame Atiglah
Journal:  J Healthc Eng       Date:  2022-02-22       Impact factor: 2.682

7.  iPcc: a novel feature extraction method for accurate disease class discovery and prediction.

Authors:  Xianwen Ren; Yong Wang; Xiang-Sun Zhang; Qi Jin
Journal:  Nucleic Acids Res       Date:  2013-06-12       Impact factor: 16.971

8.  Informative gene selection and the direct classification of tumors based on relative simplicity.

Authors:  Yuan Chen; Lifeng Wang; Lanzhi Li; Hongyan Zhang; Zheming Yuan
Journal:  BMC Bioinformatics       Date:  2016-01-20       Impact factor: 3.169

9.  Intra- and Inter-individual Variability of microRNA Levels in Human Cerebrospinal Fluid: Critical Implications for Biomarker Discovery.

Authors:  Hyejin Yoon; Krystal C Belmonte; Tom Kasten; Randall Bateman; Jungsu Kim
Journal:  Sci Rep       Date:  2017-10-05       Impact factor: 4.379

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.