Literature DB >> 16873502

Semi-supervised analysis of gene expression profiles for lineage-specific development in the Caenorhabditis elegans embryo.

Yuan Qi1, Patrycja E Missiuro, Ashish Kapoor, Craig P Hunter, Tommi S Jaakkola, David K Gifford, Hui Ge.   

Abstract

MOTIVATION: Gene expression profiling is a powerful approach to identify genes that may be involved in a specific biological process on a global scale. For example, gene expression profiling of mutant animals that lack or contain an excess of certain cell types is a common way to identify genes that are important for the development and maintenance of given cell types. However, it is difficult for traditional computational methods, including unsupervised and supervised learning methods, to detect relevant genes from a large collection of expression profiles with high sensitivity and specificity. Unsupervised methods group similar gene expressions together while ignoring important prior biological knowledge. Supervised methods utilize training data from prior biological knowledge to classify gene expression. However, for many biological problems, little prior knowledge is available, which limits the prediction performance of most supervised methods.
RESULTS: We present a Bayesian semi-supervised learning method, called BGEN, that improves upon supervised and unsupervised methods by both capturing relevant expression profiles and using prior biological knowledge from literature and experimental validation. Unlike currently available semi-supervised learning methods, this new method trains a kernel classifier based on labeled and unlabeled gene expression examples. The semi-supervised trained classifier can then be used to efficiently classify the remaining genes in the dataset. Moreover, we model the confidence of microarray probes and probabilistically combine multiple probe predictions into gene predictions. We apply BGEN to identify genes involved in the development of a specific cell lineage in the C. elegans embryo, and to further identify the tissues in which these genes are enriched. Compared to K-means clustering and SVM classification, BGEN achieves higher sensitivity and specificity. We confirm certain predictions by biological experiments. AVAILABILITY: The results are available at http://www.csail.mit.edu/~alanqi/projects/BGEN.html.

Entities:  

Mesh:

Substances:

Year:  2006        PMID: 16873502     DOI: 10.1093/bioinformatics/btl256

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  3 in total

1.  Determining effects of non-synonymous SNPs on protein-protein interactions using supervised and semi-supervised learning.

Authors:  Nan Zhao; Jing Ginger Han; Chi-Ren Shyu; Dmitry Korkin
Journal:  PLoS Comput Biol       Date:  2014-05-01       Impact factor: 4.475

2.  Information flow analysis of interactome networks.

Authors:  Patrycja Vasilyev Missiuro; Kesheng Liu; Lihua Zou; Brian C Ross; Guoyan Zhao; Jun S Liu; Hui Ge
Journal:  PLoS Comput Biol       Date:  2009-04-10       Impact factor: 4.475

3.  Biomarker discovery across annotated and unannotated microarray datasets using semi-supervised learning.

Authors:  Cole Harris; Noushin Ghaffari
Journal:  BMC Genomics       Date:  2008-09-16       Impact factor: 3.969

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.