Literature DB >> 16410319

Hierarchical multi-label prediction of gene function.

Zafer Barutcuoglu1, Robert E Schapire, Olga G Troyanskaya.   

Abstract

MOTIVATION: Assigning functions for unknown genes based on diverse large-scale data is a key task in functional genomics. Previous work on gene function prediction has addressed this problem using independent classifiers for each function. However, such an approach ignores the structure of functional class taxonomies, such as the Gene Ontology (GO). Over a hierarchy of functional classes, a group of independent classifiers where each one predicts gene membership to a particular class can produce a hierarchically inconsistent set of predictions, where for a given gene a specific class may be predicted positive while its inclusive parent class is predicted negative. Taking the hierarchical structure into account resolves such inconsistencies and provides an opportunity for leveraging all classifiers in the hierarchy to achieve higher specificity of predictions.
RESULTS: We developed a Bayesian framework for combining multiple classifiers based on the functional taxonomy constraints. Using a hierarchy of support vector machine (SVM) classifiers trained on multiple data types, we combined predictions in our Bayesian framework to obtain the most probable consistent set of predictions. Experiments show that over a 105-node subhierarchy of the GO, our Bayesian framework improves predictions for 93 nodes. As an additional benefit, our method also provides implicit calibration of SVM margin outputs to probabilities. Using this method, we make function predictions for multiple proteins, and experimentally confirm predictions for proteins involved in mitosis. SUPPLEMENTARY INFORMATION: Results for the 105 selected GO classes and predictions for 1059 unknown genes are available at: http://function.princeton.edu/genesite/ CONTACT: ogt@cs.princeton.edu.

Mesh:

Substances:

Year:  2006        PMID: 16410319     DOI: 10.1093/bioinformatics/btk048

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  66 in total

1.  Bayesian approach to transforming public gene expression repositories into disease diagnosis databases.

Authors:  Haiyan Huang; Chun-Chi Liu; Xianghong Jasmine Zhou
Journal:  Proc Natl Acad Sci U S A       Date:  2010-04-01       Impact factor: 11.205

2.  Biomedical ontologies in action: role in knowledge management, data integration and decision support.

Authors:  O Bodenreider
Journal:  Yearb Med Inform       Date:  2008

3.  The impact of incomplete knowledge on evaluation: an experimental benchmark for protein function prediction.

Authors:  Curtis Huttenhower; Matthew A Hibbs; Chad L Myers; Amy A Caudy; David C Hess; Olga G Troyanskaya
Journal:  Bioinformatics       Date:  2009-06-26       Impact factor: 6.937

4.  XML-based approaches for the integration of heterogeneous bio-molecular data.

Authors:  Marco Mesiti; Ernesto Jiménez-Ruiz; Ismael Sanz; Rafael Berlanga-Llavori; Paolo Perlasca; Giorgio Valentini; David Manset
Journal:  BMC Bioinformatics       Date:  2009-10-15       Impact factor: 3.169

5.  From learning taxonomies to phylogenetic learning: integration of 16S rRNA gene data into FAME-based bacterial classification.

Authors:  Bram Slabbinck; Willem Waegeman; Peter Dawyndt; Paul De Vos; Bernard De Baets
Journal:  BMC Bioinformatics       Date:  2010-01-30       Impact factor: 3.169

6.  Bayesian Markov Random Field analysis for protein function prediction based on network data.

Authors:  Yiannis A I Kourmpetis; Aalt D J van Dijk; Marco C A M Bink; Roeland C H J van Ham; Cajo J F ter Braak
Journal:  PLoS One       Date:  2010-02-24       Impact factor: 3.240

7.  Predicting gene function using few positive examples and unlabeled ones.

Authors:  Yiming Chen; Zhoujun Li; Xiaofeng Wang; Jiali Feng; Xiaohua Hu
Journal:  BMC Genomics       Date:  2010-11-02       Impact factor: 3.969

8.  Towards a semi-automatic functional annotation tool based on decision-tree techniques.

Authors:  Jérôme Azé; Lucie Gentils; Claire Toffano-Nioche; Valentin Loux; Jean-François Gibrat; Philippe Bessières; Céline Rouveirol; Anne Poupon; Christine Froidevaux
Journal:  BMC Proc       Date:  2008-12-17

9.  Multi-label literature classification based on the Gene Ontology graph.

Authors:  Bo Jin; Brian Muller; Chengxiang Zhai; Xinghua Lu
Journal:  BMC Bioinformatics       Date:  2008-12-08       Impact factor: 3.169

10.  Predicting gene function using hierarchical multi-label decision tree ensembles.

Authors:  Leander Schietgat; Celine Vens; Jan Struyf; Hendrik Blockeel; Dragi Kocev; Saso Dzeroski
Journal:  BMC Bioinformatics       Date:  2010-01-02       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.