Literature DB >> 15759620

Gotrees: predicting go associations from protein domain composition using decision trees.

Boris Hayete1, Jadwiga R Bienkowska.   

Abstract

The Gene Ontology (GO) offers a comprehensive and standardized way to describe a protein's biological role. Proteins are annotated with GO terms based on direct or indirect experimental evidence. Term assignments are also inferred from homology and literature mining. Regardless of the type of evidence used, GO assignments are manually curated or electronic. Unfortunately, manual curation cannot keep pace with the data, available from publications and various large experimental datasets. Automated literature-based annotation methods have been developed in order to speed up the annotation. However, they only apply to proteins that have been experimentally investigated or have close homologs with sufficient and consistent annotation. One of the homology-based electronic methods for GO annotation is provided by the InterPro database. The InterPro2GO/PFAM2GO associates individual protein domains with GO terms and thus can be used to annotate the less studied proteins. However, protein classification via a single functional domain demands stringency to avoid large number of false positives. This work broadens the basic approach. We model proteins via their entire functional domain content and train individual decision tree classifiers for each GO term using known protein assignments. We demonstrate that our approach is sensitive, specific and precise, as well as fairly robust to sparse data. We have found that our method is more sensitive when compared to the InterPro2GO performance and suffers only some precision decrease. In comparison to the InterPro2GO we have improved the sensitivity by 22%, 27% and 50% for Molecular Function, Biological Process and Cellular GO terms respectively.

Mesh:

Substances:

Year:  2005        PMID: 15759620

Source DB:  PubMed          Journal:  Pac Symp Biocomput        ISSN: 2335-6928


  10 in total

1.  Expanding the landscape of chromatin modification (CM)-related functional domains and genes in human.

Authors:  Shuye Pu; Andrei L Turinsky; James Vlasblom; Tuan On; Xuejian Xiong; Andrew Emili; Zhaolei Zhang; Jack Greenblatt; John Parkinson; Shoshana J Wodak
Journal:  PLoS One       Date:  2010-11-29       Impact factor: 3.240

2.  Domain architecture conservation in orthologs.

Authors:  Kristoffer Forslund; Isabella Pekkari; Erik L L Sonnhammer
Journal:  BMC Bioinformatics       Date:  2011-08-05       Impact factor: 3.169

3.  Genome comparison using Gene Ontology (GO) with statistical testing.

Authors:  Zhaotao Cai; Xizeng Mao; Songgang Li; Liping Wei
Journal:  BMC Bioinformatics       Date:  2006-08-11       Impact factor: 3.169

4.  Automatic extraction of gene ontology annotation and its correlation with clusters in protein networks.

Authors:  Nikolai Daraselia; Anton Yuryev; Sergei Egorov; Ilya Mazo; Iaroslav Ispolatov
Journal:  BMC Bioinformatics       Date:  2007-07-10       Impact factor: 3.169

5.  Computational prediction of protein function based on weighted mapping of domains and GO terms.

Authors:  Zhixia Teng; Maozu Guo; Qiguo Dai; Chunyu Wang; Jin Li; Xiaoyan Liu
Journal:  Biomed Res Int       Date:  2014-04-23       Impact factor: 3.411

6.  Metagenomic insights into the RDX-degrading potential of the ovine rumen microbiome.

Authors:  Robert W Li; Juan Gabriel Giarrizzo; Sitao Wu; Weizhong Li; Jennifer M Duringer; A Morrie Craig
Journal:  PLoS One       Date:  2014-11-10       Impact factor: 3.240

7.  Predicting gene function using hierarchical multi-label decision tree ensembles.

Authors:  Leander Schietgat; Celine Vens; Jan Struyf; Hendrik Blockeel; Dragi Kocev; Saso Dzeroski
Journal:  BMC Bioinformatics       Date:  2010-01-02       Impact factor: 3.169

8.  Protein function prediction using domain families.

Authors:  Robert Rentzsch; Christine A Orengo
Journal:  BMC Bioinformatics       Date:  2013-02-28       Impact factor: 3.169

9.  Protein domain recurrence and order can enhance prediction of protein functions.

Authors:  Mario Abdel Messih; Meghana Chitale; Vladimir B Bajic; Daisuke Kihara; Xin Gao
Journal:  Bioinformatics       Date:  2012-09-15       Impact factor: 6.937

10.  Exploring inconsistencies in genome-wide protein function annotations: a machine learning approach.

Authors:  Carson Andorf; Drena Dobbs; Vasant Honavar
Journal:  BMC Bioinformatics       Date:  2007-08-03       Impact factor: 3.169

  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.