Literature DB >> 16672243

A categorization approach to automated ontological function annotation.

Karin Verspoor1, Judith Cohn, Susan Mniszewski, Cliff Joslyn.   

Abstract

Automated function prediction (AFP) methods increasingly use knowledge discovery algorithms to map sequence, structure, literature, and/or pathway information about proteins whose functions are unknown into functional ontologies, typically (a portion of) the Gene Ontology (GO). While there are a growing number of methods within this paradigm, the general problem of assessing the accuracy of such prediction algorithms has not been seriously addressed. We present first an application for function prediction from protein sequences using the POSet Ontology Categorizer (POSOC) to produce new annotations by analyzing collections of GO nodes derived from annotations of protein BLAST neighborhoods. We then also present hierarchical precision and hierarchical recall as new evaluation metrics for assessing the accuracy of any predictions in hierarchical ontologies, and discuss results on a test set of protein sequences. We show that our method provides substantially improved hierarchical precision (measure of predictions made that are correct) when applied to the nearest BLAST neighbors of target proteins, as compared with simply imputing that neighborhood's annotations to the target. Moreover, when our method is applied to a broader BLAST neighborhood, hierarchical precision is enhanced even further. In all cases, such increased hierarchical precision performance is purchased at a modest expense of hierarchical recall (measure of all annotations that get predicted at all).

Mesh:

Substances:

Year:  2006        PMID: 16672243      PMCID: PMC2242540          DOI: 10.1110/ps.062184006

Source DB:  PubMed          Journal:  Protein Sci        ISSN: 0961-8368            Impact factor:   6.725


  7 in total

1.  Gene ontology: tool for the unification of biology. The Gene Ontology Consortium.

Authors:  M Ashburner; C A Ball; J A Blake; D Botstein; H Butler; J M Cherry; A P Davis; K Dolinski; S S Dwight; J T Eppig; M A Harris; D P Hill; L Issel-Tarver; A Kasarskis; S Lewis; J C Matese; J E Richardson; M Ringwald; G M Rubin; G Sherlock
Journal:  Nat Genet       Date:  2000-05       Impact factor: 38.330

2.  The gene ontology categorizer.

Authors:  Cliff A Joslyn; Susan M Mniszewski; Andy Fulmer; Gary Heaton
Journal:  Bioinformatics       Date:  2004-08-04       Impact factor: 6.937

3.  Inference of protein function from protein structure.

Authors:  Debnath Pal; David Eisenberg
Journal:  Structure       Date:  2005-01       Impact factor: 5.006

Review 4.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.

Authors:  S F Altschul; T L Madden; A A Schäffer; J Zhang; Z Zhang; W Miller; D J Lipman
Journal:  Nucleic Acids Res       Date:  1997-09-01       Impact factor: 16.971

5.  Protein annotation as term categorization in the gene ontology using word proximity networks.

Authors:  Karin Verspoor; Judith Cohn; Cliff Joslyn; Sue Mniszewski; Andreas Rechtsteiner; Luis M Rocha; Tiago Simas
Journal:  BMC Bioinformatics       Date:  2005-05-24       Impact factor: 3.169

6.  GOtcha: a new method for prediction of protein function assessed by the annotation of seven genomes.

Authors:  David M A Martin; Matthew Berriman; Geoffrey J Barton
Journal:  BMC Bioinformatics       Date:  2004-11-18       Impact factor: 3.169

7.  Modeling the percolation of annotation errors in a database of protein sequences.

Authors:  Walter R Gilks; Benjamin Audit; Daniela De Angelis; Sophia Tsoka; Christos A Ouzounis
Journal:  Bioinformatics       Date:  2002-12       Impact factor: 6.937

  7 in total
  25 in total

1.  New avenues in protein function prediction.

Authors:  Iddo Friedberg; Martin Jambon; Adam Godzik
Journal:  Protein Sci       Date:  2006-06       Impact factor: 6.725

Review 2.  Protein function in precision medicine: deep understanding with machine learning.

Authors:  Burkhard Rost; Predrag Radivojac; Yana Bromberg
Journal:  FEBS Lett       Date:  2016-08-06       Impact factor: 4.124

3.  Ambiguity in medical concept normalization: An analysis of types and coverage in electronic health record datasets.

Authors:  Denis Newman-Griffis; Guy Divita; Bart Desmet; Ayah Zirikly; Carolyn P Rosé; Eric Fosler-Lussier
Journal:  J Am Med Inform Assoc       Date:  2021-03-01       Impact factor: 4.497

4.  Benchmarking ontologies: bigger or better?

Authors:  Lixia Yao; Anna Divoli; Ilya Mayzus; James A Evans; Andrey Rzhetsky
Journal:  PLoS Comput Biol       Date:  2011-01-13       Impact factor: 4.475

5.  AIGO: towards a unified framework for the analysis and the inter-comparison of GO functional annotations.

Authors:  Michael Defoin-Platel; Matthew M Hindle; Artem Lysenko; Stephen J Powers; Dimah Z Habash; Christopher J Rawlings; Mansoor Saqi
Journal:  BMC Bioinformatics       Date:  2011-11-03       Impact factor: 3.169

6.  FGGA-lnc: automatic gene ontology annotation of lncRNA sequences based on secondary structures.

Authors:  Flavio E Spetale; Javier Murillo; Gabriela V Villanova; Pilar Bulacio; Elizabeth Tapia
Journal:  Interface Focus       Date:  2021-06-11       Impact factor: 4.661

7.  Multi-label literature classification based on the Gene Ontology graph.

Authors:  Bo Jin; Brian Muller; Chengxiang Zhai; Xinghua Lu
Journal:  BMC Bioinformatics       Date:  2008-12-08       Impact factor: 3.169

8.  Information-theoretic evaluation of predicted ontological annotations.

Authors:  Wyatt T Clark; Predrag Radivojac
Journal:  Bioinformatics       Date:  2013-07-01       Impact factor: 6.937

9.  Rapid annotation of anonymous sequences from genome projects using semantic similarities and a weighting scheme in gene ontology.

Authors:  Paolo Fontana; Alessandro Cestaro; Riccardo Velasco; Elide Formentin; Stefano Toppo
Journal:  PLoS One       Date:  2009-02-27       Impact factor: 3.240

10.  Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists.

Authors:  Da Wei Huang; Brad T Sherman; Richard A Lempicki
Journal:  Nucleic Acids Res       Date:  2008-11-25       Impact factor: 16.971

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.