Literature DB >> 14579359

Automatic annotation of protein function based on family identification.

Federico Abascal1, Alfonso Valencia.   

Abstract

Although genomes are being sequenced at an impressive rate, the information generated tells us little about protein function, which is slow to characterize by traditional methods. Automatic protein function annotation based on computational methods has alleviated this imbalance. The most powerful current approach for inferring the function of new proteins is by studying the annotations of their homologues, since their common origin is assumed to be reflected in their structure and function. Unfortunately, as proteins evolve they acquire new functions, so annotation based on homology must be carried out in the context of orthologues or subfamilies. Evolution adds new complications through domain shuffling: homology (or orthology) frequently corresponds to domains rather than complete proteins. Moreover, the function of a protein may be seen as the result of combining the functions of its domains. Additionally, automatic annotation has to deal with problems related to the annotations in the databases: errors (which are likely to be propagated), inconsistencies, or different degrees of function specification. We describe a method that addresses these difficulties for the annotation of protein function. Sequence relationships are detected and measured to obtain a map of the sequence space, which is searched for differentiated groups of proteins (similar to islands on the map), which are expected to have a common function and correspond to groups of orthologues or subfamilies. This mapmaking is done by applying a clustering algorithm based on Normalized cuts in graphs. The domain problem is addressed in a simple way: pairwise local alignments are analyzed to determine the extent to which they cover the entire sequence lengths of the two proteins. This analysis determines both what homologues are preferred for functional inheritance and the level of confidence of the annotation. To alleviate the problems associated with database annotations, the information on all the homologues that are grouped together with the query protein are taken into account to select the most representative functional descriptors. This method has been applied for the annotation of the genome of Buchnera aphidicola (specific host Baizongia pistaciae). Human inspection of the annotations allowed an estimation of accuracy of 94%; the different kinds of error that may appear when using this approach are described. Results can be accessed at http://www.pdg.cnb.uam.es/funcut.html. The programs are available upon request, although installation in other systems may be complicated. Copyright 2003 Wiley-Liss, Inc.

Entities:  

Mesh:

Substances:

Year:  2003        PMID: 14579359     DOI: 10.1002/prot.10449

Source DB:  PubMed          Journal:  Proteins        ISSN: 0887-3585


  17 in total

1.  Comparative Genomics Reveals Factors Associated with Phenotypic Expression of Wolbachia.

Authors:  Guilherme Costa Baião; Jessin Janice; Maria Galinou; Lisa Klasson
Journal:  Genome Biol Evol       Date:  2021-07-06       Impact factor: 3.416

Review 2.  Ortholog identification in the presence of domain architecture rearrangement.

Authors:  Kimmen Sjölander; Ruchira S Datta; Yaoqing Shen; Grant M Shoffner
Journal:  Brief Bioinform       Date:  2011-06-28       Impact factor: 11.622

3.  HomPPI: a class of sequence homology based protein-protein interface prediction methods.

Authors:  Li C Xue; Drena Dobbs; Vasant Honavar
Journal:  BMC Bioinformatics       Date:  2011-06-17       Impact factor: 3.169

4.  BLANNOTATOR: enhanced homology-based function prediction of bacterial proteins.

Authors:  Matti Kankainen; Teija Ojala; Liisa Holm
Journal:  BMC Bioinformatics       Date:  2012-02-15       Impact factor: 3.169

5.  Probabilistic annotation of protein sequences based on functional classifications.

Authors:  Emmanuel D Levy; Christos A Ouzounis; Walter R Gilks; Benjamin Audit
Journal:  BMC Bioinformatics       Date:  2005-12-14       Impact factor: 3.169

6.  AutoFACT: an automatic functional annotation and classification tool.

Authors:  Liisa B Koski; Michael W Gray; B Franz Lang; Gertraud Burger
Journal:  BMC Bioinformatics       Date:  2005-06-16       Impact factor: 3.169

7.  The relationship between protein sequences and their gene ontology functions.

Authors:  Zhong-Hui Duan; Brent Hughes; Lothar Reichel; Dianne M Perez; Ting Shi
Journal:  BMC Bioinformatics       Date:  2006-12-12       Impact factor: 3.169

8.  MACSIMS: multiple alignment of complete sequences information management system.

Authors:  Julie D Thompson; Arnaud Muller; Andrew Waterhouse; Jim Procter; Geoffrey J Barton; Frédéric Plewniak; Olivier Poch
Journal:  BMC Bioinformatics       Date:  2006-06-23       Impact factor: 3.169

9.  CORRIE: enzyme sequence annotation with confidence estimates.

Authors:  Benjamin Audit; Emmanuel D Levy; Wally R Gilks; Leon Goldovsky; Christos A Ouzounis
Journal:  BMC Bioinformatics       Date:  2007-05-22       Impact factor: 3.169

10.  CARGO: a web portal to integrate customized biological information.

Authors:  Ildefonso Cases; David G Pisano; Eduardo Andres; Angel Carro; José M Fernández; Gonzalo Gómez-López; Jose M Rodriguez; Jaime F Vera; Alfonso Valencia; Ana M Rojas
Journal:  Nucleic Acids Res       Date:  2007-05-05       Impact factor: 16.971

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.