Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Automatic annotation of protein function based on family identification.

Literature DB >> 14579359

Automatic annotation of protein function based on family identification.

Abstract

Although genomes are being sequenced at an impressive rate, the information generated tells us little about protein function, which is slow to characterize by traditional methods. Automatic protein function annotation based on computational methods has alleviated this imbalance. The most powerful current approach for inferring the function of new proteins is by studying the annotations of their homologues, since their common origin is assumed to be reflected in their structure and function. Unfortunately, as proteins evolve they acquire new functions, so annotation based on homology must be carried out in the context of orthologues or subfamilies. Evolution adds new complications through domain shuffling: homology (or orthology) frequently corresponds to domains rather than complete proteins. Moreover, the function of a protein may be seen as the result of combining the functions of its domains. Additionally, automatic annotation has to deal with problems related to the annotations in the databases: errors (which are likely to be propagated), inconsistencies, or different degrees of function specification. We describe a method that addresses these difficulties for the annotation of protein function. Sequence relationships are detected and measured to obtain a map of the sequence space, which is searched for differentiated groups of proteins (similar to islands on the map), which are expected to have a common function and correspond to groups of orthologues or subfamilies. This mapmaking is done by applying a clustering algorithm based on Normalized cuts in graphs. The domain problem is addressed in a simple way: pairwise local alignments are analyzed to determine the extent to which they cover the entire sequence lengths of the two proteins. This analysis determines both what homologues are preferred for functional inheritance and the level of confidence of the annotation. To alleviate the problems associated with database annotations, the information on all the homologues that are grouped together with the query protein are taken into account to select the most representative functional descriptors. This method has been applied for the annotation of the genome of Buchnera aphidicola (specific host Baizongia pistaciae). Human inspection of the annotations allowed an estimation of accuracy of 94%; the different kinds of error that may appear when using this approach are described. Results can be accessed at http://www.pdg.cnb.uam.es/funcut.html. The programs are available upon request, although installation in other systems may be complicated. Copyright 2003 Wiley-Liss, Inc.

Entities: Disease Species

Mesh：

Substances：
Proteins
Pyridoxal Kinase

Year: 2003 PMID： 14579359 DOI： 10.1002/prot.10449

Source DB: PubMed Journal: Proteins ISSN： 0887-3585

Keyword Cloud
Cited

17 in total

Automatic annotation of protein function based on family identification.

1. Comparative Genomics Reveals Factors Associated with Phenotypic Expression of Wolbachia.

Review 2. Ortholog identification in the presence of domain architecture rearrangement.

3. HomPPI: a class of sequence homology based protein-protein interface prediction methods.

4. BLANNOTATOR: enhanced homology-based function prediction of bacterial proteins.

5. Probabilistic annotation of protein sequences based on functional classifications.

6. AutoFACT: an automatic functional annotation and classification tool.

7. The relationship between protein sequences and their gene ontology functions.

8. MACSIMS: multiple alignment of complete sequences information management system.

9. CORRIE: enzyme sequence annotation with confidence estimates.

10. CARGO: a web portal to integrate customized biological information.