Literature DB >> 14568541

How well is enzyme function conserved as a function of pairwise sequence identity?

Weidong Tian1, Jeffrey Skolnick.   

Abstract

Enzyme function conservation has been used to derive the threshold of sequence identity necessary to transfer function from a protein of known function to an unknown protein. Using pairwise sequence comparison, several studies suggested that when the sequence identity is above 40%, enzyme function is well conserved. In contrast, Rost argued that because of database bias, the results from such simple pairwise comparisons might be misleading. Thus, by grouping enzyme sequences into families based on sequence similarity and selecting representative sequences for comparison, he showed that enzyme function starts to diverge quickly when the sequence identity is below 70%. Here, we employ a strategy similar to Rost's to reduce the database bias; however, we classify enzyme families based not only on sequence similarity, but also on functional similarity, i.e. sequences in each family must have the same four digits or the same first three digits of the enzyme commission (EC) number. Furthermore, instead of selecting representative sequences for comparison, we calculate the function conservation of each enzyme family and then average the degree of enzyme function conservation across all enzyme families. Our analysis suggests that for functional transferability, 40% sequence identity can still be used as a confident threshold to transfer the first three digits of an EC number; however, to transfer all four digits of an EC number, above 60% sequence identity is needed to have at least 90% accuracy. Moreover, when PSI-BLAST is used, the magnitude of the E-value is found to be weakly correlated with the extent of enzyme function conservation in the third iteration of PSI-BLAST. As a result, functional annotation based on the E-values from PSI-BLAST should be used with caution. We also show that by employing an enzyme family-specific sequence identity threshold above which 100% functional conservation is required, functional inference of unknown sequences can be accurately accomplished. However, this comes at a cost: those true positive sequences below this threshold cannot be uniquely identified.

Mesh:

Substances:

Year:  2003        PMID: 14568541     DOI: 10.1016/j.jmb.2003.08.057

Source DB:  PubMed          Journal:  J Mol Biol        ISSN: 0022-2836            Impact factor:   5.469


  160 in total

1.  Sequence and structure continuity of evolutionary importance improves protein functional site discovery and annotation.

Authors:  A D Wilkins; R Lua; S Erdin; R M Ward; O Lichtarge
Journal:  Protein Sci       Date:  2010-07       Impact factor: 6.725

2.  Evolution of the Cinnamyl/Sinapyl Alcohol Dehydrogenase (CAD/SAD) gene family: the emergence of real lignin is associated with the origin of Bona Fide CAD.

Authors:  Dong-Mei Guo; Jin-Hua Ran; Xiao-Quan Wang
Journal:  J Mol Evol       Date:  2010-08-19       Impact factor: 2.395

3.  EFICAz: a comprehensive approach for accurate genome-scale enzyme function inference.

Authors:  Weidong Tian; Adrian K Arakaki; Jeffrey Skolnick
Journal:  Nucleic Acids Res       Date:  2004-12-01       Impact factor: 16.971

4.  The FunCat, a functional annotation scheme for systematic classification of proteins from whole genomes.

Authors:  Andreas Ruepp; Alfred Zollner; Dieter Maier; Kaj Albermann; Jean Hani; Martin Mokrejs; Igor Tetko; Ulrich Güldener; Gertrud Mannhaupt; Martin Münsterkötter; H Werner Mewes
Journal:  Nucleic Acids Res       Date:  2004-10-14       Impact factor: 16.971

5.  Detecting remotely related proteins by their interactions and sequence similarity.

Authors:  Jordi Espadaler; Ramón Aragüés; Narayanan Eswar; Marc A Marti-Renom; Enrique Querol; Francesc X Avilés; Andrej Sali; Baldomero Oliva
Journal:  Proc Natl Acad Sci U S A       Date:  2005-05-09       Impact factor: 11.205

6.  Protein surface analysis for function annotation in high-throughput structural genomics pipeline.

Authors:  T Andrew Binkowski; Andrzej Joachimiak; Jie Liang
Journal:  Protein Sci       Date:  2005-12       Impact factor: 6.725

7.  Effective function annotation through catalytic residue conservation.

Authors:  Richard A George; Ruth V Spriggs; Gail J Bartlett; Alex Gutteridge; Malcolm W MacArthur; Craig T Porter; Bissan Al-Lazikani; Janet M Thornton; Mark B Swindells
Journal:  Proc Natl Acad Sci U S A       Date:  2005-07-21       Impact factor: 11.205

Review 8.  Identification of genes encoding tRNA modification enzymes by comparative genomics.

Authors:  Valérie de Crécy-Lagard
Journal:  Methods Enzymol       Date:  2007       Impact factor: 1.600

9.  Computational prediction and experimental verification of the gene encoding the NAD+/NADP+-dependent succinate semialdehyde dehydrogenase in Escherichia coli.

Authors:  Tobias Fuhrer; Lifeng Chen; Uwe Sauer; Dennis Vitkup
Journal:  J Bacteriol       Date:  2007-09-14       Impact factor: 3.490

Review 10.  'Unknown' proteins and 'orphan' enzymes: the missing half of the engineering parts list--and how to find it.

Authors:  Andrew D Hanson; Anne Pribat; Jeffrey C Waller; Valérie de Crécy-Lagard
Journal:  Biochem J       Date:  2009-12-14       Impact factor: 3.857

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.