Literature DB >> 19135455

Domain-based and family-specific sequence identity thresholds increase the levels of reliable protein function transfer.

Sarah Addou1, Robert Rentzsch, David Lee, Christine A Orengo.   

Abstract

Divergence in function of homologous proteins is based on both sequence and structural changes. Overall enzyme function has been reported to diverge earlier (50% sequence identity) than overall structure (35%). We herein study the functional conservation of enzymes and non-enzyme sequences using the protein domain families in CATH-Gene3D. Despite the rapid increase in sequence data since the last comprehensive study by Tian and Skolnick, our findings suggest that generic thresholds of 40% and 60% aligned sequence identity are still sufficient to safely inherit third-level and full Enzyme Commission numbers, respectively. This increases to 50% and 70% on the domain level, unless the multi-domain architecture matches. Assignments from the Kyoto Encyclopedia of Genes and Genomes and the Munich Information Center for Protein Sequences Functional Catalogue seem to be less conserved with sequence, probably due to a more pathway-centric view: 80% domain sequence identity is required for safe function transfer. Comparing domains (more pairwise relationships) and the use of family-specific thresholds (varying evolutionary speeds) yields the highest coverage rates when transferring functions to model proteomes. An average twofold increase in enzyme annotations is seen for 523 proteomes in Gene3D. As simple 'rules of thumb', sequence identity thresholds do not require a bioinformatics background. We will provide and update this information with future releases of CATH-Gene3D.

Mesh:

Substances:

Year:  2008        PMID: 19135455     DOI: 10.1016/j.jmb.2008.12.045

Source DB:  PubMed          Journal:  J Mol Biol        ISSN: 0022-2836            Impact factor:   5.469


  40 in total

1.  Association of pathogen strain-specific gene transcription and transmission efficiency phenotype of Anaplasma marginale.

Authors:  Joseph T Agnes; David Herndon; Massaro W Ueti; Solomon S Ramabu; Marc Evans; Kelly A Brayton; Guy H Palmer
Journal:  Infect Immun       Date:  2010-03-22       Impact factor: 3.441

Review 2.  Autophagy in protists.

Authors:  Michael Duszenko; Michael L Ginger; Ana Brennand; Melisa Gualdrón-López; María Isabel Colombo; Graham H Coombs; Isabelle Coppens; Bamini Jayabalasingham; Gordon Langsley; Solange Lisboa de Castro; Rubem Menna-Barreto; Jeremy C Mottram; Miguel Navarro; Daniel J Rigden; Patricia S Romano; Veronika Stoka; Boris Turk; Paul A M Michels
Journal:  Autophagy       Date:  2011-02-01       Impact factor: 16.016

3.  Detailed analysis of function divergence in a large and diverse domain superfamily: toward a refined protocol of function classification.

Authors:  Benoit H Dessailly; Oliver C Redfern; Alison L Cuff; Christine A Orengo
Journal:  Structure       Date:  2010-11-10       Impact factor: 5.006

4.  Prediction and experimental validation of enzyme substrate specificity in protein structures.

Authors:  Shivas R Amin; Serkan Erdin; R Matthew Ward; Rhonald C Lua; Olivier Lichtarge
Journal:  Proc Natl Acad Sci U S A       Date:  2013-10-21       Impact factor: 11.205

5.  A Robust Methodology for Assessing Differential Homeolog Contributions to the Transcriptomes of Allopolyploids.

Authors:  J Lucas Boatwright; Lauren M McIntyre; Alison M Morse; Sixue Chen; Mi-Jeong Yoo; Jin Koh; Pamela S Soltis; Douglas E Soltis; W Brad Barbazuk
Journal:  Genetics       Date:  2018-09-13       Impact factor: 4.562

6.  Biochemical control systems for small molecule damage in plants.

Authors:  M Hüdig; J Schmitz; M K M Engqvist; V G Maurino
Journal:  Plant Signal Behav       Date:  2018-06-26

7.  GeMMA: functional subfamily classification within superfamilies of predicted protein structural domains.

Authors:  David A Lee; Robert Rentzsch; Christine Orengo
Journal:  Nucleic Acids Res       Date:  2009-11-18       Impact factor: 16.971

Review 8.  High-throughput crystallography for structural genomics.

Authors:  Andrzej Joachimiak
Journal:  Curr Opin Struct Biol       Date:  2009-09-16       Impact factor: 6.809

9.  A statistical model of protein sequence similarity and function similarity reveals overly-specific function predictions.

Authors:  Brenton Louie; Roger Higdon; Eugene Kolker
Journal:  PLoS One       Date:  2009-10-21       Impact factor: 3.240

Review 10.  From protein sequences to 3D-structures and beyond: the example of the UniProt knowledgebase.

Authors:  Ursula Hinz
Journal:  Cell Mol Life Sci       Date:  2009-12-31       Impact factor: 9.261

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.