Literature DB >> 10623551

Identification of related proteins on family, superfamily and fold level.

E Lindahl1, A Elofsson.   

Abstract

Proteins might have considerable structural similarities even when no evolutionary relationship of their sequences can be detected. This property is often referred to as the proteins sharing only a "fold". Of course, there are also sequences of common origin in each fold, called a "superfamily", and in them groups of sequences with clear similarities, designated "family". Developing algorithms to reliably identify proteins related at any level is one of the most important challenges in the fast growing field of bioinformatics today. However, it is not at all certain that a method proficient at finding sequence similarities performs well at the other levels, or vice versa.Here, we have compared the performance of various search methods on these different levels of similarity. As expected, we show that it becomes much harder to detect proteins as their sequences diverge. For family related sequences the best method gets 75% of the top hits correct. When the sequences differ but the proteins belong to the same superfamily this drops to 29%, and in the case of proteins with only fold similarity it is as low as 15%. We have made a more complete analysis of the performance of different algorithms than earlier studies, also including threading methods in the comparison. Using this method a more detailed picture emerges, showing multiple sequence information to improve detection on the two closer levels of relationship. We have also compared the different methods of including this information in prediction algorithms. For lower specificities, the best scheme to use is a linking method connecting proteins through an intermediate hit. For higher specificities, better performance is obtained by PSI-BLAST and some procedures using hidden Markov models. We also show that a threading method, THREADER, performs significantly better than any other method at fold recognition. Copyright 2000 Academic Press.

Mesh:

Substances:

Year:  2000        PMID: 10623551     DOI: 10.1006/jmbi.1999.3377

Source DB:  PubMed          Journal:  J Mol Biol        ISSN: 0022-2836            Impact factor:   5.469


  53 in total

1.  Including biological literature improves homology search.

Authors:  J T Chang; S Raychaudhuri; R B Altman
Journal:  Pac Symp Biocomput       Date:  2001

2.  SUPFAM--a database of potential protein superfamily relationships derived by comparing sequence-based and structure-based families: implications for structural genomics and function annotation in genomes.

Authors:  Shashi B Pandit; Dilip Gosar; S Abhiman; S Sujatha; Sayali S Dixit; Natasha S Mhatre; R Sowdhamini; N Srinivasan
Journal:  Nucleic Acids Res       Date:  2002-01-01       Impact factor: 16.971

3.  Improved detection of homologous membrane proteins by inclusion of information from topology predictions.

Authors:  Maria Hedman; Hans Deloof; Gunnar Von Heijne; Arne Elofsson
Journal:  Protein Sci       Date:  2002-03       Impact factor: 6.725

4.  Comparing function and structure between entire proteomes.

Authors:  J Liu; B Rost
Journal:  Protein Sci       Date:  2001-10       Impact factor: 6.725

5.  Pcons: a neural-network-based consensus predictor that improves fold recognition.

Authors:  J Lundström; L Rychlewski; J Bujnicki; A Elofsson
Journal:  Protein Sci       Date:  2001-11       Impact factor: 6.725

6.  Sequence similarities of protein kinase peptide substrates and inhibitors: comparison of their primary structures with immunoglobulin repeats.

Authors:  J Kubrycht; J Borecký; K Sigler
Journal:  Folia Microbiol (Praha)       Date:  2002       Impact factor: 2.099

7.  Conservation of structure and function among tyrosine recombinases: homology-based modeling of the lambda integrase core-binding domain.

Authors:  Brian M Swalla; Richard I Gumport; Jeffrey F Gardner
Journal:  Nucleic Acids Res       Date:  2003-02-01       Impact factor: 16.971

8.  A comparison of profile hidden Markov model procedures for remote homology detection.

Authors:  Martin Madera; Julian Gough
Journal:  Nucleic Acids Res       Date:  2002-10-01       Impact factor: 16.971

9.  Sequence similarities of protein kinase substrates and inhibitors with immunoglobulins and model immunoglobulin homologue: cell adhesion molecule from the living fossil sponge Geodia cydonium. Mapping of coherent database similarities and implications for evolution of CDR1 and hypermutation.

Authors:  J Kubrycht; J Borecký; P Soucek; P Jezek
Journal:  Folia Microbiol (Praha)       Date:  2004       Impact factor: 2.099

10.  CONTSOR--a new knowledge-based fold recognition potential, based on side chain orientation and contacts between residue terminal groups.

Authors:  Boris Vishnepolsky; Malak Pirtskhalava
Journal:  Protein Sci       Date:  2011-11-23       Impact factor: 6.725

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.