Literature DB >> 9837738

Sequence comparisons using multiple sequences detect three times as many remote homologues as pairwise methods.

J Park1, K Karplus, C Barrett, R Hughey, D Haussler, T Hubbard, C Chothia.   

Abstract

The sequences of related proteins can diverge beyond the point where their relationship can be recognised by pairwise sequence comparisons. In attempts to overcome this limitation, methods have been developed that use as a query, not a single sequence, but sets of related sequences or a representation of the characteristics shared by related sequences. Here we describe an assessment of three of these methods: the SAM-T98 implementation of a hidden Markov model procedure; PSI-BLAST; and the intermediate sequence search (ISS) procedure. We determined the extent to which these procedures can detect evolutionary relationships between the members of the sequence database PDBD40-J. This database, derived from the structural classification of proteins (SCOP), contains the sequences of proteins of known structure whose sequence identities with each other are 40% or less. The evolutionary relationships that exist between those that have low sequence identities were found by the examination of their structural details and, in many cases, their functional features. For nine false positive predictions out of a possible 432,680, i.e. at a false positive rate of about 1/50,000, SAM-T98 found 35% of the true homologous relationships in PDBD40-J, whilst PSI-BLAST found 30% and ISS found 25%. Overall, this is about twice the number of PDBD40-J relations that can be detected by the pairwise comparison procedures FASTA (17%) and GAP-BLAST (15%). For distantly related sequences in PDBD40-J, those pairs whose sequence identity is less than 30%, SAM-T98 and PSI-BLAST detect three times the number of relationships found by the pairwise methods. Copyright 1998 Academic Press.

Mesh:

Substances:

Year:  1998        PMID: 9837738     DOI: 10.1006/jmbi.1998.2221

Source DB:  PubMed          Journal:  J Mol Biol        ISSN: 0022-2836            Impact factor:   5.469


  143 in total

1.  Detection of protein fold similarity based on correlation of amino acid properties.

Authors:  I V Grigoriev; S H Kim
Journal:  Proc Natl Acad Sci U S A       Date:  1999-12-07       Impact factor: 11.205

2.  SCOP: a structural classification of proteins database.

Authors:  L Lo Conte; B Ailey; T J Hubbard; S E Brenner; A G Murzin; C Chothia
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

3.  The ASTRAL compendium for protein structure and sequence analysis.

Authors:  S E Brenner; P Koehl; M Levitt
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

4.  Assigning genomic sequences to CATH.

Authors:  F M Pearl; D Lee; J E Bray; I Sillitoe; A E Todd; A P Harrison; J M Thornton; C A Orengo
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

5.  Evaluation of PSI-BLAST alignment accuracy in comparison to structural alignments.

Authors:  I Friedberg; T Kaplan; H Margalit
Journal:  Protein Sci       Date:  2000-11       Impact factor: 6.725

6.  A rapid classification protocol for the CATH Domain Database to support structural genomics.

Authors:  F M Pearl; N Martin; J E Bray; D W Buchan; A P Harrison; D Lee; G A Reeves; A J Shepherd; I Sillitoe; A E Todd; J M Thornton; C A Orengo
Journal:  Nucleic Acids Res       Date:  2001-01-01       Impact factor: 16.971

7.  PartsList: a web-based system for dynamically ranking protein folds based on disparate attributes, including whole-genome expression and interaction information.

Authors:  J Qian; B Stenger; C A Wilson; J Lin; R Jansen; S A Teichmann; J Park; W G Krebs; H Yu; V Alexandrov; N Echols; M Gerstein
Journal:  Nucleic Acids Res       Date:  2001-04-15       Impact factor: 16.971

8.  Identification of related proteins with weak sequence identity using secondary structure information.

Authors:  C Geourjon; C Combet; C Blanchet; G Deléage
Journal:  Protein Sci       Date:  2001-04       Impact factor: 6.725

9.  LiveBench-1: continuous benchmarking of protein structure prediction servers.

Authors:  J M Bujnicki; A Elofsson; D Fischer; L Rychlewski
Journal:  Protein Sci       Date:  2001-02       Impact factor: 6.725

10.  Comparison of sequence profiles. Strategies for structural predictions using sequence information.

Authors:  L Rychlewski; L Jaroszewski; W Li; A Godzik
Journal:  Protein Sci       Date:  2000-02       Impact factor: 6.725

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.