Literature DB >> 9600892

A unified statistical framework for sequence comparison and structure comparison.

M Levitt1, M Gerstein.   

Abstract

We present an approach for assessing the significance of sequence and structure comparisons by using nearly identical statistical formalisms for both sequence and structure. Doing so involves an all-vs.-all comparison of protein domains [taken here from the Structural Classification of Proteins (scop) database] and then fitting a simple distribution function to the observed scores. By using this distribution, we can attach a statistical significance to each comparison score in the form of a P value, the probability that a better score would occur by chance. As expected, we find that the scores for sequence matching follow an extreme-value distribution. The agreement, moreover, between the P values that we derive from this distribution and those reported by standard programs (e.g., BLAST and FASTA validates our approach. Structure comparison scores also follow an extreme-value distribution when the statistics are expressed in terms of a structural alignment score (essentially the sum of reciprocated distances between aligned atoms minus gap penalties). We find that the traditional metric of structural similarity, the rms deviation in atom positions after fitting aligned atoms, follows a different distribution of scores and does not perform as well as the structural alignment score. Comparison of the sequence and structure statistics for pairs of proteins known to be related distantly shows that structural comparison is able to detect approximately twice as many distant relationships as sequence comparison at the same error rate. The comparison also indicates that there are very few pairs with significant similarity in terms of sequence but not structure whereas many pairs have significant similarity in terms of structure but not sequence.

Mesh:

Substances:

Year:  1998        PMID: 9600892      PMCID: PMC34495          DOI: 10.1073/pnas.95.11.5913

Source DB:  PubMed          Journal:  Proc Natl Acad Sci U S A        ISSN: 0027-8424            Impact factor:   11.205


  36 in total

1.  A surface of minimum area metric for the structural comparison of proteins.

Authors:  A Falicov; F E Cohen
Journal:  J Mol Biol       Date:  1996-05-24       Impact factor: 5.469

2.  Definition of general topological equivalence in protein structures. A procedure involving comparison of properties and relationships through simulated annealing and dynamic programming.

Authors:  A Sali; T L Blundell
Journal:  J Mol Biol       Date:  1990-03-20       Impact factor: 5.469

3.  Comprehensive assessment of automatic structural alignment against a manual standard, the scop classification of proteins.

Authors:  M Gerstein; M Levitt
Journal:  Protein Sci       Date:  1998-02       Impact factor: 6.725

4.  Local alignment statistics.

Authors:  S F Altschul; W Gish
Journal:  Methods Enzymol       Date:  1996       Impact factor: 1.600

5.  Understanding protein structure: using scop for fold interpretation.

Authors:  S E Brenner; C Chothia; T J Hubbard; A G Murzin
Journal:  Methods Enzymol       Date:  1996       Impact factor: 1.600

Review 6.  Surprising similarities in structure comparison.

Authors:  J F Gibrat; T Madej; S H Bryant
Journal:  Curr Opin Struct Biol       Date:  1996-06       Impact factor: 6.809

7.  Structural patterns in globular proteins.

Authors:  M Levitt; C Chothia
Journal:  Nature       Date:  1976-06-17       Impact factor: 49.962

8.  Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes.

Authors:  S Karlin; S F Altschul
Journal:  Proc Natl Acad Sci U S A       Date:  1990-03       Impact factor: 11.205

9.  Improved tools for biological sequence comparison.

Authors:  W R Pearson; D J Lipman
Journal:  Proc Natl Acad Sci U S A       Date:  1988-04       Impact factor: 11.205

10.  Different protein sequences can give rise to highly similar folds through different stabilizing interactions.

Authors:  D V Laurents; S Subbiah; M Levitt
Journal:  Protein Sci       Date:  1994-11       Impact factor: 6.725

View more
  92 in total

1.  The ASTRAL compendium for protein structure and sequence analysis.

Authors:  S E Brenner; P Koehl; M Levitt
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

2.  PALI-a database of Phylogeny and ALIgnment of homologous protein structures.

Authors:  S Balaji; S Sujatha; S S Kumar; N Srinivasan
Journal:  Nucleic Acids Res       Date:  2001-01-01       Impact factor: 16.971

3.  PartsList: a web-based system for dynamically ranking protein folds based on disparate attributes, including whole-genome expression and interaction information.

Authors:  J Qian; B Stenger; C A Wilson; J Lin; R Jansen; S A Teichmann; J Park; W G Krebs; H Yu; V Alexandrov; N Echols; M Gerstein
Journal:  Nucleic Acids Res       Date:  2001-04-15       Impact factor: 16.971

4.  The morph server: a standardized system for analyzing and visualizing macromolecular motions in a database framework.

Authors:  W G Krebs; M Gerstein
Journal:  Nucleic Acids Res       Date:  2000-04-15       Impact factor: 16.971

5.  BALSA: Bayesian algorithm for local sequence alignment.

Authors:  Bobbie-Jo M Webb; Jun S Liu; Charles E Lawrence
Journal:  Nucleic Acids Res       Date:  2002-03-01       Impact factor: 16.971

6.  Electrostatics in protein-protein docking.

Authors:  Alexander Heifetz; Ephraim Katchalski-Katzir; Miriam Eisenstein
Journal:  Protein Sci       Date:  2002-03       Impact factor: 6.725

7.  Statistical significance of protein structure prediction by threading.

Authors:  L A Mirny; A V Finkelstein; E I Shakhnovich
Journal:  Proc Natl Acad Sci U S A       Date:  2000-08-29       Impact factor: 11.205

8.  MAMMOTH (matching molecular models obtained from theory): an automated method for model comparison.

Authors:  Angel R Ortiz; Charlie E M Strauss; Osvaldo Olmea
Journal:  Protein Sci       Date:  2002-11       Impact factor: 6.725

9.  Analysis of protein sequence/structure similarity relationships.

Authors:  Hin Hark Gan; Rebecca A Perlow; Sharmili Roy; Joy Ko; Min Wu; Jing Huang; Shixiang Yan; Angelo Nicoletta; Jonathan Vafai; Ding Sun; Lihua Wang; Joyce E Noah; Samuela Pasquali; Tamar Schlick
Journal:  Biophys J       Date:  2002-11       Impact factor: 4.033

10.  Multiple templates-based homology modeling enhances structure quality of AT1 receptor: validation by molecular dynamics and antagonist docking.

Authors:  Pandian Sokkar; Shylajanaciyar Mohandass; Murugesan Ramachandran
Journal:  J Mol Model       Date:  2010-10-06       Impact factor: 1.810

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.