Literature DB >> 14978311

Sensitivity and selectivity in protein structure comparison.

Michael L Sierk1, William R Pearson.   

Abstract

Seven protein structure comparison methods and two sequence comparison programs were evaluated on their ability to detect either protein homologs or domains with the same topology (fold) as defined by the CATH structure database. The structure alignment programs Dali, Structal, Combinatorial Extension (CE), VAST, and Matras were tested along with SGM and PRIDE, which calculate a structural distance between two domains without aligning them. We also tested two sequence alignment programs, SSEARCH and PSI-BLAST. Depending upon the level of selectivity and error model, structure alignment programs can detect roughly twice as many homologous domains in CATH as sequence alignment programs. Dali finds the most homologs, 321-533 of 1120 possible true positives (28.7%-45.7%), at an error rate of 0.1 errors per query (EPQ), whereas PSI-BLAST finds 365 true positives (32.6%), regardless of the error model. At an EPQ of 1.0, Dali finds 42%-70% of possible homologs, whereas Matras finds 49%-57%; PSI-BLAST finds 36.9%. However, Dali achieves >84% coverage before the first error for half of the families tested. Dali and PSI-BLAST find 9.2% and 5.2%, respectively, of the 7056 possible topology pairs at an EPQ of 0.1 and 19.5, and 5.9% at an EPQ of 1.0. Most statistical significance estimates reported by the structural alignment programs overestimate the significance of an alignment by orders of magnitude when compared with the actual distribution of errors. These results help quantify the statistical distinction between analogous and homologous structures, and provide a benchmark for structure comparison statistics.

Mesh:

Substances:

Year:  2004        PMID: 14978311      PMCID: PMC2286722          DOI: 10.1110/ps.03328504

Source DB:  PubMed          Journal:  Protein Sci        ISSN: 0961-8368            Impact factor:   6.725


  40 in total

1.  The Protein Data Bank.

Authors:  H M Berman; J Westbrook; Z Feng; G Gilliland; T N Bhat; H Weissig; I N Shindyalov; P E Bourne
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

2.  Protein fold similarity estimated by a probabilistic approach based on C(alpha)-C(alpha) distance comparison.

Authors:  Oliviero Carugo; Sándor Pongor
Journal:  J Mol Biol       Date:  2002-01-25       Impact factor: 5.469

3.  Local protein sequence similarity does not imply a structural relationship.

Authors:  M J Sternberg; S A Islam
Journal:  Protein Eng       Date:  1990-12

4.  Sequence comparisons using multiple sequences detect three times as many remote homologues as pairwise methods.

Authors:  J Park; K Karplus; C Barrett; R Hughey; D Haussler; T Hubbard; C Chothia
Journal:  J Mol Biol       Date:  1998-12-11       Impact factor: 5.469

5.  Local alignment statistics.

Authors:  S F Altschul; W Gish
Journal:  Methods Enzymol       Date:  1996       Impact factor: 1.600

6.  Threading a database of protein cores.

Authors:  T Madej; J F Gibrat; S H Bryant
Journal:  Proteins       Date:  1995-11

7.  Assessing sequence comparison methods with reliable structurally identified distant evolutionary relationships.

Authors:  S E Brenner; C Chothia; T J Hubbard
Journal:  Proc Natl Acad Sci U S A       Date:  1998-05-26       Impact factor: 11.205

8.  A unified statistical framework for sequence comparison and structure comparison.

Authors:  M Levitt; M Gerstein
Journal:  Proc Natl Acad Sci U S A       Date:  1998-05-26       Impact factor: 11.205

9.  Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes.

Authors:  S Karlin; S F Altschul
Journal:  Proc Natl Acad Sci U S A       Date:  1990-03       Impact factor: 11.205

10.  Improved tools for biological sequence comparison.

Authors:  W R Pearson; D J Lipman
Journal:  Proc Natl Acad Sci U S A       Date:  1988-04       Impact factor: 11.205

View more
  41 in total

1.  Database searching by flexible protein structure alignment.

Authors:  Yuzhen Ye; Adam Godzik
Journal:  Protein Sci       Date:  2004-07       Impact factor: 6.725

2.  A novel approach to structural alignment using realistic structural and environmental information.

Authors:  Yu Chen; Gordon M Crippen
Journal:  Protein Sci       Date:  2005-10-31       Impact factor: 6.725

3.  Comprehensive evaluation of protein structure alignment methods: scoring by geometric measures.

Authors:  Rachel Kolodny; Patrice Koehl; Michael Levitt
Journal:  J Mol Biol       Date:  2005-01-16       Impact factor: 5.469

Review 4.  The limits of protein sequence comparison?

Authors:  William R Pearson; Michael L Sierk
Journal:  Curr Opin Struct Biol       Date:  2005-06       Impact factor: 6.809

5.  Accuracy analysis of multiple structure alignments.

Authors:  Christoph Berbalk; Christine S Schwaiger; Peter Lackner
Journal:  Protein Sci       Date:  2009-10       Impact factor: 6.725

6.  Correspondences between low-energy modes in enzymes: dynamics-based alignment of enzymatic functional families.

Authors:  Andrea Zen; Vincenzo Carnevale; Arthur M Lesk; Cristian Micheletti
Journal:  Protein Sci       Date:  2008-03-27       Impact factor: 6.725

7.  Threshold Average Precision (TAP-k): a measure of retrieval designed for bioinformatics.

Authors:  Hyrum D Carroll; Maricel G Kann; Sergey L Sheetlin; John L Spouge
Journal:  Bioinformatics       Date:  2010-05-26       Impact factor: 6.937

8.  ALADYN: a web server for aligning proteins by matching their large-scale motion.

Authors:  R Potestio; T Aleksiev; F Pontiggia; S Cozzini; C Micheletti
Journal:  Nucleic Acids Res       Date:  2010-05-05       Impact factor: 16.971

9.  Improving protein structure similarity searches using domain boundaries based on conserved sequence information.

Authors:  Kenneth Evan Thompson; Yanli Wang; Tom Madej; Stephen H Bryant
Journal:  BMC Struct Biol       Date:  2009-05-19

10.  Tableau-based protein substructure search using quadratic programming.

Authors:  Alex Stivala; Anthony Wirth; Peter J Stuckey
Journal:  BMC Bioinformatics       Date:  2009-05-19       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.