Literature DB >> 9135128

A 3D-1D substitution matrix for protein fold recognition that includes predicted secondary structure of the sequence.

D W Rice1, D Eisenberg.   

Abstract

In protein fold recognition, a probe amino acid sequence is compared to a library of representative folds of known structure to identify a structural homolog. In cases where the probe and its homolog have clear sequence similarity, traditional residue substitution matrices have been used to predict the structural similarity. In cases where the probe is sequentially distant from its homolog, we have developed a (7 x 3 x 2 x 7 x 3) 3D-1D substitution matrix (called H3P2), calculated from a database of 119 structural pairs. Members of each pair share a similar fold, but have sequence identity less than 30%. Each probe sequence position is defined by one of seven residue classes and three secondary structure classes. Each homologous fold position is defined by one of seven residue classes, three secondary structure classes, and two burial classes. Thus the matrix is five-dimensional and contains 7 x 3 x 2 x 7 x 3 = 882 elements or 3D-1D scores. The first step in assigning a probe sequence to its homologous fold is the prediction of the three-state (helix, strand, coil) secondary structure of the probe; here we use the profile based neural network prediction of secondary structure (PHD) program. Then a dynamic programming algorithm uses the H3P2 matrix to align the probe sequence with structures in a representative fold library. To test the effectiveness of the H3P2 matrix a challenging, fold class diverse, and cross-validated benchmark assessment is used to compare the H3P2 matrix to the GONNET, PAM250, BLOSUM62 and a secondary structure only substitution matrix. For distantly related sequences the H3P2 matrix detects more homologous structures at higher reliabilities than do these other substitution matrices, based on sensitivity versus specificity plots (or SENS-SPEC plots). The added efficacy of the H3P2 matrix arises from its information on the statistical preferences for various sequence-structure environment combinations from very distantly related proteins. It introduces the predicted secondary structure information from a sequence into fold recognition in a statistical way that normalizes the inherent correlations between residue type, secondary structure and solvent accessibility.

Entities:  

Mesh:

Substances:

Year:  1997        PMID: 9135128     DOI: 10.1006/jmbi.1997.0924

Source DB:  PubMed          Journal:  J Mol Biol        ISSN: 0022-2836            Impact factor:   5.469


  41 in total

1.  Detection of protein fold similarity based on correlation of amino acid properties.

Authors:  I V Grigoriev; S H Kim
Journal:  Proc Natl Acad Sci U S A       Date:  1999-12-07       Impact factor: 11.205

2.  Environment-dependent residue contact energies for proteins.

Authors:  C Zhang; S H Kim
Journal:  Proc Natl Acad Sci U S A       Date:  2000-03-14       Impact factor: 11.205

3.  Factors limiting the performance of prediction-based fold recognition methods.

Authors:  X de la Cruz; J M Thornton
Journal:  Protein Sci       Date:  1999-04       Impact factor: 6.725

4.  Motif-based fold assignment.

Authors:  L Salwinski; D Eisenberg
Journal:  Protein Sci       Date:  2001-12       Impact factor: 6.725

5.  Improved detection of homologous membrane proteins by inclusion of information from topology predictions.

Authors:  Maria Hedman; Hans Deloof; Gunnar Von Heijne; Arne Elofsson
Journal:  Protein Sci       Date:  2002-03       Impact factor: 6.725

6.  Pcons: a neural-network-based consensus predictor that improves fold recognition.

Authors:  J Lundström; L Rychlewski; J Bujnicki; A Elofsson
Journal:  Protein Sci       Date:  2001-11       Impact factor: 6.725

7.  Thermodynamic propensities of amino acids in the native state ensemble: implications for fold recognition.

Authors:  J O Wrabl; S A Larson; V J Hilser
Journal:  Protein Sci       Date:  2001-05       Impact factor: 6.725

8.  Use of residue pairs in protein sequence-sequence and sequence-structure alignments.

Authors:  J Jung; B Lee
Journal:  Protein Sci       Date:  2000-08       Impact factor: 6.725

9.  Enhanced protein fold recognition using secondary structure information from NMR.

Authors:  D J Ayers; P R Gooley; A Widmer-Cooper; A E Torda
Journal:  Protein Sci       Date:  1999-05       Impact factor: 6.725

10.  The directional atomic solvation energy: an atom-based potential for the assignment of protein sequences to known folds.

Authors:  Parag Mallick; Robert Weiss; David Eisenberg
Journal:  Proc Natl Acad Sci U S A       Date:  2002-12-02       Impact factor: 11.205

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.