Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 A 3D-1D substitution matrix for protein fold recognition that includes predicted secondary structure of the sequence.

Literature DB >> 9135128

A 3D-1D substitution matrix for protein fold recognition that includes predicted secondary structure of the sequence.

Abstract

In protein fold recognition, a probe amino acid sequence is compared to a library of representative folds of known structure to identify a structural homolog. In cases where the probe and its homolog have clear sequence similarity, traditional residue substitution matrices have been used to predict the structural similarity. In cases where the probe is sequentially distant from its homolog, we have developed a (7 x 3 x 2 x 7 x 3) 3D-1D substitution matrix (called H3P2), calculated from a database of 119 structural pairs. Members of each pair share a similar fold, but have sequence identity less than 30%. Each probe sequence position is defined by one of seven residue classes and three secondary structure classes. Each homologous fold position is defined by one of seven residue classes, three secondary structure classes, and two burial classes. Thus the matrix is five-dimensional and contains 7 x 3 x 2 x 7 x 3 = 882 elements or 3D-1D scores. The first step in assigning a probe sequence to its homologous fold is the prediction of the three-state (helix, strand, coil) secondary structure of the probe; here we use the profile based neural network prediction of secondary structure (PHD) program. Then a dynamic programming algorithm uses the H3P2 matrix to align the probe sequence with structures in a representative fold library. To test the effectiveness of the H3P2 matrix a challenging, fold class diverse, and cross-validated benchmark assessment is used to compare the H3P2 matrix to the GONNET, PAM250, BLOSUM62 and a secondary structure only substitution matrix. For distantly related sequences the H3P2 matrix detects more homologous structures at higher reliabilities than do these other substitution matrices, based on sensitivity versus specificity plots (or SENS-SPEC plots). The added efficacy of the H3P2 matrix arises from its information on the statistical preferences for various sequence-structure environment combinations from very distantly related proteins. It introduces the predicted secondary structure information from a sequence into fold recognition in a statistical way that normalizes the inherent correlations between residue type, secondary structure and solvent accessibility.

Entities: Disease Gene

Mesh：

Substances：

Year: 1997 PMID： 9135128 DOI： 10.1006/jmbi.1997.0924

Source DB: PubMed Journal: J Mol Biol ISSN： 0022-2836 Impact factor: 5.469

Keyword Cloud
Cited

41 in total

10. The directional atomic solvation energy: an atom-based potential for the assignment of protein sequences to known folds.

Authors: Parag Mallick; Robert Weiss; David Eisenberg
Journal: Proc Natl Acad Sci U S A Date: 2002-12-02 Impact factor: 11.205

A 3D-1D substitution matrix for protein fold recognition that includes predicted secondary structure of the sequence.

1. Detection of protein fold similarity based on correlation of amino acid properties.

2. Environment-dependent residue contact energies for proteins.

3. Factors limiting the performance of prediction-based fold recognition methods.

4. Motif-based fold assignment.

5. Improved detection of homologous membrane proteins by inclusion of information from topology predictions.

6. Pcons: a neural-network-based consensus predictor that improves fold recognition.

7. Thermodynamic propensities of amino acids in the native state ensemble: implications for fold recognition.

8. Use of residue pairs in protein sequence-sequence and sequence-structure alignments.

9. Enhanced protein fold recognition using secondary structure information from NMR.

10. The directional atomic solvation energy: an atom-based potential for the assignment of protein sequences to known folds.