Literature DB >> 8371270

Protein secondary structure prediction using nearest-neighbor methods.

T M Yi1, E S Lander.   

Abstract

We have studied the use of nearest-neighbor classifiers to predict the secondary structure of proteins. The nearest-neighbor rule states that a test instance is classified according to the classifications of "nearby" training examples from a database of known structures. In the context of secondary structure prediction, the test instances are windows of n consecutive residues, and the label is the secondary structure type (alpha-helix, beta-strand, or coil) of the center position of the window. To define the neighborhood of a test instance, we employed a novel similarity metric based on the local structural environment scoring scheme of Bowie et al. In this manner, we have attempted to exploit the underlying structural similarity between segments of different proteins to aid in the prediction of secondary structure. Furthermore, in addition to using neighborhoods of fixed radius, we explored a modification of the standard nearest-neighbor algorithm that involved defining an "effective radius" for each exemplar by measuring its performance on a training set. Using these ideas, we achieved a peak prediction accuracy of 68%. Finally, we sought to improve the biological utility of secondary structure prediction by identifying the subset of the predictions that are most likely to be correct. Toward this end, we developed a nearest-neighbor estimator that produced not the traditional "one-state" prediction (alpha-helix, beta-strand, or coil) but rather a probability distribution over the three states. It should be emphasized that this scheme estimates true probability values and that the resulting numbers are not pseudo-probability scores generated by simple normalization of the raw output of the predictor. Applying the mutual information statistic, we found that these probability triplets possess 58% more information than the one-state predictions. Furthermore, the probability estimates allow one to assign an a priori confidence level to the prediction at each residue. Using this approach, we found that the top 28% of the predictions were 86% accurate and the top 43% of the predictions were 81% accurate. These results indicate that, notwithstanding the limitations on overall accuracy of secondary structure prediction, a substantial proportion of a protein can be predicted with considerable accuracy.

Entities:  

Mesh:

Year:  1993        PMID: 8371270     DOI: 10.1006/jmbi.1993.1464

Source DB:  PubMed          Journal:  J Mol Biol        ISSN: 0022-2836            Impact factor:   5.469


  20 in total

1.  Distinguishing between sequential and nonsequentially folded proteins: implications for folding and misfolding.

Authors:  C J Tsai; J V Maizel; R Nussinov
Journal:  Protein Sci       Date:  1999-08       Impact factor: 6.725

2.  Structure-based conformational preferences of amino acids.

Authors:  P Koehl; M Levitt
Journal:  Proc Natl Acad Sci U S A       Date:  1999-10-26       Impact factor: 11.205

3.  Cascaded multiple classifiers for secondary structure prediction.

Authors:  M Ouali; R D King
Journal:  Protein Sci       Date:  2000-06       Impact factor: 6.725

4.  Protein energetic conformational analysis from NMR chemical shifts (PECAN) and its use in determining secondary structural elements.

Authors:  Hamid R Eghbalnia; Liya Wang; Arash Bahrami; Amir Assadi; John L Markley
Journal:  J Biomol NMR       Date:  2005-05       Impact factor: 2.835

5.  Molecular modeling of phosphorylation sites in proteins using a database of local structure segments.

Authors:  Dariusz Plewczynski; Lukasz Jaroszewski; Adam Godzik; Andrzej Kloczkowski; Leszek Rychlewski
Journal:  J Mol Model       Date:  2005-08-11       Impact factor: 1.810

6.  Human adenovirus early region 4 open reading frame 1 genes encode growth-transforming proteins that may be distantly related to dUTP pyrophosphatase enzymes.

Authors:  R S Weiss; S S Lee; B V Prasad; R T Javier
Journal:  J Virol       Date:  1997-03       Impact factor: 5.103

7.  Identification and application of the concepts important for accurate and reliable protein secondary structure prediction.

Authors:  R D King; M J Sternberg
Journal:  Protein Sci       Date:  1996-11       Impact factor: 6.725

8.  Distributions of amino acids suggest that certain residue types more effectively determine protein secondary structure.

Authors:  S Saraswathi; J L Fernández-Martínez; A Koliński; R L Jernigan; A Kloczkowski
Journal:  J Mol Model       Date:  2013-08-02       Impact factor: 1.810

9.  Improved secondary structure predictions for a nicotinic receptor subunit: incorporation of solvent accessibility and experimental data into a two-dimensional representation.

Authors:  N Le Novère; P J Corringer; J P Changeux
Journal:  Biophys J       Date:  1999-05       Impact factor: 4.033

10.  Fast learning optimized prediction methodology (FLOPRED) for protein secondary structure prediction.

Authors:  S Saraswathi; J L Fernández-Martínez; A Kolinski; R L Jernigan; A Kloczkowski
Journal:  J Mol Model       Date:  2012-05-08       Impact factor: 1.810

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.