Literature DB >> 8358300

Cross-validation of protein structural class prediction using statistical clustering and neural networks.

B A Metfessel1, P N Saurugger, D P Connelly, S S Rich.   

Abstract

We present an approach to predicting protein structural class that uses amino acid composition and hydrophobic pattern frequency information as input to two types of neural networks: (1) a three-layer back-propagation network and (2) a learning vector quantization network. The results of these methods are compared to those obtained from a modified Euclidean statistical clustering algorithm. The protein sequence data used to drive these algorithms consist of the normalized frequency of up to 20 amino acid types and six hydrophobic amino acid patterns. From these frequency values the structural class predictions for each protein (all-alpha, all-beta, or alpha-beta classes) are derived. Examples consisting of 64 previously classified proteins were randomly divided into multiple training (56 proteins) and test (8 proteins) sets. The best performing algorithm on the test sets was the learning vector quantization network using 17 inputs, obtaining a prediction accuracy of 80.2%. The Matthews correlation coefficients are statistically significant for all algorithms and all structural classes. The differences between algorithms are in general not statistically significant. These results show that information exists in protein primary sequences that is easily obtainable and useful for the prediction of protein structural class by neural networks as well as by standard statistical clustering algorithms.

Mesh:

Substances:

Year:  1993        PMID: 8358300      PMCID: PMC2142422          DOI: 10.1002/pro.5560020712

Source DB:  PubMed          Journal:  Protein Sci        ISSN: 0961-8368            Impact factor:   6.725


  18 in total

1.  Comparison of the predicted and observed secondary structure of T4 phage lysozyme.

Authors:  B W Matthews
Journal:  Biochim Biophys Acta       Date:  1975-10-20

2.  Selection of representative protein data sets.

Authors:  U Hobohm; M Scharf; R Schneider; C Sander
Journal:  Protein Sci       Date:  1992-03       Impact factor: 6.725

3.  Use of helical wheels to represent the structures of proteins and to identify segments with helical potential.

Authors:  M Schiffer; A B Edmundson
Journal:  Biophys J       Date:  1967-03       Impact factor: 4.033

4.  The Protein Data Bank: a computer-based archival file for macromolecular structures.

Authors:  F C Bernstein; T F Koetzle; G J Williams; E F Meyer; M D Brice; J R Rodgers; O Kennard; T Shimanouchi; M Tasumi
Journal:  J Mol Biol       Date:  1977-05-25       Impact factor: 5.469

5.  Structural patterns in globular proteins.

Authors:  M Levitt; C Chothia
Journal:  Nature       Date:  1976-06-17       Impact factor: 49.962

6.  Predicting the secondary structure of globular proteins using neural network models.

Authors:  N Qian; T J Sejnowski
Journal:  J Mol Biol       Date:  1988-08-20       Impact factor: 5.469

7.  Prediction of protein structural class by discriminant analysis.

Authors:  P Klein
Journal:  Biochim Biophys Acta       Date:  1986-11-21

8.  Prediction of protein structural class from the amino acid sequence.

Authors:  P Klein; C Delisi
Journal:  Biopolymers       Date:  1986-09       Impact factor: 2.505

9.  Hydrophobicity of amino acid residues in globular proteins.

Authors:  G D Rose; A R Geselowitz; G J Lesser; R H Lee; M H Zehfus
Journal:  Science       Date:  1985-08-30       Impact factor: 47.728

10.  Amino acid composition and hydrophobicity patterns of protein domains correlate with their structures.

Authors:  R P Sheridan; J S Dixon; R Venkataraghavan; I D Kuntz; K P Scott
Journal:  Biopolymers       Date:  1985-10       Impact factor: 2.505

View more
  9 in total

1.  A time-series-based feature extraction approach for prediction of protein structural class.

Authors:  Ravi Gupta; Ankush Mittal; Kuldip Singh
Journal:  EURASIP J Bioinform Syst Biol       Date:  2008

2.  Prediction of protein folding class using global description of amino acid sequence.

Authors:  I Dubchak; I Muchnik; S R Holbrook; S H Kim
Journal:  Proc Natl Acad Sci U S A       Date:  1995-09-12       Impact factor: 11.205

3.  An analysis of protein folding type prediction by seed-propagated sampling and jackknife test.

Authors:  C T Zhang; K C Chou
Journal:  J Protein Chem       Date:  1995-10

4.  An eigenvalue-eigenvector approach to predicting protein folding types.

Authors:  C T Zhang; K C Chou
Journal:  J Protein Chem       Date:  1995-07

5.  Characterization of protein secondary structure from NMR chemical shifts.

Authors:  Steven P Mielke; V V Krishnan
Journal:  Prog Nucl Magn Reson Spectrosc       Date:  2009-04-05       Impact factor: 9.795

6.  Prediction of protein structural class with Rough Sets.

Authors:  Youfang Cao; Shi Liu; Lida Zhang; Jie Qin; Jiang Wang; Kexuan Tang
Journal:  BMC Bioinformatics       Date:  2006-01-14       Impact factor: 3.169

7.  Support vector machines for predicting protein structural class.

Authors:  Y D Cai; X J Liu; X Xu; G P Zhou
Journal:  BMC Bioinformatics       Date:  2001-06-29       Impact factor: 3.169

8.  Some remarks on protein attribute prediction and pseudo amino acid composition.

Authors:  Kuo-Chen Chou
Journal:  J Theor Biol       Date:  2010-12-17       Impact factor: 2.691

9.  Identification of Cancerlectins Using Support Vector Machines With Fusion of G-Gap Dipeptide.

Authors:  Lili Qian; Yaping Wen; Guosheng Han
Journal:  Front Genet       Date:  2020-04-03       Impact factor: 4.599

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.