Literature DB >> 8811733

Prediction of secondary structural content of proteins from their amino acid composition alone. II. The paradox with secondary structural class.

F Eisenhaber1, C Frömmel, P Argos.   

Abstract

The success rates reported for secondary structural class prediction with different methods are contradictory. On one side, the problem of recognizing the secondary structural class of a protein knowing only its amino acid composition appears completely solved by simply applying jury decision with an elliptically scaled distance function. Chou and coworkers repeatedly (see Crit. Rev. Biochem. Mol. Biol. 30:275-349, 1995) published prediction accuracies near 100%. On the other hand, traditional secondary structure prediction techniques achieve success rates of about 70% for the secondary structural state per residue and about 75% for structural class only with extensive input information (full sequence of the query protein, its amino acid composition and length, multiple alignments with homologous sequences). In this article, we resolve the paradox and consider (1) the question of the secondary structural class definition, (2) the role of the representativity of the test set of protein tertiary structure for the current state of the Protein Data Bank (PDB); and (3) we estimate the real impact of amino acid composition on secondary structural class. We formulate three objective criteria for a reasonable definition of secondary structural classes and show that only the criterion of Nakashima et al. (J. Biochem. 99:153-162, 1986) complies with all of them. Only this definition matches the distribution of secondary structural content in representative PDB subsets, whereas other criteria leave many proteins (up to 65% of all PDB entries) simply unassigned. We review critically specialized secondary-structural class prediction methods, especially those of Chou and coworkers, which claim almost 100% accuracy using only amino acid composition, and resolve the paradox that these prediction accuracies are better than those from secondary structure predictions from multiple alignments. We show (i) that these techniques rely on a preselection of test sets which removes irregular proteins and other proteins without any class assignment (about 35% of all PDB entries); and (ii) that even for preselected representative test sets, the success rate drops to 60% and lower for a 4-type classification (alpha, beta, alpha + beta, alpha/beta). The prediction accuracies fall to about 50% if the secondary structural class definition of Nakashima et al. is applied and only few irregular proteins are preselected and removed from automatically generated, representative subsets of the PDB. We have applied two new vector decomposition methods for secondary structural content prediction from amino acid composition alone, with and without consideration of amino acid compositional coupling in the learning set of tertiary structures respectively, to the problem of class prediction and achieve about 60% correct assignment among four classes (alpha, beta, mixed, irregular) as well as single sequence-based secondary structure prediction methods like GORIII and COMBI. Our results demonstrate that 60% correctness is the upper limit for a 4-type class prediction from amino acid composition alone for an unknown query protein and that consideration of compositional coupling does not improve the prediction success. The prediction program SSCP offering secondary structural class assignment for query compositions and sequences has been made available as a World Wide Web and E-mail service.

Mesh:

Substances:

Year:  1996        PMID: 8811733     DOI: 10.1002/(SICI)1097-0134(199606)25:2<169::AID-PROT3>3.0.CO;2-D

Source DB:  PubMed          Journal:  Proteins        ISSN: 0887-3585


  15 in total

1.  The high-molecular-weight cytochrome c Cyc2 of Acidithiobacillus ferrooxidans is an outer membrane protein.

Authors:  Andrés Yarzábal; Gaël Brasseur; Jeanine Ratouchniak; Karen Lund; Danielle Lemesle-Meunier; John A DeMoss; Violaine Bonnefoy
Journal:  J Bacteriol       Date:  2002-01       Impact factor: 3.490

2.  Predicting flexible length linear B-cell epitopes.

Authors:  Yasser El-Manzalawy; Drena Dobbs; Vasant Honavar
Journal:  Comput Syst Bioinformatics Conf       Date:  2008

3.  Using neural networks for prediction of the subcellular location of proteins.

Authors:  A Reinhardt; T Hubbard
Journal:  Nucleic Acids Res       Date:  1998-05-01       Impact factor: 16.971

4.  Anti-proliferative effect on a colon adenocarcinoma cell line exerted by a membrane disrupting antimicrobial peptide KL15.

Authors:  Yu-Ching Chen; Tsung-Lin Tsai; Xin-Hong Ye; Thy-Hou Lin
Journal:  Cancer Biol Ther       Date:  2015       Impact factor: 4.742

5.  Characterization of protein secondary structure from NMR chemical shifts.

Authors:  Steven P Mielke; V V Krishnan
Journal:  Prog Nucl Magn Reson Spectrosc       Date:  2009-04-05       Impact factor: 9.795

6.  Fold homology detection using sequence fragment composition profiles of proteins.

Authors:  Armando D Solis; Shalom R Rackovsky
Journal:  Proteins       Date:  2010-10

7.  Proteomics in Vaccinology and Immunobiology: An Informatics Perspective of the Immunone.

Authors:  Irini A. Doytchinova; Paul Taylor; Darren R. Flower
Journal:  J Biomed Biotechnol       Date:  2003

8.  Information-theoretic analysis of the reference state in contact potentials used for protein structure prediction.

Authors:  Armando D Solis; Shalom R Rackovsky
Journal:  Proteins       Date:  2010-05-01

9.  The Trypanosoma brucei MitoCarta and its regulation and splicing pattern during development.

Authors:  Xiaobai Zhang; Juan Cui; Daniel Nilsson; Kapila Gunasekera; Astrid Chanfon; Xiaofeng Song; Huinan Wang; Ying Xu; Torsten Ochsenreiter
Journal:  Nucleic Acids Res       Date:  2010-07-26       Impact factor: 16.971

10.  CapZ-lipid membrane interactions: a computer analysis.

Authors:  James Smith; Gerold Diez; Anna H Klemm; Vitali Schewkunow; Wolfgang H Goldmann
Journal:  Theor Biol Med Model       Date:  2006-08-16       Impact factor: 2.432

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.