Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Prediction of secondary structural content of proteins from their amino acid composition alone. II. The paradox with secondary structural class.

Literature DB >> 8811733

Prediction of secondary structural content of proteins from their amino acid composition alone. II. The paradox with secondary structural class.

Abstract

The success rates reported for secondary structural class prediction with different methods are contradictory. On one side, the problem of recognizing the secondary structural class of a protein knowing only its amino acid composition appears completely solved by simply applying jury decision with an elliptically scaled distance function. Chou and coworkers repeatedly (see Crit. Rev. Biochem. Mol. Biol. 30:275-349, 1995) published prediction accuracies near 100%. On the other hand, traditional secondary structure prediction techniques achieve success rates of about 70% for the secondary structural state per residue and about 75% for structural class only with extensive input information (full sequence of the query protein, its amino acid composition and length, multiple alignments with homologous sequences). In this article, we resolve the paradox and consider (1) the question of the secondary structural class definition, (2) the role of the representativity of the test set of protein tertiary structure for the current state of the Protein Data Bank (PDB); and (3) we estimate the real impact of amino acid composition on secondary structural class. We formulate three objective criteria for a reasonable definition of secondary structural classes and show that only the criterion of Nakashima et al. (J. Biochem. 99:153-162, 1986) complies with all of them. Only this definition matches the distribution of secondary structural content in representative PDB subsets, whereas other criteria leave many proteins (up to 65% of all PDB entries) simply unassigned. We review critically specialized secondary-structural class prediction methods, especially those of Chou and coworkers, which claim almost 100% accuracy using only amino acid composition, and resolve the paradox that these prediction accuracies are better than those from secondary structure predictions from multiple alignments. We show (i) that these techniques rely on a preselection of test sets which removes irregular proteins and other proteins without any class assignment (about 35% of all PDB entries); and (ii) that even for preselected representative test sets, the success rate drops to 60% and lower for a 4-type classification (alpha, beta, alpha + beta, alpha/beta). The prediction accuracies fall to about 50% if the secondary structural class definition of Nakashima et al. is applied and only few irregular proteins are preselected and removed from automatically generated, representative subsets of the PDB. We have applied two new vector decomposition methods for secondary structural content prediction from amino acid composition alone, with and without consideration of amino acid compositional coupling in the learning set of tertiary structures respectively, to the problem of class prediction and achieve about 60% correct assignment among four classes (alpha, beta, mixed, irregular) as well as single sequence-based secondary structure prediction methods like GORIII and COMBI. Our results demonstrate that 60% correctness is the upper limit for a 4-type class prediction from amino acid composition alone for an unknown query protein and that consideration of compositional coupling does not improve the prediction success. The prediction program SSCP offering secondary structural class assignment for query compositions and sequences has been made available as a World Wide Web and E-mail service.

Mesh：

Substances：
Amino Acids

Year: 1996 PMID： 8811733 DOI： 10.1002/(SICI)1097-0134(199606)25:2<169::AID-PROT3>3.0.CO;2-D

Source DB: PubMed Journal: Proteins ISSN： 0887-3585

Keyword Cloud
Cited

15 in total

1. The high-molecular-weight cytochrome c Cyc2 of Acidithiobacillus ferrooxidans is an outer membrane protein.

Authors: Andrés Yarzábal; Gaël Brasseur; Jeanine Ratouchniak; Karen Lund; Danielle Lemesle-Meunier; John A DeMoss; Violaine Bonnefoy
Journal: J Bacteriol Date: 2002-01 Impact factor: 3.490

2. Predicting flexible length linear B-cell epitopes.

Authors: Yasser El-Manzalawy; Drena Dobbs; Vasant Honavar
Journal: Comput Syst Bioinformatics Conf Date: 2008

3. Using neural networks for prediction of the subcellular location of proteins.

Authors: A Reinhardt; T Hubbard
Journal: Nucleic Acids Res Date: 1998-05-01 Impact factor: 16.971

4. Anti-proliferative effect on a colon adenocarcinoma cell line exerted by a membrane disrupting antimicrobial peptide KL15.

Authors: Yu-Ching Chen; Tsung-Lin Tsai; Xin-Hong Ye; Thy-Hou Lin
Journal: Cancer Biol Ther Date: 2015 Impact factor: 4.742

Prediction of secondary structural content of proteins from their amino acid composition alone. II. The paradox with secondary structural class.

1. The high-molecular-weight cytochrome c Cyc2 of Acidithiobacillus ferrooxidans is an outer membrane protein.

2. Predicting flexible length linear B-cell epitopes.

3. Using neural networks for prediction of the subcellular location of proteins.

4. Anti-proliferative effect on a colon adenocarcinoma cell line exerted by a membrane disrupting antimicrobial peptide KL15.

5. Characterization of protein secondary structure from NMR chemical shifts.

6. Fold homology detection using sequence fragment composition profiles of proteins.

7. Proteomics in Vaccinology and Immunobiology: An Informatics Perspective of the Immunone.

8. Information-theoretic analysis of the reference state in contact potentials used for protein structure prediction.

9. The Trypanosoma brucei MitoCarta and its regulation and splicing pattern during development.

10. CapZ-lipid membrane interactions: a computer analysis.