Literature DB >> 8811732

Prediction of secondary structural content of proteins from their amino acid composition alone. I. New analytic vector decomposition methods.

F Eisenhaber1, F Imperiale, P Argos, C Frömmel.   

Abstract

The predictive limits of the amino acid composition for the secondary structural content (percentage of residues in the secondary structural states helix, sheet, and coil) in proteins are assessed quantitatively. For the first time, techniques for prediction of secondary structural content are presented which rely on the amino acid composition as the only information on the query protein. In our first method, the amino acid composition of an unknown protein is represented by the best (in a least square sense) linear combination of the characteristic amino acid compositions of the three secondary structural types computed from a learning set of tertiary structures. The second technique is a generalization of the first one and takes into account also possible compositional couplings between any two sorts of amino acids. Its mathematical formulation results in an eigenvalue/eigenvector problem of the second moment matrix describing the amino acid compositional fluctuations of secondary structural types in various proteins of a learning set. Possible correlations of the principal directions of the eigenspaces with physical properties of the amino acids were also checked. For example, the first two eigenvectors of the helical eigenspace correlate with the size and hydrophobicity of the residue types respectively. As learning and test sets of tertiary structures, we utilized representative, automatically generated subsets of Protein Data Bank (PDB) consisting of non-homologous protein structures at the resolution thresholds < or = 1.8A, < or = 2.0A, < or = 2.5A, and < or = 3.0 A. We show that the consideration of compositional couplings improves prediction accuracy, albeit not dramatically. Whereas in the self-consistency test (learning with the protein to be predicted), a clear decrease of prediction accuracy with worsening resolution is observed, the jackknife test (leave the predicted protein out) yielded best results for the largest dataset (< or = 3.0A, almost no difference to the self-consistency test!), i.e., only this set, with more than 400 proteins, is sufficient for stable computation of the parameters in the prediction function of the second method. The average absolute error in predicting the fraction of helix, sheet, and coil from amino acid composition of the query protein are 13.7, 12.6, and 11.4%, respectively with r.m.s. deviations in the range of 8.6 divided by 11.8% for the 3.0 A dataset in a jackknife test. The absolute precision of the average absolute errors is in the range of 1 divided by 3% as measured for other representative subsets of the PDB. Secondary structural content prediction methods found in the literature have been clustered in accordance with their prediction accuracies. To our surprise, much more complex secondary structure prediction methods utilized for the same purpose of secondary structural content prediction achieve prediction accuracies very similar to those of the present analytic techniques, implying that all the information beyond the amino acid composition is, in fact, mainly utilized for positioning the secondary structural state in the sequence but not for determination of the overall number of residues in a secondary structural type. This result implies that higher prediction accuracies cannot be achieved relying solely on the amino acid composition of an unknown query protein as prediction input. Our prediction program SSCP has been made available as a World Wide Web and E-mail service.

Mesh:

Substances:

Year:  1996        PMID: 8811732     DOI: 10.1002/(SICI)1097-0134(199606)25:2<157::AID-PROT2>3.0.CO;2-F

Source DB:  PubMed          Journal:  Proteins        ISSN: 0887-3585


  12 in total

1.  The high-molecular-weight cytochrome c Cyc2 of Acidithiobacillus ferrooxidans is an outer membrane protein.

Authors:  Andrés Yarzábal; Gaël Brasseur; Jeanine Ratouchniak; Karen Lund; Danielle Lemesle-Meunier; John A DeMoss; Violaine Bonnefoy
Journal:  J Bacteriol       Date:  2002-01       Impact factor: 3.490

2.  Improved peptide elution time prediction for reversed-phase liquid chromatography-MS by incorporating peptide sequence information.

Authors:  Konstantinos Petritis; Lars J Kangas; Bo Yan; Matthew E Monroe; Eric F Strittmatter; Wei-Jun Qian; Joshua N Adkins; Ronald J Moore; Ying Xu; Mary S Lipton; David G Camp; Richard D Smith
Journal:  Anal Chem       Date:  2006-07-15       Impact factor: 6.986

3.  Sequence representation and prediction of protein secondary structure for structural motifs in twilight zone proteins.

Authors:  Lukasz Kurgan; Kanaka Durga Kedarisetti
Journal:  Protein J       Date:  2006-12       Impact factor: 2.371

4.  Computational prediction of human proteins that can be secreted into the bloodstream.

Authors:  Juan Cui; Qi Liu; David Puett; Ying Xu
Journal:  Bioinformatics       Date:  2008-08-12       Impact factor: 6.937

5.  Lipoprotein PssN of Rhizobium leguminosarum bv. trifolii: subcellular localization and possible involvement in exopolysaccharide export.

Authors:  Małgorzata Marczak; Andrzej Mazur; Jarosław E Król; Wiesław I Gruszecki; Anna Skorupska
Journal:  J Bacteriol       Date:  2006-10       Impact factor: 3.490

6.  Characterization of protein secondary structure from NMR chemical shifts.

Authors:  Steven P Mielke; V V Krishnan
Journal:  Prog Nucl Magn Reson Spectrosc       Date:  2009-04-05       Impact factor: 9.795

7.  Fold homology detection using sequence fragment composition profiles of proteins.

Authors:  Armando D Solis; Shalom R Rackovsky
Journal:  Proteins       Date:  2010-10

8.  Information-theoretic analysis of the reference state in contact potentials used for protein structure prediction.

Authors:  Armando D Solis; Shalom R Rackovsky
Journal:  Proteins       Date:  2010-05-01

9.  The Trypanosoma brucei MitoCarta and its regulation and splicing pattern during development.

Authors:  Xiaobai Zhang; Juan Cui; Daniel Nilsson; Kapila Gunasekera; Astrid Chanfon; Xiaofeng Song; Huinan Wang; Ying Xu; Torsten Ochsenreiter
Journal:  Nucleic Acids Res       Date:  2010-07-26       Impact factor: 16.971

10.  CapZ-lipid membrane interactions: a computer analysis.

Authors:  James Smith; Gerold Diez; Anna H Klemm; Vitali Schewkunow; Wolfgang H Goldmann
Journal:  Theor Biol Med Model       Date:  2006-08-16       Impact factor: 2.432

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.