Literature DB >> 9300496

Predicting protein secondary structure with probabilistic schemata of evolutionarily derived information.

M J Thompson1, R A Goldstein.   

Abstract

We demonstrate the applicability of our previously developed Bayesian probabilistic approach for predicting residue solvent accessibility to the problem of predicting secondary structure. Using only single-sequence data, this method achieves a three-state accuracy of 67% over a database of 473 non-homologous proteins. This approach is more amenable to inspection and less likely to overlearn specifics of a dataset than "black box" methods such as neural networks. It is also conceptually simpler and less computationally costly. We also introduce a novel method for representing and incorporating multiple-sequence alignment information within the prediction algorithm, achieving 72% accuracy over a dataset of 304 non-homologous proteins. This is accomplished by creating a statistical model of the evolutionarily derived correlations between patterns of amino acid substitution and local protein structure. This model consists of parameter vectors, termed "substitution schemata," which probabilistically encode the structure-based heterogeneity in the distributions of amino acid substitutions found in alignments of homologous proteins. The model is optimized for structure prediction by maximizing the mutual information between the set of schemata and the database of secondary structures. Unlike "expert heuristic" methods, this approach has been demonstrated to work well over large datasets. Unlike the opaque neural network algorithms, this approach is physicochemically intelligible. Moreover, the model optimization procedure, the formalism for predicting one-dimensional structural features and our previously developed method for tertiary structure recognition all share a common Bayesian probabilistic basis. This consistency starkly contrasts with the hybrid and ad hoc nature of methods that have dominated this field in recent years.

Mesh:

Substances:

Year:  1997        PMID: 9300496      PMCID: PMC2143796          DOI: 10.1002/pro.5560060917

Source DB:  PubMed          Journal:  Protein Sci        ISSN: 0961-8368            Impact factor:   6.725


  28 in total

1.  Protein tertiary structure recognition using optimized Hamiltonians with local interactions.

Authors:  R A Goldstein; Z A Luthey-Schulten; P G Wolynes
Journal:  Proc Natl Acad Sci U S A       Date:  1992-10-01       Impact factor: 11.205

Review 2.  Comparative methods for explaining adaptations.

Authors:  P H Harvey; A Purvis
Journal:  Nature       Date:  1991-06-20       Impact factor: 49.962

3.  Predicting protein secondary structure using neural net and statistical methods.

Authors:  P Stolorz; A Lapedes; Y Xia
Journal:  J Mol Biol       Date:  1992-05-20       Impact factor: 5.469

4.  Hybrid system for protein secondary structure prediction.

Authors:  X Zhang; J P Mesirov; D L Waltz
Journal:  J Mol Biol       Date:  1992-06-20       Impact factor: 5.469

5.  Further developments of protein secondary structure prediction using information theory. New parameters and consideration of residue pairs.

Authors:  J F Gibrat; J Garnier; B Robson
Journal:  J Mol Biol       Date:  1987-12-05       Impact factor: 5.469

6.  Structural analysis based on state-space modeling.

Authors:  C M Stultz; J V White; T F Smith
Journal:  Protein Sci       Date:  1993-03       Impact factor: 6.725

7.  Prediction of protein secondary structure by the hidden Markov model.

Authors:  K Asai; S Hayamizu; K Handa
Journal:  Comput Appl Biosci       Date:  1993-04

8.  Prediction of protein secondary structure at better than 70% accuracy.

Authors:  B Rost; C Sander
Journal:  J Mol Biol       Date:  1993-07-20       Impact factor: 5.469

9.  Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features.

Authors:  W Kabsch; C Sander
Journal:  Biopolymers       Date:  1983-12       Impact factor: 2.505

Review 10.  Predicting the conformation of proteins. Man versus machine.

Authors:  S A Benner; D L Gerloff
Journal:  FEBS Lett       Date:  1993-06-28       Impact factor: 4.124

View more
  5 in total

Review 1.  Genomic biodiversity, phylogenetics and coevolution in proteins.

Authors:  David D Pollock
Journal:  Appl Bioinformatics       Date:  2002

2.  In silico identification of new ligands for GPR17: a promising therapeutic target for neurodegenerative diseases.

Authors:  Ivano Eberini; Simona Daniele; Chiara Parravicini; Cristina Sensi; Maria L Trincavelli; Claudia Martini; Maria P Abbracchio
Journal:  J Comput Aided Mol Des       Date:  2011-07-09       Impact factor: 3.686

3.  Characterization of non-trivial neighborhood fold constraints from protein sequences using generalized topohydrophobicity.

Authors:  Guillaume Fourty; Isabelle Callebaut; Jean-Paul Mornon
Journal:  Bioinform Biol Insights       Date:  2008-01-31

4.  A dynamic Bayesian network approach to protein secondary structure prediction.

Authors:  Xin-Qiu Yao; Huaiqiu Zhu; Zhen-Su She
Journal:  BMC Bioinformatics       Date:  2008-01-25       Impact factor: 3.169

5.  How many 3D structures do we need to train a predictor?

Authors:  Pantelis G Bagos; Georgios N Tsaousis; Stavros J Hamodrakas
Journal:  Genomics Proteomics Bioinformatics       Date:  2009-09       Impact factor: 7.691

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.