Literature DB >> 17001039

A Consensus Data Mining secondary structure prediction by combining GOR V and Fragment Database Mining.

Taner Z Sen1, Haitao Cheng, Andrzej Kloczkowski, Robert L Jernigan.   

Abstract

The major aim of tertiary structure prediction is to obtain protein models with the highest possible accuracy. Fold recognition, homology modeling, and de novo prediction methods typically use predicted secondary structures as input, and all of these methods may significantly benefit from more accurate secondary structure predictions. Although there are many different secondary structure prediction methods available in the literature, their cross-validated prediction accuracy is generally <80%. In order to increase the prediction accuracy, we developed a novel hybrid algorithm called Consensus Data Mining (CDM) that combines our two previous successful methods: (1) Fragment Database Mining (FDM), which exploits the Protein Data Bank structures, and (2) GOR V, which is based on information theory, Bayesian statistics, and multiple sequence alignments (MSA). In CDM, the target sequence is dissected into smaller fragments that are compared with fragments obtained from related sequences in the PDB. For fragments with a sequence identity above a certain sequence identity threshold, the FDM method is applied for the prediction. The remainder of the fragments are predicted by GOR V. The results of the CDM are provided as a function of the upper sequence identities of aligned fragments and the sequence identity threshold. We observe that the value 50% is the optimum sequence identity threshold, and that the accuracy of the CDM method measured by Q(3) ranges from 67.5% to 93.2%, depending on the availability of known structural fragments with sufficiently high sequence identity. As the Protein Data Bank grows, it is anticipated that this consensus method will improve because it will rely more upon the structural fragments.

Mesh:

Substances:

Year:  2006        PMID: 17001039      PMCID: PMC2242411          DOI: 10.1110/ps.062125306

Source DB:  PubMed          Journal:  Protein Sci        ISSN: 0961-8368            Impact factor:   6.725


  40 in total

1.  Protein secondary structure prediction based on position-specific scoring matrices.

Authors:  D T Jones
Journal:  J Mol Biol       Date:  1999-09-17       Impact factor: 5.469

2.  The Protein Data Bank.

Authors:  H M Berman; J Westbrook; Z Feng; G Gilliland; T N Bhat; H Weissig; I N Shindyalov; P E Bourne
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

3.  Application of multiple sequence alignment profiles to improve protein secondary structure prediction.

Authors:  J A Cuff; G J Barton
Journal:  Proteins       Date:  2000-08-15

4.  Docking of protein models.

Authors:  Andrei Tovchigrechko; Christopher A Wells; Ilya A Vakser
Journal:  Protein Sci       Date:  2002-08       Impact factor: 6.725

5.  Protein secondary structure prediction with dihedral angles.

Authors:  Matthew J Wood; Jonathan D Hirst
Journal:  Proteins       Date:  2005-05-15

6.  The effect of long-range interactions on the secondary structure formation of proteins.

Authors:  Daisuke Kihara
Journal:  Protein Sci       Date:  2005-06-29       Impact factor: 6.725

Review 7.  Protein-interaction mapping in search of effective drug targets.

Authors:  Amitabha Chaudhuri; John Chant
Journal:  Bioessays       Date:  2005-09       Impact factor: 4.345

8.  Improvements in a secondary structure prediction method based on a search for local sequence homologies and its use as a model building tool.

Authors:  J M Levin; J Garnier
Journal:  Biochim Biophys Acta       Date:  1988-08-10

9.  Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features.

Authors:  W Kabsch; C Sander
Journal:  Biopolymers       Date:  1983-12       Impact factor: 2.505

10.  Protein secondary structure assignment revisited: a detailed analysis of different assignment methods.

Authors:  Juliette Martin; Guillaume Letellier; Antoine Marin; Jean-François Taly; Alexandre G de Brevern; Jean-François Gibrat
Journal:  BMC Struct Biol       Date:  2005-09-15
View more
  11 in total

1.  Transport of preproteins by the accessory Sec system requires a specific domain adjacent to the signal peptide.

Authors:  Barbara A Bensing; Paul M Sullam
Journal:  J Bacteriol       Date:  2010-06-18       Impact factor: 3.490

2.  Consensus Data Mining (CDM) Protein Secondary Structure Prediction Server: combining GOR V and Fragment Database Mining (FDM).

Authors:  Haitao Cheng; Taner Z Sen; Robert L Jernigan; Andrzej Kloczkowski
Journal:  Bioinformatics       Date:  2007-07-27       Impact factor: 6.937

3.  Identification, tissue distribution, and molecular modeling of novel human isoforms of the key enzyme in sialic acid synthesis, UDP-GlcNAc 2-epimerase/ManNAc kinase.

Authors:  Tal Yardeni; Tsering Choekyi; Katherine Jacobs; Carla Ciccone; Katherine Patzel; Yair Anikster; William A Gahl; Natalya Kurochkina; Marjan Huizing
Journal:  Biochemistry       Date:  2011-09-19       Impact factor: 3.162

4.  Distributions of amino acids suggest that certain residue types more effectively determine protein secondary structure.

Authors:  S Saraswathi; J L Fernández-Martínez; A Koliński; R L Jernigan; A Kloczkowski
Journal:  J Mol Model       Date:  2013-08-02       Impact factor: 1.810

5.  Brainstorming: weighted voting prediction of inhibitors for protein targets.

Authors:  Dariusz Plewczynski
Journal:  J Mol Model       Date:  2010-09-21       Impact factor: 1.810

6.  Preparation and topology of the Mediator middle module.

Authors:  Tobias Koschubs; Kristina Lorenzen; Sonja Baumli; Saana Sandström; Albert J R Heck; Patrick Cramer
Journal:  Nucleic Acids Res       Date:  2010-01-31       Impact factor: 16.971

Review 7.  Template-based protein modeling: recent methodological advances.

Authors:  Pankaj R Daga; Ronak Y Patel; Robert J Doerksen
Journal:  Curr Top Med Chem       Date:  2010       Impact factor: 3.295

8.  Murine isoforms of UDP-GlcNAc 2-epimerase/ManNAc kinase: Secondary structures, expression profiles, and response to ManNAc therapy.

Authors:  Tal Yardeni; Katherine Jacobs; Terren K Niethamer; Carla Ciccone; Yair Anikster; Natalya Kurochkina; William A Gahl; Marjan Huizing
Journal:  Glycoconj J       Date:  2012-12-25       Impact factor: 2.916

9.  The mate recognition protein gene mediates reproductive isolation and speciation in the Brachionus plicatilis cryptic species complex.

Authors:  Kristin E Gribble; David B Mark Welch
Journal:  BMC Evol Biol       Date:  2012-08-01       Impact factor: 3.260

10.  Combining sequence-based prediction methods and circular dichroism and infrared spectroscopic data to improve protein secondary structure determinations.

Authors:  Jonathan G Lees; Robert W Janes
Journal:  BMC Bioinformatics       Date:  2008-01-15       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.