Literature DB >> 18296462

A comprehensive assessment of sequence-based and template-based methods for protein contact prediction.

Sitao Wu1, Yang Zhang.   

Abstract

MOTIVATION: Pair-wise residue-residue contacts in proteins can be predicted from both threading templates and sequence-based machine learning. However, most structure modeling approaches only use the template-based contact predictions in guiding the simulations; this is partly because the sequence-based contact predictions are usually considered to be less accurate than that by threading. With the rapid progress in sequence databases and machine-learning techniques, it is necessary to have a detailed and comprehensive assessment of the contact-prediction methods in different template conditions.
RESULTS: We develop two methods for protein-contact predictions: SVM-SEQ is a sequence-based machine learning approach which trains a variety of sequence-derived features on contact maps; SVM-LOMETS collects consensus contact predictions from multiple threading templates. We test both methods on the same set of 554 proteins which are categorized into 'Easy', 'Medium', 'Hard' and 'Very Hard' targets based on the evolutionary and structural distance between templates and targets. For the Easy and Medium targets, SVM-LOMETS obviously outperforms SVM-SEQ; but for the Hard and Very Hard targets, the accuracy of the SVM-SEQ predictions is higher than that of SVM-LOMETS by 12-25%. If we combine the SVM-SEQ and SVM-LOMETS predictions together, the total number of correctly predicted contacts in the Hard proteins will increase by more than 60% (or 70% for the long-range contact with a sequence separation > or =24), compared with SVM-LOMETS alone. The advantage of SVM-SEQ is also shown in the CASP7 free modeling targets where the SVM-SEQ is around four times more accurate than SVM-LOMETS in the long-range contact prediction. These data demonstrate that the state-of-the-art sequence-based contact prediction has reached a level which may be helpful in assisting tertiary structure modeling for the targets which do not have close structure templates. The maximum yield should be obtained by the combination of both sequence- and template-based predictions.

Entities:  

Mesh:

Substances:

Year:  2008        PMID: 18296462      PMCID: PMC2648832          DOI: 10.1093/bioinformatics/btn069

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  43 in total

1.  Protein threading using PROSPECT: design and evaluation.

Authors:  Y Xu; D Xu
Journal:  Proteins       Date:  2000-08-15

2.  FUGUE: sequence-structure homology recognition using environment-specific substitution tables and structure-dependent gap penalties.

Authors:  J Shi; T L Blundell; K Mizuguchi
Journal:  J Mol Biol       Date:  2001-06-29       Impact factor: 5.469

3.  Prediction of contact maps with neural networks and correlated mutations.

Authors:  P Fariselli; O Olmea; A Valencia; R Casadio
Journal:  Protein Eng       Date:  2001-11

4.  Predicting interresidue contacts using templates and pathways.

Authors:  Yu Shao; Christopher Bystroff
Journal:  Proteins       Date:  2003

5.  Combining local-structure, fold-recognition, and new fold methods for protein structure prediction.

Authors:  Kevin Karplus; Rachel Karchin; Jenny Draper; Jonathan Casper; Yael Mandel-Gutfreund; Mark Diekhans; Richard Hughey
Journal:  Proteins       Date:  2003

6.  Predictions without templates: new folds, secondary structure, and contacts in CASP5.

Authors:  Patrick Aloy; Alexander Stark; Caroline Hadley; Robert B Russell
Journal:  Proteins       Date:  2003

7.  Prediction of contact maps by GIOHMMs and recurrent neural networks using lateral propagation from all four cardinal corners.

Authors:  G Pollastri; P Baldi
Journal:  Bioinformatics       Date:  2002       Impact factor: 6.937

8.  3D-Jury: a simple approach to improve protein structure predictions.

Authors:  Krzysztof Ginalski; Arne Elofsson; Daniel Fischer; Leszek Rychlewski
Journal:  Bioinformatics       Date:  2003-05-22       Impact factor: 6.937

9.  TOUCHSTONE II: a new approach to ab initio protein structure prediction.

Authors:  Yang Zhang; Andrzej Kolinski; Jeffrey Skolnick
Journal:  Biophys J       Date:  2003-08       Impact factor: 4.033

10.  3D-SHOTGUN: a novel, cooperative, fold-recognition meta-predictor.

Authors:  Daniel Fischer
Journal:  Proteins       Date:  2003-05-15
View more
  74 in total

1.  Template-based protein structure modeling using TASSER(VMT.).

Authors:  Hongyi Zhou; Jeffrey Skolnick
Journal:  Proteins       Date:  2011-11-22

2.  Integration of QUARK and I-TASSER for Ab Initio Protein Structure Prediction in CASP11.

Authors:  Wenxuan Zhang; Jianyi Yang; Baoji He; Sara Elizabeth Walker; Hongjiu Zhang; Brandon Govindarajoo; Jouko Virtanen; Zhidong Xue; Hong-Bin Shen; Yang Zhang
Journal:  Proteins       Date:  2015-09-23

3.  Protein contact prediction by integrating joint evolutionary coupling analysis and supervised learning.

Authors:  Jianzhu Ma; Sheng Wang; Zhiyong Wang; Jinbo Xu
Journal:  Bioinformatics       Date:  2015-08-14       Impact factor: 6.937

4.  Accurate disulfide-bonding network predictions improve ab initio structure prediction of cysteine-rich proteins.

Authors:  Jing Yang; Bao-Ji He; Richard Jang; Yang Zhang; Hong-Bin Shen
Journal:  Bioinformatics       Date:  2015-08-07       Impact factor: 6.937

5.  Automated protein structure modeling in CASP9 by I-TASSER pipeline combined with QUARK-based ab initio folding and FG-MD-based structure refinement.

Authors:  Dong Xu; Jian Zhang; Ambrish Roy; Yang Zhang
Journal:  Proteins       Date:  2011-08-23

6.  Using multi-data hidden Markov models trained on local neighborhoods of protein structure to predict residue-residue contacts.

Authors:  Patrik Björkholm; Pawel Daniluk; Andriy Kryshtafovych; Krzysztof Fidelis; Robin Andersson; Torgeir R Hvidsten
Journal:  Bioinformatics       Date:  2009-03-16       Impact factor: 6.937

7.  High-accuracy prediction of transmembrane inter-helix contacts and application to GPCR 3D structure modeling.

Authors:  Jing Yang; Richard Jang; Yang Zhang; Hong-Bin Shen
Journal:  Bioinformatics       Date:  2013-08-14       Impact factor: 6.937

8.  Template-based and free modeling of I-TASSER and QUARK pipelines using predicted contact maps in CASP12.

Authors:  Chengxin Zhang; S M Mortuza; Baoji He; Yanting Wang; Yang Zhang
Journal:  Proteins       Date:  2017-11-14

9.  Annotation of Alternatively Spliced Proteins and Transcripts with Protein-Folding Algorithms and Isoform-Level Functional Networks.

Authors:  Hongdong Li; Yang Zhang; Yuanfang Guan; Rajasree Menon; Gilbert S Omenn
Journal:  Methods Mol Biol       Date:  2017

10.  In silico and in vivo studies of molecular structures and mechanisms of AtPCS1 protein involved in binding arsenite and/or cadmium in plant cells.

Authors:  Noor Nahar; Aminur Rahman; Maria Moś; Tomasz Warzecha; Sibdas Ghosh; Khaled Hossain; Neelu N Nawani; Abul Mandal
Journal:  J Mol Model       Date:  2014-02-20       Impact factor: 1.810

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.