Literature DB >> 26275894

Protein contact prediction by integrating joint evolutionary coupling analysis and supervised learning.

Jianzhu Ma1, Sheng Wang1, Zhiyong Wang1, Jinbo Xu1.   

Abstract

MOTIVATION: Protein contact prediction is important for protein structure and functional study. Both evolutionary coupling (EC) analysis and supervised machine learning methods have been developed, making use of different information sources. However, contact prediction is still challenging especially for proteins without a large number of sequence homologs.
RESULTS: This article presents a group graphical lasso (GGL) method for contact prediction that integrates joint multi-family EC analysis and supervised learning to improve accuracy on proteins without many sequence homologs. Different from existing single-family EC analysis that uses residue coevolution information in only the target protein family, our joint EC analysis uses residue coevolution in both the target family and its related families, which may have divergent sequences but similar folds. To implement this, we model a set of related protein families using Gaussian graphical models and then coestimate their parameters by maximum-likelihood, subject to the constraint that these parameters shall be similar to some degree. Our GGL method can also integrate supervised learning methods to further improve accuracy. Experiments show that our method outperforms existing methods on proteins without thousands of sequence homologs, and that our method performs better on both conserved and family-specific contacts.
AVAILABILITY AND IMPLEMENTATION: See http://raptorx.uchicago.edu/ContactMap/ for a web server implementing the method. CONTACT: j3xu@ttic.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

Mesh:

Substances:

Year:  2015        PMID: 26275894      PMCID: PMC4838177          DOI: 10.1093/bioinformatics/btv472

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  40 in total

1.  Scoring function for automated assessment of protein structure template quality.

Authors:  Yang Zhang; Jeffrey Skolnick
Journal:  Proteins       Date:  2004-12-01

2.  Accurate de novo structure prediction of large transmembrane protein domains using fragment-assembly and correlated mutation analysis.

Authors:  Timothy Nugent; David T Jones
Journal:  Proc Natl Acad Sci U S A       Date:  2012-05-29       Impact factor: 11.205

3.  Statistical potential-based amino acid similarity matrices for aligning distantly related protein sequences.

Authors:  Yen Hock Tan; He Huang; Daisuke Kihara
Journal:  Proteins       Date:  2006-08-15

4.  The global trace graph, a novel paradigm for searching protein sequence databases.

Authors:  Andreas Heger; Swapan Mallick; Christopher Wilton; Liisa Holm
Journal:  Bioinformatics       Date:  2007-09-06       Impact factor: 6.937

5.  Learning generative models for protein fold families.

Authors:  Sivaraman Balakrishnan; Hetunandan Kamisetty; Jaime G Carbonell; Su-In Lee; Christopher James Langmead
Journal:  Proteins       Date:  2011-01-25

6.  A position-specific distance-dependent statistical potential for protein structure and functional study.

Authors:  Feng Zhao; Jinbo Xu
Journal:  Structure       Date:  2012-05-17       Impact factor: 5.006

7.  Three-dimensional structures of membrane proteins from genomic sequencing.

Authors:  Thomas A Hopf; Lucy J Colwell; Robert Sheridan; Burkhard Rost; Chris Sander; Debora S Marks
Journal:  Cell       Date:  2012-05-10       Impact factor: 41.582

8.  TM-align: a protein structure alignment algorithm based on the TM-score.

Authors:  Yang Zhang; Jeffrey Skolnick
Journal:  Nucleic Acids Res       Date:  2005-04-22       Impact factor: 16.971

9.  MetaPSICOV: combining coevolution methods for accurate prediction of contacts and long range hydrogen bonding in proteins.

Authors:  David T Jones; Tanya Singh; Tomasz Kosciolek; Stuart Tetchner
Journal:  Bioinformatics       Date:  2014-11-26       Impact factor: 6.937

10.  M-Coffee: combining multiple sequence alignment methods with T-Coffee.

Authors:  Iain M Wallace; Orla O'Sullivan; Desmond G Higgins; Cedric Notredame
Journal:  Nucleic Acids Res       Date:  2006-03-23       Impact factor: 16.971

View more
  42 in total

1.  Folding Membrane Proteins by Deep Transfer Learning.

Authors:  Sheng Wang; Zhen Li; Yizhou Yu; Jinbo Xu
Journal:  Cell Syst       Date:  2017-09-27       Impact factor: 10.304

2.  Distance-based protein folding powered by deep learning.

Authors:  Jinbo Xu
Journal:  Proc Natl Acad Sci U S A       Date:  2019-08-09       Impact factor: 11.205

3.  ResPRE: high-accuracy protein contact prediction by coupling precision matrix with deep residual neural networks.

Authors:  Yang Li; Jun Hu; Chengxin Zhang; Dong-Jun Yu; Yang Zhang
Journal:  Bioinformatics       Date:  2019-11-01       Impact factor: 6.937

4.  KScons: a Bayesian approach for protein residue contact prediction using the knob-socket model of protein tertiary structure.

Authors:  Qiwei Li; David B Dahl; Marina Vannucci; Hyun Joo; Jerry W Tsai
Journal:  Bioinformatics       Date:  2016-08-24       Impact factor: 6.937

5.  CoinFold: a web server for protein contact prediction and contact-assisted protein folding.

Authors:  Sheng Wang; Wei Li; Renyu Zhang; Shiwang Liu; Jinbo Xu
Journal:  Nucleic Acids Res       Date:  2016-04-25       Impact factor: 16.971

6.  AUCpreD: proteome-level protein disorder prediction by AUC-maximized deep convolutional neural fields.

Authors:  Sheng Wang; Jianzhu Ma; Jinbo Xu
Journal:  Bioinformatics       Date:  2016-09-01       Impact factor: 6.937

7.  Analysis of deep learning methods for blind protein contact prediction in CASP12.

Authors:  Sheng Wang; Siqi Sun; Jinbo Xu
Journal:  Proteins       Date:  2017-09-06

Review 8.  Potts Hamiltonian models of protein co-variation, free energy landscapes, and evolutionary fitness.

Authors:  Ronald M Levy; Allan Haldane; William F Flynn
Journal:  Curr Opin Struct Biol       Date:  2016-11-18       Impact factor: 6.809

9.  Deducing high-accuracy protein contact-maps from a triplet of coevolutionary matrices through deep residual convolutional networks.

Authors:  Yang Li; Chengxin Zhang; Eric W Bell; Wei Zheng; Xiaogen Zhou; Dong-Jun Yu; Yang Zhang
Journal:  PLoS Comput Biol       Date:  2021-03-26       Impact factor: 4.475

Review 10.  Applications of sequence coevolution in membrane protein biochemistry.

Authors:  John M Nicoludis; Rachelle Gaudet
Journal:  Biochim Biophys Acta Biomembr       Date:  2017-10-07       Impact factor: 3.747

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.