Literature DB >> 31407406

Ensembling multiple raw coevolutionary features with deep residual neural networks for contact-map prediction in CASP13.

Yang Li1,2, Chengxin Zhang2, Eric W Bell2, Dong-Jun Yu1,2, Yang Zhang2.   

Abstract

We report the results of residue-residue contact prediction of a new pipeline built purely on the learning of coevolutionary features in the CASP13 experiment. For a query sequence, the pipeline starts with the collection of multiple sequence alignments (MSAs) from multiple genome and metagenome sequence databases using two complementary Hidden Markov Model (HMM)-based searching tools. Three profile matrices, built on covariance, precision, and pseudolikelihood maximization respectively, are then created from the MSAs, which are used as the input features of a deep residual convolutional neural network architecture for contact-map training and prediction. Two ensembling strategies have been proposed to integrate the matrix features through end-to-end training and stacking, resulting in two complementary programs called TripletRes and ResTriplet, respectively. For the 31 free-modeling domains that do not have homologous templates in the PDB, TripletRes and ResTriplet generated comparable results with an average accuracy of 0.640 and 0.646, respectively, for the top L/5 long-range predictions, where 71% and 74% of the cases have an accuracy above 0.5. Detailed data analyses showed that the strength of the pipeline is due to the sensitive MSA construction and the advanced strategies for coevolutionary feature ensembling. Domain splitting was also found to help enhance the contact prediction performance. Nevertheless, contact models for tail regions, which often involve a high number of alignment gaps, and for targets with few homologous sequences are still suboptimal. Development of new approaches where the model is specifically trained on these regions and targets might help address these problems.
© 2019 Wiley Periodicals, Inc.

Entities:  

Keywords:  CASP; coevolution analysis; contact-map prediction; deep learning; protein folding

Mesh:

Substances:

Year:  2019        PMID: 31407406      PMCID: PMC6851483          DOI: 10.1002/prot.25798

Source DB:  PubMed          Journal:  Proteins        ISSN: 0887-3585


  32 in total

1.  PSICOV: precise structural contact prediction using sparse inverse covariance estimation on large multiple sequence alignments.

Authors:  David T Jones; Daniel W A Buchan; Domenico Cozzetto; Massimiliano Pontil
Journal:  Bioinformatics       Date:  2011-11-17       Impact factor: 6.937

2.  Direct-coupling analysis of residue coevolution captures native contacts across many protein families.

Authors:  Faruck Morcos; Andrea Pagnani; Bryan Lunt; Arianna Bertolino; Debora S Marks; Chris Sander; Riccardo Zecchina; José N Onuchic; Terence Hwa; Martin Weigt
Journal:  Proc Natl Acad Sci U S A       Date:  2011-11-21       Impact factor: 11.205

3.  Assessing the utility of coevolution-based residue-residue contact predictions in a sequence- and structure-rich era.

Authors:  Hetunandan Kamisetty; Sergey Ovchinnikov; David Baker
Journal:  Proc Natl Acad Sci U S A       Date:  2013-09-05       Impact factor: 11.205

4.  Comparative protein modelling by satisfaction of spatial restraints.

Authors:  A Sali; T L Blundell
Journal:  J Mol Biol       Date:  1993-12-05       Impact factor: 5.469

5.  Improving protein structure prediction using multiple sequence-based contact predictions.

Authors:  Sitao Wu; Andras Szilagyi; Yang Zhang
Journal:  Structure       Date:  2011-08-10       Impact factor: 5.006

6.  MetaPSICOV: combining coevolution methods for accurate prediction of contacts and long range hydrogen bonding in proteins.

Authors:  David T Jones; Tanya Singh; Tomasz Kosciolek; Stuart Tetchner
Journal:  Bioinformatics       Date:  2014-11-26       Impact factor: 6.937

7.  Improved protein contact predictions with the MetaPSICOV2 server in CASP12.

Authors:  Daniel W A Buchan; David T Jones
Journal:  Proteins       Date:  2017-09-29

8.  Evaluation of free modeling targets in CASP11 and ROLL.

Authors:  Lisa N Kinch; Wenlin Li; Bohdan Monastyrskyy; Andriy Kryshtafovych; Nick V Grishin
Journal:  Proteins       Date:  2016-01-20

9.  Improved de novo structure prediction in CASP11 by incorporating coevolution information into Rosetta.

Authors:  Sergey Ovchinnikov; David E Kim; Ray Yu-Ruei Wang; Yuan Liu; Frank DiMaio; David Baker
Journal:  Proteins       Date:  2016-02-24

10.  HH-suite3 for fast remote homology detection and deep protein annotation.

Authors:  Martin Steinegger; Markus Meier; Milot Mirdita; Harald Vöhringer; Stephan J Haunsberger; Johannes Söding
Journal:  BMC Bioinformatics       Date:  2019-09-14       Impact factor: 3.169

View more
  40 in total

1.  FUpred: detecting protein domains through deep-learning-based contact map prediction.

Authors:  Wei Zheng; Xiaogen Zhou; Qiqige Wuyun; Robin Pearce; Yang Li; Yang Zhang
Journal:  Bioinformatics       Date:  2020-06-01       Impact factor: 6.937

2.  DeepMSA: constructing deep multiple sequence alignment to improve contact prediction and fold-recognition for distant-homology proteins.

Authors:  Chengxin Zhang; Wei Zheng; S M Mortuza; Yang Li; Yang Zhang
Journal:  Bioinformatics       Date:  2020-04-01       Impact factor: 6.937

3.  Improved protein structure prediction using potentials from deep learning.

Authors:  Andrew W Senior; Richard Evans; John Jumper; James Kirkpatrick; Laurent Sifre; Tim Green; Chongli Qin; Augustin Žídek; Alexander W R Nelson; Alex Bridgland; Hugo Penedones; Stig Petersen; Karen Simonyan; Steve Crossan; Pushmeet Kohli; David T Jones; David Silver; Koray Kavukcuoglu; Demis Hassabis
Journal:  Nature       Date:  2020-01-15       Impact factor: 49.962

4.  Functions of Essential Genes and a Scale-Free Protein Interaction Network Revealed by Structure-Based Function and Interaction Prediction for a Minimal Genome.

Authors:  Chengxin Zhang; Wei Zheng; Micah Cheng; Gilbert S Omenn; Peter L Freddolino; Yang Zhang
Journal:  J Proteome Res       Date:  2021-01-04       Impact factor: 4.466

5.  Deducing high-accuracy protein contact-maps from a triplet of coevolutionary matrices through deep residual convolutional networks.

Authors:  Yang Li; Chengxin Zhang; Eric W Bell; Wei Zheng; Xiaogen Zhou; Dong-Jun Yu; Yang Zhang
Journal:  PLoS Comput Biol       Date:  2021-03-26       Impact factor: 4.475

Review 6.  Deep learning methods for 3D structural proteome and interactome modeling.

Authors:  Dongjin Lee; Dapeng Xiong; Shayne Wierbowski; Le Li; Siqi Liang; Haiyuan Yu
Journal:  Curr Opin Struct Biol       Date:  2022-02-06       Impact factor: 6.809

7.  Assessing the accuracy of contact and distance predictions in CASP14.

Authors:  Victoria Ruiz-Serra; Camila Pontes; Edoardo Milanetti; Andriy Kryshtafovych; Rosalba Lepore; Alfonso Valencia
Journal:  Proteins       Date:  2021-10-03

8.  A New Protocol for Atomic-Level Protein Structure Modeling and Refinement Using Low-to-Medium Resolution Cryo-EM Density Maps.

Authors:  Biao Zhang; Xi Zhang; Robin Pearce; Hong-Bin Shen; Yang Zhang
Journal:  J Mol Biol       Date:  2020-08-06       Impact factor: 5.469

9.  Protein inter-residue contact and distance prediction by coupling complementary coevolution features with deep residual networks in CASP14.

Authors:  Yang Li; Chengxin Zhang; Wei Zheng; Xiaogen Zhou; Eric W Bell; Dong-Jun Yu; Yang Zhang
Journal:  Proteins       Date:  2021-08-19

10.  Mechanism for DPY30 and ASH2L intrinsically disordered regions to modulate the MLL/SET1 activity on chromatin.

Authors:  Young-Tae Lee; Alex Ayoub; Sang-Ho Park; Liang Sha; Jing Xu; Fengbiao Mao; Wei Zheng; Yang Zhang; Uhn-Soo Cho; Yali Dou
Journal:  Nat Commun       Date:  2021-05-19       Impact factor: 14.919

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.