Literature DB >> 29718112

High precision in protein contact prediction using fully convolutional neural networks and minimal sequence features.

David T Jones1,2, Shaun M Kandathil1,2.   

Abstract

Motivation: In addition to substitution frequency data from protein sequence alignments, many state-of-the-art methods for contact prediction rely on additional sources of information, or features, of protein sequences in order to predict residue-residue contacts, such as solvent accessibility, predicted secondary structure, and scores from other contact prediction methods. It is unclear how much of this information is needed to achieve state-of-the-art results. Here, we show that using deep neural network models, simple alignment statistics contain sufficient information to achieve state-of-the-art precision. Our prediction method, DeepCov, uses fully convolutional neural networks operating on amino-acid pair frequency or covariance data derived directly from sequence alignments, without using global statistical methods such as sparse inverse covariance or pseudolikelihood estimation.
Results: Comparisons against CCMpred and MetaPSICOV2 show that using pairwise covariance data calculated from raw alignments as input allows us to match or exceed the performance of both of these methods. Almost all of the achieved precision is obtained when considering relatively local windows (around 15 residues) around any member of a given residue pairing; larger window sizes have comparable performance. Assessment on a set of shallow sequence alignments (fewer than 160 effective sequences) indicates that the new method is substantially more precise than CCMpred and MetaPSICOV2 in this regime, suggesting that improved precision is attainable on smaller sequence families. Overall, the performance of DeepCov is competitive with the state of the art, and our results demonstrate that global models, which employ features from all parts of the input alignment when predicting individual contacts, are not strictly needed in order to attain precise contact predictions. Availability and implementation: DeepCov is freely available at https://github.com/psipred/DeepCov. Supplementary information: Supplementary data are available at Bioinformatics online.

Entities:  

Mesh:

Substances:

Year:  2018        PMID: 29718112      PMCID: PMC6157083          DOI: 10.1093/bioinformatics/bty341

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  35 in total

1.  Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences.

Authors:  Weizhong Li; Adam Godzik
Journal:  Bioinformatics       Date:  2006-05-26       Impact factor: 6.937

2.  Learning generative models for protein fold families.

Authors:  Sivaraman Balakrishnan; Hetunandan Kamisetty; Jaime G Carbonell; Su-In Lee; Christopher James Langmead
Journal:  Proteins       Date:  2011-01-25

3.  New encouraging developments in contact prediction: Assessment of the CASP11 results.

Authors:  Bohdan Monastyrskyy; Daniel D'Andrea; Krzysztof Fidelis; Anna Tramontano; Andriy Kryshtafovych
Journal:  Proteins       Date:  2015-11-17

4.  A deep learning framework for improving long-range residue-residue contact prediction using a hierarchical strategy.

Authors:  Dapeng Xiong; Jianyang Zeng; Haipeng Gong
Journal:  Bioinformatics       Date:  2017-09-01       Impact factor: 6.937

5.  Improved residue contact prediction using support vector machines and a large feature set.

Authors:  Jianlin Cheng; Pierre Baldi
Journal:  BMC Bioinformatics       Date:  2007-04-02       Impact factor: 3.169

6.  MetaPSICOV: combining coevolution methods for accurate prediction of contacts and long range hydrogen bonding in proteins.

Authors:  David T Jones; Tanya Singh; Tomasz Kosciolek; Stuart Tetchner
Journal:  Bioinformatics       Date:  2014-11-26       Impact factor: 6.937

7.  CCMpred--fast and precise prediction of protein residue-residue contacts from correlated mutations.

Authors:  Stefan Seemayer; Markus Gruber; Johannes Söding
Journal:  Bioinformatics       Date:  2014-07-26       Impact factor: 6.937

8.  EigenTHREADER: analogous protein fold recognition by efficient contact map threading.

Authors:  Daniel W A Buchan; David T Jones
Journal:  Bioinformatics       Date:  2017-09-01       Impact factor: 6.937

9.  Enhancing Evolutionary Couplings with Deep Convolutional Neural Networks.

Authors:  Yang Liu; Perry Palmedo; Qing Ye; Bonnie Berger; Jian Peng
Journal:  Cell Syst       Date:  2017-12-20       Impact factor: 10.304

10.  CD-HIT: accelerated for clustering the next-generation sequencing data.

Authors:  Limin Fu; Beifang Niu; Zhengwei Zhu; Sitao Wu; Weizhong Li
Journal:  Bioinformatics       Date:  2012-10-11       Impact factor: 6.937

View more
  56 in total

1.  Synthetic protein alignments by CCMgen quantify noise in residue-residue contact prediction.

Authors:  Susann Vorberg; Stefan Seemayer; Johannes Söding
Journal:  PLoS Comput Biol       Date:  2018-11-05       Impact factor: 4.475

2.  Deep-learning contact-map guided protein structure prediction in CASP13.

Authors:  Wei Zheng; Yang Li; Chengxin Zhang; Robin Pearce; S M Mortuza; Yang Zhang
Journal:  Proteins       Date:  2019-08-14

3.  Distance-based protein folding powered by deep learning.

Authors:  Jinbo Xu
Journal:  Proc Natl Acad Sci U S A       Date:  2019-08-09       Impact factor: 11.205

4.  ResPRE: high-accuracy protein contact prediction by coupling precision matrix with deep residual neural networks.

Authors:  Yang Li; Jun Hu; Chengxin Zhang; Dong-Jun Yu; Yang Zhang
Journal:  Bioinformatics       Date:  2019-11-01       Impact factor: 6.937

5.  Probabilistic divergence of a template-based modelling methodology from the ideal protocol.

Authors:  Ashish Runthala
Journal:  J Mol Model       Date:  2021-01-07       Impact factor: 1.810

Review 6.  Hybrid methods for combined experimental and computational determination of protein structure.

Authors:  Justin T Seffernick; Steffen Lindert
Journal:  J Chem Phys       Date:  2020-12-28       Impact factor: 3.488

7.  Driven to near-experimental accuracy by refinement via molecular dynamics simulations.

Authors:  Lim Heo; Collin F Arbour; Michael Feig
Journal:  Proteins       Date:  2019-06-24

8.  High-accuracy protein structures by combining machine-learning with physics-based refinement.

Authors:  Lim Heo; Michael Feig
Journal:  Proteins       Date:  2019-11-15

9.  DeepMSA: constructing deep multiple sequence alignment to improve contact prediction and fold-recognition for distant-homology proteins.

Authors:  Chengxin Zhang; Wei Zheng; S M Mortuza; Yang Li; Yang Zhang
Journal:  Bioinformatics       Date:  2020-04-01       Impact factor: 6.937

10.  Improved protein structure prediction using potentials from deep learning.

Authors:  Andrew W Senior; Richard Evans; John Jumper; James Kirkpatrick; Laurent Sifre; Tim Green; Chongli Qin; Augustin Žídek; Alexander W R Nelson; Alex Bridgland; Hugo Penedones; Stig Petersen; Karen Simonyan; Steve Crossan; Pushmeet Kohli; David T Jones; David Silver; Koray Kavukcuoglu; Demis Hassabis
Journal:  Nature       Date:  2020-01-15       Impact factor: 49.962

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.