Literature DB >> 30864311

DeepDom: Predicting protein domain boundary from sequence alone using stacked bidirectional LSTM.

Yuexu Jiang1, Duolin Wang, Dong Xu.   

Abstract

Protein domain boundary prediction is usually an early step to understand protein function and structure. Most of the current computational domain boundary prediction methods suffer from low accuracy and limitation in handling multi-domain types, or even cannot be applied on certain targets such as proteins with discontinuous domain. We developed an ab-initio protein domain predictor using a stacked bidirectional LSTM model in deep learning. Our model is trained by a large amount of protein sequences without using feature engineering such as sequence profiles. Hence, the predictions using our method is much faster than others, and the trained model can be applied to any type of target proteins without constraint. We evaluated DeepDom by a 10-fold cross validation and also by applying it on targets in different categories from CASP 8 and CASP 9. The comparison with other methods has shown that DeepDom outperforms most of the current ab-initio methods and even achieves better results than the top-level template-based method in certain cases. The code of DeepDom and the test data we used in CASP 8, 9 can be accessed through GitHub at https://github.com/yuexujiang/DeepDom.

Entities:  

Mesh:

Year:  2019        PMID: 30864311      PMCID: PMC6417825     

Source DB:  PubMed          Journal:  Pac Symp Biocomput        ISSN: 2335-6928


  22 in total

Review 1.  The natural history of protein domains.

Authors:  Chris P Ponting; Robert R Russell
Journal:  Annu Rev Biophys Biomol Struct       Date:  2001-10-25

2.  SnapDRAGON: a method to delineate protein structural domains from sequence data.

Authors:  Richard A George; Jaap Heringa
Journal:  J Mol Biol       Date:  2002-02-22       Impact factor: 5.469

3.  DomCut: prediction of inter-domain linker regions in amino acid sequences.

Authors:  Mikita Suyama; Osamu Ohara
Journal:  Bioinformatics       Date:  2003-03-22       Impact factor: 6.937

4.  ProDom: automated clustering of homologous domains.

Authors:  Florence Servant; Catherine Bru; Sébastien Carrère; Emmanuel Courcelle; Jérĵme Gouzy; David Peyruc; Daniel Kahn
Journal:  Brief Bioinform       Date:  2002-09       Impact factor: 11.622

5.  CHOP proteins into structural domain-like fragments.

Authors:  Jinfeng Liu; Burkhard Rost
Journal:  Proteins       Date:  2004-05-15

6.  PPRODO: prediction of protein domain boundaries using neural networks.

Authors:  Jaehyun Sim; Seung-Yeon Kim; Jooyoung Lee
Journal:  Proteins       Date:  2005-05-15

7.  Automated prediction of domain boundaries in CASP6 targets using Ginzu and RosettaDOM.

Authors:  David E Kim; Dylan Chivian; Lars Malmström; David Baker
Journal:  Proteins       Date:  2005

8.  Learning long-term dependencies with gradient descent is difficult.

Authors:  Y Bengio; P Simard; P Frasconi
Journal:  IEEE Trans Neural Netw       Date:  1994

9.  EVEREST: automatic identification and classification of protein domains in all protein sequences.

Authors:  Elon Portugaly; Amir Harel; Nathan Linial; Michal Linial
Journal:  BMC Bioinformatics       Date:  2006-06-02       Impact factor: 3.169

10.  ADDA: a domain database with global coverage of the protein universe.

Authors:  Andreas Heger; Christopher Andrew Wilton; Ashwin Sivakumar; Liisa Holm
Journal:  Nucleic Acids Res       Date:  2005-01-01       Impact factor: 16.971

View more
  4 in total

1.  Multi-head attention-based U-Nets for predicting protein domain boundaries using 1D sequence features and 2D distance maps.

Authors:  Sajid Mahmud; Zhiye Guo; Farhan Quadir; Jian Liu; Jianlin Cheng
Journal:  BMC Bioinformatics       Date:  2022-07-19       Impact factor: 3.307

2.  Identification of Sub-Golgi protein localization by use of deep representation learning features.

Authors:  Zhibin Lv; Pingping Wang; Quan Zou; Qinghua Jiang
Journal:  Bioinformatics       Date:  2020-12-26       Impact factor: 6.937

3.  High-Performance Deep Learning Toolbox for Genome-Scale Prediction of Protein Structure and Function.

Authors:  Mu Gao; Peik Lund-Andersen; Alex Morehead; Sajid Mahmud; Chen Chen; Xiao Chen; Nabin Giri; Raj S Roy; Farhan Quadir; T Chad Effler; Ryan Prout; Subil Abraham; Wael Elwasif; N Quentin Haas; Jeffrey Skolnick; Jianlin Cheng; Ada Sedova
Journal:  Workshop Mach Learn HPC Environ       Date:  2021-12-27

4.  MULocDeep: A deep-learning framework for protein subcellular and suborganellar localization prediction with residue-level interpretation.

Authors:  Yuexu Jiang; Duolin Wang; Yifu Yao; Holger Eubel; Patrick Künzler; Ian Max Møller; Dong Xu
Journal:  Comput Struct Biotechnol J       Date:  2021-08-18       Impact factor: 7.271

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.