Literature DB >> 33477866

Prediction of Protein-ATP Binding Residues Based on Ensemble of Deep Convolutional Neural Networks and LightGBM Algorithm.

Jiazhi Song1,2,3, Guixia Liu1,3,4, Jingqing Jiang2, Ping Zhang1,3, Yanchun Liang1,3,4.   

Abstract

Accurately identifying protein-ATP binding residues is important for protein function annotation and drug design. Previous studies have used classic machine-learning algorithms like support vector machine (SVM) and random forest to predict protein-ATP binding residues; however, as new machine-learning techniques are being developed, the prediction performance could be further improved. In this paper, an ensemble predictor that combines deep convolutional neural network and LightGBM with ensemble learning algorithm is proposed. Three subclassifiers have been developed, including a multi-incepResNet-based predictor, a multi-Xception-based predictor, and a LightGBM predictor. The final prediction result is the combination of outputs from three subclassifiers with optimized weight distribution. We examined the performance of our proposed predictor using two datasets: a classic ATP-binding benchmark dataset and a newly proposed ATP-binding dataset. Our predictor achieved area under the curve (AUC) values of 0.925 and 0.902 and Matthews Correlation Coefficient (MCC) values of 0.639 and 0.642, respectively, which are both better than other state-of-art prediction methods.

Entities:  

Keywords:  LightGBM; deep convolutional neural network; ensemble learning; protein primary sequence; protein–ATP binding residue prediction

Mesh:

Substances:

Year:  2021        PMID: 33477866      PMCID: PMC7832895          DOI: 10.3390/ijms22020939

Source DB:  PubMed          Journal:  Int J Mol Sci        ISSN: 1422-0067            Impact factor:   5.923


  32 in total

1.  The Protein Data Bank.

Authors:  H M Berman; J Westbrook; Z Feng; G Gilliland; T N Bhat; H Weissig; I N Shindyalov; P E Bourne
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

2.  Prediction and analysis of nucleotide-binding residues using sequence and sequence-derived structural descriptors.

Authors:  Ke Chen; Marcin J Mizianty; Lukasz Kurgan
Journal:  Bioinformatics       Date:  2011-11-29       Impact factor: 6.937

3.  Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences.

Authors:  Weizhong Li; Adam Godzik
Journal:  Bioinformatics       Date:  2006-05-26       Impact factor: 6.937

4.  Protein structure determination from NMR chemical shifts.

Authors:  Andrea Cavalli; Xavier Salvatella; Christopher M Dobson; Michele Vendruscolo
Journal:  Proc Natl Acad Sci U S A       Date:  2007-05-29       Impact factor: 11.205

Review 5.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.

Authors:  S F Altschul; T L Madden; A A Schäffer; J Zhang; Z Zhang; W Miller; D J Lipman
Journal:  Nucleic Acids Res       Date:  1997-09-01       Impact factor: 16.971

6.  Identification of ATP binding residues of a protein from its primary sequence.

Authors:  Jagat S Chauhan; Nitish K Mishra; Gajendra P S Raghava
Journal:  BMC Bioinformatics       Date:  2009-12-19       Impact factor: 3.169

7.  Circulating 25-hydroxyvitamin D, vitamin D binding protein and risk of advanced and lethal prostate cancer.

Authors:  Chen Yuan; Irene M Shui; Kathryn M Wilson; Meir J Stampfer; Lorelei A Mucci; Edward L Giovannucci
Journal:  Int J Cancer       Date:  2018-12-06       Impact factor: 7.396

8.  RNABindRPlus: a predictor that combines machine learning and sequence homology-based methods to improve the reliability of predicted RNA-binding residues in proteins.

Authors:  Rasna R Walia; Li C Xue; Katherine Wilkins; Yasser El-Manzalawy; Drena Dobbs; Vasant Honavar
Journal:  PLoS One       Date:  2014-05-20       Impact factor: 3.240

9.  Protein-ligand binding with the coarse-grained Martini model.

Authors:  Paulo C T Souza; Sebastian Thallmair; Paolo Conflitti; Carlos Ramírez-Palacios; Riccardo Alessandri; Stefano Raniolo; Vittorio Limongelli; Siewert J Marrink
Journal:  Nat Commun       Date:  2020-07-24       Impact factor: 14.919

10.  A Computational Framework Based on Ensemble Deep Neural Networks for Essential Genes Identification.

Authors:  Nguyen Quoc Khanh Le; Duyen Thi Do; Truong Nguyen Khanh Hung; Luu Ho Thanh Lam; Tuan-Tu Huynh; Ngan Thi Kim Nguyen
Journal:  Int J Mol Sci       Date:  2020-11-28       Impact factor: 5.923

View more
  3 in total

1.  Protein Science Meets Artificial Intelligence: A Systematic Review and a Biochemical Meta-Analysis of an Inter-Field.

Authors:  Jalil Villalobos-Alva; Luis Ochoa-Toledo; Mario Javier Villalobos-Alva; Atocha Aliseda; Fernando Pérez-Escamirosa; Nelly F Altamirano-Bustamante; Francine Ochoa-Fernández; Ricardo Zamora-Solís; Sebastián Villalobos-Alva; Cristina Revilla-Monsalve; Nicolás Kemper-Valverde; Myriam M Altamirano-Bustamante
Journal:  Front Bioeng Biotechnol       Date:  2022-07-07

2.  A Presurgical Unfavorable Prediction Scale of Endovascular Treatment for Acute Ischemic Stroke.

Authors:  Jingwei Li; Wencheng Zhu; Junshan Zhou; Wenwei Yun; Xiaobo Li; Qiaochu Guan; Weiping Lv; Yue Cheng; Huanyu Ni; Ziyi Xie; Mengyun Li; Lu Zhang; Yun Xu; Qingxiu Zhang
Journal:  Front Aging Neurosci       Date:  2022-06-30       Impact factor: 5.702

3.  Hybrid Deep Learning Models with Sparse Enhancement Technique for Detection of Newly Grown Tree Leaves.

Authors:  Shih-Yu Chen; Chinsu Lin; Guan-Jie Li; Yu-Chun Hsu; Keng-Hao Liu
Journal:  Sensors (Basel)       Date:  2021-03-16       Impact factor: 3.576

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.