Literature DB >> 31123959

DP-BINDER: machine learning model for prediction of DNA-binding proteins by fusing evolutionary and physicochemical information.

Farman Ali1, Saeed Ahmed2, Zar Nawab Khan Swati2,3, Shahid Akbar4.   

Abstract

DNA-binding proteins (DBPs) participate in various biological processes including DNA replication, recombination, and repair. In the human genome, about 6-7% of these proteins are utilized for genes encoding. DBPs shape the DNA into a compact structure known chromatin while some of these proteins regulate the chromosome packaging and transcription process. In the pharmaceutical industry, DBPs are used as a key component of antibiotics, steroids, and cancer drugs. These proteins also involve in biophysical, biological, and biochemical studies of DNA. Due to the crucial role in various biological activities, identification of DBPs is a hot issue in protein science. A series of experimental and computational methods have been proposed, however, some methods didn't achieve the desired results while some are inadequate in its accuracy and authenticity. Still, it is highly desired to present more intelligent computational predictors. In this work, we introduce an innovative computational method namely DP-BINDER based on physicochemical and evolutionary information. We captured local highly decisive features from physicochemical properties of primary protein sequences via normalized Moreau-Broto autocorrelation (NMBAC) and evolutionary information by position specific scoring matrix-transition probability composition (PSSM-TPC) and pseudo-position specific scoring matrix (PsePSSM) using training and independent datasets. The optimal features were selected by the support vector machine-recursive feature elimination and correlation bias reduction (SVM-RFE + CBR) from fused features and were fed into random forest (RF) and support vector machine (SVM). Our method attained 92.46% and 89.58% accuracy with jackknife and ten-fold cross-validation, respectively on the training dataset, while 81.17% accuracy on the independent dataset for prediction of DBPs. These results demonstrate that our method attained the highest success rate in the literature. The superiority of DP-BINDER over existing approaches due to several reasons including abstraction of local dominant features via effective feature descriptors, utilization of appropriate feature selection algorithms and effective classifier.

Entities:  

Keywords:  DNA-binding proteins; Normalized Moreau-Broto autocorrelation; Pseudo-position specific scoring matrix; Random forest; Support vector machine; Transition probability composition

Year:  2019        PMID: 31123959     DOI: 10.1007/s10822-019-00207-x

Source DB:  PubMed          Journal:  J Comput Aided Mol Des        ISSN: 0920-654X            Impact factor:   3.686


  60 in total

1.  Identifying DNA-binding proteins using structural motifs and the electrostatic potential.

Authors:  Hugh P Shanahan; Mario A Garcia; Susan Jones; Janet M Thornton
Journal:  Nucleic Acids Res       Date:  2004-09-08       Impact factor: 16.971

Review 2.  Moment-based prediction of DNA-binding proteins.

Authors:  Shandar Ahmad; Akinori Sarai
Journal:  J Mol Biol       Date:  2004-07-30       Impact factor: 5.469

Review 3.  Principles for modulation of the nuclear receptor superfamily.

Authors:  Hinrich Gronemeyer; Jan-Ake Gustafsson; Vincent Laudet
Journal:  Nat Rev Drug Discov       Date:  2004-11       Impact factor: 84.694

4.  Multiple SVM-RFE for gene selection in cancer classification with expression data.

Authors:  Kai-Bo Duan; Jagath C Rajapakse; Haiying Wang; Francisco Azuaje
Journal:  IEEE Trans Nanobioscience       Date:  2005-09       Impact factor: 2.935

Review 5.  How many drug targets are there?

Authors:  John P Overington; Bissan Al-Lazikani; Andrew L Hopkins
Journal:  Nat Rev Drug Discov       Date:  2006-12       Impact factor: 84.694

6.  A novel computational approach to predict transcription factor DNA binding preference.

Authors:  Yudong Cai; Jianfeng He; Xinlei Li; Lin Lu; Xinyi Yang; Kaiyan Feng; Wencong Lu; Xiangyin Kong
Journal:  J Proteome Res       Date:  2009-02       Impact factor: 4.466

Review 7.  Functions of human replication protein A (RPA): from DNA replication to DNA damage and stress responses.

Authors:  Yue Zou; Yiyong Liu; Xiaoming Wu; Steven M Shell
Journal:  J Cell Physiol       Date:  2006-08       Impact factor: 6.384

8.  Kernel-based machine learning protocol for predicting DNA-binding proteins.

Authors:  Nitin Bhardwaj; Robert E Langlois; Guijun Zhao; Hui Lu
Journal:  Nucleic Acids Res       Date:  2005-11-10       Impact factor: 16.971

9.  PROFEAT: a web server for computing structural and physicochemical features of proteins and peptides from amino acid sequence.

Authors:  Z R Li; H H Lin; L Y Han; L Jiang; X Chen; Y Z Chen
Journal:  Nucleic Acids Res       Date:  2006-07-01       Impact factor: 16.971

10.  Using support vector machine combined with auto covariance to predict protein-protein interactions from protein sequences.

Authors:  Yanzhi Guo; Lezheng Yu; Zhining Wen; Menglong Li
Journal:  Nucleic Acids Res       Date:  2008-04-04       Impact factor: 16.971

View more
  8 in total

1.  Comparative Analysis on Alignment-Based and Pretrained Feature Representations for the Identification of DNA-Binding Proteins.

Authors:  Die Chen; Hua Zhang; Zeqi Chen; Bo Xie; Ye Wang
Journal:  Comput Math Methods Med       Date:  2022-06-28       Impact factor: 2.809

2.  Prediction of DNA binding proteins using local features and long-term dependencies with primary sequences based on deep learning.

Authors:  Guobin Li; Xiuquan Du; Xinlu Li; Le Zou; Guanhong Zhang; Zhize Wu
Journal:  PeerJ       Date:  2021-05-03       Impact factor: 2.984

3.  PredDBP-Stack: Prediction of DNA-Binding Proteins from HMM Profiles using a Stacked Ensemble Method.

Authors:  Jun Wang; Huiwen Zheng; Yang Yang; Wanyue Xiao; Taigang Liu
Journal:  Biomed Res Int       Date:  2020-04-13       Impact factor: 3.411

4.  Computational identification of N6-methyladenosine sites in multiple tissues of mammals.

Authors:  Fu-Ying Dao; Hao Lv; Yu-He Yang; Hasan Zulfiqar; Hui Gao; Hao Lin
Journal:  Comput Struct Biotechnol J       Date:  2020-04-30       Impact factor: 7.271

5.  XGB-DrugPred: computational prediction of druggable proteins using eXtreme gradient boosting and optimized features set.

Authors:  Rahu Sikander; Ali Ghulam; Farman Ali
Journal:  Sci Rep       Date:  2022-04-01       Impact factor: 4.996

Review 6.  Single-Stranded DNA Binding Proteins and Their Identification Using Machine Learning-Based Approaches.

Authors:  Jun-Tao Guo; Fareeha Malik
Journal:  Biomolecules       Date:  2022-08-26

7.  DBP-iDWT: Improving DNA-Binding Proteins Prediction Using Multi-Perspective Evolutionary Profile and Discrete Wavelet Transform.

Authors:  Farman Ali; Omar Barukab; Ajay B Gadicha; Shruti Patil; Omar Alghushairy; Akram Y Sarhan
Journal:  Comput Intell Neurosci       Date:  2022-09-28

8.  HMMPred: Accurate Prediction of DNA-Binding Proteins Based on HMM Profiles and XGBoost Feature Selection.

Authors:  Xiuzhi Sang; Wanyue Xiao; Huiwen Zheng; Yang Yang; Taigang Liu
Journal:  Comput Math Methods Med       Date:  2020-03-28       Impact factor: 2.238

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.