Literature DB >> 29753757

DPP-PseAAC: A DNA-binding protein prediction model using Chou's general PseAAC.

M Saifur Rahman1, Swakkhar Shatabda2, Sanjay Saha3, M Kaykobad4, M Sohel Rahman5.   

Abstract

A DNA-binding protein (DNA-BP) is a protein that can bind and interact with a DNA. Identification of DNA-BPs using experimental methods is expensive as well as time consuming. As such, fast and accurate computational methods are sought for predicting whether a protein can bind with a DNA or not. In this paper, we focus on building a new computational model to identify DNA-BPs in an efficient and accurate way. Our model extracts meaningful information directly from the protein sequences, without any dependence on functional domain or structural information. After feature extraction, we have employed Random Forest (RF) model to rank the features. Afterwards, we have used Recursive Feature Elimination (RFE) method to extract an optimal set of features and trained a prediction model using Support Vector Machine (SVM) with linear kernel. Our proposed method, named as DNA-binding Protein Prediction model using Chou's general PseAAC (DPP-PseAAC), demonstrates superior performance compared to the state-of-the-art predictors on standard benchmark dataset. DPP-PseAAC achieves accuracy values of 93.21%, 95.91% and 77.42% for 10-fold cross-validation test, jackknife test and independent test respectively. The source code of DPP-PseAAC, along with relevant dataset and detailed experimental results, can be found at https://github.com/srautonu/DNABinding. A publicly accessible web interface has also been established at: http://77.68.43.135:8080/DPP-PseAAC/.
Copyright © 2018 Elsevier Ltd. All rights reserved.

Entities:  

Keywords:  Classification; DNA binding; Prediction; PseAAC; Random Forest; Support Vector Machine

Mesh:

Substances:

Year:  2018        PMID: 29753757     DOI: 10.1016/j.jtbi.2018.05.006

Source DB:  PubMed          Journal:  J Theor Biol        ISSN: 0022-5193            Impact factor:   2.691


  18 in total

Review 1.  Some illuminating remarks on molecular genetics and genomics as well as drug development.

Authors:  Kuo-Chen Chou
Journal:  Mol Genet Genomics       Date:  2020-01-01       Impact factor: 3.291

2.  FTWSVM-SR: DNA-Binding Proteins Identification via Fuzzy Twin Support Vector Machines on Self-Representation.

Authors:  Yi Zou; Yijie Ding; Li Peng; Quan Zou
Journal:  Interdiscip Sci       Date:  2021-11-06       Impact factor: 2.233

3.  Research on DNA-Binding Protein Identification Method Based on LSTM-CNN Feature Fusion.

Authors:  Weizhong Lu; Xiaoyi Chen; Yu Zhang; Hongjie Wu; Yijie Ding; Jiawei Shen; Shixuan Guan; Haiou Li
Journal:  Comput Math Methods Med       Date:  2022-06-02       Impact factor: 2.809

4.  Comparative Analysis on Alignment-Based and Pretrained Feature Representations for the Identification of DNA-Binding Proteins.

Authors:  Die Chen; Hua Zhang; Zeqi Chen; Bo Xie; Ye Wang
Journal:  Comput Math Methods Med       Date:  2022-06-28       Impact factor: 2.809

5.  A sequence-based multiple kernel model for identifying DNA-binding proteins.

Authors:  Yuqing Qian; Limin Jiang; Yijie Ding; Jijun Tang; Fei Guo
Journal:  BMC Bioinformatics       Date:  2021-05-31       Impact factor: 3.169

6.  Prediction of DNA binding proteins using local features and long-term dependencies with primary sequences based on deep learning.

Authors:  Guobin Li; Xiuquan Du; Xinlu Li; Le Zou; Guanhong Zhang; Zhize Wu
Journal:  PeerJ       Date:  2021-05-03       Impact factor: 2.984

7.  Prediction of DNA-Binding Protein-Drug-Binding Sites Using Residue Interaction Networks and Sequence Feature.

Authors:  Wei Wang; Yu Zhang; Dong Liu; HongJun Zhang; XianFang Wang; Yun Zhou
Journal:  Front Bioeng Biotechnol       Date:  2022-04-20

8.  PredDBP-Stack: Prediction of DNA-Binding Proteins from HMM Profiles using a Stacked Ensemble Method.

Authors:  Jun Wang; Huiwen Zheng; Yang Yang; Wanyue Xiao; Taigang Liu
Journal:  Biomed Res Int       Date:  2020-04-13       Impact factor: 3.411

9.  CRISPRpred(SEQ): a sequence-based method for sgRNA on target activity prediction using traditional machine learning.

Authors:  Ali Haisam Muhammad Rafid; Md Toufikuzzaman; Mohammad Saifur Rahman; M Sohel Rahman
Journal:  BMC Bioinformatics       Date:  2020-06-01       Impact factor: 3.169

10.  HMMPred: Accurate Prediction of DNA-Binding Proteins Based on HMM Profiles and XGBoost Feature Selection.

Authors:  Xiuzhi Sang; Wanyue Xiao; Huiwen Zheng; Yang Yang; Taigang Liu
Journal:  Comput Math Methods Med       Date:  2020-03-28       Impact factor: 2.238

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.