Literature DB >> 27490858

PseDNA-Pro: DNA-Binding Protein Identification by Combining Chou's PseAAC and Physicochemical Distance Transformation.

Bin Liu1,2,3, Jinghao Xu4, Shixi Fan4, Ruifeng Xu4,5, Jiyun Zhou4, Xiaolong Wang4,5.   

Abstract

Identification of DNA-binding proteins is an important problem in biomedical research as DNA-binding proteins are crucial for various cellular processes. Currently, the machine learning methods achieve the-state-of-the-art performance with different features. A key step to improve the performance of these methods is to find a suitable representation of proteins. In this study, we proposed a feature vector composed of three kinds of sequence-based features, including overall amino acid composition, pseudo amino acid composition (PseAAC) proposed by Chou and physicochemical distance transformation. These features not only consider the sequence composition of proteins, but also incorporate the sequence-order information of amino acids in proteins. The feature vectors were fed into Support Vector Machine (SVM) for DNA-binding protein identification. The proposed method is called PseDNA-Pro. Experiments on stringent benchmark datasets and independent test datasets by using the Jackknife test showed that PseDNA-Pro can achieve an accuracy of higher than 80 %, outperforming several state-of-the-art methods, including DNAbinder, DNA-Prot, and iDNA-Prot. These results indicate that the combination of various features for DNA-binding protein prediction is a suitable approach, and the sequence-order information among residues in proteins is relative for discrimination. For practical applications, a web-server of PseDNA-Pro was established, which is available from http://bioinformatics.hitsz.edu.cn/PseDNA-Pro/.
© 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

Entities:  

Keywords:  DNA-binding protein; Pseudo amino acid composition; Support vector machine

Mesh:

Substances:

Year:  2014        PMID: 27490858     DOI: 10.1002/minf.201400025

Source DB:  PubMed          Journal:  Mol Inform        ISSN: 1868-1743            Impact factor:   3.353


  36 in total

1.  repRNA: a web server for generating various feature vectors of RNA sequences.

Authors:  Bin Liu; Fule Liu; Longyun Fang; Xiaolong Wang; Kuo-Chen Chou
Journal:  Mol Genet Genomics       Date:  2015-06-18       Impact factor: 3.291

2.  DP-BINDER: machine learning model for prediction of DNA-binding proteins by fusing evolutionary and physicochemical information.

Authors:  Farman Ali; Saeed Ahmed; Zar Nawab Khan Swati; Shahid Akbar
Journal:  J Comput Aided Mol Des       Date:  2019-05-23       Impact factor: 3.686

3.  Protein remote homology detection by combining Chou's distance-pair pseudo amino acid composition and principal component analysis.

Authors:  Bin Liu; Junjie Chen; Xiaolong Wang
Journal:  Mol Genet Genomics       Date:  2015-04-21       Impact factor: 3.291

Review 4.  Some illuminating remarks on molecular genetics and genomics as well as drug development.

Authors:  Kuo-Chen Chou
Journal:  Mol Genet Genomics       Date:  2020-01-01       Impact factor: 3.291

5.  Application of DNA-Binding Protein Prediction Based on Graph Convolutional Network and Contact Map.

Authors:  Weizhong Lu; Nan Zhou; Yijie Ding; Hongjie Wu; Yu Zhang; Qiming Fu; Haiou Li
Journal:  Biomed Res Int       Date:  2022-01-17       Impact factor: 3.411

6.  Research on DNA-Binding Protein Identification Method Based on LSTM-CNN Feature Fusion.

Authors:  Weizhong Lu; Xiaoyi Chen; Yu Zhang; Hongjie Wu; Yijie Ding; Jiawei Shen; Shixuan Guan; Haiou Li
Journal:  Comput Math Methods Med       Date:  2022-06-02       Impact factor: 2.809

7.  Comparative Analysis on Alignment-Based and Pretrained Feature Representations for the Identification of DNA-Binding Proteins.

Authors:  Die Chen; Hua Zhang; Zeqi Chen; Bo Xie; Ye Wang
Journal:  Comput Math Methods Med       Date:  2022-06-28       Impact factor: 2.809

8.  Use Chou's 5-Step Rule to Predict DNA-Binding Proteins with Evolutionary Information.

Authors:  Weizhong Lu; Zhengwei Song; Yijie Ding; Hongjie Wu; Yan Cao; Yu Zhang; Haiou Li
Journal:  Biomed Res Int       Date:  2020-07-27       Impact factor: 3.411

9.  Prediction of DNA binding proteins using local features and long-term dependencies with primary sequences based on deep learning.

Authors:  Guobin Li; Xiuquan Du; Xinlu Li; Le Zou; Guanhong Zhang; Zhize Wu
Journal:  PeerJ       Date:  2021-05-03       Impact factor: 2.984

10.  BioSeq-BLM: a platform for analyzing DNA, RNA and protein sequences based on biological language models.

Authors:  Hong-Liang Li; Yi-He Pang; Bin Liu
Journal:  Nucleic Acids Res       Date:  2021-12-16       Impact factor: 16.971

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.