Literature DB >> 17275170

Design of accurate predictors for DNA-binding sites in proteins using hybrid SVM-PSSM method.

Shinn-Ying Ho1, Fu-Chieh Yu, Chia-Yun Chang, Hui-Ling Huang.   

Abstract

In this paper, we investigate the design of accurate predictors for DNA-binding sites in proteins from amino acid sequences. As a result, we propose a hybrid method using support vector machine (SVM) in conjunction with evolutionary information of amino acid sequences in terms of their position-specific scoring matrices (PSSMs) for prediction of DNA-binding sites. Considering the numbers of binding and non-binding residues in proteins are significantly unequal, two additional weights as well as SVM parameters are analyzed and adopted to maximize net prediction (NP, an average of sensitivity and specificity) accuracy. To evaluate the generalization ability of the proposed method SVM-PSSM, a DNA-binding dataset PDC-59 consisting of 59 protein chains with low sequence identity on each other is additionally established. The SVM-based method using the same six-fold cross-validation procedure and PSSM features has NP=80.15% for the training dataset PDNA-62 and NP=69.54% for the test dataset PDC-59, which are much better than the existing neural network-based method by increasing the NP values for training and test accuracies up to 13.45% and 16.53%, respectively. Simulation results reveal that SVM-PSSM performs well in predicting DNA-binding sites of novel proteins from amino acid sequences.

Mesh:

Substances:

Year:  2006        PMID: 17275170     DOI: 10.1016/j.biosystems.2006.08.007

Source DB:  PubMed          Journal:  Biosystems        ISSN: 0303-2647            Impact factor:   1.973


  11 in total

1.  Predicting and analyzing DNA-binding domains using a systematic approach to identifying a set of informative physicochemical and biochemical properties.

Authors:  Hui-Lin Huang; I-Che Lin; Yi-Fan Liou; Chia-Ta Tsai; Kai-Ti Hsu; Wen-Lin Huang; Shinn-Jang Ho; Shinn-Ying Ho
Journal:  BMC Bioinformatics       Date:  2011-02-15       Impact factor: 3.169

2.  PNImodeler: web server for inferring protein-binding nucleotides from sequence data.

Authors:  Jinyong Im; Narankhuu Tuvshinjargal; Byungkyu Park; Wook Lee; De-Shuang Huang; Kyungsook Han
Journal:  BMC Genomics       Date:  2015-01-29       Impact factor: 3.969

3.  Identification of DNA-binding proteins using support vector machine with sequence information.

Authors:  Xin Ma; Jiansheng Wu; Xiaoyun Xue
Journal:  Comput Math Methods Med       Date:  2013-09-16       Impact factor: 2.238

4.  iDNA-Prot|dis: identifying DNA-binding proteins by incorporating amino acid distance-pairs and reduced alphabet profile into the general pseudo amino acid composition.

Authors:  Bin Liu; Jinghao Xu; Xun Lan; Ruifeng Xu; Jiyun Zhou; Xiaolong Wang; Kuo-Chen Chou
Journal:  PLoS One       Date:  2014-09-03       Impact factor: 3.240

5.  DNABP: Identification of DNA-Binding Proteins Based on Feature Selection Using a Random Forest and Predicting Binding Residues.

Authors:  Xin Ma; Jing Guo; Xiao Sun
Journal:  PLoS One       Date:  2016-12-01       Impact factor: 3.240

6.  Prediction of DNA-binding residues in proteins from amino acid sequences using a random forest model with a hybrid feature.

Authors:  Jiansheng Wu; Hongde Liu; Xueye Duan; Yan Ding; Hongtao Wu; Yunfei Bai; Xiao Sun
Journal:  Bioinformatics       Date:  2008-11-12       Impact factor: 6.937

7.  Novel approach for selecting the best predictor for identifying the binding sites in DNA binding proteins.

Authors:  R Nagarajan; Shandar Ahmad; M Michael Gromiha
Journal:  Nucleic Acids Res       Date:  2013-06-20       Impact factor: 16.971

8.  PDNAsite: Identification of DNA-binding Site from Protein Sequence by Incorporating Spatial and Sequence Context.

Authors:  Jiyun Zhou; Ruifeng Xu; Yulan He; Qin Lu; Hongpeng Wang; Bing Kong
Journal:  Sci Rep       Date:  2016-06-10       Impact factor: 4.379

9.  Sequence-Based Prediction of RNA-Binding Proteins Using Random Forest with Minimum Redundancy Maximum Relevance Feature Selection.

Authors:  Xin Ma; Jing Guo; Xiao Sun
Journal:  Biomed Res Int       Date:  2015-10-12       Impact factor: 3.411

10.  EL_PSSM-RT: DNA-binding residue prediction by integrating ensemble learning with PSSM Relation Transformation.

Authors:  Jiyun Zhou; Qin Lu; Ruifeng Xu; Yulan He; Hongpeng Wang
Journal:  BMC Bioinformatics       Date:  2017-08-29       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.