Literature DB >> 18288459

An ensemble of reduced alphabets with protein encoding based on grouped weight for predicting DNA-binding proteins.

Loris Nanni1, Alessandra Lumini.   

Abstract

It is well known in the literature that an ensemble of classifiers obtains good performance with respect to that obtained by a stand-alone method. Hence, it is very important to develop ensemble methods well suited for bioinformatics data. In this work, we propose to combine the feature extraction method based on grouped weight with a set of amino-acid alphabets obtained by a Genetic Algorithm. The proposed method is applied for predicting DNA-binding proteins. As classifiers, the linear support vector machine and the radial basis function support vector machine are tested. As performance indicators, the accuracy and Matthews's correlation coefficient are reported. Matthews's correlation coefficient obtained by our ensemble method is approximately 0.97 when the jackknife cross-validation is used. This result outperforms the performance obtained in the literature using the same dataset where the features are extracted directly from the amino-acid sequence.

Mesh:

Substances:

Year:  2008        PMID: 18288459     DOI: 10.1007/s00726-008-0044-7

Source DB:  PubMed          Journal:  Amino Acids        ISSN: 0939-4451            Impact factor:   3.520


  7 in total

1.  iDNA-Prot: identification of DNA binding proteins using random forest with grey model.

Authors:  Wei-Zhong Lin; Jian-An Fang; Xuan Xiao; Kuo-Chen Chou
Journal:  PLoS One       Date:  2011-09-15       Impact factor: 3.240

2.  iDNA-Prot|dis: identifying DNA-binding proteins by incorporating amino acid distance-pairs and reduced alphabet profile into the general pseudo amino acid composition.

Authors:  Bin Liu; Jinghao Xu; Xun Lan; Ruifeng Xu; Jiyun Zhou; Xiaolong Wang; Kuo-Chen Chou
Journal:  PLoS One       Date:  2014-09-03       Impact factor: 3.240

3.  Sequence based prediction of DNA-binding proteins based on hybrid feature selection using random forest and Gaussian naïve Bayes.

Authors:  Wangchao Lou; Xiaoqing Wang; Fan Chen; Yixiao Chen; Bo Jiang; Hua Zhang
Journal:  PLoS One       Date:  2014-01-24       Impact factor: 3.240

4.  enDNA-Prot: identification of DNA-binding proteins by applying ensemble learning.

Authors:  Ruifeng Xu; Jiyun Zhou; Bin Liu; Lin Yao; Yulan He; Quan Zou; Xiaolong Wang
Journal:  Biomed Res Int       Date:  2014-05-26       Impact factor: 3.411

5.  nDNA-Prot: identification of DNA-binding proteins based on unbalanced classification.

Authors:  Li Song; Dapeng Li; Xiangxiang Zeng; Yunfeng Wu; Li Guo; Quan Zou
Journal:  BMC Bioinformatics       Date:  2014-09-08       Impact factor: 3.169

6.  Improved detection of DNA-binding proteins via compression technology on PSSM information.

Authors:  Yubo Wang; Yijie Ding; Fei Guo; Leyi Wei; Jijun Tang
Journal:  PLoS One       Date:  2017-09-29       Impact factor: 3.240

7.  Prediction of RNA- and DNA-Binding Proteins Using Various Machine Learning Classifiers.

Authors:  Mehdi Poursheikhali Asghari; Parviz Abdolmaleki
Journal:  Avicenna J Med Biotechnol       Date:  2019 Jan-Mar
  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.