Literature DB >> 24556806

Improving enzyme regulatory protein classification by means of SVM-RFE feature selection.

Carlos Fernandez-Lozano1, Enrique Fernández-Blanco, Kirtan Dave, Nieves Pedreira, Marcos Gestal, Julián Dorado, Cristian R Munteanu.   

Abstract

Enzyme regulation proteins are very important due to their involvement in many biological processes that sustain life. The complexity of these proteins, the impossibility of identifying direct quantification molecular properties associated with the regulation of enzymatic activities, and their structural diversity creates the necessity for new theoretical methods that can predict the enzyme regulatory function of new proteins. The current work presents the first classification model that predicts protein enzyme regulators using the Markov mean properties. These protein descriptors encode the topological information of the amino acid into contact networks based on amino acid distances and physicochemical properties. MInD-Prot software calculated these molecular descriptors for 2415 protein chains (350 enzyme regulators) using five atom physicochemical properties (Mulliken electronegativity, Kang-Jhon polarizability, vdW area, atom contribution to P) and the protein 3D regions. The best classification models to predict enzyme regulators have been obtained with machine learning algorithms from Weka using 18 features. K* has been demonstrated to be the most accurate algorithm for this protein function classification. Wrapper Subset Evaluator and SVM-RFE approaches were used to perform a feature subset selection with the best results obtained from SVM-RFE. Classification performance employing all the available features can be reached using only the 8 most relevant features selected by SVM-RFE. Thus, the current work has demonstrated the possibility of predicting new molecular targets involved in enzyme regulation using fast theoretical algorithms.

Entities:  

Mesh:

Substances:

Year:  2014        PMID: 24556806     DOI: 10.1039/c3mb70489k

Source DB:  PubMed          Journal:  Mol Biosyst        ISSN: 1742-2051


  2 in total

1.  Sequence-based identification of recombination spots using pseudo nucleic acid representation and recursive feature extraction by linear kernel SVM.

Authors:  Liqi Li; Sanjiu Yu; Weidong Xiao; Yongsheng Li; Lan Huang; Xiaoqi Zheng; Shiwen Zhou; Hua Yang
Journal:  BMC Bioinformatics       Date:  2014-11-20       Impact factor: 3.169

2.  Prediction of high anti-angiogenic activity peptides in silico using a generalized linear model and feature selection.

Authors:  Jose Liñares Blanco; Ana B Porto-Pazos; Alejandro Pazos; Carlos Fernandez-Lozano
Journal:  Sci Rep       Date:  2018-10-24       Impact factor: 4.379

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.