Literature DB >> 23117653

Prediction of active sites of enzymes by maximum relevance minimum redundancy (mRMR) feature selection.

Yu-Fei Gao1, Bi-Qing Li, Yu-Dong Cai, Kai-Yan Feng, Zhan-Dong Li, Yang Jiang.   

Abstract

Identification of catalytic residues plays a key role in understanding how enzymes work. Although numerous computational methods have been developed to predict catalytic residues and active sites, the prediction accuracy remains relatively low with high false positives. In this work, we developed a novel predictor based on the Random Forest algorithm (RF) aided by the maximum relevance minimum redundancy (mRMR) method and incremental feature selection (IFS). We incorporated features of physicochemical/biochemical properties, sequence conservation, residual disorder, secondary structure and solvent accessibility to predict active sites of enzymes and achieved an overall accuracy of 0.885687 and MCC of 0.689226 on an independent test dataset. Feature analysis showed that every category of the features except disorder contributed to the identification of active sites. It was also shown via the site-specific feature analysis that the features derived from the active site itself contributed most to the active site determination. Our prediction method may become a useful tool for identifying the active sites and the key features identified by the paper may provide valuable insights into the mechanism of catalysis.

Mesh:

Substances:

Year:  2012        PMID: 23117653     DOI: 10.1039/c2mb25327e

Source DB:  PubMed          Journal:  Mol Biosyst        ISSN: 1742-2051


  13 in total

1.  Application of feature selection and regression models for chlorophyll-a prediction in a shallow lake.

Authors:  Xue Li; Jian Sha; Zhong-Liang Wang
Journal:  Environ Sci Pollut Res Int       Date:  2018-05-05       Impact factor: 4.223

2.  iCataly-PseAAC: Identification of Enzymes Catalytic Sites Using Sequence Evolution Information with Grey Model GM (2,1).

Authors:  Xuan Xiao; Meng-Juan Hui; Zi Liu; Wang-Ren Qiu
Journal:  J Membr Biol       Date:  2015-06-16       Impact factor: 1.843

3.  Predicting DNA-binding sites of proteins based on sequential and 3D structural information.

Authors:  Bi-Qing Li; Kai-Yan Feng; Juan Ding; Yu-Dong Cai
Journal:  Mol Genet Genomics       Date:  2014-01-22       Impact factor: 3.291

4.  ADMET evaluation in drug discovery: 15. Accurate prediction of rat oral acute toxicity using relevance vector machine and consensus modeling.

Authors:  Tailong Lei; Youyong Li; Yunlong Song; Dan Li; Huiyong Sun; Tingjun Hou
Journal:  J Cheminform       Date:  2016-02-01       Impact factor: 5.514

5.  An ensemble prognostic model for colorectal cancer.

Authors:  Bi-Qing Li; Tao Huang; Jian Zhang; Ning Zhang; Guo-Hua Huang; Lei Liu; Yu-Dong Cai
Journal:  PLoS One       Date:  2013-05-02       Impact factor: 3.240

6.  PINGU: PredIction of eNzyme catalytic residues usinG seqUence information.

Authors:  Priyadarshini P Pai; S S Shree Ranjani; Sukanta Mondal
Journal:  PLoS One       Date:  2015-08-11       Impact factor: 3.240

7.  Prediction and analysis of retinoblastoma related genes through gene ontology and KEGG.

Authors:  Zhen Li; Bi-Qing Li; Min Jiang; Lei Chen; Jian Zhang; Lin Liu; Tao Huang
Journal:  Biomed Res Int       Date:  2013-08-13       Impact factor: 3.411

8.  Classification of non-small cell lung cancer based on copy number alterations.

Authors:  Bi-Qing Li; Jin You; Tao Huang; Yu-Dong Cai
Journal:  PLoS One       Date:  2014-02-05       Impact factor: 3.240

9.  Prediction of aptamer-target interacting pairs with pseudo-amino acid composition.

Authors:  Bi-Qing Li; Yu-Chao Zhang; Guo-Hua Huang; Wei-Ren Cui; Ning Zhang; Yu-Dong Cai
Journal:  PLoS One       Date:  2014-01-22       Impact factor: 3.240

10.  Sequence-Based Prediction of RNA-Binding Proteins Using Random Forest with Minimum Redundancy Maximum Relevance Feature Selection.

Authors:  Xin Ma; Jing Guo; Xiao Sun
Journal:  Biomed Res Int       Date:  2015-10-12       Impact factor: 3.411

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.