Literature DB >> 29408627

PREvaIL, an integrative approach for inferring catalytic residues using sequence, structural, and network features in a machine-learning framework.

Jiangning Song1, Fuyi Li2, Kazuhiro Takemoto3, Gholamreza Haffari4, Tatsuya Akutsu5, Kuo-Chen Chou6, Geoffrey I Webb7.   

Abstract

Determining the catalytic residues in an enzyme is critical to our understanding the relationship between protein sequence, structure, function, and enhancing our ability to design novel enzymes and their inhibitors. Although many enzymes have been sequenced, and their primary and tertiary structures determined, experimental methods for enzyme functional characterization lag behind. Because experimental methods used for identifying catalytic residues are resource- and labor-intensive, computational approaches have considerable value and are highly desirable for their ability to complement experimental studies in identifying catalytic residues and helping to bridge the sequence-structure-function gap. In this study, we describe a new computational method called PREvaIL for predicting enzyme catalytic residues. This method was developed by leveraging a comprehensive set of informative features extracted from multiple levels, including sequence, structure, and residue-contact network, in a random forest machine-learning framework. Extensive benchmarking experiments on eight different datasets based on 10-fold cross-validation and independent tests, as well as side-by-side performance comparisons with seven modern sequence- and structure-based methods, showed that PREvaIL achieved competitive predictive performance, with an area under the receiver operating characteristic curve and area under the precision-recall curve ranging from 0.896 to 0.973 and from 0.294 to 0.523, respectively. We demonstrated that this method was able to capture useful signals arising from different levels, leveraging such differential but useful types of features and allowing us to significantly improve the performance of catalytic residue prediction. We believe that this new method can be utilized as a valuable tool for both understanding the complex sequence-structure-function relationships of proteins and facilitating the characterization of novel enzymes lacking functional annotations.
Copyright © 2018 Elsevier Ltd. All rights reserved.

Keywords:  Bioinformatics; Enzyme catalytic residues; Functional annotation; Machine learning; Pattern recognition; Sequence analysis; Sequence–structure–function relationship

Mesh:

Year:  2018        PMID: 29408627     DOI: 10.1016/j.jtbi.2018.01.023

Source DB:  PubMed          Journal:  J Theor Biol        ISSN: 0022-5193            Impact factor:   2.691


  24 in total

1.  PRISMOID: a comprehensive 3D structure database for post-translational modifications and mutations with functional impact.

Authors:  Fuyi Li; Cunshuo Fan; Tatiana T Marquez-Lago; André Leier; Jerico Revote; Cangzhi Jia; Yan Zhu; A Ian Smith; Geoffrey I Webb; Quanzhong Liu; Leyi Wei; Jian Li; Jiangning Song
Journal:  Brief Bioinform       Date:  2020-05-21       Impact factor: 11.622

2.  iN6-methylat (5-step): identifying DNA N6-methyladenine sites in rice genome using continuous bag of nucleobases via Chou's 5-step rule.

Authors:  Nguyen Quoc Khanh Le
Journal:  Mol Genet Genomics       Date:  2019-05-04       Impact factor: 3.291

3.  Bastion3: a two-layer ensemble predictor of type III secreted effectors.

Authors:  Jiawei Wang; Jiahui Li; Bingjiao Yang; Ruopeng Xie; Tatiana T Marquez-Lago; André Leier; Morihiro Hayashida; Tatsuya Akutsu; Yanju Zhang; Kuo-Chen Chou; Joel Selkrig; Tieli Zhou; Jiangning Song; Trevor Lithgow
Journal:  Bioinformatics       Date:  2019-06-01       Impact factor: 6.937

4.  iPhosY-PseAAC: identify phosphotyrosine sites by incorporating sequence statistical moments into PseAAC.

Authors:  Yaser Daanial Khan; Nouman Rasool; Waqar Hussain; Sher Afzal Khan; Kuo-Chen Chou
Journal:  Mol Biol Rep       Date:  2018-10-11       Impact factor: 2.316

5.  MULTiPly: a novel multi-layer predictor for discovering general and specific types of promoters.

Authors:  Meng Zhang; Fuyi Li; Tatiana T Marquez-Lago; André Leier; Cunshuo Fan; Chee Keong Kwoh; Kuo-Chen Chou; Jiangning Song; Cangzhi Jia
Journal:  Bioinformatics       Date:  2019-09-01       Impact factor: 6.937

6.  Predicting membrane proteins and their types by extracting various sequence features into Chou's general PseAAC.

Authors:  Ahmad Hassan Butt; Nouman Rasool; Yaser Daanial Khan
Journal:  Mol Biol Rep       Date:  2018-09-20       Impact factor: 2.316

7.  Twenty years of bioinformatics research for protease-specific substrate and cleavage site prediction: a comprehensive revisit and benchmarking of existing methods.

Authors:  Fuyi Li; Yanan Wang; Chen Li; Tatiana T Marquez-Lago; André Leier; Neil D Rawlings; Gholamreza Haffari; Jerico Revote; Tatsuya Akutsu; Kuo-Chen Chou; Anthony W Purcell; Robert N Pike; Geoffrey I Webb; A Ian Smith; Trevor Lithgow; Roger J Daly; James C Whisstock; Jiangning Song
Journal:  Brief Bioinform       Date:  2019-11-27       Impact factor: 11.622

8.  Quokka: a comprehensive tool for rapid and accurate prediction of kinase family-specific phosphorylation sites in the human proteome.

Authors:  Fuyi Li; Chen Li; Tatiana T Marquez-Lago; André Leier; Tatsuya Akutsu; Anthony W Purcell; A Ian Smith; Trevor Lithgow; Roger J Daly; Jiangning Song; Kuo-Chen Chou
Journal:  Bioinformatics       Date:  2018-12-15       Impact factor: 6.937

9.  Coupling dynamics and evolutionary information with structure to identify protein regulatory and functional binding sites.

Authors:  Sambit K Mishra; Gaurav Kandoi; Robert L Jernigan
Journal:  Proteins       Date:  2019-06-22

10.  Computational analysis and prediction of lysine malonylation sites by exploiting informative features in an integrative machine-learning framework.

Authors:  Yanju Zhang; Ruopeng Xie; Jiawei Wang; André Leier; Tatiana T Marquez-Lago; Tatsuya Akutsu; Geoffrey I Webb; Kuo-Chen Chou; Jiangning Song
Journal:  Brief Bioinform       Date:  2019-11-27       Impact factor: 11.622

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.