Literature DB >> 19381539

Enzyme function prediction with interpretable models.

Umar Syed1, Golan Yona.   

Abstract

Enzymes play central roles in metabolic pathways, and the prediction of metabolic pathways in newly sequenced genomes usually starts with the assignment of genes to enzymatic reactions. However, genes with similar catalytic activity are not necessarily similar in sequence, and therefore the traditional sequence similarity-based approach often fails to identify the relevant enzymes, thus hindering efforts to map the metabolome of an organism.Here we study the direct relationship between basic protein properties and their function. Our goal is to develop a new tool for functional prediction (e.g., prediction of Enzyme Commission number), which can be used to complement and support other techniques based on sequence or structure information. In order to define this mapping we collected a set of 453 features and properties that characterize proteins and are believed to be related to structural and functional aspects of proteins. We introduce a mixture model of stochastic decision trees to learn the set of potentially complex relationships between features and function. To study these correlations, trees are created and tested on the Pfam classification of proteins, which is based on sequence, and the EC classification, which is based on enzymatic function. The model is very effective in learning highly diverged protein families or families that are not defined on the basis of sequence. The resulting tree structures highlight the properties that are strongly correlated with structural and functional aspects of protein families, and can be used to suggest a concise definition of a protein family.

Mesh:

Substances:

Year:  2009        PMID: 19381539     DOI: 10.1007/978-1-59745-243-4_17

Source DB:  PubMed          Journal:  Methods Mol Biol        ISSN: 1064-3745


  5 in total

1.  SHARP: genome-scale identification of gene-protein-reaction associations in cyanobacteria.

Authors:  S Krishnakumar; Dilip A Durai; Pramod P Wangikar; Ganesh A Viswanathan
Journal:  Photosynth Res       Date:  2013-08-24       Impact factor: 3.573

2.  A top-down approach to classify enzyme functional classes and sub-classes using random forest.

Authors:  Chetan Kumar; Alok Choudhary
Journal:  EURASIP J Bioinform Syst Biol       Date:  2012-02-29

3.  Computational Approaches for Automated Classification of Enzyme Sequences.

Authors:  Akram Mohammed; Chittibabu Guda
Journal:  J Proteomics Bioinform       Date:  2011-08-23

4.  Application of a hierarchical enzyme classification method reveals the role of gut microbiome in human metabolism.

Authors:  Akram Mohammed; Chittibabu Guda
Journal:  BMC Genomics       Date:  2015-06-11       Impact factor: 3.969

5.  Prediction of detailed enzyme functions and identification of specificity determining residues by random forests.

Authors:  Chioko Nagao; Nozomi Nagano; Kenji Mizuguchi
Journal:  PLoS One       Date:  2014-01-08       Impact factor: 3.240

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.