Literature DB >> 33515234

Enzyme Promiscuity Prediction Using Hierarchy-Informed Multi-Label Classification.

Gian Marco Visani1, Michael C Hughes1, Soha Hassoun1,2.   

Abstract

MOTIVATION: As experimental efforts are costly and time consuming, computational characterization of enzyme capabilities is an attractive alternative. We present and evaluate several machine-learning models to predict which of 983 distinct enzymes, as defined via the Enzyme Commission (EC) numbers, are likely to interact with a given query molecule. Our data consists of enzyme-substrate interactions from the BRENDA database. Some interactions are attributed to natural selection and involve the enzyme's natural substrates. The majority of the interactions however involve non-natural substrates, thus reflecting promiscuous enzymatic activities.
RESULTS: We frame this "enzyme promiscuity prediction" problem as a multi-label classification task. We maximally utilize inhibitor and unlabelled data to train prediction models that can take advantage of known hierarchical relationships between enzyme classes. We report that a hierarchical multi-label neural network, EPP-HMCNF, is the best model for solving this problem, outperforming k-nearest neighbours similarity-based and other machine learning models. We show that inhibitor information during training consistently improves predictive power, particularly for EPP-HMCNF. We also show that all promiscuity prediction models perform worse under a realistic data split when compared to a random data split, and when evaluating performance on non-natural substrates compared to natural substrates.
AVAILABILITY AND IMPLEMENTATION: We provide Python code for EPP-HMCNF and other models in a repository termed EPP (Enzyme Promiscuity Prediction) at https://github.com/hassounlab/EPP. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author(s) 2021. Published by Oxford University Press.

Entities:  

Year:  2021        PMID: 33515234     DOI: 10.1093/bioinformatics/btab054

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  1 in total

1.  Boost-RS: boosted embeddings for recommender systems and its application to enzyme-substrate interaction prediction.

Authors:  Xinmeng Li; Li-Ping Liu; Soha Hassoun
Journal:  Bioinformatics       Date:  2022-05-13       Impact factor: 6.931

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.