Literature DB >> 31589422

STarFish: A Stacked Ensemble Target Fishing Approach and its Application to Natural Products.

Nicholas T Cockroft1, Xiaolin Cheng1, James R Fuchs1.   

Abstract

Target fishing is the process of identifying the protein target of a bioactive small molecule. To do so experimentally requires a significant investment of time and resources, which can be expedited with a reliable computational target fishing model. The development of computational target fishing models using machine learning has become very popular over the last several years because of the increased availability of large amounts of public bioactivity data. Unfortunately, the applicability and performance of such models for natural products has not yet been comprehensively assessed. This is, in part, due to the relative lack of bioactivity data available for natural products compared to synthetic compounds. Moreover, the databases commonly used to train such models do not annotate which compounds are natural products, which makes the collection of a benchmarking set difficult. To address this knowledge gap, a data set composed of natural product structures and their associated protein targets was generated by cross-referencing 20 publicly available natural product databases with the bioactivity database ChEMBL. This data set contains 5589 compound-target pairs for 1943 unique compounds and 1023 unique targets. A synthetic data set comprising 107 190 compound-target pairs for 88 728 unique compounds and 1907 unique targets was used to train k-nearest neighbors, random forest, and multilayer perceptron models. The predictive performance of each model was assessed by stratified 10-fold cross-validation and benchmarking on the newly collected natural product data set. Strong performance was observed for each model during cross-validation with area under the receiver operating characteristic (AUROC) scores ranging from 0.94 to 0.99 and Boltzmann-enhanced discrimination of receiver operating characteristic (BEDROC) scores from 0.89 to 0.94. When tested on the natural product data set, performance dramatically decreased with AUROC scores ranging from 0.70 to 0.85 and BEDROC scores from 0.43 to 0.59. However, the implementation of a model stacking approach, which uses logistic regression as a meta-classifier to combine model predictions, dramatically improved the ability to correctly predict the protein targets of natural products and increased the AUROC score to 0.94 and BEDROC score to 0.73. This stacked model was deployed as a web application, called STarFish, and has been made available for use to aid in target identification for natural products.

Entities:  

Year:  2019        PMID: 31589422      PMCID: PMC7291623          DOI: 10.1021/acs.jcim.9b00489

Source DB:  PubMed          Journal:  J Chem Inf Model        ISSN: 1549-9596            Impact factor:   4.956


  54 in total

1.  Quantitative and Systems Pharmacology. 1. In Silico Prediction of Drug-Target Interactions of Natural Products Enables New Targeted Cancer Therapy.

Authors:  Jiansong Fang; Zengrui Wu; Chuipu Cai; Qi Wang; Yun Tang; Feixiong Cheng
Journal:  J Chem Inf Model       Date:  2017-10-13       Impact factor: 4.956

2.  Evaluation of Cross-Validation Strategies in Sequence-Based Binding Prediction Using Deep Learning.

Authors:  Angela Lopez-Del Rio; Alfons Nonell-Canals; David Vidal; Alexandre Perera-Lluna
Journal:  J Chem Inf Model       Date:  2019-02-22       Impact factor: 4.956

Review 3.  Harnessing the potential of natural products in drug discovery from a cheminformatics vantage point.

Authors:  Tiago Rodrigues
Journal:  Org Biomol Chem       Date:  2017-11-15       Impact factor: 3.876

Review 4.  Data Resources for the Computer-Guided Discovery of Bioactive Natural Products.

Authors:  Ya Chen; Christina de Bruyn Kops; Johannes Kirchmair
Journal:  J Chem Inf Model       Date:  2017-08-30       Impact factor: 4.956

5.  Structure-based identification of aporphines with selective 5-HT(2A) receptor-binding activity.

Authors:  Vani Munusamy; Beow Keat Yap; Michael J C Buckle; Stephen W Doughty; Lip Yong Chung
Journal:  Chem Biol Drug Des       Date:  2012-11-14       Impact factor: 2.817

Review 6.  Target identification and mechanism of action in chemical biology and drug discovery.

Authors:  Monica Schenone; Vlado Dančík; Bridget K Wagner; Paul A Clemons
Journal:  Nat Chem Biol       Date:  2013-04       Impact factor: 15.040

7.  SANCDB: a South African natural compound database.

Authors:  Rowan Hatherley; David K Brown; Thommas M Musyoka; David L Penkler; Ngonidzashe Faya; Kevin A Lobb; Özlem Tastan Bishop
Journal:  J Cheminform       Date:  2015-06-19       Impact factor: 5.514

8.  Beyond the hype: deep neural networks outperform established methods using a ChEMBL bioactivity benchmark set.

Authors:  Eelke B Lenselink; Niels Ten Dijke; Brandon Bongers; George Papadatos; Herman W T van Vlijmen; Wojtek Kowalczyk; Adriaan P IJzerman; Gerard J P van Westen
Journal:  J Cheminform       Date:  2017-08-14       Impact factor: 5.514

9.  Super Natural II--a database of natural products.

Authors:  Priyanka Banerjee; Jevgeni Erehman; Björn-Oliver Gohlke; Thomas Wilhelm; Robert Preissner; Mathias Dunkel
Journal:  Nucleic Acids Res       Date:  2014-10-09       Impact factor: 16.971

10.  NPASS: natural product activity and species source database for natural product research, discovery and tool development.

Authors:  Xian Zeng; Peng Zhang; Weidong He; Chu Qin; Shangying Chen; Lin Tao; Yali Wang; Ying Tan; Dan Gao; Bohua Wang; Zhe Chen; Weiping Chen; Yu Yang Jiang; Yu Zong Chen
Journal:  Nucleic Acids Res       Date:  2018-01-04       Impact factor: 16.971

View more
  6 in total

1.  Target Prediction Model for Natural Products Using Transfer Learning.

Authors:  Bo Qiang; Junyong Lai; Hongwei Jin; Liangren Zhang; Zhenming Liu
Journal:  Int J Mol Sci       Date:  2021-04-28       Impact factor: 5.923

2.  Similarity-Based Methods and Machine Learning Approaches for Target Prediction in Early Drug Discovery: Performance and Scope.

Authors:  Neann Mathai; Johannes Kirchmair
Journal:  Int J Mol Sci       Date:  2020-05-19       Impact factor: 5.923

3.  Scope of 3D Shape-Based Approaches in Predicting the Macromolecular Targets of Structurally Complex Small Molecules Including Natural Products and Macrocyclic Ligands.

Authors:  Ya Chen; Neann Mathai; Johannes Kirchmair
Journal:  J Chem Inf Model       Date:  2020-05-05       Impact factor: 4.956

4.  SANCDB: an update on South African natural compounds and their readily available analogs.

Authors:  Bakary N'tji Diallo; Michael Glenister; Thommas M Musyoka; Kevin Lobb; Özlem Tastan Bishop
Journal:  J Cheminform       Date:  2021-05-05       Impact factor: 5.514

Review 5.  Natural product drug discovery in the artificial intelligence era.

Authors:  F I Saldívar-González; V D Aldas-Bulos; J L Medina-Franco; F Plisson
Journal:  Chem Sci       Date:  2021-12-13       Impact factor: 9.825

6.  Three diketomorpholines from a Penicillium sp. (strain G1071).

Authors:  Zeinab Y Al Subeh; Huzefa A Raja; Joanna E Burdette; Joseph O Falkinham; Scott E Hemby; Nicholas H Oberlies
Journal:  Phytochemistry       Date:  2021-06-20       Impact factor: 4.004

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.