| Literature DB >> 35282622 |
F I Saldívar-González1, V D Aldas-Bulos2, J L Medina-Franco1, F Plisson3.
Abstract
Natural products (NPs) are primarily recognized as privileged structures to interact with protein drug targets. Their unique characteristics and structural diversity continue to marvel scientists for developing NP-inspired medicines, even though the pharmaceutical industry has largely given up. High-performance computer hardware, extensive storage, accessible software and affordable online education have democratized the use of artificial intelligence (AI) in many sectors and research areas. The last decades have introduced natural language processing and machine learning algorithms, two subfields of AI, to tackle NP drug discovery challenges and open up opportunities. In this article, we review and discuss the rational applications of AI approaches developed to assist in discovering bioactive NPs and capturing the molecular "patterns" of these privileged structures for combinatorial design or target selectivity. This journal is © The Royal Society of Chemistry.Entities:
Year: 2021 PMID: 35282622 PMCID: PMC8827052 DOI: 10.1039/d1sc04471k
Source DB: PubMed Journal: Chem Sci ISSN: 2041-6520 Impact factor: 9.825
Fig. 1Overview of ML/AI algorithms that are implemented across the different stages of the natural product drug discovery pipeline. The pipeline presents two sections: (1) computer-assisted discovery of NPs (data-mining into traditional medicines and peer-reviewed articles, genome mining & structural elucidation and dereplication) and (2) machine learning algorithms applied to NPs (encoding into molecular representations, molecular descriptors, likeness scores, chemical space, predicting biological functions, de-orphanizing and generating de novo NP-inspired compounds).
Fig. 2Molecular representations frequently used in NPs.
All different ML algorithms/tools used to predict molecular targets of NPsa
| Tool | Algorithm(s) | Application(s) | Ref |
|---|---|---|---|
| PASS (prediction of biological activity for substances) | NB | It predicts over 3500 pharmacotherapeutic effects, mechanisms of action, interaction with the metabolic system, and specific toxicity for drug-like molecules on the basis of their structural formula |
|
| SEA (similarity ensemble approach) | Kruskal algorithm of MST | It relates proteins based on the set-wise chemical similarity among their ligands |
|
| SPiDER (self-organizing map-based prediction of drug equivalence relationships) | SOMs | Useful to identify innovative compounds in chemical biology, and help investigate the potential side effects of drugs and their repurposing options |
|
| TiGER (target inference GEneratoR) | Multiple SOMs | It performs qualitative predictions of up to 331 targets |
|
| DEcRyPT (drug–target relationship predictor) | RF | It deconvolves phenotypic hit targets and accurately predicts affinities |
|
| STarFish | kNN, RF, MLP and LoR | It considers small molecule binding to 1907 targets and its performance on natural products target prediction is explicitly considered |
|
kNN: k-nearest neighbors; LoR: logistic regression; MLP: multilayer perceptron; MST: minimum spanning tree; NB: naive Bayes; RF: random forest; SOM: self-organizing map.