| Literature DB >> 36016718 |
Zhihong Liu1,2, Jiewen Du3, Ziying Lin2, Ze Li1, Bingdong Liu2, Zongbin Cui2, Jiansong Fang4, Liwei Xie1,2,5.
Abstract
Various deep learning-based architectures for molecular generation have been proposed for de novo drug design. The flourish of the de novo molecular generation methods and applications has created a great demand for the visualization and functional profiling for the de novo generated molecules. An increasing number of publicly available chemogenomic databases sets good foundations and creates good opportunities for comprehensive profiling of the de novo library. In this paper, we present DenovoProfiling, a webserver dedicated to de novo library visualization and functional profiling. Currently, DenovoProfiling contains six modules: (1) identification & visualization module for chemical structure visualization and identify the reported structures, (2) chemical space module for chemical space exploration using similarity maps, principal components analysis (PCA), drug-like properties distribution, and scaffold-based clustering, (3) ADMET prediction module for predicting the ADMET properties of the de novo molecules, (4) molecular alignment module for three dimensional molecular shape analysis, (5) drugs mapping module for identifying structural similar drugs, and (6) target & pathway module for identifying the reported targets and corresponding functional pathways. DenovoProfiling could provide structural identification, chemical space exploration, drug mapping, and target & pathway information. The comprehensive annotated information could give users a clear picture of their de novo library and could guide the further selection of candidates for chemical synthesis and biological confirmation. DenovoProfiling is freely available at http://denovoprofiling.xielab.net.Entities:
Keywords: DDR1, Discovered potent discoidin domain receptor 1; De novo drug design; De novo molecule library; Deep learning; FBDD, Fragment-based drug design; FDR, False discovery rate; GAN, Generative adversarial networks; HTS, High throughput screening; LSTM, Long short-term memory; Library profiling; PCA, Principal components analysis; RNN, Recurrent neural networks; SCA, Scaffold-based classification approach; VAE, Variational autoencoders
Year: 2022 PMID: 36016718 PMCID: PMC9379519 DOI: 10.1016/j.csbj.2022.07.045
Source DB: PubMed Journal: Comput Struct Biotechnol J ISSN: 2001-0370 Impact factor: 6.155
Fig. 1The framework of the DenovoProfiling web platform.
The collected 13 ADMET datasets and deep learning-based model performance.
| Caco2 Cell Permeability | 1946 | 0.92 | 0.018 | |
| P-gp Inhibitors | 4418 | 0.96 | 0.008 | |
| P-gp Substrates | 2100 | 0.85 | 0.021 | |
| Biodegradability | 1604 | 0.91 | 0.023 | |
| CYP1A2 Inhibitors | 14,903 | 0.89 | 0.006 | |
| CYP3A4 Inhibitors | 18,561 | 0.88 | 0.007 | |
| CYP2D6 Inhibitors | 14,741 | 0.86 | 0.015 | |
| CYP2C9 Inhibitors | 14,709 | 0.88 | 0.007 | |
| CYP2C19 Inhibitors | 14,576 | 0.89 | 0.008 | |
| Human Liver Toxicity | 2476 | 0.94 | 0.014 | |
| HERG | 9636 | 0.95 | 0.006 | |
| Rat Acute Oral Toxicity | 12,170 | 0.86 | 0.021 | |
| Carcinogenic Potency | 833 | 0.84 | 0.044 |
The datasets for testing the functionality of DenovoProfiling.
| Index | Dataset | Molecules | Source |
|---|---|---|---|
| 1 | Drug Dataset | 60 | drug molecules randomly selected from DrugBank |
| 2 | Random Dataset | 500 | |
| 3 | Focused Dataset | 50 |
Fig. 2Structure identification and visualization of de novo library using Random Dataset.
Fig. 3Chemical space illustration using similarity heatmap based on Drug Dataset.
Fig. 4Chemical space illustration using principal component analysis (PCA) based on Random Dataset.
Fig. 5Distribution of drug-like properties based on Random Dataset.
Fig. 6Scaffold statistics of chemical scaffolds of de novo library based on Random Dataset.
Fig. 7Grid view of the chemical scaffolds of the de novo library based on Random Dataset.
Fig. 8ADMET prediction snapshot based on Random Dataset. A: Table view of ADMET prediction results for de novo library. B: ADMET prediction details for one molecule.
Fig. 9Molecular alignment of the scaffold-focused de novo library based on Focused Dataset.
Fig. 10Grid view of the drugs mapping. A: Drug Dataset results; B: Random Dataset results.
Fig. 11Table view of drugs mapping using Random Dataset.
Fig. 12The identified targets in ChEMBL for Drug Dataset and the compound target network.
Fig. 13The identified targets in ChEMBL of de novo molecules and the compound target network.
Fig. 14The enriched KEGG pathways using the identified targets in ChEMBL. A: Drug Dataset results; B: Random Dataset results.
The time cost for each module (seconds).
| Modules | Drug Dataset | Focused Dataset | Random Dataset (500 mols) |
|---|---|---|---|
| Identification & Visualization | 8 | 6 | 10 |
| Chemical Space | 26 | 31 | 124 |
| ADMET Prediction | 61 | 58 | 87 |
| Molecular Alignment | 227 | 25 | 86 |
| Drugs Mapping | 32 | 25 | 73 |
| Target & Pathway | 24 | 18 | 19 |