| Literature DB >> 33104791 |
Tianbiao Yang1,2,3, Zhaojun Li4, Yingjia Chen1,2, Dan Feng1,5, Guangchao Wang4, Zunyun Fu1,6, Xiaoyu Ding1,2, Xiaoqin Tan1,2, Jihui Zhao1,2, Xiaomin Luo1,2, Kaixian Chen1,2, Hualiang Jiang1,2,3,7, Mingyue Zheng1,2.
Abstract
One of the most prominent topics in drug discovery is efficient exploration of the vast drug-like chemical space to find synthesizable and novel chemical structures with desired biological properties. To address this challenge, we created the DrugSpaceX (https://drugspacex.simm.ac.cn/) database based on expert-defined transformations of approved drug molecules. The current version of DrugSpaceX contains >100 million transformed chemical products for virtual screening, with outstanding characteristics in terms of structural novelty, diversity and large three-dimensional chemical space coverage. To illustrate its practical application in drug discovery, we used a case study of discoidin domain receptor 1 (DDR1), a kinase target implicated in fibrosis and other diseases, to show DrugSpaceX performing a quick search of initial hit compounds. Additionally, for ligand identification and optimization purposes, DrugSpaceX also provides several subsets for download, including a 10% diversity subset, an extended drug-like subset, a drug-like subset, a lead-like subset, and a fragment-like subset. In addition to chemical properties and transformation instructions, DrugSpaceX can locate the position of transformation, which will enable medicinal chemists to easily integrate strategy planning and protection design.Entities:
Year: 2021 PMID: 33104791 PMCID: PMC7778939 DOI: 10.1093/nar/gkaa920
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Assembly procedure for chemotype library DrugSpaceX.
Figure 2.Analysis of chemical properties. (A) Comparison of the property distributions for the Drug Set (blue) and DrugSpaceX (red). Relative frequencies of the descriptors logP (upper left), molecular weight (upper right), H-bond acceptors (middle left), H-bond donors (middle right), rotatable bonds (lower left), and TPSA (lower right). (B) A principal component analysis (PCA) plot comparing the chemical space defined by the DrugSpaceX databases: all compounds (red), the Drug Set (blue). (C) A molecular 3D shape analysis of the diversity of DrugSpaceX (red) and Drug Set (blue) by the principal moments of inertia. (D) Distribution of SA scores for all of the products contained in DrugSpaceX (red) and Drug Set (blue).
Figure 3.Key property distributions of different chemical libraries. (A) Molecular fingerprint-based similarity to the approved drugs, which quantifies the closest distance of chemicals to existing drugs; (B) The synthetic accessibility (SA) score calculated by an RDKit-based Python script, where a lower value indicates easier synthesis; (C) Structural diversity measured by molecular 3D shape analysis based on the principal moment of inertia (PMI), which allows the classification of molecules as rods (linear shape, e.g. propyne), discs (cyclic planar shape, e.g., benzene), or spheres (globular shape, e.g. adamantane). Considering the large size of these databases, 100 000 samples were randomly selected from each for analysis of these properties.
Figure 4.Detailed view of the DrugSpaceX webpage: (A) Details of the subsets available to download; (B) Details of the search section; (C) Example of a search results page; (D) Details page for a DrugSpaceX molecule.
Figure 5.Docking-based virtual screening of DDR1 inhibitors against DrugSpaceX compounds. (A) The t-SNE projection of the compounds in Set1, including the top 10 drugs repositioned in relation to DDR1 and their first round of transformation products. The chemical structure features were encoded as an ECFP4 512-bit vector for t-SNE analysis. (B) The putative binding mode of DE209841 (cyan carbons) derived from docking simulations compared to the crystallized ligand ponatinib (salmon carbons) in DDR1 kinase (PDB code: 3ZOS). (C) The t-SNE projection of the compounds in Set2 coloured by docking scores ranging from the lowest in orange to the highest in yellow. The compound DE50204704, showing the lowest docking score, can be traced back to ponatinib in two rounds of transformation.