| Literature DB >> 34143323 |
José L Medina-Franco1, Norberto Sánchez-Cruz2, Edgar López-López2,3, Bárbara I Díaz-Eufracio2.
Abstract
The concept of chemical space is a cornerstone in chemoinformatics, and it has broad conceptual and practical applicability in many areas of chemistry, including drug design and discovery. One of the most considerable impacts is in the study of structure-property relationships where the property can be a biological activity or any other characteristic of interest to a particular chemistry discipline. The chemical space is highly dependent on the molecular representation that is also a cornerstone concept in computational chemistry. Herein, we discuss the recent progress on chemoinformatic tools developed to expand and characterize the chemical space of compound data sets using different types of molecular representations, generate visual representations of such spaces, and explore structure-property relationships in the context of chemical spaces. We emphasize the development of methods and freely available tools focusing on drug discovery applications. We also comment on the general advantages and shortcomings of using freely available and easy-to-use tools and discuss the value of using such open resources for research, education, and scientific dissemination.Entities:
Keywords: Chemoinformatics; Drug discovery; Molecular representation; Open-source; Structure–activity relationships; Webserver
Mesh:
Year: 2021 PMID: 34143323 PMCID: PMC8211976 DOI: 10.1007/s10822-021-00399-1
Source DB: PubMed Journal: J Comput Aided Mol Des ISSN: 0920-654X Impact factor: 4.179
Fig. 1Schematic representation of the chemical space concept as an M-dimensional descriptor space
Overview of advantages and disadvantages of using open tools, including web servers
| Advantages | Disadvantages | |
|---|---|---|
| Web servers | Increased accessibility Budget, cost-effective Experts and non-experts in chemoinformatics can use them They are good resources for the education of beginners and teaching They are convenient tools for distance learning (provided the servers are correctly used) They contribute to the generation of multi- and transdisciplinary science | They could be used as black boxes Limited access to parameters Potential issues of intellectual property Sensitive to proprietary data They are usually limited to a given (relatively short) number of compounds to analyze |
| Stand alone software | No need for programming or previous experience in programming is not mandatory Ready to use and apply to a research project No sensitivity to proprietary data | It might depend on the operating system Cost–benefit increases |
| Scripts and programs | Broadly widely customizable It can be implemented onto web servers Faster data processing speed | Experience in programming required Support might not be easily accessible. Depend on the experts The learning curve can be steep, not necessarily read to use |
Fig. 2The graphical user interface of D-Peptide builder: an example of a recent free webserver to generate compounds. D-Peptide builder enumerates combinatorial peptide libraries
Examples of freely available web servers for the interactive visualization of chemical space
| Web Server | Brief description | URL (accessed May 1, 2021) | Ref |
|---|---|---|---|
| AtlasCBS | Generates two-dimensional, dynamical representations of its contents in terms of Ligand Efficiency Indices | [ | |
| ChemMaps | Webserver developed to navigate throughout chemical and environmental chemical space | [ | |
| ChemGPS-NP | ChemGPS-NP Web is a system for computing the eight principal components (dimensions) describing physical–chemical properties for a reference set of compounds | [ | |
| Natural Products Navigator | Visualization and navigation through the chemical space of NPs and NP-like molecules | [ | |
| tMAP | Visualization library for large, high-dimensional data sets | [ | |
| Faerun | Chemical space accessible by the PDGA with an interactive 3D map of the MXFP property space | [ | |
| PDB Explorer | Interactive visualization and similarity search of the RSCB Protein Databank in shape space | ||
| D-Peptide Builder | Enumerate chemical spaces of peptide combinatorial libraries and visualize chemical spaces | [ | |
| Platform for Unified Molecular Analysis | Online server to visualize the chemical space and compute the molecular diversity of your data sets | [ |
Fig. 3Visual representation of the chemical space of user-supplied chemical structures using the free server Platform for Unified Molecular Analysis (PUMA). The figure shows the visual representation of the chemical space of two synthetic commercial libraries targeted for epigenetic targets (709 compounds in total). The principal component analysis is based on six physicochemical properties of pharmaceutical interest as described in [81]. On the free web server, the 2D or 3D plot is interactive
Fig. 4Property Landscapes of compounds with activity against Tubulin using cell-based inhibition data. a Structure–Property Similarity (SPS) map of 188 tubulin inhibitors that correspond to 17,578 pairwise comparisons. The property cliffs are displayed in the upper-right zone. Each data point was colored using a SALI value scale from green (low) to red (high); b Dual Property Difference (DPD) map of tubulin inhibitors. The dual active compounds are displayed in the upper right zone. Each data point was colored using a selectivity score from green (low) to red (high); c Example of a property and dual activity cliff
Fig. 5Constellation plot of compounds with activity against Tubulin using cell-based inhibition data. The plot shows 147 data points, each one representing an analog series. The size of the data point indicates the relative number of compounds in each analog series, and the color is the average activity of the compound in the series. Linking lines represent shared molecules between two analog series. Figure was
adapted from López-López E. et al. [96]