| Literature DB >> 33431002 |
Phyo Phyo Kyaw Zin1,2, Gavin Williams1,3, Denis Fourches4,5,6.
Abstract
We report on a new cheminformatics enumeration technology-SIME, synthetic insight-based macrolide enumerator-a new and improved software technology. SIME can enumerate fully assembled macrolides with synthetic feasibility by utilizing the constitutional and structural knowledge extracted from biosynthetic aspects of macrolides. Taken into account by the software are key information such as positions in macrolide structures at which chemical components can be inserted, and the types of structural motifs and sugars of interest that can be synthesized and incorporated at those positions. Additionally, we report on the chemical distribution analysis of the newly SIME-generated V1B (virtual 1 billion) library of macrolides. Those compounds were built based on the core of the Erythromycin structure, 13 structural motifs and a library of sugars derived from eighteen bioactive macrolides. This new enumeration technology can be coupled with cheminformatics approaches such as QSAR modeling and molecular docking to aid in drug discovery for rational designing of next generation macrolide therapeutics with desirable pharmacokinetic properties.Entities:
Keywords: In silico chemical library software; Macrolides; PKS enumerator; Polyketides
Year: 2020 PMID: 33431002 PMCID: PMC7146965 DOI: 10.1186/s13321-020-00427-6
Source DB: PubMed Journal: J Cheminform ISSN: 1758-2946 Impact factor: 5.514
Fig. 1Example structure of erythromycin core with possible structural motif and sugar replacement positions
Fig. 2Structures of 13 SMs used to generate V1B Library
Fig. 3Structures of seven sugars used to generate the V1B Library
Fig. 4Randomly picked example macrolide structures from V1B. The first set of digits correspond to the name of the files and the second set the row ID of the compound
Fig. 5Distribution of molecular properties in V1B: a MW—molecular weight, b MolLogP—calculated water/octanol partition coefficient, c TPSA—topological polar surface area, d HBA—hydrogen bond acceptors, e HBD—hydrogen bond donors, f NRB—rotatable bonds
Fig. 6Comparison between density plots of V1B and MacrolactoneDB for molecular properties: a MW—molecular weight, b MolLogP—calculated water/octanol partition coefficient, c TPSA—topological polar surface area, d HBA—hydrogen bond acceptors, e HBD—hydrogen bond donors, f NRB—rotatable bonds. Green rectangles indicate values within druglike regions based on Lipinski and Veber’s rules
Fig. 8Graphical illustration of the first parameter in SIME; # maximal repeat for SMs
Fig. 9Graphical illustration of the second parameter in SIME; minimal # of sugars
Fig. 10Graphical illustration of the fourth parameter in SIME; enumerate all possible stereocenters. Upon detection of stereocenters at the connecting atom present in SM011, SIME generates both R and S configurations for the joining atom upon user’s request
Fig. 11Graphical illustrative workflow of the core SIME. Sections I, II, III and IV are explained in the implementation details of “Methods” section
Helper functions for SIME. The descriptions for each function were provided to help understand the simplified workflow of SIME algorithm provided in Fig. 11
| Helper functions | Description |
|---|---|
| ENUMERATE_sugar_stereocenters ( | Take in sugar strings that start and end with [*R*] and return a list of sugars with two different stereocenters for the joining carbon |
| enumerate_SM_stereocenters ( | Takes a list of SMs. For SMs with identified stereocenters at the joining point, both R and S configurations for those SMs are generated and added to the |
| remove_SM_digits ( | Takes a given smile and locates SM points of interest indicated with [1*], [2*], etc. Returns the smile string with all SM points of interest with removed digits Input — > ’1[1*]234[2*]5[3*]6’ Output — > ’1[*]234[*]5[*]6’ |
| string_splitter ( | Splits a given string into fragments based on a symbol provided and returns a list containing the fragments. For example: input — > string = ’1[*]234[*]5[*]6’, symbol = ’[*]’ output — > [‘1’, ‘[*]’, ‘234’, ‘[*]’, ‘5’, ‘[*]’, ‘6’] |
| insert_SMs ( | Takes in a smile template resulted from string_splitter and replace the ‘[*]’ symbols with a list of SMs |
| generate_dummy_sugar_templates ( | This function takes two parameters: smile template, minimal sugars in each macrolide (default is one sugar). For simplification purposes, it generates a list of all possible sugar dummies as ‘ |
| replace_SYMBOLsugars_with_dummies ( | This function takes two inputs: sugar_dummy_order and smile_template_with_sugar_symbols. It splits the given template at [*sugar*] positions wherein the correct dummies (‘SUGARS’ and ‘Full_List’) are inserted |
| insert_sugars_to_dummies ( | This function takes the smile template with specified ‘SUGARS’ and ‘Full_List’ after *** function. It then replaces ‘SUGARS’ with an actual list of sugars, and ‘Full_List’ with the list of sugars and a hydroxyl group |
Fig. 7Graphical user Interface of SIME