| Literature DB >> 34750989 |
Gian Marco Ghiandoni1, Michael J Bodkin2, Beining Chen3, Dimitar Hristozov2, James E A Wallace2, James Webster1, Valerie J Gillet1.
Abstract
Reaction-based de novo design refers to the generation of synthetically accessible molecules using transformation rules extracted from known reactions in the literature. In this context, we have previously described the extraction of reaction vectors from a reactions database and their coupling with a structure generation algorithm for the generation of novel molecules from a starting material. An issue when designing molecules from a starting material is the combinatorial explosion of possible product molecules that can be generated, especially for multistep syntheses. Here, we present the development of RENATE, a reaction-based de novo design tool, which is based on a pseudo-retrosynthetic fragmentation of a reference ligand and an inside-out approach to de novo design. The reference ligand is fragmented; each fragment is used to search for similar fragments as building blocks; the building blocks are combined into products using reaction vectors; and a synthetic route is suggested for each product molecule. The RENATE methodology is presented followed by a retrospective validation to recreate a set of approved drugs. Results show that RENATE can generate very similar or even identical structures to the corresponding input drugs, hence validating the fragmentation, search, and design heuristics implemented in the tool.Entities:
Keywords: de novo drug design; patents; pharmaceuticals; reaction informatics
Mesh:
Substances:
Year: 2021 PMID: 34750989 PMCID: PMC9285524 DOI: 10.1002/minf.202100207
Source DB: PubMed Journal: Mol Inform ISSN: 1868-1743 Impact factor: 4.050
Figure 1Pseudo‐retrosynthetic de novo design applied to the molecule Celecoxib. The query ligand bonds are (a) broken to yield a set of key fragments. Fragments are then used to retrieve similar structures that are (b) recombined to yield novel compounds that are similar to the query ligand.
Figure 2Ligand fragmentation results for the molecule Celecoxib. The pyrazole is identified as the starting material (molecular scaffold) due to the higher number of connections compared to the other fragments, which are, therefore, considered as reagents (substituents).
Figure 3RENATE KNIME workflow: Query molecules are fragmented and used to find fragments similar to the scaffold (which form starting materials) and each of the substituents (to form lists of reagents). The fragments returned from the scaffold search are written to a temporary file, which is read by the structure generator as a starting population. Once the starting materials have been combined with the first set of reagents, the new population is scored and overwrites the temporary table. RENATE iterates through each reagent set while reading and overwriting the temporary table until the process is complete. The final population is then rescored and written out.
Figure 4Drugs that failed the BRICS decomposition. Potential fragmentation bonds are highlighted in bold.
Statistics from the pairwise similarities between queries and their closest reproductions from the USPD and JMC 2018 designs.
|
Design |
Binary Fingerprint |
Min |
Max |
Mean |
Median |
|---|---|---|---|---|---|
|
USPD |
RDKit‐ECFP4 |
0.19 |
1.00 |
0.62 |
0.60 |
|
CDK‐ECFP4 |
0.18 |
1.00 |
0.62 |
0.61 | |
|
RDKit‐FCFP4 |
0.29 |
1.00 |
0.64 |
0.64 | |
|
CDK‐FCFP4 |
0.23 |
1.00 |
0.65 |
0.64 | |
|
JMC 2018 |
RDKit‐ECFP4 |
0.15 |
1.00 |
0.51 |
0.48 |
|
CDK‐ECFP4 |
0.12 |
1.00 |
0.50 |
0.48 | |
|
RDKit‐FCFP4 |
0.16 |
1.00 |
0.51 |
0.45 | |
|
CDK‐FCFP4 |
0.21 |
1.00 |
0.52 |
0.52 |
Figure 5Examples of some closest reproduction‐drug pairwise similarities using RDKit‐ECFP4 generated from the USPD design.
Virtual and real synthetic steps, plus original patent references, for each reproduced drug from the USPD and JMC 2018 designs.
|
Design |
Drug |
Virtual Steps |
Real Steps (Patent Reference) |
|---|---|---|---|
|
USPD |
Brimonidine |
1 |
3 (US3890319 A) |
|
Glipizide |
2 |
2 (DE2012138) | |
|
Glyburide |
2 |
3 (DE1283837) | |
|
Levofloxacin |
1 |
7 (US4382892 A) | |
|
Naproxen |
1 |
8 (US3896157) | |
|
Rivaroxaban |
3 |
4 (US7157456B2) | |
|
JMC 2018 |
Diclofenac |
2 |
4 (DE1793592) |