| Literature DB >> 33328504 |
Abstract
We developed a computational method named Molecule Optimization by Reinforcement Learning and Docking (MORLD) that automatically generates and optimizes lead compounds by combining reinforcement learning and docking to develop predicted novel inhibitors. This model requires only a target protein structure and directly modifies ligand structures to obtain higher predicted binding affinity for the target protein without any other training data. Using MORLD, we were able to generate potential novel inhibitors against discoidin domain receptor 1 kinase (DDR1) in less than 2 days on a moderate computer. We also demonstrated MORLD's ability to generate predicted novel agonists for the D4 dopamine receptor (D4DR) from scratch without virtual screening on an ultra large compound library. The free web server is available at http://morld.kaist.ac.kr .Entities:
Year: 2020 PMID: 33328504 PMCID: PMC7744578 DOI: 10.1038/s41598-020-78537-2
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Schematic overview of the MORLD. An initial molecule is optimized by T steps of modifications (one episode) as shown in the flow chart. Through multiple episodes, MORLD learns a way of modifying molecules to create an optimized molecule having a higher docking score to the target protein.
Figure 2Comparison of the compounds from MORLD and from random model. (a) Molecular properties (QuickVina 2 docking score, SA score, and QED score) of the compounds generated by MORLD and the random model. The Red horizontal line indicates the molecular property of the initial molecule. (b) The number of unique compounds generated for each 100 episodes (red solid line) and the mean of QuickVina 2 docking scores (blue solid line) of the generated compounds from MORLD (left) and random model (right). The standard deviation of QuickVina 2 docking scores is depicted as blue area. (c) Tanimoto score of the generated compounds from MORLD (blue area) and random model (orange area) against the lead compound (“ponatinib”).
Figure 3Samples from the inhibitors generated by the MORLD for the target DDR1. (a) Using the “Parent structure” shown in Fig. 1 of Zhavoronkov’s paper as the initial lead compound (Lead), DDR1 inhibitors were generated and three sample compounds are shown. (b) Three sample compounds generated from the initial compound ZINC12115041, which was identified by a simple virtual screening procedure for DDR1.
Docking scores calculated by four popular docking programs (AutoDock Vina, QuickVina 2, rDock, Ledock), as well as SA and QED scores for the DDR1 inhibitors.
| Compounds | Vina* (QuickVina 2) | rDock* | Ledock* | SA** | QED*** |
|---|---|---|---|---|---|
| Lead (initial) | − 7.6 (− 7.0) | − 21.24 | − 6.50 | 0.75 | 0.55 |
| L_Sample1 | − 12.4 (− 12.6) | − 40.27 | − 10.06 | 0.68 | 0.35 |
| L_Sample2 | − 12.5 (− 12.6) | − 40.91 | − 10.31 | 0.67 | 0.2 |
| L_Sample3 | − 12.4 (− 12.5) | − 39.55 | − 10.10 | 0.7 | 0.49 |
| ZINC12114041 (initial) | − 10.9 (− 11.0) | − 30.31 | − 6.90 | 0.81 | 0.77 |
| V_Sample1 | − 13.1 (− 13.1) | − 36.27 | − 8.69 | 0.69 | 0.6 |
| V_Sample2 | − 13.1 (− 13.1) | − 36.68 | − 8.27 | 0.63 | 0.49 |
| V_Sample3 | − 13.1 (− 13.1) | − 37.09 | − 8.23 | 0.75 | 0.66 |
| Ponatinib (initial) | − 12.7 (− 12.7) | − 43.02 | − 11.79 | 0.78 | 0.39 |
| P_Sample1 | − 15.9 (− 15.9) | − 50.08 | − 13.59 | 0.65 | 0.2 |
| Compounds 1 (active) | |||||
| Compounds 3 (moderate) | |||||
| Compounds 5 (inactive) |
Data in italics indicates compounds generated by Zhavoronkov et al.[8].
*For docking scores, the lower, the better. The unit of Vina, QuickVina 2, and Ledock score is kcal/mol.
**Synthetic accessibility.
***Quantitative Estimate of Drug-likeness.
Figure 4Samples from the inhibitors generated by the MORLD for the target D4DR. (a) Sample D4DR agonists generated from scratch (None). (b) Sample D4DR agonists generated from ZINC12203131, which was found by virtual screening.
Docking scores calculated by four popular docking programs (AutoDock Vina, QuickVina 2, rDock, Ledock), as well as SA and QED scores for the D4DR agonists.
| Compounds | Vina* (QuickVina 2) | rDock* | Ledock* | SA** | QED*** |
|---|---|---|---|---|---|
| None (initial) | – | – | – | – | – |
| N_Sample1 | − 12.7 (− 12.7) | − 39.80 | − 7.51 | 0.5 | 0.57 |
| N_Sample2 | − 11.2 (− 11.3) | − 37.99 | − 7.66 | 0.61 | 0.41 |
| ZINC12203131 (initial) | − 10.9 (− 10.8) | − 36.41 | − 6.99 | 0.85 | 0.59 |
| Z_Sample1 | − 14.3 (− 14.3) | − 40.82 | − 8.46 | 0.75 | 0.69 |
| Z_Sample2 | − 13.8 (− 13.8) | − 39.38 | − 8.14 | 0.76 | 0.67 |
| ZINC465129598 (active) | |||||
| ZINC518842964 (active) | |||||
| ZINC464771011 (active) |
Data in italics indicates active inhibitors from Lyu et al.[10].
*For docking scores, the lower, the better. The unit of Vina, QuickVina 2, and Ledock score is kcal/mol.
**Synthetic accessibility.
***Quantitative estimate of drug-likeness.
Figure 5Docking poses of ponatinib, P_sample1, and V_sample1. (a) The X-ray crystallographic pose of ponatinib from PDB ID:3ZOS (native), P_sample1 (optimized from ponatinib) docked into 3zos, and V_sample1 (optimized from ZINC12114041) docked into 3zos. (b) Binding interactions of ponatinib, P_sample1, and V_sample1 to 3ZOS.
Hyperparameters in MORLD.
| Initial molecule | Target structure | Num. of steps | Num. of episodes | Atom types | Weight of SA | Weight of QED |
|---|---|---|---|---|---|---|
| Ponatinib | 3ZOS | 20 | 7000 | C, N, O, F | 1 | 1 |
| Lead | 3ZOS | 20 | 7000 | C, N, O, F | 1 | 1 |
| ZINC12114041 | 3ZOS | 20 | 7000 | C, N, O, F | 1 | 1 |
| None | 5WIU | 48 | 15,000 | C, N, O | 1 | 1 |
| ZINC12203131 | 5WIU | 20 | 7000 | C, N, O, F | 1 | 1 |
| E7449 frag | 4R6E | 24 | 20,000 | C, N, O, F | 1 | 1 |