| Literature DB >> 29435094 |
Xiufeng Yang1, Jinzhe Zhang2, Kazuki Yoshizoe3, Kei Terayama1, Koji Tsuda1,3,4.
Abstract
Automatic design of organic materials requires black-box optimization in a vast chemical space. In conventional molecular design algorithms, a molecule is built as a combination of predetermined fragments. Recently, deep neural network models such as variational autoencoders and recurrent neural networks (RNNs) are shown to be effective in de novo design of molecules without any predetermined fragments. This paper presents a novel Python library ChemTS that explores the chemical space by combining Monte Carlo tree search and an RNN. In a benchmarking problem of optimizing the octanol-water partition coefficient and synthesizability, our algorithm showed superior efficiency in finding high-scoring molecules. ChemTS is available at https://github.com/tsudalab/ChemTS.Entities:
Keywords: 404 Materials informatics / Genomics; 60 New topics/Others; Molecular design; Monte Carlo tree search; python library; recurrent neural network
Year: 2017 PMID: 29435094 PMCID: PMC5801530 DOI: 10.1080/14686996.2017.1401424
Source DB: PubMed Journal: Sci Technol Adv Mater ISSN: 1468-6996 Impact factor: 8.090
Figure 1.Monte Carlo tree search. (a) Selection step: the search tree is traversed from the root to a leaf by choosing the child with the largest UCB score. (b) Expansion step: 30 children nodes are created by sampling from RNN. (c) Simulation step: paths to terminal nodes are created by the rollout procedure using RNN. Rewards of the corresponding molecules are computed. (d) Backpropagation step: the internal parameters of upstream nodes are updated.
Maximum score J at time points 2,4,6 and 8 h achieved by different molecular generation methods.
| Method | 2 h | 4 h | 6 h | 8 h | Molecules/Min |
|---|---|---|---|---|---|
| ChemTS | |||||
| RNN+BO | |||||
| Only RNN | |||||
| CVAE+BO | |||||
| GVAE+BO |
Notes: The rightmost column shows the number of generated molecules per minute. The average values and standard deviations over 10 trials are shown.
Figure 2.Best 20 molecules by ChemTS. Blue parts in SMILES strings indicate prefixes made in the search tree. The remaining parts are made by the rollout procedure.