| Literature DB >> 30733290 |
Frits Daeyaert1, Fengdan Ye2, Michael W Deem3,2.
Abstract
We report a machine-learning strategy for design of organic structure directing agents (OSDAs) for zeolite beta. We use machine learning to replace a computationally expensive molecular dynamics evaluation of the stabilization energy of the OSDA inside zeolite beta with a neural network prediction. We train the neural network on 4,781 candidate OSDAs, spanning a range of stabilization energies. We find that the stabilization energies predicted by the neural network are highly correlated with the molecular dynamics computations. We further find that the evolutionary design algorithm samples the space of chemically feasible OSDAs thoroughly. In total, we find 469 OSDAs with verified stabilization energies below -17 kJ/(mol Si), comparable to or better than known OSDAs for zeolite beta, and greatly expanding our previous list of 152 such predicted OSDAs. We expect that these OSDAs will lead to syntheses of zeolite beta.Entities:
Keywords: OSDA; machine learning; neural network; zeolite beta
Year: 2019 PMID: 30733290 PMCID: PMC6397530 DOI: 10.1073/pnas.1818763116
Source DB: PubMed Journal: Proc Natl Acad Sci U S A ISSN: 0027-8424 Impact factor: 11.205
Top two sets of hyperparameters selected from models 1–4
| Model | Number of intensities | Total number of weights | |||||||
| 1a | 24 | 0.500 | 49 | 5 | 256 | 1.52 (0.03) | 1.79 (0.07) | 1.45 | 1.41 |
| 1b | 8 | 0.500 | 17 | 8 | 153 | 1.59 (0.02) | 1.75 (0.06) | 1.52 | 1.47 |
| 2a | 24 | 0.500 | 49 | 4 | 205 | 1.66 (0.04) | 1.83 (0.08) | 1.50 | 1.65 |
| 2b | 8 | 0.500 | 17 | 8 | 153 | 1.68 (0.02) | 1.84 (0.07) | 1.59 | 1.59 |
| 3a | 8 | 0.500 | 17 | 2 | 39 | 1.61 (0.07) | 1.68 (0.14) | 1.50 | 1.64 |
| 3b | 32 | 0.500 | 65 | 1 | 68 | 1.55 (0.04) | 1.75 (0.13) | 1.55 | 1.68 |
| 4a | 32 | 0.500 | 65 | 5 | 336 | 1.90 (0.05) | 1.92 (0.07) | 1.87 | 1.87 |
| 4b | 24 | 0.250 | 97 | 2 | 199 | 1.91 (0.05) | 1.95 (0.09) | 1.88 | 1.89 |
The is defined in , and is defined in . The values between brackets are the corresponding SDs. The is defined in , and is defined in .
Fig. 1.Scatter plots of MD- versus ML-predicted stabilization energies for the OSDAs in the validation set for the eight models (A–H). Models 1a and 1b were trained on all compounds without weighing. Models 2a and 2b were trained on all compounds with weighing. Compared with models 1a and 1b, models 2a and 2b have better prediction for OSDAs with MD-calculated energy below −15 kJ/mol Si. Models 3a and 3b were trained on charged compounds only without weighing. No charged OSDAs have an MD-calculated energy below −17.5 kJ/mol Si, which limited the ability of the neural network to find favorable OSDAs. Models 4a and 4b used a linear activation function in the output node.
Fig. 2.Results for OSDA design using model 1b. (A) The top five molecules produced. The molecule scores in this figure are the ML determined binding energy in kJ/(mol Si). (B) Proposed synthesis route to the first molecule in the output shown in A. The outcome of the synthesis route is listed together with the acronym of the reaction used (ALKYLATENP), as well as the structures and catalog names of the proposed reagents.
Best OSDA found with its ML-predicted and MD-calculated stabilization energy, number of compounds with an ML-predicted stabilization energy below −15 kJ/(mol Si), the total number of molecules for which the stabilization energy was predicted, and the total number of unique molecules generated in each run
The number of compounds with ML-predicted energies below −15 kJ/(mol Si), the number of compounds with ML-predicted energies between −15 and −14 kJ/(mol Si) and among which the number of compounds with MD-calculated energies below −17 kJ/(mol Si), the number of TP, and the prediction precision for the eight in silico materials design runs
| Model | −15 < | TP (precision) | |
| 1a | 1,058 (1,054) | 839 (32, 3.8%) | 812 (76.7%) |
| 1b | 1,179 (1,177) | 625 (6, 0.9%) | 865 (73.4%) |
| 2a | 836 (832) | 696 (33, 4.7%) | 690 (82.5%) |
| 2b | 910 (908) | 550 (14, 2.5%) | 672 (73.8%) |
| 3a | 1,857 (1,840) | 915 (60, 6.6%) | 727 (39.1%) |
| 3b | 1,280 (1,280) | 1,204 (104, 8.6%) | 660 (51.6%) |
| 4a | 712 (695) | 827 (34, 4.1%) | 538 (75.6%) |
| 4b | 599 (599) | 805 (57, 7.1%) | 484 (80.8%) |
In parentheses is prediction precision, defined as TP/(number with ≤ −15) ≡ TP/(TP + FP), where FP is false positive and TP is true positive.
In parentheses is the number of MD energies, as some MD evaluations failed.
Cross-section of the putative OSDAs generated in different runs with ML-predicted stabilization energies E ≤ −15. kJ/(mol Si)
| Run | 1a | 1b | 2a | 2b | 3a | 3b | 4a | 4b | In training set |
| 1a | 1,058 | 749 | 630 | 560 | 477 | 452 | 497 | 453 | 13 |
| 1b | 1,179 | 585 | 691 | 445 | 446 | 402 | 384 | 10 | |
| 2a | 836 | 565 | 386 | 374 | 419 | 435 | 11 | ||
| 2b | 910 | 320 | 312 | 339 | 328 | 7 | |||
| 3a | 1,857 | 1,051 | 322 | 254 | 21 | ||||
| 3b | 1,280 | 354 | 311 | 12 | |||||
| 4a | 712 | 386 | 17 | ||||||
| 4b | 599 | 11 | |||||||
| Total unique molecules: 3,062 | |||||||||
Column 10 lists the number of molecules generated in one run that are present in the training or validation set.