| Literature DB >> 35471772 |
Sanggil Park1, Herim Han2,3, Hyungjun Kim1, Sunghwan Choi4.
Abstract
Machine learning (ML) approaches have enabled rapid and efficient molecular property predictions as well as the design of new novel materials. In addition to great success for molecular problems, ML techniques are applied to various chemical reaction problems that require huge costs to solve with the existing experimental and simulation methods. In this review, starting with basic representations of chemical reactions, we summarized recent achievements of ML studies on two different problems; predicting reaction properties and synthetic routes. The various ML models are used to predict physical properties related to chemical reaction properties (e. g. thermodynamic changes, activation barriers, and reaction rates). Furthermore, the predictions of reactivity, self-optimization of reaction, and designing retrosynthetic reaction paths are also tackled by ML approaches. Herein we illustrate various ML strategies utilized in the various context of chemical reaction studies.Entities:
Keywords: Chemical reaction; Machine Learning; Reaction rate; Reactivity; Retrosynthesis
Mesh:
Year: 2022 PMID: 35471772 PMCID: PMC9401034 DOI: 10.1002/asia.202200203
Source DB: PubMed Journal: Chem Asian J ISSN: 1861-471X
Figure 1The example of descriptors for chemical reactions. The substructure‐based descriptors represent changes of substructures in the reactant and product. (top) The reaction‐SMILES denotes three parts of reactions: reactants, agents, and products as a single code. (middle) The graphical representation contains node and edge features for atoms and bonds information, respectively. R1–4 represent differences or concatenations of node features from reactant and product structures. (bottom)
Figure 2Schematic representations of two different ways to apply a machine learning (ML) model for chemical reaction problems. (a) Reaction properties are computed from the chemical properties predicted by ML. (b) An ML model directly predicts reaction properties from a chemical reaction itself.
Figure 3An illustration of the reaction feature construction for drug‐target interaction (DTI). Protein and chemical features are obtained from protein sequence and fingerprint respectively. The machine learning model is employed to find the relationship between the reaction feature and the corresponding DTI value.
Figure 4An illustration of self‐optimization of reaction conditions.
Figure 5An illustration of categories of ML‐based retrosynthetic predictions discussed in the retrosynthesis part (Section 4.3)
Top‐k accuracy for retrosynthesis prediction on USPTO‐50k database when reaction types are unknown and machine learning technique to be used.
|
Methods |
Top‐n accuracy [%] |
Methodology | ||||||
|---|---|---|---|---|---|---|---|---|
|
1 |
3 |
5 |
10 |
Prioritization of templates |
Synthon Completion |
Encoder‐decoder |
Monte Carlo Tree Search | |
|
AutoSynRoute |
43.1 |
64.6 |
71.8 |
78.7 |
|
|
✔ |
✔ |
|
SCROP |
43.7 |
60.0 |
65.2 |
68.7 |
|
|
✔ |
|
|
GET |
44.9 |
58.8 |
62.4 |
65.9 |
|
|
✔ |
|
|
Tied Transformer |
47.1 |
67.2 |
73.5 |
78.5 |
|
|
✔ |
|
|
Graph2SMILES (D‐GAT) |
51.2 |
66.3 |
70.4 |
73.9 |
|
|
✔ |
|
|
Graph2SMILES (D‐GCN) |
52.9 |
66.5 |
70.0 |
72.9 |
|
|
✔ |
|
|
MEGAN |
48.1 |
70.7 |
78.4 |
86.1 |
|
|
✔ |
|
|
G2Gs |
48.9 |
67.6 |
72.5 |
75.5 |
|
✔ |
|
|
|
RetroXpert |
50.4 |
61.1 |
62.3 |
63.4 |
|
✔ |
|
|
|
GTA |
51.1 |
67.6 |
74.8 |
81.6 |
|
|
✔ |
|
|
RetroPrime |
51.4 |
70.8 |
74.0 |
76.1 |
|
✔ |
✔ |
|
|
GLN |
52.5 |
69.0 |
75.6 |
83.7 |
✔ |
|
|
|
|
Aug. Transformer |
53.2 |
– |
80.5 |
85.2 |
|
|
✔ |
|
|
LocalRetro |
53.4 |
77.5 |
85.9 |
92.4 |
✔ |
|
|
|
|
GraphRetro |
53.7 |
68.3 |
72.2 |
75.5 |
|
✔ |
|
|
|
Chemformer |
54.3 |
– |
62.3 |
63.0 |
|
|
✔ |
|
|
EBM (Dual‐TB) |
55.2 |
74.6 |
80.5 |
86.9 |
✔ | |||