| Literature DB >> 35104033 |
Yi Luo1, Saientan Bag2, Orysia Zaremba3, Adrian Cierpka4, Jacopo Andreo3, Stefan Wuttke3,5, Pascal Friederich2,4, Manuel Tsotsalas1,6.
Abstract
Despite rapid progress in the field of metal-organic frameworks (MOFs), the potential of using machine learning (ML) methods to predict MOF synthesis parameters is still untapped. Here, we show how ML can be used for rationalization and acceleration of the MOF discovery process by directly predicting the synthesis conditions of a MOF based on its crystal structure. Our approach is based on: i) establishing the first MOF synthesis database via automatic extraction of synthesis parameters from the literature, ii) training and optimizing ML models by employing the MOF database, and iii) predicting the synthesis conditions for new MOF structures. The ML models, even at an initial stage, exhibit a good prediction performance, outperforming human expert predictions, obtained through a synthesis survey. The automated synthesis prediction is available via a web-tool on https://mof-synthesis.aimat.science.Entities:
Keywords: Data Mining; Machine Learning; Metal-Organic Frameworks; Microporous Materials; Synthesis Prediction
Mesh:
Substances:
Year: 2022 PMID: 35104033 PMCID: PMC9310626 DOI: 10.1002/anie.202200242
Source DB: PubMed Journal: Angew Chem Int Ed Engl ISSN: 1433-7851 Impact factor: 16.823
Figure 1A new approach to MOF synthesis. The conventional approach (left loop) of new MOF synthesis is based on a time‐consuming trial‐and‐error approach, in which a target MOF structure is compared with reported MOFs from literature to find similar synthesis conditions and experimentally refine them. A data‐driven approach (right loop), where a ML model is trained on a library of automatically extracted literature data, to then suggest synthesis conditions in a data‐driven MOF discovery cycle. Updating the ML model based on new experiments leads to continuous improvement of the predictions.
Figure 2SynMOF database. a) Data mining pipeline and content of the SynMOF database; b) the statistics on the most common metal source and c) structures and occurrences of the most common linkers in the SynMOF database; d) 3D graph exhibiting correlation between solvent type, additive, and temperature.
Figure 3Machine learning models trained on the SynMOF‐A database. a) ML workflow, including fingerprint representation of the linkers and the feature representation of the metal type and oxidation state; b) and c) comparison of ML predictions of temperature and time for training and test sets with the initial data extracted from literature; d) learning curve of temperature predictions, i.e. mean absolute error as a function of the training set size, for neural network and random forest regression models; e) ML solvent prediction accuracy for a subset of single‐solvent MOFs, compared to different methods of random predictions; f) training and test set performance of additive classification where A, B, and N correspond to acid, base, and no additive respectively and g) average of eleven human expert predictions of temperature and time for 50 MOFs to evaluate the complexity of the problem.