Literature DB >> 29095620

Resolving Transition Metal Chemical Space: Feature Selection for Machine Learning and Structure-Property Relationships.

Jon Paul Janet1, Heather J Kulik1.   

Abstract

Machine learning (ML) of quantum mechanical properties shows promise for accelerating chemical discovery. For transition metal chemistry where accurate calculations are computationally costly and available training data sets are small, the molecular representation becomes a critical ingredient in ML model predictive accuracy. We introduce a series of revised autocorrelation functions (RACs) that encode relationships of the heuristic atomic properties (e.g., size, connectivity, and electronegativity) on a molecular graph. We alter the starting point, scope, and nature of the quantities evaluated in standard ACs to make these RACs amenable to inorganic chemistry. On an organic molecule set, we first demonstrate superior standard AC performance to other presently available topological descriptors for ML model training, with mean unsigned errors (MUEs) for atomization energies on set-aside test molecules as low as 6 kcal/mol. For inorganic chemistry, our RACs yield 1 kcal/mol ML MUEs on set-aside test molecules in spin-state splitting in comparison to 15-20× higher errors for feature sets that encode whole-molecule structural information. Systematic feature selection methods including univariate filtering, recursive feature elimination, and direct optimization (e.g., random forest and LASSO) are compared. Random-forest- or LASSO-selected subsets 4-5× smaller than the full RAC set produce sub- to 1 kcal/mol spin-splitting MUEs, with good transferability to metal-ligand bond length prediction (0.004-5 Å MUE) and redox potential on a smaller data set (0.2-0.3 eV MUE). Evaluation of feature selection results across property sets reveals the relative importance of local, electronic descriptors (e.g., electronegativity, atomic number) in spin-splitting and distal, steric effects in redox potential and bond lengths.

Entities:  

Year:  2017        PMID: 29095620     DOI: 10.1021/acs.jpca.7b08750

Source DB:  PubMed          Journal:  J Phys Chem A        ISSN: 1089-5639            Impact factor:   2.781


  18 in total

Review 1.  Big-Data Science in Porous Materials: Materials Genomics and Machine Learning.

Authors:  Kevin Maik Jablonka; Daniele Ongari; Seyed Mohamad Moosavi; Berend Smit
Journal:  Chem Rev       Date:  2020-06-10       Impact factor: 60.622

2.  Using collective knowledge to assign oxidation states of metal cations in metal-organic frameworks.

Authors:  Kevin Maik Jablonka; Daniele Ongari; Seyed Mohamad Moosavi; Berend Smit
Journal:  Nat Chem       Date:  2021-07-05       Impact factor: 24.427

3.  Machine learning and semi-empirical calculations: a synergistic approach to rapid, accurate, and mechanism-based reaction barrier prediction.

Authors:  Elliot H E Farrar; Matthew N Grayson
Journal:  Chem Sci       Date:  2022-06-14       Impact factor: 9.969

4.  What Does the Machine Learn? Knowledge Representations of Chemical Reactivity.

Authors:  Joshua A Kammeraad; Jack Goetz; Eric A Walker; Ambuj Tewari; Paul M Zimmerman
Journal:  J Chem Inf Model       Date:  2020-03-03       Impact factor: 4.956

Review 5.  Ab Initio Machine Learning in Chemical Compound Space.

Authors:  Bing Huang; O Anatole von Lilienfeld
Journal:  Chem Rev       Date:  2021-08-13       Impact factor: 60.622

6.  Machine learning meets volcano plots: computational discovery of cross-coupling catalysts.

Authors:  Benjamin Meyer; Boodsarin Sawatlon; Stefan Heinen; O Anatole von Lilienfeld; Clémence Corminboeuf
Journal:  Chem Sci       Date:  2018-07-13       Impact factor: 9.825

7.  A data-driven perspective on the colours of metal-organic frameworks.

Authors:  Kevin Maik Jablonka; Seyed Mohamad Moosavi; Mehrdad Asgari; Christopher Ireland; Luc Patiny; Berend Smit
Journal:  Chem Sci       Date:  2020-12-28       Impact factor: 9.825

8.  Quantum-mechanical transition-state model combined with machine learning provides catalyst design features for selective Cr olefin oligomerization.

Authors:  Steven M Maley; Doo-Hyun Kwon; Nick Rollins; Johnathan C Stanley; Orson L Sydora; Steven M Bischof; Daniel H Ess
Journal:  Chem Sci       Date:  2020-08-21       Impact factor: 9.825

9.  The TensorMol-0.1 model chemistry: a neural network augmented with long-range physics.

Authors:  Kun Yao; John E Herr; David W Toth; Ryker Mckintyre; John Parkhill
Journal:  Chem Sci       Date:  2018-01-18       Impact factor: 9.825

10.  A quantitative uncertainty metric controls error in neural network-driven chemical discovery.

Authors:  Jon Paul Janet; Chenru Duan; Tzuhsiung Yang; Aditya Nandy; Heather J Kulik
Journal:  Chem Sci       Date:  2019-07-11       Impact factor: 9.825

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.