Literature DB >> 31603673

Gaussian Process-Based Refinement of Dispersion Corrections.

Jonny Proppe1,2, Stefan Gugler2, Markus Reiher2.   

Abstract

We employ Gaussian process (GP) regression to adjust for systematic errors in D3-type dispersion corrections. We refer to the associated, statistically improved model as D3-GP. It is trained on differences between interaction energies obtained from PBE-D3(BJ)/ma-def2-QZVPP and DLPNO-CCSD(T)/CBS calculations. We generated a data set containing interaction energies for 1248 molecular dimers, which resemble the dispersion-dominated systems contained in the S66 data set. Our systems represent not only equilibrium structures but also dimers with various relative orientations and conformations at both shorter and longer distances. A reparametrization of the D3(BJ) model based on 66 of these dimers suggests that two of its three empirical parameters, a1 and s8, are zero, whereas a2 = 5.6841 bohr. For the remaining 1182 dimers, we find that this new set of parameters is superior to all previously published D3(BJ) parameter sets. To train our D3-GP model, we engineered two different vectorial representations of (supra-)molecular systems, both derived from the matrix of atom-pairwise D3(BJ) interaction terms: (a) a distance-resolved interaction energy histogram, histD3(BJ), and (b) eigenvalues of the interaction matrix ordered according to their decreasing absolute value, eigD3(BJ). Hence, the GP learns a mapping from D3(BJ) information only, which renders D3-GP-type dispersion corrections comparable to those obtained with the original D3 approach. They improve systematically if the underlying training set is selected carefully. Here, we harness the prediction variance obtained from GP regression to select optimal training sets in an automated fashion. The larger the variance, the more information the corresponding data point may add to the training set. For a given set of molecular systems, variance-based sampling can approximately determine the smallest subset being subjected to reference calculations such that all dispersion corrections for the remaining systems fall below a predefined accuracy threshold. To render the entire D3-GP workflow as efficient as possible, we present an improvement over our variance-based, sequential active-learning scheme [ J. Chem. Theory Comput. 2018 , 14 , 5238 ]. Our refined learning algorithm selects multiple (instead of single) systems that can be subjected to reference calculations simultaneously. We refer to the underlying selection strategy as batchwise variance-based sampling (BVS). BVS-guided active learning is an essential component of our D3-GP workflow, which is implemented in a black-box fashion. Once provided with reference data for new molecular systems, the underlying GP model automatically learns to adapt to these and similar systems. This approach leads overall to a self-improving model (D3-GP) that predicts system-focused and GP-refined D3-type dispersion corrections for any given system of reference data.

Entities:  

Year:  2019        PMID: 31603673     DOI: 10.1021/acs.jctc.9b00627

Source DB:  PubMed          Journal:  J Chem Theory Comput        ISSN: 1549-9618            Impact factor:   6.006


  7 in total

Review 1.  Big-Data Science in Porous Materials: Materials Genomics and Machine Learning.

Authors:  Kevin Maik Jablonka; Daniele Ongari; Seyed Mohamad Moosavi; Berend Smit
Journal:  Chem Rev       Date:  2020-06-10       Impact factor: 60.622

2.  Gaussian Process Regression for Materials and Molecules.

Authors:  Volker L Deringer; Albert P Bartók; Noam Bernstein; David M Wilkins; Michele Ceriotti; Gábor Csányi
Journal:  Chem Rev       Date:  2021-08-16       Impact factor: 60.622

3.  The transferability limits of static benchmarks.

Authors:  Thomas Weymuth; Markus Reiher
Journal:  Phys Chem Chem Phys       Date:  2022-06-22       Impact factor: 3.945

Review 4.  Ab Initio Machine Learning in Chemical Compound Space.

Authors:  Bing Huang; O Anatole von Lilienfeld
Journal:  Chem Rev       Date:  2021-08-13       Impact factor: 60.622

5.  Uncertainty quantification in classical molecular dynamics.

Authors:  Shunzhou Wan; Robert C Sinclair; Peter V Coveney
Journal:  Philos Trans A Math Phys Eng Sci       Date:  2021-03-29       Impact factor: 4.226

6.  Gabor Dictionary of Sparse Image Patches Selected in Prior Boundaries for 3D Liver Segmentation in CT Images.

Authors:  Xuehu Wang; Zhiling Zhang; Kunlun Wu; Xiaoping Yin; Haifeng Guo
Journal:  J Healthc Eng       Date:  2021-12-09       Impact factor: 2.682

7.  Uncertainty Quantification of Reactivity Scales.

Authors:  Jonny Proppe; Johannes Kircher
Journal:  Chemphyschem       Date:  2022-03-18       Impact factor: 3.520

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.