Literature DB >> 33810678

Improving molecular force fields across configurational space by combining supervised and unsupervised machine learning.

Gregory Fonseca1, Igor Poltavsky1, Valentin Vassilev-Galindo1, Alexandre Tkatchenko1.   

Abstract

The training set of atomic configurations is key to the performance of any Machine Learning Force Field (MLFF) and, as such, the training set selection determines the applicability of the MLFF model for predictive molecular simulations. However, most atomistic reference datasets are inhomogeneously distributed across configurational space (CS), and thus, choosing the training set randomly or according to the probability distribution of the data leads to models whose accuracy is mainly defined by the most common close-to-equilibrium configurations in the reference data. In this work, we combine unsupervised and supervised ML methods to bypass the inherent bias of the data for common configurations, effectively widening the applicability range of the MLFF to the fullest capabilities of the dataset. To achieve this goal, we first cluster the CS into subregions similar in terms of geometry and energetics. We iteratively test a given MLFF performance on each subregion and fill the training set of the model with the representatives of the most inaccurate parts of the CS. The proposed approach has been applied to a set of small organic molecules and alanine tetrapeptide, demonstrating an up to twofold decrease in the root mean squared errors for force predictions on non-equilibrium geometries of these molecules. Furthermore, our ML models demonstrate superior stability over the default training approaches, allowing reliable study of processes involving highly out-of-equilibrium molecular configurations. These results hold for both kernel-based methods (sGDML and GAP/SOAP models) and deep neural networks (SchNet model).

Entities:  

Year:  2021        PMID: 33810678     DOI: 10.1063/5.0035530

Source DB:  PubMed          Journal:  J Chem Phys        ISSN: 0021-9606            Impact factor:   3.488


  3 in total

1.  Linear Atomic Cluster Expansion Force Fields for Organic Molecules: Beyond RMSE.

Authors:  Dávid Péter Kovács; Cas van der Oord; Jiri Kucera; Alice E A Allen; Daniel J Cole; Christoph Ortner; Gábor Csányi
Journal:  J Chem Theory Comput       Date:  2021-11-04       Impact factor: 6.006

2.  Simulations of molecular photodynamics in long timescales.

Authors:  Saikat Mukherjee; Max Pinheiro; Baptiste Demoulin; Mario Barbatti
Journal:  Philos Trans A Math Phys Eng Sci       Date:  2022-03-28       Impact factor: 4.226

Review 3.  Opportunities and Challenges for In Silico Drug Discovery at Delta Opioid Receptors.

Authors:  Yazan J Meqbil; Richard M van Rijn
Journal:  Pharmaceuticals (Basel)       Date:  2022-07-15
  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.