| Literature DB >> 34518232 |
Christoph Schran1,2,3,4, Fabian L Thiemann5,2,3,4,6, Patrick Rowe5,2,3,4, Erich A Müller6, Ondrej Marsalek7, Angelos Michaelides1,2,3,4.
Abstract
Simulation techniques based on accurate and efficient representations of potential energy surfaces are urgently needed for the understanding of complex systems such as solid-liquid interfaces. Here we present a machine learning framework that enables the efficient development and validation of models for complex aqueous systems. Instead of trying to deliver a globally optimal machine learning potential, we propose to develop models applicable to specific thermodynamic state points in a simple and user-friendly process. After an initial ab initio simulation, a machine learning potential is constructed with minimum human effort through a data-driven active learning protocol. Such models can afterward be applied in exhaustive simulations to provide reliable answers for the scientific question at hand or to systematically explore the thermal performance of ab initio methods. We showcase this methodology on a diverse set of aqueous systems comprising bulk water with different ions in solution, water on a titanium dioxide surface, and water confined in nanotubes and between molybdenum disulfide sheets. Highlighting the accuracy of our approach with respect to the underlying ab initio reference, the resulting models are evaluated in detail with an automated validation protocol that includes structural and dynamical properties and the precision of the force prediction of the models. Finally, we demonstrate the capabilities of our approach for the description of water on the rutile titanium dioxide (110) surface to analyze the structure and mobility of water on this surface. Such machine learning models provide a straightforward and uncomplicated but accurate extension of simulation time and length scales for complex systems.Entities:
Keywords: aqueous phase; machine learning potentials; solid–liquid systems
Year: 2021 PMID: 34518232 PMCID: PMC8463804 DOI: 10.1073/pnas.2110077118
Source DB: PubMed Journal: Proc Natl Acad Sci U S A ISSN: 0027-8424 Impact factor: 11.205
Fig. 1.Schematic depiction of the rapid development process of machine learning models with C-NNPs. (Top) The workflow used to generate a C-NNP model starting from a single reference trajectory. Using a small-scale AIMD simulation as input, the C-NNP model is constructed in an active learning cycle that selects the most important configurations for an improvement of the model. This is achieved in an automated iterative process of first training the model and then screening of a large set of candidate configurations for structures with largest error estimate, which are added to the training set. Subsequently, the C-NNP model can be applied to large-scale simulations in order to provide insight into the system of interest. The systems and potential energy curves schematically shown in Top are chosen for illustration purposes and do not reflect actual simulation data. (Bottom) Representative sections of the simulation cells used for the six aqueous systems chosen in this study for which we successfully applied our machine learning protocol. They are the fluoride ion in solution (), the sulfate ion in solution (), water in carbon () and hexagonal boron nitride nanotubes (), water under molybdenum disulfide confinement (), and water on a titatium dioxide interface ().
Fig. 2.Performance assessment of the C-NNP for six different aqueous systems. (Left) Bar plot featuring the summary of the accuracy for the RDFs, the VDOS, and the force predictions (Force) in percent for each system. (Right) The species resolved functions (RDF and VDOS [in logarithmic scale]) and the force correlation of the C-NNP model with respect to the reference method for the solvated fluoride ion () C-NNP model, which are condensed into the three scores for the fluoride/water C-NNP model, shown in Left. Details of the suitable difference measure and reduction for the three properties can be found in .
Fig. 3.Properties of water on the rutile (110) surface. (A) A representative section of the simulation cell including the four distinct adsorption sites at the interface (Ti5c, fivefold coordinated titanium; Ti5c, sixfold coordinated titanium; , threefold coordinated oxygen; and , oxygen bridge site). (B) The mass density profile based on all water atoms, (C) the water diffusion constant separated into parallel (xy) and perpendicular (z) components, and (D) as a function of the distance from the surface, the C-NNP atomic force error estimate for structures from the C-NNP simulation and for all structures from the original AIMD simulation. This error estimate is obtained as a direct product of the committee disagreement and a scaling factor to match the force RMSE of a validation set as proposed in ref. 50. The Inset in D depicts the average atomic force error estimate of the water atoms as a function of the simulation time. (E and F) The free energy profile of the water adsorbed in the two contact layers from AIMD and C-NNP simulations, respectively. Titanium atoms are shown in gray, oxygen atoms are shown in red, and hydrogen atoms are shown in white.