Literature DB >> 35084846

Artificial Intelligence Resolves Kinetic Pathways of Magnesium Binding to RNA.

Abstract

Magnesium is an indispensable cofactor in countless vital processes. In order to understand its functional role, the characterization of the binding pathways to biomolecules such as RNA is crucial. Despite the importance, a molecular description is still lacking since the transition from the water-mediated outer-sphere to the direct inner-sphere coordination is on the millisecond time scale and therefore out of reach for conventional simulation techniques. To fill this gap, we use transition path sampling to resolve the binding pathways and to elucidate the role of the solvent in the binding process. The results reveal that the molecular void provoked by the leaving phosphate oxygen of the RNA is immediately filled by an entering water molecule. In addition, water molecules from the first and second hydration shell couple to the concerted exchange. To capture the intimate solute-solvent coupling, we perform a committor analysis as the basis for a machine learning algorithm that derives the optimal deep learning model from thousands of scanned architectures using hyperparameter tuning. The results reveal that the properly optimized deep network architecture recognizes the important solvent structures, extracts the relevant information, and predicts the commitment probability with high accuracy. Our results provide detailed insights into the solute-solvent coupling which is ubiquitous for kosmotropic ions and governs a large variety of biochemical reactions in aqueous solutions.

Entities: Chemical

Mesh：

Substances：

Year: 2022 PMID： 35084846 PMCID： PMC8830046 DOI： 10.1021/acs.jctc.1c00752

Source DB: PubMed Journal: J Chem Theory Comput ISSN： 1549-9618 Impact factor: 6.006

Introduction

Magnesium plays a vital role in almost every biological process. By now more than 800 different biochemical roles of Mg2+ have been identified in physiological processes ranging from the creation of cellular energy or the synthesis of biomolecules to the activation of enzymes and ribozymes.[1−5] The specific requirement for Mg2+ as a cofactor is particularly pronounced in nucleic acid systems where Mg2+ plays structural roles by complexing negatively charged groups or catalytic roles by accelerating or inhibiting chemical reactions in ribozymes.[3,6−8] In RNA systems, Mg2+ ions are essential for two reasons: They screen the electrostatic repulsion, thus allowing RNA to fold into compact and functional structures.[9] In addition, a smaller fraction of Mg2+ ions interacts directly with the functional atom groups of the RNA. These site-specific ions stabilize the three-dimensional structure further and are involved either in a direct contact (inner-sphere) or are mediated through the hydrogen bond of a coordinating water molecule (solvent-shared).[7,9] The second reason why cations are essential is that binding of Mg2+ to active binding sites allows ribozymes to perform chemical reactions that would not be possible from the basic RNA building blocks alone.[3,6] Despite the importance of Mg2+ in RNA biology, a detailed understanding of the Mg2+-RNA interactions at the atomic level is still lacking. As a first step in providing a thorough understanding, we here resolve the binding pathways of Mg2+ to RNA and elucidate the dynamic interplay of direct ion-RNA and indirect water-mediated interactions. In aqueous solutions, the first hydration shell of Mg2+ consists of six water molecules arranged in octahedral symmetry.[10] Water molecules from the first hydration shell exchange with the second, more loosely bound hydration shell on the microsecond time scale.[11−13] This dynamic equilibrium allows for ligand exchange. Hereby, oxygen atoms are the preferred binding partners, in particular the nonbridging oxygens of the phosphate group on the backbone of RNA.[3,14] Interestingly, water exchange around Mg2+ is orders of magnitude slower compared to other metal ions.[11,15] The long lifetimes of water molecules in the first hydration shell facilitate two distinct binding arrangements: inner-sphere and outer-sphere. In inner-sphere binding, one water molecule is removed from the first hydration shell, and the ion is in direct contact with the RNA atoms. In outer-sphere binding, the interactions with the RNA are mediated by water, while Mg2+ remains coordinated by six water molecules.[3] Does Mg2+ bind in inner- or outer-sphere coordination and how can it transition from the one binding mode to the other? Discerning the nature of the exact interactions is an important and controversially discussed question in the field.[14,16,17] Typically, structural knowledge on the binding mode of Mg2+ is obtained from crystallographic experiments. However, the correct assignment in the electron density maps remains notoriously difficult. Yet, with proper stereochemical guidelines a consistent picture emerges: Nucleobase nitrogen and carbonyls are poor inner-sphere Mg2+ binders,[16,17] while the nonbridging oxygens of the phosphate group are the primary nucleic acid binding location. In large RNA structures, Mg2+ ions can, depending on the exact environment, bind in inner- or outer-sphere coordination,[3,18] whereas the inner-sphere coordination is thermodynamically favored in simple mononucleotide systems.[19] In addition to the structural information from crystallography, NMR can provide valuable insight into the exchange kinetics.[11,20,21] Depending on the number of direct Mg2+-RNA contacts, the time scale for ligand exchange ranges from milliseconds[20,22,23] to hundreds of seconds.[21] However, the structural changes at the microscopic level during an exchange process are not accessible from experiments. Here, simulations can contribute important insights by characterizing the solvent behavior and by providing a unique atomistic description of the dynamics. Yet, simulating the transition from outer-to-inner sphere binding is tremendously challenging for two reasons: According to experiments, the transition is on the millisecond to second time scale and therefore out of reach for conventional all-atom simulations. Alternatively, enhanced sampling techniques such as metadynamics, replica exchange, and umbrella sampling can be applied[24,25] but do not prove any information on the exchange dynamics and reaction mechanism. The second challenge is the coupling of the solvent to the exchange dynamics. Kosmotropic ions, such as Mg2+, provoke strong structural ordering of the first hydration shells. Therefore, any process in aqueous solutions is expected to be governed by a complex interplay of structural, orientational, and hydrogen bonding effects extending over several hydration shells. It is therefore not surprising that the solvent not only is a spectator of the chemical process but also plays an active role in the evolution. Similarly, the dynamics of seemingly simple processes such as dipeptide isomerization, ion pair formation, water exchange between hydration shells, or proton and electron transfer is governed by solute–solvent coupling.[26−33] Therefore, attempts in describing the dynamics of such processes in terms of simplified reaction coordinates that do not include the solute–solvent coupling are likely to fail for several reasons. (i) Solvent reorganization is orthogonal to such simplified reaction coordinates giving rise to pronounced non-Markovian behavior or memory effects.[34] (ii) The metastable states are not uniquely separated leading to the violation of the no-recrossing assumption and failure of transition state theory.[35] (iii) Enhanced sampling techniques that rely on biasing the slow degrees of freedom become inefficient as the solvent reorganization becomes slower than the reactive motion itself. Consequently, quantifying the contribution of the solvent to the reaction coordinate is one of the long-standing problems in chemical reaction kinetics. In order to address both challenges, we apply transition path sampling as a particularly powerful sampling strategy to provide unbiased microscopic insight into the dynamics of Mg2+ binding to the phosphate oxygen of RNA. Subsequently, we perform a committor analysis comprising more than 28,600 configurations as the basis for a machine learning algorithm which automatically selects the optimal deep learning model from thousands of scanned architectures in a robust and efficient manner. The resulting optimized deep neural network is shown to capture the intimate solute–solvent coupling and to provide an accurate description of the complete dynamical process.

Methods

Atomistic Model and Simulation Setup

Our model system consists of an RNA dinucleotide with two guanine nucleobases (Figure ). The RNA dinucleotide is an ideal model system to investigate ion-RNA interactions since it contains the three most important metal cation binding sites: The nonbridging phosphate oxygens of the backbone and the N7 and O6 metal binding site on the nucleobase.[19] For Mg2+, there is clear experimental evidence that the nonbridging phosphate oxygens are the primary nucleic acid binding site for Mg2+.[3,17] Since both oxygen atoms (O1P and O2P) have identical force field parameters, we focus on the interactions with O1P which has a partial charge of −0.776e. Our selection of the phosphate oxygens is further justified by a large number of unbiased simulations that reveal exclusive binding of Mg2+ to the phosphate oxygen and not to the nucleobases in agreement with previous results.[25] A single Mg2+ ion, one Cl– ion, and 2150 water molecules are added to the cubic simulation box (L = 40 Å). Since the dinucleotide has one negative charge, the simulation box is neutral. For Mg2+ and Cl–, recently optimized force field parameters were used[36] in combination with the TIP3P water model.[37] The TIP3P water model assigns partial charges of −0.834e and 0.417e to oxygen and hydrogen. Note that to date a large variety of water models exist with some reproducing the physical properties of water better than TIP3P.[38] Our main motivation to use TIP3P water is that it is frequently used in biomolecular simulations since protein and nucleic acid force fields were optimized with TIP3P. The force field parameters of the RNA are taken from Amber99sb-ildn*[39] with parmbsc0[40] and χ0L3 corrections.[41]

Figure 1

Simulation snapshot of the RNA dinucleotide consisting of two guanine nucleobases with a Mg2+ ion in inner- or outer-sphere coordination. The backbone binding sites (atoms O1P and O2P) and the N7 and the O6 binding site of guanine are shown. The exchanging water molecule is shown in blue. The Mg2+ parameters were optimized in our previous work[36] in combination with TIP3P and are particularly suited to investigate ion binding since they reproduce experimental activity coefficients and the hydration free energy. In addition, the Mg2+ parameters by Mamatkulov and Schwierz[36] predict an interchange dissociative exchange mechanism in water in agreement with experiments, while polarizable force fields[42−44] erroneously yield an associative mechanism.[45] Note that models with variable polarizability[46] or scaled charge force fields[47,48] might yield improvement but are beyond the scope of our current work. A detailed comparison of the Mamtkulov Mg2+ force field parameters with other parameters from the literature can be found in refs (36 and 49). All simulations were performed using GROMACS[50] with periodic boundary conditions. Particle mesh Ewald summation was used along with a Fourier spacing of 0.12 nm and a grid interpolation up to order 4 to handle long-range electrostatic forces. Close Coulomb real space interactions were cut off at 1.2 nm, and Lennard-Jones (LJ) interactions were cut off after 1.2 nm, respectively. Long-range dispersion corrections for energy and pressure were applied to account for errors stemming from truncated LJ interactions. The initial energy minimization was performed with the steepest descent algorithm. For each simulation, an NVT and a subsequent NPT equilibration were done for 1 ns, controlling the temperature at 300 K and at a pressure of 1 bar with the Berendsen thermostat and barostat.[51] All production runs and the transition path sampling were done in the NVT ensemble at a temperature of 300 K using the velocity rescaling thermostat with stochastic term[52] and a time step of 2 fs. Here, the velocity rescaling thermostat was used since it generates the canonical ensemble and ensures detailed balance in the canonical ensemble of transition paths.[53]

Free Energy Profiles

The one-dimensional free energy profile as a function of the distance r between the Mg2+ ion and the oxygen atom O1P was calculated from umbrella sampling using PLUMED.[54] A force constant k = 600,000 kJ/(mol nm2) and a window spacing of 0.005 nm were used for r < 0.35 nm. A force constant k = 60,000 kJ/(mol nm2) and a window spacing of 0.01 nm were used for r > 0.35 nm. To ensure convergence, we computed the reverse pathway and performed a block analysis by dividing the 100 ns long simulation into eight individual blocks (Figure S1 in the Supporting Information.) The two-dimensional free energy profiles as a function of r and the effective hydration distance s6 were calculated from umbrella sampling using PLUMED.[54] Hereby, s6 was defined as the sum of six Mg2+-oxygen distanceswhere ri are the distances between Mg2+ and the five closest water molecules, and rex is the distance between Mg2+ and the exchanging water molecule. A force constant k = 100,000 kJ/(mol nm2) and a window spacing of 0.01 nm were used. Without further restraints, the system shows three stable states in the two-dimensional projection (see Figure S2 in the Supporting Information). In the umbrella sampling, we focus only on transitions between the two relevant states (corresponding to the inner-sphere configuration with five coordinating water molecules and the outer-sphere coordination in which O1P is replaced by the selected water molecule for which the umbrella potential is applied). This is justified further by the results from the unbiased simulations (TPS and fleeting trajectory setup with 3.5 μs total simulation time) which clearly demonstrate that Mg2+ goes directly to the phosphate oxygen and never to any of the nucleobase binding sites. In the umbrella sampling, this is achieved by an additional biasing potential on the hydration number (see the Supporting Information for further details). Position restraints were applied on all heavy atoms of the dinucleotide with the exception of the nonbridging oxygen atoms of the phosphate group. Each umbrella simulation was performed for 2 ns discarding 500 ps for equilibration. The free energy profiles were calculated using the weighted histogram analysis method (WHAM).[55]

Transition State Theory

TST is the most popular theory to calculate reaction rates. In simple systems for which the reaction coordinate is exactly known, TST gives an accurate estimate of the rate. However, in complex many-body systems such as the one presented here, TST could fail due to the violation of the nonrecrossing hypothesis which forms the cornerstone of the theory. Therefore, TST can be used only to provide an upper estimate for the rate constant. For a more accurate estimate, additional corrections as implemented in the reactive flux method are required[35,56] but are beyond the scope of our current work. Following conventional transition state theory (TST), the upper estimate of the rate constant is calculated from[57,58]where λ* is the position of the barrier top, θ(λ̇) is the Heaviside step function, and ⟨...⟩ denotes the average over the restrained ensemble of trajectories initiated from an equilibrium ensemble of phase points on the dividing surface.

Transition Path Sampling

Transition path sampling[53,59] was used to harvest an ensemble of rare trajectories that connect the two stable states. Starting from an initial reactive pathway generated at high temperature, new trial trajectories were created by randomly selecting a time slice, randomizing the velocities, and integrating the equations of motion forward and backward in time. Using standard two-way shooting moves with a fixed length of 1.6 ps, trial moves are accepted if they connected the two stable states and rejected otherwise.

Committor Analysis and Transition States

For a simulation snapshot taken from a reactive transition pathway, the committor pA is defined as the probability of the configuration initiated with randomized velocities drawn from a Maxwell–Boltzmann distribution to be committed to state A (inner-sphere coordination). Configurations from basin A have pA = 1, configurations from basin B have pA = 0, and transition states have pA = 0.5. The commitment probability pA was calculated from the fraction of trajectories initiated with randomized velocities that reach state A. 29,112 configurations along more than 2500 independent pathways obtained from transition path sampling were used as shooting points. 28,612 shooting points were chosen in the transition region (with 0.6 < λ < 1.0 based on the coordinate λ defined in eq ). In addition, 500 shooting points were selected in the regions of the stable states (λ < 0.6 and λ > 1.0). For each shooting point, 100 trajectories were initiated with random velocities and run forward and backward for 2 ps. A configuration was identified as a transition state if half of the trajectories relaxed into each stable state (0.45 < pA < 0.55). From the ensemble of shooting point, two subsets with r = 0.28 ± 0.02 nm or λ = 0.8 ± 0.05 with 3,500 randomly drawn data points were selected, and the distribution p(pA) was calculated. In total, 3.5 μs of unbiased simulation data were used to test the different putative reaction coordinates.

Commitment Probability from Deep Neural Networks

Deducing an appropriate description by visual inspection is virtually impossible if complex configurational rearrangements are considered as in the system presented here. Neural networks, on the other hand, are particularly suited for this task. For example, in the pioneering work by Ma and Dinner,[28] neural networks were used to predict the committor based on a set of coordinates for the isomerization of alanine dipeptide. More recently, Jung et al. proposed and implemented a combination of path sampling and deep neural networks to guide the sampling and to extract the reaction coordinate.[26,27,60] Following the work by Jung et al.,[26,27] we used a deep neural network to learn the reaction coordinate from the outcome of the committor analysis by minimizing the likelihood loss function.[61,62] Note, however, that we did not guide the sampling as described in refs (26 and 27). Instead, we performed a preceding committor analysis (as described above) and used the committor to map the molecular configurations onto the reaction coordinate qpred(X). Each configuration was described by a set of 83 physical properties (referred to as features X in the following). The committor was parametrized as in refs (26 and 27)where the predicted committor ppred is a nonlinear function of all the features X of the system. Note that by rescaling the reaction coordinate via q = q̃/2 the frequently used expression for the committor by Peters and Trout[62] is retrieved:Therefore, eqs and 4 are expected to yield identical results. The likelihood that a model can reproduce the observed data is given by[26,27,61,62]where N is the number of shooting points, and n and n are the number of trajectories that reach state A and B from shooting point i, respectively. Following the work by Jung et al.,[26,27] we modeled the committor with a deep neural network. Hereby, the weight matrix that defines the connection between the nodes of the deep neural network was optimized by minimizing the negative log likelihood loss l(26,27)The overall accuracy and generalization of the machine learning model (i.e., the deviation of the predicted values pApred from the simulated ones pAtrue) was estimated by dividing the available data into a training set, a validation set, and a test set. The first was used to train the model, the second was used to validate the training result, and the third was used to check whether the final model is able to correctly predict the values for structures that were not used for training.

Training a Deep Learning Regression Model with Hyperparameter Optimization

One challenge in the application of neural networks is to choose the model architecture. As illustrated in Figure A, the accuracy of the network to predict the commitment probability of a configuration and to autonomously select the transition state strongly depends on the underlying model architecture. However, there is no generic way to determine all the model parameters such as the number of hidden layers, the number of neurons, and all hyperparameters a priori. Moreover, a network architecture that works well for one system might fail to capture other systems or dissimilar problems. Therefore, building an optimal deep neural network manually by trial and error is a time-consuming problem which highly depends on human expertise and intuition. In order to make this progress more efficient, we developed an algorithm that automatically finds the optimal network architecture. First, we specified a class of multilayer perceptron (MLP) regression models that are described by a set of hyperparameters.[63] Subsequently, we selected the optimal model by running a Keras Tuner random search hyperparameter optimization.[64] Hereby, the generic MLP was defined by the following set of hyperparameters: number of hidden dense layers (1–6), number of neurons per hidden layer (32–256), activation function (Rectified Linear Units (ReLU) or Scaled Exponential Linear Unit (SELU)), position of a single dropout layer (none/after feature input/middle layer/before output layer), dropout rate (0–40%), and learning rate for the Adam Optimizer (logarithmic sampling from 10–4–10–2). For ReLU activation functions, a Glorot uniform initialization and a normal dropout layer were used. For SELU activation functions, a Lecun normal initialization and an alpha dropout layer were used. ReLU and SELU are the standard choices for regression models in Keras and were therefore used. The custom loss function by Jung et al.[26,27] according to eq was used. All models were trained via back-propagation using the stochastic gradient algorithm.

Figure 2

Dependence of the performance on the model architecture. (A) Comparison of the committor distribution p(pAtrue) for transition states selected from deep learning algorithms with different model architectures (see the Supporting Information for further details of the models). (B) Validation likelihood loss l for a random subset of model architectures. One parameter of the architecture, the total number of neurons, is shown in color. In total, several thousand different model architectures were explored by running a Keras Tuner random search in the space of model parameters. The data was split into a 20% test set, a 72% training set, and an 8% validation set. Different splitting of the data yielded similar results (see the Supporting Information). All features were normalized using the Keras standard scaler with mean μ = 0 and standard deviation σ = 1. Each model was set up for 50 epochs with a training batch size of 128. Early stopping with patience 10 was used on the validation loss. Multiple thousand different model architectures were explored, and the likelihood loss of a random subset of 2,160 models is shown in Figure B. For further optimization, a subset of best models based on minimal validation loss was chosen. The training of the best model subset was continued for up to 200 epochs with the previous batch size and early stopping. In addition, a stepwise reduction of the learning rate by a factor of 5 down to 10–5 was used if the validation loss reached a plateau for 7 training steps. Finally, the best model was chosen based on the highest maximum in the p(pApred) distribution. In summary, the optimization algorithm allows us to find the optimal model architecture in a quick and reliable fashion. The optimal deep learning model has five hidden layers with 256 neurons, ReLU activation, and an initial learning rate of 1.9 * 10–4. A dropout layer with 30% dropout is placed directly before the output layer. All code was developed with Keras,[65] scikit-learn,[66] and TensorFlow.[67]

Features

For each shooting point, we calculated 83 features that reflect different structural properties of the molecular configuration. Focusing only on the dominant indirect exchange, the data set has 17,477 entries. Each entry consists of 83 features, and the label pAtrue was calculated for each shooting point using the committor analysis. Specifically, the features include all distances r between Mg2+ and the 20 closest water molecules, distances between Mg2+ and different RNA atoms and the Cl– ion, all angles a formed between Mg2+, O1P, and the 20 closest water molecules, the Steinhardt-Nelson order parameters q3, q4, and q6(68) of the first and second hydration shell, tetrahedral order parameters,[54] and the number of hydrogen bonds in the first and second hydration shell and between RNA and water (Table S1). Transition path sampling, performed in previous work,[31,45] yielded detailed molecular insights into the mechanism of water exchange in the first hydration shell and showed that the coordinates of the exchanging ligands need to be included to provide a reasonable description. Similarly, in the present case, a putative, knowledge-based reaction coordinate λ was defined, which combines the Mg2+-RNA distance r and the effective hydration distance s6 via a trigonometric functionThe atan2 function is an extension of the usual atan function with two real numbers as argument. Therewith, the atan2 function comprises enough information to yield function values in the range of 0–2π and allows us to cover all possible values for s6 and r. Note that the parameters s60 = 1.18 nm and r0 = 0.18 nm correspond to the minima in the two-dimensional free energy profile. With this choice, λ = π/2 corresponds to the inner-sphere coordination, and λ = 0 corresponds to the outer-sphere coordination (Figure B).

Figure 3

Free energy landscape of Mg2+-RNA interactions. (A) Free energy profile F(r) as a function of the distance between Mg2+ and the phosphate oxygen. Simulation snapshots of the two stable states are shown. (B) Two-dimensional free energy landscape F(r, s6) as a function of r and s6 where s6 is the sum over the distances between Mg2+ and the five closest molecules and the exchanging water molecule. The dashed line indicates configurations from the top of the free energy profile with r* = 0.28 nm shown in (A). Trajectories, initiated from the upper green stripe, relax back into state A, and trajectories from the lower blue stripe relax back into state B. The diagonal dashed line for λ* = 0.8 approximates the saddle between states A and B. (C) Committor distributions pSim(pA) for trajectories initiated with r* = 0.28 nm (top of the one-dimensional free energy profile shown in A) and with λ* = 0.8 (saddle of two-dimensional free energy profile shown in B). pid(pA) is the distribution expected for an ideal reaction coordinate with 3500 shooting points.

Feature Relevance

To this end, the machine learning model employs a large number of features to model the reaction coordinate and to predict the commitment probability pApred of new structures. In the following, we aim to determine the importance of each input feature. Since the deep learning network consists of a large number of nodes, weighted sums, and nonlinear transformations, the feature relevance was calculated numerically by selectively replacing single features or combinations of features by noise and resampling the model. For the single feature relevance, we generated for every input feature i a data set in which the values X were replaced by randomly permuted values X. The random permutation approach was chosen as it exactly conserves the original distribution. The normalized relevance of the i-th feature r was defined aswhere lopt is the converged loss of the optimal model, and lmax is the largest value upon permutation. The relevance corresponds to the loss of information (i.e., the negative log likelihood increase) upon effectively removing a single feature from the data set while leaving all other features unchanged. Accordingly, the single feature rank ranges from low (unimportant) to high (important). For single feature permutation importance, standard libraries can be used.[66,69] For this work, we implemented a custom algorithm that extends the standard permutation importance algorithms to be able to measure the importance of combined feature sets (see below). Note that an alternative approach for the single feature analysis was proposed in ref (26) where uniform random noise was used instead of random permutations. For the relevance of combined features, N(N + 1) /2 = 3486 permuted data sets were generated as follows. Starting from the optimal model, the feature which yielded the smallest loss upon permutation was removed (rank 1). This feature contained the smallest amount of information and was consequently least important for the process. Maintaining the permutation of rank 1, we selected the second feature that yielded the smallest loss upon permutation (rank 2). The procedure was repeated until all features are permuted. The normalized relevance for effectively removing the information on n features is given bywhere lopt is the converged loss of the optimal model, and lrand is loss obtained by permuting all features. Here, the relevance corresponds to the increase of the negative log likelihood upon removing the combination of n features that contain the least amount of information. Accordingly, the feature combination rank ranges from low (least important combination) to high (most important combination).

Results and Discussion

Free Energy Landscape of Mg2+-RNA Interactions

During association, one of magnesium’s six strongly bound hydration waters is removed to facilitate a direct contact between Mg2+ and the phosphate oxygen. In the simplest case, this ligand exchange can be described by the distance r between Mg2+ and the phosphate oxygen, while all other degrees of freedom are integrated out. The corresponding one-dimensional free energy profile is shown in Figure A. The free energy profile has two stable states. State A corresponds to the inner-sphere coordination, and state B corresponds to the outer-sphere coordination. The free energy barrier from outer-sphere to inner-sphere is about 21 kBT and exactly matches the value for water exchange.[31] Consequently, the barrier corresponds to the free energy necessary to remove one water molecule from the first hydration shell in order to facilitate a direct contact with the phosphate oxygen. The inner-sphere coordination is thermodynamically more stable in agreement with experimental findings[19] and Collins’ empirical rule like-seeks-like:[70] Due to the high charge density of the phosphate oxygens, direct ion pairing with a high binding affinity is expected. Using transition state theory TST (eq ), we can provide an upper estimate of the exchange rate or conversely a lower estimate of the lifetime. Here, it is essential to mention that the true rate could be significantly smaller due to the violation of the nonrecrossing hypothesis. As a lower limit, Mg2+ is estimated to remain about 300 s in the inner-sphere coordination before transitioning to outer-sphere. Similarly, Mg2+ is estimated to remain about 0.2 ms in the outer-sphere coordination before transitioning back to inner-sphere. Note that the exchange between the outer-sphere cordination and bulk is much faster since the free energy barrier is significantly smaller (Figure A). The lifetime of the outer-sphere coordination is in agreement with the millisecond time scale observed experimentally.[20,22,23] On the other hand, the computed lifetime of the inner-sphere coordination is likely too high reflecting the shortcoming of current Mg2+ force fields in reproducing experimental binding affinities at nucleic acid binding sites.[25,49,71] The quality of a reaction coordinate can be assessed by a committor analysis. For an ideal reaction coordinate, about half of the trajectories initiated from the barrier top are expected to relax back to either stable state. Consequently, the distribution p(pA) of the probability to relax back to state A should have a sharp peak at pA ≈ 1/2 (binomial committor distribution). While r provides a simplified description of the process, it is not an adequate reaction coordinate by itself. The committor analysis (Figure C) shows that most configurations, initiated with r = 0.28 nm, relax back to the inner-sphere coordination. Therefore, the Mg2+-oxygen distance alone is insufficient to describe the dynamics of the transition. To provide a more realistic picture, the water molecules from the first hydration shells have to be included. Figure B shows the two-dimensional free energy landscape as a function of r and the hydration parameter s6, which includes the distances of the five closest waters and the exchanging water (eq ). From the two-dimensional representation, the failure of r as the reaction coordinate can be rationalized: Trajectories starting from the upper panel (s6 > 1.24 nm) are committed to state A, while trajectories starting from the lower panel (s6 < 1.24 nm) are committed to state B. r and s6 can be combined into a putative, knowledge-based reaction coordinate λ (eq ). Based on the committor distribution (Figure C), λ provides significant improvement compared to r. Still, configurations with λ* = 0.8 lead to a much broader distribution compared to the ideal binomial distribution. These results show that free energy profiles along a few simple coordinates provide valuable initial insight. Yet, the committor analysis reveals that the exchange dynamics is more complex than the free energy profiles might suggest.

Kinetic Pathways from Transition Path Sampling

To gain insight into the kinetic pathways of Mg2+ association and dissociation, transition path sampling is applied to sample a large number of inner-to-outer sphere transitions. Four representative transition paths that connect the two stable states are shown in Figure A. Figure C shows the distribution of transition times. Typically, the exchanging water molecules spend less than 0.5 ps in transition. This time is considerably smaller compared to the millisecond time scale of the stable states. The clear separation of time scales highlights the path sampling method as a particularly powerful sampling strategy for these systems.

Figure 4

Kinetic pathways from transition path sampling. (A) Four representative pathways connecting the two stable states. Blue/gray pathways correspond to the indirect/direct exchange mechanism. (B) Probability distribution along transition pathways for the distance r between Mg2+ and the phosphate oxygen and the exchange angle αex between phosphate oxygen, Mg2+, and exchanging water oxygen. (C) Distribution of transition times for the indirect and the direct exchange mechanisms. (D) Indirect Mg2+ exchange mechanism in which the leaving phosphate oxygen and the incoming water molecule occupy different positions on the water octahedron. (F) Direct Mg2+ exchange mechanism in which the leaving phosphate oxygen and the incoming water molecule occupy the same position on the water octahedron. In (D) and (E), the five closest water molecules are shown. The incoming water molecule is highlighted in blue. The distribution of the exchange angles (Figure B) along reactive pathways indicates that two alternative exchange pathways exist: In the indirect exchange mechanism, the leaving oxygen ligand and the entering water molecule occupy different positions on the water octahedron (Figure D). During activation, one water molecule from the second hydration shell enters the molecular void between the hydration shells. This motion leads to a concerted motion of the phosphate oxygen out of the first hydration shell. The distances of the leaving phosphate oxygen and the entering water are elongated compared to their equilibrium values (Table ). Reflecting balanced electrostatic interactions, the Mg2+-phosphate oxygen distance is smaller than the Mg2+-water oxygen distance due to the smaller partial charge on the phosphate oxygen compared to the water oxygen. The distances of the five closest molecules remain relatively unchanged (Table ). Yet, they rearrange such that the transition state has an approximate mirror symmetry. The mirror plane is perpendicular to the plane composed of Mg2+, the phosphate, and water oxygen and contains three water molecules (Figure D).

Table 1

Properties of the Transition State Ensemblea

mechanism	r_I [Å]	r_ex [Å]	r_s [Å]	α_ex [deg]	τ [ps]
indirect	3.17 ± 0.03	3.51 ± 0.03	1.928 ± 0.003	122.9 ± 14.7	0.48 ± 0.01
direct	3.28 ± 0.03	3.46 ± 0.03	1.928 ± 0.003	47.7 ± 5.1	0.56 ± 0.01

Mg2+-O1P distance r, Mg2+-oxygen distance of exchanging water rex, Mg2+-oxygen distance of the five closest water molecules r, angle between O1P, Mg2+, and the exchanging water molecule αex, average transition time τ. Standard deviations are indicated. A sample size of 1759 or 754 transition states was used for the indirect and direct exchange mechanisms, respectively. In the direct exchange mechanism, the leaving oxygen ligand and the entering water molecule occupy the same positions on the water octahedron (Figure E). The exchange arises via the attack of the incoming water onto the edge of the octahedron formed by the oxygen ligands. Similar as in the indirect pathway, the Mg2+-phosphate oxygen distance is slightly smaller than the Mg2+-water oxygen distance, while the distances of the five closest water molecules remain relatively unchanged (Table ). In the transition state, the five closest water molecules form a square pyramidal coordination, and the transition state has a distorted C2 symmetry (Figure E). Based on the change of bond length during activation, both pathways correspond to an interchange dissociative (I) process and are akin to the pathways of water exchange.[31] In equilibrium, the indirect exchange is observed much more frequently (92%) compared to the direct mechanism (8%) since configurations with cis positions of exchanging ligands (direct pathways) are energetically less favorable compared to trans positions (indirect pathways).[72] The results reveal that phosphate oxygen, exchanging water, and the five closest water molecules play a decisive role in the exchange mechanism. However, additional simulations in which those coordinates were fixed, while the solvent outside the first hydration shell was relaxed, show that they are yet insufficient to predict p. Consequently, solvent molecules beyond the first hydration shell are crucial for the exchange and need to be considered explicitly.

Optimized Deep Neural Networks for Quantitative Predictions

During the transition from outer-to-inner sphere binding, water molecules from the first two hydration shells couple to the exchange. The reordering of close and distant water molecules determines whether the process can proceed or not. This behavior might be expected since the kosmotropic Mg2+ ion causes strong orientational ordering in the first hydration shells leading to long-range and collective interactions. However, resolving the subtle rearrangements and providing a quantitative description of the solute–solvent coupling is a demanding problem that is impossible to solve by visual inspection. In order to make progress in quantifying the solvent participation, we use a deep neural network to model the outer-to-inner sphere exchange. Here, we focus the machine learning on the dominant indirect exchange. In particular, we use 17,477 structures along independent transition paths and their commitment probabilities pA (Figure A) to learn the functional relation between pA and the features describing the structure of Mg2+, RNA, and the first two hydration shells.

Figure 5

Reaction coordinate from artificial intelligence. (A) Data set used for machine learning. The committor probability pAtrue and free energy contour are shown as a function of the features r and s6 for the full data set consisting of 17,477 entries for indirect exchange pathways (72% training set, 8% validation set, 20% test set). (B) Committor values pApred predicted by the optimized deep neural network correlated with the values pAtrue obtained from the committor simulations. The RMS error of the prediction is 6.5%. (C) Comparison of the committor distribution p(pAtrue) for transition states selected from the deep learning algorithm, from linear regression over all features, and from the expert knowledge using eq and the binomial distribution expected for an ideal reaction coordinate.[61] (D) Ranking of the features according to their relevance in the machine learned reaction coordinate: single feature relevance r (top) and combined feature relevance r (bottom). The color indicates the rank of the features according to the combined feature relevance. The corresponding features are shown in Table and Table S2.

Table 2

Ranking of the Eight Most Important Features in the Machine Learned Reaction Coordinate According to Their Normalized Relevance (Eqs and 9)a

c-rank	feature	s-rank	feature
84	r_I – r_ex	84	r_I – r_ex
83	r_I	83	r_I
82	q₄₀	82	q₄₀
81	r_P	81	r_P
80	h_O1P¹	80	h_ex¹
79	h_ex¹	79	q₄₁
78	a₁	78	r_Cl
77	s₆	77	a₇

The combined feature rank (c-rank) corresponds to the combination of N – n features that contains the most information, i.e., yields the smallest log likelihood loss upon permuting n features. The single feature rank (s-rank) corresponds to the feature which yields the highest loss upon permuting the feature while leaving all other features unchanged. The ranking of all N = 83 features is given in Table S1.

The choice of the architecture of the deep learning model is essential for its performance. To derive the optimal deep learning model, we have defined an algorithm that systematically selects the optimal architecture by scanning through thousands of individual models using hyperparameter optimization. In order to make robust predictions and prevent overfitting, we optimize the neural network on the training set and select the model that performs best on the validation set. Finally, using the test set, we illustrate the performance of the optimal machine learning model in predicting the commitment probability of unknown structures (Figure B). The results demonstrate that the optimized deep neural network is capable of predicting pA with an RMS error of 6.5%. The high accuracy in predicting the progress of the binding process clearly shows that the network extracts all relevant information, combines it in a weighted nonlinear fashion to a scalar reaction coordinate, and captures the details of the solute–solvent coupling precisely. Further insight into the quality of the predictions is obtained from the distribution of commitment probability p(pA) of the transition states selected autonomously by the neural network (Figure C). The distribution is unimodal and sharply peaked at pA ≈ 0.5 and closely resembles the binomial distribution expected for an ideal reaction coordinate.[61] Consequently, the information contained in the features is sufficient for the network to single out the transition state. The results from deep learning provide significant improvement compared to the reaction coordinate based on expert knowledge and highlight the machine learning ansatz as a particularly useful strategy in recognizing the relevant patterns. In addition, the results from deep learning outperform the results from linear regression based on the exact same features (Figure C). Therefore, hidden layers and a nonlinear combination of features using activation functions are essential to describe the many-body interactions. At the same time, the complexity of the deep learning model prohibits further insight into the reaction mechanism. In this regard, it is particularly useful to evaluate the contribution of each feature. Two complementary rankings are presented in Figure D. The single feature ranking quantifies the relevance of a single feature independent of all others. Since the single feature ranking can contain redundant information, the complementary combined feature rank is given, which yields the most relevant combination of features. Taken together, a consistent picture emerges: All features describing the molecular structure within the first two hydration shells contain information that is relevant for the process. Adding this information to the model leads to a continuous decrease of the likelihood loss and therefore to a linear increase of the relevance r. Finally, four features give rise to an exponential increase in the relevance and carry about 40% of the information (Table ). These features reflect the concerted motion of the leaving phosphate oxygen and the entering water molecule as well as the rearrangement of the five closest water molecules from octahedral to mirror symmetry. At the same time, the remaining 60% of the information is distributed over the remaining 79 features reflecting the importance and many-body nature of the solvent–solute coupling. The combined feature rank (c-rank) corresponds to the combination of N – n features that contains the most information, i.e., yields the smallest log likelihood loss upon permuting n features. The single feature rank (s-rank) corresponds to the feature which yields the highest loss upon permuting the feature while leaving all other features unchanged. The ranking of all N = 83 features is given in Table S1.

Conclusion

Characterizing the kinetic pathways of Mg2+ binding to biomolecules such as RNA is fundamental in understanding its functional role in biochemical processes. Yet, the transition from water-mediated outer-sphere to direct inner-sphere binding is on the millisecond time scale and therefore out of reach for conventional all-atom simulations. To fill this gap, we used transition path sampling to resolve the kinetic pathways of Mg2+ binding to the most important ion binding site on RNA, namely the phosphate oxygen. The results reveal a superior indirect pathway and an inferior direct exchange pathway. In both pathways, the molecular void in the first hydration shell provoked by the leaving phosphate oxygen is immediately filled by an entering water. At the same time, the water molecules from the first and second hydration shell couple to the concerted exchange. These long-range and collective interactions are a direct consequence of the high ionic charge density of Mg2+ which provokes strong ordering in the surrounding solvent and an intimate coupling between solute and solvent. Consequently, ligand exchange gives rise to a complex interplay of orientational, packing, and hydrogen bonding effects in which the collective reorientation and translation of several solvent molecules become important. A complete understanding of the dynamics requires knowledge of the detailed molecular motions involved in the exchange process. However, quantifying their contribution is exceptionally difficult. Instantaneous fluctuations, resulting from different solvent configurations, influence the progress in particular since the time scale of solvent reorientation (about 10 ps) is slower than the reactive motion itself (0.4 ps). In order to quantify how the reactant and solvent dynamics are coupled, the rapidly growing field of machine learning offers exciting possibilities in extracting complex patterns in large data sets and in finding optimal reaction coordinates.[26−28,60,73,74] The results presented here reveal that a properly optimized deep neural network is particularly suited to recognize the molecular motions that occur during the course of the binding, to extract the relevant information, and to predict the commitment probability with high accuracy. About half of the information on the dynamics is contained in the concerted motion of the leaving phosphate oxygen and the entering water molecule as well as in the rearrangement of the five closest water molecules. The other half is contained in the solvent structure rendering deep neural networks particularly useful in recognizing the relevant molecular structures. The question how the solvent affects processes in aqueous solutions is ubiquitous in all areas of chemistry and biology. Here, machine learning provides a promising perspective to explore the intimate coupling between charged solutes and the solvent at the molecular level.[26,27,73] Still, more work is required to investigate whether a simple one-dimensional reaction coordinate exists that captures such complex dynamics.[75]

44 in total

1. Autoionization in liquid water.

Authors: P L Geissler; C Dellago; D Chandler; J Hutter; M Parrinello
Journal: Science Date: 2001-03-16 Impact factor: 47.728

2. Transition path sampling: throwing ropes over rough mountain passes, in the dark.

Authors: Peter G Bolhuis; David Chandler; Christoph Dellago; Phillip L Geissler
Journal: Annu Rev Phys Chem Date: 2001-10-04 Impact factor: 12.703

3. Simulation of Ca2+ and Mg2+ solvation using polarizable atomic multipole potential.

Authors: Dian Jiao; Christopher King; Alan Grossfield; Thomas A Darden; Pengyu Ren
Journal: J Phys Chem B Date: 2006-09-21 Impact factor: 2.991

4. Obtaining reaction coordinates by likelihood maximization.

Authors: Baron Peters; Bernhardt L Trout
Journal: J Chem Phys Date: 2006-08-07 Impact factor: 3.488

5. Refinement of the AMBER force field for nucleic acids: improving the description of alpha/gamma conformers.

Authors: Alberto Pérez; Iván Marchán; Daniel Svozil; Jiri Sponer; Thomas E Cheatham; Charles A Laughton; Modesto Orozco
Journal: Biophys J Date: 2007-03-09 Impact factor: 4.033

Review 6. Metal ions and RNA folding: a highly charged topic with a dynamic future.

Authors: Sarah A Woodson
Journal: Curr Opin Chem Biol Date: 2005-04 Impact factor: 8.822

7. Transition path sampling of rare events by shooting from the top.

Authors: Hendrik Jung; Kei-Ichi Okazaki; Gerhard Hummer
Journal: J Chem Phys Date: 2017-10-21 Impact factor: 3.488

8. Modeling Electronic Polarizability Changes in the Course of a Magnesium Ion Water Ligand Exchange Process.

Authors: Igor V Kurnikov; Maria Kurnikova
Journal: J Phys Chem B Date: 2015-07-31 Impact factor: 2.991

Review 9. Dynamics of chemical processes in polar solvents.

Authors: P J Rossky; J D Simon
Journal: Nature Date: 1994-07-28 Impact factor: 49.962

10. Current status of the AMOEBA polarizable force field.

Authors: Jay W Ponder; Chuanjie Wu; Pengyu Ren; Vijay S Pande; John D Chodera; Michael J Schnieders; Imran Haque; David L Mobley; Daniel S Lambrecht; Robert A DiStasio; Martin Head-Gordon; Gary N I Clark; Margaret E Johnson; Teresa Head-Gordon
Journal: J Phys Chem B Date: 2010-03-04 Impact factor: 2.991

1 in total

Review 1. AlphaFold, Artificial Intelligence (AI), and Allostery.

Authors: Ruth Nussinov; Mingzhen Zhang; Yonglan Liu; Hyunbum Jang
Journal: J Phys Chem B Date: 2022-08-17 Impact factor: 3.466

1 in total