Literature DB >> 25838811

Enhanced Conformational Sampling Using Replica Exchange with Collective-Variable Tempering.

Abstract

The computational study of conformational transitions in RNA and proteins with atomistic molecular dynamics often requires suitable enhanced sampling techniques. We here introduce a novel method where concurrent metadynamics are integrated in a Hamiltonian replica-exchange scheme. The ladder of replicas is built with different strengths of the bias potential exploiting the tunability of well-tempered metadynamics. Using this method, free-energy barriers of individual collective variables are significantly reduced compared with simple force-field scaling. The introduced methodology is flexible and allows adaptive bias potentials to be self-consistently constructed for a large number of simple collective variables, such as distances and dihedral angles. The method is tested on alanine dipeptide and applied to the difficult problem of conformational sampling in a tetranucleotide.

Entities: Chemical Disease Gene Mutation Species

Mesh：

Substances：
Dipeptides
RNA
Alanine

Year: 2015 PMID： 25838811 PMCID： PMC4364913 DOI： 10.1021/ct5009087

Source DB: PubMed Journal: J Chem Theory Comput ISSN： 1549-9618 Impact factor: 6.006

Introduction

Biomolecular dynamics involves a wide range of time scales ranging from bond fluctuations to slow large-scale motions.[1−3] Molecular dynamics (MD) with accurate force fields can in principle be used as a virtual microscope to investigate these motions at atomistic resolution.[4] However, its applicability to problems such as folding or conformational transitions in proteins and RNA is limited by the fact that only short time scales (∼μs) are directly accessible by straightforward simulation. In spite of the development of ad hoc hardware,[5] many relevant processes are still out of reach for accurate atomistic modeling. Several different techniques have been developed in the last decades to address this issue. These techniques can be roughly classified in two groups.[6] In the first group, inspired by annealing,[7] ergodicity is achieved by increasing the temperature[8,9] or by artificially modifying the Hamiltonian.[10−13] In methods of this group a ladder of replicas with different degree of ergodicity is often employed. Swaps of coordinates between neighboring replicas are periodically attempted and accepted or rejected with a Metropolis criterion. These methods are usually referred to as replica exchange molecular dynamics (REMD). Whereas temperature is certainly the most adopted control parameter, temperature REMD (T-REMD) is computationally demanding for solvated systems, because replica spacing and interchange probability depends on the system size.[14] Moreover, T-REMD is ineffective on entropic barriers.[15,16] Scaling of portions of the Hamiltonian (H-REMD) is a common alternative and could have a better convergence behavior for large systems. T-REMD and H-REMD can also be combined, by integrating both schemes on each replica[17] or in a multidimensional framework.[18] The second group of enhanced sampling techniques includes methods based on importance sampling, where suitable collective variables (CVs) are a priori selected and biased. This class has its root in umbrella sampling method[19] and includes local elevation,[20] conformational flooding,[21] adaptive biasing force (ABF),[22] and metadynamics.[23,24] These methods are very efficient but require large a priori information. With the notable exception of bias-exchange metadynamics,[25] approaches based on importance sampling have been traditionally applied to a small number of CVs at a time, due to the difficulties in building a history dependent potential in a high-dimensional CV space. For many systems it is difficult to find a small number of effective CVs that describe the slow degrees of freedom, and one often has to resort to expensive methods of the first class. For instance, conformational transitions in unstructured oligonucleotides have been studied with different REMD schemes.[18,26,27] In these studies, the generation of a converged conformational ensemble was proven a cumbersome task, even when a high number of replicas and tens of μs of simulated time were employed. Methods bridging between these two classes can be designed by biasing a large number of local CVs (e.g., dihedral angles), so as to avoid the complication of designing ad hoc CVs. For example, Straatsmaa and McCammon[28] introduced a technique where bias potentials acting on dihedrals was used in a simulated annealing protocol. In that work a bias was fitted to the potential of mean force of backbone dihedrals and then used to quickly optimize the structure of a polypeptide. Zacharias and collaborators[29,30] used a similar technique to build bias potentials for dihedrals to be employed in H-REMD and successfully applied it to study conformational changes in proteins. However, these potentials were fitted on a model system and cannot account for the specific identity of each residue and for the cross-talk between correlated dihedrals. For nucleic acids, where the complex backbone does not allow a straightforward application of this technique, penalty potentials centered on the stable rotamers were manually selected with a procedure that seems difficult to generalize.[31−33] In this paper we propose to use concurrent well-tempered metadynamics[24] (WT-MetaD) to build bias potentials acting on a large number of local CVs. We then show how to integrate this approach in a H-REMD scheme, exploiting the replica ladder to obtain unbiased conformations. In WT-MetaD the compensation of the underlying free-energy landscape is modulated by the so-called bias factor γ. We here change this parameter across the replica ladder, adjusting the ergodicity of each replica. The final bias could be also used as a static potential so as to completely eliminate any nonequilibrium effect. Since the effect of the bias is that of keeping the chosen CVs at an effectively higher temperature, we refer to the introduced method as replica exchange with collective-variable tempering (RECT). The method is first tested on alanine dipeptide in water and then applied to the conformational sampling of a RNA tetranucleotide where it outperforms dihedral-scaling REMD and plain MD. The chosen tetranucleotide is a very challenging system that has been recently studied with different variants of REMD.[18,26,27]

Methods

In this section we show how to use WT-MetaD as an effective method to build bias potentials that allow barriers to be easily crossed. One of the input parameters of well-tempered metadynamics is a boosting temperature Δ = (γ–1)T, where γ is the bias factor and T is the temperature of the system. In the rest of the paper we will equivalently use either γ or Δ so as to simplify the notation. This parameter can be used to smoothly interpolate between unbiased sampling (γ = 1, ΔT = 0) and flat histogram (γ → ∞, Δ → ∞). One can thus introduce a set of replicas using different values of Δ, ranging from 0 to a value large enough to allow all the relevant barriers to be crossed. Metadynamics relies on the accumulation of a history dependent potential and cannot be applied straightforwardly to a large number of CVs. In the next subsection we show that this issue can be circumvented by performing many, low-dimensional, concurrent metadynamics simulations. We then show how to combine many simulations of this kind in a multiple-replica scheme.

Concurrent Well-tempered Metadynamics

In well-tempered metadynamics a history dependent potential V(s,t) acting on the collective variable s is introduced and evolved according to the following equation of motionHere kB is the Boltzmann constant, T is the temperature, τ is the characteristic time for the bias evolution, Δ is a boosting temperature, and K is a kernel function which is usually defined as a Gaussian. For simplicity we consider the case of a single CV. The variance of the Gaussian provides the binning in CV space and is usually chosen based on CV fluctuations or adjusted on the fly.[34] By assuming that the bias is growing uniformly with time one can show rigorously[24,35] that in the long time limit the bias potential tends toso that the following probability distribution is sampledThe role of Δ is that of setting the effective temperature for the CV. The explored conformations are thus taken from an ensemble where that CV only is kept at an artificially high temperature, similar to other methods,[36−38] but has the nice feature that it is obtained with a bias that is quasi-static in the long lime limit. The bias is usually grown by adding a Gaussian every NG steps. As a consequence, to obtain an initial growing rate equal to (kBΔ)/(τB), the initial Gaussian height should be chosen equal to ((kBΔ)/(τB))NGΔt where Δt is the MD time step. We here propose to introduce a separate history-dependent potential on each CVwhere α = 1,...,NCV is the index of the CV, and NCV is the number of CVs. The growth of each of these bias potentials will depend only on the marginal probability for each CVIn the long time limit, this potential will tend to flatten the marginal probabilities for every single CV. Since CVs are in general nonorthogonal between each other, one should consider the fact that whenever a bias is added on a CV also the distribution of the other CVs is affected. In the following we will discuss this issue considering two CVs only, but the argument is straightforwardly generalized to a larger number of CVs. Two Independent Variables. If two CVs are independent, the joint probability is just the product of the two marginal probabilities, i.e. P(sα,sβ) = Pα(sα)Pβ(sβ). Adding a bias potential on a CV will not affect the distribution of the other. As a consequence, in the long time limit the two bias potentials will converge independently to the predicted fraction of the free energy as in eq 1. The final bias potential will be completely equivalent to that obtained from a two-dimensional well-tempered metadynamics but will only need the accumulation of two one-dimensional histograms, thus requiring a fraction of the time to converge. A simple example on a model potential is shown in Figure S1. Two Identical Variables. We also consider the case of two identical CVs, sα = sβ. This can be obtained for instance by biasing twice the same torsional angle. Here the potentials Vα and Vβ will grow identically, and the total bias potential acting on sα will be Vtot = 2Vα. The total potential will grow asThus, the net effect will be exactly equivalent to that of choosing a doubled Δ parameter. In other words, the Δ parameter acts in an additive way on the selected CVs. A similar effect can be expected if two CVs are linearly correlated. In realistic applications one can expect the behavior to be somewhere in the middle between these two limiting cases. The most important consideration here is that the bias potentials will tend to flatten all the marginal probabilities, but there will be no guarantee that the joint probability is flattened. Results for a simple functional form can be seen in Figure S2. In the same figure it is possible to appreciate the importance of using a self-consistent procedure when CVs are correlated. In ref (39) two metadynamics were applied on top of each other, namely on the potential energy and on selected CVs, in a non-self-consistent way. This was possible because the correlation between the potential energy and the selected CVs is small. The need for a self-consistent solution was also pointed out in a recent paper[40] where a generalization of the ABF method[22] was introduced. In that work independent one-dimensional adaptive forces were applied at the same time to different CVs so as to enhance the sampling of a high multidimensional space. In short, the novelty of the introduced procedure is that many low-dimensional metadynamics potentials are grown instead of a single multidimensional one. This allows the bias to converge very quickly to a flattening potential, with the degree of flatness controlled by the parameter Δ. The flattening is expected to enhance conformational transitions which are otherwise hindered by free-energy barriers on the biased CVs. When variables are correlated the exact relationship between bias and free energy (eq 1) could be lost.

Hamiltonian Replica Exchange

The procedure introduced above produces conformations in an ensemble which is in general difficult to predict. However, since the bias potential is known, one can in principle reweight results so as to extract conformations in the canonical ensemble. In the case of static bias potentials acting on the CVs, this can be done by weighting each frame as e(∑. This can provide in principle correct results even if the joint probability is not flattened. It must be noticed that such a reweighting can provide statistically meaningful results only for small fluctuations of the total biasing potentials, on the order of kBT.[40] However, in a typical setup one would be interested in biasing all the torsional angles of a molecule. Even if each of them contributes with a few kBT, the total fluctuation of the bias would grow with the system size. For similar reasons, also the ABF-based scheme introduced in ref (40) is limited to a relatively low number of CVs. A more robust and scalable procedure can be designed by introducing a ladder of replicas with increasing values of Δ, ranging from 0 to a value large enough to enhance the relevant conformational transitions. The first replica (γ = 1, Δ = 0) can be used to accumulate unbiased statistics. Replicas other than the first one feel multiple biasing potentials on all the CVs. From time to time an exchange of coordinates between neighboring replicas is proposed and accepted with probability chosen so as to enforce detailed balance with respect to the current biasing potential:Here the suffix i = 1,...,Nrep indicates the replica index, Nrep being the number of replicas. The exchanges allow the bias potential of every single replica to grow as close as possible to equilibrium taking advantage of the enhanced ergodicity of the more biased replicas. We notice that to reach a quasi-static distribution it is necessary that all the bias potentials converge for all the replicas. Since the time scale for convergence is related to the parameter τB,[24] it is convenient to use the same τB for all the replicas or, equivalently, to choose the initial deposition rate as proportional to Δ. The number of replicas required to span a given range in the Δ parameter is proportional to (NCV)1/2, thus allowing for a very large number of CVs. We underline that, if a very large number of CVs has to be calculated at every step on every replica, the overall performances could be degraded. This issue could be tackled using e.g. multiple time-step schemes.[41,42] In the examples presented here this performance issue was not observed. We notice that in principle one could use the bias potentials built with this protocol to perform a replica-exchange umbrella sampling simulation. In this manner the final production run would be performed with an equilibrium replica-exchange simulation. However, we observe that well-tempered metadynamics is designed so that the speed at which the bias grows decreases with time and the potential becomes quasi-static. In the practical cases we investigated, this second stage was not necessary.

Model Systems

Alanine Dipeptide

Alanine dipeptide (dALA) was modeled with the Amber99SB-ILDN[43,44] force field and solvated in a truncated octahedron box containing 599 TIP3P[45] water molecules. The LINCS[46,47] algorithm was used to constrain all bonds, and equations of motion were integrated with a time step of 2 fs. For each replica the system temperature was kept at 300 K by the stochastic velocity rescaling thermostat.[48] For all nonbonded interactions the direct space cutoff was set to 0.8 nm, and the electrostatic long-range interactions were treated using the default particle-mesh Ewald[49] settings. All the simulations were run using GROMACS 4.6.5[50] patched with the PLUMED 2.0 plugin.[51] We underline that the possibility of running concurrent metadynamics within the same replica is a novelty introduced in PLUMED 2.0. The RECT simulation was performed with 6 replicas. The backbones dihedral angles (Ψ and Φ) and the gyration radius (R) were selected as CVs. The γ factors were chosen from 1 to 15 following a geometric distribution. We recall that a geometric replica distribution is optimal for constant specific-heat systems. In RECT, this would be true if the exploration of each of the biased CVs were limited to a quasi-parabolic minimum in the free-energy landscape. Whereas this is clearly not true in real cases (e.g., double-well landscapes) we found that a geometric schedule was leading to a reasonable acceptance in the cases investigated here. The possibility of optimizing the replica ladder is left as a subject for further investigation. For the dihedral angles the Gaussian width was set to 0.35 rad and for the R to 0.007 nm. The Gaussians were deposited every 500 steps. The initial Gaussian height was adjusted to the Δ of each replica, according to the relation h = (kBΔ)/(τB)NGΔt, in order to maintain the same τB = 12 ps across the entire replica ladder. The CVs were monitored every 100 steps, and exchanges were attempted with the same frequency. The simulation was run for 20 ns per replica. A H-REMD simulation where the force-field dihedral terms were scaled (H-REMD) was also performed, as implemented in an in-house version of the GROMACS code.[52] The same initial structures, number of replicas, and simulated time as in RECT were used. The scaling factor λ for each replica was selected using the relation λ = 1/γ to allow for a fair comparison of RECT and H-REMD. Finally a conventional MD simulation in the NVT ensemble was run for 120 ns using the same settings.

Tetranucleotide

The second system considered was an RNA oligonucleotide, sequence GACC. The initial coordinates were taken from a ribosome crystal structure (PDB: 3G6E), residue 2623 to 2626. Simulations were performed using the Amber99-bsc0 force field.[43,53,54] The system was solvated in a box containing 2502 TIP3P[45] water molecules, and the system charge was neutralized by adding 3 Na+ counterions, consistently with previous simulations.[26,55] A RECT simulation was performed using 16 replicas simulated for 300 ns each. The γ ladder was chosen in the range from 1 to 4 following again a geometric distribution. The initial structures for the H-REMD were taken from a 500 ps MD at 600 K, to avoid correlations of the bias during the initial deposition stage of the WT-MetaD. Other details of the simulation protocol were chosen as for the previous system. As depicted in Figure 1, for each residue the dihedrals of the nucleic acid backbone (α, β, ϵ, γ, ς), together with the pseudodihedrals angles of the ribose ring (θ1 and θ2) and the glycosidic torsion angle (χ) were chosen as CVs. To help the free rotation of the nucleotide heterocyclic base around the glycosidic bond, the minimum distance between the center of mass of each base with the other three bases was also biased. For the WT-MetaD we used the same parameters as in the previous system. Gaussian width for the minimum distance between bases was chosen equal to 0.05 nm.

Figure 1

Schematic representation of the collective variables used for the tetranucleotide simulation. For each nucleotide, the labeled dihedral angles and the minimum distance between each nucleobase center of mass and the other three nucleobases were biased.

For this system a H-REMD, a T-REMD, and a plain MD simulation were performed in addition to the RECT. In the case of H-REMD we used 24 replicas with scaling factors λ ranging from 1 to 0.25, so as to cover the same range of the γ values chosen for the RECT. In the T-REMD 24 replicas were used to cover a temperature range between 300 and 400 K with a geometric distribution. For both methods, T-REMD and H-REMD, the simulation length was 200 ns per replica. Exchanges were attempted every 120 steps. The conventional MD simulation was run for 4.8 μs. All the simulations (RECT, H-REMD, T-REMD, and conventional MD) correspond to the same total simulated time. Schematic representation of the collective variables used for the tetranucleotide simulation. For each nucleotide, the labeled dihedral angles and the minimum distance between each nucleobase center of mass and the other three nucleobases were biased.

Analysis

Dihedral Entropy

As the bias compensates the underlying free energy the probability distribution of the biased CVs is partially flattened. The main CVs used in our method are dihedrals angles. To quantify the effect of the Hamiltonian modifications on the angle distributions one-dimensional entropies (S1) were estimated. The calculation procedure was equivalent to the one used in ref (56) to evaluate the configurational entropy associated with soft degrees of freedom in proteins. We employed wrapped Gaussian kernels to estimate the histogram profile of each dihedral. Histograms were calculated with PLUMED 2.0. For all the distributions the bandwidth for the kernel density estimation was set to 0.017 rad. We underline that using this definition we only evaluate the flatness of the individual one-dimensional distributions, and cross-correlation between CVs is ignored.

RNA Conformations

RNA conformations were classified according to the combination of the χ angle rotameric state of each nucleotide. Torsion orientations in the range of −0.26 to 2.01 rad were considered as syn, while the remaining ones were classified as anti. The limiting values were chosen according to the position of the barriers in the χ free-energy profiles of all the residues. The result of this clustering procedure gave 24= 16 different states that are kinetically well separated by the high torsional barriers. We observe that the population of these states does not depend only on the torsional potential associated with the χ dihedrals but include contributions from base–base stacking, hydrogen bonds, solvation of bases, etc.

Results

In this section we first show the validation of our methodology on a standard model system, dALA in water. Then we present results for the more challenging case of the conformational sampling of a tetranucleotide. For all the applications we benchmark against plain MD and a H-REMD where the dihedral potentials are scaled. For the tetranucleotide also results obtained with T-REMD are shown. All the comparisons are made using the same total simulated time.

Alanine Dipeptide

The goal of the introduced method is to enhance conformational sampling in the unbiased replica. The possibility to explore different metastable conformations in this replica relies on the fact that probability distributions in the biased replicas are flattened and that conformations can travel across the replica ladder. These conditions can be verified by monitoring the exchange rate and the flatness of the distributions. The acceptance rate is in the range 65–72% for RECT and in the range 43–53% for H-REMD, indicating that the former method requires less replicas. This is likely due to the fact that the total number of scaled dihedrals in H-REMD is larger than the number of biased CVs in RECT. For both REMD methods we also verified that all the trajectories in the generalized ensemble sampled the same conformational ensemble (see Figure S3). A quantitative measure of the flatness of the distribution in the biased replicas can be obtained from the dihedral entropy, shown in Figure 2 as a function of scaling factors (γ and λ for RECT and H-REMD, respectively). The limiting value corresponding to a flat distribution is also indicated. Entropy grows faster as a function of the scaling factor when using RECT, indicating that free-energy barriers on the dALA isomerization transition are more effectively compensated by the bias potentials. With H-REMD entropy of Ψ angle saturates and apparently the distribution cannot be further flattened by decreasing λ. In the case of the Φ angle, the dihedral entropy does not grow monotonically when λ is decreased. This behavior indicates that the relevant free-energy barriers are not only originating from the dihedral force-field terms. The conformational transitions involve indeed also changes in water coordination, reorganization of hydrogen bonds, nonbonded interactions, etc. On the contrary, RECT achieves an almost flat distribution for both dihedral angles at the highest value of the γ factor. Backbone dihedral distributions for all the replicas are shown in Figure S4. The conformations sampled on each replica are shown projected on the Φ,Ψ free-energy landscape in Figure S5, where it can be appreciated that all the relevant basins (α, β, and α) are explored and connected by points close to the minimum-action pathways (see refs (40 and 57)).

Figure 2

Entropy for Ψ(top) and Φ(bottom) dihedral angles in alanine dipeptide. Entropies are shown as a function of 1/λ and γ for H-REMD and RECT, respectively. As the entropy increases the dihedral distributions become more flat. The maximum entropy value corresponding to a flat distribution is represented with a straight line.

To assess the efficiency and the accuracy of the introduced enhanced sampling technique the free-energy difference ΔF between the states ϕ∈[−π,0] and ϕ∈[0, (π)/(2)] was calculated from the distribution of the unbiased replica. Results are shown as a function of time in Figure 3, for the two REMD schemes and for the reference conventional MD. Both H-REMD methods converge to the right value with a similar behavior, whereas plain MD needs several tens of ns for the first transition to be observed. The similarity in the convergence of RECT and H-REMD indicates that for this system the moderate flattening of the distribution induced by H-REMD is sufficient to achieve ergodicity on this time scale. In order to better evaluate differences between the performance of RECT and H-REMD we applied these methodologies to a more complex system. Results are shown in the next section.

Figure 3

Estimate of the free-energy difference between the two metastable minima in alanine dipeptide. Data are shown for both replica-exchange methods (H-REMD and RECT) and for conventional MD as a function of the total simulated time.

Tetranucleotide

Also in this case we monitor the average exchange ratio (76–83% for H-REMD, 25–32% for T-REMD, and 60–80% for RECT). In Figure S7 the variation of the exchange ratios in time is shown for the exchanges between the first 2 and the last 2 replicas of each method. We also checked the consistency of trajectories along the replica ladder. As it can be appreciated in Figure S6, for H-REMD and RECT the empirical distribution of RMSD is very similar for all the trajectories in the generalized ensemble, indicating that, for each method, all of them sampled the same conformational space. On the contrary, in the case of T-REMD, agreement among the distributions of RMSD is very poor. During this simulation trajectories across the temperature space remain trapped on different metastable conformations. The same behavior was obtained in ref (26) where several T-REMD simulations were performed on the same system, with the same number of replicas and a similar temperature range. In that work divergence among the obtained generalized ensembles was observed even for a simulated time as long as 2 μs per replica. For H-REMD and RECT round-trip times are shown on Figure S8. The average round-trip time is ≈0.5 ns for H-REMD, ≈1.8 ns for T-REMD, and ≈1.2 ns for RECT. Estimate of the free-energy difference between the two metastable minima in alanine dipeptide. Data are shown for both replica-exchange methods (H-REMD and RECT) and for conventional MD as a function of the total simulated time. In Figure 4 we show the sum of the entropies for the 32 dihedrals used as CVs. In this respect, RECT is clearly more effective than H-REMD in flattening the dihedral distributions, consistently with what was observed for dALA. Notably, the entropic increment observed in RECT is close to the one observed in T-REMD when using an equivalent temperature. This confirms that RECT has an effect comparable to that of raising the temperature of the biased CVs by a factor γ.

Figure 4

Total entropy of backbone, puckering, and glycosidic dihedral angles in the tetranucleotide for both replica-exchange methods. Entropies are shown as a function of 1/λ and γ, for H-REMD and RECT, respectively. For T-REMD, temperature is chosen as T = γ 300 K. As the entropy increases the dihedral distributions become more flat. The maximum entropy value corresponding to a flat distribution is represented with a straight line. Entropies obtained for the unbiased replicas in the three methods are consistent within their error bars (error not shown).

The significance of this entropic values could be appreciated on the time series and related histograms for all the dihedral angles shown in Figures S9–S12 for the most and least ergodic replica of H-REMD and RECT. It is clear that for RECT, at the most ergodic replica, all the accessible torsional range is sampled. On the contrary, in the highest replica of H-REMD the distributions of some torsions are not flattened. Total entropy of backbone, puckering, and glycosidic dihedral angles in the tetranucleotide for both replica-exchange methods. Entropies are shown as a function of 1/λ and γ, for H-REMD and RECT, respectively. For T-REMD, temperature is chosen as T = γ 300 K. As the entropy increases the dihedral distributions become more flat. The maximum entropy value corresponding to a flat distribution is represented with a straight line. Entropies obtained for the unbiased replicas in the three methods are consistent within their error bars (error not shown). The transition around the glycosidic bond, from anti to syn, is among the slowest relaxation times in RNA dynamics.[58] To evaluate the convergence of the unbiased replica we analyzed the population of the anti rotamer for each nucleotide χ angle. Populations are shown in Figure 5 as a function of the total simulated time. For all the nucleotides the anti conformations are preferred. The guanosine is the nucleotide with the highest syn proportion, and the cytidines are the ones with the smallest (<2%), as correspond to their rotameric preferences.[59] Values from both H-REMD approaches seem well consistent, except for the population of the first nucleotide. From the time behavior of these populations, it is clear that for all the REMD approaches the guanosine proportion of anti is the most difficult to converge. Here RECT can reach values close to a longer reservoir-REMD simulation,[26] while both H-REMD and T-REMD show results closer to those obtained from conventional MD, with a higher occupation of the anti conformer.

Figure 5

Estimated glycosidic angle anti population for each nucleotide as a function of the total simulated time. Data are shown for H-REMD, T-REMD, and RECT unbiased replicas and for conventional MD. Reference values taken from ref (26) are shown as dashed lines. We observe that our method is enforcing the exploration of both anti and syn conformations in the biased replicas for each nucleotide independently. This however does not guarantee that all 16 combinations of anti and syn conformations are explored. Figure 6 shows the free energy of the RNA structures grouped by the combination of the χ angle anti(a)/syn(s) rotamers. All 16 combinations, except for ssss and asss, are sampled in the unbiased replica from RECT. On the contrary, the unbiased replica from T-REMD and H-REMD explores respectively 13 and 8 of the states and plain MD only 5 of them. The most populated cluster corresponds to an all-anti conformation, followed by the saaa. Then, the three clusters asaa, ssaa, and sasa appear with similar population.

Figure 6

Estimated free energies for the tetranucleotide conformations clustered according to the χ angle anti/syn rotameric combinations (circles). Free energies are computed as −kBTlogP, where P is the normalized population of each cluster on the indicated replica. Gray boxes represent relative populations higher than 1%. Confidence intervals are shown as bars and span the range [−kBTlog(P+ΔP),–kBTlog(P–ΔP)], where ΔP is the standard deviation of the average P as obtained from four blocks. Clusters which are observed in only one of the four blocks have an infinite upper bound.

In the same figure the free-energy values for the ergodic replica show that all the 16 combinations are populated in RECT within a range of 6 kBT. In the case of H-REMD the most ergodic replica visits only 9 combinations with a population that is very close to that of the unbiased replica. The most ergodic replica in T-REMD explores 14 clusters, but their populations have large statistical errors. We highlight the fact that results from T-REMD could be affected by the lack of convergence of trajectories across the temperature space (see Figure S6). This could lead to an underestimation of the errors as evaluated from block analysis.

Discussion

The introduced method allows for building bias potentials for a Hamiltonian replica-exchange scheme using concurrent well-tempered metadynamics on several CVs. Replicas are simulated using a ladder of well-tempered bias factors γ. When CVs are correlated, the self-consistency among the bias potentials is crucial to achieve flat sampling in each individual CV, as illustrated in Figure S2. In this case the exact relationship between bias and free energy is lost. We also remark that here flattening is not complete but modulated by the value of γ. This is useful since it avoids sampling very high energy states (e.g., with steric clashes) that would have a very low chance of being accepted in the unbiased replica. The method compares favorably with both conventional MD and H-REMD. The method slightly outperforms T-REMD, where the entire system is heated, indicating that for these small systems there is not a substantial advantage in schemes where part of the system is biased. However, RECT can be straightforwardly generalized to large systems since the acceptance only depends on the size of the biased portion. Results from both dALA and tetranucleotide simulations show that the bias potentials constructed with concurrent WT-MetaD are able to gradually scale the free-energy barriers. We notice that only barriers in the one-dimensional free-energy profiles are compensated, which means that some regions in the multidimensional space of all the CVs might not be explored. In principle this could hide some important minima that would never be observed. We did not observe this problem in the applications presented here. Estimated free energies for the tetranucleotide conformations clustered according to the χ angle anti/syn rotameric combinations (circles). Free energies are computed as −kBTlogP, where P is the normalized population of each cluster on the indicated replica. Gray boxes represent relative populations higher than 1%. Confidence intervals are shown as bars and span the range [−kBTlog(P+ΔP),–kBTlog(P–ΔP)], where ΔP is the standard deviation of the average P as obtained from four blocks. Clusters which are observed in only one of the four blocks have an infinite upper bound. The second application on which we tested RECT, namely conformational sampling of a tetranucleotide, is particularly challenging. The conformational space of these small RNA molecules is not constrained by Watson–Crick pairings, and ergodic sampling is out of reach of conventional MD simulations.[26,55] So far, converged ensembles have been obtained only through highly expensive multidimensional REMD simulations, corresponding to a total simulated time of several tens of μs.[18,27] One of the reasons for this difficult convergence is the long relaxation time for the anti to syn transitions, which could be additionally hindered by an incorrect force-field description of base–base stacking and base–solvent interactions.[58] Figure 6 illustrates the ability of RECT to accelerate conformational transitions among the χ angle anti/syn rotamers. Although the conformational space of the more biased replicas is highly expanded, the convergence in the unbiased replica is not affected. On the contrary the method facilitates the sampling of glycosidic rotamer conformations that otherwise would not be explored by MD simulations of the same overall length. We finally remark that our procedure can be combined with weighted histogram[60] so as to include the statistics of the biased replicas.

Comparison with Related State-of-the-Art Methods

RECT is based on the idea of building a replica ladder where a large set of selected CVs is progressively heated. CVs are heated by flattening their distribution with concurrent well-tempered metadynamics. We first discuss the possibility of using methods other than well-tempered metadynamics to build the replica ladder. Possible alternatives here include ABF[22] or a recently proposed variational approach.[61] These methods could be used in a RECT scheme provided they are suitably extended so as to sample a partially flattened distribution. We also observe that other methods aimed at keeping selected CVs at a given temperature have been proposed based on coupling thermostats to CVs directly.[36−38] These techniques have been mostly used in the past with an exploration purpose relying on additional calculations so as to provide free energies (see, e.g., ref (62)), but it is not clear if they can be integrated in a RECT scheme. In the following we discuss the comparison of RECT with related methods that are not directly based on CV tempering. Comparison with H-REMD of Curuksu and Zacharias. Our method is closely related to the one introduced in ref (31). There, a bias potential aimed at disfavoring the most probable rotamers is manually constructed and applied on several replicas using a scaling factor. This bias disfavors the major minima but does not ensure a proper compensation of the free-energy barriers, as their positions and magnitudes are not a priori known. The main advantage of RECT is that several low-dimensional bias potentials are built with a self-consistent procedure so that the technique can be straightforwardly applied to a large number of degrees of freedom. Comparison with Bias-Exchange Metadynamics. In bias-exchange metadynamics every replica performs an independent metadynamics simulation so that one CV at a time is feeling the flattening potential. Thus, it is typically used with a relatively small number of ad hoc designed CVs capable of describing the relevant conformational transitions. On the other hand, RECT is designed to be used with a very large number of dummy CVs with little a priori information and to bias them concurrently to exploit their cooperation in enhancing conformational sampling. For this reason, the two approaches are complementary and could even be combined in a multidimensional replica exchange suitable for a massively parallel environment. Comparison with Solute Tempering and Related Methods. In replica exchange with solute tempering the solute Hamiltonian is scaled so as to obtain an effect equivalent to a rise in the simulation temperature.[11,13] Any set of atoms can be identified as solute, giving the opportunity to enhance sampling in a region localized in space.[52,63] This requires modifying charges of the enhanced region, with long-range effects and sometimes affecting fundamental properties such as hydrophobicity. In our method, the bias potentials act on precisely selected degrees of freedom minimally perturbing their coupling with the rest of the system. Moreover, the bias is adaptively built so as to compensate the free energy and not the potential energy, so that with properly chosen CVs it could be used to compensate entropic barriers. Comparison with Hyperdynamics and Accelerated MD. In these methods the potential energy of the system is modified so as to decrease the probability to sample minima on the potential energy.[27,64,65] On the contrary, RECT employs a bias which is related with the free energy so as to achieve a flatter histogram on the selected CVs. We finally remark that RECT, although formally based on the a priori choice of a set of CVs, typically requires the same amount of information as methods not based on CVs. Indeed, as we have shown, the method can be easily applied to a very large number of CVs, virtually including by construction all the slow degrees of freedom of the system. Additionally, when a few relevant CVs can be identified based on chemical intuition, RECT can be straightforwardly combined with standard metadynamics similarly to parallel tempering[66] or solute tempering.[67]

Conclusion

Replica exchange with collective-variable tempering (RECT) has been here proposed as a novel and flexible enhanced-sampling method. RECT takes advantage of the adaptive nature of well-tempered metadynamics to build bias potentials that compensate free-energy barriers. The flattening of the barriers is modulated by the well-tempered factor γ, and the chosen collective variables (CVs) are effectively kept at a higher temperature. The biasing potentials are built combining concurrent low-dimensional metadynamics protocols so as to be usable on a very large number of CVs. Multiple replicas are then used so as to smoothly interpolate between a highly biased, ergodic simulation and an unbiased one (γ = 1). The number of required replicas scales with the square root of the number of chosen CVs for a fixed range of γ factors. This allows a very large number of CVs to be biased, so that virtually all the relevant transitions can be accelerated. The CVs used here were mostly dihedral angles, which exhibit relevant barriers in many biomolecular conformational transitions, but the method can be used with any CV. The application of this technique to the dALA in water shows that the CV probability distributions are effectively flattened by the action of the bias potentials and unbiased statistics is correctly recovered. In the case of the tetranucleotide conformational sampling is greatly enhanced since RECT effectively overcomes the high free-energy barriers of the χ angle transitions that hindered the conformational sampling at room temperature. RECT is available in PLUMED, and a sample input file is provided in Figure S13. RECT is a promising tool to enhance the exploration of the conformational space in highly flexible biomolecular systems such as RNA, proteins, or RNA/protein complexes.

44 in total

1. Escaping free-energy minima.

Authors: Alessandro Laio; Michele Parrinello
Journal: Proc Natl Acad Sci U S A Date: 2002-09-23 Impact factor: 11.205

2. A Novel Hamiltonian Replica Exchange MD Protocol to Enhance Protein Conformational Space Sampling.

Authors: Roman Affentranger; Ivano Tavernelli; Ernesto E Di Iorio
Journal: J Chem Theory Comput Date: 2006-03 Impact factor: 6.006

3. Hydrophobic aided replica exchange: an efficient algorithm for protein folding in explicit solvent.

Authors: Pu Liu; Xuhui Huang; Ruhong Zhou; B J Berne
Journal: J Phys Chem B Date: 2006-09-28 Impact factor: 2.991

4. Well-tempered metadynamics: a smoothly converging and tunable free-energy method.

Authors: Alessandro Barducci; Giovanni Bussi; Michele Parrinello
Journal: Phys Rev Lett Date: 2008-01-18 Impact factor: 9.161

5. Exploring the protein G helix free-energy surface by solute tempering metadynamics.

Authors: Carlo Camilloni; Davide Provasi; Guido Tiana; Ricardo A Broglia
Journal: Proteins Date: 2008-06

Review 6. Biomolecular simulation: a computational microscope for molecular biology.

Authors: Ron O Dror; Robert M Dirks; J P Grossman; Huafeng Xu; David E Shaw
Journal: Annu Rev Biophys Date: 2012 Impact factor: 12.981

7. Effect of the disulfide bond on the monomeric structure of human amylin studied by combined Hamiltonian and temperature replica exchange molecular dynamics simulations.

Authors: Rozita Laghaei; Normand Mousseau; Guanghong Wei
Journal: J Phys Chem B Date: 2010-05-27 Impact factor: 2.991

8. Enhanced conformational sampling of carbohydrates by Hamiltonian replica-exchange simulation.

Authors: Sushil Kumar Mishra; Mahmut Kara; Martin Zacharias; Jaroslav Koca
Journal: Glycobiology Date: 2013-10-16 Impact factor: 4.313

9. Improved side-chain torsion potentials for the Amber ff99SB protein force field.

Authors: Kresten Lindorff-Larsen; Stefano Piana; Kim Palmo; Paul Maragakis; John L Klepeis; Ron O Dror; David E Shaw
Journal: Proteins Date: 2010-06

10. Benchmarking AMBER force fields for RNA: comparisons to NMR spectra for single-stranded r(GACC) are improved by revised χ torsions.

Authors: Ilyas Yildirim; Harry A Stern; Jason D Tubbs; Scott D Kennedy; Douglas H Turner
Journal: J Phys Chem B Date: 2011-07-01 Impact factor: 2.991

18 in total

1. Predicting the Kinetics of RNA Oligonucleotides Using Markov State Models.

Authors: Giovanni Pinamonti; Jianbo Zhao; David E Condon; Fabian Paul; Frank Noè; Douglas H Turner; Giovanni Bussi
Journal: J Chem Theory Comput Date: 2017-01-05 Impact factor: 6.006

Review 2. RNA Structural Dynamics As Captured by Molecular Simulations: A Comprehensive Overview.

Authors: Jiří Šponer; Giovanni Bussi; Miroslav Krepl; Pavel Banáš; Sandro Bottaro; Richard A Cunha; Alejandro Gil-Ley; Giovanni Pinamonti; Simón Poblete; Petr Jurečka; Nils G Walter; Michal Otyepka
Journal: Chem Rev Date: 2018-01-03 Impact factor: 60.622

3. A GPU-Accelerated Parameter Interpolation Thermodynamic Integration Free Energy Method.

Authors: Timothy J Giese; Darrin M York
Journal: J Chem Theory Comput Date: 2018-02-07 Impact factor: 6.006

4. Metadynamics combined with auxiliary density functional and density functional tight-binding methods: alanine dipeptide as a case study.

Authors: Jerome Cuny; Kseniia Korchagina; Chemseddine Menakbi; Tzonka Mineva
Journal: J Mol Model Date: 2017-02-15 Impact factor: 1.810

5. From data to noise to data for mixing physics across temperatures with generative artificial intelligence.

Authors: Yihang Wang; Lukas Herron; Pratyush Tiwary
Journal: Proc Natl Acad Sci U S A Date: 2022-08-04 Impact factor: 12.779

6. Conformational ensembles of an RNA hairpin using molecular dynamics and sparse NMR data.

Authors: Sabine Reißer; Silvia Zucchelli; Stefano Gustincich; Giovanni Bussi
Journal: Nucleic Acids Res Date: 2020-02-20 Impact factor: 16.971

7. Large-Scale Analysis of 48 DNA and 48 RNA Tetranucleotides Studied by 1 μs Explicit-Solvent Molecular Dynamics Simulations.

Authors: Michael V Schrodt; Casey T Andrews; Adrian H Elcock
Journal: J Chem Theory Comput Date: 2015-11-18 Impact factor: 6.006

8. Enhanced, targeted sampling of high-dimensional free-energy landscapes using variationally enhanced sampling, with an application to chignolin.

Authors: Patrick Shaffer; Omar Valsson; Michele Parrinello
Journal: Proc Natl Acad Sci U S A Date: 2016-01-19 Impact factor: 11.205

9. Enhanced conformational sampling using replica exchange with concurrent solute scaling and hamiltonian biasing realized in one dimension.

Authors: Mingjun Yang; Jing Huang; Alexander D MacKerell
Journal: J Chem Theory Comput Date: 2015-06-09 Impact factor: 6.006

10. Exploring Valleys without Climbing Every Peak: More Efficient and Forgiving Metabasin Metadynamics via Robust On-the-Fly Bias Domain Restriction.

Authors: James F Dama; Glen M Hocky; Rui Sun; Gregory A Voth
Journal: J Chem Theory Comput Date: 2015-11-20 Impact factor: 6.006