Literature DB >> 28034310

Gaussian Accelerated Molecular Dynamics in NAMD.

Yui Tik Pang¹, Yinglong Miao, Yi Wang¹, J Andrew McCammon.

Abstract

Gaussian accelerated molecular dynamics (GaMD) is a recently developed enhanced sampling technique that provides efficient free energy calculations of biomolecules. Like the previous accelerated molecular dynamics (aMD), GaMD allows for "unconstrained" enhanced sampling without the need to set predefined collective variables and so is useful for studying complex biomolecular conformational changes such as protein folding and ligand binding. Furthermore, because the boost potential is constructed using a harmonic function that follows Gaussian distribution in GaMD, cumulant expansion to the second order can be applied to recover the original free energy profiles of proteins and other large biomolecules, which solves a long-standing energetic reweighting problem of the previous aMD method. Taken together, GaMD offers major advantages for both unconstrained enhanced sampling and free energy calculations of large biomolecules. Here, we have implemented GaMD in the NAMD package on top of the existing aMD feature and validated it on three model systems: alanine dipeptide, the chignolin fast-folding protein, and the M3 muscarinic G protein-coupled receptor (GPCR). For alanine dipeptide, while conventional molecular dynamics (cMD) simulations performed for 30 ns are poorly converged, GaMD simulations of the same length yield free energy profiles that agree quantitatively with those of 1000 ns cMD simulation. Further GaMD simulations have captured folding of the chignolin and binding of the acetylcholine (ACh) endogenous agonist to the M3 muscarinic receptor. The reweighted free energy profiles are used to characterize the protein folding and ligand binding pathways quantitatively. GaMD implemented in the scalable NAMD is widely applicable to enhanced sampling and free energy calculations of large biomolecules.

Entities: Chemical Disease Gene Species

Mesh：

Substances：
Ligands
Proteins

Year: 2016 PMID： 28034310 PMCID： PMC5743237 DOI： 10.1021/acs.jctc.6b00931

Source DB: PubMed Journal: J Chem Theory Comput ISSN： 1549-9618 Impact factor: 6.006

Introduction

Accelerated molecular dynamics (aMD) is an enhanced sampling technique that works by smoothing the potential energy surface to lower energy barriers and thus accelerate conformational transitions of biomolecules.[1] Without the need to set predefined collective variables (CVs), aMD allows “unconstrained” enhanced sampling of many complex biomolecules.[2] For example, aMD simulations provide significant speed-up of peptide and protein conformational transitions,[1,3] lipid diffusion and mixing,[4] protein folding,[5] and protein–ligand binding.[6] Hundreds-of-nanosecond aMD simulations are able to capture millisecond time scale events in both globular and membrane proteins.[7] While aMD is powerful for enhanced conformational sampling, its accuracy for free energy calculations has attracted lots of attention.[8] In theory, frames of aMD simulations can be reweighted by the Boltzmann factors of the corresponding boost potential (i.e., eΔ) and averaged over each bin of selected CV(s) to obtain the canonical ensemble. However, the exponential reweighting is known to suffer from large statistical noise in practical calculations[2,7c,8,9] because the Boltzmann reweighting factors are often dominated by a very few frames with high boost potential. The boost potential in aMD simulations of proteins is typically on the order of tens to hundreds of kilocalories per mole, which is much greater in magnitude and wider in distribution than that of CV-biasing simulations (e.g., several kilocalories per mole in metadynamics). It has been a long-standing problem to accurately reweight aMD simulations and recover the original free energy landscapes, especially for large proteins.[6,7c] Notably, when the boost potential follows near-Gaussian distribution, cumulant expansion to the second order provides improved reweighting of aMD simulations compared with the previously used exponential average and Maclaurin series expansion reweighting methods.[10] The reweighted free energy profiles are in good agreement with the long-time scale conventional molecular dynamics (cMD) simulations as demonstrated on alanine dipeptide and fast-folding proteins.[5b] However, such improvement is limited to rather small systems (e.g., proteins with less than ∼35 amino acid residues).[5b] In simulations of larger systems, the boost potential exhibits significantly wider distribution and does not allow for accurate reweighting. In order to achieve both unconstrained enhanced sampling and accurate energetic reweighting for free energy calculations of large biomolecules like proteins, Gaussian accelerated molecular dynamics (GaMD) has been developed by applying a harmonic boost potential to smooth the biomolecular potential energy surface.[11] GaMD greatly reduces the energy barriers and accelerates conformational transitions and ligand binding by orders of magnitude.[11] Moreover, because the boost potential follows Gaussian distribution, the original free energy profiles of biomolecules can be recovered through cumulant expansion to the second order.[11] GaMD solves the long-standing energetic reweighting problem as encountered in the previous aMD method[8] and allows us to characterize complex biomolecular conformational changes quantitatively.[11] Compared with other enhanced sampling methods such as the metadynamics[12] and adaptive biasing force (ABF),[13] GaMD does not require predefined CVs, which is advantageous for studying “free” protein folding and ligand binding processes[11] and efficient free energy calculations of large biomolecules.[11] Here, we have implemented GaMD in the NAMD package on top of its existing aMD feature. The potential statistics of simulated biomolecules are collected from short cMD and used to construct the harmonic boost potential. The implemented GaMD in NAMD will be demonstrated on three model systems that have been extensively studied earlier: alanine dipeptide,[10,11,14] the chignolin fast-folding protein,[5b,11] and the M3 muscarinic G protein-coupled receptor (GPCR) bound by the acetylcholine (ACh) endogenous agonist.[6,15]

Methods

Theory

GaMD enhances the conformational sampling of biomolecules by adding a harmonic boost potential to smooth the system potential energy surface when the system potential is lower than a reference energy E:[11]where k is the harmonic force constant. The two adjustable parameters E and k are automatically determined by applying the following three criteria. First, for any two arbitrary potential values and found on the original energy surface, if , ΔV should be a monotonic function that does not change the relative order of the biased potential values, i.e., . Second, if , the potential difference observed on the smoothed energy surface should be smaller than that of the original, i.e., . By combining the first two criteria and plugging in the formula of and ΔV, we obtainwhere Vmin and Vmax are the system minimum and maximum potential energies. To ensure that eq is valid, k has to satisfy: . Let us define , then 0 < k0 ≤ 1. Third, the standard deviation of ΔV needs to be small enough (i.e., narrow distribution) to ensure accurate reweighting using cumulant expansion to the second order:[10] σΔ = k(E – Vavg)σ ≤ σ0, where Vavg and σ are the average and standard deviation of the system potential energies, σΔ is the standard deviation of ΔV with σ0 as a user-specified upper limit (e.g., 10kBT) for accurate reweighting. When E is set to the lower bound E = Vmax according to eq , k0 can be calculated asAlternatively, when the threshold energy E is set to its upper bound E = Vmin + 1/k, k0 is set toif k0″ is found to be between 0 and 1. Otherwise, k0 is calculated using eq . For energetic reweighting of GaMD simulations, the probability distribution along a selected reaction coordinate A() is written as p*(A), where denotes the atomic positions {1, ..., }.Given the boost potential ΔV() of each frame, p*(A) can be reweighted to recover the canonical ensemble distribution, p(A), aswhere M is the number of bins, β = 1/kBT and ⟨eβΔV(⟩ is the ensemble-averaged Boltzmann factor of ΔV() for simulation frames found in the jth bin. In order to reduce the energetic noise, the ensemble-averaged reweighting factor can be approximated using cumulant expansion:[16]where the first three cumulants are given byAs shown earlier, when the boost potential follows near-Gaussian distribution, cumulant expansion to the second order provides the more accurate reweighting than the exponential average and Maclaurin series expansion methods.[10] Finally, the reweighted free energy is calculated as .

Implementation

Similar to the aMD implemented in NAMD,[14] three modes are available for applying boost potential to biomolecules in GaMD: (1) boosting the dihedral energetic term only, (2) boosting the total potential energy only, and (3) boosting both the dihedral and total potential energetic terms (i.e., “dual-boost”). The major code modification is to extend the aMD function in NAMD 2.11 to include the boost potential calculation used in GaMD (Appendix A). As described in the previous section, GaMD boost potential is computed based on statistics of the system potential such as the minimum, maximum, average and standard deviation. Therefore, three stages of simulation are needed to collect the potential statistics. They include the (i) cMD, (ii) equilibration, and (iii) production stages. The program first collects potential statistics from a short cMD run. Subsequently, a boost potential is added to the system in the equilibration stage while update of the potential statistics continues. During this stage, the boost potential applied in each step is computed based on the energetic statistics collected up to that particular step. After the equilibration stage, the statistics collected is assumed to be sufficient to represent the potential energy landscape of interest. Hence, the potential statistics are fixed to calculate the boost potential for running the production simulation. Note that in both the cMD and equilibration stages, there are a small number of steps at the beginning of each stage during which we do not collect statistics. These steps, named preparation steps, are performed to allow the system to adapt to the simulation environment. The program starts collecting statistics of the potential energies after the preparation steps.

Simulation Protocols and Benchmarks

For alanine dipeptide and chignolin, the AMBER ff99SB force field was used and the simulation systems were built using the Xleap module in the AMBER package[17] as described previously.[9,11] By solvating the structures in a TIP3P[18] water box that extends 8–10 Å from the solute surface, the alanine dipeptide system contained 630 water molecules and 2211 waters for chignolin. The total number of atoms in the two systems is 1912 and 6773 for alanine dipeptide and chignolin, respectively. Periodic boundary conditions were applied for the simulation systems. Bonds containing hydrogen atoms were restrained with the SHAKE algorithm[19] and a 2 fs time step was used. Weak coupling to an external temperature and pressure bath was used to control both temperature and pressure.[20] The electrostatic interactions were calculated using the particle mesh Ewald (PME) summation[21] with a cutoff of 8.0 Å for long-range interactions. After the initial energy minimization and thermalization as described earlier,[11] dual-boost GaMD was applied to simulate the two systems. The system threshold energy E for applying the boost potential was set to Vmax. The default parameter values were used for the GaMD simulations except stated otherwise. For alanine dipeptide, statistics of the system potential were first collected from an initial 2 ns cMD run, followed by a 6 ns equilibration run. Based on the statistics collected, the threshold energy E and harmonic force constant k were computed automatically according to eq . Finally, three independent 30 ns production runs were performed with different randomized initial atomic velocities. For chignolin, the dual-boost GaMD simulation includes 2 ns cMD, 50 ns equilibration after adding the boost potential, and then three independent 300 ns production runs. For the M3 muscarinic GPCR, the system was set up following the same protocol as used previously.[6,7d] The tiotropium (TTP)-bound X-ray structure (PDB: 4DAJ) of the M3 receptor[22] was used. After removal of TTP, the M3 receptor was inserted into a palmitoyl-oleoyl-phosphatidyl-choline (POPC) bilayer with all overlapping lipid molecules removed using the Membrane plugin and solvated in a water box using the Solvate plugin in VMD.[23] Four ACh ligand molecules were placed at least 40 Å away from the receptor orthosteric site in the bulk solvent. The system charges were then neutralized with 18 Cl– ions. The simulation systems of the M3 receptor initially measured about 80 Å × 87 Å × 97 Å with 130 lipid molecules, ∼11 200 water molecules and a total of ∼55 500 atoms. Initial energy minimization and thermalization of the M3 receptor system follow the same protocol as used in the previous study.[6] GaMD simulation was then performed using the dual-boost scheme with the threshold energy E set to Vmax. The simulation included 2 ns cMD, 50 ns equilibration after adding the boost potential and then three independent production runs with randomized atomic velocities (one for 400 ns and another two for 300 ns). The GaMD simulations were carried out using NAMD 2.11 on the Gordon supercomputer at the San Diego Supercomputing Center. Excellent scalability was obtained as shown in Figure S1. For the M3 muscarinic GPCR system, benchmark simulations showed that GaMD ran at ∼10 ns/day with 64 CPUs and up to ∼61 ns/day with 640 CPUs, which were ∼8–11% slower than the corresponding cMD runs. This performance is very similar to that of the conventional aMD implemented in NAMD.[14] GaMD production frames were saved every 0.1 ps. The VMD[23] and CPPTRAJ[24] tools were used for trajectory analysis. For chignolin, the root-mean square deviations (RMSD) of the Cα atoms relative to the native NMR structure (PDB: 1UAO) and the protein radius of gyration, Rg, were calculated. For the M3 muscarnic receptor, the density based spatial clustering of applications with noise (DBSCAN) algorithm[25] was applied to cluster the diffusing ligand molecules for identifying the highly populated binding sites. Finally, the PyReweighting toolkit[10] was applied to compute the potential of mean force (PMF) profiles of the backbone dihedrals Φ and Ψ in the alanine dipeptide, the (RMSD, Rg) of chignolin and structural clusters of the diffusing ACh in the M3 receptor system.

Results

Free Energy Profiles of the Alanine Dipeptide

With the three independent 30 ns GaMD production trajectories of the alanine dipeptide (Figure A), energetic reweighting was applied to calculate the free energy profiles of the backbone dihedrals Φ and Ψ. Analysis of the system boost potential showed that it followed Gaussian distribution with low anharmonicity (7.18 × 10–3; Figure B). The boost potential average is 6.93 and 1.87 kcal/mol for the standard deviation. With this, the cumulant expansion to the second order was applied for the reweighting.

Figure 1

(A) Schematic representation of backbone dihedrals Φ and Ψ in alanine dipeptide. (B) Distribution of the boost potential ΔV applied in the GaMD simulations with anharmonicity equal to 7.18 × 10–3. (C–D) Potential of mean force (PMF) profiles of the (C) Φ and (D) Ψ dihedrals calculated from three 30 ns GaMD simulations combined using cumulant expansion to the second order. (E) 2D PMF profile of backbone dihedrals (Φ, Ψ). The low energy wells are labeled corresponding to the right-handed α helix (αR), left-handed α helix (αL), β-sheet (β), and polyproline II (PII) conformations. (F) Distribution anharmonicity of ΔV of frames found in each bin of the PMF profile. In comparison, the reweighted PMF profiles obtained from 30 ns GaMD trajectories agree quantitatively with the original profiles from much longer 1000 ns cMD simulation. Although the GaMD derived PMF profile of Φ exhibits moderate fluctuations near the energy barrier at 0° and slightly elevated free energy well centered at ∼50° (Figure C), it essentially overlaps with the original profile in the other regions, similar for the entire PMF profile of Ψ (Figure D). In contrast, three cMD simulations did not properly sample the energy barriers of Φ at 0° and Ψ at 120° as shown in Figures S2A and S2B, respectively. For Φ, the energy well centered at 60° obtained from the 30 ns cMD simulations was higher than that from 1000 ns cMD simulation. Therefore, while cMD simulations performed for 30 ns are poorly converged for alanine dipeptide, GaMD simulations of the same length yielded significantly improved free energy profiles that agree quantitatively with those of the 1000 ns cMD simulation. In addition, we calculated a 2D PMF of (Φ, Ψ) in alanine dipeptide by reweighting the three 30 ns GaMD trajectories combined. As shown in Figure E, five free energy wells were identified in the reweighted PMF profile of (Φ, Ψ), which are centered around (−144°, 0°) and (−72°, −18°) for the right-handed α helix (αR), (48°, −6°) for the left-handed α helix (αL), (−150°, 156°) for the β-sheet, and (−72°, 162°) for the polyproline II (PII) conformation (Figure E). Their corresponding minimum free energies are estimated as 0, 0.47, 1.82, 1.44, and 2.35 kcal/mol, respectively. In addition, the distribution anharmonicity of ΔV of frames clustered in each bin of the 2D PMF is smaller than 0.10 in all low-energy regions (Figure F), suggesting that reweighting using second order cumulant expansion is a reasonable approximation. Indeed, the reweighted 2D PMF profile obtained from three 30 ns GaMD trajectories (Figure E) is very similar to that obtained from 1000 ns cMD (Figure S2D), but not for the 30 ns cMD simulations (Figure S2C). Therefore, GaMD accurately samples the free energy profiles of alanine dipeptide within much shorter simulation time compared with cMD, demonstrating the efficiency and accuracy of GaMD for the biomolecular free energy calculations.

Folding of Chignolin

Started from an extended conformation of chignolin, simulations using GaMD implemented in NAMD 2.11 were able to capture complete folding of the protein into its native structure within 300 ns. The RMSD obtained between the simulation-folded chignolin and NMR experimental native structure (PDB: 1UAO) reaches a minimum of 0.2 Å (Figure A). The system boost potential applied in the GaMD simulations followed Gaussian distribution with the anharmonicity equal to 9.66 × 10–3 (Figure B). The average and standard deviation of the boost potential are 11.2 and 2.8 kcal/mol, respectively. During the three independent 300 ns GaMD simulations, chignolin folded into the native conformational state with RMSD < 2 Å and unfolded repeatedly in two of the simulations. It remained in the folded state after rapid folding within ∼20 ns in the third simulation (Figure C). Upon folding, the chignolin showed decrease of the radius of gyration, Rg, to 4.2 Å (Figure D). The average folding time of chignolin obtained from the GaMD simulation is ∼28 ns, which is significantly shorter than the 600 ns folding time obtained from the previous long-time scale cMD simulations[26] (i.e., ∼30 times speedup).

Figure 2

Folding of chignolin captured in the GaMD simulations: (A) comparison of simulation-folded chignolin (blue) with the PDB (1UAO) native structure (red) that exhibits 0.2 Å RMSD, (B) distribution of the boost potential ΔV with anharmonicity equal to 9.66 × 10–3, (C) the RMSD of chignolin between simulation snapshots and the PDB native structure, and (D) the radius of gyration, Rg, of chignolin calculated from the three independent 300 ns GaMD simulations. (E) 2D (RMSD, Rg) PMF calculated by reweighting the three 300 ns GaMD simulations combined. (F) Distribution anharmonicity of ΔV of frames found in each bin of the PMF profile. Based on Gaussian distribution of the boost potential, cumulant expansion to the second order was applied to reweight the three 300 ns GaMD simulations of chignolin combined. A 2D PMF profile was calculated for the protein RMSD relative to the PDB native structure and the radius of gyration, (RMSD, Rg) as shown in Figure E. The reweighted PMF allowed us to identify the folded (“F”) and intermediate (“I”) conformational states, which correspond to the global energy minimum at (1.0 and 4.0 Å) and a low-energy well centered at (4.5 and 5.5 Å), respectively. Figure F plots the distribution anharmonicity of ΔV for frames found in each bin of the 2D PMF as shown in Figure E. The anharmonicity exhibits values smaller than 0.05 in the simulation sampled conformational space, suggesting that the boost potential follows Gaussian distribution for proper reweighting using cumulant expansion to the second order. In summary, GaMD enables efficient enhanced sampling and free energy calculations of protein folding as demonstrated on the chignolin.

Ligand Binding to a Muscarinic G Protein-Coupled Receptor

Finally, the GaMD implemented in NAMD 2.11 was demonstrated on binding of the ACh endogenous agonist to the M3 muscarinic GPCR (Figure A). The M3 muscarinic receptor is widely expressed in human tissues and a key seven-transmembrane (TM) GPCR that has been targeted for treating various human diseases, including cancer,[27] diabetes,[28] and obesity.[29] Three independent GaMD simulations were performed on the M3 receptor, one for 400 ns and another two for 300 ns. As shown in Figure B, the system boost potential follows Gaussian distribution with anharmonicity equal to 1.33 × 10–2. The average and standard deviation of the boost potential are 10.9 and 3.0 kcal/mol, respectively. Such narrow distribution will ensure accurate reweighting for free energy calculation using cumulant expansion to the second order.

Figure 3

Binding of the acetylcholine (ACh) endogenous agonist to the M3 muscarinic G-protein-coupled receptor simulated via GaMD: (A) Schematic representation of the computational model, in which the receptor is shown in ribbons (orange), lipid in sticks, ions in small spheres, and four ligand molecules in large spheres. (B) Distribution of the boost potential ΔV with anharmonicity equal to 1.33 × 10–2. (C) Probability distribution of the ACh (the N atom in blue dots) diffusing in the bulk solvent and bound to the M3 receptor (orange ribbons), in which the Glide docking pose of ACh is shown in green sticks. (D) RMSD of the diffusing ACh relative to the Glide docking pose calculated from the 400 ns GaMD simulation. (E) Ten lowest energy structural clusters of ACh that are labeled and colored in a green–white–red (GWR) scale according to the PMF values obtained from reweighting of the GaMD simulation. During the 400 ns GaMD simulation of the M3 muscarinic receptor, ACh was observed to enter the receptor and then bind to the receptor endogenous ligand-binding (“orthosteric”) site (Figure C). Highly populated clusters were identified for the ligand in the extracellular vestibule and orthosteric site of the receptor, while the ligand diffuses nearly homogeneously in the bulk solvent. Note that periodic boundary conditions were applied on the simulation system and thus ACh diffused to the cytoplasmic side of the lipid membrane, which may not occur in the real cells. Nonetheless, ACh entered the receptor from only the extracellular side, recapitulating the first step of GPCR-mediated cellular signaling machinery. Figure D plots the RMSD of the four diffusing ACh molecules relative to the ligand binding pose predicted from Glide docking[30] in the orthosteric site. The ACh-3 molecule was observed to bind the extracellular vestibule with ∼10 Å RMSD, dissociate completely from the receptor, rebind to the extracellular vestibule at ∼200 ns and then enter the receptor to the orthosteric pocket at ∼270 ns. It finally rearranged its conformation in the orthosteric pocket, reached a minimum RMSD of 2.0 Å at ∼340 ns, and stayed bound in the orthosteric site until the end of the 400 ns GaMD simulation. Moreover, during the dissociation of the ACh-3, another ligand molecule (ACh-2) bound briefly to the receptor extracellular vestibule during ∼125–180 ns. Similar observations were obtained in the other 300 ns GaMD simulations of the M3 receptor as shown in Figures S3 and S4, during which different ACh molecules were able to bind the extracellular vestibule but could not reach the orthosteric site within the limited simulation time. In order to obtain a quantitative picture of the ligand binding pathway, the DBSCAN algorithm[25] was applied to cluster trajectory snapshots of four diffusing ligand molecules from the 400 ns GaMD simulation. Energetic reweighting[10,11] was then applied to each of the ligand structural clusters to recover the original free energy. Ten structural clusters with the lowest free energies are shown in Figure E. Global free energy minimum (0.0 kcal/mol) was found for cluster “C1” in the orthosteric site. The second lowest energy minimum was identified for cluster “C2” (0.12 kcal/mol) located in the extracellular vestibule formed between ECL2/ECL3. Moreover, cluster “C3” that exhibits a different conformation compared with cluster “C1” and higher free energy (0.33 kcal/mol) was also identified in the orthosteric pocket. In addition to cluster “C2”, clusters “C4” with 0.45 kcal/mol, “C6” with 0.51 kcal/mol, “C8” with 1.23 kcal/mol, and “C10” with 1.96 kcal/mol were also identified in the extracellular vestibule, in which the positively charged N atom of the ligand interacts with residue Trp5257.35 through cation−π interactions. The residue superscripts denote the Ballesteros–Weinstein (BW) numbering of GPCRs.[31] Three clusters of higher free energies, “C5” with 0.50 kcal/mol, “C7” with 0.94 kcal/mol, and “C9” with 1.50 kcal/mol, appear to connect “C1” in the orthosteric pocket and “C2” in the extracellular vestibule. Therefore, structural clusters “C1”, “C3” ↔ “C7”, “C5”, “C9” ↔ “C2”, “C4”, “C6”, “C8”, and “C10” appear to represent an energetically preferred pathway for the endogenous agonist binding to the M3 muscarinic receptor.

Discussion

By adding a harmonic boost potential to the potential energy surface, GaMD provides both unconstrained enhanced sampling and free energy calculation of biomolecules. Important statistical properties of the system potential, such as the average, maximum, minimum, and standard deviation values, are used to calculate the simulation acceleration parameters, particularly the threshold energy E and force constant k for applying the boost potential. In this study, we have implemented GaMD in the NAMD software version 2.11. GaMD computes the potential boost according to statistics of the system potential collected during the cMD and equilibration stages. It is worth noting that the statistics collection in GaMD-NAMD slightly differs from the previous version of GaMD implemented in AMBER.[11] In the AMBER version, the average and standard deviation of potential energies are calculated in every “ntave” steps, a native function available in the AMBER program.[32] In the NAMD version of GaMD, the average and standard deviation of the potential are calculated using the potential values collected up to the current step (see details in Appendix A).[33] The two variables are updated every step until the end of the cMD and equilibration stages. Moreover, similar to the implementation of conventional aMD in NAMD, the LJ correction term is not included in the total potential energy calculation in GaMD. With the implementation of GaMD in NAMD, we have demonstrated the code on three model systems: the alanine dipeptide, chignolin (a globular protein) and the M3 muscarinic GPCR (a membrane protein). For the alanine dipeptide biomolecular model system, short GaMD simulations performed for only 30 ns were able to reproduce highly accurate free energy profiles of the backbone dihedrals that may need as long as 1000 ns cMD simulation to converge. The free energy errors were almost negligible except the elevated free energy well of Φ near 50° by ∼0.5 kcal/mol and slight fluctuations in the energy barriers (particularly Φ at 0° and Ψ at −120°; Figure ). In contrast, cMD simulations lasting 30 ns hardly sample these free energy barriers and exhibit poor convergence. Therefore, GaMD-NAMD greatly accelerates the conformational sampling and accurate free energy calculation of the alanine dipeptide. For the chignolin, the GaMD-NAMD simulations were able to fold the protein rapidly. In two of the three 300 ns GaMD simulations, chignolin undergoes both folding and unfolding repeatedly (Figure C). Compared with the average folding time obtained from long-time scale cMD simulations (600 ns),[26] GaMD folds the protein within ∼28 ns, i.e., ∼30 times faster. Unlike the previous GaMD-AMBER simulations,[11] the fully unfolded state of chignolin does not appear as a low-energy well in the reweighted free energy profile obtained from the present GaMD-NAMD simulations. This behavior will be subject to further investigation in future GaMD studies. Nonetheless, in addition to sampling the folded state in the global free energy minimum, the GaMD-NAMD simulations also captured the intermediate state during the folding of the protein. This is consistent with the previous long-time scale cMD[26] and aMD[5b] simulations. Finally, we have demonstrated GaMD-NAMD on ligand binding to the M3 muscarinic GPCR as a model membrane protein system. While the ACh endogenous agonist binds only transiently to the receptor extracellular vestibule in two 300 ns GaMD simulations, the ligand enters the receptor and binds to the target orthosteric site in a 400 ns GaMD simulation. Although in principle multiple binding and unbinding events may be needed in order to compute converged ligand binding free energy, structural clustering and reweighting of the GaMD simulation allows us to identify energetically preferred binding sites and pathway of the diffusing ligand. Particularly, the lowest energy cluster of ACh is identified in the orthosteric site, in excellent agreement with the Glide docking pose. The second lowest energy cluster is located in the extracellular vestibule, with the positively charged N atom of ACh forming cation−π interaction with the receptor residue Trp5257.35. This is consistent with previous extensive experimental and computational studies that the extracellular vestibule of class A GPCRs acts as a metastable intermediate site during binding of orthosteric ligands.[6,15] The energetically preferred pathway of agonist binding to the M3 receptor identified from the current GaMD-NAMD simulation is similar to that found in previous long-time scale cMD,[34] aMD,[6] and GaMD simulations[35] of class A GPCRs. GaMD is thus promising to predict low-energy conformations of ligand binding and serve as a useful tool in structural biology and drug discovery. Moreover, GaMD mainly accelerates transitions across enthalpic energy barriers because it is based on boosting the system potential energy surface. However, GaMD can be potentially combined with the parallel tempering[36] and replica exchange[37] algorithms for further enhanced sampling over entropic barriers as discussed earlier.[11] In summary, we have implemented GaMD in NAMD 2.11 that shows excellent scalability for supercomputer simulations of large biomolecules.[38] The updated source code files of NAMD 2.11 for implementing GaMD are now publicly available at http://gamd.ucsd.edu. The implementation shall be released in the upcoming version 2.12 of NAMD. It is complementary to the original implementation of GaMD in the graphics processing unit (GPU) version of the AMBER software[11] that runs extremely fast simulations with one or a small number of GPU cards.[17c,32] As demonstrated on selected model systems, results of the current work shall facilitate the applications of GaMD in enhanced sampling and free energy calculations of a wide range of large biomolecules, such as proteins, lipid membrane, nucleic acids, virus particles, and cellular structures.

40 in total

1. Glide: a new approach for rapid, accurate docking and scoring. 2. Enrichment factors in database screening.

Authors: Thomas A Halgren; Robert B Murphy; Richard A Friesner; Hege S Beard; Leah L Frye; W Thomas Pollard; Jay L Banks
Journal: J Med Chem Date: 2004-03-25 Impact factor: 7.446

2. Glide: a new approach for rapid, accurate docking and scoring. 1. Method and assessment of docking accuracy.

Authors: Richard A Friesner; Jay L Banks; Robert B Murphy; Thomas A Halgren; Jasna J Klicic; Daniel T Mainz; Matthew P Repasky; Eric H Knoll; Mee Shelley; Jason K Perry; David E Shaw; Perry Francis; Peter S Shenkin
Journal: J Med Chem Date: 2004-03-25 Impact factor: 7.446

3. How fast-folding proteins fold.

Authors: Kresten Lindorff-Larsen; Stefano Piana; Ron O Dror; David E Shaw
Journal: Science Date: 2011-10-28 Impact factor: 47.728

Review 4. Studying functional dynamics in bio-molecules using accelerated molecular dynamics.

Authors: Phineus R L Markwick; J Andrew McCammon
Journal: Phys Chem Chem Phys Date: 2011-10-21 Impact factor: 3.676

5. Activation and dynamic network of the M2 muscarinic receptor.

Authors: Yinglong Miao; Sara E Nichols; Paul M Gasper; Vincent T Metzger; J Andrew McCammon
Journal: Proc Natl Acad Sci U S A Date: 2013-06-18 Impact factor: 11.205

6. Structure and dynamics of the M3 muscarinic acetylcholine receptor.

Authors: Andrew C Kruse; Jianxin Hu; Albert C Pan; Daniel H Arlow; Daniel M Rosenbaum; Erica Rosemond; Hillary F Green; Tong Liu; Pil Seok Chae; Ron O Dror; David E Shaw; William I Weis; Jürgen Wess; Brian K Kobilka
Journal: Nature Date: 2012-02-22 Impact factor: 49.962

7. Large-scale conformational changes of Trypanosoma cruzi proline racemase predicted by accelerated molecular dynamics simulation.

Authors: César Augusto F de Oliveira; Barry J Grant; Michelle Zhou; J Andrew McCammon
Journal: PLoS Comput Biol Date: 2011-10-13 Impact factor: 4.475

8. Free energy landscape of G-protein coupled receptors, explored by accelerated molecular dynamics.

Authors: Yinglong Miao; Sara E Nichols; J Andrew McCammon
Journal: Phys Chem Chem Phys Date: 2014-01-21 Impact factor: 3.676

9. E9-Im9 colicin DNase-immunity protein biomolecular association in water: a multiple-copy and accelerated molecular dynamics simulation study.

Authors: Riccardo Baron; Sergio E Wong; Cesar A F de Oliveira; J Andrew McCammon
Journal: J Phys Chem B Date: 2008-12-25 Impact factor: 2.991

10. Gaussian Accelerated Molecular Dynamics: Unconstrained Enhanced Sampling and Free Energy Calculation.

Authors: Yinglong Miao; Victoria A Feher; J Andrew McCammon
Journal: J Chem Theory Comput Date: 2015-07-14 Impact factor: 6.006

44 in total

1. Peptide Gaussian accelerated molecular dynamics (Pep-GaMD): Enhanced sampling and free energy and kinetics calculations of peptide binding.

Authors: Jinan Wang; Yinglong Miao
Journal: J Chem Phys Date: 2020-10-21 Impact factor: 3.488

2. Scalable molecular dynamics on CPU and GPU architectures with NAMD.

Authors: James C Phillips; David J Hardy; Julio D C Maia; John E Stone; João V Ribeiro; Rafael C Bernardi; Ronak Buch; Giacomo Fiorin; Jérôme Hénin; Wei Jiang; Ryan McGreevy; Marcelo C R Melo; Brian K Radak; Robert D Skeel; Abhishek Singharoy; Yi Wang; Benoît Roux; Aleksei Aksimentiev; Zaida Luthey-Schulten; Laxmikant V Kalé; Klaus Schulten; Christophe Chipot; Emad Tajkhorshid
Journal: J Chem Phys Date: 2020-07-28 Impact factor: 3.488

3. Ligand Binding Pathways and Conformational Transitions of the HIV Protease.

Authors: Yinglong Miao; Yu-Ming M Huang; Ross C Walker; J Andrew McCammon; Chia-En A Chang
Journal: Biochemistry Date: 2018-02-15 Impact factor: 3.162

4. Gaussian Accelerated Molecular Dynamics: Theory, Implementation, and Applications.

Authors: Yinglong Miao; J Andrew McCammon
Journal: Annu Rep Comput Chem Date: 2017-08-10

5. Pathway and mechanism of drug binding to chemokine receptors revealed by accelerated molecular simulations.

Authors: Shristi Pawnikar; Yinglong Miao
Journal: Future Med Chem Date: 2020-06-09 Impact factor: 3.808

6. Probing the Molecular Mechanism of Rifampin Resistance Caused by the Point Mutations S456L and D441V on Mycobacterium tuberculosis RNA Polymerase through Gaussian Accelerated Molecular Dynamics Simulation.

Authors: Qianqian Zhang; Shuoyan Tan; Tong Xiao; Hongli Liu; Syed Jawad Ali Shah; Huanxiang Liu
Journal: Antimicrob Agents Chemother Date: 2020-06-23 Impact factor: 5.191

7. Dynamics and Location of the Allosteric Midazolam Site in Cytochrome P4503A4 in Lipid Nanodiscs.

Authors: Michelle Redhair; John C Hackett; Robert D Pelletier; William M Atkins
Journal: Biochemistry Date: 2020-01-27 Impact factor: 3.162

8. Identification of SLAC1 anion channel residues required for CO₂/bicarbonate sensing and regulation of stomatal movements.

Authors: Jingbo Zhang; Nuo Wang; Yinglong Miao; Felix Hauser; J Andrew McCammon; Wouter-Jan Rappel; Julian I Schroeder
Journal: Proc Natl Acad Sci U S A Date: 2018-10-09 Impact factor: 11.205

Review 9. Recent advances in computational studies of GPCR-G protein interactions.

Authors: Jinan Wang; Yinglong Miao
Journal: Adv Protein Chem Struct Biol Date: 2019-01-03 Impact factor: 3.507

10. Mechanistic Insights into Specific G Protein Interactions with Adenosine Receptors.

Authors: Jinan Wang; Yinglong Miao
Journal: J Phys Chem B Date: 2019-07-22 Impact factor: 2.991