Literature DB >> 27807980

Convergence and Sampling in Determining Free Energy Landscapes for Membrane Protein Association.

Jan Domański^1,2, George Hedger¹, Robert B Best², Phillip J Stansfeld¹, Mark S P Sansom¹.

Abstract

Potential of mean force (PMF) calculations are used to characterize the free energy landscape of protein-lipid and protein-protein association within membranes. Coarse-grained simulations allow binding free energies to be determined with reasonable statistical error. This accuracy relies on defining a good collective variable to describe the binding and unbinding transitions, and upon criteria for assessing the convergence of the simulation toward representative equilibrium sampling. As examples, we calculate protein-lipid binding PMFs for ANT/cardiolipin and Kir2.2/PIP2, using umbrella sampling on a distance coordinate. These highlight the importance of replica exchange between windows for convergence. The use of two independent sets of simulations, initiated from bound and unbound states, provide strong evidence for simulation convergence. For a model protein-protein interaction within a membrane, center-of-mass distance is shown to be a poor collective variable for describing transmembrane helix-helix dimerization. Instead, we employ an alternative intermolecular distance matrix RMS (DRMS) coordinate to obtain converged PMFs for the association of the glycophorin transmembrane domain. While the coarse-grained force field gives a reasonable Kd for dimerization, the majority of the bound population is revealed to be in a near-native conformation. Thus, the combination of a refined reaction coordinate with improved sampling reveals previously unnoticed complexities of the dimerization free energy landscape. We propose the use of replica-exchange umbrella sampling starting from different initial conditions as a robust approach for calculation of the binding energies in membrane simulations.

Entities: Chemical

Mesh：

Substances：

Year: 2016 PMID： 27807980 PMCID： PMC5402295 DOI： 10.1021/acs.jpcb.6b08445

Source DB: PubMed Journal: J Phys Chem B ISSN： 1520-5207 Impact factor: 3.466

Introduction

Biological membranes are a complex mixture of multiple species of lipids and proteins, the interactions of which play a key role in cell function. The balance of interactions between different components within the membrane also determines its longer range (nano to microscale) structure. Of particular interest are the intramembrane interactions of proteins with lipids[1] and adjacent proteins, the latter playing key roles in membrane protein folding[2] and function.[3] Molecular simulations have the potential to provide a detailed and quantitative understanding of protein and lipid interactions in membranes. In particular, simulations permit the computation of free energy landscapes or potentials of mean force (PMF) of interactions between the different membrane components (e.g., refs (4) and (5)). By enabling binding free energies to be computed, PMFs can also provide mechanistic insights into biologically relevant intermolecular interactions, which in turn can be compared to available experimental measurements. However, determining free energies for molecules embedded in lipid membranes is particularly challenging. For example, the lateral diffusion coefficient of common lipids in lipid bilayers is two to 3 orders of magnitude smaller than that of water.[6] Therefore, sampling methods based on molecular dynamics will be impeded by the resulting slow diffusion of any molecule embedded in the membrane, a situation which is exacerbated in complex and crowded membrane environments.[7] To calculate the binding free energy surface between two species, a sufficiently representative sample of the relevant configurations of the system have to be visited such that the relative weights of different free energy basins can to be established. Determining such a sample, and when it is sufficient (or “converged”), form the central task of any method for computing free energies. While incompletely converged simulations may provide insights into the location of stable states on a free energy surface, it is still possible that the relative free energies of different states on that surface may even be qualitatively incorrect. These challenges are starting to be recognized in simulations of membrane-systems.[8] Therefore, for quantitative interpretation, and for comparison with experiment, it is essential to obtain converged and therefore reproducible results. Enhanced sampling methods have been widely used to address the sampling challenge in membrane simulations. Here we focus in particular on those based on a reaction coordinate, or collective variable (CV), which include, e.g., umbrella sampling,[9] metadynamics,[10] and adaptive bias force.[11] Replica exchange (RE) umbrella sampling (US) is particularly advantageous due to straightforward setup and faster convergence (relative to standard umbrella sampling) due to exchanges between replicas.[12] This sampling method has been applied to calculate protein–lipid and protein–protein binding free energies in both all-atom (AA) and coarse-grained (CG) simulations.[13−15] Umbrella sampling requires a collective variable (CV) that defines the binding/unbinding transition of the interaction. The most commonly used CV is a center-of-mass distance between two species (protein–lipid or protein–protein) within the membrane, which is used to define the umbrella windows. The initial configurations of the simulation windows are generally selected by translation along this reaction coordinate. Discarding an initial fraction of a simulation sufficient to allow for equilibration, a reweighting can be performed, using tools such as WHAM[16,17] or, equivalently, MBAR,[18] recovering the unbiased potential of mean force after subtracting the effect of the umbrella potentials. It is often assumed that if this PMF fails to change with increased simulation time, then the PMF calculation is converged. Of course, this is a necessary but insufficient test when calculating PMFs for complex membrane systems, as we will illustrate. Here we present three examples of typical membrane PMF calculations, two for protein–lipid (one simple, one more complex) and one for protein–protein interactions within a lipid bilayer (Figure ). We use the coarse-grained (CG) Martini force field which has been employed in a number of such studies recently (recently reviewed in ref (19)). Our examples are the mitochondrial transport protein ANT binding to cardiolipin (CL),[20] the potassium channel Kir 2.2 binding to phosphatidyl inositol 4,5-bisphosphate (PIP2),[21] and dimerization of the transmembrane (TM) helix of glycophorin A (GpA),[22] the latter being the prototypical system for studying protein–protein interactions within a membrane. We study the dimerization of GpA and its mutants, using a new collective variable, based on the intermolecular contact matrix in the GpA dimer, to obtain a well-converged PMF. We provide strong evidence of convergence in both cases: starting the simulations with all replicas in a bound or all in an unbound initial configuration has no effect on the result. The importance of exchanges between windows is also highlighted, with absence of exchanges resulting in slower convergence. We compare our simulation results with experimental data, allowing us to explore limitations of the current CG force field.

Figure 1

(A) Possible choices for initialization of umbrella sampling simulations: “default” (cyan) corresponds to replicas starting close to the bias center for each window; “bound” (blue) corresponds to all the replicas close to one end of the collective variable (CV) range; and “unbound” (red) corresponds to all replicas close to the other end of the CV range. (B) The three simulation systems explored in this article: ANT, the adenine nucleotide transporter (in blue) interacting with a cardiolipin (CL; red) molecules; Kir2.2, an inward rectifier potassium channel (gray) with a subunit (blue) interacting with a PIP2 (red) molecule; and GpA, the glycophorin TM transmembrane (TM) helix dimer, with the two TM helices (i.e., monomers) in red and blue. In each case the lipid bilayer is indicated by its phosphate particles (in green) with the remainder of the lipid molecules and the water on either side of the bilayer are omitted for clarity. For each system bound and unbound system snapshots are shown, in the upper and lower panel, respectively.

Methods

Molecular Simulation Methods

Molecular dynamics was performed in GROMACS (version 4.6.7; www.gromacs.org),[23] using a time step of 20 fs. Semi-isotropic Berendsen pressure coupling[24] was used for equilibration with coupling constant of 12 ps and Parrinello–Rahman[25] coupling with coupling constant of 12 ps for production simulations with a reference pressure of 1 bar in both XY and Z directions. Similarly, a Berendsen thermostat was used for equilibration with coupling constant of 1 ps and a stochastic velocity rescaling temperature control[26] for production with characteristic time of 1 ps, keeping the temperature constant around 310 K. A cutoff of 1.2 nm was used for Lennard-Jones (LJ) interactions, with the potential switched off smoothly between 0.9 and 1.2 nm. Coulombic interactions were calculated using the Particle Mesh Ewald (PME) method,[27] with a grid spacing of 0.12 nm and a real-space cutoff of 1.2 nm. These settings were consistently used in the ANT/CL, KIK/PIP2, and glycophorin simulations.

Force Field

The CG Martini force field was used for the simulations: for ANT/CL and GpA, version 2.1 of Martini was used[28−30] with a 4-tail bead POPC model. For Kir2.2/PIP2 version 2.2 was used[31] with the older 5-tail bead POPC model.[32] PIP2 lipid parameters taken from ref (33) and cardiolipin parameters were obtained from ref (34).

Umbrella Sampling

The PLUMED2 package[35,36] (2.2-hrex, compiled from https://github.com/GiovanniBussi/plumed2/tree/v2.2-hrex) was used to patch GROMACS 4.6.7 and define the collective variables and perform the biasing. Replica exchanges are evaluated using the Boltzmann criterion.where the probability of exchange between two replica windows 1 and 2 with umbrella potentials U1 and U2 is dependent on the difference Δ between the energies of the protein configurations X1 and X2 (those in windows 1 and 2 before exchange) in these potentials before and after the trial exchange (, where kB is Boltzmann’s constant and T is the absolute temperature). Exchanges are determined according to a Metropolis criterion, always being accepted if e–Δ ≥ 1 and if e–Δ < 1 being accepted if a random number r drawn from a uniform distribution on [0,1) is less than e–Δ. In this work, we have applied the final umbrella potentials to each window. In some cases, this may generate large forces due to the system being out of equilibrium. This can readily be overcome by including an appropriate initialization protocol (e.g., equilibrating the umbrella closest to the initial condition first, then using this to equilibrate the next umbrella and so on). If the landscape is rougher (as in an all atom simulation), it will be harder to converge the two initial conditions, although “slow growth” methods, e.g., ref (37), may be helpful in the situation. Even if convergence is not achieved, the comparison will expose this fact, which otherwise would be overlooked.

Analysis and Visualization

MDAnalysis[38] was used to automate plumed input generation and GromacsWrapper[39] was used for topology file manipulation. VMD[40] was used to produce the molecular renderings.

ANT/Cardiolipin

The system was setup as described in ref (20) with a simple POPC bilayer assembled around the position restrained protein (with force constant of 1000 kJ/mol/nm2). The final system contained 279 POPC molecules, a single cardiolipin molecule in the upper leaflet, and over 4200 coarse-grained water particles. Randomly selected water particles were converted into sodium and chloride ions to neutralize the system and mimic the experimentally used buffer. Initial configurations for the umbrella sampling windows were generated via a steered MD (SMD) molecular dynamics simulation. A collective variable (CV) was defined to permit sampling along a path orthogonal to the binding site surface (Figure A). This CV corresponded to the distance between the CG glycerol particle (GL0) of cardiolipin and the backbone bead of the Proline in the conserved Px[D/E]xx[K/R] motif, found on the odd-numbered TM helices and present in all members of this protein family (Figure A). We have chosen to sample along this linear coordinate because it is most relevant to binding to the known native binding sites, whereas using a radial coordinate would also average over non-native binding. In addition, a linear coordinate greatly facilitates sampling. To prevent the protein from rotating within the bilayer, the 3 Proline backbone particles of the three equivalent sites on ANT were position restrained (400 kJ/mol/nm2 on the X and Y dimensions). To keep the CL lipid molecule from diffusing away from the line coordinate, an additional harmonic restraint is defined with a force constant of 1000 kJ/mol/nm2, keeping the Y-component of the CV close to a fixed value. Umbrella sampling was set up with 32 windows, spaced linearly between 1.6 and 4.5 nm on the cardolipin headgroup–protein center-of-mass distance, with a force constant of 1000 kJ/mol/nm2. Exchanges were attempted every 10000 steps.

Figure 2

Cardiolipin binding to ANT, providing an example of calculating a PMF for a lipid interacting with a specific binding site on an integral membrane protein. (A) Representative structures of the closest (protein lipid COM distance =1.5 nm) and furthest (4.5 nm) replicas in the cardiolipin-ANT binding simulations. (B) PMFs starting from 3 different initial configurations (default in cyan, unbound in orange, bound in blue; black crosses indicate the final PMF for comparison; see Figure for definitions), computed after 3, 5, and 7 μs of the simulation (within each case the first half of the simulation discarded as equilibration). Thus, the final PMF contains data pooled from 2 replica exchange simulations, corresponding to the starting bound and unbound initial conditions. (C) Root mean square difference between the two PMFs started from “all bound” and “all unbound” configurations as a function of time. (D) PMF RMSD to the final PMF of simulations started from the default initial condition and either allowing for replica exchange (labeled replex) or without exchanges (labeled no replex).

Kir2.2

For computational efficiency, the wild-type Kir structure PDB 3SPI(21) was truncated in PyMol,[41] removing residues before Asp-61 and after Thr-191, preserving Arg-186, which coordinates the 5′ phosphate in this structure. This residue was then mutated to Ala in the mutant system. The martinize script was used to obtain a coarse-grained topology in each case and an elastic network was generated to stabilized the protein structure in simulation. The protein was inserted into a POPC bilayer, solvated, and equilibrated using the MemProtMD pipeline.[42] Position restraints with a force constant of 1000 kJ/mol/nm2 were used in equilibration to keep the protein from moving. The short-tail PIP2 molecules from the 3SPI crystal structure were used as the starting conformation for the bound conformation of PIP2. Sodium and chloride ions were added to 0.15 M to mimic the buffer generally used experimentally. The final system contained over 12 000 particles with 4 PIP2 lipids, 360 POPC lipids, and over 6000 water particles. The collective variable was chosen to be the distance between the PIP2 headgroup and the center of mass of the nearest Kir2.2 chain. Sixteen umbrella sampling windows were setup, with the CV linearly spaced between 2.6 and 4.5 nm with a force constant of 500 kJ/mol/nm2 (Figure A). Exchanges between replicas were attempted every 10 000 steps. The initial structure for all windows of the “start bound” simulation was 3SPI. After 50 ns of umbrella sampling, a frame from the last window was used to initialize all windows of the “start unbound” simulation. The distance is roughly parallel to the X-axis of the simulation box and so an additional bias is applied to prevent the lipid from diffusing in Y, with a force constant of 100 kJ/mol/nm2. To prevent the Kir from rotating in the bilayer, a position restraint was applied to the backbone particles of the protein, with a force constant of 100 kJ/mol/nm2 on X- and Y-coordinates.

Figure 3

PIP2 binding to Kir2.2, providing an example of a PMF for a charged lipid interacting with a specific binding site on an integral membrane protein. (A) Representative structures of the PIP2 bound (PIP–Kir2.2 distance 2.7 nm) and PIP2 unbound (3.4 nm) replicas in the PIP2–Kir2.2 binding simulations. (B,C) PMFs along a CV defined by the PIP2 headgroup distance to the Kir monomer center of mass. PMFs are shown starting from two initial configurations: with lipid bound in all replicas (blue) or unbound (orange) PMFs are shown for (B) the wild-type protein and (C) a mutant protein (R186A) with reduced PIP2-binding affinity mutant. Instead of using an initial steered MD simulation to generate the unbound configurations (as was done for the ANT simulations) we relied on a short umbrella sampling run starting from bound with the umbrella centered far away from the protein. While this resulted in large initial forces, these quickly taper off as the lipid moves closer to the target window distance. Because the coarse-grained energy landscape is relatively smooth this has no adverse effect.

GpA

The solution NMR structure of GpA in a detergent micelle (PDB id 1AFO(22)) was used as a starting structure for the simulation. The protein was inserted into a POPC bilayer using g_membed.[43] The system contained 125 POPC lipids and over 1400 coarse-grained water particles. Umbrella sampling simulations were setup with 16 windows, spaced linearly between DRMS = 0 and 2.5 nm relative to the native, with a force constant of 100 kJ/mol/nm2. Exchanges were attempted every 1000 steps. All the windows of start bound simulations were initialized from the 1AFO structure. After 50 ns, frames from each window were used to initialize the “start default” simulation, and the last window frame was used to initialize the “start unbound” simulation. This is similar to what was done in Kir2.2 simulations, as described in the previous section.

Definition of DRMS

Distance root-mean-square displacement DRMS between two configurations XA and XB is defined below.where d(xia,xja) is the distance between the coordinates xi and xj of atoms i and j in configuration XA and the sum runs over all atom pairs (i,j) in a specified contact list. Because all-to-all distances have to be considered this becomes computationally expensive with increasing number of particles. Therefore, pairs of atoms are only included in the contact list if those atoms are within some cut off distance in the “native” bound state. Here we use a lower and upper cut offs of 0.1 and 0.6 nm, respectively. We then further restrict the atoms used in the calculating of the collective variable by taking only pairs involving one atom from each molecule (for glycophorin this becomes interhelical DRMS). Only the transmembrane region between Glu-72 and Ile-95 was used to calculate the interhelical DRMS (both backbone and side-chain beads were included).

Calculation of the Dimerization Kd from the Potential of Mean Force

The dissociation constant (Kd) for the GpA dimer can be defined aswhere [A] is monomer concentration (per lipid area) when the system is in monomers and [AA] is the dimer concentration when the helices are dimerized and y is the fraction of time it is bound. Eq can be derived by requiring either (i) that the time- or ensemble-averaged chemical potential of the monomer state and dimer state should be equal at equilibrium (and neglecting pressure contributions to free energy) or (ii) that the time-averaged forward and reverse fluxes for dimerization should be equal at equilibrium. Since in this case, , where σ is the lipid area (defined as the circle whose radius (L) is the mean helix–helix distance in the last window of the umbrella sampling simulation), A reference concentration of 1 molecule per nm2 is used. The standard free energy of dissociation is then ΔG0 = −RTln Kd. An analogous expression was derived by De Jong et al.[44] (see also[45]) for heterodimers. To determine the fraction bound (y), one has to integrate the PMF using some suitable dividing surface.where β = 1/(kBT), kB is Boltzmann’s constant, and T is the absolute temperature. dc is the dividing value of DRMS separating bound and unbound state, chosen as 2.0 nm, and where the free energy value starts decreasing toward bound; the value of y is not highly sensitive to this choice.

Results and Discussion

Protein–Lipid Interactions: A Biologically Relevant Lipid

Protein–lipid interactions play a key role in membrane protein stability and function[1] and so it is important to be able to characterize the strength of such interactions via PMF calculations. A well-defined interaction between a lipid molecule and a protein is provided by that between cardiolipin (CL) and ANT (Figure B). The ADP/ATP carrier (AAC1/ANT) is located in the inner membrane of the mitochondria. It possesses a 3-fold pseudosymmetric structure with 3 CL binding sites which have previously been characterized structurally.[46,47] From a sampling perspective, this is an interesting system because CL is an anionic 4-tailed lipid, and thus might be anticipated to be more challenging to sample in terms of its interactions with a membrane protein than a neutral 2-tailed lipid. A collective variable (CV) was defined to permit sampling along a path orthogonal to the binding site surface (Figure A; see Methods Section for details). Two independent sets of simulations were performed, with the CL molecule either initially bound (d = 1.7 nm) to site 1 on ANT, or unbound (d = 4.5 nm). The bound configuration was the most representative structure of an ANT/CL complex from a long equilibrium, unbiased MD simulation in which the cardiolipin spontaneously bound to the site, which corresponds closely to the crystallographically observed configuration of CL at this site.[20] An additional independent simulation was prepared in which the CL molecules were initially positioned with uniform spacing along the umbrella coordinate (see Figure A for a schematic representation of these three initial starting conditions). The unbound configurations were generated using accelerated MD, starting from the bound configuration, and applying a force pushing the CL out and away from its binding site. The umbrella sampling simulation was unbiased using WHAM,[16,17] yielding a PMF with 2 distinct minima, at d = 1.5 nm and d = 2.0 nm, with well-depths of 20 and 10 kJ/mol, respectively (Figure B). The PMFs are well-converged, within 1 kJ/mol of each other after 7 μs. The progress of convergence can be monitored via calculating the root-mean-square difference between a PMF derived from simulations started with all replicas bound and a PMF starting from all unbound. This PMF RMSD becomes <1 kJ/mol after 5 μs (Figure C, upper plot). Up to this difference, the PMF is thus insensitive to the initial conditions. An overlap of distance histograms from the different umbrella windows is shown in SI Figure 1A. Multiple binding and unbinding transitions are observed in the continuous trajectories obtained by following the replicas through exchanges (SI Figure 1B). Convergence is aided by allowing for exchange between replicas, which is attempted every 10000 steps and accepted via an analogous criterion to that for temperature replica exchange.[12] In the absence of such exchanges, the PMF RMSD remains ∼2 kT even after 7 μs, with no sign of convergence (Figure C, lower plot). The two minima (A and B in Figure B) correspond to configurations of ANT with bound or partially bound CL, respectively. Representative frames show the headgroup particles of CL bound tightly into the ANT in minimum A, while in minimum B the phospholipid headgroup is about 0.5 nm further away from the binding pocket surface. While minimum A is considerably deeper than B, it can only be accessed by crossing a ∼ 5 kJ/mol barrier. Assessing the relative depth of these two minima would be very difficult using standard unbiased MD simulation, and presents a sampling challenge to the replica exchange umbrella sampling approach. We are therefore confident that, from the umbrella sampling with replica exchange, we have a well converged PMF for the interaction of a lipid with a membrane protein. We note that the duration of our simulations (5–7 μs) is longer than often used for determining PMFs in a membrane environment (e.g., 1 μs[5]). While we find that the well-depth for the deeper minimum is within 2.5–5 kJ/mol of the final value, had the simulation been run for 1 μs, the rest of the PMF may not be well converged. We also note that a commonly used criterion, the PMF well-depth not changing over time is a necessary but not sufficient to demonstrate that a simulation is converged. This is especially the case in membrane protein/lipid simulations where “memory effects” of the initial conditions are likely to play a key role in determining convergence. Furthermore, it appears that the simulations started from the “bound” state converge much more quickly to the final PMF than those started from “unbound”. This is most likely related to the chosen coordinate not uniquely defining the bound state: thus, applying a force to separate the protein and lipid is effective in producing representative unbound states (since distance is sufficient to define fully unbound states). However, pulling the molecules together can create non-native states at the desired intermolecular separation which evolve more slowly to the correctly bound state as the bias force is not pushing non-native bound states toward the native-like bound or partially bound states. Note that this faster convergence starting from bound would only be expected if the most stable bound state (in the context of the force field) was used to initialize the runs.

Protein–Lipid Interactions: A More Complex Case, Kir/PIP2

PIP2 is a key lipid molecule involved in regulation and/or recognition in many membrane processes,[48] including the activity of ion channels.[49] It has a somewhat more complex polyanionic headgroup than CL. Here we explore the PMF of PIP2 interactions with the TM domain of the Kir2.2 potassium channel. Both structural and functional experimental data are available for this interaction,[49] as well as simulation studies which demonstrate that extended equilibrium CG simulations will permit PIP2 to find its experimentally observed binding site (Figure B), in which the anionic headgroup binds to basic residues on the protein surface close to bilayer/water interface.[33,50] A number of binding site mutants exists, with reduced apparent affinity for PIP2.[51,52] Thus, this system provides both a further opportunity to explore convergence of protein/lipid PMFs, and to explore the sensitivity of such (CG) PMFs to mutations in key lipid-binding side chains. As detailed above, the CV was defined as a distance between the PIP2 headgroup and the center of mass of the Kir2.2 subunit containing the binding site (Figure A). For wild-type (WT) Kir2.2, the potential of mean force shows a single minimum, 45 kJ/mol deep, with the PIP2 bound at the native site observed in the crystal structure (Figure B). Comparison of the bound/unbound profiles demonstrates convergence of the PMF to within an RMSD of 2.5–5 kJ/mol after 4–5 μs umbrella sampling with replica exchange. The depth of the energy minimum, corresponding to interaction of PIP2 with a binding site containing 6 basic side chains, is comparable to that of PIP2 interacting with a juxtamembrane binding site of the EGFR (−42 kJ/mol), a (less structured) binding site which contains 5 basic side chains.[5] The Kir2.2 mutant R186A (which has a reduced apparent affinity for PIP2) has one fewer arginine in the PIP2 binding site, but still can be crystallized with a short-chain PIP2 molecule bound.[21] The PMF shows a rather broad well, with depth of about 32 kJ/mol (Figure C). Representative structures reveal that partially unbound and fully bound configurations of the PIP2 headgroup at the mutant binding site have about the same free energy of interaction (Figure C, inset), which is ca. 70% of that of PIP2 bound to the WT channel. Thus, the PMF suggests that the effect of the R186A mutation may be more complex than simply a reduction in affinity of the site for PIP2. To better understand the role of electrostatics in the Kir2.2–PIP2 interaction, we repeated the calculation with a PIP2 molecule in which the phosphate particle charges had been neutralized (SI Figure 2). This reduces the well-depth by about 2-fold, but interestingly the energy minimum remains at about the same position on the CV.

Protein–Protein Interactions in a Membrane

Protein–protein interactions, and in particular interactions between TM helices within a bilayer, play key roles in folding,[53] stability,[54] and function[3] of membrane proteins. Thus, it is of considerable interest to understand the free energy landscapes of membrane protein–protein interactions. It is also anticipated to be more challenging both in terms of selection of a suitable CV and of establishing convergence than is the case for even protein/lipid interactions in a membrane. One of the best studied examples of protein–protein interactions in a membrane, both experimentally[22,55−62] and computationally[4,45,63−68] is that of dimerization of the glycophorin (GpA) TM domain (Figure B). Both NMR[22] and more recent crystallographic[62] data reveal the GpA TM domain to form a parallel TM helix dimer with an interhelix crossing angle of ca. −20° and a helix/helix interface containing a key glycine-rich sequence motif. Both the wild-type structure and dimerization-disrupting mutants have been characterized via a range of biochemical methods[55−60] and also in silico, the latter both at the coarse-grained[4,66,67] and all-atom[45,68] resolution. It is therefore an ideal system to probe protein–protein interactions within a membrane. A “good” CV describing TM helix dimerization in a membrane should allow the dimer to be separated, and also allow us to readily distinguish native from near-native dimer configurations. We defined a set of intermolecular (i.e., between the TM helix monomers) native distances in the GpA TM dimer structure. Using those distances as a reference we then define a distance root-mean-square displacement (DRSM) relative to the native TM dimer configuration. As previously (see above) three independent sets of simulations were run: one starting with all windows in the bound configuration; one starting with all the windows in unbound configuration; and one with the windows spaced at uniform intervals along the CV. The initial “bound” structure was a CG model derived directly from the PDB 1AFO solution NMR structure of GpA in a detergent micelle.[22] The PMF thus obtained (Figure A) could be shown to be independent of the initial configuration and converges around 6 μs (as judged from the PMF RMSD plot; SI Figure 3) to within 2.5 kJ/mol between the simulations. The dimerization landscape has 4 minima, with a well-depth of about 35 kJ/mol relative to the unbound state. Notably, the native minimum (labeled 1 in Figure A) is not the lowest free energy configuration, but rather a near-native minimum (labeled 2 in Figure A) is more stable. Projecting the PMF from sampling along the DRMS CV onto the helix–helix center of mass distance (D), we find that a non-native dimerized state, with a helix–helix separation of 0.9 nm (versus 0.6 nm for the native) is the free energy minimum (Figure B). Minima 1 and 2 correspond to native and near-native configurations (characterized by helix–helix crossing angles of −25° and −20° compared to −23° and −20° in the 1AFO NMR structure and the 5EH4 crystal structure, respectively). Minima 3 and 4 represent non-native configurations, with the helices more closely parallel to one other (Figure C), and crossing angles of −15° and −5°, respectively. The simulations are well converged, with multiple dimerization and dissociation events (SI Figure 3).

Figure 4

Dimerization of the TM helix of glycophorin A (GpA), providing an example of a protein–protein interaction within a membrane. (A) GpA dimerization showing a PMF along a reaction coordinate (CV) defined by the intermolecular DRMS to the native dimer structure (PDB id 1AFO). Three initial conditions used, which had either all replicas in the native configuration (“bound”, blue), all replicas initially dissociated (“unbound”, orange), or replicas linearly spaced apart (“default”, cyan). Observed 4 energy minima are labeled 1–4, corresponding to DRMS values of about 0.1, 0.3, 0.55, and 0.75 nm, respectively. (B) Reweighted PMF along a CV corresponding to the interhelical distance (the native structure distance is in red at ∼0.6 nm). (C) Representative structures (in gray; wireframe backbone) from each of the four minima labeled in the PMF in (A). The two subunits of the experimental (PDB id 1AFO) are shown in blue and in red. (D) Crossing angle distributions for the structures seen in PMF minima labeled in (A), with crossing angle defined as a torsion angle between flanking backbone atoms of residues number 78–88. This finding was somewhat unexpected. Previous CG PMFs for GpA dimerization, whether by umbrella sampling MD[4] or by a Monte Carlo approach[67] were calculated along helix–helix distance. While the DRMS and D behave similarly at large (>1.5 nm, Figure B, SI Figure 4) separations, at short distances D is unable to discriminate between closely related native and non-native dimer configurations. Thus, D is a relatively poor collective variable since these native, near native, and non-native states are clearly separated by a free energy barrier when projected onto DRMS (Figure A). Projecting our PMF onto a 2D surface defined by the DRMS and D plane confirms that D collapses together a number of states which are separated by DRMS (Figure B). A key question is whether the observed non-native states are in fact experimentally relevant, or simply due to the limited resolution that can be achieved with the coarse-grained model. While the existing experimental structures all favor a very tightly defined native structure,[62] that does not rule out the possibility that there may be an appreciable population of other states at equilibrium (although it seems unlikely). A second method of validating the results is compared with dissociation constants determined from recent steric trap methods, which allow an estimate of the dimer stability in a lipid bilayer.[60] In order to compute Kd from the PMF, we need to define what is meant by the bound state. If we provisionally define all configurations in which the TM helices are separated by a distance of less than 2.3 nm (corresponds to DRMS of ∼2.0 nm) (since the steric trap method would not distinguish native from non-native states if the helices have a native separation), we obtain a Kd of 1.1 × 10–6 molecules per nm2, compared with the experimental estimate of 1.3 × 10–9 molecules nm2,[60] an acceptable difference considering the simplicity of the simulation model, which lacks detailed specific interactions, such as hydrogen bonds. The PMF also shows that the non-native bound states constitute the vast majority of the bound population, which might suggest that they are in fact relevant.

Figure 5

(A) Schematic/cartoon of D vs DRMS as “poor” and “good” CVs. While at far separation both CVs describe the system similarly, at close separation, D collapses the phase space and can correspond to a number of very different configurations, while DRMS is able to distinguish native from near-native helix–helix packing arrangements. (B) 2D PMF along D and DRMS, obtained by reweighing the original 1D umbrella sampling simulation along DRMS. Inset (C; magnified) shows the minima 1–4 which have different DRMS values and essentially the same D. As an alternative, more stringent, experimental test of our free energy landscape, we therefore consider the effects of mutations. The tight packing of GpA TM helices is mediated via a conserved GXXXG motif.[69] Mutations that introduce a bulky side-chain residue in place of one of the Gly residues are known to disrupt dimerization.[63] Thus, both G83L and G83I prevent dimerization.[55,57,58,60] We have simulated the G83L mutant dimer to obtain PMFs as before starting either from bound or unbound configurations; because isoleucine and leucine are represented in an almost identical way in the Martini force field, the results for G83I are expected to be very similar. The Kd computed from these PMFs is indistinguishable from that of wild-type, strongly suggesting that the populated binding modes are not representative of the experimental distribution. We can obtain further insight from the detailed differences between the PMFs. The G83L PMF shows that in fact the most native-like minimum (minimum number 1 in Figure ) is destabilized, albeit by 5 ± 2.5 kJ/mol relative to the WT PMF (Figure A), compared with 19 kJ/mol in experiment.[60] The quantitative discrepancy may be due to the coarse-grained nature of the model. In addition, the range of experimental constructs (based largely on a chimeric protein formed by fusion of the GpA TM domain with staphylococcal nuclease) and environments (detergent micelles vs in vitro lipid bilayers vs bacterial cell membranes) makes quantitative comparisons difficult. Nonetheless, the reproduction of a destabilizing effect is consistent with experiment and intuition, based on the published structures.[22,62] There is also a similar destabilization of the native-like states with lower DRMS to the experimental structure. On the other hand, there is virtually no change in free energy for the minimum between 0.7 and 0.8 nm. This is because residue 83 has a smaller role in the binding interface in this case. Because this minimum (number 4 for the mutant, in which the helix crossing angles are bimodally distributed with peaks at −20 and 20 degrees (SI Figure 5), is the most stable bound species in the mutant, this allows the Kd to remain essentially unchanged in the G83L mutant simulations. Overall these results suggest that, at the least, the population of stable non-native dimer states is too large in the simulation, particularly for those with largest DRMS from native.

Figure 6

Effects of a “disruptive” mutation on the dimerization of the TM helix of GpA. (A) PMF for dimerization of a mutant (G83L) of GpA (compare with the wild type PMF in Figure A). The three energy minima labeled correspond to the structures in Figure C. Two initial conditions were used, which had either all replicas in the native configuration (“bound”, blue), or all replicas initially dissociated (“unbound”, orange). (B) Comparison of the population distributions derived from the wild-type and G83L GpA PMFs. From these one may estimate a ΔΔG of 0 or 4 kJ/mol (with unbound state defined at 2.0 nm or 0.2 DRMS respectively) for destabilization of the native dimer by the G83L mutation. We also explored the robustness of the GpA TM helix dimer PMF to variations in the CG model applied. Thus, the results discussed above used the Martini 2.1 force field; recalculation using the more recent version, Martini 2.2,[31] does not seem to change the PMF appreciably (SI Figure 6). We also explored whether the use of PME vs shifted electrostatics within Martini, and the effect of different restraints on secondary structure (dihedral vs elastic network). The resultant PMFs (SI Figure 7) suggest that the results are relatively robust to the restraints, and that the use of PME leads to some relative stabilization of the native dimer structure. Thus, we are confident we have achieved a well sampled and converged PMF within the limitations of the Martini force field.

Conclusions

In summary, we show how PMF calculations for membrane systems, for both protein–lipid and protein–protein interactions, benefit from stronger tests of convergence than is sometimes the case. Exchanges between umbrella sampling windows speed-up convergence and independent initial conditions provide an easily interpretable condition for convergence. Additionally, care must be taken in selection of suitable CV. For certain systems, such as lipid–protein interactions, a simple center of mass distance appears sufficient to achieve good sampling. However, for protein–protein interactions in a membrane, we implemented a different collective variable, namely an intermolecular distance matrix RMSD. We have illustrated our case with an example of the interaction of an integral membrane protein with cardiolipin, a topic of general importance as the energetics of such interactions have been explored by CG MD for a number of proteins from mitochondrial membranes, e.g., refs (70), (71), (20), and (72). We have extended this analysis to a well-characterized interaction of PIP2 with an inward rectifier potassium channel. This is also of broad relevance given recent evidence for a role on PIP2 interactions in regulating a number of ion channel proteins.[49] In particular, we show that with replica exchange and CG simulations of >5 μs, one can obtain demonstrably converged protein/single lipid PMFs, thus allowing exploration of the effects of binding site mutations and/or changes in lipid species on the strength of lipid/protein interactions. These calculations use the Martini CG force field: it will be of interest in the future to attempt such free energy calculations with an all-atom force field. We anticipate that in this case establishing convergence will be more demanding, as the free energy landscape will likely be more rugged. Using our DRMS collective variable, we find that ΔG for GpA dissociation in Martini 2.1 is in agreement with experiment (values ranging from −24 kJ/mol[73] to −51 kJ/mol[60]). For the native state the well-depth observed is −28 kJ/mol, which compares well with other PMF-based estimates of −30 kJ/mol[67] and −40 kJ/mol[4] from CG simulations, and of −44 kJ/mol[45] and −60 kJ/mol[68] from all-atom simulations. However, while the CG force field can qualitatively reproduce the effect of a dimer-disrupting mutation on the native bound conformation, it does not appreciably alter the dimer stability. The small effect of the mutation on stability is due to the large population of non-native bound states that are not much affected by the mutation. That the CG force field is not quantitatively successful in capturing the effects of such mutations is perhaps not surprising given the role of interactions which are not explicitly present in the CG energy function, e.g., Cα-H···O H-bonds, in stabilizing the GpA native helix dimer[62,74] over non-native bound states. The use of the DRMS reveals a more complex energy landscape than was previously seen for GpA dimerization, such that a near-native dimer is the most stable configuration. Comparable complexities of an apparently simple free energy landscape for TM helix interactions have recently been seen for CG metadynamics studies of dimerization of the EGFR TM domain.[75] A clear future extension of this study, especially in terms of TM helix–helix interactions, will be to obtain demonstrably converged PMFs from all-atom simulations. Two all-atom PMFs of GpA TM helix dimerization are available[45,68] but would merit revisiting in the context of the greater complexity of the (CG) free energy landscape revealed via the use of the DRMS as a CV. This will doubtless prove challenging due to much larger number of degrees of freedom in all-atom simulations and the high viscosity of the lipid bilayer.

62 in total

1. The Calpha ---H...O hydrogen bond: a determinant of stability and specificity in transmembrane helix interactions.

Authors: A Senes; I Ubarretxena-Belandia; D M Engelman
Journal: Proc Natl Acad Sci U S A Date: 2001-07-31 Impact factor: 11.205

2. Thermodynamics of glycophorin A transmembrane helix dimerization in C14 betaine micelles.

Authors: Karen G Fleming; Cha-Chi Ren; Abigail K Doura; Matthew E Eisley; Felix J Kobus; Ann Marie Stanley
Journal: Biophys Chem Date: 2004-03-01 Impact factor: 2.352

3. Improved Parameters for the Martini Coarse-Grained Protein Force Field.

Authors: Djurre H de Jong; Gurpreet Singh; W F Drew Bennett; Clement Arnarez; Tsjerk A Wassenaar; Lars V Schäfer; Xavier Periole; D Peter Tieleman; Siewert J Marrink
Journal: J Chem Theory Comput Date: 2012-11-28 Impact factor: 6.006

4. Insights into the recognition and association of transmembrane alpha-helices. The free energy of alpha-helix dimerization in glycophorin A.

Authors: Jérôme Hénin; Andrew Pohorille; Christophe Chipot
Journal: J Am Chem Soc Date: 2005-06-15 Impact factor: 15.419

Review 5. Perspective on the Martini model.

Authors: Siewert J Marrink; D Peter Tieleman
Journal: Chem Soc Rev Date: 2013-08-21 Impact factor: 54.564

6. GROMACS 4.5: a high-throughput and highly parallel open source molecular simulation toolkit.

Authors: Sander Pronk; Szilárd Páll; Roland Schulz; Per Larsson; Pär Bjelkmar; Rossen Apostolov; Michael R Shirts; Jeremy C Smith; Peter M Kasson; David van der Spoel; Berk Hess; Erik Lindahl
Journal: Bioinformatics Date: 2013-02-13 Impact factor: 6.937

Review 7. PIP(2) and proteins: interactions, organization, and information flow.

Authors: Stuart McLaughlin; Jiyao Wang; Alok Gambhir; Diana Murray
Journal: Annu Rev Biophys Biomol Struct Date: 2001-10-25

8. The effect of cholesterol on the lateral diffusion of phospholipids in oriented bilayers.

Authors: Andrey Filippov; Greger Orädd; Göran Lindblom
Journal: Biophys J Date: 2003-05 Impact factor: 4.033

9. Alchembed: A Computational Method for Incorporating Multiple Proteins into Complex Lipid Geometries.

Authors: Elizabeth Jefferys; Zara A Sands; Jiye Shi; Mark S P Sansom; Philip W Fowler
Journal: J Chem Theory Comput Date: 2015-05-14 Impact factor: 6.006

10. PIP(2)-binding site in Kir channels: definition by multiscale biomolecular simulations.

Authors: Phillip J Stansfeld; Richard Hopkinson; Frances M Ashcroft; Mark S P Sansom
Journal: Biochemistry Date: 2009-11-24 Impact factor: 3.162

37 in total

1. Characterization of Lipid-Protein Interactions and Lipid-Mediated Modulation of Membrane Protein Function through Molecular Simulation.

Authors: Melanie P Muller; Tao Jiang; Chang Sun; Muyun Lihan; Shashank Pant; Paween Mahinthichaichan; Anda Trifan; Emad Tajkhorshid
Journal: Chem Rev Date: 2019-04-12 Impact factor: 60.622

2. Inserting Small Molecules across Membrane Mixtures: Insight from the Potential of Mean Force.

Authors: Alessia Centi; Arghya Dutta; Sapun H Parekh; Tristan Bereau
Journal: Biophys J Date: 2020-02-04 Impact factor: 4.033

3. Accelerating Membrane Simulations with Hydrogen Mass Repartitioning.

Authors: Curtis Balusek; Hyea Hwang; Chun Hon Lau; Karl Lundquist; Anthony Hazel; Anna Pavlova; Diane L Lynch; Patricia H Reggio; Yi Wang; James C Gumbart
Journal: J Chem Theory Comput Date: 2019-07-02 Impact factor: 6.006

4. Molecular mechanism of CD44 homodimerization modulated by palmitoylation and membrane environments.

Authors: Ziyi Ma; Sai Shi; Meina Ren; Chunli Pang; Yong Zhan; Hailong An; Fude Sun
Journal: Biophys J Date: 2022-06-22 Impact factor: 3.699

5. Prediction of CB[8] host-guest binding free energies in SAMPL6 using the double-decoupling method.

Authors: Kyungreem Han; Phillip S Hudson; Michael R Jones; Naohiro Nishikawa; Florentina Tofoleanu; Bernard R Brooks
Journal: J Comput Aided Mol Des Date: 2018-08-06 Impact factor: 3.686

Review 6. Perturbations of Native Membrane Protein Structure in Alkyl Phosphocholine Detergents: A Critical Assessment of NMR and Biophysical Studies.

Authors: Christophe Chipot; François Dehez; Jason R Schnell; Nicole Zitzmann; Eva Pebay-Peyroula; Laurent J Catoire; Bruno Miroux; Edmund R S Kunji; Gianluigi Veglia; Timothy A Cross; Paul Schanda
Journal: Chem Rev Date: 2018-02-28 Impact factor: 60.622

Review 7. Microscopic view of lipids and their diverse biological functions.

Authors: Po-Chao Wen; Paween Mahinthichaichan; Noah Trebesch; Tao Jiang; Zhiyu Zhao; Eric Shinn; Yuhang Wang; Mrinal Shekhar; Karan Kapoor; Chun Kit Chan; Emad Tajkhorshid
Journal: Curr Opin Struct Biol Date: 2018-07-23 Impact factor: 6.809

8. Guardians of the Cell: State-of-the-Art of Membrane Proteins from a Computational Point-of-View.

Authors: Nícia Rosário-Ferreira; Catarina Marques-Pereira; Raquel P Gouveia; Joana Mourão; Irina S Moreira
Journal: Methods Mol Biol Date: 2021

9. Molecular simulation studies of the interactions between the human/pangolin/cat/bat ACE2 and the receptor binding domain of the SARS-CoV-2 spike protein.

Authors: Shaojie Ma; Hui Li; Jun Yang; Kunqian Yu
Journal: Biochimie Date: 2021-05-11 Impact factor: 4.079

Review 10. Computational Modeling of Realistic Cell Membranes.

Authors: Siewert J Marrink; Valentina Corradi; Paulo C T Souza; Helgi I Ingólfsson; D Peter Tieleman; Mark S P Sansom
Journal: Chem Rev Date: 2019-01-09 Impact factor: 72.087