George Hedger1, David Shorthouse1,2, Heidi Koldsø1,3, Mark S P Sansom1. 1. Department of Biochemistry, University of Oxford , South Parks Road, Oxford OX1 3QU, United Kingdom. 2. MRC Cancer Unit, University of Cambridge , MRC Research Centre, Box 197, Cambridge CB2 0X1, United Kingdom. 3. D. E. Shaw Research , 120 West 45th Street, 39th floor, New York, New York 10036, United States.
Abstract
Lipid molecules can bind to specific sites on integral membrane proteins, modulating their structure and function. We have undertaken coarse-grained simulations to calculate free energy profiles for glycolipids and phospholipids interacting with modulatory sites on the transmembrane helix dimer of the EGF receptor within a lipid bilayer environment. We identify lipid interaction sites at each end of the transmembrane domain and compute interaction free energy profiles for lipids with these sites. Interaction free energies ranged from ca. -40 to -4 kJ/mol for different lipid species. Those lipids (glycolipid GM3 and phosphoinositide PIP2) known to modulate EGFR function exhibit the strongest binding to interaction sites on the EGFR, and we are able to reproduce the preference for interaction with GM3 over other glycolipids suggested by experiment. Mutation of amino acid residues essential for EGFR function reduce the binding free energy of these key lipid species. The residues interacting with the lipids in the simulations are in agreement with those suggested by experimental (mutational) studies. This approach provides a generalizable tool for characterizing the interactions of lipids that bind to specific sites on integral membrane proteins.
Lipid molecules can bind to specific sites on integral membrane proteins, modulating their structure and function. We have undertaken coarse-grained simulations to calculate free energy profiles for glycolipids and phospholipids interacting with modulatory sites on the transmembrane helix dimer of the EGF receptor within a lipid bilayer environment. We identify lipid interaction sites at each end of the transmembrane domain and compute interaction free energy profiles for lipids with these sites. Interaction free energies ranged from ca. -40 to -4 kJ/mol for different lipid species. Those lipids (glycolipidGM3 and phosphoinositide PIP2) known to modulate EGFR function exhibit the strongest binding to interaction sites on the EGFR, and we are able to reproduce the preference for interaction with GM3 over other glycolipids suggested by experiment. Mutation of amino acid residues essential for EGFR function reduce the binding free energy of these key lipid species. The residues interacting with the lipids in the simulations are in agreement with those suggested by experimental (mutational) studies. This approach provides a generalizable tool for characterizing the interactions of lipids that bind to specific sites on integral membrane proteins.
The complexity and
diversity of cell membrane compositions suggest
that lipids may bind to membrane proteins at specific sites and modulate
their function. Only within the past decade have structure determination
techniques begun to achieve the resolution required to discern specifically
bound lipids at atomic resolution, and we now possess over 100 structures
of membrane proteins suggested to contain bound lipid molecules.[1] In a number of cases, structural data have been
coupled to functional analysis, and it is now clear that several classes
of membrane protein, including, e.g., K+ channels, receptor
tyrosine kinases (RTKs), and G-protein coupled receptors, are regulated
via selective interactions with specific lipids.[2−4] Therefore, there
is a need to better characterize these interactions to fully understand
the influence of the surrounding membrane environment on membrane
protein structure and function.[5]The epidermal growth factor receptor (EGFR/ErbB1) is perhaps the
best characterized member of the human RTK family. It resides in the
plasma membrane, where it serves an essential role in transmitting
information on the cellular environment to intracellular signaling
networks, which subsequently elicit a response.[6] Mutations of the receptor have been implicated in a variety
of cancers and thus the EGFR is a target for therapeutic intervention.[7] Of particular interest, EGFR is known to be modulated
by its surrounding lipid environment. The emerging picture is that
this modulation may occur via specific lipid interactions,[4,8] in addition to the more general influences of lipids such as cholesterol
on the biophysical properties of the membrane.[9]Molecular simulation techniques provide a powerful tool for
exploring
lipid interactions with membrane proteins[10] and have previously been used to successfully predict lipid interaction
sites on cytochrome C oxidase,[11] aquaporins,[12] and Kir channels.[13] However, there is a need to quantify these interactions to enable
predictions of the effects of protein mutation and to provide a fuller
understanding of the lipid selectivity of proteins such as EGFR. Molecular
simulations and potential of mean force[11] (PMF) calculations represent one such route for the quantitative
exploration of lipid–protein interactions. PMF methods are
increasingly being applied to better understand the molecular basis
of a range of biomolecular interactions (e.g., refs (11) and (14−17)) and allow one to describe the free energy profile along a particular
reaction coordinate. Sufficient sampling of configurational space
along this reaction coordinate is necessary for the accurate calculation
of a free energy profile. Attaining sufficient sampling and convergence
of the calculation is not an insignificant challenge,[18] which can inhibit its application to complex systems such
as membranes. These challenges may be addressed through application
of specialized sampling techniques such as umbrella sampling[19] and adaptive force biasing.[20] An additional approach is to reduce the granularity of
the system (and hence the number of degrees of freedom that must be
sampled) through use of coarse-grained (CG) models, which allow an
increase in accessible simulation times of several orders of magnitude.[21] A number of comparative studies have indicated
that estimates of the free energy for removal of lipid molecules from
a bilayer obtained from CG and all-atom simulations are in agreement
with one another and with experiment.[22,23] On the basis
of these considerations, we chose to employ umbrella sampling together
with the MARTINI CG force field (see ref (21) and references therein), which was originally
parametrized based on comparison with experimental and all-atom simulation
partitioning free energies.In addition to its intrinsic biological
importance, the EGFR provides
a well-characterized model system for the study of protein–lipid
interactions. The full-length protein consists of an extracellular
ligand binding ectodomain, a single transmembrane (TM) helix, a basic
juxtamembrane (JM) region, and an intracellular protein kinase domain
(Figure ).[6] The functional activity of the receptor is known
to be modulated by a glycolipid (monosialodihexosylganglioside
or GM3) and by phosphatidyl inositol-4,5-bisphosphate (PIP2).[4,8] However, despite a number of biochemical and biophysical
studies, the structural and energetic bases of the interactions of
these lipids with the EGFR TM and JM regions remain to be fully characterized.
This is in part due to the experimental challenges involved in studying
the behavior and interactions of lipids with proteins. In this study,
we have employed CG simulations and umbrella sampling to explore the
free energy landscape of interactions of a model of the wild-type
(WT) and mutant EGFR TM–JM dimer with a range of different
lipid species present within biological membranes.
Figure 1
Overview of EGF receptor
structure and sequence. (A) Composite
structural diagram of dimeric EGFR based on the extracellular domain
(ECD) crystal structure (PDB ID: 3NJP), the transmembrane (TM) and juxtamembrane
(JM) domains NMR structure (PDB: 2M20), and the asymmetric tyrosine kinase
crystal structure (PDB IDs: 2GS6, 3GOP, and 2JIU).
(B) NMR-derived structure of the TM–JM dimer used as the basis
of the simulations. The atomistic (AT) structure was converted to
a coarse-grained (CG) model (basic residues in blue, acidic residues
in red, polar residues in green, and hydrophobic residues in gray).
(C) Sequence of the TM helix and JM region, with the residue numbering
as in the processed mature protein sequence (UniProt entry P00533). An addition
of +24 should be made to convert to the preprocessed numbering.
Overview of EGF receptor
structure and sequence. (A) Composite
structural diagram of dimeric EGFR based on the extracellular domain
(ECD) crystal structure (PDB ID: 3NJP), the transmembrane (TM) and juxtamembrane
(JM) domains NMR structure (PDB: 2M20), and the asymmetric tyrosine kinase
crystal structure (PDB IDs: 2GS6, 3GOP, and 2JIU).
(B) NMR-derived structure of the TM–JM dimer used as the basis
of the simulations. The atomistic (AT) structure was converted to
a coarse-grained (CG) model (basic residues in blue, acidic residues
in red, polar residues in green, and hydrophobic residues in gray).
(C) Sequence of the TM helix and JM region, with the residue numbering
as in the processed mature protein sequence (UniProt entry P00533). An addition
of +24 should be made to convert to the preprocessed numbering.
Methods
NMR Structure-Derived Molecular
Model
The molecular
model of the EGFR helix dimer used was derived from the experimentally
determined NMR structure (PDB ID: 2M20).[24] Two native
methionines (M626 and M644) had been mutated to prevent chemical cleavage
of the protein during purification. These residues were restored to
the WT residues using the mutagenesis tool implemented in PyMOL (https://www.pymol.org/). The sequence and structure of the
model used in simulations are indicated in Figure . Models of the K618G and R645–7N
mutants were also obtained using the PyMOL mutagenesis tool. Energy
minimized atomistic (AT) models were converted to CG representation
using the MARTINI2.2 force field.[25] Assignment
of dihedral restraints to model the TM and short JM-A helices was
achieved in an automated fashion using the martinize.py workflow,
based on DSSP assignment. The N- and C-terminus of each monomer were
modeled with neutral charge.
CG Simulation Details
Simulations
were performed using
the GROMACS (www.gromacs.org) 5.0 and 4.6 simulation packages.[26] During the initial set of CG simulations, 50
ns self-assembly simulations[27] were performed
to allow formation of a PC bilayer consisting of 700 PClipid molecules
around the TM region of the dimer. A locally developed script[28] was used to exchange PC molecules for other
lipids to form an asymmetric bilayer containing 17 GM3 molecules (in
the outer leaflet, corresponding to ca. 2.5% of all lipids within
the system) and 17 PIP2 molecules (in the inner leaflet,
corresponding to ca. 2.5% of all lipids within the system). This provides
a simplified model of the distribution and glycolipid/PI content of
a plasma membrane.[28,29] Five-thousand steps of steepest
descent energy minimization were applied to relax the system, followed
by a 10 ns equilibration simulation. Two microsecond production runs
were performed with different random initial velocity seeds. Temperature
was maintained at 310 K utilizing a V-rescale thermostat[30] with a coupling constant of τt = 1 ps. Pressure was controlled at 1 bar using a Parrinello–Rahman
barostat[31] with a coupling constant of
τp = 5 ps and a compressibility of 3 × 10–4 bar–1. Protein, lipids, and solvent
(water + ions) were coupled independently. Electrostatics and van
der Waals interactions were shifted between 0 and 1.2 nm and 0.9 and
1.2 nm, respectively, and an integration time step of 20 was applied.
Within umbrella sampling simulations, particle mesh Ewald was applied
to improve modeling of long-range electrostatics.[32] Covalent bonds were constrained to their equilibrium values
using the P-LINCS algorithm.[26,33] All simulations were
run in the presence of conventional MARTINI water[22] and neutralized using 0.15 M NaCl. The MARTINI force field
was used to describe all system components, other than PIP2 and GM3, for which we employed locally parametrized versions previously
described by us.[28,34] Simulations were visualized using
VMD.[35] Contact analysis was performed using
VMD. Contacts were calculated between each residue of the protein
and the PO3 bead of the PIP2lipid headgroup, and the B1A
bead of the GM3 headgroup (see Figure S1 for lipid bead names). A 6 Å cutoff was applied to define a
contact. Radial distribution functions for each lipid headgroup relative
to the protein were calculated using the PO3 particle of PIP2, PO4 particle of PC, and B1A particle of GM3.
PMF Calculations
PMF is a key concept in statistical
mechanics,[36] which allows one to describe
the free energy profile along a particular reaction coordinate (d), and is derived from the average distribution function p(d) along this reaction coordinate. A
key challenge in calculating a PMF via MD simulation is attaining
sufficient sampling of configurational space.[37] This may be addressed via a number of sampling techniques including
umbrella sampling[19] and adaptive force
biasing.[20] We have applied umbrella sampling
in conjunction with a CG model to calculate PMF profiles for lipid
interactions with a TM helix dimer.[36]The last CG snapshot with a single bound lipid of interest was extracted
from a 2 μs production run. A 50 ns self-assembly simulation
was performed to permit formation of a PC bilayer around the dimer
and the bound lipid. Position restraints were applied to the protein
and the lipid of interest during the self-assembly, using a force
constant of 1000 kJ/mol/nm2. Protein side chains were subsequently
relaxed during a 10 ns equilibration with position restraints (400
kJ/mol/nm2 in the xy plane) only applied
to the G625 and G628 backbone beads to prevent rotation and translation
of the protein. All lipids of interest remained bound over this time
course, other than PC, to which a weak positional restraint of 100
kJ/mol/nm2 was applied to the PO4 bead (Figure S1). Steered MD (SMD) simulations were performed to
generate a series of configurations along a reaction coordinate ranging
from the lipid bound to lipid unbound states (free energy profile
reaches a plateau). The lipid was pulled at a rate of 0.1 nm/ns over
a distance of 2–3 nm using a force constant of 1000 kJ/mol/nm2, with the distance from the center-of-mass of G625 and G628
of the N-terminal dimerization interface of the EGFR model and a single
bead of the lipid of interest defined as the 1D reaction coordinate
(B1A for GM3; PO4 for PC, PS, and PG; and PO3 for PIP2).
The translational and rotational motions of the protein were reduced
through application of position restraints to the backbone beads of
G625 and G628 using a force constant of 400 kJ/mol/nm2,
with the z coordinate remaining unrestrained. For
all inner leaflet SMD simulations, these restraints were also applied
to M644 at the C-terminus of each helix to prevent the C-terminal
portion of the helix from “following” the lipid. Additionally,
a weak positional restraint of 100 kJ/mol/nm2 was applied
to a single bead of the lipid headgroup to limit its motional freedom
along the y coordinate, in a manner similar to Arnarez
et al.[11,38] The y coordinate within
our system is defined as is the direction perpendicular to the reaction
coordinate (x) and to the bilayer normal (z), and application of such restraints thus ensures sampling
of the 1D reaction coordinate and prevents transition to 2D sampling
and the associated convergence challenges. Application of these position
restraints to the dimer, together with the standard dihedral restraints
inherent to the MARTINI force field,[39] prevented
any major helical transitions or conformational rearrangements within
the TM region of the dimer, whereas the flexible +4 N-terminus and
JM were modeled in an unrestrained fashion and could freely interact
with the membrane. The subject lipid was treated separately to bulk
lipids for temperature and pressure coupling. Snapshots were extracted
from the SMD simulation, and these configurations were used as input
for umbrella sampling simulations. Windows were spaced asymmetrically
with ∼0.05 nm spacing at low-center-of-mass separations and
∼0.1 nm in the bulk. Each window was run for between 500 and
1000 ns. Relative lipid–protein separations within each umbrella
sampling window were maintained through application of a harmonic
umbrella potential between the center-of-mass of G625 and G628 residues
of the N-terminal dimerization interface and a single lipid headgroup
bead, using a force constant of 1000 kJ/mol/nm2. Additionally,
the same position restraints used in the SMD simulations were applied
in the umbrella sampling simulations. Within each window, temperature
and pressure were controlled using the Berendsen thermostat and Berendsen
barostat.[40] The GROMACS implementation
(g_wham) of the weighted histogram analysis method
(WHAM) was used to combine and unbias the umbrella potentials, and
the Bayesian bootstrapping method was utilized to estimate the error
on each free energy profile.[41] Convergence
was analyzed by comparing free energy profiles computed from nonintersecting
periods of simulation time. On the basis of these observations, the
first 100 ns of simulation time was excluded from analysis as equilibration.
The possible introduction of bias caused by taking system configurations
generated by pulling the lipid away from the dimer
was tested by taking the end point of a 1 μs umbrella sampling
simulation at the maximum protein–lipid separation and pulling
the lipid “backward” toward the dimer
from the bulk bilayer along the same 1D reaction coordinate. This
backward trajectory was then used to generate a new set of umbrella
sampling simulations conducted in the same manner described previously.
Results
Identification of Lipid Interaction Hotspots
Lipid
interactions were explored using a CG molecular model of the TM–JM
dimer of EGFR based on an NMR structure (PDB ID: 2M20).[24] The monomeric unit consisted of a 4 residue N-terminal
region, a 23 residue TM helix, and a 32 residue JM region (Figure ). This model was
embedded in a phosphatidylcholine (PC) bilayer via self-assembly
simulations.[27] Selected PClipid molecules
were subsequently exchanged for other lipid species[28] to form an asymmetric bilayer containing 2.5% GM3 (in the
outer leaflet) and 2.5% PIP2 (in the inner leaflet), providing
a simplified model of the distribution of these lipids in mammalian
cell membranes.[29,42] CG-MD simulations were performed
using the MARTINI force field.[22,25,39] Five repeat simulations of the dimer in the asymmetric membrane
were run, each 2 μs in duration (see Methods for full simulation details). The number of contacts formed over
the course of the simulations between each protein residue and the
headgroup of each lipid species was calculated. Within the outer leaflet,
GM3 exhibited a strong preference for interaction with N-terminal
membrane proximal residue K618, whereas within the inner leaflet,
PIP2 showed high frequencies of interaction with R645,
R646, and R647, as well as with R656 within the cytoplasmic JM region
(Figure ). These lipid
interaction hotspots identified by our simulations agree well with in vivo mutagenesis studies and in vitro functional assays that have shown that these residues are essential
for the sensitivity of EGFR to GM3 and PIP2.[4,8] Visual inspection of the trajectory revealed that once associated
each PIP2lipid molecule remained bound over the 2 μs
duration of the simulation. In contrast, the extracellular leaflet
GM3lipids exhibited a more dynamic pattern of transient receptor
interactions such that multiple association and dissociation events
were observed (Figure A).
Figure 2
Protein–lipid interactions within a model bilayer. Lipid
interaction hotspots identified by CG simulations of the TM–JM
dimer in a PC bilayer containing 5% GM3 in the extracellular leaflet
and 5% PIP2 in the intracellular leaflet. These are shown
in (A) by a contact matrix of the dynamic pattern of interaction formed
between the peptide and a single bead of the headgroup moieties of
PIP2 and GM3 over time (see Methods for full details of contact analysis). (B) TM helix dimer (snapshot
from a CG simulation) with each residue colored from white (no interaction
with GM3 or PIP2 headgroup) to red (high degree of interaction).
Residues that formed the highest levels of contact with lipid molecules
within each leaflet are labeled. Contacts were calculated over 5 ×
2 μs simulation repeats, using a 6 Å cutoff.
Protein–lipid interactions within a model bilayer. Lipid
interaction hotspots identified by CG simulations of the TM–JM
dimer in a PC bilayer containing 5% GM3 in the extracellular leaflet
and 5% PIP2 in the intracellular leaflet. These are shown
in (A) by a contact matrix of the dynamic pattern of interaction formed
between the peptide and a single bead of the headgroup moieties of
PIP2 and GM3 over time (see Methods for full details of contact analysis). (B) TM helix dimer (snapshot
from a CG simulation) with each residue colored from white (no interaction
with GM3 or PIP2 headgroup) to red (high degree of interaction).
Residues that formed the highest levels of contact with lipid molecules
within each leaflet are labeled. Contacts were calculated over 5 ×
2 μs simulation repeats, using a 6 Å cutoff.We sought to quantify the strength of the observed
lipid–protein
interactions and test the importance of residue identity at the interaction
sites identified as hotspots through calculation of the PMF (i.e.,
free energy profiles) for lipid association with both WT and mutant
receptors. A final CG snapshot with a single bound lipid of interest
at either of the previously determined sites was extracted from the
2 μs simulation trajectory to provide starting structures for
the PMF calculations. In each case, the snapshot corresponded to a
structure in which the lipid of interest was bound between the two
TM helices, approximately equidistant from each helix axis. In the
case of PIP2, the flexible region of the JM after R647
was removed to leave just the first three JM residues (R645–7)
present within a truncated JM region. This enabled us both to expedite
data collection and to combat the technical challenge of achieving
convergence in PMF calculations with a highly flexible JM region.
A new PC bilayer was subsequently formed via self-assembly simulations
around the position-restrained protein (see Methods for details of restraints used) and the bound lipid. The TM region
of the dimer was thus conformationally “frozen” in terms
of degrees of freedom, whereas the flexible +4 N-terminus and C-terminal
JM regions remained unrestrained and free to dynamically interact
with the membrane. The choice of membrane composition served to combat
the additional convergence challenges that arise from calculations
in more complex multicomponent bilayers. To generate a 1D reaction
coordinate ranging from the lipid-bound state to the corresponding
lipid (GM3 or PIP2) being free in the PC bilayer, a steered
MD simulation was performed in which a force was applied to pull the
lipid of interest away from the restrained TM helix dimer over a distance
of 2–3 nm (Figure A,B). A series of 20–40 system configurations were
extracted along the reaction coordinate, each spaced by ca. 0.1 nm.
These configurations were used as starting points for umbrella sampling
simulations. The subject lipid was maintained on the 1D reaction coordinate
during umbrella sampling through application of lateral position restraints
in a manner previously described in refs (11) and (38). Simulations were performed using Gromacs 4.6,[26] and free energy profiles were constructed using
the g_wham utility.[41] (See
the Methods for full details of how PMF calculations
were conducted.)
Figure 3
Reaction coordinate for probing the energetics of lipid
interactions
with the EGFR TM–JM dimer in a lipid bilayer. (A) Schematic
illustration of a dimer in a membrane, with the arrows showing the
approximate pathways (i.e., reaction coordinates) for evaluating the
free energy profiles (i.e., potentials of mean force, PMFs) of GM3
and PIP2 interactions with the protein. (B) Reaction coordinate
for exploring the energetics of protein–lipid interactions
illustrated for GM3. The reaction coordinate is defined as the separation
along the x coordinate of the center-of-mass (red)
of the two glycine residues within the N-terminal GxxGA motif of the
dimer and of the B1A particle (red) of GM3.
Reaction coordinate for probing the energetics of lipid
interactions
with the EGFR TM–JM dimer in a lipid bilayer. (A) Schematic
illustration of a dimer in a membrane, with the arrows showing the
approximate pathways (i.e., reaction coordinates) for evaluating the
free energy profiles (i.e., potentials of mean force, PMFs) of GM3
and PIP2 interactions with the protein. (B) Reaction coordinate
for exploring the energetics of protein–lipid interactions
illustrated for GM3. The reaction coordinate is defined as the separation
along the x coordinate of the center-of-mass (red)
of the two glycine residues within the N-terminal GxxGA motif of the
dimer and of the B1A particle (red) of GM3.
Binding of GM3 within the Extracellular Leaflet
Free
energy profiles were calculated for GM3 and for phosphatidylglycerol
(PG) interaction with the WT dimer within the outer leaflet and for
GM3 interactions with a mutant dimer (K618G) (Figure ). Additionally, the importance of the N-acetyl
neuraminic acid (Neu5Ac) moiety of GM3 for protein interactions was
tested by calculation of free energy profiles for GM3 lacking the
Neu5Ac moiety (GM3 (−Neu5Ac)). GM3 was found to interact with
the EGFR dimer with an overall minimum (state I) free energy of −9
kJ/mol. A second local energy minimum (state II) with a free energy
of −7 kJ/mol was also observed (Figure A). These minima both represent bound states
of GM3. In state II, the lipid headgroup is erect and parallel to
the membrane normal. In contrast, in state I the GM3lipid headgroup
is tilted and penetrates further into the bilayer (Figures A, S2, and S3). This enables formation of optimal interactions between
the N-terminal four residues of the helix dimer and the GM3 headgroup.
In state I, there is a degree of local bilayer deformation seen as
bilayer thinning (Figure B,C). This local deformation is absent from state II. The
second minimum thus most likely represents an energetically favorable
adjustment of the tilt angle and depth of GM3 within the lipid bilayer
upon moving away from the protein and the consequent release of local
deformation in the surrounding PC bilayer. The ability of PMF calculations
to detect these subtle differences in the behavior of a single lipid
illustrates the sensitivity of the method. Mutation of residue K618
to glycine decreased the binding affinity of GM3 by 4 kJ/mol compared
to the WT (Figure ). In addition, removal of the Neu5Ac sugar from the GM3 headgroup
caused a reduction in binding affinity of 5 kJ/mol (Figure ). Both the K618G mutant and
GM3 (−Neu5Ac) truncation remove one of the formal charge components
of the interaction between the lysine side chain and the carboxylate
group of GM3. The reduction in binding affinity by ∼50% demonstrates
that both the cationic lysine side chain at position 618 and the anionic
headgroup moiety of the glycolipid are significant contributors to
the GM3–EGFR interaction. In agreement with this prediction, in vitro assays have shown that treatment of GM3 with neuraminidase
(which cleaves Neu5Ac from the GM3 headgroup) abolishes its ability
to modulate receptor activity and that K618 is likewise essential
for GM3 interaction and the sensitivity of the receptor to GM3 levels
within membranes.[4] Significantly, the simple
anionic lipidPG exhibited a weaker binding affinity than GM3 (Figure ), despite both lipids
bearing a formal charge of −1. This suggests that the additional
polar moieties of the GM3 headgroup play a role in providing interaction
sites that allow EGFR to form selective interactions with GM3. As
anticipated, the zwitterionic lipidPC exhibited no significant interaction
(Figure S4).
Figure 4
Outer leaflet free energy
profiles. (A) GM3, (B) GM3 (−Neu5Ac),
i.e., GM3 from which the terminal N-acetyl neuraminic acid has been
removed, (C) GM3 and the K618G mutant dimer, and (D) phosphatidylglycerol
(PG). The standard deviation estimated from bootstrapping is shown
as the shaded area behind the curve.
Figure 5
Repositioning of GM3 at the protein–lipid interface. (A)
Position of the GM3 molecule (purple to red) relative to the upper
leaflet of the lipid bilayer (gray) as a function of the displacement, x, along the reaction coordinate. The inset shows the position
along the bilayer normal (z) for the B1A particle
(red) of GM3 as a function of the x coordinate of
the lipid. (B,C) Density map of the location of the PC head groups
averaged over the course of the first umbrella sampling window simulation
(i.e., that with the GM3 closest to the protein). A degree of bilayer
thinning is observed around the dimer and bound lipid. The approximate
locations of the protein and GM3 are indicated in (B).
Outer leaflet free energy
profiles. (A) GM3, (B) GM3 (−Neu5Ac),
i.e., GM3 from which the terminal N-acetyl neuraminic acid has been
removed, (C) GM3 and the K618G mutant dimer, and (D) phosphatidylglycerol
(PG). The standard deviation estimated from bootstrapping is shown
as the shaded area behind the curve.Repositioning of GM3 at the protein–lipid interface. (A)
Position of the GM3 molecule (purple to red) relative to the upper
leaflet of the lipid bilayer (gray) as a function of the displacement, x, along the reaction coordinate. The inset shows the position
along the bilayer normal (z) for the B1A particle
(red) of GM3 as a function of the x coordinate of
the lipid. (B,C) Density map of the location of the PC head groups
averaged over the course of the first umbrella sampling window simulation
(i.e., that with the GM3 closest to the protein). A degree of bilayer
thinning is observed around the dimer and bound lipid. The approximate
locations of the protein and GM3 are indicated in (B).
PIP2 Binding to the EGFR TM Domain
Within
the inner leaflet of the bilayer, free energy profiles were computed
for the interactions of PIP2–5, PC, and
PS. The importance of PIP2ionization state[43] was then evaluated by calculation of free energy
profiles for PIP2–3 and PIP20, and the significance of the side chain properties at
positions within the JM was investigated by calculation of a profile
for the interaction of PIP2–5 with a
R645–7N mutant dimer (Figure ). In contrast to the relatively small interaction
free energies observed within the upper leaflet, PIP2–5 bound to the dimer with a free energy of −42
kJ/mol, whereas PIP2–3 was bound with
a free energy of −35 kJ/mol. The significantly deeper wells
observed in these profiles are consistent with observations from the
contact analysis (Figure ) and with visual inspection of trajectories revealing that
PIP2 lipids remained bound for the duration of the initial
2 μs CG simulations. In vivo, PIP2 lipids are suggested to exist in a mixture of the −3, −4,
and −5 ionization states,[44] although
the exact charge may depend on the lipid microenvironment.[45] Thus, we would expect the interaction free energy
of PIP2 with the dimer to be about −40 kJ/mol, i.e.,
4 times stronger than the interaction with GM3 in the opposite leaflet
of the bilayer. Mutation of residues R645–7 within the JM cationic
cluster of each monomer to three asparagines resulted in a significant
reduction in the strength of interaction to about −10 kJ/mol.
The free energy of the PIP2 interaction with this asparagine
(NNN) mutant is comparable of that of phosphatidylserine (PS) with
the WT receptor (Figure ), demonstrating that these residues are essential for the strong
binding of PIP2 lipids as well as the selectivity of EGFR
for this lipid over other species present within cell membranes. Furthermore,
as expected, these mutations show that the formal charge interactions
between PIP2 phosphate groups and the basic arginine residues
contribute to a significant portion of the binding energy. In support
of this, the interaction of neutral charge state PIP2 lipids
was ca. 30 kJ/mol weaker compared to that of the −5 ionization
state (Figure ). The
residues that we identify as important for protein–lipid interactions
have been experimentally shown to be required for normal receptor
function, with R-to-N and R-to-A mutants exhibiting significantly
impaired activity. In addition, basic residues within the early JM
region have been shown to be crucial in facilitating PIP2 interactions and the ability of PIP2 lipids to modulate
receptor activity in vivo.[8,46]
Figure 6
Inner
leaflet free energy profiles. (A) Free energy profiles for
PIP2–5, PIP2–3, and PIP20 interactions with the WT protein.
(B) Profiles for PS and PC interactions with the WT protein and PIP2 interactions with the R645–7N mutant. The standard
deviation estimated from bootstrapping is shown as the shaded area
behind the curve.
Inner
leaflet free energy profiles. (A) Free energy profiles for
PIP2–5, PIP2–3, and PIP20 interactions with the WT protein.
(B) Profiles for PS and PC interactions with the WT protein and PIP2 interactions with the R645–7N mutant. The standard
deviation estimated from bootstrapping is shown as the shaded area
behind the curve.When judging the utility
of PMF calculations, it is important to
consider their convergence.[18] The convergence
of the PIP2–5 PMF calculations was judged
by comparison of energy profiles calculated from multiple nonoverlapping
segments of simulation time (Figure A). The well depth oscillated around −42 kJ/mol,
with no overall trend during the 1 μs simulation window (30
× 1 μs for the whole profile). The efficient convergence
seen in these profiles is likely a function of the CG model that we
employed, as the reduction in degrees of freedom[22] allows the system to attain convergence on faster time
scales. The approach was found to be reproducible to within 5 kJ/mol
(i.e., ca. 2 kT) by comparison of three independently calculated PMFs
of the PIP2–5 system (Figure B). The direction of the force
applied in the initial steered MD simulation (i.e., whether the 1D
reaction coordinate was generated by pulling the lipid away from the
protein or toward it) was also tested and did not show any significant
influence on the profile obtained (Figure C), indicating the absence of a hysteresis.
Figure 7
Convergence
of the free energy landscape. (A) Free energy profiles
for the interaction of PIP2–5 with the
WT dimer, each calculated from nonoverlapping 100 ns segments of the
simulation time within each 1 μs window. (B) Three PIP2–5 profiles, each calculated from an independent
repeat. In each case, a different structure was extracted from an
equilibrium simulation, and a new set of SMD simulations and umbrella
sampling calculations was conducted using that structure as the starting
point. The well depths span a range of ca. 5 kJ/mol. (C) Profile obtained
by pulling a PIP2–5 lipid from the bulk
membrane “backward” along the 1D coordinate toward the
protein during an SMD simulation and using that trajectory to define
the reaction pathway for a new set of umbrella sampling windows.
Convergence
of the free energy landscape. (A) Free energy profiles
for the interaction of PIP2–5 with the
WT dimer, each calculated from nonoverlapping 100 ns segments of the
simulation time within each 1 μs window. (B) Three PIP2–5 profiles, each calculated from an independent
repeat. In each case, a different structure was extracted from an
equilibrium simulation, and a new set of SMD simulations and umbrella
sampling calculations was conducted using that structure as the starting
point. The well depths span a range of ca. 5 kJ/mol. (C) Profile obtained
by pulling a PIP2–5 lipid from the bulk
membrane “backward” along the 1D coordinate toward the
protein during an SMD simulation and using that trajectory to define
the reaction pathway for a new set of umbrella sampling windows.In addition to more specific protein–lipid
interactions,
we have previously revealed lipid clustering induced by disordered
JM regions in simulations of the monomeric TM–JM domains of
RTKs[47] and cytokine receptors.[48] Comparable clustering of anionic lipids was
seen in the current simulations of the dimeric EGFR TM–JM (Figure A,B). Spherical radial
distribution functions (RDFs) for each lipid headgroup relative to
the protein exhibited peaks at 0.5 nm, a second peak extending out
to 1 nm, and a third weaker peak at 1.5 nm, indicating a degree of
local lipid reorganization and enrichment of certain lipid species
around the protein dimer. In the outer leaflet, the RDF revealed clustering
of GM3; in the inner leaflet, clustering of PIP2 is seen.
The degree of local clustering (i.e., the first peak in the RDF) is
greater for PIP2 than for GM3, reflecting the stronger
electrostatic interactions for the underlying interaction, as seen
in the free energy profiles.
Figure 8
Clustering of GM3 and PIP2 lipids.
Clustering of (A)
GM3 in the outer leaflet and (B) PIP2 in the inner leaflet
around the TM + JM dimer. In each case, the radial distribution function
(RDF; g(r)) for the headgroup of
PC (black) and of GM3 (green) or PIP2 (purple) is shown.
Calculations were performed over 5 × 2 μs of CG simulation,
with the standard deviation indicated as the shaded area behind each
curve. The area under each curve is normalized to unity to aid comparison
between the different lipid species. The two insets are density maps
showing the average spatial occupancy of each lipid headgroup over
the outer or inner leaflet surface. GM3 is indicated in green, PIP2 is in purple, and lysine or arginine density is in cyan.
Calculations were performed over 5 × 2 μs simulation repeats.
In these images, the protein has been omitted to show the lysine and
arginine density maps more clearly.
Clustering of GM3 and PIP2 lipids.
Clustering of (A)
GM3 in the outer leaflet and (B) PIP2 in the inner leaflet
around the TM + JM dimer. In each case, the radial distribution function
(RDF; g(r)) for the headgroup of
PC (black) and of GM3 (green) or PIP2 (purple) is shown.
Calculations were performed over 5 × 2 μs of CG simulation,
with the standard deviation indicated as the shaded area behind each
curve. The area under each curve is normalized to unity to aid comparison
between the different lipid species. The two insets are density maps
showing the average spatial occupancy of each lipid headgroup over
the outer or inner leaflet surface. GM3 is indicated in green, PIP2 is in purple, and lysine or arginine density is in cyan.
Calculations were performed over 5 × 2 μs simulation repeats.
In these images, the protein has been omitted to show the lysine and
arginine density maps more clearly.
Limitations
It is important to consider the possible
limitations of the model that we employed. In particular, we might
expect that inclusion of the extracellular and intracellular domains
of the EGFR (along with the glycans that decorate the extracellular
domain[49]) to exert some effect on the free
energy landscape of lateral lipid interaction with the TM domain.
However, the use of RTK TM domain constructs and mutational studies
within these systems have previously been shown to be relevant to
the full-length receptor.[50] Furthermore,
by its nature, the CG MARTINI model involves a degree of approximation
arising from loss of atomic detail and is likely to cause deviation
from binding free energies estimated from AT simulations and/or measured
experimentally. In particular, some local changes in secondary structure
might be induced upon binding of regulatory lipids. The standard protein
backbone dihedral restraints employed within the MARTINI model[39] and the lack of hydrogen-bond directionality
arising from coarse-graining the system mean that possible lipid-induced
conformational changes are not be captured within our calculations.
Along the same line, it is possible that protein and lipid protonation
states may change as a function of the reaction coordinate. To capture
such effects, constant-pH simulations would be required (see, e.g.,
ref (51)). In lieu
of modeling dynamic protonation, some indication of the possible effects
of protonation state may be provided by our choice to calculate PIP2 free energy profiles for three different ionization states,
along with neutralizing protein and lipid mutagenesis. The standard
MARTINI water model[22] is likely to oversimplify
the role of water in these interactions, particularly for lipids with
large solvent-exposed headgroups such as GM3. Polarizable water models[52] have been suggested to improve the accuracy
of free energy calculations for partitioning amino acid side chains
into hydrophobic environments. However, there remains an incomplete
understanding of its influence on lipid–water and protein–water
interactions, together with a substantial increase in the convergence
time associated with its use (see, e.g., ref (53)).Despite these
limitations, we note that the model has been shown to correctly predict
lipid interaction sites on a number of membrane proteins,[10−13] and also that the MARTINI force field may be considered to be lipid-oriented
as it was parametrized based on the reproduction of experimental partitioning
free energies and atomic simulation parameters.[39] For example, free energy profiles for phospholipid removal
from a DPPC bilayer show reasonable agreement in all-atom and CG simulations,
which have respectively been used to obtain estimates of ca. −80
and −90 kJ/mol.[22,54] These values may in turn be compared
with an experimental estimate of ca. −70 kJ/mol derived from
the critical micelle concentration.[14] Turning
to proteins, free energy profiles for the dimerization of the TM domain
of glycophorin A have yielded well-depth estimates of −30 to
−40 kJ/mol[55,56] in MARTINI CG and of −45
to −60 kJ/mol[57,58] in all-atom simulations. Considering
the differences in bilayer composition, helix length, and the errors
associated with both methods, this suggests that broad agreement is
seen between the two levels of granularity for both lipid–lipid
and protein–protein interactions. Taking this together with
a significant reduction in the sampling challenges that are routinely
faced in AT simulations,[18] we expect the
CG free energy profiles that we calculate to provide a reasonable
first approximation of EGFR–lipid interaction free energies
and of the relative effects of protein mutation.
Discussion
We have demonstrated that simulations can reveal both the strength
and specificity of the interactions of regulatory lipids with the
TM domain of a RTK. We show that K618, R645, R646, and R647 play an
important role in EGFR interactions with modulatory lipidsGM3 and
PIP2. Comparison to experimental mutations confirms that
CG simulations are indeed capable of identifying lipid interaction
hotspots on integral membrane proteins. The significant (ca. 4-fold)
difference in receptor interaction energies of modulatory lipidsGM3
and PIP2 and the differences in the dynamics of these interactions
are suggestive of possible differential modes of receptor modulation.
To date, these lipids have been proposed to modulate EGFR activity
via influences on dimerization propensity,[4] direct conformational stabilization and orientation of intracellular
JM and kinase regions,[59,60] and larger scale effects on receptor
clustering at the cell surface.[47,61] RTK helix dimerization
energies have previously been estimated to be on the order of ca.
−20 kJ/mol for EGFR[62] to ca. −60
kJ/mol for EphA1.[63] The strength of interactions
of PIP2 lipids revealed in the current simulations (ca.
−40 kJ/mol) is supportive of the notion that lipid modulation
of EGFR activity could occur via direct conformational stabilization
and effects on EGFR helix dimerization rather than solely via effects
on receptor clustering.[61] In contrast,
the weaker and more transient interactions of GM3 that we observe
may be consistent with a more raft-like model of receptor modulation,
which requires only weak lateral interactions between receptors and
lipids.[64] This is consistent with observations
that GM3 inhibits EGFR activity only within lipid-disordered/lipid-ordered
(i.e., raft-like) membranes containing sphingomyelin and cholesterol[4] and not in lipid-disordered phase bilayers. Characterization
of the underlying energetics of EGFR–lipid interactions is
thus an essential first step toward a mechanistic understanding of
these interactions. In particular, a key prediction arising from our
results is that the comparable magnitudes of PIP2 binding
energies and those of helix dimerization indicate that PIP2 lipids may be able to modulate receptor dimerization by direct competition
with TM helices, as suggested by, e.g., Stangl and Schneider.[65] This could occur both by influencing the relative
monomer/dimer populations of the EGFR and by driving possible conformational
transitions within existing dimers, as hinted at by recent structural
studies of a presumed inactive state of the TM domain.[66] This prediction could be tested either via reconstitution
of the EGFR in membranes of defined lipid composition[4] or via modulation of the lipid composition of the membranes
of living cells.[67] Depending on the conformation
the extracellular domain adopts and the type of glycosylation, a number
of glycans can be positioned spatially close to the N-termini of the
helices with which GM3 interacts.[49] It
therefore seems likely that in certain states these glycans may influence
lateral lipid interaction with the TM domain. Further combined simulation
and experimental studies would be welcome in elucidating the influence
of glycans and their relationship with regulatory lipid species and
the TM domain. Ultimately, structural studies may reveal lipid-binding
sites at high resolution and provide further insight into the mechanism
of lipid modulation.
Conclusions
In summary, this well-studied
model system allows us to demonstrate
that calculation of free energy profiles for lipid interactions via
CG simulation represents a powerful generalizable tool for the dissection
and prediction of protein–lipid interactions and selectivity
within cell membranes. A comparable approach has been previously applied
to probe the specific binding of cardiolipin to cytochrome C oxidase[11] and, more recently, for the binding of other
lipid species.[38] Taken together, these
studies suggest that the CGlipid PMF methodology may be generally
applicable to a wide range of membrane proteins and lipid types. This
approach provides important biophysical quantification of the relative
strengths of lipid interaction at specific binding sites and thus
complements structural insights from crystallographic,[68] computational,[69] and
biophysical[70] approaches.
Authors: Djurre H de Jong; Gurpreet Singh; W F Drew Bennett; Clement Arnarez; Tsjerk A Wassenaar; Lars V Schäfer; Xavier Periole; D Peter Tieleman; Siewert J Marrink Journal: J Chem Theory Comput Date: 2012-11-28 Impact factor: 6.006
Authors: Michael Zocher; Cheng Zhang; Søren G F Rasmussen; Brian K Kobilka; Daniel J Müller Journal: Proc Natl Acad Sci U S A Date: 2012-11-14 Impact factor: 11.205
Authors: W F Drew Bennett; Justin L MacCallum; Marlon J Hinner; Siewert J Marrink; D Peter Tieleman Journal: J Am Chem Soc Date: 2009-09-09 Impact factor: 15.419
Authors: Anton Arkhipov; Yibing Shan; Rahul Das; Nicholas F Endres; Michael P Eastwood; David E Wemmer; John Kuriyan; David E Shaw Journal: Cell Date: 2013-01-31 Impact factor: 41.582