T Bertie Ansell1, Luke Curran1, Michael R Horrell1, Tanadet Pipatpolkai1,2, Suzanne C Letham1,3, Wanling Song1, Christian Siebold4, Phillip J Stansfeld5, Mark S P Sansom1, Robin A Corey1. 1. Department of Biochemistry, University of Oxford, South Parks Road, Oxford, OX1 3QU, U.K. 2. Department of Physiology, Anatomy & Genetics, University of Oxford, South Parks Road, Oxford, OX1 3PT, U.K. 3. Sir William Dunn School of Pathology, University of Oxford, South Parks Road, Oxford, OX1 3RE, U.K. 4. Division of Structural Biology, Wellcome Centre for Human Genetics, University of Oxford, Roosevelt Drive, Oxford, OX3 7BN, U.K. 5. School of Life Sciences and Department of Chemistry, University of Warwick, Coventry, CV4 7AL, U.K.
Abstract
Specific interactions of lipids with membrane proteins contribute to protein stability and function. Multiple lipid interactions surrounding a membrane protein are often identified in molecular dynamics (MD) simulations and are, increasingly, resolved in cryo-electron microscopy (cryo-EM) densities. Determining the relative importance of specific interaction sites is aided by determination of lipid binding affinities using experimental or simulation methods. Here, we develop a method for determining protein-lipid binding affinities from equilibrium coarse-grained MD simulations using binding saturation curves, designed to mimic experimental protocols. We apply this method to directly obtain affinities for cholesterol binding to multiple sites on a range of membrane proteins and compare our results with free energies obtained from density-based equilibrium methods and with potential of mean force calculations, getting good agreement with respect to the ranking of affinities for different sites. Thus, our binding saturation method provides a robust, high-throughput alternative for determining the relative consequence of individual sites seen in, e.g., cryo-EM derived membrane protein structures surrounded by an array of ancillary lipid densities.
Specific interactions of lipids with membrane proteins contribute to protein stability and function. Multiple lipid interactions surrounding a membrane protein are often identified in molecular dynamics (MD) simulations and are, increasingly, resolved in cryo-electron microscopy (cryo-EM) densities. Determining the relative importance of specific interaction sites is aided by determination of lipid binding affinities using experimental or simulation methods. Here, we develop a method for determining protein-lipid binding affinities from equilibrium coarse-grained MD simulations using binding saturation curves, designed to mimic experimental protocols. We apply this method to directly obtain affinities for cholesterol binding to multiple sites on a range of membrane proteins and compare our results with free energies obtained from density-based equilibrium methods and with potential of mean force calculations, getting good agreement with respect to the ranking of affinities for different sites. Thus, our binding saturation method provides a robust, high-throughput alternative for determining the relative consequence of individual sites seen in, e.g., cryo-EM derived membrane protein structures surrounded by an array of ancillary lipid densities.
Eukaryotic
integral membrane proteins participate in a range of
essential cellular functions including signaling, adhesion, solute
transport, and ion homeostasis. Membrane proteins are inserted in
a lipid bilayer, the composition of which varies between cellular
compartments, metabolic state, and intramembrane localization.[1,2] Specific interactions of lipids with proteins have been observed
both experimentally and in molecular dynamics (MD) simulations[3−5] and can alter protein functionality by, e.g., allosteric modulation[6−8] or bridging protein–protein oligomerization.[9,10]Structural elucidation of specific protein–lipid interactions
has been aided by advances in cryo-electron microscopy (cryo-EM).[11,12] However, distinguishing the molecular identity of lipid-like densities
can be challenging, and it is limited to higher resolution examples.[13,14] Differentiating between phospholipid and sterol densities is somewhat
easier due to their distinct shapes. In mammalian cell membranes the
most abundant sterol is cholesterol, whereas in yeast and plant cell
membranes it is ergosterol and phytosterol, respectively.[15] Cholesterol is typically present at concentrations
of 30–40%,[16,17] although this may vary across
different regions of the membrane, and is higher in sphingolipid enriched
areas.[18] Cholesterol has been shown to
bind and modulate a broad range of membrane proteins including G-protein
coupled receptors (GPCRs), ion channels, and solute transporters.[6,19−23] Recent cryo-EM structures have revealed several sterol-like densities
surrounding protein transmembrane domains (TMDs). In these instances,
the bound density is cholesterol, copurified from the native bilayer,[24−26] or it may correspond to cholesterol derivatives, such as cholesterol
hemisuccinate (CHS), which are added during purification.[27,28] Often multiple cholesterol binding sites are observed within the
same structure.[25] For example, a recent
structure of the serotonin receptor, 5-HT1A (Protein Data
Bank (PDB) ID 7E2X), revealed 10 cholesterol molecules surrounding the TMD, including
one partially buried cholesterol adjacent to the orthosteric ligand
pocket.[29] There is therefore a clear need
to understand and characterize the relative affinities of multiple
cholesterol binding sites on the same protein. However, this remains
experimentally challenging, and there is a paucity of quantitative experimental biophysical data for cholesterol binding to, e.g.,
GPCRs[30,31] and other membrane proteins.Equilibrium
MD simulations have been used extensively to expand
on the information provided from structural analyses, to study protein–lipid
interaction patterns, and to obtain detailed insights into specific
binding sites.[4,32,33] In addition, biased-sampling simulations, such as potential of mean
force (PMF) calculations, free energy perturbation, and metadynamics
simulations, have been used to obtain lipid binding free energies,
supplementing available experimental data on lipid binding affinities.[34] These biased simulations are often performed
after initial equilibrium MD simulations, therefore requiring additional
computing resources and an iterative process to select suitable reaction
coordinates. This limits the applicability of such approaches to high-throughput,
automated pipelines, for example, MemProtMD.[35,36] To circumvent these limitations, efforts have been made to derive
protein–lipid binding affinities directly from equilibrium
MD simulations. These have the advantage that multiple lipid sites
can be simultaneously examined, such as in studies using 2D density
distributions of cholesterol surrounding the A2A and/or
β2 adrenergic receptors, taken from either atomistic[37] or coarse-grained (CG) MD simulations.[38] Additionally, complex lipid interaction profiles[32] can be more readily determined, such as applied
in a “density-threshold” approach with the nicotinic
acetylcholine receptor in a mixed lipid environment.[39] This can also be achieved with biased simulations, but
it requires additional simulations for each lipid species studied.[40] However, it remains unclear how accurate equilibrium
methods are for obtaining binding affinities and whether full convergence
is feasible within the limits of current MD simulations.Here,
we present a method for obtaining apparent dissociation constants
(Kdapp’s) directly from
equilibrium MD simulations. We apply this method to rank the strength
of binding sites for cholesterol on three representative membrane
proteins: an ATP-dependent pump (P-glycoprotein; P-gp; see below for
further details), a sterol receptor/transporter protein (Patched1;
PTCH1), and a member of the TRP family of ion channels (polycystin-2;
PC2) (Figure ). We
investigate whether the site rankings derived from this approach are
comparable with existing equilibrium and nonequilibrium methods. We
also study whether these differences are maintained in the presence
of higher (i.e., physiological) membrane concentrations of cholesterol.
We illustrate the utility of our robust method for determining the
relative affinities of multiple cholesterol sites on a membrane protein
via its application to the serotonin receptor (5-HT1A),
a GPCR structure recently determined by cryo-EM with 10 cholesterol
molecules bound.[29]
Figure 1
Membrane proteins which
bind cholesterol. Coarse-grained (CG) representations
of the structures of a transporter (P-glycoprotein; P-gp; PDB ID 7A65, subunit A), a receptor
(Patched1; PTCH1; PDB ID 6RVD, subunit A), an ion channel (human polycystin-2; PC2;
PDB ID 6T9N,
subunits A–D), and a GPCR (5-hydroxytrptamine/serotonin receptor;
5-HT1A; PDB ID 7E2X, subunit R), embedded in a phosphatidylcholine (PC;
60%) and cholesterol (40%) lipid bilayer. PC phosphate beads are shown
as gray spheres, cholesterol is shown in QuickSurf representation
in cyan, and proteins are in yellow. Extracellular (EC) and intracellular
(IC) leaflets are labeled. The inset shows corresponding atomistic
(AT) and CG representations of cholesterol with the β3-hydroxyl group (equivalent to the ROH bead at CG resolution) in
red.
Membrane proteins which
bind cholesterol. Coarse-grained (CG) representations
of the structures of a transporter (P-glycoprotein; P-gp; PDB ID 7A65, subunit A), a receptor
(Patched1; PTCH1; PDB ID 6RVD, subunit A), an ion channel (human polycystin-2; PC2;
PDB ID 6T9N,
subunits A–D), and a GPCR (5-hydroxytrptamine/serotonin receptor;
5-HT1A; PDB ID 7E2X, subunit R), embedded in a phosphatidylcholine (PC;
60%) and cholesterol (40%) lipid bilayer. PC phosphate beads are shown
as gray spheres, cholesterol is shown in QuickSurf representation
in cyan, and proteins are in yellow. Extracellular (EC) and intracellular
(IC) leaflets are labeled. The inset shows corresponding atomistic
(AT) and CG representations of cholesterol with the β3-hydroxyl group (equivalent to the ROH bead at CG resolution) in
red.
Methods
Equilibrium Coarse-Grained
MD Simulations
Structures
of human PC2 (PDB ID 6T9N, subunits A–D),[41] PTCH1 (PDB ID 6RVD, subunit A),[42] P-gp (PDB ID 7A65, subunit A),[43] and 5-HT1A (PDB ID 7E2X, subunit R) were obtained from the Protein
Data Bank (PDB). Nonprotein components were removed and loops were
modeled using MODELER 9.20,[44] for Q296–N305
of PC2 (for each subunit) and L608–L732 PTCH1 (using a nine
residue linker as previously described[42]). Proteins were converted to CG resolution using martinize.py[45] with an ElNeDyn 2.2[46] elastic network applied (spring force constant = 500 kJ mol–1 nm–2, cutoff = 0.9 nm). For PC2
the elastic network was applied to each subunit separately.The MARTINI2.2[47] force field was used
to describe all components. Proteins were embedded in a symmetric
1-palmitoyl-2-oleoyl-sn-glycero-3-phosphocholine
(POPC)/cholesterol bilayer using insane.py[48] (Figure ). The following
cholesterol concentrations were used with the remaining bilayer composed
of POPC: 1, 2.5, 5, 10, 15, 30, and 40% cholesterol. Cholesterol was
modeled using the virtual site parameters.[49] insane.py was also used to solvate the system with MARTINI water[45] before neutralization and addition of ions to
∼0.15 M NaCl. Each replica was independently energy minimized
using the steepest-decent method and equilibrated in 2 × 100
ns steps with restraints applied to the backbone beads.Each
protein was simulated for 5 × 5 μs in each bilayer
composition (7 bilayer compositions × 5 replicates = 175 μs
per system) using the GROMACS 2018 and 2019 simulation packages (www.gromacs.org). A 20 fs time
step was used, and periodic boundary conditions were applied. The
temperature was maintained at 310 K with a V-rescale thermostat[50] and a τT coupling constant
of 1.0 ps. A Parrinello–Rahman barostat[51] was used to maintain the pressure at 1 bar with a τP value of 12 ps and a compressibility of 3 × 10–4 bar–1. Electrostatic interactions were cut off
at 1.1 nm by using the reaction-field method, and Lennard-Jones interactions
were cut off at 1.1. nm by using the potential-shift Verlet method.
Bonds were constrained to their equilibrium values by using the LINCS
algorithm.[52]
Binding Site Identification
Interactions of cholesterol
with each protein were calculated using PyLipID (github.com/wlsong/PyLipID).[53] Cholesterol interaction occupancy
is defined as the fraction of simulation time where any bead of cholesterol
is in contact with any bead of a protein residue, with a 0.55 nm/1.0
nm double cutoff used to define lipid contacts. PyLipID was also used
to identify cholesterol binding sites by using a community analysis
approach to group residues which simultaneously interact with a bound
cholesterol over the course of the trajectories. This method is described
in detail elsewhere[54,55] and has been applied to a number
of recent examples to characterize lipid binding sites and kinetics.[56−58] Since the residue composition of sites A and B varied slightly with
the percent cholesterol present in the bilayer, we selected six residues
from each site, contacts to which were maintained across all cholesterol
concentrations, and used these six residues in our subsequent analysis
(Supplementary Figure 1). The positions
of sites A and B in relation to all identified cholesterol binding
sites are shown in Supplementary Figure 2 for the 40% cholesterol simulations. For 5-HT1A, sites
were defined as six residues in proximity to each of the 10 modeled
cholesterol densities in the structure (Supplementary Table 1).
Binding Saturation Curves
To define
specific interactions
of cholesterol with a membrane protein, we calculated the mean occupancy
(eq ) of the six selected
site residues, as reported by PyLipID,[53] across all cholesterol concentrations. F indicates the number of frames cholesterol is bound
to a given residue, Ft is the total number
of frames, and n indicates the total number of residues;
e.g., n = 6 for interactions with a site. Nonspecific
interactions were obtained by calculating the mean occupancy of residues
which, in the 40% cholesterol system, had interactions within the
30–50% range. Further details regarding definitions of specific/nonspecific
interactions are included in the Supporting Information and Supplementary Figure 3.Binding saturation curves were plotted
with GraphPad Prism 9.0.2 for MacOS (www.graphpad.com). The apparent dissociation constant for cholesterol
binding (Kdapp) was calculated
by fitting the data to eq , assuming site occupancies are a result of specific interactions
at one site on the protein. No constraints were used in calculation
of the Kdapp values. Errors
are reported as the standard error of the mean among five independent
repeats.The concentration
of free cholesterol ([CHOL]free) was
derived from the mean number of cholesterol molecules >0.8 nm from
the protein surface (unbound cholesterol) as a fraction of the total
number of unbound lipids (POPC and cholesterol) across simulations.
Thus, our computational saturation curves circumvent approximations
of free and total ligand pools often used experimentally. Note that
the Bmaxapp values are not
reported here (see the Supporting Information).Convergence analyses were performed by re-running the fitting
protocol
with fewer simulations (Supplementary Figure 4) or reducing the length of the trajectory (Supplementary Figure 5).
Density Analysis
We adapted a previously
described
method used to obtain free energy values for protein–cholesterol
interactions from 2D lipid density profiles observed in simulations.[37,38] The free energy (ΔG) can be then obtained
by comparing the density of cholesterol bound at a specific site (ρsite) to the mean lipid density in bulk (ρbulk) (eq ). R denotes the gas constant in kJ mol–1 K–1 (8.314 × 10–3) and T is the temperature in kelvin.Our method utilizes the same underlying
approach but extends the analysis to three dimensions (density in xyz) as opposed to averaging across the bilayer normal (density
in xy). Full details of processing of the density
data are provided in the Supporting Information, summarized below. Density analysis was performed with the DensityAnalysis
tool implemented in MDAnalysis[59,60] (www.mdanalysis.org) using an
in-house script. Grid dimensions were fixed, and the grid center was
defined as the center of mass of the protein transmembrane domain.
The bin size was 0.1 nm. Three-dimensional ρsite and ρbulk values
were obtained by masking specific regions of the density array (Supplementary Figure 6). These values were then
converted directly to free energy values by using eq .
Potential of Mean Force
Calculations
The setup and
analysis of PMF calculations was assisted by the pmf.py tool (DOI: https://doi.org/10.5281/zenodo.3592318).[34] CG PMF calculations were performed
as described previously[34] in bilayers containing
30% cholesterol. Briefly, a 1D reaction coordinate was generated by
pulling between the cholesterol center of mass and the backbone bead
of a site residue. Windows at 0.05 nm spacing along the reaction coordinate
were simulated for 1 μs each with a 1000 kJ mol–1 nm–2 umbrella potential used to limit cholesterol
movement along this coordinate. Free energy profiles were obtained
by using the weighted-histogram analysis method (WHAM)[61] implemented in GROMACS with 2000 rounds of Bayesian
bootstrapping, discarding the first 200 ns of each window. Further
details are provided in the Supporting Information, and convergence of free energy values is shown in Supplementary Figure 7.
Site Membrane Exposure
The membrane exposure fraction
was defined as the number of lipid contacts within 0.6 nm of a bound
cholesterol divided by the number of total contacts (protein and lipid)
to the site cholesterol, as calculated by using MDAnalysis[59,60] across the simulations.
Results
We set
out to determine if equilibrium MD simulations are able
to not only identify specific interactions of protein with lipids
but also to rank the affinities of different sites, and to evaluate
how well these estimates compare to biased simulations. We also wanted
to assess whether values obtained from simulations were affected by
the lipid concentration in the membrane.Using equilibrium CG
MD simulations, we constructed binding saturation
curves, where the total cholesterol concentration was varied, and
the mean occupancy of six residues in each specified binding site
was determined across the concentrations of free cholesterol. The
idea of this method was to mimic ligand binding assays used experimentally
to produce binding saturation curves.[62,63]To help
with the convergence of these calculations, we chose to
study the lipid cholesterol, which has been demonstrated to have relatively
fast binding and/or dissociation kinetics compared to other lipids
(e.g., anionic lipids such as cardiolipin and phosphatidylinositols)
and hence is more amenable to sampling of multiple sites within a
given simulation.[33] In addition, the thermodynamics
of protein–cholesterol interactions have been extensively studied
in both atomistic and CG simulations, with the use of both biased
and unbiased methods.[64] These free energy
estimates therefore provide a good benchmark against which to compare
our results.Three human integral membrane proteins were selected
to evaluate
our analysis of cholesterol interactions: the ATP-dependent efflux
pump P-glycoprotein (P-gp), the proposed sterol receptor/transporter
protein Patched1 (PTCH1), and the transient receptor potential (TRP)
ion channel polycystin-2 (PC2) (Figure ). In each case, cholesterol has been suggested to
play a role in protein function by either allosteric modulation or
direct involvement in the protein’s biological process.
Comparative
Methods for Determining Cholesterol Binding Affinities
with P-gp
Cholesterol has been shown to alter both the drug
binding properties[65] and ATP-mediated export
rates[65,66] of P-gp. In addition, P-gp localizes in
sphingomyelin/cholesterol enriched regions in the cell,[67,68] further supporting a role for cholesterol in modulating P-gp function.Previously reported CG simulations of cholesterol binding to a
human P-gp homology model observed cholesterol binding to multiple
sites including between TM10/TM12 and TM7/TM8, which were suggested
to have different free energy values (as reported by PMF calculations).[69] The TM7/TM8 site was also observed in atomistic
simulations of mouse P-gp.[70] We used PyLipID[53] (see Methods for details)
to identify two cholesterol binding sites from equilibrium simulations.
Our simulations, initiated from the recently solved human P-gp structure,[43] replicated the two aforementioned cholesterol
binding sites from the homology model simulations. Thus, site A corresponds
to cholesterol bound between TM10/TM12 and site B corresponds to cholesterol
bound between TM7/TM8 (Figure A, Supplementary Figure 1).
Figure 2
Cholesterol
binding to P-gp. (A) Cholesterol interaction sites
A (blue) and B (red), identified using equilibrium simulations (5
× 5 μs at each cholesterol concentration) in PC:Chol 60:40
followed by analysis using PyLipID (github.com/wlsong/PyLipID).[53] The sites are shown mapped onto the
structure of the P-gp (7A65, subunit A) TMD. Residues involved in cholesterol
interactions in the 40% cholesterol simulations are shown as spheres
scaled according to cholesterol residence times. The six residues
selected (which were conserved across all cholesterol concentrations)
are labeled (opaque), whereas the remaining residues constituting
the site (in 40% cholesterol) are transparent. (B) Binding saturation
curves for cholesterol binding to sites A and B across a range of
cholesterol concentrations. Site occupancy was defined as the mean
occupancy of the six site residues in (A). Error bars correspond to
standard deviations. Nonspecific interactions were calculated from
mean occupancies of specified residues with 30–50% occupancy
in the 40% cholesterol simulations. (C) Free energy landscapes from
potential of mean force (PMF) calculations for sites A and B from
simulations in bilayers containing 30% cholesterol. Bootstrapping
errors are shown in gray. (D) Free energies of binding derived from
probabilities of cholesterol bound at site A (blue) or B (red) relative
to the bulk probability calculated from 3D density plots of cholesterol
localized surrounding P-gp.
Cholesterol
binding to P-gp. (A) Cholesterol interaction sites
A (blue) and B (red), identified using equilibrium simulations (5
× 5 μs at each cholesterol concentration) in PC:Chol 60:40
followed by analysis using PyLipID (github.com/wlsong/PyLipID).[53] The sites are shown mapped onto the
structure of the P-gp (7A65, subunit A) TMD. Residues involved in cholesterol
interactions in the 40% cholesterol simulations are shown as spheres
scaled according to cholesterol residence times. The six residues
selected (which were conserved across all cholesterol concentrations)
are labeled (opaque), whereas the remaining residues constituting
the site (in 40% cholesterol) are transparent. (B) Binding saturation
curves for cholesterol binding to sites A and B across a range of
cholesterol concentrations. Site occupancy was defined as the mean
occupancy of the six site residues in (A). Error bars correspond to
standard deviations. Nonspecific interactions were calculated from
mean occupancies of specified residues with 30–50% occupancy
in the 40% cholesterol simulations. (C) Free energy landscapes from
potential of mean force (PMF) calculations for sites A and B from
simulations in bilayers containing 30% cholesterol. Bootstrapping
errors are shown in gray. (D) Free energies of binding derived from
probabilities of cholesterol bound at site A (blue) or B (red) relative
to the bulk probability calculated from 3D density plots of cholesterol
localized surrounding P-gp.Occupancies for both site A and site B increased nonlinearly with
cholesterol concentration, as would be anticipated for well-defined,
saturable binding sites (Figure B). We observe a rapid increase in site A occupancy
compared to site B as cholesterol concentration is increased. The Kdapp of P-gp site A (Kdapp = 0.8 ± 0.2%) is substantially lower
(i.e., has a higher affinity) than that for site B (Kdapp = 16.0 ± 0.8%). This suggests the
cholesterol binding affinities of the two sites on P-gp are not equal,
as is also exemplified by the variability in cholesterol affinities
reported in other studies.[64]We next
performed PMF calculations to validate our observed differences
in site affinities from our binding saturation method. For both sites
we observe defined energetic wells at low reaction coordinate values,
consistent with PMF profiles of other cholesterol binding sites.[64] We obtain free energy well depths of −13
± 2 kJ mol–1 for cholesterol binding to site
A and of −6 ± 2 kJ mol–1 for site B
(Figure C). Our PMF
values are in agreement with the relative affinities of sites obtained
from previous calculations on the P-gp homology model.[69] Thus, both binding saturation and PMF calculations
rank the sites in the same order.We then assessed whether density-based
equilibrium free energy
methods could also be used to observe quantitative differences in
site binding affinities. Interestingly, for our density analysis,
despite a strong difference at very low cholesterol, our free energy
values for site A and site B converge at approximately −10
kJ mol–1 in 40% cholesterol. This suggests some
sensitivity of the method to the lipid concentration chosen for the
simulation (Figure D).
Extending Analysis of Cholesterol Affinities to Other Protein
Examples: PTCH1 and PC2
To test the applicability of our
methods to other membrane proteins, we applied the same protocol described
above in detail for P-gp to two other proteins: the receptor/transporter
PTCH1 and the ion channel PC2. These are described in succession in
the following sections.
PTCH1
Recent structural studies
of PTCH1 have identified
multiple sterol binding sites on the TMD and bound within the extracellular
domain (ECD).[71−75] In addition, novel biochemical and CRISPR-based assays suggest PTCH1
alters the abundance of accessible cholesterol,[74,76] which collectively has led to the growing consensus that PTCH1 may
function as a cholesterol transporter.[77]For PTCH1, cholesterol binding sites were selected because
sterol-like densities have been observed in proximity to both sites
in cryo-EM structures[72,75] (Figure A, Supplementary Figure 1). Site A is localized within a structurally conserved domain
formed by TM2-6 called the sterol-binding domain (SSD). Site B is
situated between TM7/TM12 of PTCH1. In addition to the observed structural
densities, both sites are situated at the exit points of tunnels extending
through the ECD, characterized in previous atomistic simulations,
and are therefore suggested to form local cholesterol binding sites
for coordination of transport between the ECD and membrane.[42]
Figure 3
Cholesterol binding to PTCH1. As in Figure for PTCH1 (6RVD, subunit A). (A) Residues comprising
cholesterol interaction sites A (blue) and B (red) on PTCH1. (B) Binding
saturation curves for cholesterol binding to sites A and B as the
concentration of cholesterol is varied. (C) Free energy profiles from
PMF calculations for cholesterol binding to sites A and B on PTCH1.
(D) Free energies derived from the probability of cholesterol binding
to sites relative to in bulk, obtained using a density-based approach.
For full methodological details see Figure .
Cholesterol binding to PTCH1. As in Figure for PTCH1 (6RVD, subunit A). (A) Residues comprising
cholesterol interaction sites A (blue) and B (red) on PTCH1. (B) Binding
saturation curves for cholesterol binding to sites A and B as the
concentration of cholesterol is varied. (C) Free energy profiles from
PMF calculations for cholesterol binding to sites A and B on PTCH1.
(D) Free energies derived from the probability of cholesterol binding
to sites relative to in bulk, obtained using a density-based approach.
For full methodological details see Figure .We again see a strong difference in the binding saturation curves
for site A (Kdapp = 6.8 ±
0.3%) and site B (Kdapp = 46
± 2%) (Figure B), suggesting that site A has a far higher apparent affinity than
site B, although the difference between sites was somewhat less than
seen in P-gp.This difference is also reflected in our PMF calculations,
from
which we obtain free energy well depths of −18 ± 3 kJ
mol–1 for cholesterol binding to site A and −6
± 1 kJ mol–1 for site B (Figure C). This suggests that we can obtain qualitative
agreement between the ranking of site affinities using binding saturation
curves and density analysis compared to PMFs and that the magnitudes
can be compared between proteins and appear to reflect genuine differences
in site affinities.These differences are reflected in our 3D
density analysis, albeit
with a muted difference between the sites, which at >15% cholesterol
gives values of about −10 kJ mol–1 for site
A and −8 kJ mol–1 (40% cholesterol) for site
B (Figure D).
PC2
A combined cryo-EM and MD study of PC2 identified
cholesterol-like density located between the voltage-sensing-like
domain (VSLD) and the pore helices, which coincided with a cholesterol
binding site seen in CG simulations.[41] In
addition, both PC2 and PTCH1 localize within the primary cilia of
cells where levels of accessible cholesterol are regulated and where
cholesterol has been shown to play roles in initiating intracellular
signaling pathways.[78]As before,
we identified cholesterol binding sites on PC2 and constructed binding
saturation curves. Site A on TM3/TM4 (Figure A, Supplementary Figure 1) has previously been identified from a combined structural
and simulation study.[41] Site B is on the
interface of TM1/TM4. We again see differences in site affinity between
site A (Kdapp =11 ± 1%)
and site B (Kdapp = 49 ±
9%) from our saturation curves (Figure B). The Kdapp for site A is higher than that observed for P-gp or PTCH1.
Figure 4
Cholesterol
binding to PC2. As in Figure for PC2 (6T9N, subunits A–D). For clarity, only
subunit A of the PC2 homotetramer is shown in A, with the pore-lining
helices of PC2 in darker gray compared to the voltage-sensing-like
domain (VSLD). (A) Residues comprising cholesterol interaction sites
A (blue) and B (red) on PC2. (B) Binding saturation curves for cholesterol
binding to sites A and B as the concentration of cholesterol is varied.
(C) Free energy profiles from PMF calculations for cholesterol binding
to sites A and B on PC2. (D) Free energies derived from the probability
of cholesterol binding to sites relative to in bulk, obtained using
a density-based approach. For full methodological details see Figure .
Cholesterol
binding to PC2. As in Figure for PC2 (6T9N, subunits A–D). For clarity, only
subunit A of the PC2 homotetramer is shown in A, with the pore-lining
helices of PC2 in darker gray compared to the voltage-sensing-like
domain (VSLD). (A) Residues comprising cholesterol interaction sites
A (blue) and B (red) on PC2. (B) Binding saturation curves for cholesterol
binding to sites A and B as the concentration of cholesterol is varied.
(C) Free energy profiles from PMF calculations for cholesterol binding
to sites A and B on PC2. (D) Free energies derived from the probability
of cholesterol binding to sites relative to in bulk, obtained using
a density-based approach. For full methodological details see Figure .From PMF calculations we obtain a free energy well depth
of −13
± 3 kJ mol–1 for cholesterol binding to site
A, in agreement with a previously reported value of −12 ±
3 kJ mol–1 for cholesterol binding to this site
obtained by a similar method[41] (Figure C). For site B we
obtain a free energy value of −7 ± 1 kJ mol–1, consistent with differences in site affinities from the saturation
curves.Finally, from the density analysis we observe stabilization
of
the free energy values at >15% cholesterol, corresponding to approximately
−10 and −9 kJ mol–1 for sites A and
B, respectively (Figure D).
Affinities of Multiple Cholesterol Sites
on One Protein
We sought to assess whether the binding saturation
method could be
applied to a membrane protein with several bound cholesterol molecules,
exploiting the method’s ability to obtain multiple Kdapp’s from the same simulation
data set. For this we chose a recent structure of the 5-HT1A GPCR, determined in complex with 10 cholesterol molecules.[29] Here, we used the structurally observed cholesterol
densities to define the position of the binding sites (site IDs as
numbered in ref (29), Supplementary Table 1) and constructed
binding saturation curves for each site.We observe saturable
binding curves for all 10 sites, validating the position of modeled
cholesterol densities in the 5-HT1A structure[29] (Figure A). All except one site (S3) had Kdapp’s ranging between 4 and 9%, similar
to site A Kdapp’s for
P-gp, PTCH1, and PC2. These binding sites could be further separated
into two subcategories: “strong” sites with Kdapp’s of 4–5% (S2, S4, S7, S8, S11) (Figure B, blue; Supplementary Table 1) and “moderate”
sites with Kdapp’s of
8–9% (S1, S5, S9, S12) (Figure B, lime)
which had distinct binding saturation profiles (Figure A). The remaining site (S3) (Figure B, orange) had an
intermediate affinity (Kdapp = 20%) relative to sites A and B of P-gp, PTCH1, and PC2, suggestive
of a “medium” affinity site. Thus, using a single set
of simulations, we are able to rank the respective affinities of the
10 cholesterol molecules bound to 5-HT1A.
Figure 5
Binding saturation method
as applied to 10 cholesterol sites on
5-HT1A. (A) Binding saturation curves for 10 cholesterol
binding sites on 5-HT1A (7E2X, subunit R). Site occupancies were obtained
from the mean occupancy of six residues in proximity to the modeled
cholesterols (Supplementary Table 1), as
obtained using PyLipID.[53] Sites are colored
according to the relative strength as given by the obtained Kdapp values (“strong”,
blue; “moderate”, lime; “medium”, orange).
(B) Structure of apo 5-HT1A used to obtain the binding
saturation curves in (A). Cholesterol molecules are shown as sticks
colored according to the relative site affinity (see (A)) and Kdapp values for each site indicated.
Site IDs (S1–5, S7–9, S11–12) correspond to those in Figure 2f of Xu et al.[29] The modeled phosphatidylinositol is shown in gray stick
representation for reference.
Binding saturation method
as applied to 10 cholesterol sites on
5-HT1A. (A) Binding saturation curves for 10 cholesterol
binding sites on 5-HT1A (7E2X, subunit R). Site occupancies were obtained
from the mean occupancy of six residues in proximity to the modeled
cholesterols (Supplementary Table 1), as
obtained using PyLipID.[53] Sites are colored
according to the relative strength as given by the obtained Kdapp values (“strong”,
blue; “moderate”, lime; “medium”, orange).
(B) Structure of apo 5-HT1A used to obtain the binding
saturation curves in (A). Cholesterol molecules are shown as sticks
colored according to the relative site affinity (see (A)) and Kdapp values for each site indicated.
Site IDs (S1–5, S7–9, S11–12) correspond to those in Figure 2f of Xu et al.[29] The modeled phosphatidylinositol is shown in gray stick
representation for reference.
Discussion
Increasingly, structures and simulations reveal
a range of lipids
bound to sites on the TMDs of membrane proteins.[25,29,32] Nevertheless, challenges with structural
interpretation prevail when attempting to assign meaning to bound
lipids in a biological context and/or for protein function. Ranking
the affinity of lipid sites can aid this interpretation by establishing
which sites may be more relevant/prevalent in a biological context.We compared the affinities of two cholesterol sites on each of
P-gp, PTCH1, and PC2 using equilibrium and biased MD simulations.
Calculating the difference in apparent site free energies (ΔΔGapp) between sites A and B (eq ) reveals good agreement between
the ranking of site affinities derived from PMF calculations and from
our binding saturation method (Figure ), i.e., ΔΔGapp < 0 for both methods. Furthermore, the magnitude of ΔΔGapp between methods agrees well for P-gp and
PC2, which yield ΔΔGapp values
of −4 to −8 kJ mol–1. ΔΔGapp values for PTCH1 differ somewhat between
methods, which we attribute to a site A free energy much greater than
observed for cholesterol binding to other proteins by PMF calculations.[64] Thus, this suggests our binding saturation method
can accurately rank the order and magnitude of site affinities when
compared to robust free energy methods.
Figure 6
Difference
in apparent free energies of binding, site A –
site B. For each protein, ΔΔGapp (site A – site B) is shown, estimated from the difference
in PMF well depth (black; Figures C, 3C, and 4C) and from the difference in ΔGapp = −RT ln Kdapp (light blue; using Kdapp values obtained from fitting the binding saturation
curves in Figures B, 3B, and 4B). PMF
errors were calculated in quadrature (total error = ) since PMFs for sites A and B are independent.
Difference
in apparent free energies of binding, site A –
site B. For each protein, ΔΔGapp (site A – site B) is shown, estimated from the difference
in PMF well depth (black; Figures C, 3C, and 4C) and from the difference in ΔGapp = −RT ln Kdapp (light blue; using Kdapp values obtained from fitting the binding saturation
curves in Figures B, 3B, and 4B). PMF
errors were calculated in quadrature (total error = ) since PMFs for sites A and B are independent.It is worth noting is that we do not see strict
agreement between
all obtained Kdapp and PMF
values. One notable example of this is the P-gp site A, which gives
the strongest Kdapp but not
the highest PMF value. Interestingly, previous PMFs for this system
have been much higher,[69] and more in line
with our Kdapp. This points
to a variance in the PMF calculations depending on the conditions
used, something we do not expect to factor into our Kdapp calculations.The in silico binding saturation method circumvents
two approximations that are routinely applied in equivalent experimental
procedures. First, our cholesterol binding occupancies are specific
to the site of interest, avoiding complications created by conflating
micro and macro dissociation constants, and second, the concentration
of free cholesterol can be directly calculated rather than approximating
to the total cholesterol concentration. This allows for differences
in site affinities to be observed over a range of physiologically
relevant free cholesterol concentrations. We note that care should
be taken when determining site affinities from density-based equilibrium
methods as these appeared to show some sensitivity to the overall
lipid concentration in the membrane (Figures D, 3D, and 4D) and/or the degree of site sampling at low percent
cholesterol.We note that, for PTCH1 and P-gp, the free energy
values obtained
from PMFs were broadly similar at different membrane cholesterol concentrations
(Supplementary Figure 8). This suggests
that the site A free energy value was not affected by whether site
B was occupied (as in 30% cholesterol, Figures C, 3C, and 4C) or unoccupied (in 0% cholesterol, Supplementary Figure 8). Conversely, the PC2
site PMF was considerably higher at 30% cholesterol than at 0% cholesterol.
This is potentially due to allosteric coupling between cholesterol
sites. However, it is unclear how well this could be modeled in CG
simulations with an elastic network model. Alternatively, there may
be direct cooperativity between cholesterol molecules, with overlapping
cholesterol sites. Either way, this makes interpretation of any value
taken from free energy calculations difficult, as the presence or
absence of other cholesterol could impact the finding. The presented
binding saturation method, however, implicitly takes this into consideration.The computational cost and ease of setup are key considerations
if we wish to investigate affinities using high-throughput simulation
methods applied to a wide range of membrane proteins. For the PMFs,
each site on a given protein was simulated for approximately 65 ×
1 μs umbrella sampling windows, in addition to the initial steered
MD simulations (∼0.03 μs), from which those windows were
derived. Thus, approximately 130 μs of CG simulation time was
used in the PMF calculations to derive the two site affinities on
a protein, at a single cholesterol concentration.For the binding
saturation method, the total CG simulation time
for both sites was 175 μs (5 × 5 μs at 7 lipid concentrations)
across all free cholesterol concentrations. The binding saturation
method was surprisingly robust, with as few as two replicas required
to reach Kdapp convergence
and one replica sufficient to observe qualitative differences in site
affinities (Supplementary Figure 4). In
addition, 3 μs per simulation appears to be sufficient for convergence
(Supplementary Figure 5). Therefore, the
total simulation time could be reduced to 21–42 μs and
still yield quantitative differences in site affinities. Furthermore,
the number of residues used to define the high affinity site (site
A) could be reduced from six to one, reducing the amount of user input
required in simulation analysis (Supplementary Figure 9). The lower affinity site (site B) was more sensitive
to the number of site residues, as expected for weaker site binding.
Setting up equilibrium simulations is more amenable to automation
compared to the careful selection of reaction coordinates required
in biased methods, making the former approach suitable for use in
high-throughput pipelines. Crucially, equilibrium methods allow multiple
site affinities to be obtained from the same simulation data set,
meaning that analyses could be extended to many sites in the same
system for the same computational cost. We exemplify this here, to
obtain relative affinities of 10 cholesterol sites on the 5-HT1A receptor. From binding saturation curves, we can group these
sites into three categories corresponding to “strong”,
“moderate”, and “medium” affinity sites
compared to Kdapp’s
of sites on P-gp, PTCH1, and PC2 (Figure ). Performing 10 equivalent PMF calculations
would require ∼650 μs of simulation time, ∼4 times
the equilibrium CG simulation time used here. Thus, the binding saturation
method is a suitable alternative for investigating site affinities,
yielding tractable and accurate results with modest input required
from the user.
What Dictates Differences in Cholesterol Binding Affinities?
We sought to understand whether key structural features between
site A and site B underpin the observed differences in site affinities
(Figure ). For P-gp
and PTCH1, the membrane exposure of the bound cholesterol was lower
for site A than for site B (Figure A). This suggests that the more buried the cholesterol
site is, the higher the observed affinity. That said, the degree of
site exposure to the surrounding membrane was not sufficient to fully
describe differences in site affinity for PC2, where both sites were
similarly buried but the affinities were different. For PC2, the presence
of a polar residue (Q557) in proximity to the hydroxyl (ROH) bead
of cholesterol appears to enhance the affinity of cholesterol binding
to site A (Figure B). Equivalent polar residues are not present in site B (Figure A). A polar residue
was also present in site A of P-gp and that of PTCH1 adjacent to the
cholesterol ROH bead (Figure B). Thus, “strong” cholesterol binding is enhanced
by both the pocket-like nature of a binding cavity and polar residues
in direct contact with the lipid headgroup.
Figure 7
Molecular basis of observed
differences in site affinities.(A)
Fractional membrane exposure of the bound site A and site B cholesterols
for P-gp, PTCH1, and PC2 across simulations. Membrane exposure was
defined as the number of lipid contacts within 0.6 nm of the bound
cholesterol divided by the total number of contacts (protein and lipid)
within 0.6 nm. Error bars indicate the standard error of the mean
across replicates. (B) Binding pose of cholesterol bound to sites
A (blue) and B (red) of P-gp, PTCH1, and PC2, as obtained using PyLipID[53] from our CG simulation data. Site cholesterols
are shown bound to the surface (all beads, white) of the proteins,
indicating differential burial of the site cholesterols. The locations
of polar residues in proximity to the ROH bead of site A cholesterols
are indicated in purple.
Molecular basis of observed
differences in site affinities.(A)
Fractional membrane exposure of the bound site A and site B cholesterols
for P-gp, PTCH1, and PC2 across simulations. Membrane exposure was
defined as the number of lipid contacts within 0.6 nm of the bound
cholesterol divided by the total number of contacts (protein and lipid)
within 0.6 nm. Error bars indicate the standard error of the mean
across replicates. (B) Binding pose of cholesterol bound to sites
A (blue) and B (red) of P-gp, PTCH1, and PC2, as obtained using PyLipID[53] from our CG simulation data. Site cholesterols
are shown bound to the surface (all beads, white) of the proteins,
indicating differential burial of the site cholesterols. The locations
of polar residues in proximity to the ROH bead of site A cholesterols
are indicated in purple.For 5-HT1A, almost all sites contained polar residues
in proximity to the cholesterol hydroxyl, reflected in the low Kdapp values obtained from the binding
saturation curves (Figure , Supplementary Table 1). Higher
affinity sites are more likely to persist during the relatively harsh
purification and solubilization process used to obtain membrane protein
structures by cryo-EM, consistent with the high affinity of sites
observed on 5-HT1A. One site on 5-HT1A (S1) is proposed to stabilize the orthosteric ligand binding
pocket and regulate binding of aripiprazole to the receptor.[29] We obtained a Kdapp value of 8% for this site which, while possessing
a reasonably high affinity, was not the strongest cholesterol binding
site on 5-HT1A. Reduced affinity at S1 may assist
dynamic binding/unbinding of cholesterol to this site compared to
a constitutively occupied, higher affinity, binding site.Perhaps
a more intriguing question is why, from a functional perspective,
membrane proteins might show differences in cholesterol affinities
across their surfaces. Differential site affinities on proteins could
be utilized, e.g., for cholesterol-dependent differences in protein
regulation. PC2 and PTCH1 localize to the primary cilia, where the
abundance of accessible membrane cholesterol is highly regulated.[79] Changes in cilia cholesterol levels coincide
with activation levels of key signaling pathways and to the subcellular
localization of PTCH1.[76,80] In addition, the abundance of
membrane cholesterol within organelles increases between the endoplasmic
reticulum and the plasma membrane.[17] Cholesterol
binding/unbinding to sites could therefore aid in protein trafficking
to its native membrane environment. For example, the dynamic localization
of SNARE proteins within the trans-Golgi network and endosomes is
affected by membrane cholesterol abundance, affecting SNARE recycling
between membranes.[81]One factor not
considered in the study is the ability of other
specific lipids to influence the affinity of a different lipid to
a site. For example, the presence of PIP2 in a complex
membrane environment enhances the affinity of PS binding to the Kir2.2
channel.[54] Additionally, we do not assess
the relative affinity of different lipids binding to the same site
as has been investigated in a recent study of the Kir6.2 channel.[40] Future work will seek to evaluate how changes
in cholesterol concentrations influence site affinities within the
context of more realistic membrane environments and to assess roles
lipid synergy might play in affinity modulation.In summary,
we have evaluated the binding affinities of cholesterol
to two sites on a range of proteins, drawing comparisons between well-established
PMF and equilibrium methods. We describe a novel binding saturation
curve method for obtaining affinities from equilibrium simulations,
intended to imitate experimental binding assays. This method was also
applied to simultaneously probe the affinities of 10 cholesterol binding
sites on a protein, demonstrating how the method could be scaled for
automated and/or high-throughput analysis. The binding saturation
method accurately ranks the order and relative magnitude of site affinities
when compared to PMF calculations and could be readily applied to
study affinities of other lipid/ligand binding events.
Authors: Lucien E Weiss; Ljiljana Milenkovic; Joshua Yoon; Tim Stearns; W E Moerner Journal: Proc Natl Acad Sci U S A Date: 2019-02-28 Impact factor: 11.205
Authors: Wanling Song; Robin A Corey; T Bertie Ansell; C Keith Cassidy; Michael R Horrell; Anna L Duncan; Phillip J Stansfeld; Mark S P Sansom Journal: J Chem Theory Comput Date: 2022-01-12 Impact factor: 6.578