Huafeng Xu1, Timothy Palpant1, Cody Weinberger1, David E Shaw1,2. 1. D. E. Shaw Research, New York, New York 10036, United States. 2. Department of Biochemistry and Molecular Biophysics, Columbia University, New York, New York 10032, United States.
Abstract
A key step in the emergence of human pandemic influenza strains has been a switch in binding preference of the viral glycoprotein hemagglutinin (HA) from avian to human sialic acid (SA) receptors. The conformation of the bound SA varies substantially with HA sequence, and crystallographic evidence suggests that the bound SA is flexible, making it difficult to predict which mutations are responsible for changing HA-binding preference. We performed molecular dynamics (MD) simulations of SA analogues binding to various HAs and observed a dynamic equilibrium among structurally diverse receptor conformations, including conformations that have not been experimentally observed. Using one such novel conformation, we predicted─and experimentally confirmed─a set of mutations that substantially increased an HA's affinity for a human SA analogue. This prediction could not have been inferred from the existing crystal structures, suggesting that MD-generated HA-SA conformational ensembles could help researchers predict human-adaptive mutations, aiding surveillance of emerging pandemic threats.
A key step in the emergence of human pandemic influenza strains has been a switch in binding preference of the viral glycoprotein hemagglutinin (HA) from avian to human sialic acid (SA) receptors. The conformation of the bound SA varies substantially with HA sequence, and crystallographic evidence suggests that the bound SA is flexible, making it difficult to predict which mutations are responsible for changing HA-binding preference. We performed molecular dynamics (MD) simulations of SA analogues binding to various HAs and observed a dynamic equilibrium among structurally diverse receptor conformations, including conformations that have not been experimentally observed. Using one such novel conformation, we predicted─and experimentally confirmed─a set of mutations that substantially increased an HA's affinity for a human SA analogue. This prediction could not have been inferred from the existing crystal structures, suggesting that MD-generated HA-SA conformational ensembles could help researchers predict human-adaptive mutations, aiding surveillance of emerging pandemic threats.
For a strain of influenza
to become pandemic, a non-human—often
avian—strain must acquire mutations that enable it to infect
and transmit between humans.[1−3] The infection of a host cell by
an influenza virus begins when the glycoprotein hemagglutinin (HA)
on the viral surface binds to HA-specific receptors [i.e., sialic
acid (SA)—containing glycoproteins and glycolipids] on the
host cell.[4] Human transmissibility requires
both that the viral HAs have sufficiently strong affinity for the
human SA receptors on the target cells to establish adequate adhesion,[5,6] and that they have sufficiently weak affinity for respiratory tract
mucins to avoid sequestration and reach the target cells.[7] Human-transmissible strains are associated with
mutations in HA that switch its binding specificity from avian to
human receptors.[8−13] Such mutations typically increase the binding affinity to receptors
whose terminal SA is linked to the penultimate sugar by an α2,6
glycosidic bond (prevalent on epithelial cells in the human trachea,
the site of human infection) and decrease binding affinity to receptors
whose terminal SA uses an α2,3 linkage (prevalent in respiratory
tract mucins and in avian enteric tracts, the latter being the primary
site of avian infection).[14,15]Various distinct
sets of mutations have caused HAs of different
subtypes, which have substantially different amino acid sequences[16] (Figure ), to switch binding preference from avian to human SAs in
past pandemics.[17] Although X-ray crystallography,
biochemical studies, and computational studies[18−20] have shed some
light on the structural and thermodynamic basis of these switching
mutations,[9,21−28] it has been a challenge in studies of past affinity-switching strains
to identify the key sequence differences between subtypes that allowed
a given set of mutations to confer human-SA binding preference to
HA of one subtype but not another, and it remains difficult to predict
which sets of mutations would enable an HA of a given sequence to
switch binding preference in the future.[29]
Figure 1
Multiple-sequence
alignment is shown for the 130-loop, the 150-loop,
the 190-helix, and the 220-loop, which form the binding pocket, in
HAs of different subtypes. (a) Sequence logo representation of HAs
of H1, H2, H3, H5, and H7 subtypes. The height of the letter indicates
the frequency that the corresponding amino acid is observed at that
sequence position. There is a substantial sequence variation in the
binding pocket within a subtype and even more difference across different
subtypes. (b) Sequences of the HAs of the DK76, IN05, and SH13 strains,
and some of the variants studied in this work. Highlighted in red
are the mutations that enabled the HA to switch binding preference
from avian receptors to human receptors. In the H1N1 pandemic strain,
the pair of mutations E190D,G225D in HA led to the switch in binding
preference,[40] whereas in the H2N2 and H3N2
pandemic strains, the pair of mutations Q226L,G228S caused the switch.
The latter pair of mutations can also confer preferential binding
to human receptors when introduced into HA of the H5N1 IN05 strain.[38,39] No HA of H2 or H3 subtypes, however, has been reported to switch
binding preference with the E190D,G225D mutations, and this pair of
mutations has been shown not to switch the binding preference when
introduced into an HA of H5 subtype.[8] Conversely,
no HA of the H1 subtype has been reported to switch its binding preference
by the Q226L,G228S mutations; an additional pair of mutations A227S,P186N,
identified in this work, were required for DK76 HA to do so. The full
sequences of DK76, IN05, SH13, and their mutants studied in this work
are provided in the Supporting Information.
Multiple-sequence
alignment is shown for the 130-loop, the 150-loop,
the 190-helix, and the 220-loop, which form the binding pocket, in
HAs of different subtypes. (a) Sequence logo representation of HAs
of H1, H2, H3, H5, and H7 subtypes. The height of the letter indicates
the frequency that the corresponding amino acid is observed at that
sequence position. There is a substantial sequence variation in the
binding pocket within a subtype and even more difference across different
subtypes. (b) Sequences of the HAs of the DK76, IN05, and SH13 strains,
and some of the variants studied in this work. Highlighted in red
are the mutations that enabled the HA to switch binding preference
from avian receptors to human receptors. In the H1N1 pandemic strain,
the pair of mutations E190D,G225D in HA led to the switch in binding
preference,[40] whereas in the H2N2 and H3N2
pandemic strains, the pair of mutations Q226L,G228S caused the switch.
The latter pair of mutations can also confer preferential binding
to human receptors when introduced into HA of the H5N1 IN05 strain.[38,39] No HA of H2 or H3 subtypes, however, has been reported to switch
binding preference with the E190D,G225D mutations, and this pair of
mutations has been shown not to switch the binding preference when
introduced into an HA of H5 subtype.[8] Conversely,
no HA of the H1 subtype has been reported to switch its binding preference
by the Q226L,G228S mutations; an additional pair of mutations A227S,P186N,
identified in this work, were required for DK76 HA to do so. The full
sequences of DK76, IN05, SH13, and their mutants studied in this work
are provided in the Supporting Information.This difficulty arises in part
because of the conformational heterogeneity
of SAs when bound to different HA variants: although the terminal N-acetylneuraminic acid (Neu5Ac) adopts the same conformation
in all human and avian HA–SA complexes (Figure S1a,b), the linked penultimate galactose (Gal) and
the third sugar—often N-acetylglucosamine
(GlcNAc)—of SAs can assume many different conformations when
bound to different HA variants (Figure S2). Switching mutations in past pandemic strains have primarily affected
HA interactions with Gal and GlcNAc (Figure S1c,d), and it is thus likely that predicting affinity-switching mutations
in HA would require knowing the range of possible binding conformations
of the SA’s terminal three sugars.Furthermore, the GlcNAc
and Gal moieties of human SAs are only
partially resolved in many crystal structures of HA–SA complexes,
hinting that human receptors might remain flexible and adopt a wide
range of conformations while bound to HA. In existing crystal structures,
human SAs adopt a more diverse range of binding conformations across
complexes with different HAs than avian SAs do (Figure S2), suggesting that human receptors may be especially
flexible. It would be difficult, however, to structurally characterize
flexible SA conformations in a given HA–SA complex using X-ray
crystallography. Atomic-level molecular dynamics (MD) simulations
offer a computational route for investigating such binding conformations
as recent developments in the simulation methodology and computer
hardware have enabled MD simulations to successfully model the kinetics
and thermodynamics of protein–ligand binding, including protein–protein
association.[30−32]Here, we report the results of long-timescale
MD simulations[33,34] of binding between a number of
HA variants and analogues of SAs.
In our simulations, we observed that human SAs in complex with HA
remained in a dynamic equilibrium among a diverse ensemble of conformations,
adopting all of the crystallographically determined binding conformations
and multiple novel binding conformations. Based on one of these novel
binding conformations, we predicted the effects of a number of previously
unexamined HA mutations on HA–SA binding affinity, including
a set of mutations that we predicted would substantially increase
the binding affinity of an avian HA for a human receptor analogue.
These predictions, which could not have been made using existing crystal
structures, were then tested using microscale thermophoresis (MST)
experiments. The results of these experiments were consistent with
our predictions. Our results thus suggest that ensembles of binding
conformations generated by MD might be used by researchers to help
predict potential human-adaptive mutations in avian HA, which could
potentially assist in the monitoring of future pandemic threats.
Results
and Discussion
We simulated the binding of α2,3-linked
sialyl lactosamine
(3-SLN, Figure S1e), a trisaccharide analogue
of avian receptors, and of α2,6-linked sialyl lactosamine (6-SLN, Figure S1f), a trisaccharide analogue of human
receptors, to a diverse set of variants (Table S1) of HAs from three different subtypes: A/duck/Alberta/35/1976
(DK76) of the H1 subtype, A/Indonesia/5/2005 (IN05) of the H5 subtype,
and A/Shanghai/2/2013 (SH13) of the H7 subtype. We also simulated
the binding of another human receptor analogue, the pentasaccharide
506-SLN (also known as LSTc), to a subset of the HA variants (we denote
an HA variant by its strain of origin and, in the superscript, its
amino acid mutations from the wild type; a mutation is abbreviated
by the one-letter code of the amino acid in the wild type, followed
by the HA residue number according to the numbering in the H3 subtype,
followed by the one-letter code of the amino acid in the mutant. The
simulations are summarized in Table S7).
The receptor analogues 3-SLN, 6-SLN, and 506-SLN represent the receptors’
terminal sugars that interact directly with HA, and they have been
commonly used in structural and biochemical studies of receptor-HA
binding.In our simulations, 3-SLN and 6-SLN—and 506-SLN
for some
HA variants—spontaneously bound to and unbound from HA (Figures a,b and S3, Movie S1); when
3-SLN, 6-SLN, and 506-SLN were bound to an HA molecule, they recapitulated
the crystallographic binding conformation of Neu5Ac and the interactions
conserved across all HA–SA complexes. In every one of the simulated
HA–SA complexes, the parts of the SAs other than Neu5Ac remained
flexible and adopted a wide range of conformations, including novel
conformations previously unseen in crystal structures of any HA–SA
complexes (the simulated conformations observed for 6-SLN and 506-SLN
will be the focus of our analysis because the primary goal of this
work is to predict HA mutations that may increase affinity for human
receptors).
Figure 2
Receptor analogues reversibly bound to and unbound from HA in MD
simulations, and the kinetic rates can be estimated from the binding
and unbinding times. (a,b) Time traces of the root-mean-square deviation
(RMSD) of Neu5Ac from its bound pose in crystal structures, taken
from individual representative simulations of (a) 3-SLN and (b) 6-SLN
binding to the RBD of the DK76 HA and two of its mutants. The times
of binding (i.e., when the RMSD fell below 1.5 Å) and unbinding
(i.e., when the RMSD rose above 8 Å) are indicated by red and
blue arrows, respectively, and the time intervals when the receptor
analogue was bound are indicated by the green shade under the curve.
(c,d) Kinetic rates, ka and kd, for (c) 3-SLN and (c) 6-SLN binding to different HA
variants, estimated from the MD simulations. Since the rates are plotted
on a logarithmic scale, the difference between ka and kd, indicated by the height
of the gray area, gives the equilibrium association constant. The
rates and the statistical errors are estimated with the assumption
that the binding and unbinding events followed Poisson statistics.
The RMSD time traces of all of the MD simulations used to estimate
the ka and kd values are shown in Figure S3. Absence
of the error bar on ka indicates that
the value is an upper bound, reflecting the fact that no spontaneous
binding event occurred in the corresponding simulations, and that
based on the lengths of the simulations, we can thus estimate that
there is only a P = 0.02 probability that ka is greater than the given value (see the Supporting Information for details).
Receptor analogues reversibly bound to and unbound from HA in MD
simulations, and the kinetic rates can be estimated from the binding
and unbinding times. (a,b) Time traces of the root-mean-square deviation
(RMSD) of Neu5Ac from its bound pose in crystal structures, taken
from individual representative simulations of (a) 3-SLN and (b) 6-SLN
binding to the RBD of the DK76 HA and two of its mutants. The times
of binding (i.e., when the RMSD fell below 1.5 Å) and unbinding
(i.e., when the RMSD rose above 8 Å) are indicated by red and
blue arrows, respectively, and the time intervals when the receptor
analogue was bound are indicated by the green shade under the curve.
(c,d) Kinetic rates, ka and kd, for (c) 3-SLN and (c) 6-SLN binding to different HA
variants, estimated from the MD simulations. Since the rates are plotted
on a logarithmic scale, the difference between ka and kd, indicated by the height
of the gray area, gives the equilibrium association constant. The
rates and the statistical errors are estimated with the assumption
that the binding and unbinding events followed Poisson statistics.
The RMSD time traces of all of the MD simulations used to estimate
the ka and kd values are shown in Figure S3. Absence
of the error bar on ka indicates that
the value is an upper bound, reflecting the fact that no spontaneous
binding event occurred in the corresponding simulations, and that
based on the lengths of the simulations, we can thus estimate that
there is only a P = 0.02 probability that ka is greater than the given value (see the Supporting Information for details).To give confidence that the glycan force field parameters[35] used in this work are sufficient to yield accurate
conformational ensembles of SAs, we simulated the receptor analogues
6-SLN, 506-SLN, and 503-SLN (also known as LSTa), which have been
previously characterized using nuclear magnetic resonance (NMR),[36,37] in aqueous solution, and we compared the nuclear Overhauser effect
and J3-coupling parameters estimated from
these simulations to the NMR measurements. The good agreement between
the computational estimates and the experimental values (Tables S4–S6)—in addition to the
fact that our simulations reproduced all known SA-binding conformations
observed in crystallography—suggests that the receptor conformations
in our simulations are likely realistic.From the binding and
unbinding events in the long-timescale simulations,
the kinetic rates of association (ka)
and dissociation (kd), and the equilibrium
dissociation constant (KD), can be estimated
(Figure c,d and Table ; see the Supporting Information for the estimation method).
Although such predicted KD values have
large statistical uncertainties and may differ in absolute terms from
experimental values (as discussed in more detail below), they are
consistent with the previously observed mutational patterns of human
adaptation: IN05 of the H5 subtype gains affinity to human receptors
by the mutations Q226L,G228S,[38,39] whereas DK76 of H1
subtype does not; the latter switches its binding preference from
avian to human receptors by the mutations E190D,G225D.[40] These results also suggest that our simulations
are able to capture the changes in HA–SA interactions that
result from HA mutations.
Table 1
Equilibrium Dissociation
Constants
(KD) of Different Receptor Analogues Binding
to the IN05 and DK76 HA Variantsa
HA
receptor
KD,MD (10–3 M)
KD,MST (10–3 M)
IN05
3-SLN
20.745.4
N.B.
6-SLN
20074544
N.B.
506-SLN
0.35 ± 0.02
IN05Q226L,G228S
3-SLN
21.23.3
N.B.
6-SLN
201233
0.055 ± 0.004
506-SLN
0.21 ± 0.01
IN05Q226L,G228S,S227A
6-SLN
18296344
1.6 ± 0.1b
DK76
3-SLN
0.10.040.3
0.64 ± 0.06
6-SLN
106.116.5
N.B.
506-SLN
100.0110966
0.088 ± 0.009
DK76E190D,G225D
3-SLN
0.90.51.6
N.B.
6-SLN
0.10.040.3
0.20 ± 0.02
506-SLN
0.08 ± 0.01
DK76Q226L,G228S
3-SLN
502791
N.B.
6-SLN
9037219
N.B.
506-SLN
401985
0.11 ± 0.01
DK76Q226L,G228S,P186N,A227S
3-SLN
20.745.4
0.04 ± 0.03
6-SLN
0.20.070.5
0.06 ± 0.02
506-SLN
301181.5
0.23 ± 0.05
DK76Q226L,G228S,P186N
3-SLN
207.454
6-SLN
63.69.9
N.B.
DK76Q226L,G228S,A227S
3-SLN
103.727
6-SLN
10013.5739
N.B.
Only the RBD of
HA was included
in the simulations, whereas the full HA trimer was used in the MST
assays. The KD results from MST are thus
apparent dissociation constants of the receptor analogue binding to
the HA trimer, whereas those from MD are dissociation constants of
binding to the monomeric RBD. “N.B.” indicates no detectable
binding in MST; blank cells indicate that the corresponding MD simulation
or MST experiment was not performed. For the MD results, the statistical
uncertainties are reported in the superscript and subscript, which
correspond to the upper and lower bounds, respectively, of the 68.3%
confidence interval of the estimates [i.e., at one standard error
in the estimate of ln(KD,MD)].
The KD,MST value for IN05Q226L,G228S,S227A was estimated from fluorescence
quenching as no binding was detectable from thermophoresis.
Only the RBD of
HA was included
in the simulations, whereas the full HA trimer was used in the MST
assays. The KD results from MST are thus
apparent dissociation constants of the receptor analogue binding to
the HA trimer, whereas those from MD are dissociation constants of
binding to the monomeric RBD. “N.B.” indicates no detectable
binding in MST; blank cells indicate that the corresponding MD simulation
or MST experiment was not performed. For the MD results, the statistical
uncertainties are reported in the superscript and subscript, which
correspond to the upper and lower bounds, respectively, of the 68.3%
confidence interval of the estimates [i.e., at one standard error
in the estimate of ln(KD,MD)].The KD,MST value for IN05Q226L,G228S,S227A was estimated from fluorescence
quenching as no binding was detectable from thermophoresis.When the human receptor analogues
(6-SLN and 506-SLN) were bound
to any of the simulated HA variants, the Gal and GlcNAc moieties assumed
a diverse set of conformations. The receptors rapidly interconverted
among structurally dissimilar binding conformations (Figure a), such as the cis and trans
configurations around the glycosidic linkage between Neu5Ac and Gal[9,22] (Table S2). Each human receptor analogue
bound to any individual HA variant visited the majority of all the
binding conformations that have been observed in the crystal structures
of a variety of receptor analogues bound to diverse HA variants of
different subtypes (Figure S4). The binding
conformations of the receptors in our simulations can be divided into
distinct clusters (Figures b and S5–S7; clustering
is described in the “Conformational Clustering” subsection of the Methods Summary). The occupancy in each cluster varied by HA variant (Figures c and S7b) and between 6-SLN and 506-SLN. The dominant binding conformation
in the simulations was not always the crystallographic conformation
of the corresponding HA variant (Figure S5). In addition to the conformations previously observed in crystal
structures, both human receptor analogues also bound to HA variants
in novel conformations (Figures b and S7a). Human receptors
displayed less conformational flexibility when bound to human-adapted
HA mutants than to the corresponding wild-type avian HAs; a given
human receptor bound to a human-adapted mutant predominantly occupied
a single conformation, one which in some cases was only rarely visited
in its complex with the wild-type avian HA.
Figure 3
Receptor analogues adopted
a diverse and dynamic ensemble of conformations
when bound to HA. To characterize these conformations, we projected
each conformation onto five torsional angles: around the three bonds
connecting Gal to Neu5Ac and around the two bonds connecting GlcNac
to Gal (Figure S1c,d). (a) Time trace of
the five torsional angles (see Figure S1f) and the conformational-cluster assignment in one representative
simulation of 6-SLN in complex with DK76, showing the transitions
between different conformations. Discontinuity in time reflects the
intervals when 6-SLN unbound. (b) Binding conformations of the receptors
in our simulations can be divided into eight clusters based on the
three torsional angles ϕ1, ψ1, and
ω, which determine the position of the Gal ring. The receptor
conformation corresponding to the center of each cluster is shown
here in thick sticks. Crystal structures, as shown in thin lines,
are assigned to the cluster with the smallest RMSD between its center
and the crystal structure, provided that the RMSD is also smaller
than 1.5 Å. The conformations in cluster 0 have not yet been
observed in crystal structures. (c) Occupancy of different clusters
of conformations by human receptor analogues bound to different HA
variants. The occupancy varies among HA variants and between 6-SLN
and 506-SLN.
Receptor analogues adopted
a diverse and dynamic ensemble of conformations
when bound to HA. To characterize these conformations, we projected
each conformation onto five torsional angles: around the three bonds
connecting Gal to Neu5Ac and around the two bonds connecting GlcNac
to Gal (Figure S1c,d). (a) Time trace of
the five torsional angles (see Figure S1f) and the conformational-cluster assignment in one representative
simulation of 6-SLN in complex with DK76, showing the transitions
between different conformations. Discontinuity in time reflects the
intervals when 6-SLN unbound. (b) Binding conformations of the receptors
in our simulations can be divided into eight clusters based on the
three torsional angles ϕ1, ψ1, and
ω, which determine the position of the Gal ring. The receptor
conformation corresponding to the center of each cluster is shown
here in thick sticks. Crystal structures, as shown in thin lines,
are assigned to the cluster with the smallest RMSD between its center
and the crystal structure, provided that the RMSD is also smaller
than 1.5 Å. The conformations in cluster 0 have not yet been
observed in crystal structures. (c) Occupancy of different clusters
of conformations by human receptor analogues bound to different HA
variants. The occupancy varies among HA variants and between 6-SLN
and 506-SLN.The above results may partly explain
the difficulty in predicting
human-adapting mutations based on crystal structures. They suggest
that, unlike other examples of specific biomolecular binding (which
are normally associated with a unique, well-defined binding conformation),
a given human receptor binds to each HA variant in a number of distinct
binding conformations. Crystal structures of a receptor in complex
with different HA variants may thus give an indication of the range
of the receptor’s possible binding conformations in a fluid
ensemble, as has been suggested for other biomolecular complexes.[41]Next, we explore whether the heterogeneous
binding conformations
sampled in our MD simulations allow us to predict HA mutations that
might affect binding to human receptors but cannot be inferred from
crystal structures alone, hypothesizing that new favorable interactions
between HA and SA might form in some of the predicted novel conformations.
We use long-timescale MD simulations to estimate, and MST experiments[42] to measure, the effects of the predicted mutations
on the binding affinities between the receptor analogues and HA variants
(the sequences of the HA variants used in the MST experiments are
listed in Table S3).In this work,
we examine the conformation of cluster 0, which we
term the cis/g/tg conformation because its first torsional angles
are ϕ1 ≈ −π/3(cis), ϕ1 ≈ −π/2 (gauche), and ω ≈
π (trans-gauche). In this conformation, which to our knowledge
has not previously been observed experimentally or in simulation,
the receptor exits the binding pocket over the 220 loop (Figure a). The glutamate
at position 190 (E190) forms a hydrogen bond with the O2 hydroxyl of galactose, and in the IN05 and SH13 HA variants, the
serine at position 227 (S227) forms a hydrogen bond with the terminal
hydroxyl of GlcNAc in 6-SLN. These two interactions, which do not
appear in any previously published crystal structure (Figure b), may contribute to the increased
affinity that the mutant IN05Q226L,G228S, from the mammal-transmissible
H5 strain, has for 6-SLN compared to wild-type IN05 (Figure c,d and Table ).[9] We thus predicted
that mutating Ser227 in IN05Q226L,G228S—a position
where no mutation has been previously reported to affect receptor
binding in any HA—to Ala would weaken binding to 6-SLN. Introducing
the mutation S227A in IN05Q226L,G228S indeed reduces its
binding affinity for 6-SLN in our MD simulations (Table ).
Figure 4
Human receptor bound
to HA containing the Q226L,G228S mutations
in a novel conformation not yet seen in X-ray crystallography. (a)
Representative snapshot of the novel conformation of human receptor
binding to an HA containing the Q226L,G228S mutations, taken from
a simulation of 6-SLN bound to IN05Q226L,G228S. 6-SLN is
shown in orange sticks; the characteristic hydrogen bonds in this
conformation are highlighted by green dashed lines. (b) Structures
of human receptor analogues in complex with HAs of identical or very
similar sequences (PDB IDs: 4K64, 4K67, 4BGX, 4CQU, 4CQR, 4BH3, 4KDO). The spheres are
water molecules present in the crystal structures. S227 does not form
any direct or water-mediated hydrogen bonds with the human receptors,
nor does E190 with the galactose, in any of the crystal structures:
in the crystallographic structures that we have examined (listed on
the x-axis of Figure S4), the minimum distance between the side chain of any residue at
position 227 and the bound receptor analog is 4.2 Å, and that
between the side chain of E190 and the O2 of galactose
is 5.2 Å. (c) Probabilities that the hydrogen bonds between S227
and the terminal hydroxyl of GlcNac (top) and between E190 and the
O2 hydroxyl of Gal (bottom) formed in the receptor-HA complex
during our simulations.
Human receptor bound
to HA containing the Q226L,G228S mutations
in a novel conformation not yet seen in X-ray crystallography. (a)
Representative snapshot of the novel conformation of human receptor
binding to an HA containing the Q226L,G228S mutations, taken from
a simulation of 6-SLN bound to IN05Q226L,G228S. 6-SLN is
shown in orange sticks; the characteristic hydrogen bonds in this
conformation are highlighted by green dashed lines. (b) Structures
of human receptor analogues in complex with HAs of identical or very
similar sequences (PDB IDs: 4K64, 4K67, 4BGX, 4CQU, 4CQR, 4BH3, 4KDO). The spheres are
water molecules present in the crystal structures. S227 does not form
any direct or water-mediated hydrogen bonds with the human receptors,
nor does E190 with the galactose, in any of the crystal structures:
in the crystallographic structures that we have examined (listed on
the x-axis of Figure S4), the minimum distance between the side chain of any residue at
position 227 and the bound receptor analog is 4.2 Å, and that
between the side chain of E190 and the O2 of galactose
is 5.2 Å. (c) Probabilities that the hydrogen bonds between S227
and the terminal hydroxyl of GlcNac (top) and between E190 and the
O2 hydroxyl of Gal (bottom) formed in the receptor-HA complex
during our simulations.In our MST measurements
(Table ), 6-SLN has
no detectable binding to the wild-type
IN05 HA, but it reproducibly bound to the Q226L,G228S mutant with KD = 0.055 ± 0.004 mM, confirming that this
pair of mutations increases the HA’s binding affinity for 6-SLN.
When the S227A mutation is added, 6-SLN binding to HA can no longer
be detected by thermophoresis, but the KD value is estimated to be 1.6 ± 0.1 mM based on fluorescence
quenching, a decrease in binding affinity consistent with our predictions
above.To identify new gain-of-binding mutations that exploit
the novel
cis/g/tg conformation, we studied DK76 HA of the H1 subtype, in which
(unlike in IN05) the Q226L,G228S mutations do not improve binding
to 6-SLN (Figures d and 5; Table ). In our simulations, 6-SLN did, however, bind in
the cis/g/tg conformation to DK76Q226L,G228S, though it
occupied this conformation less than it did when binding to IN05Q226L,G228S (Figure c). We predicted that an additional pair of mutations, A227S
and P186N, would establish hydrogen bonds between S227 and GlcNAc
and between N186 and E190 in the cis/g/tg conformation, increasing
the binding affinity of DK76Q226L,G228S for 6-SLN.
Figure 5
MST measurements
of the binding affinities of 3-SLN and 6-SLN to
selected DK76 HA variants are shown. (a) Titration curves for 3-SLN
(left) and 6-SLN (right) binding to DK76 wild-type, DK76E190D,G225D, DK76Q226L,G228S, and DK76Q226L,G228S,P186N,A227S. In the cases of detectable binding, the normalized fluorescence
decreases with the receptor analogue concentration. The blue and orange
points represent two replicate experiments, and the black curves are
fits of the mass-action law that yield the estimated dissociation
constants, KD, in Table . The experimental binding affinities are
compared to the values obtained from MD simulations for 3-SLN (b)
and 6-SLN (c). Binding affinities below the detection limit of MST
are shown in the gray box as rectangles at their respective values
estimated by MD. Binding affinities for DK76 variants are colored
blue; those for IN05 variants, orange. MST and MD are in good agreement
with regard to the effects of the mutations (indicated by the arrows),
with the exception of the DK76 mutations Q226L,G228S,P186N,A227S on
3-SLN binding (indicated by the dashed arrow).
MST measurements
of the binding affinities of 3-SLN and 6-SLN to
selected DK76 HA variants are shown. (a) Titration curves for 3-SLN
(left) and 6-SLN (right) binding to DK76 wild-type, DK76E190D,G225D, DK76Q226L,G228S, and DK76Q226L,G228S,P186N,A227S. In the cases of detectable binding, the normalized fluorescence
decreases with the receptor analogue concentration. The blue and orange
points represent two replicate experiments, and the black curves are
fits of the mass-action law that yield the estimated dissociation
constants, KD, in Table . The experimental binding affinities are
compared to the values obtained from MD simulations for 3-SLN (b)
and 6-SLN (c). Binding affinities below the detection limit of MST
are shown in the gray box as rectangles at their respective values
estimated by MD. Binding affinities for DK76 variants are colored
blue; those for IN05 variants, orange. MST and MD are in good agreement
with regard to the effects of the mutations (indicated by the arrows),
with the exception of the DK76 mutations Q226L,G228S,P186N,A227S on
3-SLN binding (indicated by the dashed arrow).Indeed, in our MD simulations, 6-SLN bound predominantly in the
cis/g/tg conformation to this quadruple mutant DK76Q226L,G228S,A227S,P186N (Figure c), and
the above two hydrogen bonds formed for the majority of the time that
6-SLN was bound (Figure c). The quadruple mutant bound 6-SLN with both increased ka and decreased kd (Figure ) compared
to wild-type DK76, as estimated from the simulated binding equilibrium;
its estimated affinity for 6-SLN increased by a factor of 50 (Table ). We again performed
MST measurements to corroborate the results and found that 6-SLN bound
to the quadruple mutant with KD = 0.06
± 0.03 mM, compared to no detectable binding to the DK76 wild
type, qualitatively consistent with the predicted affinity increase
(Figure and Table ). Our simulations
also predicted, and our MST experiments confirmed, that eliminating
either the A227S or P186N mutation leads to a substantial reduction
in affinity for 6-SLN (Table ), suggesting that both the S227-GlcNAc and the E190-Gal hydrogen
bonds contribute to the increased affinity for the human receptor.The different binding-conformation distributions of 6-SLN and 506-SLN
(Figures c and S5) suggest that HA mutations can have different
effects depending on which receptor analogue they bind to. 506-SLN
in complex with IN05 variants visited a different set of conformations
than 6-SLN did in our simulations (Figure S5), preferred the conformation that is observed in the crystal structure
of 506-SLN in complex with IN05Q226L,G228S,[23] and rarely adopted the cis/g/tg conformation
(Figure c) or formed
the hydrogen bond between E190 and the O2 hydroxyl of Gal
(Figure c). 506-SLN
and complete, natural receptors (unlike the analogue 6-SLN) lack the
terminal hydroxyl on GlcNac and thus cannot form a hydrogen bond,
as a donor, with S227. We thus predicted that the additional mutations
A227S,P186N, which improved the affinity of DK76Q226L,G228S for 6-SLN, would not do so in the case of 506-SLN; our MST measurements
also confirmed this prediction (Table ). Our results are consistent with the experimental
observation that an HA molecule can have different avidities for different
α2,6-linked glycans,[11,12,24] and imply that different HA mutants might engage SA receptors of
different glycan compositions on host cells.[6]We note that the KD values measured
by our MST assays are systematically lower than previously reported
values[9,25] for HA–SA binding. There are important
differences between the experiments that may account for these quantitative
differences: the HA trimer concentrations, for example, are much higher
in NMR experiments (∼30 μM)[25] than that in MST experiments (∼200 nM). At high HA and SA
concentrations, nonspecific SA binding to HA may compete against the
weak specific binding, thus reducing the effective SA concentration
in solution and decreasing the measured binding affinity. Nonspecific
SA binding to the receptor-binding domains (RBDs) occurred in our
MD simulations, as can be seen in Figure S3 as the RMSD hovering (with small fluctuations) around a large value
(>8 Å) for an extended length of time, in some cases (e.g.,
6-SLN
binding to DK76Q226L,G228S and IN05Q226L,G228S,S227A) for cumulatively longer periods than specific binding to the orthosteric
pocket.The buffers also differ between our MST assay and the
previous
experiments: Most notably, Mg2+ is present in our MST buffer
but not in the buffers of the previous experiments, whereas phosphate
is used in the buffers of the previous experiments but not in our
MST buffer. Because SA contains a charged carboxylic acid in Neu5Ac
and multiple polar hydroxyls, it is unsurprising that the difference
in the ionic co-solvents might lead to a significant difference in
the measured affinities. Furthermore, our MST measurements used the
label-free method, which avoids the non-specific labeling of HA that
can potentially interfere with SA binding and result in a lower measured
affinity. Despite the quantitative differences in the absolute KD values for previously characterized mutations,[9] the results agree in the directional changes
in the binding affinities, which are more relevant than absolute KD values for identifying affinity-switching
mutations. We thus believe that our predictions of the affinity-switching
mutations will hold true under different experimental conditions.Also worth noting are the quantitative differences between the
MD-estimated and the MST-measured KD values.
Although our MD simulations were long enough for some of the HA–SA
pairs to yield relatively precise estimates of KD values, they were still too short for pairs with slow binding
or unbinding kinetics to have numbers of binding and unbinding events
sufficient to produce such estimates. The 95% confidence intervals
for the estimated KD values of 3-SLN binding
to DK76 (0.01–0.7 mM) and to DK76Q226L,G228S,P186N,A227S (0.3–14.2 mM), for example, overlap, which can confound the
prediction of the direction of affinity change. Furthermore, our MD
simulations modeled the binding of SA to an HA RBD monomer, whereas
MST measured the binding of SA to the HA trimer, implying that MD
would systematically underestimate KD by
a factor of 3 compared to MST. In addition, the co-solvents in the
MST buffer are not included in the simulations (a common limitation
for MD simulations, as many of the co-solvents and bivalent ions do
not have well-parameterized force field models), which, as discussed
above, could contribute a systematic difference in calculated KD values. Insufficient accuracy of the force
field may introduce additional deviations. Nonetheless, the affinity-switching
mutations identified by the MD-predicted KD changes are largely in agreement with the MST measurements. This
finding suggests that long timescale MD simulations can be used not
only to generate SA-binding conformations for prediction of potential
human-adaptive HA mutations, but also to provide an initial assessment
of the effect of such mutations on the HA–SA binding affinities.A notable feature of the simulations we report is their timescale,
which is orders-of-magnitude longer than previous simulations of HA–SA
complexes.[27,43] Due to their length, our simulations
are able to generate SA-binding conformations not previously reported
in MD simulations or in crystallography, and to give statistically
converged estimates of the relative populations of SA-binding conformations
to a wide range of HA variants, including H1, H5, and H7 subtypes.
This represent new structural and thermodynamic information that can
be used in structure-based prediction of potential affinity-switching
mutations, as illustrated in this work. In addition, our long-timescale
MD simulations enabled the unbiased simulations of SA binding to and
unbinding from HA, yielding estimates of the corresponding kinetic
rate constants, which are difficult to measure experimentally for
these weak complexes.Taken together, our simulations suggest
that a human receptor binds
to a given avian HA in a diverse ensemble of conformations, and that
specificity-switching mutations induce new favorable interactions
in one of these conformations, stabilizing that conformation and promoting
affinity for the human receptor. The conformational diversity of the
bound receptors increases the number of viable mutations, potentially
facilitating the human adaptation of the influenza virus. The conformations
generated by our MD simulations may potentially serve as starting
points for further analyses,[27,43] such as free-energy
calculations[29] and for rational, structure-based
protein design,[44] to help identify influenza
strains on the verge of human transmission.
Methods Summary
Molecular Dynamics
Simulations
The starting structures
of the RBDs of the HA molecules were taken from the corresponding
crystal structures (DK76 from PDB ID: 2WRH; IN05 from PDB ID: 4K64; and SH13 from PDB ID: 4LKG); the mutations
corresponding to different variants were introduced and modeled using
Maestro software.[45] The RBD of DK76 in
our simulation and in the crystal structure 2WRH differ from the
RBD of A/duck/Alberta/35/76 (GenBank accession number: AF091309) by
a deletion of threonine at position 132, outside the binding pocket.
Comparison with a closely related sequence with the additional threonine
at position 132, in the PDB structure 3HTQ, suggests that the additional threonine
creates a bulge outside the binding pocket and does not impact the
pocket’s structure. Structural models of the receptor analogues
(3-SLN, 6-SLN, and 506-SLN) were built using the GLYCAM web server.[46] The Amber99SB-ILDN force field[47] (which builds upon other modifications[48,49] to Amber99[50]) was used for the proteins,
the GLYCAM06[35] force field was used for
the polysaccharides (3-SLN, 6-SLN, and 506-SLN), and the TIP3P[51] water model was used. Isobaric–isothermal
simulations, as described previously,[52] were run at a temperature of 310 K and a pressure of 1 atm, using
Anton-specialized hardware.[34]
Conformational
Clustering
The SA conformations sampled
in the MD simulations were clustered using the density peak[53] method. In the clustering, the distance between
two conformations was calculated as , where
Δ(θ)
= θ + 2π⌊(1 – θ/π)/2⌋
(⌊x⌋ being the floor function that
gives the largest integer no greater than x) wraps
an angle with periodicity 2π into the interval [−π,π).
The glycosidic angles ϕ1, ψ1, and
ω are illustrated in Figure S1f.
Bound conformations from all simulations (Table S7) were included in determining the conformational clusters.
Bound conformations from all simulations for each HA–SA pair
were then used to compute the occupancy of each cluster for the SA
bound to the HA variant. The relative variance (i.e., the variance
of a value divided by the square of the mean value) in the occupancy
estimate is equal to the inverse of the number of times that the SA
transitioned into and out of a conformation; the large number of transitions
observed in our set of MD simulations (see, e.g., Figure a) enabled us to estimate the
occupancy to good precision.
Expression and Purification of HA Proteins
and MST
Trimers of HA variants were produced, and binding
to 3-SLN, 6-SLN,
and 506-SLN was measured, using MST by Crelux (now part of WuXi AppTec)
through contracted research (the experimental details are given in
the Supporting Information.) To eliminate
bias in data interpretation, Crelux was not informed of the predictions
of the MD simulations at any point in the course of the research.
Authors: S J Gamblin; L F Haire; R J Russell; D J Stevens; B Xiao; Y Ha; N Vasisht; D A Steinhauer; R S Daniels; A Elliot; D C Wiley; J J Skehel Journal: Science Date: 2004-02-05 Impact factor: 47.728
Authors: James Stevens; Ola Blixt; Terrence M Tumpey; Jeffery K Taubenberger; James C Paulson; Ian A Wilson Journal: Science Date: 2006-03-16 Impact factor: 47.728
Authors: Xiaoli Xiong; Peter J Coombs; Stephen R Martin; Junfeng Liu; Haixia Xiao; John W McCauley; Kathrin Locher; Philip A Walker; Patrick J Collins; Yoshihiro Kawaoka; John J Skehel; Steven J Gamblin Journal: Nature Date: 2013-04-24 Impact factor: 49.962