Jacob D Durrant1, Robin M Bush2, Rommie E Amaro1. 1. Department of Chemistry & Biochemistry and the National Biomedical Computation Resource, University of California San Diego , La Jolla, California 92093, United States. 2. Department of Ecology & Evolutionary Biology, University of California Irvine , Irvine, California 92697, United States.
Abstract
Deletions in the stalk of the influenza neuraminidase (NA) surface protein are associated with increased virulence, but the mechanisms responsible for this enhanced virulence are unclear. Here we use microsecond molecular dynamics simulations to explore the effect of stalk deletion on enzymatic activity, contrasting NA proteins from the A/swine/Shandong/N1/2009 strain both with and without a stalk deletion. By modeling and simulating neuraminidase apo glycoproteins embedded in complex-mixture lipid bilayers, we show that the geometry and dynamics of the neuraminidase enzymatic pocket may differ depending on stalk length, with possible repercussions on the binding of the endogenous sialylated-oligosaccharide receptors. We also use these simulations to predict previously unrecognized druggable "hotspots" on the neuraminidase surface that may prove useful for future efforts aimed at structure-based drug design.
Deletions in the stalk of the influenzaneuraminidase (NA) surface protein are associated with increased virulence, but the mechanisms responsible for this enhanced virulence are unclear. Here we use microsecond molecular dynamics simulations to explore the effect of stalk deletion on enzymatic activity, contrasting NA proteins from the A/swine/Shandong/N1/2009 strain both with and without a stalk deletion. By modeling and simulating neuraminidase apo glycoproteins embedded in complex-mixture lipid bilayers, we show that the geometry and dynamics of the neuraminidase enzymatic pocket may differ depending on stalk length, with possible repercussions on the binding of the endogenous sialylated-oligosaccharide receptors. We also use these simulations to predict previously unrecognized druggable "hotspots" on the neuraminidase surface that may prove useful for future efforts aimed at structure-based drug design.
There is an urgent
need to better understand the molecular factors
that govern influenza A virus (IAV) virulence. Seasonal (human-adapted)
IAV is a major cause of morbidity and mortality each year, but the
pandemic potential of IAV strains that originate in animal hosts poses
an even more serious threat to public health. There have been four
influenza pandemics in the past century, in 1918, 1957, 1968, and
2009. The precise origin of the 1918 pandemic strain, which killed
roughly 50 million people, is unclear. However, the three subsequent
pandemic strains resulted from the process of genetic reassortment,
in which gene segments from different viruses mix during host coinfection
to produce novel viral strains.[1]There is much concern that a new pandemic strain might result from
reassortment involving the highly pathogenic H5N1 and H7N9 avian influenza
viruses currently circulating in Asia. The majority of these viruses
undergo a deletion in the stalk of the neuraminidase surface protein,
a characteristic that is associated with virulence when causing outbreaks
in domestic poultry.[2] It is thus vitally
important to understand the effect of stalk deletion on viral function.Influenza A strains are named according to the alleles coding for
their two surface antigens, the hemagglutinin (HA) and neuraminidase
(NA) glycoproteins. These alleles are assigned numbers based on the
order of their discovery.[3] Hemagglutinin
(HA, 18 antigenically distinct alleles identified to date) facilitates
viral adhesion to human host cells by binding to cellular sialylated-oligosaccharide
receptors, and neuraminidase (NA, 11 antigenically distinct alleles)
facilitates viral release by cleaving sialic acid linkages. A number
of factors modulate NA and HA activities. For example, hyperglycosylation
and active-site mutation tend to reduce HA/sialic acid binding. NA
activity is generally proportional to the length of the highly variable
NA stalk,[2,4−9] and recent neuraminidase stalk deletions often produce highly pathogenic
avian influenza viruses such as H5N1.[2]Influenza virulence is determined in large part by the balance
between NA and HA activities at the virion surface. Recent viral strains
with stalk-deletion NA glycoproteins (NAdel) are more virulent
because the reduced NAdel activity alters this balance.
For example, the virulence of both a pandemic 2009 H1N1 virus and
a more recent 2013 H7N9 strain was enhanced when deletion mutations
containing 20 and 5 amino acids, respectively, were introduced into
the corresponding NA stalks.[2,4−13] In these strains, reduced NA activity due to a stalk deletion may
enhance viral infectivity by reducing oligosaccharide-receptor cleavage
and thus increasing the number of receptors available for HA binding.
This strategy is particularly advantageous when HA activity is itself
compromised (e.g., due to hyperglycosylation, active-site mutations,
etc.), leading many to express concerns that a current H5N1 strain
might obtain a long-stalk NA through reassortment with a human-adapted
virus. Optimal viral replication is achieved only when the activities
of NA and HA are ideally complementary.[2,4−7,9]The mechanism by which reduced
NA stalk height impacts NA activity
remains uncertain. One prevailing theory suggests that NAdel glycoproteins have reduced sialidase activity relative to wild type
(NAwt) because their diminished height hinders access to
cellular sialylated-oligosaccharide receptors (i.e., the “limited
access” theory).[4,8,9] The
implication is that a towering canopy of HA effectively buries the
short-stalk NAdel such that cell-bound oligosaccharides
simply cannot reach the NA enzymatic pockets. While there is merit
to the limited-access theory, especially in cases where the stalk
length is profoundly reduced, experiment and computational modeling
suggest it may not fully explain the reduced NAdel activity.We hypothesize that, in addition to any limited-access effects,
NA stalk length also reduces NA enzymatic activity by altering the
binding affinity of sialic acid, the native substrate, to the enzymatic
pocket. Given that the stalk is roughly 40 Å from the active
site, we propose an indirect mechanism by which stalk length impacts
binding-pocket dynamics at a distance. Other studies have demonstrated
that the NA enzymatic pocket is highly flexible and so samples many
conformations. For example, X-ray crystallography and molecular dynamics
simulations of the isolated neuraminidase catalytic domain suggest
that key loops surrounding the active site (i.e., the 150- and 430-loops)
are highly mobile, permitting large cavities to form immediately adjacent
to the main pocket.[14−19] Given that the enzyme is known to adopt a variety of conformational
states, it is reasonable to hypothesize that stalk length may have
a quasi-allosteric impact on ligand binding affinity by shifting the
binding-pocket conformational population toward a subtly different
distribution.[20−23]To properly study the impact of stalk length, we built a full
model
of 2009 H1N1neuraminidase as well as a model that included a 20 amino
acid stalk-deletion mutation similar to one known to enhance virulence
in a recent strains.[24] Our approach was
inspired by Blumenkrantz et al.,[25] who
replaced the long-stalk NAwt of a 2009 H1N1 pandemic virus
with a short-stalk H5N1NAdel. They attributed the resulting
decrease in transmissibility at least in part to a reduction in the
ability of the short-stalk NAdel to cleave tethered substrates
and penetrate the host mucus barrier. Rather than exactly replicating
the work of Blumenkrantz et al., who inserted a foreign NA (from H5N1)
into a H1N1 virus, we made our deletion directly in the H1N1NAwt stalk itself. Our models were constructed using data from
both X-ray crystallography (used to model the enzymatic head) and
also protein–protein docking and modeling (used to model the
stalk, transmembrane, and intravirion domains, which have never been
crystallized). All NA models were embedded in complex-mixture lipid
bilayers (Figure ).
Figure 1
Equilibrated
neuraminidase models used in the current study: (A)
wild-type (long-stalk) NAwt and (B) stalk-deletion NAdel. The four domains of the glycoprotein are indicated.
Equilibrated
neuraminidase models used in the current study: (A)
wild-type (long-stalk) NAwt and (B) stalk-deletion NAdel. The four domains of the glycoprotein are indicated.Subsequent atomistic molecular
dynamics simulations of these systems
support the theory that the NA stalk length impacts the dynamics of
the NA binding pocket and therefore the affinity for the endogenous
substrate. Furthermore, these simulations reveal previously uncharacterized
druggable “hotspots” that may open up additional avenues
for structure-based drug design. Previous simulations have included
only the NA head; our models, which include additional NA domains
as well as the associated lipid bilayer, allow us to take the next
steps toward more realistic atomic simulations of viral components
Materials
and Methods
Constructing the Neuraminidase Models
To generate a
complete model of long-stalk (wild-type) apo NAwt, a monomer
of the tetrameric NA head was built by homology using Schrödinger’s
Prime and the NA sequence of the 2009 H1N1 pandemic reference strain
A/California/04/2009 (GenBank accession FJ966084), which codes for
a NA protein identical to that of the A/England/195/09 virus used
by Blumenkrantz et al.[25] Default Prime
parameters were used. The monomeric homology model, based on the 3NSS
crystal structure,[26] was missing a single
lysine residue on the C terminus, which was added manually. The appropriate
tetramer was formed by aligning copies of the homology-model monomer
to chains A–D of the 2HU4 crystal structure[27] using MultiSeq,[28] as implemented
in VMD.[29] The resulting tetrameric head
was then processed with Schrödinger Maestro’s Protein
Preparation Wizard and subjected to repeated minimization in the Maestro
environment. Calcium cations were added using the 3CL0 structure[30] as a guiding template. Coordination was manually
inspected and corrected when necessary.In contrast, the stalk
and transmembrane domains were built entirely de novo. Both SSPro
4.5,[31] a neural-network-based algorithm
for predicting secondary structure from primary sequence, and ABTMpro,[32] a SVM algorithm for determining whether a given
transmembrane segment is α helical or β barrel, suggested
that the stalk has an α-helical structure. We therefore used
the Amber XLEAP module to generate a single α helix with the
appropriate long-stalk sequence.RosettaDock 3.4[33−37] was used to assemble four identical copies of this helix into a
four-helix bundle, one helix at a time. At each step, ∼100 000
dockings were performed. The top 200 best-scoring structures were
clustered by root-mean-square distance (RMSD) using cutoffs ranging
from 0.75 to 3.0 Å, and representative dockings were examined
visually. The top representative docked pose that positioned the newly
introduced helix in the expected parallel conformation was retained.
When no such pose was identified by clustering, all docked poses were
examined sequentially, starting with the pose that had the highest
score, until an acceptable pose was identified. The resulting four-helix
bundle was then processed with Schrödinger Maestro’s
Protein Preparation Wizard and subjected to repeated minimization
in the Maestro environment.Each monomer of the NAwt model contained two cysteine
residues. We assumed that the cysteine residues located in the extra-virion
region of the stalk likely participate in interchain disulfide bonds,[38] but that the cysteine residues located in the
transmembrane region likely do not. The appropriate bonds were built
in Maestro. The tetrameric-head and stalk/transmembrane-region models
were next positioned near each other using VMD. The combined models
were subsequently loaded into Maestro and again subjected to multiple
minimizations.Finally, Maestro was used to build and position
the residues of
the short intravirion topological domain. Based on the assumption
that this domain lies at the interface of the inner membrane leaf
and bulk intravirion water, all hydrophobic side chains were manually
pointed upward toward the bilayer, and all polar side chains were
pointed downward.To generate a model of apo NA with a virulence-enhancing
stalk
deletion (hereafter called NAdel), the appropriate mutant
amino acid sequence was first identified.[24] This sequence suggested that the stalk residues Cys49 to Asn68 should
be deleted. These residues were manually removed from the NAwt model, and the flanking NA segments were juxtaposed and fused using
VMD and Schrödinger’s Maestro, followed by multiple
minimizations as needed.
Constructing the Bilayer
Our efforts
to construct a
complex-mixture lipid-membrane model were guided by a recent paper
describing the composition of the influenza viral envelope.[39] We identified and classified all listed membrane
components with mol % > 1.5, taking care to keep the composition
of
the inner and outer leaflets separate. As our ultimate goal was to
perform molecular dynamics simulations of this system, we were limited
to the lipids of the CHARMM 36 force field.[40,41] Consequently, we had to substitute some experimentally identified
lipids with available lipid models. 3-palmitoyl-2-oleoyl-d-glycero-1-phosphatidylethanolamine (POPE) was used for any phosphatidylethanolamine
(PE) variant, 2,3-distearoyl-d-glycero-1-phosphatidylserine
(DSPS) for phosphatidylserine (PS), 3-palmitoyl-2-oleoyl-d-glycero-1-phosphatidylglycerol (POPG) for the Forssman glycolipid
hapten, and 3-palmitoyl-2-oleoyl-d-glycero-1-phosphatidylcholine
(POPC) for sphingomyelin (SM). Cholesterol and phosphatidylcholine,
also major components of the envelope membrane, are included in the
CHARMM 36 force field.[40] Ultimately, an
appropriate planar bilayer model was built using CHARMM-GUI.[42,43] The NA models were then positioned within this generated bilayer,
and clashing lipid residues were deleted with PyMolecule.[44]In preparation for molecular dynamics
simulations, we resolved multiple steric clashes in both the long-stalk
and stalk-deletion systems using serial iterations of minimization
and geometry optimization in Schrödinger’s Maestro,
coupled with manual atomic and molecular manipulation in VMD.[29]
Solvation and Parameterization
Partial
charges were
assigned to all atoms according to the specifications of the CHARMM22[45,46] all-hydrogen force field for proteins and the CHARMM36[40,41] all-hydrogen force field for lipids. The VMD[29] plug-in cionize(47) was used to position sodium and chloride ions as required to bring
the systems to electrical neutrality and to simulate a 20 nM solution.
Limited manual adjustments to the positions of some ions that had
been placed far from any protein or lipid molecule were made.For both the NAwt and NAdel systems, Amber’s
XLEAP module[48] was used to generate a water
box extending 15 Å beyond any atom. Water molecules were subsequently
removed if they were not positioned directly above the bilayer, so
that the 15 Å margin was ultimately retained only along the Z
axis, perpendicular to the bilayer itself. The NAwt and
NAdel systems had 366 293 and 351 699 atoms,
respectively.
Molecular Dynamics Simulations
PSFGEN
was used to fully
parametrize each system according to the CHARMM22,[45,46] CHARMM36,[40] and TIP3P[49] force fields for proteins, lipids, and water molecules,
respectively. NAMD 2.9[50] running on the
Stampede supercomputer at the Texas Advanced Computing Center was
used for all MD simulations. Periodic boundary conditions were employed
with the particle mesh Ewald method to account for electrostatic effects
(smoothing cutoff: 14 Å). Because of observed instabilities when
our standard minimization protocols were used, we subjected the initial
NAwt system to only 1000 steps of unrestrained conjugate
gradient minimization, with the first and max initial steps for the
line minimizer reduced to 1.0 × 10–20 and 1.0
× 10–10, respectively. Minimization of the
NAdel system was more straightforward. The structure was
minimized in four distinct steps: hydrogen atoms were first relaxed
for 5 000 steps; hydrogen atoms, water molecules, and ions
were next relaxed for 5 000 steps; hydrogen atoms, water molecules,
ions, protein side chains, and lipid tails were then relaxed for 10 000
steps; and, finally, all atoms were relaxed for 25 000 steps.Following minimization, each system was subjected to a constrained
equilibration using an NPT ensemble at 310 K. The
equilibration was realized in four steps, using stepwise harmonic-constraint
force constants of 4, 3, 2, and 1 kcal/mol/Å2 on the
protein backbone. For each force constant (1 fs time step), 250 000
steps of MD simulation were performed. Langevin dynamics were applied
to maintain the temperature, and a modified Langevin piston Nosé–Hoover
thermostat was used to maintain 1 atm pressure.Each system
was next subjected to an unconstrained equilibration.
The NAwt model underwent an additional ∼5 million
2 fs steps of unconstrained simulation, followed by another ∼21
million 1 fs steps. The NAdel model underwent an additional
∼5.5 million 2 fs steps of unconstrained simulation, followed
by another 36.5 million 1 fs steps.Satisfied that both systems
were sufficiently equilibrated, five
distinct 100 ns productive runs were performed for each (∼100
million 1 fs steps) using distinct random seeds in order to sample
diverse protein configurations. Snapshots were saved every 50 ps for
subsequent analysis. (5 simulations) × (100 ns per simulation)
× (2 systems) × (4 monomers per system) = 4 μs of
monomeric simulation total. 5000 and 7000 frames were extracted from
the last 50% and 70% of the tetrameric NAwt and NAdel simulations, respectively, for subsequent analysis.For comparison purposes, we also analyzed an archived simulation
of the isolated apo NA head (hereafter called NAhead),
described previously.[14] In brief, the NA
head (PDB ID: 3NSS(26)) was simulated for 100 ns using the
AMBER99SB force field, yielding 400 ns of monomeric trajectory. For
the current study, we extracted 5000 frames from this simulation for
subsequent analysis.
Root-Mean-Square-Distance Analysis of Binding-Pocket
Residues
We used root-mean-square-distance (RMSD) calculations
to judge
the geometric similarity between the primary NA enzymatic pockets
of crystallographic holo and simulated apo models. We first identified
active-site residues as those that came within 5.0 Å of a crystallographic
oseltamivir ligand (PDB ID: 2HU4:A, H5N1). These residues are Arg118, Glu119, Asp151,
Arg152, Arg156, Trp178, Ser179, Ile222, Arg224, Glu227, Ser246, Glu276,
Glu277, Arg292, Asn294, Tyr347, Gly348, Arg371, and Tyr406. For the
purposes of our RMSD analysis, we discarded Tyr347 because that tyrosine
is an asparagine in H1N1NA. Hydrogen atoms were also removed prior
to performing all RMSD calculations; RMSD alignment was performed
only on the remaining heavy atoms.
Principal Component Analysis
We performed principal
component analysis (PCA) on these same active-site heavy atoms. After
RMSD aligning the NAwt trajectories by these atoms using
VMD, we used the python modules scikit-learn[51] and numpy[52−54] to calculate their first two principal components
and to project each trajectory frame onto those components. The frames
of the NAdel trajectory were similarly projected onto the
same two NAwt principal components to facilitate comparison.
Volumetric Analysis
The POVME algorithm[55] was used to measure the volume of the primary
sialic-acid-binding pocket over the course of the NAwt,
NAdel, and NAhead simulations. To calculate
the volume of the binding pocket for a given trajectory frame, we
monitored the space within a pocket-centered sphere of radius 16.0
Å. In all cases, the pocket volume was calculated from the portion
of the sphere that was unoccupied by protein atoms. Only volumetric
regions contiguous with the pocket were included. Grid spacing and
padding parameters were set to 1.0 and 1.09 Å, respectively.
Root-Mean-Square Fluctuations
All-atom, residue root-mean-square
fluctuations (RMSF)[56] were calculated for
each monomer of each simulation. For the NAwt and NAdel simulations, there were 20 associated monomeric simulations
(5 runs × 4 monomers). For the NAhead simulation,
there were 4 associated monomeric simulations (1 run × 4 monomers).
In all cases, the monomeric trajectories were first aligned by the
α carbons of residues Ser90 to Asp469 (i.e., the residues of
the NA head). The residue center-of-mass RMSF values were calculated
for each monomeric simulation separately.
Identifying Druggable Hotspots
To identify druggable
hotspots, we first identified representative protein conformations
from the MD simulations. The heads of the aligned monomeric trajectories
were subjected to RMSD clustering,[57] as
implemented in the GROMACS computer package.[58] We adjusted the RMSD cutoff until the trajectory frames were separated
into five clusters. The required cutoffs were 1.72, 1.74, and 1.74
Å for the NAwt, NAdel, and NAhead simulations, respectively. The centroid members of each of these
clusters were considered representative of the corresponding simulated
model.The 15 representative NA structures (3 models ×
5 cluster centroids) were then submitted to the FTMAP server[59,60] to identify druggable hotspots. To simplify the analysis, all the
docked molecular probes associated with each simulation (NAwt, NAdel, and NAhead) were considered collectively.
To identify druggable hotspots that were specific to a given model,
the associated docked probes were superimposed, and any probe atom
associated with one model that came within 5 Å of any probe atom
associated with the other was deleted. The remaining probe atoms congregated
at hotspots that were unique to their respective models.
Results
and Discussion
The “Limited-Access” Theory
As described
in the Introduction, the limited-access theory
holds that stalk-deletion neuraminidase glyocproteins (NAdel) have reduced sialidase activity relative to NAwt because
their diminished height hinders access to cellular sialylated-oligosaccharide
receptors. The fact that NA enzymatic activity varies depending on
ligand bulkiness is perhaps the most compelling evidence in favor
of this theory. Studies have shown that short-stalk NAdel glycoproteins have significantly reduced activity when a bulky substrate
(e.g., fetuin) is used,[61] presumably because
the large substrate cannot physically reach the NA enzymatic sites.
In contrast, NA activity is largely stalk-independent when cleaving
the small molecule MUNANA (essentially a single sialic acid with a
4-methylcoumarin fluorophore attached to the 2-carbon),[62,63] apparently because this smaller substrate is not subject to the
same large-scale steric hindrance.However, several lines of
evidence suggest that the limited-access theory may not fully explain
the reduced NAdel activity. First, MUNANA is very different
than the endogenous sialylated-oligosaccharide receptor. Binding to
these receptors, which are extended, large, and cell bound, is likely
diffusion limited; in contrast, MUNANA is a relatively small molecule
with an affinity that is likely little dependent on kon. While MUNANA may be a fine substrate for measuring
relative NA enzymatic activity in the context of a small-molecule
screen, it is arguably an inadequate surrogate for the actual receptor.Second, fetuin (α-2-HS-glycoprotein) is also very different
than the endogenous receptor, and the theory that its bulk alone prevents
it from accessing the NAdel enzymatic sites is hardly certain.
We note that all available monoclonal antibodies known to target NA
antigenic sites bind equally well to virions with NAdel and NAwt glycoproteins,[61] despite
the fact that antibodies are much larger than fetuin (∼160
and ∼1000 kDa for IgG and IgM, respectively, vs ∼40
kDa for fetuin).[61] This evidence suggests
that access to NAdel is not terribly hindered. Furthermore,
recent cryo-electron microscopy studies of whole-virion particles
demonstrate that the NA glycoproteins are not evenly interspersed
among the more prevalent HA’s as the limited-access theory
suggests; rather, the NA particles tend to cluster together on the
viral surface.[64] Even when stalk mutations
substantially reduce the NA height relative to HA, this clustering
effect alone should in theory preserve much access to the NA enzymatic
head.Computational evidence likewise suggests that simple steric
considerations
are not solely responsible for reduced NAdel activity.
Our models of 2009 H1N1 glycoproteins demonstrate that each stalk
amino acid contributes only ∼1.2 Å to the total NA height
(data not shown). The 2013 H7N9 five-amino-acid stalk deletion with
enhanced virulence mentioned in the Introduction therefore reduced the NA height by only ∼6 Å (roughly
4%).[13] More substantial stalk deletions
do not necessarily force the NA enzymatic head to retreat entirely
behind a towering canopy of HA glycoproteins either. In our models
of H1N1 HA and NA, we found the height of NAwt was nearly
equal to that of HA (149 vs 154 Å, respectively). When a virulence-enhancing
20 amino acid NA stalk deletion[24] was modeled
(NAdel), the height of the NA fell to only 125 Å,
still 81% of the HA height. Given that the NA head is itself a boxlike
structure 80 Å square, the resulting 24 Å height reduction
seems unlikely to substantially interfere with access.We suggest
that, beyond limited-access effects, actual differences
in the affinities of NAwt and NAdel for the
endogenous sialylated-oligosaccharide receptors contribute to the
differences in enzymatic activity. As support for our reduced-affinity
hypothesis, we analyze atomistic molecular dynamics simulations of
three NA models: five 100 ns simulations of 2009 H1N1NAwt; five 100 ns simulations of 2009 H1N1NA with a 20-amino-acid deletion
in the stalk (NAdel[24]); and
one 100 ns simulation of the isolated H1N1NA head, NAhead (Figure ). These
simulations indicate that stalk length does in fact impact the dynamics
of the enzymatic sialic acid binding pocket, suggesting a potential
mechanism for the proposed stalk-length-dependent differences in affinity.
To
verify that the simulations of the three NA systems (i.e., NAwt, NAdel, and NAhead) had properly equilibrated,
we RMSD aligned each tetrameric trajectory by its Cα atoms using VMD.[29] The RMSD was then
calculated between each trajectory frame and a corresponding reference
structure (Figure S1). Despite extensive
pre-equilibration, the RMSD values of the whole-glycoprotein NAwt and NAdel simulations continued to drift slightly
throughout the productive runs.To further investigate the source
of this drift, we calculated separate Cα RMSD values
for the residues of the NA head (Figure ), the lower (membrane-embedded) stalk, and
the upper stalk. The dynamics of the head and lower stalk were in
fact stable in both the NAwt and NAdel simulations.
It was the upper stalk that was dynamically unstable. Given that the
upper stalk serves as a linker connecting two fairly ridged bodies
(the lower stalk and the head), it is reasonable to suppose that this
region is legitimately hypermobile.
Figure 2
RMSDs of all trajectory monomers to their
respective reference
structures. In all cases, only the Cα of the head
domain (i.e., the Cα of all residues homologous to
the 388 residues present in the 3NSS crystal structure) were considered.
(A) NAwt (5 tetrameric trajectories × 4 monomers per
tetramer = 20 runs); (B) NAdel (5 tetrameric trajectories
× 4 monomers per tetramer = 20 runs); and (C) NAhead (1 tetrameric trajectory × 4 monomers per tetramer = 4 runs).
RMSDs of all trajectory monomers to their
respective reference
structures. In all cases, only the Cα of the head
domain (i.e., the Cα of all residues homologous to
the 388 residues present in the 3NSS crystal structure) were considered.
(A) NAwt (5 tetrameric trajectories × 4 monomers per
tetramer = 20 runs); (B) NAdel (5 tetrameric trajectories
× 4 monomers per tetramer = 20 runs); and (C) NAhead (1 tetrameric trajectory × 4 monomers per tetramer = 4 runs).Subsequent analyses (described
below) focused primarily on the
residues of the NA binding pocket, located on the enzymatic head.
It is therefore encouraging that the dynamics of this region equilibrated.
Given the instability of the upper stalk, however, we chose to discard
the initial 50% and 30% of the NAwt and NAdel trajectories, respectively, to ensure that subsequent analysis focused
on the most equilibrated portion of the simulations. In contrast,
NAhead, based on the 3NSS crystal structure,[26] equilibrated far faster. We therefore discarded
only the first 20% of the NAhead simulation (Figures and S1).
RMSD Analysis: Judging Binding-Pocket Dynamics
We next
sought to characterize the effects of the stalk on binding-pocket
dynamics by comparing the geometries of our simulated apo binding
sites to that of holo (ligand-bound) NA. We calculated the RMSD values
between the simulated binding-pocket conformations and that of a reference
crystallographic structure, 2HU4[27] (H5N1),
which shares in common with H1N1 a total of 18 out of 19 binding-pocket
residues. In calculating these RMSD values, the single differing residue
was ignored.A histogram of the RMSD values suggests that the
long-stalk NAwt pocket exists in two conformational states
(Figure a), with RMSDs
to the apo (reference) conformation of roughly 1.8 and 2.7 Å,
respectively. In contrast, the pocket conformations sampled during
the NAdel and NAhead simulations were not binary,
suggesting the pocket is less flexible in these systems.
Figure 3
Dynamics of
the sialic-acid-binding pocket. (A) Simulated apo binding-pocket
conformations are compared to the crystallographic (2HU4) holo conformation.
Geometric similarity was judged by calculating the RMSDs between simulated
and crystallographic pocket residues. The NAwt pocket assumes
two distinct conformational states. (B) Dynamics of the arginine triad.
Geometric similarity was judged by calculating the RMSDs between simulated
and crystallographic triad conformations. Note that the NAwt simulation again assumes two conformational states. (C) Representative
NAwt arginine-triad conformations sampled from the 2.4
Å state (in yellow) and the 3.4 Å state (in red). The crystallographic
position of the triad residues is shown in thick, white licorice.
An oseltamivir molecule has been positioned within the binding pocket
for reference (in orange), though the simulations did not include
any ligand. (D) The simulated conformations of NAwt and
NAdel active-site heavy atoms, in black and red, respectively,
projected onto the first two principal components of the NAwt simulation. The NAwt site assumes a greater variety of
conformations, in harmony with the RMSD analysis.
Dynamics of
the sialic-acid-binding pocket. (A) Simulated apo binding-pocket
conformations are compared to the crystallographic (2HU4) holo conformation.
Geometric similarity was judged by calculating the RMSDs between simulated
and crystallographic pocket residues. The NAwt pocket assumes
two distinct conformational states. (B) Dynamics of the arginine triad.
Geometric similarity was judged by calculating the RMSDs between simulated
and crystallographic triad conformations. Note that the NAwt simulation again assumes two conformational states. (C) Representative
NAwtarginine-triad conformations sampled from the 2.4
Å state (in yellow) and the 3.4 Å state (in red). The crystallographic
position of the triad residues is shown in thick, white licorice.
An oseltamivir molecule has been positioned within the binding pocket
for reference (in orange), though the simulations did not include
any ligand. (D) The simulated conformations of NAwt and
NAdel active-site heavy atoms, in black and red, respectively,
projected onto the first two principal components of the NAwt simulation. The NAwt site assumes a greater variety of
conformations, in harmony with the RMSD analysis.An arginine triad formed by Arg118, Arg292, and Arg371 is
a key
feature of the enzymatic pocket. This triad forms a positively charged
arginine nest that surrounds the negatively charged carboxylate groups
of the endogenous substrate as well as all FDA-approved NA inhibitors.
As with the whole pocket, the dynamics of this key triad are similarly
dependent on stalk length. The average RMSD between the atom positions
of the simulated apo triad configurations and the holo 2HU4[27] conformation were 2.5 and 2.4 Å for the
NAwt and NAdel simulations, respectively. A t-test led us to reject the null hypothesis that these two
averages are statistically equal (p = 0.0).A histogram of the arginine-triad RMSD values suggests that, unlike
the NAdel triad, the NAwt triad again assumes
two distinct conformations (Figure b). To further characterize these two NAwt states, we visually inspected representative conformations from
each (Figure c). In
the state with RMSD = ∼ 2.4 Å, Arg371 is near its crystallographic
conformation, and Arg292 is displaced only slightly in the direction
opposite of Arg371. In contrast, in the state with RMSD = ∼3.4
Å, Arg371 is generally displaced in the direction of Arg118,
and Arg292 is even further from Arg371. Arg118 possesses roughly the
same conformation in both states.To further verify that the
NAwt pocket assumes a greater
variety of conformational states than the NAdel pocket,
we next performed a principal component analysis (PCA) of the NAwt active-site heavy atoms (Figure d). Pocket conformations from both the NAwt and NAdel simulations were projected onto the
first two principal components. The NAwt pocket (in black)
sampled the same region of PCA space as the NAwt simulation
(in red), but the NAwt pocket additionally sampled other
conformations not seen in the NAdel simulation, confirming
that the NAwt pockets adopted a greater number of conformational
states.
Stalk-Length Effects on the Movements of Specific Binding-Pocket
Residues
Having studied the impact of stalk length on binding-pocket
dynamics generally, we next considered the dynamics of specific binding-pocket
residues. We calculated the associated all-atom, residue root-mean-square
fluctuations[56] for each of the 20 monomeric
trajectories associated with the NAwt and NAdel simulations. For each residue, we compared the ensembles of NAwt and NAdel RMSF values using a t-test. A number of residues exhibited statistically significant average
RMSF differences when a liberal significance level was used (t-test, α = 0.10). Most interestingly, the RMSF values
of two of the three arginine residues of the key active-site triad
(Arg371 and Arg118) differed significantly (Table ). This analysis again implicates the arginine
triad and 371 loop as contributing to differential, stalk-length-dependent
binding-pocket dynamics.
Table 1
Binding-Pocket Residues
with RMSF
Values That Differed Significantly between the NAwt and
NAdel Simulations (p < 0.10)a
NAwt
NAdel
p value
NAhead
Arg118 (triad)
0.65 ± 0.09
0.75 ± 0.21
0.06
0.95 ± 0.43
Glu119
0.67 ± 0.17
0.74 ± 0.10
0.10
1.02 ± 0.23
Ile222
0.82 ± 0.05
0.85 ± 0.06
0.09
0.84 ± 0.02
Ser246
1.24 ± 0.28
1.40 ± 0.23
0.06
1.30 ± 0.24
Glu277
0.58 ± 0.20
0.67 ± 0.13
0.07
0.68 ± 0.18
Asn347
1.04 ± 0.11
1.14 ± 0.19
0.06
0.83 ± 0.07
Arg371 (371 loop, triad)
1.24 ± 0.37
1.44 ± 0.32
0.09
0.98 ± 0.21
Data from the
corresponding NAhead (stalk-absent) simulations are also
included. The average
RMSF value is shown plus or minus the standard deviation. The residue-specific
populations of all NAwt and NAdel RMSF values
were compared using t-tests.
Data from the
corresponding NAhead (stalk-absent) simulations are also
included. The average
RMSF value is shown plus or minus the standard deviation. The residue-specific
populations of all NAwt and NAdel RMSF values
were compared using t-tests.To better visualize differences in active-site dynamics,
we colored
the residues of a NAwt conformation by ΔRMSF = RMSFwt – RMSFdel (Figure ). The 371, 406, and 430 loops were generally
more flexible in the NAwt simulation (shown in red). However,
a portion of the 150 loop was more flexible in the NAdel simulation (shown in blue).
Figure 4
NA active-site residues colored by ΔRMSF
= RMSFwt – RMSFdel, from −0.2
to 0.2. Residues in
red and blue were more flexible in the NAwt and NAdel simulations, respectively. The backbones of key binding-pocket
loops are subtly outlined. A crystallographic oseltamivir molecule
is shown for reference, though no ligand was included in the simulations.
NA active-site residues colored by ΔRMSF
= RMSFwt – RMSFdel, from −0.2
to 0.2. Residues in
red and blue were more flexible in the NAwt and NAdel simulations, respectively. The backbones of key binding-pocket
loops are subtly outlined. A crystallographic oseltamivir molecule
is shown for reference, though no ligand was included in the simulations.
Stalk Length May Subtly
Alter Active-Site Druggability
Finally, we also used our
molecular dynamics simulations to study
NA druggability. To get an initial impression of the full extent of
pocket dynamics, we first performed a volumetric analysis of the primary
sialic-acid-binding pocket using the POVME algorithm.[55] The trajectory frames with both the largest and smallest
volumes from the NAwt, NAdel, and NAhead simulations were compared separately (Figure ). In all cases, the pocket fluctuated between
a closed conformation (Figure a) and an open conformation (Figure b), suggesting a highly dynamic pharmacophore.
Interestingly, the NAwt pocket tended to be larger than
the NAdel pocket (Figure c, p = 0.0 per a t-test), implying possible differences in the pharmacophore distributions
of these two NA variants that may have important implications for
drug discovery. To further study variant-dependent differences in
NA-pocket druggability. FTMAP seeks to replicate in silico the experimental
multiple solvent crystal structures (MSCS) method developed by Mattos
et al.[65,66] In brief, the FTMAP algorithm uses computer
docking to flood the surface of a protein with small organic molecular
probes. These probes are then clustered to identify regions of the
protein surface that are most likely to bind small molecules.
Figure 5
Volumetric
analysis. (A) The smallest and (B) largest binding-pocket
conformations sampled by the NAwt simulation. The smallest
and largest conformations from the NAdel and NAhead simulations were similar (data not shown). Sialic acid molecules
(in licorice) have been placed in each pocket for reference, though
the simulations included no ligand. () Histograms of the volumes sampled
over the course of the NAwt and NAdel simulations,
in black and gray, respectively. The NAwt pocket tended
to be larger (p = 0.0).
Volumetric
analysis. (A) The smallest and (B) largest binding-pocket
conformations sampled by the NAwt simulation. The smallest
and largest conformations from the NAdel and NAhead simulations were similar (data not shown). Sialic acid molecules
(in licorice) have been placed in each pocket for reference, though
the simulations included no ligand. () Histograms of the volumes sampled
over the course of the NAwt and NAdel simulations,
in black and gray, respectively. The NAwt pocket tended
to be larger (p = 0.0).We used FTMAP to identify druggable hotspots on the surfaces
of
five representative NA structures extracted from each of our molecular
dynamics simulations. For the most part, the druggable regions of
the enzymatic site did not differ substantially between the NAwt and NAdel simulations. However, FTMAP did position
some molecular probes near a 430-loop-adjacent region of the NAwt pocket that was not deemed “druggable” when
the NAdel conformations were considered, suggesting subtle
differences that may be pharmacologically relevant (Figure a).
Figure 6
Unique druggable hotspots
(outlined). The 371, 430, and 150 loops
are shown in mauve, pink, and green, respectively. (A) A druggable
hotspot was identified near the 430 loop of the NAwt pocket.
(B) A second druggable hotspot was identified beneath the 371 loop
of both the NAwt and NAdel pockets. (C) A large
druggable hotspot was identified near the base of the NAwt and NAdel heads.
Unique druggable hotspots
(outlined). The 371, 430, and 150 loops
are shown in mauve, pink, and green, respectively. (A) A druggable
hotspot was identified near the 430 loop of the NAwt pocket.
(B) A second druggable hotspot was identified beneath the 371 loop
of both the NAwt and NAdel pockets. (C) A large
druggable hotspot was identified near the base of the NAwt and NAdel heads.The NAwt and NAdel hotspots were then
merged
and similarly compared to the NAhead hotspots. Surprisingly,
in both stalk-inclusive NA simulations (NAwt and NAdel), the 371 loop occasionally flipped upward, opening a novel
druggable hotspot that was not apparent when the representative structures
of the NAhead simulation were considered (Figure b). This pocket has been described
previously,[19,67] but the NAwt and NAdel simulations confirm its importance. To date, no known NA
inhibitor exploits this pocket, and it has not been the focus of previous
CADD efforts. Finally, another large druggable hotspot located at
the base of the NA head was unique to the stalk-inclusive (i.e., NAwt and NAdel) simulations (Figure c).
Conclusion
Factors
that alter the sialic acid binding/association mechanism
may profoundly alter NA affinity and activity in additional ways not
explained by the “theory of limited access.” The length
of the stalk appears to influence the dynamics of the 150, 430, 371,
and 406 binding-pocket loops. The 371 loop and the critical arginine
catalytic triad are particularly affected.As a consequence,
the apo NAwt pocket may adopt a greater
variety of conformational states than does that of NAdel. Specifically, the NAdel pocket tends to sample a single
conformational state that is generally similar to that of the holo
conformation. In contrast, NAwt additionally samples more
distant pocket geometries (Figure ).While the impact of stalk length on binding-pocket
dynamics may
well be partially responsible for the reduced NAdel affinity,
we note also that stalk length likely has a profound influence on
the long-range electrostatics of the whole-virion surface, which could
also impact kon. Future efforts will focus
on elucidating stalk-length-dependent differences in this effect as
well.Finally, the predicted druggable hotspots evident in the
stalk-inclusive
simulations also suggest novel and urgently needed opportunities for
drug discovery. Prior to the 2009 H1N1 pandemic, seasonal H1N1 in
the United States was almost entirely oseltamivir resistant.[68] Fortunately, that pandemic strain itself remained
largely susceptible to oseltamivir,[69] but
given the viral proclivity for rapid mutation and adaptation, multidrug
combinatorial therapies may be necessary in the future.[68]These novel hotspots also provide means
for experimentally testing
some of the hypotheses generated in the current study. Virtual screening
into the NA pocket conformations sampled over the course of the NAwt and NAdel simulations could identify compounds
that are uniquely suited to the corresponding pocket geometries. If
subsequent experimental validation confirms binding, X-ray crystallography
could be used to determine if the novel ligand(s) stabilize one of
the predicted NA conformations.
Authors: Jeffrey J Gray; Stewart Moughon; Chu Wang; Ora Schueler-Furman; Brian Kuhlman; Carol A Rohl; David Baker Journal: J Mol Biol Date: 2003-08-01 Impact factor: 5.469
Authors: Markus Christen; Philippe H Hünenberger; Dirk Bakowies; Riccardo Baron; Roland Bürgi; Daan P Geerke; Tim N Heinz; Mika A Kastenholz; Vincent Kräutler; Chris Oostenbrink; Christine Peter; Daniel Trzesniak; Wilfred F van Gunsteren Journal: J Comput Chem Date: 2005-12 Impact factor: 3.376
Authors: Dima Kozakov; David R Hall; Gwo-Yu Chuang; Regina Cencic; Ryan Brenke; Laurie E Grove; Dmitri Beglov; Jerry Pelletier; Adrian Whitty; Sandor Vajda Journal: Proc Natl Acad Sci U S A Date: 2011-08-01 Impact factor: 11.205
Authors: Xueyong Zhu; Hua Yang; Zhu Guo; Wenli Yu; Paul J Carney; Yan Li; Li-Mei Chen; James C Paulson; Ruben O Donis; Suxiang Tong; James Stevens; Ian A Wilson Journal: Proc Natl Acad Sci U S A Date: 2012-09-24 Impact factor: 11.205
Authors: Suxiang Tong; Xueyong Zhu; Yan Li; Mang Shi; Jing Zhang; Melissa Bourgeois; Hua Yang; Xianfeng Chen; Sergio Recuenco; Jorge Gomez; Li-Mei Chen; Adam Johnson; Ying Tao; Cyrille Dreyfus; Wenli Yu; Ryan McBride; Paul J Carney; Amy T Gilbert; Jessie Chang; Zhu Guo; Charles T Davis; James C Paulson; James Stevens; Charles E Rupprecht; Edward C Holmes; Ian A Wilson; Ruben O Donis Journal: PLoS Pathog Date: 2013-10-10 Impact factor: 6.823
Authors: Peter Eugene Jones; Carolina Pérez-Segura; Alexander J Bryer; Juan R Perilla; Jodi A Hadden-Perilla Journal: Curr Opin Virol Date: 2021-08-28 Impact factor: 7.121
Authors: Rommie E Amaro; Pek U Ieong; Gary Huber; Abigail Dommer; Alasdair C Steven; Robin M Bush; Jacob D Durrant; Lane W Votapka Journal: ACS Cent Sci Date: 2018-11-09 Impact factor: 14.553