The d-Ala:d-Lac ligase, VanA, plays a critical role in the resistance of vancomycin. Indeed, it is involved in the synthesis of a peptidoglycan precursor, to which vancomycin cannot bind. The reaction catalyzed by VanA requires the opening of the so-called "ω-loop", so that the substrates can enter the active site. Here, the conformational landscape of VanA is explored by an enhanced sampling approach: the temperature-accelerated molecular dynamics (TAMD). Analysis of the molecular dynamics (MD) and TAMD trajectories recorded on VanA permits a graphical description of the structural and kinetics aspects of the conformational space of VanA, where the internal mobility and various opening modes of the ω-loop play a major role. The other important feature is the correlation of the ω-loop motion with the movements of the opposite domain, defined as containing the residues A149-Q208. Conformational and kinetic clusters have been determined and a path describing the ω-loop opening was extracted from these clusters. The determination of this opening path, as well as the relative importance of hydrogen bonds along the path, permit one to propose some key residue interactions for the kinetics of the ω-loop opening.
The d-Ala:d-Lac ligase, VanA, plays a critical role in the resistance of vancomycin. Indeed, it is involved in the synthesis of a peptidoglycan precursor, to which vancomycin cannot bind. The reaction catalyzed by VanA requires the opening of the so-called "ω-loop", so that the substrates can enter the active site. Here, the conformational landscape of VanA is explored by an enhanced sampling approach: the temperature-accelerated molecular dynamics (TAMD). Analysis of the molecular dynamics (MD) and TAMD trajectories recorded on VanA permits a graphical description of the structural and kinetics aspects of the conformational space of VanA, where the internal mobility and various opening modes of the ω-loop play a major role. The other important feature is the correlation of the ω-loop motion with the movements of the opposite domain, defined as containing the residues A149-Q208. Conformational and kinetic clusters have been determined and a path describing the ω-loop opening was extracted from these clusters. The determination of this opening path, as well as the relative importance of hydrogen bonds along the path, permit one to propose some key residue interactions for the kinetics of the ω-loop opening.
The development of bioinformatics
has been initially driven not
only by the enormous quantity of data that the biologist community
was able to produce during the last decades, but also by the necessity
of finding approaches to organize and better analyze these huge datasets.
Although the protein structures constitute small datasets with respect
to many other data encountered in biology, they nevertheless represent
a challenge for the data analysis, as the relative positions of atomic
coordinates in a protein structure take values in the continuous three-dimensional
(3D) space. The large variability of protein features is obvious from
the variety of physicochemical properties among a given family of
proteins.[1] Furthermore, the full understanding
of a protein function requires, in addition of the knowledge of its
structure, the knowledge of the internal dynamics and thus of the
conformational landscape of the protein, which correspond to large
datasets.Graphs are traditionally used for modeling biological
datasets,
as for the analysis of protein–protein and molecular interaction
networks,[2−9] for description of drug function,[10−16] for the description of interactions within a protein,[17−19] for the description of the hierarchy of local minima in the conformational
space.[20−22] In the description of protein conformational space,
the determination of such a graph is hampered by the need to (i) simplify
the protein local geometry without loss of information and (ii) find
a generic approach for graph determination, while preserving the specificity
of each protein. In contrast, the description of protein structure
and dynamics through graphs would allow one to (i) relate structure
description, conformational variability, and protein function; (ii)
unify the structural and dynamical representations; and (iii) obtain,
for a given protein, a model that could be interfaced with the graphs
described at the cellular level, as the interactome network.[23]In order to investigate the points quoted
above, we have been using
several processing tools to describe the graphs underlying the structural
and dynamical features of the d-Ala:d-Lac (VanA)
ligase:The conformational space
has been explored using an enhanced
sampling approach: the temperature-accelerated molecular dynamics
(TAMD).[29−41]the
self-organizing maps,[24] to convert the
conformational space in a two-dimensional
(2D) map;the Louvain
greedy algorithm,[25] to determine kinetic
clusters in the conformational
space;the Girvan–Newmann
algorithm,
to determine contact communities within the protein structure, which
was already used in other structural objects;[26,27] andthe analysis
of hydrogen bonds within
the protein structure, using a machine-learning approach (Random Forest[28]).The d-Ala:d-Lac ligase (VanA) is present
in cases
of resistance to the glycopeptide antibiotic vancomycin in Enterococcus faetium and Staphylococcus
aureus.[42,43] VanA synthesizes a
modified precursor d-Ala-d-Lac instead of the usual d-Ala-d-Ala, synthesized by using a d-Ala:d-Ala ligase.[44] This depsipeptide
is then fixed at the end of the N-acetyl-muramyl-l-Ala-d-Glu-l-Lys-d-Ala-d-Lac monomers involved in the building of the peptidoglycan, giving
rise to a fully efficient cell wall while preventing the binding of
vancomycin.The X-ray crystallographic structure of VanA[45] (Figure a) includes the domains N-terminal (residues A2–G121
shown
in blue), central (residues C122–S211 shown in red and yellow),
and C-terminal (residues G212–A342 shown in black and green).
The ω-loop (shown in green in Figure a, residues L236–A256) is part of
the C-terminal domain and closes the binding site where the ligase
enzymatic reaction occurs. The two-layer β-sandwich (residues
A149–Q208) is a region opposite to the ω-loop in the
structure and colored yellow in Figure a. It was called “opposite domain” in
a previous work.[46] The binding site is
located at the interface between N-terminal, central, and C-terminal
domains. Concerted motions of the opposite domain and of the ω-loop
allow the opening of the binding cavity to release the product of
the catalytic reaction and accept new ligands.[46]
Figure 1
(a) Three-dimensional (3D) view of the X-ray crystallographic structure
of VanA, colored according to its domains: the N-terminal [A2–G121]
shown in blue, the C-terminal [G212–A342] shown in black, which
includes the ω-loop [L236–A256] shown in green, and the
central domain [C122–S211] shown in red, which includes the
opposite domain [A149–Q208] shown in yellow. The disulfide
bridge C52–C64, located in the N-terminal domain, is shown
with magenta labels (bottom right). (b) Localization of the collective
variables (CV) used for the different TAMD calculations on a cartoon
view of VanA extracted at the end of a 10 ns MD trajectory. The three
structural CV are shown in orange and the five CV obtained from contact
communities calculations are shown in cyan.
(a) Three-dimensional (3D) view of the X-ray crystallographic structure
of VanA, colored according to its domains: the N-terminal [A2–G121]
shown in blue, the C-terminal [G212–A342] shown in black, which
includes the ω-loop [L236–A256] shown in green, and the
central domain [C122–S211] shown in red, which includes the
opposite domain [A149–Q208] shown in yellow. The disulfide
bridge C52–C64, located in the N-terminal domain, is shown
with magenta labels (bottom right). (b) Localization of the collective
variables (CV) used for the different TAMD calculations on a cartoon
view of VanA extracted at the end of a 10 ns MD trajectory. The three
structural CV are shown in orange and the five CV obtained from contact
communities calculations are shown in cyan.The bioinformatics approaches described above have been applied
to MD and TAMD trajectories recorded on VanA. Several graph models
describing the structural architecture, internal dynamics, and the
opening of the ω-loop, have been established. These models give
an extended view of the structural and dynamical features of VanA
and agree with the experimental knowledge available for the protein
function.
Materials and Methods
Molecular
Dynamics Simulation
The
starting point of the simulations was the X-ray crystallographic structure
of the d-Ala:d-Lac ligase (VanA) from Enterococcus faecium BM4147 VanA (PDB ID: 1E4E).[45] The co-crystallized ligands, ADP and phosphinate (1(S)-aminoethyl-(2-carboxypropyl)phosphoryl-phosphinic
acid), located in the active site were removed. The C52–C64
disulfide bridge, observed in the crystal was disrupted to be as close
as possible to the physiological state of the d-Ala:d-Ala ligase.[47]The force field CHARMM22
including the correction map (CMAP)[48,49] was used.
The system was neutralized with five Na+ counterions. Explicit
TIP3P[50] solvent water molecules were added
to the systems using a cutoff of 10 Å. The solvated system includes
13585 water molecules. The molecular dynamics (MD) and the temperature-accelerated
molecular dynamics (TAMD) trajectories were recorded using NAMD 2.7b2.[51] A cutoff of 12 Å and a switching distance
of 10 Å were defined for nonbonded interactions. Long-range electrostatic
interactions were calculated with the Particule Mesh Ewald (PME) protocol.[52]Before starting the initial MD trajectories,
the system was initialized
in the following way. It was first minimized using 1000 steps, then
thermalized by heating the system from 0 to 300 K over 30 ps, with
a time step of 1 fs. The system then is equilibrated in the NPT ensemble
for 100 ps with a time step of 2 fs before a 40 ns MD simulation.The analyzed trajectories were recorded in the NPT ensemble with
periodic boundary conditions. The temperature was maintained at 300
K using a Langevin thermostat,[53] and the
1 atm pressure was regulated using the Langevin piston Nose–Hoover
method.[54,55] The SHAKE algorithm[56] kept all covalent bonds involving hydrogens rigid, so an integration
time step of 2 fs was used for all MD simulations. Atomic coordinates
were saved every picosecond.
TAMD Simulations
At the end of the
first 10 ns of the MD trajectory, five independent 30-ns temperature-accelerated
molecular dynamics (TAMD) simulations were launched (Table S2 in the Supporting Information). The TAMD approach
is an enhanced sampling approach, based on the parallel evolution
of the protein coordinates x in a classical MD simulation
and of the target values z for the collective variables
θα(x):where x are
the physical variables (atomic coordinates) of the system, θ(x) are the collective variables, and z the
instantaneous target values of the collective variables. M is the mass matrix, V(x) is the
empirical classical potential of the system, η(t) denotes white noise
(i.e., Gaussian processes with mean 0 and covariance of ⟨ηα(t)ηα′(t′)⟩
= δαα′δ(t – t′), with p = x,z), κ > 0 is the so-called spring
force constant, γ and γ̅ > 0 are friction coefficients
of the Langevin thermostats, β–1 = kBT, and β̅–1 = kBT̅, where kB is the Boltzmann constant
and T and T̅ represent the temperatures.Equation describes
the motion of x and z under the
extended potentialIt was shown in ref (29) that, by adjusting the
parameter κ, so that z(t)
≈ θ(x(t)), and the
friction coefficient γ̅ so that the value of z moves slower than that of x, one can generate a
trajectory z(t) in z-space that effectively moves at the artificial temperature T̅ on the free-energy hyper-surface F(z), which is defined at the physical temperature T. Hence, by construction, the limiting equation for z(t) in eq samples the distribution e–β̅. Then, using T̅ > T in eq ) accelerates the exploration of the free-energy landscape
by the z(t) trajectory, as energy
barriers can be crossed more easily.The value for the artificial
friction γ̅ on the z variables can be
determined following the principle that
the separation of time scales between x and z must be such that the x have time to
equilibrate before the z values move substantially.
In practice, we proceeded as suggested in ref (57), i.e., we ran short standard
MD trajectories with the collective variables restrained at θ(x) = z fixed, and monitored the mean force
estimators G(N) defined for each collective variable j aswhere θ(x(t)) is the instantaneous value
at time t of the collective
variable. The time required for G(N) to reach
a plateau (see Figure S1 in the Supporting
Information) allows one to extract the characteristic time of relaxation
of the Cartesian variables to a fixed value of the variables z, and hence an estimate of γ̅ to ensure the
time-scales separation γ̅/γ. As the estimator (described
in eq ) converges in
5000 simulation time steps (0.002 ps), a friction γ of 50 ps–1, corresponding to a characteristic time of 0.02 ps,
is sufficient to allow system relaxation.The TAMD approach
was implemented in NAMD using a tcl script.[39,57] In TAMD, the evolution of the usual MD equation, at 300 K, was coupled
to the evolution of collective variables at a much higher temperature.
Several sets of collective variables were used, which were all geometric
centers located in different protein regions.The friction coefficient,
γ = 0.5 ps–1,
and the physical thermal energy, β–1 = 0.6
kcal/mol, are the parameters of the conventional Langevin thermostat,
which allow one to obtain a simulation temperature of 300 K. The restraint
force constant is set to κ = 100 kcal/(mol Å2).TAMD trajectories were run using a value of 20 kcal mol–1 for the artificial thermal energy β̅–1 of the Langevin thermostat attached to the collective
variables.
This thermal energy corresponds to an artificial temperature T̅ of 10 060 K. Despite the high temperature
values used for the Langevin thermostat attached to the collective
variables, it is not expected that the folded structure of VanA would
be destabilized, as a large friction (γ̅ = 50 ps–1) is used for this thermostat, along with the high force constant
(κ = 100 kcal/(mol Å2) to restraint the collective
variable coordinates to the collective variables. In that way, we
reduce the risk of system instability due to large deviation of the
collective variables θ(x) from their target
values z.
Determination of Contact
Communities
The following method has been used to determine
the contact communities
of VanA along each recorded trajectory. At each trajectory frame,
a contact is set up for all α-carbon pairs closer than 12 Å,[58] and the frequency of contacts is calculated
along the trajectory. The protein structure is then considered as
a graph, where the residues Cα constitute the vertices
and the edges are weighted by the frequency of contacts between Cα atoms along the trajectories. An absence of contact
is modeled as a nonexisting edge. The Girvan–Newman algorithm,[59] as implemented in the program Python, allows
one to divide, in an iterative way, the graph into contact communities.
First, all possible shortest paths are calculated between the Cα
and the betweenness of each edge, which is defined as the number of
shortest paths crossing this edge, is computed. The algorithm then removes the edge exhibiting the most important
betweenness and includes the two edge vertices into the same community.
The betweenness of all edges affected by the removal is recalculated.
Several runs of the algorithm are performed to remove the edge of
highest betweenness until no edges remain. At the end of the process,
the initial dynamic map of frequency of contacts has been split into
contact communities of amino acids that are strongly connected.
Conformational Analysis of the Simulations
Using SOM
The Self-Organizing Maps (SOM) approach[24,60,61] was used to cluster the conformations
generated along MD and TAMD trajectories. The SOM algorithm allows
the mapping of the conformational space on a periodic subspace of
reduced dimensions: a 50 × 50 map. 341 × 341 pairwise square
Euclidean distance matrices D were calculated for
the 341 Cα atoms of VanA, for each frame of the trajectory.
To compress the data, a covariance matrix C was computed
from each D. Its four eigenvectors, corresponding
to the first four significant eigenvalues N were kept. For each trajectory frame t, the resulting compressed 4 × 341 matrix D · V, stored as a vector V, contains the conformational descriptors and is used to cluster
the protein conformations.[61]The
SOM was trained in two phases with the following parameters: (i) a
map size of 50 × 50 with periodic boundaries, initialized randomly
with a constant learning rate of 0.5 and a radius of 6.250 for the
first phase (180 000 iterations), and (ii) an exponential decrease
of learning rate (starting at 0.25) and radius (starting at 3.125)
for the second phase (360 000 iterations). After the random initialization
of the map, vectors of conformational descriptors V described above, were presented to
the map in random order,[46] and the neuron
closest to the presented V was updated, as well as the neighbor neurons to preserve the
coherence of the clustering. At the end of the calculation, each neuron
of the SOM contains a average vector ⟨V⟩ corresponding to a mixture of clustered
protein conformations.The Unified distance matrix (U-matrix)
representation was computed
to display the SOM topology on a bidimensional matrix. In the U-matrix,
each node shows the local similarity between the corresponding neighboring
SOM neurons, i.e., the mean distance between the node and its eight
neighbors. A flooding algorithm was then used to aggregate the U-matrix
basins, and to reject outside the regions corresponding to nonsimilar
neurones, leading to a continuous map representation while preserving
the inherent SOM topology.[61]
Graph Processing of the Self-Organizing Maps
The SOM
were additionally processed in two ways in order to determine
graphs describing (i) the kinetics of the conformational space sampled
and (ii) the opening path between the closed and open conformations
of VanA.The graph related to the kinetics of the conformational
space sampled was determined in the following way. A transition matrix
is built from the SOM map. The SOM neurons define the microstates,
and each structure along a given MD or TAMD trajectory is assigned
to a given neuron. The element T of the transition matrix, depicting the transition between
neurons i and j, is defined as the
number of i → j transitions
divided by the number of starts from neuron i. The
transition matrix can be represented as a weighted graph, with the
weight of the vertex ij being given by T.The obtained graph is then
partitioned using the greedy algorithm
of Louvain,[25] in order to maximize the
graph modularity. The modularity is a value between −1 and
+1, measuring the density of edges inside the partitions, compared
to the density of edges outside the partitions. The greedy algorithm
of Louvain optimizes the modularity in two phases.In the first
phase, each SOM neuron is assigned to distinct kinetic
clusters. Then, for each SOM neuron u, the variation
of modularity is evaluated when u is removed from
its cluster and placed to the cluster of each of its neighbors. If
no gain of modularity is possible, u remains in its
cluster. In the second phase, a new graph is built by merging the
SOM neurons belonging to the same cluster. The weights of the resulting
graph are computed by summing the weights of the links between nodes
in the corresponding two clusters.The opening path between
the VanA states displaying open and closed
ω-loops was determined in the following way. Edges between SOM
neurons were weighted by the value of the corresponding element of
the U-matrix, which measures the local similarity between protein
conformations. The starting point was the SOM node u corresponding to the starting point of all trajectories, with closed
ω-loop. The final point of the path was chosen as the medoid
of the SOM kinetic cluster 15 which will be described in section . The medoid
is the neuron whose average distance to all the neurons in the cluster
is minimal.The shortest path is computed using the Dijkstra
algorithm,[62] using the similarity between
neurons as a distance.
Finally, the path defined from SOM neurons was converted to a series
of VanA conformations by replacing each neuron by the VanA conformation
exhibiting the smallest Euclidean distance between its vector of conformational
descriptors V and the
average of the neuron vector ⟨V⟩.
Analysis of Hydrogen Bonds
within VanA
The path describing the ω opening has been
analyzed to detect
the most critical hydrogen bonds for the conformational change. For
that purpose, along the opening path, a representative conformation
was extracted from each kinetic cluster obtained above using the Louvain
greedy algorithm.[25] This representative
conformation was chosen as the medoid of the path conformations belonging
to this kinetic cluster.On each of these VanA conformations,
hydrogen bonds have been detected using criteria based on a survey
of small-molecule crystal structures.[63] This analysis was performed using the UCSF Chimera package,[64] producing 1623 hydrogen bonds. A hydrogen bond
is supposed to be established if the donor–acceptor and the
hydrogen–acceptor distances are respectively smaller than 4.0
and 3.0 Å.A Random Forest (RF)[28] machine learning
approach was used to calculate the importance of each hydrogen bond
for predicting to which kinetic cluster the representative conformation
belongs. The information on established and disrupted hydrogen bonds
was encoded as a Boolean vector for each conformation populating the
path. The hydrogen bonds were indexed by protein residue numbers.
The Boolean vectors were used as descriptors to train the RF. The
predicted value for each vector was the identifier of the kinetic
cluster.The RF calculation was performed using the Python package
scikit-learn
(scikit-learn.org). The number of trees in the forest was
set to 10, with a Gini criterion[28] to measure
the quality of a split. The number of features used when searching
for the best split was set to 40, which is approximately the square
root of the length of the Boolean vectors ( ≈ 40). The trees are expanded until
all leaves are pure. Once the training done, the importance of each
hydrogen bond to define a kinetic cluster has been computed.
Ligand Docking Procedure and GBSA Scoring
The substrates,
ATP, d-Ala, d-Lac, d-alanyl-phosphate (d-Ala(P)), the transition-state analogue
phosphinate or PHY, the product of the reaction, d-Ala-d-Lac, and the allosteric binder,[65] were formatted in mol2 with Chimera 1.4[64] and MarvinSketch 5.1 (www.chemaxon.com/products/marvin/marvinsketch) for docking.UCSF DOCK 6.5[66−68] was used to perform
ligand docking VanA conformations along the opening path obtained
as described at the end of the section .Chimera[64] was used to add hydrogens,
check atom assignment, and assign partial charges consistent with
the AMBER-ff99SB force field.[69] Chimera
was also used to produce mol2 format files for the ligands and the
selected conformations of the receptor. The DMS software program[70,71] generated the molecular surface of the receptor, using a radius
probe of 1.4 Å. Spheres then were calculated around the receptor
with the DOCK 6.5 command “sphgen” with radius probe
values varying between 1.4 Å and 4 Å.[72] Spheres were selected within a radius of 10 Å around
the geometric center defined by the residues E15, K170, R289, N303,
E304, N306, which are close to positions observed for the ligands
(ADP, phosphinate) in 1E4E. The grid encoding van der Waals and electrostatic
interactions was precalculated with the “grid” tool[72] in a box containing the selected spheres. The
DOCK program builds up to 500 flexible ligand docking orientations,
on the precalculated “grid” interaction map. The ligand
poses were then re-scored with the implementation of the Hawkins Molecular Mechanics Generalized Born Surface Area (MM-GBSA) score,[73−77] implemented in UCSF DOCK 6.5. The best scoring solution was kept
for each protein–ligand pair.
Results
Choice of Collective Variables from the Structural
and Community Domains of VanA
The use of the enhanced sampling
approach TAMD requires the definition of collective variables. In
the present work, these variables were chosen as geometric centers
of α-carbons located in various VanA regions. These regions
were detected (Table ) from an analysis of the X-ray crystallographic structure of VanA
(PDB ID: 1E4E) or from the contact communities determined by the Girvan–Newman
algorithm, as described in section . Starting from these regions, two sets of geometric
centers were determined (see Table S1 in
the Supporting Information): structural collective variables (CVN-Xr,
CVO-Xr, and CVω-Xr) and dynamical collective variables (CVω-Com,
CVE0-Com, CVE1-Com, CVM-Com, and CVO-Com). Five independent 30-ns
temperature-accelerated molecular dynamics (TAMD) simulations were
launched using various combinations of both sets of collective variables
(see Table S2 in the Supporting Information).
Table 1
Definition of the Different Domains
of Protein VanAa
domain
residues
determination method
N-terminal-Xr
2–121
structural
C-terminal-Xr
212–342
structural
Central-Xr
122–211
structural
Opposite-Xr
149–208
structural
Omega-Xr
236–256
structural
Ends_0-Com
2–7, 30–39, 69–78, 88–95, 108–120, 330–342
communities
Ends_1-Com
8–29, 40–68, 79–87, 96–103, 310–313
communities
Middle-Com
104–107, 121–147, 220–226, 277–289, 303–309
communities
Opposite-Com
148–210
communities
ω-Com
211–219, 227–276, 290–302, 314–329
communities
The first five
domain definitions
are derived from the analysis of the X-ray (Xr) crystallographic structure[45]1E4E. The last five domain definitions are the communities
obtained using the Girvan–Newman algorithm on the 30-ns MD
trajectory.
The first five
domain definitions
are derived from the analysis of the X-ray (Xr) crystallographic structure[45]1E4E. The last five domain definitions are the communities
obtained using the Girvan–Newman algorithm on the 30-ns MD
trajectory.The structural
collective variables CVN-Xr, CVO-Xr, and CVω-Xr
(Table S1 and Figure ) were respectively defined on the N-terminal
domain, opposite domain, and ω-loop, chosen from a direct observation
of the PDB structure 1E4E. This choice is supported by several observations on X-ray crystallographic
structures and MD trajectories.[45−47] First, the ω-loop, containing
CVω-Xr, displays diverse orientations in X-ray crystallographic
structures of d-Ala:d-Ala ligases.[45] Second, the opposite region (residues 149–208) was
chosen to define CVO-Xr, as this region moves apart from the protein
core, as published in a previous work.[46]The dynamical collectives variables were derived from the
contact
communities calculated using the Girvan–Newman algorithm along
a 30-ns MD trajectory: these communities are described in more detail
below. The corresponding geometric centers are located in the ω-loop
(CVω-Com), in the N-terminal and C-terminal domains (CVE0-Com,
CVE1-Com), and in the middle (CVM-Com) and opposite (CVO-Com) domains
(see Table S1 and Figure ).The contact community analysis based
on the Girvan–Newman
algorithm allowed one to divide VanA in five communities either in
MD or in TAMD simulations, except in TAMD_ON, where four communities
were observed (see Figure ). These communities are variable from one simulation to another,
but involve similar protein regions for all trajectories (see Table S3 in the Supporting Information), even
though different sets of collective variables were used during each
TAMD trajectory. The two Ends_0-Com and Ends_1-Com communities are
interlaced in the protein sequence, and contain residues from the
structural definition of the N- and C-terminal regions. The Opposite-Com
community is located in the opposite domain, while the ω-Com
community corresponds to the ω-loop and part of the C-terminal.
The last community, Middle-Com (see Table S3), located in the middle of the protein and partially superimposed
with the central structural domain Central-Xr (Table ), is detected in all trajectories except
TAMD_ON. The definition of contact communities are slightly different
from the definitions of structural domains, except Opposite-Com, almost
superimposed to the domain Opposite-Xr (Table ). The good fit of Opposite-Com to Opposite-Xr
is expected as the opposite domain was previously detected from an
analysis of MD trajectories.[46]
Figure 2
Communities
determined by the Girvan–Newman algorithm[59] along the MD and TAMD trajectories recorded
on VanA. The same color code was kept for the communities both on
the 3D structures and on the graphs: the communities mainly located
in the N-terminal region (numbers 0 and 1) are shown in blue and red;
the Middle (number 2) community is shown in magenta, if it exists;
the Opposite region is shown in yellow (number 3); the ω-loop
and the main part of the C-terminal are shown in green (number 4).
Projection of the communities calculated on a 30-ns trajectory of
VanA for (a) MD, (c) TAMD_ON, (e) TAMD_ωN, (g) TAMD_OωN,
(i) TAMD_MD, and (k) TAMD_5CV. Also shown is a graph of the interconnectivity
calculated between the different communities for (b) MD, (d) TAMD_ON,
(f) TAMD_ωN, (h) TAMD_OωN, (j) TAMD_MD, and (l) TAMD_5CV.
The collective variables (CV) used for TAMD trajectories are represented
by orange balls when they were derived from structural calculations
and cyan balls if they were obtained from the communities calculations.
Communities
determined by the Girvan–Newman algorithm[59] along the MD and TAMD trajectories recorded
on VanA. The same color code was kept for the communities both on
the 3D structures and on the graphs: the communities mainly located
in the N-terminal region (numbers 0 and 1) are shown in blue and red;
the Middle (number 2) community is shown in magenta, if it exists;
the Opposite region is shown in yellow (number 3); the ω-loop
and the main part of the C-terminal are shown in green (number 4).
Projection of the communities calculated on a 30-ns trajectory of
VanA for (a) MD, (c) TAMD_ON, (e) TAMD_ωN, (g) TAMD_OωN,
(i) TAMD_MD, and (k) TAMD_5CV. Also shown is a graph of the interconnectivity
calculated between the different communities for (b) MD, (d) TAMD_ON,
(f) TAMD_ωN, (h) TAMD_OωN, (j) TAMD_MD, and (l) TAMD_5CV.
The collective variables (CV) used for TAMD trajectories are represented
by orange balls when they were derived from structural calculations
and cyan balls if they were obtained from the communities calculations.The contact communities graph
is connected by edges (Figure ), which depict the frequency
of contact between α-carbons belonging to two different communities.
The larger the frequency, the thicker the edge.[26,27] Thus, the edge thickness gives a qualitative indication of the relative
influences that the communities have on each other. Overall, the same
pattern of influences between communities is observed in all trajectories
(Figure ). The community
corresponding to the ω-loop is always strongly linked with the
opposite community, as reflected by the high betweenness. This communication
is mostly mediated by the middle community (in purple). The opposite
domain is itself connected to the Ends communities detected into the
N- and C-terminal domains (shown in red and blue in Figure ).The definitions of
structural, dynamical collective variables and
of contact communities determined on the trajectory TAMD_ωN
are depicted (Figure ) using a color code. The definitions corresponding to the opposite
domain (yellow) and to the ω-loop (green) are similar for the
three sets of definition. Also, similar middle or central domains
(magenta) are detected between dynamical collective variables and
contact communities.
Figure 3
Definition of collective variables (CV) and of contact
communities
displayed on the VanA sequence. The first line contains the definition
of structural collective variables (CVN-Xr, CVO-Xr, CVω-Xr:
see Table S1) determined from an analysis
of the structure 1E4E. The second line contains the definition of dynamical collective
variables (CVE0-Com, CVE1-Com, CVM-Com, CVO-Com, CVω-Com: Table S1) determined from a community analysis
using the Girvan–Newman algorithm over the 30-ns MD trajectory.
The third line contains the definition of communities (Ends_Oc, Ends_1c,
Middle_c, Opposite_c, ω_c: see Table S3) determined by the Girvan–Newman algorithm on the trajectory
TAMD_ωN. The following color code is used. For the structural
CV: CVN-Xr (blue), CVO-Xr (yellow), and CVω-Xr (green). For
the dynamical CV: CVE0-Com (blue), CVE1-Com (red), CVM-Com (magenta),
CVO-Com (yellow), and CVω-Com (green). For the TAMD_ωN
communities: Ends_Oc (blue), Ends_1c (red), Middle_c (magenta), Opposite_c
(yellow), and ω_c (green).
Definition of collective variables (CV) and of contact
communities
displayed on the VanA sequence. The first line contains the definition
of structural collective variables (CVN-Xr, CVO-Xr, CVω-Xr:
see Table S1) determined from an analysis
of the structure 1E4E. The second line contains the definition of dynamical collective
variables (CVE0-Com, CVE1-Com, CVM-Com, CVO-Com, CVω-Com: Table S1) determined from a community analysis
using the Girvan–Newman algorithm over the 30-ns MD trajectory.
The third line contains the definition of communities (Ends_Oc, Ends_1c,
Middle_c, Opposite_c, ω_c: see Table S3) determined by the Girvan–Newman algorithm on the trajectory
TAMD_ωN. The following color code is used. For the structural
CV: CVN-Xr (blue), CVO-Xr (yellow), and CVω-Xr (green). For
the dynamical CV: CVE0-Com (blue), CVE1-Com (red), CVM-Com (magenta),
CVO-Com (yellow), and CVω-Com (green). For the TAMD_ωN
communities: Ends_Oc (blue), Ends_1c (red), Middle_c (magenta), Opposite_c
(yellow), and ω_c (green).
Conformational Clustering of the Conformational
Landscape
The existence of α helices and β strands
has been monitored along the MD and TAMD trajectories (see Table S4 in the Supporting Information). Most
of the secondary structure elements are present more than 80% of the
time, at the exception of 5 β-strands, which are destabilized
in the MD as well as in the TAMD trajectories. Thus, the folded structure
of VanA is not specifically altered by the use of the TAMD, as has
been already noticed in section section .The 180 000 frames of VanA
generated either along the MD or TAMD trajectories were subjected
to a SOM clustering.[46,61] The analysis of SOM permits one
to determine six clusters of conformations (see Figure ). For each cluster, the average VanA conformation
has been drawn in tube representation, where the tube width and color
depend on the conformational local variability (root-mean-square fluctuation
(RMSF), Å) within the cluster. The color varies from blue (RMSF
close to 1 Å) to red, corresponding to the maximal fluctuation
in a given cluster (e.g., cluster 1, 13 Å; cluster 2, 13.3 Å;
cluster 3, 15.7 Å; cluster 4, 7.9 Å; cluster 5, 8.0 Å;
cluster 6, 8.4 Å). A permanent feature of the entire conformational
landscape of VanA is the large internal mobility of the ω-loop.
This agrees with the apo form of VanA simulated: the ω-loop
tendency to open is expected to play an important role in the substrate
processing.
Figure 4
Clustering of VanA conformations sampled along MD and TAMD trajectories,
using SOM. The root mean square deviation (RMSD) from the starting
conformation of the trajectories is shown in a prune-green heat map
(in Å). The conformation sets associated with the medoid of each
cluster are depicted in putty cartoons. On the cartoons, the root-mean-square
fluctuation (RMSF) of the backbone is represented by the width of
the main chain and by a blue–green–red color scale corresponding
to the RMSF values within the corresponding SOM cluster.
Clustering of VanA conformations sampled along MD and TAMD trajectories,
using SOM. The root mean square deviation (RMSD) from the starting
conformation of the trajectories is shown in a prune-green heat map
(in Å). The conformation sets associated with the medoid of each
cluster are depicted in putty cartoons. On the cartoons, the root-mean-square
fluctuation (RMSF) of the backbone is represented by the width of
the main chain and by a blue–green–red color scale corresponding
to the RMSF values within the corresponding SOM cluster.Cluster 4 contains the starting point of MD and
TAMD trajectories.
The average conformation of this cluster is characterized by three
regions displaying large local RMSF: the ω-loop, the opposite
domain, and three loops [residues I43–V48], [residues P71–H76],
[residues N83–H84].A first series of clusters, represented
by clusters 1, 2, and 3,
displays significant opening of the ω-loop, with the loop being
the most open in clusters 1 and 3. In all of these clusters, the protein
internal mobility remains concentrated on the ω-loop (with maximal
RMSF values of 13 Å in cluster 1 and 15.7 Å in cluster 3)
and the other regions are much less mobile, except the opposite domain
(maximal RMSF value of 8.0 Å), the other maxima remaining ∼4–5
Å. Thus, after only 30 ns of simulation, the TAMD trajectories
have been able to reach conformations displaying a wide opening of
the ω-loop. These conformations are similar to the X-ray crystallographic
structures published on the TtDdl d-Ala:d-Ala ligase
(PDB ID: 2YZG).[47]The second series of clusters,
which is represented by clusters
5 and 6, displays conformations with semiopen or semiclosed ω-loop,
similar to the X-ray crystallographic structure of the d-Ala:d-Ala ligase in ref (47) (PDB ID: 2ZDG). The averaged conformations of clusters 5 and 6 display large mobility
of the ω-loop, as well as that of a few regions of the protein:
the opposite domain and the three loops previously detected in cluster
4: [residues I43–V48], [residues P71–H76], [residues
N83–H84].The various trajectories explored the U-matrix
differently (see Figure ). The larger cluster,
cluster 4, was sampled by the different trajectories, but each one
sampled distinct areas. The MD trajectory explored mainly cluster
4, keeping the coordinate RMSD value as low as 2.5 Å, with respect
to the starting point (Figure ), and performing few incursions into cluster 6. This result
agrees with the previously recorded MD trajectories in the absence
of the disulfide bridge C52–C64.[46]
Figure 5
Detailed
exploration of the SOM map by each trajectory. The starting
points are shown in pink and the ending ones are shown in magenta.
The blue–green–red color scale represents the local
root-mean-square deviation (RMSD), from the starting structure for
each structure (values shown are given in Å).
Detailed
exploration of the SOM map by each trajectory. The starting
points are shown in pink and the ending ones are shown in magenta.
The blue–green–red color scale represents the local
root-mean-square deviation (RMSD), from the starting structure for
each structure (values shown are given in Å).Although all TAMD trajectories started from the
same conformation,
the different choices for the collective variables, as well as the
random evolution of MD simulations, induced distinct explorations
of the conformational space. In that respect, three main behaviors
were observed. The trajectories TAMD_ON and TAMD_5CV visited mainly
cluster 4, containing the starting conformation. The trajectories
TAMD_ωN and TAMD_OωN explored clusters 1, 2, and 3, corresponding
to the opening of the ω-loop. The trajectory TAMD_MN explored
regions 5 and 6. Therefore, it seems that the geometric center of
the ω-loop is a required collective variable to obtain the loop
opening. Frames extracted from TAMD_ΩN are plotted in Figure S2 in the Supporting Information, and
reveals that, before the full opening, the ω-loop undergoes
a sideways movement.Overall, the cluster analysis of MD and
TAMD trajectories provides
an exploration of several possible models for ω-loop mobility.
Indeed, protein conformations with fully open loop are obtained along
with conformations displaying mobile closed ω-loop, corresponding
to several conformational states explored by apo VanA.
Kinetic Clustering of the VanA Conformational
Space
The opening of the VanA binding cavity was monitored
by following the values of the angles and between the centers
of mass of the entire
protein VanA (C), of the opposite domain (O), of the N-terminal (N),
and of ω-loop (ω) (Figure a). The values of and angles were projected
on the U-matrix (see Figures b and 6c). An increased value for corresponds to an opening of the ω-loop,
while an increased value for corresponds to a displacement
of the opposite
domain apart from the VanA structure core.
Figure 6
(a) Tube representation
of VanA with the ω-loop in green
and the opposite domain in yellow. Their own centers of mass is marked
with a ball of the same color and respectively called ω and
O. The center of mass of the entire protein VanA is called C (shown
in red) and the center of mass of the N-terminal region, called N
(shown in blue). (b, c) Projections of the angles on the SOM using
a prune-green heat map: (panel (b)) and (panel (c)). The angles
are expressed in
degrees.
(a) Tube representation
of VanA with the ω-loop in green
and the opposite domain in yellow. Their own centers of mass is marked
with a ball of the same color and respectively called ω and
O. The center of mass of the entire protein VanA is called C (shown
in red) and the center of mass of the N-terminal region, called N
(shown in blue). (b, c) Projections of the angles on the SOM using
a prune-green heat map: (panel (b)) and (panel (c)). The angles
are expressed in
degrees.Some of the structural clusters
previously determined from the
SOM analysis (Figure ) display homogeneous angle values while other clusters show much
more heterogeneous values (see Figures b and 6c).Cluster 3,
which contains some of the most open conformations of
VanA (Figure ) is
very homogeneous. It exhibits the widest opening (∼55°)
for the angle (Figure c), while (Figure b) is shrunk
with a value of ∼52°, showing
the opposite domain moving apart, with respect to the protein core,
while the ω-loop is still closed. Unlike cluster 3, clusters
1 and 2, containing open ω-loops, display quite heterogeneous
angle values. The and are mostly mirrored, with large values
(green regions in Figure c) corresponding to small values (violet regions in Figure b) and vice versa. This is
the sign of an anticorrelation between the ω-loop and opposite
domain displacements. Nevertheless, some regions of Figures b and 6c in clusters 1 and 2 display the same color, corresponding to simultaneous
shrinkage or expansion of the two protein domains. For the conformations
displaying the most closed ω-loop, sampled in clusters 4, 5,
and 6, there is mainly little opening of the angles and .To analyze the kinetics of the conformational exchange in
VanA,
the protein conformations were clustered by the Louvain greedy algorithm,
taking into account the time order of the dynamic simulations, as
described in section 2. In that way, 15 individual
kinetic clusters were determined (see Figure ). The conformations populating each kinetic
cluster were sampled along the same trajectory, which is a sign that
the different TAMD trajectories explored various aspects of the conformational
kinetics. The division of SOM according to the kinetic clusters (Figure ) display patterns
quite similar to the ones observed for the projection of the angle or on the SOM (see Figures b and 6c), which proves
that the overall system kinetics is mainly determined by these angle
variations. However, the kinetics clustering brings additional information,
with respect to the conformational clustering performed by SOM. Indeed,
three clusters (1, 5, and 7) display nonconnected regions on the SOM,
respectively labeled 1 and 1′, 5, 5′, and 5″,
and 7, 7′, and 7″ on Figure , putting in evidence fast conformational
equilibrium between distinct conformational regions. The representative
conformations extracted from the nonconnected regions of each of three
clusters, display conformational variability in precise regions of
VanA, as the L and ω loops and the opposite domain (O). Different
types of movements for these regions are observed within the three
clusters, as shown by the superimposed representative conformations
(Figure ).
Figure 7
Kinetic clustering
of the VanA conformation using the Louvain greedy
algorithm on the SOM neurones. A given color is associated with each
of the 15 obtained clusters. For the three clusters, including nonconnected
regions (1, 5, and 7), the disconnected regions are labeled, respectively,
as 1 and 1′, 5 to 5″, and 7 to 7′′. The
representative conformations corresponding to each disconnected region
are drawn superimposed in cartoons.
Kinetic clustering
of the VanA conformation using the Louvain greedy
algorithm on the SOM neurones. A given color is associated with each
of the 15 obtained clusters. For the three clusters, including nonconnected
regions (1, 5, and 7), the disconnected regions are labeled, respectively,
as 1 and 1′, 5 to 5″, and 7 to 7′′. The
representative conformations corresponding to each disconnected region
are drawn superimposed in cartoons.
A Path Describing the ω-Loop Opening
Starting from the kinetic clustering of SOM map and using a procedure
described in section , a path relating the conformations of VanA with closed and open
ω-loop has been traced on the U-matrix (see Figure a). The opening path starts
from the kinetic cluster 5′ (Figure ), passes through clusters 7′, 2,
and 3, and ends up in cluster 15. The conformations sampled along
this path correspond to a slight translational move of the ω-loop
(conformational cluster 2 in Figure ) and then to a rotation of the loop on the side (conformational
cluster 1 in Figure ). Note that the path through conformational clusters 2 and 1 presents
the advantage of permitting a large opening, which allows the substrates
to easily enter into the active site.
Figure 8
(a) Opening path traced on the U-matrix.
The medoids of each clusters,
labeled from A to F, are shown in red and their minimum spanning link
is shown in red. (b) GBSA score (in kcal/mol) for the molecules involved
in the enzymatic reaction: the substrates d-Ala, d-Ala-(P), d-Lac; the reaction intermediate homologous, PHY;
the product of the enzymatic reaction d-Ala-d-Lac;
and an allosteric inhibitor.[65] The GBSA
score is plotted along the conformations labeled from 0 to 60, extracted
from the opening path.
(a) Opening path traced on the U-matrix.
The medoids of each clusters,
labeled from A to F, are shown in red and their minimum spanning link
is shown in red. (b) GBSA score (in kcal/mol) for the molecules involved
in the enzymatic reaction: the substrates d-Ala, d-Ala-(P), d-Lac; the reaction intermediate homologous, PHY;
the product of the enzymatic reaction d-Ala-d-Lac;
and an allosteric inhibitor.[65] The GBSA
score is plotted along the conformations labeled from 0 to 60, extracted
from the opening path.Since the opening of the VanA binding site is directly related
to the protein function, we analyzed the path with respect to the
interaction of VanA with the substrates, inhibitors, and reaction
intermediate. The relative importance of hydrogen bonds within VanA
along the path then was statistically evaluated, and connected to
experimental observations.Several ligands (d-Ala, d-Ala(P)), d-Lac, PHY, d-Ala-d-Lac,
and an allosteric inhibitor[65]) were docked
into the VanA conformations extracted
from the path and the poses scored using the GBSA interaction energy
(Figure b),[75,76] according to the procedure described in section . The score profile displayed by the allosteric
inhibitor (green curve in Figure b) is quite negative and constant. Similarly, the score
profile of d-Ala (red curve in Figure b) is also negative and does not display
much variation along the path, which is in agreement with the fact
that d-Ala is not specific of VanA, but rather binds to all
proteins of the d-Ala:d-Lac ligase family. In contrast,
the other ligands—d-Ala(P), d-Lac, PHY, and d-Ala-d-Lac—all display profiles, becoming mostly
negative in cluster E of the path, after the ω-loop opening
(see Figure b). Before
this opening, the reaction product d-Ala-d-Lac (orange
curve in Figure b)
displays repulsion for VanA, which agrees with the release of the
product after reaction. The intermediate of reaction, PHY, displays
a behavior similar to that of the other compounds.Six conformations,
labeled A to F, were picked up in each of the
kinetic clusters crossed by the path (Figure a). On these conformations, a Random Forest
approach, described in section , was used to determine the relative importance of hydrogen
bonds for the kinetic cluster prediction (Figure ). The most important hydrogen bonds are
mainly located in the N-terminal domain, in the opposite domain, and
in the ω-loop, which reflects the displacements of these domains
described above. In addition, some important hydrogen bonds are observed
in the C-terminal region.
Figure 9
Most important hydrogen bonds for the prediction
of the kinetic
cluster along the opening path. The protein structure is displayed
in trace, with the Opposite domain (residues [149–208]) colored
orange and the ω-loop (residues [236–256]) colored green.
The hydrogen bonds within the ω-loop and the opposite domain
are colored cyan, and the hydrogen bonds between these protein domains
and other protein regions are colored red. The other hydrogen bonds
are gray.
Most important hydrogen bonds for the prediction
of the kinetic
cluster along the opening path. The protein structure is displayed
in trace, with the Opposite domain (residues [149-208]) colored
orange and the ω-loop (residues [236-256]) colored green.
The hydrogen bonds within the ω-loop and the opposite domain
are colored cyan, and the hydrogen bonds between these protein domains
and other protein regions are colored red. The other hydrogen bonds
are gray.The hydrogen bonds connecting
residues from different regions have
been colored red in Figure . From this outline, the breaking of interactions between
protein domains can be followed along the opening path in order to
give a description of the kinetic events. The two interactions E250–K22
(between ω-loop and N-terminal region) and E207-Y137 (between
the opposite domain and the N-terminal region) are broken in the protein
conformation labeled C (Figure ). On the other hand, hydrogen bonds E207–Y137, K203–D132,
R174–D105, and, to a lesser extent, R174–E104 are formed
in the two conformations E and F at the end of the path. The change
from the first set of hydrogen bonds to the second set gives a description
of the opening, involving only few residues, and can be compared to
the patterns of experimental mutations observed for VanA.The
E250A mutation induces a slight decrease in experimental catalytic
efficiency,[78] which would agree with the
importance of the E250–K22 interaction along the opening of
the ω-loop. The only limited decrease experimentally observed
could arise from a possible reorganization of the VanA structure,
which would be due to the presence of residues compensating for the
mutation effect. Besides, in the X-ray crystallographic structure
of VanA,[45] it was observed that the residues
E15, S177, and H244 are involved in a network of hydrogen bonds preventing
the entrance of water molecules that could impair the catalytic reaction
by hydrolyzing the ligands. The residues K22 and E250 detected in
the present analysis, are located, respectively, in the vicinity of
E15 and H244, and could play a similar role.The analysis of
the trajectories in the frame of graph theory has
permitted the determination of an opening path of VanA, allowing the
entrance of substrates in the binding site. The path found agrees
with the interaction energy profiles observed for various VanA ligands.
The relative importance of hydrogen bonds is supported by some experimental
observations.
Discussion
The d-Ala:d-Lac ligase VanA was analyzed by molecular
modeling and various algorithmic tools, in order to obtain a phenomenological
description of the protein internal dynamics and conformational landscape,
based on graph models.The comparison of MD and TAMD trajectories
reveals the efficiency
of TAMD to perform enhanced sampling of the protein conformational
space. As expected, the regions of conformational space explored during
TAMD trajectories are closely dependent on the collective variables
used. In particular, the opening of the ω-loop seems to be favored
if a geometric center of the ω-loop is included into the collective
variables. The exploration of the conformational landscape has permitted
us to describe two different modes of ω-loop opening: in one
mode, ω opens through a translation, whereas in the other, a
translation of ω is followed by a rotation.The partial
opening of the ω-loop has been previously[46] observed spontaneously in MD trajectories in
the presence of the crystallographic disulfide bridge C52–C64.[45] The moving of the opposite domain, closely related
to the opening of the active site, was also observed in these MD trajectories.
One should notice that Roper et al.[45] mentioned
that this disulfide bridge was unexpected, because VanA is a bacterial
intracellular enzyme that should behave in a reducing environment
incompatible with the formation of the bridge. The enhanced sampling
approach taken here made it possible to observe the opening in the
absence of disulfide bridge. The dynamics features observed along
the opening path, as the mobility of the opposite domain, are similar
to the observations previously made[46] in
the presence of the disulfide bridge.The protein internal dynamics
along the opening of the active site
seems to be closely related to the relative mobility of the ω-loop
and of the opposite domain, as shown by the conformational clustering
(Figure ), by the
importance of the angles and (Figure ), to describe the protein kinetics (Figure ), and by the analysis
of hydrogen
bonds along the opening path (Figures and 9).MD and TAMD trajectories
of d-Ala:d-Lac ligase
VanA have been analyzed using various algorithms. Graph models describe
the protein architecture and behavior in the conformational landscape,
as well as along the conformational change related to the opening
of the ω-loop.The contact communities detected by analysis
of the contacts along
the trajectories display a pattern of connections relating the ω-loop
to the middle domain, which acts as a hub to establish connection
to the opposite and the N- and C-terminal domains. This pattern is
conserved in most of the trajectories, whereas contrasted internal
dynamics are observed in these protein regions over the conformational
space (Figure ). Indeed,
the ω-loop is always quite mobile whereas other protein regions
display large (clusters 5 and 6) to small (clusters 1, 2 and 3) internal
mobility (Figure ).The various graphs obtained on the contact communities, or on the
SOM, display characteristics similar to those observed in other bioinformatics
graphs obtained in different contexts, for example, in hub, Middle-Com,
observed in the graph of contact communities (Figure ). Such hubs have been also observed in protein–protein
interaction networks.[79] The graph of hydrogen
bonds along the opening path reveals that all residues establishing
discriminating hydrogen bonds are connected to <4 other residues
(Figure ), a property
of ”small world” also encountered in chemo-informatics
networks based on the ligand-set similarities.[80]Several approaches have been proposed in the literature
to describe
the conformational space of proteins as graph of local minima. The
analysis performed in ref (22) is based on Principal Component Analysis (PCA) of protein
motion. However, the PCA-based analysis detects only linear correlation,
whereas SOM can capture nonlinear correlations. The method proposed
here is related to the Conformational Space Network (CSN), which was
proposed by Yin et al.[21] However, these
authors used discrete structural class to cluster conformations. Similarly,
in ref (20), the structures
were clustered using an all-atom RMSD cutoff of 2.0 Å. In the
present paper, we defined the so-called microstates as the elements
of the SOM grid. This avoids having to define arbitrary structural
classes to cluster the conformations. In addition, from an analysis
of conformational transitions between SOM neurons, a method to detect
the kinetics cluster is proposed, and put in evidence fast conformational
exchange.The graphs proposed here could be used in a systematic
way in proteins
for which structural information can be obtained, in order to insert
these protein structural graphs into larger graphs as the ones observed
in protein–protein interaction networks. Such model stacking
would permit to relate directly phenotypic information to physicochemical
interactions at the atomic level.In the case of VanA, the graphs
provide a model of the open/closed
motion of the ω-loop, allowing one to perform the synthesis
between various information. The influence of specific residues and/or
conformations in such graphs provides candidates for directed mutagenesis
studies.The MD and TAMD trajectories allows an exploration
of the VanA
conformational space, which induces the observation of the ω-loop
opening. As the closed loop blocks the entrance of the active site,
understanding the way the loop is opening gives a qualitative view
of the kinetics of the VanA enzymatic function. In the enhanced sampling
approach, the time scale of opening events observed along TAMD trajectories
is biased and cannot be used to give quantitative information on the
opening kinetics. However, on the other hand, the conformations extracted
along the opening path of the ω-loop, can be used for docking
purposes. Indeed, during the ω-loop opening, the entire architecture
of the VanA structure, as well as the active site geometry change.
Docking ligands on the active site pocket modified by the ω-loop
opening would block this site into an inactive conformation and would
orient the docking prediction toward effective inhibitors of the VanA
function. The protein conformations sampled during the opening path
are available from the authors upon request.
Conclusion
The d-Ala:d-Lac ligase VanA have been exhaustively
investigated by molecular dynamics and enhanced sampling simulations,
in order to propose outlines of (i) protein architecture and (ii)
protein conformational landscape. These two types of analyses have
been conducted in parallel and give consistent results. The conformational
landscape of VanA is characterized by a large mobility of the ω-loop,
which displays different translational and rotational motions, with
respect to the remaining part of the protein. This conformational
view of the landscape is completed by a slightly different kinetic
view, which fully agrees with an angular description of the relative
mobility of the opposite domain and ω-loop. The importance of
the relative motions of the opposite domain and ω-loop is further
enforced by the contact communities analysis of the protein structure,
showing a large influence between these two regions. Overall, the
numerical and statistical tools used here provide parallel descriptions
of the protein structure and of the protein conformational landscape,
which are in global agreement.
Authors: Jing-Dong J Han; Nicolas Bertin; Tong Hao; Debra S Goldberg; Gabriel F Berriz; Lan V Zhang; Denis Dupuy; Albertha J M Walhout; Michael E Cusick; Frederick P Roth; Marc Vidal Journal: Nature Date: 2004-06-09 Impact factor: 49.962
Authors: Siming Li; Christopher M Armstrong; Nicolas Bertin; Hui Ge; Stuart Milstein; Mike Boxem; Pierre-Olivier Vidalain; Jing-Dong J Han; Alban Chesneau; Tong Hao; Debra S Goldberg; Ning Li; Monica Martinez; Jean-François Rual; Philippe Lamesch; Lai Xu; Muneesh Tewari; Sharyl L Wong; Lan V Zhang; Gabriel F Berriz; Laurent Jacotot; Philippe Vaglio; Jérôme Reboul; Tomoko Hirozane-Kishikawa; Qianru Li; Harrison W Gabel; Ahmed Elewa; Bridget Baumgartner; Debra J Rose; Haiyuan Yu; Stephanie Bosak; Reynaldo Sequerra; Andrew Fraser; Susan E Mango; William M Saxton; Susan Strome; Sander Van Den Heuvel; Fabio Piano; Jean Vandenhaute; Claude Sardet; Mark Gerstein; Lynn Doucette-Stamm; Kristin C Gunsalus; J Wade Harper; Michael E Cusick; Frederick P Roth; David E Hill; Marc Vidal Journal: Science Date: 2004-01-02 Impact factor: 47.728
Authors: Po Wei Kang; Annie M Westerlund; Jingyi Shi; Kelli McFarland White; Alex K Dou; Amy H Cui; Jonathan R Silva; Lucie Delemotte; Jianmin Cui Journal: Sci Adv Date: 2020-12-11 Impact factor: 14.136