Junchao Xia1, Ronald M Levy. 1. Department of Chemistry and Chemical Biology, Rutgers, the State University of New Jersey , 610 Taylor Road, Piscataway, New Jersey 08854, United States.
Abstract
The Crk adaptor proteins play a central role as a molecular timer for the formation of protein complexes including various growth and differentiation factors. The loss of regulation of Crk results in many kinds of cancers. A self-regulatory mechanism for Crk was recently proposed, which involves domain-domain rearrangement. It is initiated by a cis-trans isomerization of a specific proline residue (Pro238 in chicken Crk II) and can be accelerated by Cyclophilin A. To understand how the proline switch controls the autoinhibition at the molecular level, we performed large-scale molecular dynamics and metadynamics simulations in the context of short peptides and multidomain constructs of chicken Crk II. We found that the equilibrium and kinetic properties of the macrostates are regulated not only by the local environments of specified prolines but also by the global organization of multiple domains. We observe the two macrostates (cis closed/autoinhibited and trans open/uninhibited) consistent with NMR experiments and predict barriers. We also propose an intermediate state, the trans closed state, which interestingly was reported to be a prevalent state in human Crk II. The existence of this macrostate suggests that the rate of switching off the autoinhibition by Cyp A may be limited by the relaxation rate of this intermediate state.
The Crk adaptor proteins play a central role as a molecular timer for the formation of protein complexes including various growth and differentiation factors. The loss of regulation of Crk results in many kinds of cancers. A self-regulatory mechanism for Crk was recently proposed, which involves domain-domain rearrangement. It is initiated by a cis-trans isomerization of a specific proline residue (Pro238 in chickenCrk II) and can be accelerated by Cyclophilin A. To understand how the proline switch controls the autoinhibition at the molecular level, we performed large-scale molecular dynamics and metadynamics simulations in the context of short peptides and multidomain constructs of chickenCrk II. We found that the equilibrium and kinetic properties of the macrostates are regulated not only by the local environments of specified prolines but also by the global organization of multiple domains. We observe the two macrostates (cis closed/autoinhibited and trans open/uninhibited) consistent with NMR experiments and predict barriers. We also propose an intermediate state, the trans closed state, which interestingly was reported to be a prevalent state in humanCrk II. The existence of this macrostate suggests that the rate of switching off the autoinhibition by Cyp A may be limited by the relaxation rate of this intermediate state.
Proline cis–trans
isomerization describes two distinct states
(0° for cis and ±180° for trans, respectively; see
Figure 1) of the backbone dihedral angle ω
(defined as Cα–C–N-Cα) presented in the
X-Propeptides. Proline isomerization is one important way to achieve
large conformational changes and reach various macrostates of multidomain
proteins without modifying the covalent structures.[1−4] Conformational changes resulting
from proline switching are crucial to control protein activity in
many biological processes including cell signaling,[5−8] neurodegeneration,[9] channel gating,[10] gene regulation,[11] and others.[12−16]
Figure 1
Domain organization and proline switch of chicken Crk
II protein.
(a) Schematic diagram of the domain arrangement of Crk. Pro238 can
behave as a regulation switch through the cis–trans isomerization.
Tyr222 can be phosphorylated by the enzyme Abl. l-SH3C and
CrkSLS represent the one-domain (residue 220 to 297) and
two-domain (residue 135 to 297) systems respectively studied in this
paper. (b) cis–trans isomerization about the prolyl Gly237-Pro238
bond (a case of Xxx-Pro peptide). The corresponding dihedral angle
ω is defined by four atoms (Cα237–C237–N238–Cα238
as shown in the red line).
Domain organization and proline switch of chickenCrk
II protein.
(a) Schematic diagram of the domain arrangement of Crk. Pro238 can
behave as a regulation switch through the cis–trans isomerization.
Tyr222 can be phosphorylated by the enzyme Abl. l-SH3C and
CrkSLS represent the one-domain (residue 220 to 297) and
two-domain (residue 135 to 297) systems respectively studied in this
paper. (b) cis–trans isomerization about the prolylGly237-Pro238
bond (a case of Xxx-Pro peptide). The corresponding dihedral angle
ω is defined by four atoms (Cα237–C237–N238–Cα238
as shown in the red line).Many experimental studies[17−21] and theoretical investigations[22−33] on short-peptides containing only one proline residue show that
the trans state population predominates (>95%) and the free energy
barriers to rotation about the torsion angle ω are relatively
large (∼20 kcal/mol), although the populations and the barriers
can be adjusted due to different substitutions and side chain effects.[18,34−38] The well-known preference for the trans conformation has also been
found through statistical analysis of proline-containing proteins
from the Protein Data Bank,[30,39,40] and has been attributed to the steric effects[18] of ring atoms, although other contributions including electronic
effects[20] may exist. Recent results also
revealed that the equilibrium and exchange rates between macrostates
varies considerably from short polyprolinepeptides to large proteins
due to changes in the local and global environments.[5,6,41−47] For example, a recent ion mobility-mass spectrometry (IM-MS) study
found that the nonapeptide bradykinin (containing three proline residues
at positions of 2, 3, and 7) contains up to 10 metastable states depending
on the solution composition and the multiple structures are associated
with different combinations of cis and trans states from the three
proline residues.[44,45] Förster resonance energy
transfer (FRET) experiments on polyprolines with 1–10 residues
have also revealed structural heterogeneity with subpopulations that
do not interconvert on time scales from nano- to milliseconds.[46,47] For large proteins, although statistical surveys of X-ray structures
of nonredundant chains from the Protein Data Bank found that around
95% are in the trans configuration,[30,39,40] recent studies show that the population of the cis
state can be dramatically increased for some systems such as staphylococcal
nuclease,[48] 5-HT3 receptor,[10,22] and Crk adaptor proteins.[5,6]The Crk family
of adaptor proteins is believed to act as a molecular
bridge to form protein complexes by recruiting downsteam effectors
to upsteam phosphorylated tyrosine motifs.[49,50] Crk proteins are expressed in most tissues and mediate timely formation
of protein complexes including various growth and differentiation
factors.[51,52] Crk proteins usually are overexpressed in
many humancancers.[53−56] Crk II is one of the five types of Crk adaptor proteins and consists
of three domains (Figure 1):[56,57] a single Src homology 2 (SH2) domain, a N-terminal Src homology
3 (SH3N) domain, and a C-terminal Src Homology 3 (SH3C) domain. Between the SH3N and SH3C domains
there is an approximately 50-residue long linker containing a specific
tyrosine residue (Tyr222 in chickenCrk II) that can be phosphorylated
by Abl kinase.[58] The SH2 domain is used
to achieve the binding of phosporylated tyrosine motifs with a consensus
sequence of pTyr-x-x-Pro.[59,60] The SH3N domain binds proline-rich motifs of the polyproline II (PPII) subtype
with the consensus Pro-x-x-Pro-x-(Lys, Arg) (e.g., Abl kinase).[61] The SH3C, however, does not bind
to these canonical PPII motifs due to the lack of aromatic residues
at the binding surface.[62] However, recent
NMR experiments (more details below) show that The SH3C domain plays a critical role in a new autoinhibition mechanism.[5,6]The regulation of Crk II is achieved through an autoinhibition
mechanism by reorganizing the arrangement of the three Crk II domains.[56,57] Early studies suggested that the regulation of Crk is accomplished
by phosphorylation,[63] namely, that Abl
kinase recruited by the SH3N domain can phosphorylate Tyr222
in the linker and promote the formation of an intramolecular interaction
between the SH2 domain and Tyr222, which further decreases the availability
of the SH2 domain for other tyrosine-phosphorylated binding partners.
In contrast, recent findings showed that the unphosphorylated humanCrkII adopts a compact structure where protein binding to the SH3N domain is partially occluded by the linker region and autoinhibition
is achieved.[64] Recently a new autoregulatory
mechanism has been proposed based on a study of the one-domain and
two-domain subsystems of chickenCrk II, recognizing the importance
of the SH3C domain.[5,6] The cis–trans
transition of proline (Pro 238) results in two distinct macrostates,
one is an autoinhibited (closed, 90%) cis state in which the canonical
binding site of SH3N favors hydrophobic contacts with the
SH3C domain, and the other is an uninhibited (open, 10%)
trans state in which no interaction exists between the two SH3 domains,
and the SH3N is accessible for binding to other molecules
(e.g., Abl kinase). Namely, the SH3C domain is able to
block the binding of other ligands to the SH3N domain by
reorganizing itself and forming hydrophobic contact interactions with
the N domain.[5,6] NMR experiments from different
groups also revealed that this autoregulatory mechanism might not
be conserved across species.[5,6,64−66]Due to the high barrier (∼20 kcal/mol)
and the long time
scale (>seconds) for the proline cis–trans switching, it
is
difficult to observe the exchange dynamics directly from experiments;
furthermore, the transition paths by which Crk switches between the
autoinhibited and uninhibited states are unknown. Some structural
and dynamic information about the individual macrostates can be obtained
from chemical shifts and fast NMR relaxation experiments.[5,6,64−66]Computational
methods are a complementary way to find the missing
components of the autoregulatory mechanisms, but standard molecular
dynamics (MD) simulations using high performance resources can only
reach time scales on the order of microseconds.[67] For simulations on the order of milliseconds and beyond,
we need to resort to more advanced sampling methods. Several MD simulations
by accelerated methods[23,24,26,27,29,31,42,43,68,69] have been used to study the proline cis–trans isomerization
in short peptides. Metadynamics[70,71] is a recently developed
technique for accelerating simulations using a time-dependent biasing
potential acting on certain prechosen chemical reaction coordinates.
Metadynamics simulations have also been performed for short peptide
fragments in 5-HT3 receptor[22] and in the HIV-1 capsid protein (CA) and its complex with Cyclophilin
A (Cyp A),[72] although few simulations of
large proteins have been reported.Due to the multidomain structures
of Crk proteins, extending previous
results from short peptides to large systems is challenging since
no domain–domain interactions were considered in those studies.
To understand the role of the proline switch on large Crk signaling
proteins at the molecular level, we performed extensive standard MD
and metadynamics simulations for short proline- containing peptides
and the large protein systems related to Crk II proteins. Our simulations
suggest the existence of a new intermediate state. Namely, the Proline
238 has switched “on”, but Crk remains in the autoinhibited
“off” conformation due the existence of stabilizing
hydrophobic contacts between the Src N and C terminal domains. We
speculate on the possible effects of this new intermediate on the
kinetics of autoinhibition.
Materials and Methods
Molecular Dynamics Simulations
All MD and metadynamics
simulations of short peptides and Crk proteins were performed using
the molecular simulation package GROMACS,[73,74] an improved version of AMBER 99SB[75] force
field by the Show group (AMBER99SB-ILDN in GROMACS), and the SPCE[76] explicit solvent model, at the same experimental
temperature 298.15 K. The initial conformations for the simulations
come from the NMR structures related to Crk proteins in the PDB and
capped by the acetyl (ACE) and N-methyl (NME) groups:
the fragment Gly237-Pro238-Phe239 of PDB 2L3P for ACE-GPF-NME, ACE-GP-NME, and ACE-P-NME,
PDB 2L3P for
the cis linker-SH3C, PDB 2L3Q for the trans linker-SH3C,
and PDB 2L3S for the SH3N-linker-SH3C. These systems (in
their NMR conformations) were solvated in a truncated octahedron of
SPCE water molecules with the shortest distance of solute to box surfaces
of 1.5 nm (for three short peptides), 2.5 nm (for the one-domain and
two-domain Crk proteins), respectively. All production simulations
were under periodic boundary conditions in the constant volume, temperature,
and number of particles (NVT) ensemble. Before the production runs,
sequential equilibration processes in NVT (1.0 ns), NPT (1.0 ns),
and three other NVT (1.0 ns) ensemble were performed to adjust the
systems into the desired temperatures and volumes. During the first
two equilibrium processes, the position restraints with the force
constants of 1000 KJ/(mol·nm2) are applied to the
solute. The position restraints were released gradually for the last
3 NVT equilibrium processes with the force constants from 100, 10,
to 1 KJ/(mol·nm2). The integration time step was 0.002
ps. In all simulations, LINCS algorithm was used to constrain bonds
involving hydrogen. The nonbonded cutoff for evaluating electrostatic
and van der Waals forces was set to 1.0 nm. To deal with long-range
electrostatic interactions, the PME algorithm was used with the default
settings, including a real space grid of 0.12 nm.
Metadynamics
Simulations
Metadynamics[70,71] accelerates
the system out of a local free energy minimum and explores
other free energy minima by continuously adding a history dependent
biasing potential energy that forces the dynamics to explore conformations
that were not previously visited. Compared with the equations of motion
in standard MD simulations, there are additional forces acting on
atoms derived from the history dependent biasing potential energy
which is accumulated as Gaussian functions,where s(x) is the jth of M chemical reaction
coordinates (collective variables) constructed
from the atomistic coordinates x. The constant w and δ define the height
and width of the Gaussian function added at constant time intervals
of τG. Well-tempered metadynamics[77] is one of many improved versions of metadynamics, which
corresponds to adjusting the constant height of the Gaussian function
as a time-dependent variable. Equation 1 becomes
as below:where w0 is the
initial height of Gaussian functions and ΔB defines a characteristic energy. Over a long
time, the exact free energy can be estimated from the converged biasing
potential aswhere (T + Δ)/T defines a scaling
factor of fictitious higher temperature T + ΔT to the normal temperature T.The
PLUMED[78] (version
1.3) package for the implementation of metadynamics was patched to
GROMACS 4.5.5. Our well-tempered metadynamics simulations were started
after finishing the five corresponding standard MD simulations, using
the parameters of initial Gaussian height w0 = 0.4, scaling factor B = 10, and Gaussian deposit
time τG = 1 ps. The Gaussian width δ is 0.02
nm for distance space and 0.1 rad for dihedral space. These values
were chosen so as to reach a balanced trade-off between a fast exploration
of the conformations and good accuracy in the reconstructed free energy,
and are close to the parameters chosen for other protein systems.[72,79]
Umbrella Sampling Corrections
The accuracy of the estimated
free energy landscapes can be improved further by additional umbrella
sampling using the free energy surface constructed from the metadynamics
simulation as a biasing potential.[80] Corrections
to the free energy were obtained from the probability distributions P(s) from
the biased molecular dynamics trajectories aswhere kB is Bolzmann’s
constant and T is temperature. The final free energy
landscapes becomeIn this report we used metadynamics
simulations to obtain low-resolution approximations of potentials
of mean force (PMFs) and applied them as the time-independent biasing
potentials during the umbrella sampling simulations for the fine-resolution
results of PMFs. The umbrella sampling simulations allow some local
relaxations along degrees of freedom other than that of the predefined
reaction coordinates for PMFs and permit equilibration of the canonical
ensembles. The simulation times for umbrella sampling were chosen
in such a way that the populations integrated from all major basins
do not have significant changes.
Definition of Chemical
Reaction Coordinates
The center
of mass calculations only involved the Cα atoms from the corresponding
groups including the linker, SH3N, and SH3C domain.
The peptide ω dihedral angle was calculated from the four atoms
Cα–C–N–Cα in the two neighboring
residues.The number of hydrophobic contacts between two groups
was computed asThe summations of i and j go
through all nonpolar side-chain carbon atoms belonging to two different
groups. r = |r–
r– d0 | involves the distance between the i and jth atom. r0 and d0 are two preselected distances (0.4 and 0.2
nm in this paper) defining the switch function when combined with
the other two constants (n and m) whose values are 6 and 12, respectively, in our calculations.Similarly, the number of intramolecular hydrogen bonds between
a group of donors and a group of acceptors is defined aswhere i and j are O and H atoms belonging to two different groups, respectively. d = |r– r| is the distance between the i and jth atom. r0 (= 0.27 nm) is the preselected distance, and n (= 6) and m (= 12) are two constants that determine
the steepness of the switching function. The values, defining the
chemical coordinates, are similar to previous work from other groups.[72,81]
Reweighting 2D PMFs for 1D PMFs
In principle, the free
energy surfaces (or potentials of mean force) for some variables other
than the preselected chemical reaction coordinates for the biasing
potentials can be reconstructed from the well-tempered metadynamics.[82] However, this requires many sampling snapshots
between two continuous updates of the biasing potentials. In this
paper, we obtained the 1D PMF (in s) from the 2D
PMFs (in x,y) in an alternative
way by using the biased MD trajectory for the umbrella sampling correction
(in the time series of (s, x, and y)) in a similar way as WHAM.[72,83] The unbiased probability distribution can be calculated aswhere
β = 1/kBT, i goes through all snapshots
with the s values in the range of
bin k, and VB(x, y) is the 2D biasing potential at (x, y). The superscript 0 denote the unbiased (real) distribution
instead of 1 for the biased one. Finally the reweighted free energy
can be obtained using eq 4 from the unbiased
probability distribution.
Results and Discussion
Trans
State Is Predominant for Short Proline Peptides
Figure 2 displays the free energy results
as a function of the dihedral angle (one-dimensional PMFs in ω)
involved in the proline cis–trans transition for three peptide
fragments with 1 to 3 amino acid residues from chickenCrk II (PDB: 2L2P) (ACE-P238-NME,
ACE-G237P238-NME, and ACE-G237P238F239-NME.) Adding the neighboring
two residues does not have a pronounced effect on the free energy
landscape of ω: the trans state (ω = ± 180°)
is the global free energy minimum and predominates (population >99%);
the cis state has free energy 3–5 kcal/mol higher than that
of the trans state; the transition barriers from the trans to cis
state are between 15 to 20 kcal/mol (see Table 1 from the exact values).
Figure 2
Free energy of short peptides as a function
of ω dihedral
angle of the proline obtained from the metadynamics simulations. The
initial structures of these three peptide fragments with one to three
amino acid residues are from the first model of NMR derived chicken
Crk II (PDB: 2L2P) (ACE-P238-NME, ACE-G237P238-NME, and ACE-G237P238F239-NME). The
red line shows the function of torsion potential energy in AMBER force
field,[75]V(ϕ) = V2[1 + cos(2ϕ – 180°)]/2, V2 = 20.
Table 1
Free Energy Differences and Barriers
for the Short Proline-Containing Peptides (in Units of kcal/mol) from
Metadynamics Simulations
peptides
Fcis – Ftrans
ΔFtrans→cis
ΔFcis→tran
ACE-P-NME
4.2
17.5
13.3
ACE-GP-NME
3.6
17.0
13.4
ACE-GPF-NME
3.5
16.6
13.1
Free energy of short peptides as a function
of ω dihedral
angle of the proline obtained from the metadynamics simulations. The
initial structures of these three peptide fragments with one to three
amino acid residues are from the first model of NMR derived chickenCrk II (PDB: 2L2P) (ACE-P238-NME, ACE-G237P238-NME, and ACE-G237P238F239-NME). The
red line shows the function of torsion potential energy in AMBER force
field,[75]V(ϕ) = V2[1 + cos(2ϕ – 180°)]/2, V2 = 20.As a model peptide
for proline cis–trans isomerization,
ACE-P-NME has been extensively studied experimentally[17−21] and using computational methods, including classical molecular simulations[22−24,26,27,29] and quantum mechanics.[25,28,30,84,85] The general results of ACE-P-NME from our metadynamics
simulation are in reasonable agreement with the experimental data[17−21] (with energy differences of ∼1.5 kcal/mol and barriers of
∼20 kcal/mol), although the free energy difference between
the trans and cis state is overestimated and the barrier is underestimated.
It should be noticed that it appears possible to improve these values
by recently optimized force field parameters.[26]
Two Macrostates of Crk Linker-SH3C with Distinct
Sets of Hydrophobic Contacts
Previous NMR results[5,6] showed that the cis–trans isomerization of Pro238 in Crk
results in two distinct equally populated conformational states of
the linker-SH3C fragment of the chickenCrk II protein.
Figure 1S in the Supporting Information shows the energy landscapes of Crk l-SH3C in two dimensions:
(1) the center of mass distance between the second half of the linker
(residues 231 to 238) and the SH3C domain (residues 239
to 297) and (2) the ω dihedral angle of P238 defined as in Figure 1, constructed from the 200 ns standard MD simulations
started from the cis state (first model of PDB: 2L3P) and the trans state
(first model of PDB: 2L3Q) respectively. The cis and trans macrostates are stable during the
simulations, and no transitions occur during the 200 ns period. The
comparisons of residue contact maps (Figure 2S) calculated from the NMR models with that from MD snapshots reveal
that most linker–domain interactions from NMR are observable
in the MD simulations. This confirms that both stable structures from
the MD simulations are consistent with the NMR results. Further analysis
of the structures from simulations will be presented below.Due to the intrinsically slow interconversion rate between the two
conformational states, no transitions were observed in the standard
MD simulation. In contrast, in the metadynamics simulation of Crk
l-SH3C many transitions are observed. The PMF along the
same two dimensions as in Figure 1S is
displayed in Figure 3. The metadynamics simulation
explored more high energy regions in addition to the low-energy cis
and trans basins sampled in the two standard MD simulations. The low
energy conformers for each macrostate are shown in Figure 3S. The residue contact maps calculated from these
low energy structures are exhibited in Figure
4S; they closely resemble the maps from the NMR models and
also from the standard MD simulations in Figure
2S.
Figure 3
(a) PMF of linker-SH3C in 2D (the center of mass distance
between the second half of the linker (residue 230 to 238) and the
SH3C domain (residue 239 to 297) and the ω dihedral
angle of P238) from the 60 ns metadynamics simulation started from
the cis state (first model of PDB: 2L3P) and corrected by the 100 ns umbrella
sampling simulation with the fixed biasing potential from the metadynamics.
Two minima, (12.4, 2.4°) and (14.5,180°), represent the
two macrostates, the cis and the trans state, respectively. (b) Schematic
representation of the free energy path for the cis to trans transition
of linker-SH3C extracted from panel a. The numbers show
the free energy values in kcal/mol and the positions in 2D PMF of
panel a.
(a) PMF of linker-SH3C in 2D (the center of mass distance
between the second half of the linker (residue 230 to 238) and the
SH3C domain (residue 239 to 297) and the ω dihedral
angle of P238) from the 60 ns metadynamics simulation started from
the cis state (first model of PDB: 2L3P) and corrected by the 100 ns umbrella
sampling simulation with the fixed biasing potential from the metadynamics.
Two minima, (12.4, 2.4°) and (14.5,180°), represent the
two macrostates, the cis and the trans state, respectively. (b) Schematic
representation of the free energy path for the cis to trans transition
of linker-SH3C extracted from panel a. The numbers show
the free energy values in kcal/mol and the positions in 2D PMF of
panel a.More detailed comparison of Crk
l-SH3C structures from
the NMR with that from the metadynamics simulations is presented in Figure 5S. The linker–domain (type A)
and linker–linker interactions (type B) in the structures representing
the two free energy minima (macrostates) are well conserved from the
NMR models.[5] Further analysis of the residues
within the interaction interfaces reveals that the cis–trans
isomerization results in two stable macrostates with two different
sets of linker–domain interactions. In the cis state, the major
hydrophobic interaction region includes two residues from the second
half linker (Leu231 and Leu234) and five from the SH3C domain
(Ala241, Val267, Trp276, Phe289, and Val292). In contrast, the major
hydrophobic contacts in the trans state involve four residues from
the linker (ala223, Ile227, Pro230 and Pro 232) and five from the
SH3C domain (Phe239, Lys269, Ile270, Trp276, and Leu 294).
Moreover, the linker–linker interactions are also different.
Pro230 and Pro232 from the second half linker form hydrophobic contacts
with the domain in the trans state instead of with Pro225 and Ile227
from the first half linker in the cis state. In addition to these
hydrophobic interactions, a hydrogen bond can be found between Asn236
(from the linker) and Gln297 (from the SH3C domain) in
the trans state. From Figure 4 we can see that
this rearrangement of the hydrophobic interactions is associated with
a substantial change in the environment of Phe239. Namely, it is exposed
to the solvent in the cis state instead of buried in the hydrophobic
core in the trans state. As described below the conformation of Phe239
in the cis state promotes the association between the SH3C and SH3N domain in the case of the two-domain system.
Figure 4
Structure
comparison of two macrostates of Crk l-SH3C from the metadynamics
simulation as shown in Figure 3. (blue: cis
state, Red: trans state) (a) The cis–trans
isomerization of Pro238 (shown as the blue sticks and balls in the
cis state and red in the trans state) results in macrostates with
distinct linker–domain interactions. The linkers (residue Gly220
to Pro238) are displayed in light color. (b) The residue Phe239 of
trans state (shown as red bonds) is buried into the hydrophobic interaction
surface (shown as transparent surface) between the linker (Ile227,
Pro230 and Pro232) and the SH3C domain (Phe239, Lys269,
Ile270, Asn271, Trp276, and Leu294). Instead, it is exposed to solvent
in the cis state (shown as blue bonds).
Structure
comparison of two macrostates of Crk l-SH3C from the metadynamics
simulation as shown in Figure 3. (blue: cis
state, Red: trans state) (a) The cis–trans
isomerization of Pro238 (shown as the blue sticks and balls in the
cis state and red in the trans state) results in macrostates with
distinct linker–domain interactions. The linkers (residue Gly220
to Pro238) are displayed in light color. (b) The residue Phe239 of
trans state (shown as red bonds) is buried into the hydrophobic interaction
surface (shown as transparent surface) between the linker (Ile227,
Pro230 and Pro232) and the SH3C domain (Phe239, Lys269,
Ile270, Asn271, Trp276, and Leu294). Instead, it is exposed to solvent
in the cis state (shown as blue bonds).Integrating the PMF in Figure 3 results
in the following populations: 38% cis and 62% trans state, which is
in reasonable agreement with the NMR results (50%/ 50%). The transition
barriers are calculated to be around 17.0 kcal/mol, which is close
to the values of short peptides above and consistent with the slow
transition rate of cis–trans isomerization.
Three Macrostates
of the Crk Two-Domain SH3N-Linker-SH3C System
Previous chemical shift and relaxation analyses[5,6] found
that Crk SH3N-linker-SH3C (CrkSLS) exists in two conformations in solution due to the proline isomerization
switching of Pro238: a major one (with population of ∼90%)
in which the switch is in the autoinhibited cis state and the two
SH3 domains interact each other (the closed conformation), and a minor
one (∼10%) in which the switch is in the trans state and the
two domains do not have direct interactions. In comparison with the
short peptides and the one-domain linker-SH3C construct,
the population of the cis state becomes predominant in the larger
construct, which contains both domains. Due to the interactions between
the SH3C and SH3N domains, the binding of PPII
ligands to the SH3N domain is blocked, which results in
a biologically autoinhibited state. Our simulations provide new insights
into the structural transformation between the autoinhibited and uninhibited
macrostates.Figure 6S displays the
PMF of Crk SH3N-l-SH3C along the reaction coordinates
including the center of mass distance between the SH3N and
SH3C domain and the ω dihedral angle of P238, constructed
from the 200 ns standard MD simulation started from the closed cis
state (first model of PDB: 2L3S). The cis macrostate is stable during the simulations
and no transition can be observed during the 200 ns period. The residue
contact patterns on the map in Figure 7S (especially for the domain–domain interactions) calculated
from the MD conformations are very similar to that from NMR models,
implying that the extensive domain–domain interactions from
NMR are reproduced in the MD simulation.The 2D PMF from the
metadynamics simulation (Figure 5) and the
corresponding population analysis by integrating
the entire basins corresponding to the individual macrostates instead
reveal the existence of three macrostates (see Figure 8S for 20 low-energy conformations for each state).
The population of the cis state (cis closed/autoinhibited) is 89%,
very close to the population derived from NMR data (∼90%).[5,6] The trans state has two subgroup distributions: besides one macrostate
(4.5%, trans open/uninhibited, already found in NMR) with a wide-distribution
of domain–domain distance, we found an intermediate metastable
trans state (6.5%, trans closed) with the distribution of domain–domain
distance similar to that of the cis closed state (see Figure 5). Although previous NMR experiments on chickenCrk II[5,6] did not report the existence of a trans
closed state, corresponding experiments on humanCrk II[64] in fact reported a trans closed state while
the cis closed state was not observed. Our results suggest that Crk
II can exist in both trans and cis closed states and that the most
prevalent state may depend on the species.
Figure 5
(a) PMFs of Crk SH3N-l-SH3C in 2D (the center
of mass distance between the SH3N and SH3C domains
and the ω dihedral angle of P238) from the 50 ns metadynamics
simulation started from the cis state (first model of PDB: 2L3S) and corrected by
the 250 ns umbrella sampling simulation with the fixed biasing potential
from the metadynamics. Three minima can be found at (25.8, 9.1°),
(26.2, 180°), and (40,180°), representing the three macrostates:
the cis closed, the trans closed, and the trans open state, respectively.
(b) Schematic representation of the free energy path for the cis closed
to trans closed and to trans open transition extracted from panel
a. The numbers show the free energy values in kcal/mol and the positions
in 2D PMF of panel a.
(a) PMFs of Crk SH3N-l-SH3C in 2D (the center
of mass distance between the SH3N and SH3C domains
and the ω dihedral angle of P238) from the 50 ns metadynamics
simulation started from the cis state (first model of PDB: 2L3S) and corrected by
the 250 ns umbrella sampling simulation with the fixed biasing potential
from the metadynamics. Three minima can be found at (25.8, 9.1°),
(26.2, 180°), and (40,180°), representing the three macrostates:
the cis closed, the trans closed, and the trans open state, respectively.
(b) Schematic representation of the free energy path for the cis closed
to trans closed and to trans open transition extracted from panel
a. The numbers show the free energy values in kcal/mol and the positions
in 2D PMF of panel a.
The Two Closed States of CrkSLS Have a Similar Domain–Domain
Interface but the Flexible Linker Configuration Is Different
The residue contact maps (Figure 9S) of
the cis closed and trans closed states calculated from the simulation
snapshots share very similar patterns of domain–domain interactions
with that of NMR models of the cis closed state (PDB: 2L3S), indicating a similar
domain–domain interface. The detailed structure comparison
of the cis closed state between the NMR and metadynamics simulation
is shown in Figure 10S. Both of them have
the same intermolecular interactions utilizing the hydrophobic contacts
which include the canonical binding site of SH3N (aromatic
residues Phe142, Phe144, Trp170, and Tyr187 along with Pro184 and
Pro186) and three other hydrophobic residues from SH3C (Pro238,
Phe239, and Ile270). In addition, Gln169 in SH3N and Lys266
in SH3C form a hydrogen bond, and Asp143 in SH3N and Lys269 in SH3C make a salt bridge in the cis closed
state.Figure 6a shows the structure
comparison of three macrostates of CrkSLS from the metadynamics
simulation. Both closed (cis/trans) states have very similar domain–domain
orientations and interactions although the flexible linker has different
arrangements. In contrast, the domain–domain interactions are
lost in the trans open state, which results in a wide distribution
of domain–domain distances (Figure 5a) due to the effects of the flexible linker. Some important differences
also exist between the two closed states, which become clearer when
the domain–domain interface is examined (Figure 6b) and reweighting is performed to create 1D free energy curves
as a function of the number of hydrophobic interactions between the
SH3N and SH3C domains (Figure 7a), the number of hydrogen bonds between Gln169 and Lys266
(Figure 7b), and the distance between Asp143
and Lys269 forming a salt bridge (Figure 7c).
From Figure 6b, we can see that the cis to
trans transition of Pro238 disturbs the hydrophobic interface induces
an outward motion of Tyr 187 from the interface, and reduces the number
of hydrophobic contacts between the two domains (Figure 7a). At the same time, the transition also breaks other interactions
between two domains including the hydrogen bond between Gln 169 and
Lys266 (smaller number in Figure 7b), and the
salt bridge between Asp143 and Lys269 (larger distance in Figure 7c). From Figure 6b, it can
be seen that Phe239 of the cis closed state is no longer buried into
the SH3C domain but exposed to SH3N as is the
cis state of the linker-SH3C construct. Such exposure of
Phe239 helps to form the hydrophobic contact interface between the
two domains as found by the NMR structure analysis.[6] From Figure 7b,c, we can see that
the cis closed state is not the global free energy minimum when projected
along this reaction coordinate; but this is not consistent with the
2D PMF in Figure 5 and NMR results.[6] This implies the hydrogen bond between Gln169
and Lys266 and the salt bridge between Asp 143 and Lys269 are not
the major contributors to stabilize the cis closed state. In contrast,
from Figure 7a, it can be seen that the cis
closed state is the global minimum, in agreement with the 2D PMF in
Figure 5 and NMR results,[6] indicating that the number of hydrophobic contacts between
the SH3C and SH3N domains is the major contributor
to stabilize the cis-closed state. Although the 2D dimensional PMF
shown in Figure 5 represents a projection of
the many dimensions macrostate transition onto preselected reaction
coordinates, the corresponding macrostate populations and transition
barriers are in good agreement with that derived from experiments.[6] This suggests that the 2D PMF analysis captures
the major characteristics of the cis–trans transition of Crk
II.
Figure 6
Comparison of three structure minima of Crk SH3N-l-SH3C from the medynamics simulation. (a) The 3D structures representing
the three macrostates—the cis closed (blue), the trans closed
(red), and the trans open (orange)—were aligned using the SH3N domain. (b) Domain–domain interaction surfaces of
both the cis closed (blue) and the trans closed (red) state. The transparent
surface area shows the hydrophobic interaction regions (for the cis
closed state) includes the canonical binding site of SH3N (aromatic residues Phe142, Phe144, Trp170, and Tyr187 along with
Pro184 and Pro186), three hydrophobic residues from SH3C (Pro238, Phe239, and Ile270), two residues (Gln 169 and Lys 266)
forming a hydrogen bond, and the other two residues (Asp143 and Lys269)
involving in a salt bridge. Tyr187, Pro238, Phe239, and Lys269 are
displayed as bonds.
Figure 7
1D PMFs of SH3N-l-SH3C reweighted from 2D
PMF as a function of (a) the number of hydrophobic interactions between
the SH3N and SH3C domains, (b) the number of
hydrogen bonds between Gln 169 and Lys266, (c) the distance between
Asp143 and Lys269 for a salt bridge, and (d) the dihedral angle ω
of Pro 238.
Comparison of three structure minima of Crk SH3N-l-SH3C from the medynamics simulation. (a) The 3D structures representing
the three macrostates—the cis closed (blue), the trans closed
(red), and the trans open (orange)—were aligned using the SH3N domain. (b) Domain–domain interaction surfaces of
both the cis closed (blue) and the trans closed (red) state. The transparent
surface area shows the hydrophobic interaction regions (for the cis
closed state) includes the canonical binding site of SH3N (aromatic residues Phe142, Phe144, Trp170, and Tyr187 along with
Pro184 and Pro186), three hydrophobic residues from SH3C (Pro238, Phe239, and Ile270), two residues (Gln 169 and Lys 266)
forming a hydrogen bond, and the other two residues (Asp143 and Lys269)
involving in a salt bridge. Tyr187, Pro238, Phe239, and Lys269 are
displayed as bonds.1D PMFs of SH3N-l-SH3C reweighted from 2D
PMF as a function of (a) the number of hydrophobic interactions between
the SH3N and SH3C domains, (b) the number of
hydrogen bonds between Gln 169 and Lys266, (c) the distance between
Asp143 and Lys269 for a salt bridge, and (d) the dihedral angle ω
of Pro 238.
Possible Mechanism for
the Catalysis of Crk Signaling by Cyp
A
The cis–trans transition of proline can be catalyzed
by peptidyl-prolyl isomerases (PPIase)[86,87] such as Pin
1, Cyp A and FKBP. The experimental results[6] show that Cyp A is able to accelerate the transition from the closed
to open state for the two domain systems of Crk II CrkSLS by several thousand-fold. The mechanisms of catalysis at the molecular
level are still not clear although several classical[72,88−90] and quantum simulations[91−94] have been performed for Cyp A
and Prolyl isomerases. Previous theoretical studies of Cyp A[72,88−90,92,93,95,96] only focused on short peptide-enzyme complexes and searched for
the transition states and paths with low energy barriers from the
local structure rearrangements. It is difficult to extend these results
to large protein systems that have extensive domain–domain
interactions. For the case of Crk II, the domain–domain interactions
have to be broken for the transition from the closed to the open state,
as shown in Figures 6 and 7.As shown in Figure 5, the free
energy barrier for the transition between the cis closed and trans
closed state is ∼16 kcal/mol and ∼15 for the reverse
transition. The major component of the barrier is the intrinsic proline
peptide torsion energy. In contrast, the free energy barrier between
the trans closed to trans open state is much lower (∼7 kcal/mol),
which comes from the favorable domain–domain interactions mostly
due to the dissociation of hydrophobic contacts as analyzed above.
There are two possible control points that determine the “timing
mechanism” of Crk II proteins: (1) proline isomerization and
(2) breaking hydrophobic contacts. This suggests that the mechanism
of enzyme catalysis for the Crk proteins occurs in two steps: first
a PPIase such as Cyp A acts locally on the Pro238 of the cis closed
state, accelerating the cis to trans transition in a way similar to
that of short peptides; the transition disturbs the domain–domain
interactions and results in an intermediate state (trans closed state).
The subsequent relaxation from the trans closed to the trans open
state (with a much smaller barrier) breaks the hydrophobic contacts
further and completes the transition. The first step is required to
overcome the high barrier (∼16 kcal/mol) from the torsion energy
of ω of Pro238 as is the case for short peptides. This step
is catalyzed by Cyp A by forming the peptide-enzyme transition state,
which lowers the height of the barriers (by 6–13 kcal/mol).[72,88−90,92,93,95,96] In contrast, the second step is required to overcome a barrier (∼7
kcal/mol) due to the domain–domain interactions. The need to
break the domain–domain interaction (in the second step) determines
an intrinsic rate to turn off the autoinhibition. No matter how efficiently
a PPIase accelerates the transition from the cis closed to the trans
closed state, the relaxation of the domain–domain interactions
defines an upper limit to the reaction rate. That is, no enzyme can
accelerate switching off the autoinhibition with a rate faster than
the self-relaxation rate in the second step. This limit becomes very
important when a PPIase can decrease the barrier of proline isomerization
below 7 kcal/mol. Furthermore, we suggest that mutations that destabilize
the domain–domain association can further accelerate the timing
mechanism.
Conclusion
In this work, we provided
the results from molecular dynamics and
metadynamics simulations for several Crk II protein constructs. The
ability to surmount large free energy barriers and accelerate the
large conformational changes using metadynamics sampling allow us
to investigate the autoregulatory mechanism of Crk, which results
from the cis–trans isomerization of the proline switch at Pro
238. The cis–trans transition of Pro 238 induces the linker–domain
and domain–domain reorganization and results in three distinct
macrostates. The large increase in the cis macrostate population can
be explained by linker–linker, linker–domain, and domain–domain
interactions, which stabilize the closed cis state primarily by hydrophobic
contacts, but also by the formation of hydrogen bonds, and salt bridges.
In contrast, the high barrier to the cis–trans transitions
is mainly associated with the torsion energy of the intrinsic peptide
ω dihedral angle of Pro238. For the Crk SH3N-linker-SH3C construct, our simulations predict an intermediate state,
the trans closed state, in which the domain–domain interface
is disturbed, the number of hydrophobic contacts is reduced, and the
interdomain hydrogen bond and salt bridge are broken, compared with
the stable cis closed (autoinhibited) state. Based on the existence
of this intermediate state, we propose that the accelerating process
of switching off the autoinhibition by Crp A has two distinct steps:
(1) The enzyme binds to the Pro 238 region of the cis closed state
catalyzes the transition of the proline switch, and this results in
an intermediate state (the trans closed); (2) subsequently the intermediate
state relaxes to the trans open states with a much smaller barrier
and completes the activation process.The full Crk II proteins
have three domains, SH2, SH3N, and SH3C. NMR
analysis[6] of
ChickenCrk SH3N-linker-SH3C two-domain construct
has shown that the SH2 domain does not have significant effects on
the other two domains. For Crk II proteins from other species[64,65] and Crk like proteins,[66] this assumption
may not be valid. Recent NMR experiments show that Crk like and phosphorylated
Crk like proteins have markedly different domain–domain structures
and are regulated in a distinct manner, which may involve the SH2
domain.[66] NMR results on mouseCrk II[65] and humanCrk II[64] also showed that the autoinhibition mechanism can be accomplished
without cis–trans isomerization by combining the intradomain
and interdomain, or linker–domain rearrangements. NMR experiments
on humanCrk II[64] also revealed the existence
of a predominant trans closed state as we have observed for chickenCrk II. Our simulations show that both closed states can coexist in
chickenCrk II, and we suggest that the predominant state may depend
on the species. Besides these structural reorganizations, phosphorylation
of specific tyrosines is an alternative mechanism of autoregulation.[63] Simulations of phosphorylated systems will also
shed new light on the mechanisms of autoinhibition.[97]In this report we were not able to provide the exact
time scales
for the transitions of Crk II, although no similar results are available
in experiments. The metadynamics simulations accelerated the convergence
of equilibrium properties, but direct information about the real dynamics
is lost. However, the barrier of ∼16 kcal/mol for the cis closed
to trans closed transition is reasonably close to the experimentally
estimated limit for the time scale (>seconds).[6] The substantially reduced barrier for the trans closed
to trans
open state conformational change is also consistent with the several
thousand-fold acceleration of the cis–trans transition by the
Cyp A enzyme.[6] To obtain further information
about the kinetics and pathways, we plan to build Markov state models
starting from the metadynamics simulations in order to generate stochastic
trajectories for sampling the transitions,[98,99] which can be used to investigate transition paths systematically
for the activation of Crk II.
Authors: Nicholas A Pierson; Liuxi Chen; Stephen J Valentine; David H Russell; David E Clemmer Journal: J Am Chem Soc Date: 2011-08-15 Impact factor: 15.419