Circular permutations usually retain the native structure and function of a protein while inevitably perturbing its folding dynamics. By using simulations with a structure-based model and a rigorous methodology to determine free-energy surfaces from trajectories, we evaluate the effect of a circular permutation on the free-energy landscape of the protein T4 lysozyme. We observe changes which, although subtle, largely affect the cooperativity between the two subdomains. Such a change in cooperativity has been previously experimentally observed and recently also characterized using single molecule optical tweezers and the Crooks relation. The free-energy landscapes show that both the wild type and circular permutant have an on-pathway intermediate, previously experimentally characterized, in which one of the subdomains is completely formed. The landscapes, however, differ in the position of the rate-limiting step for folding, which occurs before the intermediate in the wild type and after in the circular permutant. This shift of transition state explains the observed change in the cooperativity. The underlying free-energy landscape thus provides a microscopic description of the folding dynamics and the connection between circular permutation and the loss of cooperativity experimentally observed.
Circular permutations usually retain the native structure and function of a protein while inevitably perturbing its folding dynamics. By using simulations with a structure-based model and a rigorous methodology to determine free-energy surfaces from trajectories, we evaluate the effect of a circular permutation on the free-energy landscape of the protein T4 lysozyme. We observe changes which, although subtle, largely affect the cooperativity between the two subdomains. Such a change in cooperativity has been previously experimentally observed and recently also characterized using single molecule optical tweezers and the Crooks relation. The free-energy landscapes show that both the wild type and circular permutant have an on-pathway intermediate, previously experimentally characterized, in which one of the subdomains is completely formed. The landscapes, however, differ in the position of the rate-limiting step for folding, which occurs before the intermediate in the wild type and after in the circular permutant. This shift of transition state explains the observed change in the cooperativity. The underlying free-energy landscape thus provides a microscopic description of the folding dynamics and the connection between circular permutation and the loss of cooperativity experimentally observed.
A circular permutation is a
rearrangement of the connectivity of a protein obtained by linking
the N- and C-termini of a protein with a peptide linker and creating
new termini elsewhere. Circular permutations can occur naturally and
are an important mechanism through which evolution has used stable
folds to create new ones.[1] Circular permutations
can also be introduced by protein engineering, for example, to manipulate
protein scaffolds, improve catalytic activity and modulate affinity.[2] Circular permutations selectively perturb the
folding dynamics without affecting the native structure. For this
reason, they have been used to probe the effect of chain connectivity
on the folding mechanism of proteins.[3−5]Structure of the cysteine
free WT*T4L (A) and CP13*T4L (B). For
both species, the C-domain is on the top and the N-domain is on the
bottom. The C-terminal part of each subdomain is shown in red; the
N-terminal part is in blue. Thus, the A-helix (green dashed circle),
which encompasses residues 1–12 in WT*T4L, is displayed in
red for WT*T4L and CP13*T4L but is linked to the N-domain in A although
it is no longer in B. The respective discontinuity and continuity
of WT*T4L and CP13*T4L are clear when one observes their sequences
(bottom). Protein structures (PDB ID: 3DKE) have been rendered with Pymol.[6]Here, we focus on the lysozyme from phage T4 (Figure 1), which is a 164-residue protein with two structural
subdomains
connected by a long α-helix. The α/β N-domain is
continuous (residues 13–59), whereas the all-α C-domain
is discontinuous because it also contains the re-entrant N-terminal
helix (residues 1–12) that forms the domain with residues 60–164.
The folding of T4 lysozyme has been extensively investigated experimentally.[7−12] The specific role of its discontinuous subdomain has been experimentally
probed by studying circular permutants of the protein.[13−17] These studies showed that T4 lysozyme folds through an on-pathway
intermediate that occurs after the rate-limiting step. A two-domain
protein with discontinuous subdomains is an example of a naturally
occurring circular permutation, very likely introduced through evolution
to improve the folding property of the protein, while preserving structure
and function. In fact, the presence of discontinuous subdomains among
proteins has been suggested as one strategy to enhance subdomain coupling
and, thus, folding cooperativity.[18] Recently,
this change in cooperativity between subdomains has been further demonstrated,
using a combination of optical tweezers and protein engineering: in
a circular permutant in which the contiguity of two subdomains in
the sequence is re-established, there is loss of cooperativity and
coupling between subdomains.[18]
Figure 1
Structure of the cysteine
free WT*T4L (A) and CP13*T4L (B). For
both species, the C-domain is on the top and the N-domain is on the
bottom. The C-terminal part of each subdomain is shown in red; the
N-terminal part is in blue. Thus, the A-helix (green dashed circle),
which encompasses residues 1–12 in WT*T4L, is displayed in
red for WT*T4L and CP13*T4L but is linked to the N-domain in A although
it is no longer in B. The respective discontinuity and continuity
of WT*T4L and CP13*T4L are clear when one observes their sequences
(bottom). Protein structures (PDB ID: 3DKE) have been rendered with Pymol.[6]
The
description of the folding mechanism of a protein is challenging
because of the large dimensionality of the problem and its stochasticity.
In principle, folding can be described as diffusion on a free-energy
landscape.[19−21] The question we here try to answer is, how does a
circular permutation affect the free-energy landscape of a protein?
The free-energy landscape is not directly accessible by experiment
because of the limited spatial and time resolution. Techniques such
as φ-value analysis[22] have been exploited
to infer the effects of circular permutations on the folding mechanism.[5] Such experiments provide insight at residue level
on the folding pathway but do not clarify the effect of the circular
permutation on the free-energy landscape.Atomistic simulation,
unlike any experimental technique, can provide
information at ångstrom space resolution and femtosecond time
resolution on the sequence of events that take place during a folding
event.[23] Simulation is also the only practical
way to determine directly intricate details about a free-energy landscape.
Simulation results depend on the model and force field used. The most
accurate force fields are also computationally more demanding, and
equilibrium folding simulations can only be performed for small fast-folding
proteins using purpose-designed computers.[24] Coarse-grained or simplified models, such as minimally frustrated,
structure-based ones, turn out to be valuable when accurate sampling
of the conformation space accessible is required.[25] Any attempt to characterize the folding free-energy landscape
of a protein requires accurate sampling of the slowest event, which
consists in the crossing of the major free energy barrier between
the native conformation and denatured ones. Thus, unless one is looking
at small and fast-folding proteins,[24] coarse-grained,
structure-based models, such as the one used here, are for the time
being unavoidable. The availability of experimental high-resolution
measurements to validate the results from such simplified models becomes
necessary if the conclusions are to be generalized to the real system
being modeled.Only a small number of simulation studies have
focused on the effect
of circular permutations on folding dynamics, and even fewer have
addressed the effects of such perturbation on the free-energy landscape.
In most cases, simplified models of proteins, such as lattice models,
have been used.[26] Itoh et al. applied
an Ising model and analyzed the free-energy landscape using the number
of native contacts as a reaction coordinate.[27]Here, we use molecular dynamics simulations with a structure-based,
minimally frustrated model to determine the free-energy landscape
of both wild type (WT*T4L (Figure 1A)) and
a circular permutant (CP13*T4L (Figure 1B))
of a cysteine-free variant of T4 lysozyme.[28] Good agreement with experimental measurements justifies the use
of a structure-based model in this specific study. Key to our approach
is the use of a method[29−32] to determine low dimensional projection of the free energy that
is rigorous and general and that provides results that are easily
interpretable.The free-energy profile (FEP) of WT*T4L shown
in Figure 2 provides a description of the main
features of
WT*T4L folding. Three main basins are clearly identifiable: the denatured,
the intermediate, and the native basins (labeled D, I, and N respectively).
The kinetics can be modeled asStructures belonging
to the denatured basin
are highly diverse and lack secondary structure. Further on the folding
pathway, the protein encounters an intermediate state, which has been
experimentally observed.[12,14,33] This intermediate presents an average radius of gyration of 18.7
Å, as compared with 16.6 Å for the native state and 26 Å
for the denatured state. Projections of the native and intermediate
basins onto plots of the radius of gyration versus number of native
contacts (Figure 2) shows that the intermediate
exhibits a well-folded C-domain, whereas the N-domain is mostly unfolded.
This feature tallies with the intermediate observed experimentally.[33]
Figure 2
FEP of WT*T4L along the optimal reaction coordinate. The
different
basins (native, intermediate, and denatured) and transition states
are shown as structures representative of the corresponding ensemble
and colored in a blue to red scale according to increasing B-factor
values. A projection of each basin onto the radius of gyration and
number of native contacts is shown for the native basin (N), the intermediate
basin (I), and the denatured basin (D) for the whole protein (greens),
N-domain (blues), and C-domain (reds); darker colors mean higher probability.
In the native basin (N), both subdomains and the whole protein are
completely formed; the intermediate basin (I) has a folded C-domain
and an unfolded N-domain; in the denatured basin (D), both subdomains
are unfolded.
FEP of WT*T4L along the optimal reaction coordinate. The
different
basins (native, intermediate, and denatured) and transition states
are shown as structures representative of the corresponding ensemble
and colored in a blue to red scale according to increasing B-factor
values. A projection of each basin onto the radius of gyration and
number of native contacts is shown for the native basin (N), the intermediate
basin (I), and the denatured basin (D) for the whole protein (greens),
N-domain (blues), and C-domain (reds); darker colors mean higher probability.
In the native basin (N), both subdomains and the whole protein are
completely formed; the intermediate basin (I) has a folded C-domain
and an unfolded N-domain; in the denatured basin (D), both subdomains
are unfolded.The transition between
states D and I is the rate-limiting step
of the folding landscape as it corresponds to the highest free-energy
barrier (about 4 kBT).
As a consequence, the crossing of the transition state TSD–I is the rate-limiting step in the folding reaction, with a mean first
passage time (MFPT) of 7.4 μs. The present result agrees with
the experimental finding that places the formation of the intermediate
species after the rate-limiting step.[12,14,33] An important feature of the transition state TSD–I is the native-like placement of the A-helix and
the whole of the C-domain.[17,18,33] The height of the barrier between D and I can be ascribed at least
in part to the loss in entropy upon the docking of the A-helix to
the C-domain. After the formation of the intermediate, we find an
additional transition state (TSI–N), characterized by a smaller free-energy barrier (about 2.5 kBT), between the intermediate
and the native state. This relatively faster step (MFPT 2.3 μs)
corresponds to the structuring of the N-domain that completes the
folding of the protein.FEP of CP13*T4L along the natural optimal reaction
coordinate and
representation of the different basins (native, intermediate, and
denatured); see caption to Figure 2, where
the analogous result for WT*T4L is shown. It appears clearly that
the rate-limiting step for folding corresponds to overcoming the large
barrier between I and N. Despite this capital difference, the features
of the various states are remarkably similar; in particular, in I,
the C-domain is formed, and the N-domain is not. Interestingly, the
N-domain is more disordered than in the case of WT*T4L as a consequence
because of the lack of the docked re-entrant helix.To test the role of discontinuous subdomains in
folding dynamics,
we have performed analogous simulations for the circular permutant
CP13*T4L, in which the A-helix is covalently attached to the C-domain.
The resulting FEP is shown in Figure 3. The
free-energy landscape of CP13*T4L is qualitatively similar to that
of the wild type enzyme, with three distinguishable basins (D, I,
and N) and two transition states (TSD–I and TSI–N). The observation that the circular permutant folds
in a manner similar to the wild-type is in agreement with experimental
evidence.[14,33,34]
Figure 3
FEP of CP13*T4L along the natural optimal reaction
coordinate and
representation of the different basins (native, intermediate, and
denatured); see caption to Figure 2, where
the analogous result for WT*T4L is shown. It appears clearly that
the rate-limiting step for folding corresponds to overcoming the large
barrier between I and N. Despite this capital difference, the features
of the various states are remarkably similar; in particular, in I,
the C-domain is formed, and the N-domain is not. Interestingly, the
N-domain is more disordered than in the case of WT*T4L as a consequence
because of the lack of the docked re-entrant helix.
Despite
the apparent similarity in the folding landscape of the
wild type and the circular permutant T4 lysozyme, the folding mechanism
has changed. One interesting observation comes from structural indicators
such as the radius of gyration and number of native contacts of the
three identified states of the FEP. The three lower panels of Figure 2 and Figure 3 show the distributions
of such properties for the three states of each variant. In the native
state (N), both the radius of gyration and the number of contacts
are narrowly distributed; on the contrary, in the denatured state
(D), they are broadly distributed, and similarly so in the wild-type
and the circular permutant. In the intermediate (I), the C-terminal
domain has a native-like radius of gyration and number of contacts
in both variants; the N-terminal domain, instead, is considerably
disordered, as previously inferred from native state hydrogen exchange.[35] In the intermediate of the circular permutant,
the radius of gyration and number of native contacts of the N-terminal
domain vary as broadly as in the denatured state. In the wild-type,
the N-terminal domain is also considerably disordered in the intermediate,
but less than in the denatured state because the folding of the C-domain
includes the A-helix and the latter keeps the N-terminus anchored
to the folded C-domain.These considerations provide a microscopic
explanation of the shift
in the rate-limiting step upon circular permutation. In the circular
permutant, although the D → I transition involves the ordering
of the C-domain, it does not involve a large loss of entropy, as in
the wild type, where the folding of the C-domain restrains the dynamics
of the N-domain through the A-helix. A relatively more stable state,
I, for the circular permutant results in an higher free-energy barrier
between I and N (MFPT ≈ 22.2 μs); the barrier between
D and I (MFPT ≈ 2.7 μs) is much lower. This results in
the movement of the main free energy barrier relative to that observed
for the wild type, that now is located at the I → N transition
and involves the formation of the N-domain and the achievement of
the correct overall topology.To quantify the cooperativity
between the subdomains, we have used
quasiharmonic principal component analysis to extract the essential
dynamics space of each subdomain for both WT*T4L and CP13*T4L.[36] The cross-correlation between the projection
of the individual subdomains’ trajectories over the lowest
frequency mode is much larger for the WT*T4L (0.6) compared with CP13*T4L
(0.02). This means that slow motions are correlated between subdomains
for the wild-type and not for the circular permutant. These findings
are consistent with the experimental evidence of the two-subdomain
cooperativity during the folding of the wild-type T4 lysozyme[10,11,13,14] and the loss of coupling in the circular permutant.[14,15,17,18,33]We thus conclude that the construction
of the circular permutant
of T4 lysozyme has a major effect on the folding cooperativity of
this protein. Although there is broad experimental evidence of the
relation between circular permutation and interdomain cooperativity,
the present simulations show that this arises from a subtle change
in the folding landscape. The source of this cooperativity is found
in the transition state. Structurally, it enforces the communication between the two subdomains through the docking
of the A-helix to the rest of the C-domain. In the wild type protein,
since the formation of the transition state is an early event in the
folding pathway, this connection between subdomains (i.e., cooperativity)
is present throughout the folding process.The structure-based
potential that makes possible the thorough
sampling of the folding process turns out to be a good choice. Important
features of the system (such as the existence and essential characteristics
of the intermediate) are in excellent agreement with the experiment.[14,15,18,33] Here, we want to stress that the features of the model system and
their relation with the experimental properties of T4L could not have
been extracted from the simulation without the specific algorithm
we used to construct the optimal reaction coordinate and the corresponding
free-energy landscape. An optimal reaction coordinate not only preserves
the diffusive dynamics of the system[29−32] but also provides an intuitive
representation of the results of the simulation. The approach uses
no adjustable parameters apart from those used in the optimization
of the reaction coordinate and analyzes the free-energy landscape
with any number of transition states (see the Supporting Information). The good agreement between the dynamics
on the profile and the unprojected dynamics (i.e., the MFPTs computed
from diffusion on the profile and directly from the trajectories agree)
confirm the robustness of choice of reaction coordinate. Alternative
approaches (for example, the method proposed by Best and Hummer[37]) provide similar results in the case of a single
transition state but fail when multiple transitions are present, as
in the case studied here.The approach used to determine free
energy landscapes from molecular
dynamics trajectories is rigorous and here generalized to nonequilibrium
trajectories. For the specific case of T4 lysozyme, we observe that
a circular permutant in which the two domains are continuous folds
less cooperatively, as experimentally previously shown. This change
is reflected by a subtle change in the free energy landscape, that
is, the shift of the rate-limiting barrier after an intermediate state;
the latter, although present in both species, is relatively more populated
in the circular permutant.For the two discontinuous domain
protein dihydropholate reductases,
it has been shown that several circular permutants, including one
that makes the two domains continuous,[3,38] preserve the
folding ability of the protein. The effect of a circular permutant
on the folding mechanism of ∼100 amino acid single-domain protein
S6[4] has shown how the folding ability of
a protein can be preserved while the folding mechanism can be dramatically
changed; for protein S6, as for T4 lysozyme investigated in the present
work, the wild type proteins fold more cooperatively than the respective
circular permutants.It seems plausible, as suggested by Shank
et al.,[18] that proteins have likely evolved
to select topologies
that allow for more cooperative folding to avoid kinetic trapping
and misfolding. In multidomain proteins, where individual domains
independently evolved join together to perform new functions,[39] coupling between domains through circular permutation
may be an evolutionary strategy to improve their folding. Indeed,
it has been shown that circular permutations play an important role
in protein evolution,[40] particularly for
multidomain proteins.[41]For the specific
case of T4 lysozyme, we have shown that the radical
change in topology induced by a circular permutation that turns the
two domains from continuous to discontinuous, does not involve a complete
change of the free energy landscape. Although this may not be true
in general, it appears that enhanced cooperativity through domain
coupling may be obtained by preserving the essential features of the
free-energy landscape of an ancestor protein where the subdomains
were continuous.
Authors: G Yang; C Cecconi; W A Baase; I R Vetter; W A Breyer; J A Haack; B W Matthews; F W Dahlquist; C Bustamante Journal: Proc Natl Acad Sci U S A Date: 2000-01-04 Impact factor: 11.205