Literature DB >> 31482116

Explicit Characterization of the Free Energy Landscape of pKID-KIX Coupled Folding and Binding.

Abstract

The most fundamental aspect of the free energy landscape of proteins is that it is globally funneled such that protein folding is energetically biased. Then, what are the distinctive characteristics of the landscape of intrinsically disordered proteins, apparently lacking such energetic bias, that nevertheless fold upon binding? Here, we address this fundamental issue through the explicit characterization of the free energy landscape of the paradigmatic pKID-KIX system (pKID, phosphorylated kinase-inducible domain; KIX, kinase interacting domain). This is done based on unguided, fully atomistic, explicit-water molecular dynamics simulations with an aggregated simulation time of >30 μs and on the computation of the free energy that defines the landscape. We find that, while the landscape of pKID before binding is considerably shallower than the one for a protein that autonomously folds, it becomes progressively more funneled as the binding of pKID with KIX proceeds. This explains why pKID is disordered in a free state, and the binding of pKID with KIX is a prerequisite for pKID's folding. In addition, we observe that the key event in completing the pKID-KIX coupled folding and binding is the directed self-assembly where pKID is docked upon the KIX surface to maximize the surface electrostatic complementarity, which, in turn, require pKID to adopt the correct folded structure. This key process shows up as the free energy barrier in the pKID landscape separating the intermediate nonspecific complex state and the specific complex state. The present work not only provides a detailed molecular picture of the coupled folding and binding of pKID but also expands the funneled landscape perspective to intrinsically disordered proteins.

Entities: Chemical Disease Species

Year: 2019 PMID： 31482116 PMCID： PMC6716127 DOI： 10.1021/acscentsci.9b00200

Source DB: PubMed Journal: ACS Cent Sci ISSN： 2374-7943 Impact factor: 14.553

Introduction

The free energy landscape underlies the thermodynamics and kinetics of any molecular processes in solution. It is the graph of the free energy (to be denoted as f) across the configuration space whose point (collectively abbreviated as r) is specified by the coordinates of all atoms constituting a molecule of interest.[1−3] (Figure illustrates three types of “free energies” used in different contexts in the literature to discriminate f from the others.) The free energy f is a sum of the potential energy and the solvation free energy,[1−3] and the landscape defined by f is a natural extension of the potential energy landscape or surface to systems in which solvation effects cannot be neglected. f determines thermodynamic properties of the system through the partition function, Z = ∫ dr e–, in which β denotes the inverse temperature.[1−3] It also dictates kinetic properties as the force −∂f(r)/∂r, under the influence of the frictional and thermal random forces, governs the equation of motion for each (say, ith) constituent atom.[2] The concept of a free energy landscape has, in particular, played a key role in advancing our understanding of protein folding. For example, the basic assumption behind the well-known paradox by Levinthal[4] is that all possible protein configurations are equally probable, which amounts to assuming constant f(r), i.e., a flat free energy landscape. It was a keen insight that the paradox is resolved if the free energy landscape is not flat but is globally funneled, i.e., if there is an overall negative slope (∂f(r)/∂r < 0) toward the folded state.[5,6] The funneled landscape perspective has since then served as a conceptual framework not only for interpreting folding experiments[7,8] but also for understanding a variety of processes, including biomolecular recognition and protein misfolding and aggregation.[9−12]

Figure 1

Illustration of the free energy landscape defined by f(Q) (a), the free energy profile F(Q) (b), and the thermodynamic free energy for the folded (Ff) and unfolded state (Fu) (c) based on the ∼400 μs folding–unfolding simulation trajectory of HP-35. The free energy landscape f(Q) was constructed after computing f(r) for ∼2 × 106 individual configurations saved with a 200 ps interval. The free energy profile F(Q) was obtained from a histogram of Q values sampled in the simulation. The dashed vertical lines in panel b indicate the positions of Qu = 0.20 and Qf = 0.89 that characterize the unfolded-state (Q < Qu) and folded-state (Q > Qf) regions. The folding free energy ΔF = Ff – Fu was estimated from the population ratio of the folded (Q > Qf) and unfolded (Q < Qu) configurations. New attempts have emerged in recent years that aim to go beyond the conceptual level and to perform quantification of the free energy landscape.[13,14] However, this is challenging not only because it requires sampling relevant configurations associated with the process of interest but also because efficiently computing the free energy f for all of those individual configurations is nontrivial. Recent advances in computing power and simulation software have opened up the possibility of providing fully atomistic long-time simulation trajectories of protein folding and protein–protein association.[15−17] An analysis method has also been developed for evaluating thermodynamic functions (including f) along protein configurational changes, which has been utilized to derive thermodynamic perspective on protein aggregation.[18] Combining these developments, it is now possible to construct the free energy landscape from first-principles (Figure a provides such an example). Coupled folding and binding is an intriguing process in which an intrinsically disordered protein (IDP) folds into an ordered structure upon binding with a partner protein.[19−21] Such conformational flexibilities are central to the functions of numerous IDPs, enabling them to serve as molecular switches and hubs in biological networks.[22,23] A number of questions naturally arise concerning the fundamental aspects of this process, such as the following: (1) What distinguishes an IDP from proteins that autonomously fold? (2) How are those distinguishing characteristics of an IDP altered, when its binding partner approaches nearby, so that its folding becomes possible? (3) Does the folding occur before, after, or concomitantly with binding? It will be illuminating in this regard to explicitly characterize the free energy landscape of coupled folding and binding as the molecular mechanisms that address these questions are expected to manifest themselves as its topographic characteristics. In addition, it is of significant interest to explore to what extent the funneled landscape perspective applies also to intrinsically disordered proteins. Here, we investigate the free energy landscape of the phosphorylated kinase-inducible domain (pKID; residues 116–149) of the CREB protein in the presence of its binding partner, the kinase interacting domain (KIX; residues 586–672) of the CREB binding protein. This is a paradigmatic system that exhibits coupled folding and binding—disordered pKID folds into a structure possessing two α helices (termed αA and αB) upon binding to KIX (Figure a)—and has been intensively investigated through experimental and computational studies.[24−42] We carry out spontaneous binding simulation starting from disordered pKID placed at a large separation from KIX, to sample the configurations during the course of the coupled folding and binding. To the best of our knowledge, this is the first unguided, fully atomistic pKID–KIX binding simulation that reaches the specific pKID–KIX complex structure. For each of the simulated configurations (r), we compute f(r) by applying the molecular integral-equation theory for solvation free energy and then construct the free energy landscape. While rigorous calculation methods for f(r) are available such as free energy simulations, the use of an approximate solvation theory is inevitable here since f(r) for millions of simulated configurations needs to be computed. We analyze in detail how the topography of the free energy landscape varies as pKID binds to KIX. Supplemented with structural analysis of the simulation trajectory, we would like to provide a detailed molecular picture of the coupled folding and binding of the pKID–KIX system.

Figure 2

(a) Folding of the disordered pKID upon binding to KIX. The structure of the free pKID is taken from the simulation and those of the free KIX and the pKID–KIX complex from the NMR complex structure (PDB entry 2LXT). (b) Sequence of the pKID region and the locations of the two α helices (αA and αB) and of the phosphorylation site. (c) Fraction of native intermolecular contacts Qint (top panel) and fraction of native intra-pKID contacts QpKID (bottom panel) from the successful 10 μs spontaneous binding trajectory (only the results up to 3 μs are displayed here to facilitate the visibility; the whole results are shown in Figure S3). The time regions are colored as follows: the initial diffusive stage (0–490 ns) colored red; the binding stage (490–540 ns) colored green; the initial folding stage (540–900 ns) colored magenta; the intermediate stage (900–2250 ns) colored orange; and the final specific complex stage (>2250 ns) colored cyan. Qint in the binding stage and QpKID in the initial folding stage are magnified in the middle panels. (d) Selected structures along with (Qint, QpKID) values and the angle between the pKID αA and αB helices.

Results

Free Energy That Defines the Free Energy Landscape

Before embarking on an analysis of pKID–KIX binding, we would like to elaborate on the free energy f(r) that defines the free energy landscape. We feel this is necessary as F(Q) = −kBT log P(Q), another free energy also referred to as the potential-of-mean-force curve, which is associated with the probability distribution P(Q) of a certain order parameter (or reaction coordinate) Q, has often been identified in the literature as the “free energy landscape”; however, its definition is distinct from the definition introduced in refs (5 and 6) for discussing the funneled landscape. It is therefore instructive to refer to the relationship between f(r), F(Q), and the conventional thermodynamic free energy F. In Figure , we illustrate these free energies, which were computed on the basis of the ∼400 μs folding–unfolding simulation trajectory of villin headpiece subdomain (HP-35) provided by the D. E. Shaw Research.[43] HP-35 was chosen as it is a 35-residue α-helical protein that autonomously folds[44] and, hence, serves as a good reference system in discussing the free energy landscape of the 34-residue pKID. The function f(r) emerges in an attempt to express the partition function Z of the system solely in terms of the configuration r of a solute molecule of interest (a protein here), Z = ∫ dr e–, by integrating out all the degrees of freedom associated with the rest (regarded as solvent) of the system (see Methods in the SI). f(r) = Eu(r) + Gsolv(r) is given by a sum of the solute potential energy (Eu) and the solvation free energy (Gsolv). As Gsolv is involved, it is more appropriate to refer to f(r) as the free energy rather than the potential energy. It is essential to recognize here that f(r) is defined for a single individual configuration r. As such, it carries no solute configurational entropy. P(r) = e–/Z can be interpreted as the probability of observing a specific configuration r. Therefore, f(r) is directly connected to Levinthal’s argument: in fact, the assumption that all possible protein configurations are equally probable amounts to assuming a constant f(r). Correspondingly, f(r) is precisely the quantity with which the funneling concept has been introduced.[5,6] Here and in the following, we shall use f(Q), which is an arithmetic average of f(r) over the configurations conforming to Q = Q(r), in drawing the free energy landscape as f(r) is defined on a high-dimensional configuration space and is not suitable for visualization. We observe from Figure a that the free energy landscape f(Q) indeed exhibits the funneled character as predicted by the landscape theory: the overall negative slope (df(Q)/dQ < 0) toward the folded state serves as a proxy of the genuine funneledness (∂f(r)/∂r < 0). The free energy F(Q) = −kBT log P(Q) is associated with the restricted partition function, P(Q) = Z(Q)/Z with Z(Q) = ∫ dr e–, in which the integration is over the configurations satisfying Q = Q(r). As a number of solute configurations contribute to F(Q), it carries the solute configurational entropy, to be denoted as Sconfig(Q). Indeed, f(r) and F(Q) are related via F(Q) = f(Q) – TSconfig(Q).[5,45] The free energy profile F(Q) (Figure b) exhibits quite a dissimilar appearance from the free energy landscape f(Q) (Figure a). In particular, the transition-state barrier shows up in F(Q). Given the relation F(Q) = f(Q) – TSconfig(Q) and the funneled character of f(Q), the barrier has a purely entropic origin.[46,47] This is one of the major differences between protein folding and simple chemical reactions of small molecules:[5] in the latter, the barrier appears already in f(Q) as it stems from the high potential energy of a fairly well-defined transition-state structure. F(Q) can be connected to the thermodynamic free energy F by recognizing that the folded (X = f) and unfolded (X = u) states can be characterized by the respective regions, Q > Qf and Q < Qu,[48] along the reaction coordinate. One can then show that F = ⟨F(Q)⟩ – TSconfig,distr. Here, ⟨F(Q)⟩ refers to an average over the Q region associated with state X, and Sconfig,distr is the configurational entropy originating from the presence of the distribution of Q values in state X. This explains why the difference between the folded- and unfolded-state minima of F(Q) (indicated by ⟨F(Q)⟩f and ⟨F(Q)⟩u in Figure b) does not correspond to the thermodynamic folding free energy ΔF = Ff – Fu. In fact, an inspection of Figure b indicates that the folding is apparently slightly exothermic (⟨F(Q)⟩f < ⟨F(Q)⟩u), which however contradicts the estimated ΔF = +0.66 kcal/mol. This is because the width of F(Q) also contributes to the thermodynamic configurational entropy: the wider width of the unfolded-state region of F(Q) than the one for the folded-state region results in larger Sconfigdistr for the former, leading to Ff > Fu as shown in Figure c. The free energies f(r), F(Q), and F constitute a natural series ranging from microscopic (defined for individual configurations r) to intermediate (defined for a set of configurations r satisfying Q = Q(r)) to macroscopic (defined for a state characterized by a range of Q). The equilibration with respect to the solute configurations is therefore essential for a reliable determination of F(Q) and F. On the other hand, this does not apply to f(r) (and f(Q) therefrom) as it is defined for individual solute configurations; i.e., f(r), rather than being an equilibrium property, describes the topography of the configuration space. Indeed, quite a similar free energy landscape for HP-35 can be gained just based on an unequilibrated, single folding portion of the simulation trajectory (Figure S1). This holds even though individual folding pathways of HP-35 are quite heterogeneous.[43] Furthermore, since equilibration with respect to solute configurations is not involved, f(r) is more relevant to nonequilibrium dynamical processes. Indeed, while f(r) was introduced above through a thermodynamic argument, it also shows up in the dynamics equation, , which can be derived by eliminating the solvent degrees of freedom from Newton’s equations of motion through the use of a projection-operator formalism (see Methods in the SI for an outline of the derivation). Here, the bare friction coefficient γ0 and the random force R are connected by the fluctuation–dissipation theorem. This equation clearly indicates that the solute dynamics is dictated by f(r). It also ensures that, after long times, the solute configuration r is populated proportional to e–.

Structural Changes along the Coupled Folding and Binding

We carried out spontaneous pKID–KIX binding simulations at 300 K and 1 bar. The initial disordered pKID structure was taken from a separate 1 μs simulation of the free pKID which was performed beforehand (Figure S2): the average α-helical contents of the αA (63.4 ± 22.1%) and αB (9.0 ± 14.3%) regions from our simulation are in fair agreement with the experimental observations (αA ∼ 50–60% and αB ∼ 15%) for the free pKID.[25] (We notice, in light of these experimental and simulation results for the free pKID, that it is mainly the folding of the αB helix that occurs when pKID binds to KIX. We also remark here that the angle between the pKID αA and αB helices is ∼90° in the experimental pKID–KIX complex structure.[28]) The disordered pKID taken from our free pKID simulation and KIX from the NMR complex structure were initially placed with a ∼60 Å center-of-mass distance, and we applied no artificial attraction force between them during our simulations. In total, 10 independent binding simulations with at least a length of 1.5 μs were carried out: two of them were extended to 3 μs, and one of them was extended to 10 μs. (All the simulations reported in the present work are summarized in Table S1.) The simulation with a length of 10 μs resulted in the successful folding of pKID upon binding (the minimum Cα root-mean-square-deviation of the folded pKID from the corresponding NMR structure is 1.0 Å), and we start from analyzing this successful trajectory. The binding of pKID with KIX and the folding of pKID are monitored via the fraction of native intermolecular contacts (Qint) and the fraction of native intra-pKID contacts (QpKID), respectively (Figure c). Selected structures along the trajectory are also displayed (Figure d). In the initial diffusive stage (from 0 to 490 ns; the time region colored red), pKID diffuses and approaches KIX and then explores the surface of KIX with disordered structures (QpKID ∼ 0.5). Certain intermolecular contacts are formed as can be inferred from the displayed structures, which however are mostly non-native ones, and Qint in this stage remains close to 0. The native intermolecular contacts start to be formed from 490 ns by the anchoring of the disordered pKID αB region onto the KIX surface, and Qint increases to ∼0.4 by 540 ns (the binding stage; the time region colored green). Then, we observe the folding of the pKID αB helix (dashed magenta ellipses in Figure d), and QpKID increases from ∼0.5 to ∼0.9 in the initial folding stage (from 540 to 900 ns; the time region colored magenta). However, the folding of pKID is still incomplete in that the angle between the αA and αB helices (∼60° to 70°) significantly deviates from the one (∼90°) found in the experimental folded structure, and such pKID structures shall be referred to as the “misfolded” structures. Correspondingly, the native intermolecular contacts are not fully formed between the misfolded pKID and KIX (Qint ∼ 0.3–0.4). We term the stage characterized by the misfolded pKID and deficient intermolecular contacts as the intermediate stage (from 900 to 2250 ns; the time region colored orange). This stage is terminated by the unfolding of the αB helix (dashed orange ellipse in Figure d), which is followed by a refolding that accompanies the widening of the αA–αB angle toward ∼90°. Such conformational changes are more clearly reflected in the pKID αB helix content and the αA–αB angle along the trajectory presented in Figure S4. Finally, the system enters the stage where the pKID–KIX complex takes structures close to the experimental one (>2250 ns; the time region colored cyan).

Free Energy Landscape of the Coupled Folding and Binding

The presence of various distinctive stages during the pKID–KIX coupled folding and binding is manifested in the underlying free energy f. We computed f along the successful 10 μs spontaneous pKID–KIX binding trajectory (Figure a; only the result up to 3 μs is displayed to facilitate the visibility of various stages, and the result for the entire time range is shown in Figure S5). Here, f = fpKID + Δfint is a sum of the free energy fpKID for the folding of pKID and Δfint = ΔEint + ΔGsolv associated with the pKID–KIX binding. In the latter, ΔEint is the direct (gas-phase) pKID–KIX interaction potential, and ΔGsolv is the solvent-induced one defined by Gsolv(pKID:KIX) – [Gsolv(pKID) + Gsolv(KIX)].[49] Several observations can be made from Figure a: Overall, f decreases as the system goes through the initial diffusive (the time region colored red), intermediate (orange), and final (cyan) stages. f decreases when the pKID folding occurs in the initial folding stage (the time region colored magenta), and the free energy barrier is discernible at the end of the intermediate stage (the time region colored orange). However, a more illuminating plot can be gained by combining f (Figure a) with Qint and QpKID (Figure c): that is, the free energy landscape f(Qint, QpKID) of coupled folding and binding expressed over the binding (Qint) and folding (QpKID) coordinates.

Figure 3

(a) Free energy f = fpKID + Δfint versus the simulation time from the successful 10 μs spontaneous pKID–KIX binding trajectory. The time regions are colored as in Figure c to highlight the various stages of the binding process (only the result up to 3 μs is displayed here; the whole result is presented in Figure S5). (b) Free energy landscape f(Qint, QpKID) constructed from the successful binding trajectory. The trajectory projected onto the landscape is drawn with a solid line, whose portions are colored as in panel a. (c) Projection of f(Qint, QpKID) onto the QpKID axis. The dashed lines denote the linear fits to the respective curves of the same color. (d) Folding landscape of pKID constructed from the slopes of the curves shown in panel c (see Figure S6 for details of its construction). The free energy landscape constructed from the successful 10 μs pKID–KIX binding simulation is shown in Figure b. The landscape f(Qint, QpKID) was obtained by dividing the configuration space spanned by 0 ≤ Qint ≤ 1 and 0 ≤ QpKID ≤ 1 into 2500 small areas after discretizing each of Qint and QpKID into 50 bins and then computing the average f value from those configurations that belong to each small area. The trajectory projected onto the landscape is drawn with a solid line, whose portions are colored as before to highlight the various distinctive stages. In the initial diffusive stage (colored red), the system exhibits the “vertical” diffusive dynamics along the folding (QpKID) coordinate that is restricted to the Qint ≈ 0 region. This is followed by the binding stage (green) that mostly occurs “horizontally” along the Qint coordinate. Then, the folding of pKID occurs (magenta) nearly vertically along the folding (QpKID) direction at Qint ∼ 0.3–0.4. What distinguishes the stage before binding, where pKID remains disordered, and the stage after binding, where the folding of pKID occurs? To elucidate this fundamental point, we projected the landscape f(Qint, QpKID) onto the QpKID axis to estimate the landscape’s slopes along the folding (QpKID) direction (Figure c) and then constructed the folding landscape of pKID, shown in Figure d, based on those slopes. (Our focus in the “funneled” diagram displayed in Figure d is the landscape’s slope for which the funnel concept was originally introduced.[5] In this regard, we notice that, unlike typical funnel diagrams in which the width of the funnel is meant to represent the configurational entropy, the width in Figure d is associated with the presence of the binding direction.) As the slope of the landscape characterizes the landscape’s funneledness, measuring the strength of the energetic bias toward the folded state, Figure d clearly indicates (i) that pKID remains disordered before binding as the landscape is not steep enough to allow it to fold, (ii) that the landscape becomes more progressively funneled as the binding proceeds, and (iii) that the folding of pKID occurs when the landscape is steep enough. Indeed, although the free energy landscape of pKID in a free environment is significantly shallower than the one of HP-35 which autonomously folds (we notice in this regard that the pKID landscape before binding agrees well with the one from the free pKID simulation; see Figure S7), the slope of the pKID landscape in the KIX environment becomes comparable to that of the HP-35 landscape as demonstrated in Figure S8. In the intermediate stage (colored orange in Figure b), pKID exists in its misfolded state as mentioned above. The unfolding and refolding of the misfolded pKID that occur at the end of the intermediate stage appear as the barrier region on the landscape preceding the final specific complex stage (colored cyan in Figure b). Remarkably, we observe an additional increase in the slope of the landscape in the final stage (Figure c,d). This provides further energetic bias to complete the folding of pKID, and the resulting folded state is lower in free energy than the misfolded state.

Native Contact Formation during the Binding of pKID to KIX

So far, we have primarily focused on the folding of pKID, i.e., the folding (QpKID) direction of the free energy landscape. Here, we investigate the molecular details on the formation of native intermolecular contacts during the spontaneous binding of pKID to KIX, i.e., the horizontal (Qint) direction of the landscape. Prior to such an investigation, we performed six independent 1 μs equilibrium pKID–KIX complex simulations to analyze the nature of the native intermolecular contacts. Table S2 lists the representative contacts obtained therefrom. We identified the following three types of native intermolecular contacts (Figure a): the hydrophobic contacts between residues that belong to the pKID αB helix and the KIX α3 helix; the contacts involving the phosphorylated Ser-133; and the electrostatic contacts that reflect the alternating local electrostatic complementarity between the binding surfaces. The formation of these contacts during the spontaneous binding is shown in Figure b.

Figure 4

(a) Amino acid residues forming representative intermolecular hydrophobic contacts (left); those forming the intermolecular contacts with the phosphorylated Ser-133 (middle); and those associated with the alternating local surface electrostatic complementarity (right). In the electrostatic potential surface representation, pKID is rotated such that the binding areas are facing the reader. (b) Presence (indicated by a dot) of the intermolecular residue contacts versus the simulation time. The time regions are colored as in Figure c (only the results up to 3 μs are displayed here; the whole results are presented in Figure S9). We find that the successful spontaneous binding is initiated by the contact formation between the hydrophobic residues of pKID and Tyr-658 of KIX (the time region colored red in Figure b). This is followed by the formation of the contacts involving the phosphorylated Ser-133 (the time region colored green). In the initial folding and intermediate stages (the time regions colored magenta and orange), the hydrophobic contacts and the contacts involving the phosphorylated Ser-133 are only partially formed, and the electrostatic contacts associated with the local electrostatic complementarity are barely formed. This reflects the fact that the misfolded pKID structures in these stages, in which the pKID αA–αB angle is ∼60–70°, are not optimum for making intermolecular contacts with KIX. Indeed, after the unfolding/refolding of the misfolded pKID that occur at the end of the intermediate stage, resulting in the pKID αA–αB angle of ∼90°, most of the native intermolecular contacts are formed (the time region colored cyan). In particular, the aforementioned electrostatic contacts are now present. This is because, for those electrostatic contacts to be formed, pKID must be present on the KIX surface not only at the right place but also with the correctly folded structure. Such electrostatic interactions should therefore be crucial for the pKID–KIX binding specificity. Those intermolecular contacts formed only after the unfolding/refolding of the misfolded structure are responsible for the additional increase in the landscape slope discussed above in connection with Figure c,d. Indeed, the free energy landscape becomes comparable to the one from the equilibrium complex simulations only after the formation of those contacts as demonstrated in Figure S7.

Free Energy Landscape of the Nonspecific Binding

We now turn our attention to the results based on the other nine spontaneous binding simulations. These are “unsuccessful” in the sense that the native intermolecular contacts barely formed (Qint remains close to 0) in them; the folding of pKID (to misfolded states) can nevertheless occur as we demonstrate here. By combining all of those trajectories, we constructed the free energy landscape f(ncontact, QpKID) (surface colored yellow in Figure a). Here, we adopted the number of intermolecular heavy atom contacts (ncontact) as the binding coordinate, and the landscape f(ncontact, QpKID) was obtained by dividing the configuration space spanned by 0 ≤ ncontact ≤ 400 and 0 ≤ QpKID ≤ 1 into 50 × 50 = 2500 small areas and then computing the average f value for each small area; we also calculated the standard deviation σf(ncontact, QpKID) of f values sampled in each small area for an error estimate. We find that, also in these nonspecific bindings, the landscape becomes more progressively funneled as the binding proceeds; this can be inferred from the sections along QpKID taken at ncontact = 0, 80, and 160 (dashed lines in Figure a). The folding landscapes of pKID constructed from these sections (Figure b) visualize this finding. This indicates that the folding of pKID can occur on the KIX surface even in nonspecific complexes.

Figure 5

(a) Free energy landscape f(ncontact, QpKID) constructed from the nine unsuccessful pKID–KIX binding trajectories (surface colored yellow). Sections along QpKID taken at ncontact = 0, 80, and 160 are shown by red, green, and blue dashed lines, respectively. (b) Folding landscape of pKID constructed from these sections. (c) Comparison of f(ncontact, QpKID) for the unsuccessful (surface colored yellow) and successful (brown) trajectories. The latter landscape is a redrawing of the one in Figure b with the binding coordinate ncontact and using only the trajectory data up to the intermediate stage (<2250 ns). (d) Standard deviation computed from the unsuccessful trajectories and the absolute difference between the landscapes for the unsuccessful and successful trajectories. (e) A selected trajectory projected onto the landscape is drawn with a solid line that is colored red, orange, and cyan every 1000 ns. (f) Corresponding result based on another trajectory. Indeed, in spite of an apparently dissimilar appearance of Figure a from Figure b, the free energy landscapes for the unsuccessful and successful trajectories are in fact comparable to each other. This is demonstrated in Figure c, in which the landscape from the successful trajectory redrawn with ncontact as the binding coordinate (surface colored brown) is superposed on top of the one from the unsuccessful trajectories (yellow). For a meaningful comparison, only the successful trajectory data up to the intermediate stage (<2250 ns) are used in the redrawing since the final specific complex stage, where the free energy f becomes considerably lower, is absent in the unsuccessful trajectories. Figure d shows that the two landscapes agree within statistical errors estimated by σf(ncontact, QpKID). Thus, topographic characteristics of the landscape associated with the initial folding of pKID are largely independent of the nature (native or non-native) of intermolecular contacts; i.e., they do not significantly depend on the angle of approach of pKID to KIX. Projections of the two selected nonspecific binding trajectories onto the landscape are drawn with solid lines in Figure e,f: those of the other seven trajectories are presented in Figure S10. The final complex structures from these simulations are also displayed. We find that pKID is bound at various sites on the KIX surface in these nonspecific complexes. In three of the nine unsuccessful trajectories (two shown in Figure e,f and one in Figure S10), the folding of pKID into helical structures is observed. However, all of these “folded” structures are in fact misfolded ones, in which the angle between the αA and αB helices is ∼60–70°. This corroborates the finding that the electrostatic interactions discussed in connection with Figure a, lacking in nonspecific complexes, are the crucial interactions in widening the αA–αB angle toward ∼90° to complete the coupled folding and binding of pKID.

Discussion

A helical structure is, in general, not stable by itself: short helices taken out from stable globular proteins are found to be unstable when isolated.[50] Additional stabilizing interactions of the helical structure must, therefore, be present in stable globular proteins. For example, all of the three α helices in HP-35 are tightly in contact with its hydrophobic core.[44] On the other hand, intrinsically disordered proteins are generally characterized by a high content of polar and charged residues and by a low population of hydrophobic residues.[51,52] As such, most of the residues in the pKID αA helix are either polar or charged, and the bulky hydrophobic residues are somewhat localized in the pKID αB helix region (Figure b); therefore, a ternary hydrophobic core that would stabilize the two-helix structure of the folded pKID cannot be formed. Such sequence characteristics of pKID clearly manifest themselves in the constructed free energy landscape: the landscape of pKID before binding is not steep enough to allow pKID to fold, and this is because pKID lacks additional stabilizing structural entities such as a hydrophobic core. Upon binding with KIX, intermolecular interactions previously unavailable come into play. In particular, a hydrophobic core can now be formed intermolecularly at the binding interface (Figure a), which contributes to stabilizing the helical structure of pKID. It is therefore natural for the free energy landscape along the folding direction to become progressively more funneled as the binding proceeds. Thus, the free energy landscape constructed for the pKID–KIX system clearly shows that the folding of pKID occurs after pKID binds to KIX. This agrees with the experimental observation for this system.[27] We also find from the successful binding simulation that the binding is initiated by the anchoring of the disordered pKID αB helix to the KIX surface, followed by the binding of the pKID αA helix (Figure d). Such a sequence of events has been observed in previous experimental[27,32] and computational[33,37] studies. In this regard, we recognize an important role played by Tyr-658 of KIX in the early stages (Figure b). In fact, this residue is known to be the critical residue in pKID–KIX binding.[24] We also observe that the initial folding of pKID to misfolded states occurs even in nonspecific complexes at various sites on the KIX surface (Figure e,f and Figure S10). This is because the associated landscape topography is found to be largely independent of the nature (native or non-native) of intermolecular contacts (Figure c). This supports the conclusion of an earlier computational study indicating that non-native contacts can also help the coupled folding and binding of IDPs.[53] The key event, however, in completing the pKID–KIX coupled folding and binding elucidated in the present study is the correct orientation of the two α helices of pKID upon docking onto the surface of KIX, which should a adopt a conformation where an angle of ∼90° exists between them. The latter conformation is stabilized by electrostatic interactions (Figure a), which, in turn, require pKID to adopt the correct folded structure. Low-free-energy local structures and non-native intermolecular interactions formed in intermediate nonspecific complexes must be broken toward such a directed self-assembly, which leads to an increase in free energy. Indeed, we observe a free energy barrier in the pKID landscape that separates the misfolded intermediate and the final folded states (Figure b,d). The presence of the metastable intermediate state in the pKID landscape is remarkable, as such an intermediate state is absent in the free energy landscape of most small proteins that exhibit the two-state folding behavior[47] (see, e.g., Figure a for HP-35). Our finding is corroborated by the experimental observation that a three-state model is necessary to account for the kinetics of the pKID–KIX coupled folding and binding.[27,31] Finally, caveats must be stated because, due to the high computational cost for carrying out spontaneous pKID–KIX binding simulations at a fully atomistic level, a number of analyses presented here are based on a single successful trajectory. For example, one could imagine that the native intermolecular contacts could have assembled in an order different from the one shown in Figure b, and even that the order observed in this work may not be the most common pathway. Furthermore, we cannot exclude a possibility that a barrier that separates the nonspecific and specific complex states may arise from configurational entropy as in protein folding (cf. Figure a,b). However, the landscape characteristics elucidated in the present work would not be significantly altered even if additional successful trajectories could be gained. Indeed, topographic characteristics (e.g., the landscape slope) of the configuration space are well-preserved even when underlying processes might be heterogeneous, as we argued in connection with Figure a and Figure S1 on the basis of a much longer simulation. We also observe that the landscape topography associated with the initial folding of pKID does not significantly depend on how pKID approaches KIX (Figure c), and that the landscape at the final specific complex state is consistent with the one obtained from the independent, equilibrium pKID–KIX complex simulations (Figure S7).

Conclusions

We demonstrate in the present work that molecular insights into the coupled folding and binding of an intrinsically disordered protein can be gained through an analysis of the free energy landscape. This is done by constructing the free energy landscape of the paradigmatic pKID–KIX system based on fully atomistic molecular dynamics simulations and the direct computation of the free energy that defines the landscape. We find that the free energy landscape of pKID before binding is not steep enough to allow pKID to fold by itself. This explains why pKID is disordered in a free state. Binding with KIX brings about considerable modifications to the landscape topography. In particular, the landscape along the pKID folding direction becomes progressively more funneled as the binding proceeds. This indicates that the folding of pKID occurs only after pKID binds to KIX. Additionally, we detect the presence of a metastable intermediate state in the landscape. This implies that the pKID–KIX coupled folding and binding involves at least three states. By linking these landscape characteristics to the structural details of the simulation trajectories, we provide a molecular mechanism by which disordered pKID folds upon binding to KIX. The present work further shows that the funneled landscape perspective, originally developed for ordered proteins, is enlightening also in advancing our understanding of intricate processes that involve intrinsically disordered proteins.

50 in total

1. Entropic barriers, transition states, funnels, and exponential protein folding kinetics: a simple model.

Authors: D J Bicout; A Szabo
Journal: Protein Sci Date: 2000-03 Impact factor: 6.725

Review 2. Flexible nets. The roles of intrinsic disorder in protein interaction networks.

Authors: A Keith Dunker; Marc S Cortese; Pedro Romero; Lilia M Iakoucheva; Vladimir N Uversky
Journal: FEBS J Date: 2005-10 Impact factor: 5.542

Review 3. The interplay between structure and function in intrinsically unstructured proteins.

Authors: Peter Tompa
Journal: FEBS Lett Date: 2005-04-08 Impact factor: 4.124

Review 4. The experimental survey of protein-folding energy landscapes.

Authors: Mikael Oliveberg; Peter G Wolynes
Journal: Q Rev Biophys Date: 2006-06-19 Impact factor: 5.318

Review 5. Thermodynamics of protein folding: a microscopic view.

Authors: Themis Lazaridis; Martin Karplus
Journal: Biophys Chem Date: 2003 Impact factor: 2.352

6. Mechanism of coupled folding and binding of an intrinsically disordered protein.

Authors: Kenji Sugase; H Jane Dyson; Peter E Wright
Journal: Nature Date: 2007-05-23 Impact factor: 49.962

7. Cooperativity in transcription factor binding to the coactivator CREB-binding protein (CBP). The mixed lineage leukemia protein (MLL) activation domain binds to an allosteric site on the KIX domain.

Authors: Natalie K Goto; Tsaffrir Zor; Maria Martinez-Yamout; H Jane Dyson; Peter E Wright
Journal: J Biol Chem Date: 2002-08-29 Impact factor: 5.157

8. Atomistic details of the disordered states of KID and pKID. Implications in coupled binding and folding.

Authors: Debabani Ganguly; Jianhan Chen
Journal: J Am Chem Soc Date: 2009-04-15 Impact factor: 15.419

Review 9. Linking folding and binding.

Authors: Peter E Wright; H Jane Dyson
Journal: Curr Opin Struct Biol Date: 2009-01-20 Impact factor: 6.809

10. Binding-induced folding of a natively unstructured transcription factor.

Authors: Adrian Gustavo Turjanski; J Silvio Gutkind; Robert B Best; Gerhard Hummer
Journal: PLoS Comput Biol Date: 2008-04-11 Impact factor: 4.475

1 in total

1. Resolving Dynamics in the Ensemble: Finding Paths through Intermediate States and Disordered Protein Structures.

Authors: Adam K Nijhawan; Arnold M Chan; Darren J Hsu; Lin X Chen; Kevin L Kohlstedt
Journal: J Phys Chem B Date: 2021-11-08 Impact factor: 3.466

1 in total