Andrey Alekseenko1,2, Olga Kononova1,2, Yaroslav Kholodov3,4, Kenneth A Marx1, Valeri Barsegov1,2. 1. Department of Chemistry, One University ave, University of Massachusetts, Lowell, Massachusetts, 01854. 2. 9 Institutskiy per, Department of Control and Applied Mathematics Moscow Institute of Physics and Technology, Moscow region, 141700, Russia. 3. 1 Universitetskaya st, Department of Information Systems Innopolis University, Innopolis, Tatarstan, 420500, Russia. 4. 19/18, 2-nd Brestskaya st, Department of Applied Mathematics Institute of Computer Aided Design of the Russian Academy of Sciences, Moscow, 123056, Russia.
Abstract
Hydrodynamic interactions (HI) are incorporated into Langevin dynamics of the Cα -based protein model using the Truncated Expansion approximation (TEA) to the Rotne-Prager-Yamakawa diffusion tensor. Computational performance of the obtained GPU realization demonstrates the model's capability for describing protein systems of varying complexity (10(2) -10(5) residues), including biological particles (filaments, virus shells). Comparison of numerical accuracy of the TEA versus exact description of HI reveals similar results for the kinetics and thermodynamics of protein unfolding. The HI speed up and couple biomolecular transitions through cross-communication among protein domains, which result in more collective displacements of structure elements governed by more deterministic (less variable) dynamics. The force-extension/deformation spectra from nanomanipulations in silico exhibit sharper force signals that match well the experimental profiles. Hence, biomolecular simulations without HI overestimate the role of tension/stress fluctuations. Our findings establish the importance of incorporating implicit water-mediated many-body effects into theoretical modeling of dynamic processes involving biomolecules.
Hydrodynamic interactions (HI) are incorporated into Langevin dynamics of the Cα -based protein model using the Truncated Expansion approximation (TEA) to the Rotne-Prager-Yamakawa diffusion tensor. Computational performance of the obtained GPU realization demonstrates the model's capability for describing protein systems of varying complexity (10(2) -10(5) residues), including biological particles (filaments, virus shells). Comparison of numerical accuracy of the TEA versus exact description of HI reveals similar results for the kinetics and thermodynamics of protein unfolding. The HI speed up and couple biomolecular transitions through cross-communication among protein domains, which result in more collective displacements of structure elements governed by more deterministic (less variable) dynamics. The force-extension/deformation spectra from nanomanipulations in silico exhibit sharper force signals that match well the experimental profiles. Hence, biomolecular simulations without HI overestimate the role of tension/stress fluctuations. Our findings establish the importance of incorporating implicit water-mediated many-body effects into theoretical modeling of dynamic processes involving biomolecules.
The solvent environment affects the dynamic properties of biomolecules, but the issue of hydrodynamic coupling between various segments within the same biological macromolecule has not received much attention. Although hydrodynamic interactions (HI) affect the properties of condensed phase systems, there are only a handful of reports in the literature on the effects of these interactions in proteins. Recent theoretical studies have mainly focused on how hydrodynamic effects influence the diffusion of biomolecules,1, 2 protein ensembles’ association,3, 4 and protein folding.5 Very few studies have focused on how hydrodynamic interactions affect the kinetics and thermodynamics of protein unfolding. In a recent study,6 the investigators addressed the role of hydrodynamic interactions on protein unfolding, but the example chosen was a small protein‐ubiquitin. Therefore, a number of interesting questions remain. For example, it is not known how hydrodynamic interactions influence the unfolding pathways and mechanism(s) in multidomain proteins under tension. Little is known about the extent of domain–domain coupling in multidomain proteins in the presence of hydrodynamic interactions. More intriguing questions concern the effect of hydrodynamic interactions on the biophysical properties of large protein complexes. It is not yet known how hydrodynamic interactions modulate any large system's dynamic response to external mechanical factors, where examples of such important biological assemblies include microtubules, cell organelles, and animal and plant virus particles, etc.Langevin dynamics simulations based on simplified models of biomolecules are widely used to access large‐amplitude and long‐timescale molecular motions. In these simulations, the description of solvent effects is typically limited to stochastic collisions implicitly reflecting the thermal environment. These are modeled by random forces acting on each particle and in each dimension x, y, and z in the three‐dimensional space. Although the collective behavior of solvent molecules can dramatically alter the dynamics of the whole system, in coarse‐grained modeling approaches this behavior is usually neglected. Here, we employ the Langevin dynamics simulations of the Cα‐based Self Organized Polymer (SOP) model of a polypeptide chain7, 8 to explore the influence of solvent induced many‐body effects on dynamic structural transitions in protein assemblies. A variety of biochemical systems have been described using this approach.7, 9, 10, 11We have ported the SOP‐based simulations to Graphics Processing Units (GPUs).12, 13 The computational acceleration attained on a GPU with the SOP‐GPU program (see Fig. S1 in the Supporting Information) has enabled us to address the biological N‐body problem and to probe dynamic structural transitions in: fibrin monomer and oligomers (N ≈ 103–104 amino acids),14, 15 microtubule polymers (N ≈ 105),16 and protein shells of the virus particle CCMV and the bacteriophage HK97 (N ≈ 104–105).13, 17, 18 Here, we employ the SOP‐model to account for the hydrodynamic interactions between the Cα‐particles with all the computational subroutines fully implemented on a GPU. In the past decade, GPUs have transformed from being an obscure rapid visualization oriented technical development to becoming integral components of modern supercomputers with wide software support. The utilization of GPUs for Molecular Dynamics (MD) based scientific applications began in 2007,19 but now successful MD simulation packages have partial or full GPU support including HOOMD‐blue,20 ACEMD,21 AMBER,22, 23 and GROMACS.24A simple method to account for hydrodynamic interactions in Langevin dynamics simulations was proposed by Ermak and McCammon.25 A downside of this approach is the requirement to perform computationally expensive Cholesky decomposition to obtain the correlation matrix for random forces at each step of numerical integration. Several methods have been introduced to leverage this difficulty, including the Fixman approximation26 and Krylov subspace methods,27 but the issues remain. Here, we propose to use an approximate treatment of the hydrodynamic effects based on the Truncated Expansion approximation (TEA).28 The obtained realization of the SOP model with hydrodynamics is fully implemented on a GPU and is incorporated into the SOP‐GPU package. We profiled the computational performance of the SOP‐GPU program with HI option for a range of biomolecular systems of varying size (N = 101–105 amino acids) to demonstrate that the program enables one to complete simulation runs in reasonable wall‐clock time. We also compared the numerical accuracy of TEA‐based treatment of HI against the results obtained with the Cholesky decomposition by performing Umbrella Sampling calculations of the unfolding free energy landscape for small proteins and by analyzing the unfolding kinetics of a generic multidomain protein.Excellent agreement between the approximate and exact results has enabled us to carry out a comprehensive study of the influence of HI on a number of different processes, including the forced unfolding of a large multidomain protein, and the deformation of a biological filament. The results obtained illuminate the critical importance played by solvent‐induced correlations influencing finite‐rate dynamic processes involving biomolecules. Our results also necessitate the development of scalable computational modeling schemes for the accurate description of force generating properties of proteins, in which solvent‐mediated many‐body effects are accounted for. Utilizing these hydrodynamic interactions will allow better theoretical modeling of biological assemblies of all sizes to compare with both single‐molecule force‐based experiments as well as descriptions of biomolecular systems in the context of their cellular mechanical environment.
Methods
Hydrodynamic interactions
In Langevin Dynamics simulations in the overdamped limit, the x‐, y‐, and z‐coordinates of N particles of the system in the three‐dimensional space r, where index α runs through all the degrees of freedom for all N particles (α = 1,…, 3N), are propagated forward in time according to the Langevin equation of motion:In eq. (1),
is the amplitude of random force, Δt is the timestep, k
B
T is the temperature, and γ is the friction coefficient, which is given by the Einstein‐Stokes equation γ = 6πηa, with water viscosity η = 7.0 × 105 pN ps/nm and the particle's size a. In eq. (1),
is the molecular force acting on the α‐degree of freedom (U is the potential energy). In this approach, which ignores the hydrodynamic coupling of degrees of freedom, all particles are described by the same diffusion coefficient,
, and stochastic kicks from solvent molecules are modeled using N(0,1)‐distributed random forces g.To account for solvent‐mediated many‐body effects, one can use an approach due to Ermak and McCammon.25 In this approach, the equation of motion eq. (1) is transformed into the following equation:In eq. (2), the hydrodynamic tensor D, a real 3N × 3N matrix in which an entry D describes a contribution to the diffusion of degree of freedom α from the degree of freedom β. Tensor D can be represented by an N × N matrix of 3 × 3 submatrices D
, each corresponding to a pair of particles i and j. In eq. (2), B = D
1/2; therefore, for the correct distribution of random forces a real 3N × 3N matrix B must satisfy the condition B
τ
B = D, where the superscript τ represents the transpose of a matrix. When D is diagonal with the equal matrix elements D = k/γ, we recover eq. (1).The simplest explicit form of the hydrodynamic tensor D was first proposed by Oseen.29 However, since this quantity is not always positively defined, the correct matrix B may not exist for some system's configurations. In this work, we use the Rotne–Prager–Yamakawa (RPY) tensor30, 31 which is a positive‐definite quantity. The submatrices D
of RPY tensor are given byIn eqs. (3a), (3b), (3c), I is the identity matrix of rank 3 and a is the hydrodynamic radius of the particle. In this work, we use the same value of a for all particles.
Truncated expansion approximation
There are several methods for numerical computation of matrix D, including a fast multipole method32 and Ewald summation method.33 In this work, however, we utilized an approach in which the elements of the diffusion tensor D are computed on a pairwise basis. For each pair of residues i and j, the submatrix D
is computed directly according to eqs. (3a), (3b), (3c). This approach has been extensively tested in conjunction with various algorithms for finding the matrix B,27, 28, 34 and it does not require the periodic boundary conditions. Also, a straightforward way of computing B is Cholesky decomposition of D, but this method requires a total of O(N
3) arithmetical operations and O(N
2) memory space to store the whole matrix D. Clearly, such an approach is not most efficient for an implementation on a GPU, due to memory limitations. Several methods for approximate calculation of the matrix B have been developed, including the Fixman approximation,26 Krylov subspace methods,27 and the Truncated Expansion approximation.28 In the Fixman method,26 the square root of D is approximated using Chebyshev polynomials p(D). Here, we only need a product B·g ≈ p(D)g, which can be computed more efficiently than p(D) itself. This method is iterative and requires estimation of the extreme eigenvalues of the matrix D for fast convergence. The Krylov subspace method27 is similar to the Fixman method, but uses polynomials over Krylov subspace {g, Dg, …, D
m−1
g}. Although the use of Krylov subspace leverages the need to estimate the extreme eigenvalues, this method requires the whole matrix D as the input and several iterations to achieve convergence.Therefore, in this work we employed the Truncated Expansion approximation (TEA).28 This is by far the fastest method for calculation of the matrix D,27 and it can be adapted to compute the matrix elements on a pairwise basis. The TEA method is less accurate than other methods available,27, 34 but given the coarse‐grained nature of the SOP model, we accept this trade‐off to achieve a significant computational speed‐up, while also maintaining a reasonable level of accuracy. In the TEA approach, the matrix elements of B can be rewritten as
, and eq. (2) can be recast as:28
whereIn eqs. (4) and (5), C and b′ are given by
where
. Equations (5), (6), (7) allows one to parallelize the integration algorithm on a GPU. The memory demand associated with using TEA scales with system size as O(N), as opposed to O(N
2) for Cholesky decomposition. The memory cost is small compared to the amount of memory required to store the lists of pairs of interacting particles.
Cholesky decomposition
We used the Cholesky decomposition method to assess the accuracy of the TEA method. Cholesky decomposition allows one to calculate the matrix elements of the diffusion matrix exactly. In this method, a decomposition of a positive‐definite matrix D is employed to find the upper triangular matrix B so that the matrix D can be written as the product D = B
τ
B. Numerically, this is achieved by solving a system of N(N + 1)/2 linear equations. It takes O(N
3) operations using the Cholesky–Banachiewicz or Cholesky–Crout algorithms to solve these equations, which differ only in their access pattern. We used the CPU implementation of Cholesky decomposition available in the GNU Scientific Library.35 For the free energy calculations, we created our own GPU‐based realization of Cholesky decomposition, which is now part of the SOP‐GPU package.
SOP model
We employed the native topology‐based SOP model of a polypeptide chain.7, 8 In the SOP model, each protein residue is represented by a bead, positioned at the residue's Cα‐atom. The explicit form of the potential energy U
SOP expressed in terms of the coordinates of the Cα‐atoms {r} = r of residues 1, 2, …, N is given byIn eq. (8), the first term is the finite extensible nonlinear elastic (FENE) potential, which describes the backbone chain connectivity:
where k = 14 N/m is the spring constant, and the tolerance in the change of the covalent bond distance is R
0 = 2 Å. The distance between the next‐neighbor residues i and i + 1, is r, and
is its value in the native structure. The second term in eq. (8) takes into account the noncovalent (or nonbonded) interactions that stabilize the native state (structure). To describe these interactions, we use the Lennard–Jones potential:In eq. (10), we assume that if noncovalently linked residues i and j (|i−j| > 2) are within the cut‐off distance of 8 Å in the native state, then Δ =1; and Δ = 0 otherwise. The value of ε quantifies the strength of the nonbonded (noncovalent) interactions. In the SOP model, parameter ε sets the energy scale. The third term in eq. (8) accounts for the remaining non‐native (nonbonded) interactions, which are treated as repulsive, including the bond angle between the residues i, i + 1, and i + 2:
where parameters ε = 1 kcal/mol and σ = 3.8 Å define the strength and range of the repulsion.
Model systems
The following model systems were used: (i) a fragment of an α‐helical chain (α‐peptide), (ii) a single WW‐domain (WW), (iii) a trimer of linearly connected WW‐domains (WW3), (iv) a γ‐γ‐crosslinked double‐D fragment from human fibrin(ogen) (double‐D fragment), (v) a single microtubule (MT) protofilament, (vi) the nanocompartment encapsulin, and (vii) the bacteriophage HK97 capsid. The structures of these systems were taken from the Protein Data Base (PDB); see Figure 1. A summarized description of all the model systems, that is, system size, number of native contacts, and the total number of residue pairs for the calculation of diffusion tensor D, is given in Table 1.
Figure 1
Computational performance of the SOP‐GPU package with an approximate treatment of hydrodynamic interactions (TEA HI) and without hydrodynamics (no HI). The computational speed (number of integration steps per second) of an end‐to‐end realization of the SOP‐GPU program implemented on a GPU Tesla K40 is estimated for a number of systems arranged in the order of their increasing size N (Table 1): i) α‐peptide, ii) WW‐domain, iii) trimer WW3, iv) double‐D fragment from fibrin, v) MT protofilament, vi) encapsulin shell, and vii) HK97 virus capsid. The computational speed is compared using: a) single‐run‐per‐GPU approach without the HI (solid line) and with the TEA HI (dash‐dotted line), and b) many‐runs‐per‐GPU approach without the HI (dashed line) and with the TEA HI (dotted line). The inset plot shows the associated GPU memory usage vs. N for using the single‐run‐per‐GPU approach (results of simulations with and without HI are indistinguishable on this scale).
Table 1
Summarized description of and results of performance measurements for α‐peptide, WW‐domain, trimer WW3, double‐D fragment, microtubule (MT) protofilament, encapsulin shell, and bacteriophage HK97 capsid.
System
PDB code
System size N
Number of native contacts
Number of bead pairs
Optimal number of trajectories, sopt
α‐peptide
3GHG
22
32
992
350
WW‐domain
1PIN
38
75
1406
170
trimer WW3
1PIN
144
225
12,882
70
double‐D fragment
1FZB
1062
3894
1,126,782
9
MT protofilament
1JFF
6928
21734
46,990,256
2
Encapsulin
3DKT
15,840
75138
890,813,562
1
bacteriophage HK97
1FT1
115,140
352224
13,257,104,460
1
Shown are the system size (number of amino acids), number of native contacts stabilizing the native state, number of pairs of beads (number of operations in the calculation of RPY tensor), and the optimal number of simulation runs on a GPU (Tesla K40) with the many‐runs‐per‐GPU option.
Computational performance of the SOP‐GPU package with an approximate treatment of hydrodynamic interactions (TEAHI) and without hydrodynamics (no HI). The computational speed (number of integration steps per second) of an end‐to‐end realization of the SOP‐GPU program implemented on a GPU Tesla K40 is estimated for a number of systems arranged in the order of their increasing size N (Table 1): i) α‐peptide, ii) WW‐domain, iii) trimer WW3, iv) double‐D fragment from fibrin, v) MT protofilament, vi) encapsulin shell, and vii) HK97 virus capsid. The computational speed is compared using: a) single‐run‐per‐GPU approach without the HI (solid line) and with the TEAHI (dash‐dotted line), and b) many‐runs‐per‐GPU approach without the HI (dashed line) and with the TEAHI (dotted line). The inset plot shows the associated GPU memory usage vs. N for using the single‐run‐per‐GPU approach (results of simulations with and without HI are indistinguishable on this scale).Summarized description of and results of performance measurements for α‐peptide, WW‐domain, trimer WW3, double‐D fragment, microtubule (MT) protofilament, encapsulin shell, and bacteriophage HK97 capsid.Shown are the system size (number of amino acids), number of native contacts stabilizing the native state, number of pairs of beads (number of operations in the calculation of RPY tensor), and the optimal number of simulation runs on a GPU (Tesla K40) with the many‐runs‐per‐GPU option.The α‐peptide – a 22‐residue peptide from the α‐chain of human fibrinogen (residues 172–19436;) was used in the calculation of the free energy landscape for the α‐type protein. The WW‐domain of mitotic rotamase Pin1,37 a 34‐residue long single protein domain, was used in the calculation of the free energy for the β‐sheet protein. The WW‐trimer WW—used as a prototype of a multidomain protein, was constructed by connecting head‐to‐tail three identical WW‐domains with glycine linkers, which were also attached to the N‐ and C‐termini of WW3. We used WW3 to explore the effect of hydrodynamic interactions on protein forced unfolding. The double‐D fragment from human fibrin(ogen)38 consists of two identical D‐regions from the abutted fibrin monomers connected through the γ‐γ crosslinking sites. We used the double‐D fragment to explore the influence of hydrodynamics on structural transitions in large‐size protein assemblies. The MT protofilaments are long threads of longitudinally arranged α‐ and β‐tubulin heterodimers (439 and 427 amino acids, respectively), which form long MT polymers in eukaryotic cells.39 We used an 8‐dimers long MT protofilament, employing nanoindentations in silico, to study the effects of hydrodynamic interactions on dissociation of longitudinal αβ‐tubulin bonds. The Encapsulin and Bacteriophage HK97 are examples of regular geometry biological protein assemblies.40, 41 We used encapsulin and HK97 (Head II state) to test the SOP‐GPU (with HI option) capability (computational speed, memory usage).
Applications
Langevin dynamics
The biomolecular dynamics were obtained by performing numerical integration of the Langevin equations of motion for each particle's position r
(i = 1, 2, …, N) in the overdamped limit without HI [eq. (1)] and with HI [eq. (2)] using the first‐order integration scheme. We implemented the Truncated Expansion based approximate treatment and the Cholesky decomposition based exact treatment of HI. The molecular forces were calculated from the total potential energy U = U
SOP [eqs. (8), (9), (10), (11)]. The Langevin equations were propagated with the time step Δt = 0.08τ ≈ 20 ps, where τ. Here, τ
2
/ε
1/2 = 3 ps, ζ = 50 is the dimensionless friction constant for a residue in water (η = ζm/τ), and m ≈ 3 × 10−22 g is the average residue mass.42 All the simulations with and without the HI were carried out at 300 K using the bulk water viscosity, which corresponds to the friction coefficient η = 7.0 × 105 pN ps/nm. Numerical values of the hydrodynamic radius a vary in the literature from 1.5 Å5 to 5.3 Å1. The TEA treatment of hydrodynamics correctly handles overlaps,34 but the RPY tensor works better at describing the nonoverlapping particles. Since the Cα−Cα distance in a polypeptide chain is 3.8 Å and due to self‐avoidance, we set a = 1.9 Å.
Umbrella sampling simulations
To map the free energy landscape for unfolding of the WW‐domain and α‐peptide, we utilized the Umbrella Sampling technique (Supporting Information),43 which is widely used to access the thermodynamics of biomolecules.44, 45, 46 Using Umbrella Sampling requires generation and subsequent equilibration of initial conformations (windows), which span the entire range of reaction coordinate (elongation X). The potential of mean force was constructed using the Weighted Histogram Analysis Method.44, 47, 48 To generate a set of initial structures for the α‐peptide and the WW‐domain, we ran 200 4‐ms‐long equilibration simulations for each molecule. We carried out force‐ramp simulations by constraining the N‐terminus and applying the pulling force to the C‐terminus through a virtual cantilever spring κ = 0.05 kcal/mol/Å2 ≈ 35 pN/nm moving with velocity ν = 2.5 μm/s in the direction of X. We generated 1450 sampling windows of width 0.1 Å, corresponding to a 5 ‐ 150 Å interval of X for the WW‐domain; and 630 windows of width 0.1 Å, corresponding to a 25 ‐ 88 Å interval of X for the α‐peptide. These conformations were used in 10‐ms equilibration runs. We carried out the Umbrella Sampling simulations without HI, and with HI using the exact (Cholesky) and approximate (Truncated Expansion) treatments.
Protein forced unfolding
In force‐clamp simulations, which mimic force‐clamp (e.g., AFM‐based) measurements, the Cα‐atom of the molecule at the N‐terminus was constrained and the constant pulling force
= f
with the magnitude f was applied to the Cα‐atom at the molecule's C‐terminus in the direction coinciding with the end‐to‐end vector . In the force‐clamp simulations for WW3, the N‐terminal Cα‐atom in the first WW‐domain was kept fixed whereas f was applied to the C‐terminal Cα‐atom of the third domain (Fig. 1). We generated 1000 runs for each force from f = 100 pN to 320 pN, with 20 pN increment. The unfolding time for each WW‐domain was defined as the first time at which the end‐to‐end distance of the domain had exceeded 90 Å. Force‐ramp simulations were carried out using the time‐dependent force f(t) = κ(ν‐Δx), where ν is the velocity of the cantilever base, represented by a virtual particle, κ is cantilever spring constant, and Δx is the displacement of the pulled residue from its initial position.12 For double‐D fragment, we constrained/pulled γCys139 in the first/second domain D (Fig. 1). We set the pulling velocity to either ν = 2.5 or 25 μm/s, and the cantilever spring constant to κ = 35 pN/nm. The output was used to profile the unfolding force (F) as a function of molecular extension (X) or the FX‐curves.
Nanoindentations in silico
For nanoindentations in silico on the MT protofilament, we harmonically connected the cantilever base with the spherical bead of radius R
tip mimicking the cantilever tip. The tip interacted with the particle via the repulsive Lennard–Jones potential:
which produced an indentation on the particle's surface. In eq. (12), r and r
tip are coordinates of the i‐th residue and the center of the tip, respectively; ε
tip = 4.18 kJ/mol, and σ
tip = 1.0 Å. For the cantilever tip‐sphere (Fig. 6), we solved the following Langevin equation:
where
is the initial position of the tip center, and the friction coefficient γ corresponds to the viscosity η = 7.0 × 106 pN ps/nm. To generate the dynamics of a biological particle, we solved numerically eqs. (1) (without HI) or (2) (with HI), eqs. (8), (9), (10), (11) for the particle, and eqs. (12) and (13) for the tip of radius R
tip = 10 nm. The cantilever base moving with constant velocity (v) exerted the compressive force = f(t)
in the direction perpendicular to the particle surface; f(t) = r increased linearly in time t with the rate r (ν = 3, 10, and 30 μm/s and κ = 50 pN/nm). We monitored the tip position X, which defines the extent of deformation; the resisting force F was calculated using the energy output. To bend the MT protofilament, we constrained the Cα‐atoms at the N‐terminus of the first monomer (residues 248, 253, 257, 262, 325, 326, 329, 348, 349) and at the C‐terminus of the last monomer (residues 98, 176, 177, 180, 221, 224, 225, 403, 407).
Figure 6
The dynamics of deformation of MT protofilament (Fig. 1; Table 1). Shown are the force (F)–deformation (X) curves obtained from in silico nanoindentations with compressive force increasing with ν = 3 μm/s (panel a), 10 μm/s (panel b), and 30 μm/s (panel c). The cantilever spring constant is κ = 50 pN/nm, and the cantilever tip radius is R
tip
= 10 nm. Compared are the FX‐spectra from simulations with no HI (black curves) and TEA HI (blue curves). In panel a, snapshots corresponding to the similarly numbered regions 1, 2, and 3 in the FX‐spectra show the deformation progress, from the bent structure 1 (ascending part of the FX‐spectrum), to the bent structure 2 before the MT protofilament dissociation (force maximum in the FX spectrum), and to the disrupted structure 3 (local minimum in the FX spectrum). In panel c, the locations on the FX‐curves numbered 1–4 correspond to the snapshots displayed in Figure 7. [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]
Results
GPU‐based implementation of hydrodynamic interactions
The SOP‐GPU with HI option can be run in the one‐run‐per‐GPU mode (for large systems when N is comparable with the number of ALUs) and in the many‐runs‐per‐GPU mode (for small systems when N is much smaller than the number of ALUs).12 In our implementation of the TEA algorithm, all computations described in eqs. (3)–(7) are performed on a single GPU. First, we compute molecular forces F and random forces g for all particles. Next, we perform the numerical integration of the Langevin equations of motion using one thread of execution for each particle. At this stage, we compute the elements of tensor D for each particle by performing the summation over contributions from all particles [eq. (4)]. In this approach, each element of D is calculated twice, but we avoid synchronization issues and the need to store the whole matrix, which would be impossible given the relatively limited amount of memory in contemporary GPUs (up to 12 GiB). Here, the only operation performed on a CPU is the averaging of the off‐diagonal elements of D [parameter ε in eq. (7)]. Also, because the positions of amino acid residues in 3D‐space change gradually, parameter b' [eq. (7)] does not need to be updated frequently. In our implementation, we update b' every ∼10–50 steps. More frequent updates do not improve accuracy and less frequent updates do not reduce significantly the computational time.We also implemented the exact treatment of HI where we used the modified Cholesky–Banachiewicz algorithm, in which each thread computes a single row of the matrix B. This requires that the whole matrix is processed by a single CUDA block (a group of threads sharing same memory) for synchronization purposes. In the developed implementation, we utilized shared memory for efficiency. While this approach works well when using the many‐runs‐per‐GPU approach, which allows us to process simultaneously as many blocks as there are trajectories, it places an upper limit on system size due to the number of available threads in a single block: 341 residues on a GPU with CC 2.0 and later or 170 residues for CC lower than 2.0. Leveraging this would result in significant complication of the CUDA code, and simulations with N > 102 residues in centisecond timescales would not be feasible.
Computational performance
We assessed the computational performance of our implementation of the TEA algorithm on a GPU Tesla K40 (from NVIDIA) for the following systems: α‐peptide, WW‐domain, trimer WW3, double‐D fragment, MT protofilament, encapsulin shell, and HK97 capsid (Table 1). In the first set of benchmark tests, we performed long equilibrium simulations (107 integration steps) in the one‐run‐per‐GPU mode for each system at T = 300 K with (TEAHI) and without (no HI) hydrodynamic interactions. We evaluated the computation speed (number of integration steps per second) and memory demand in the one‐run‐per‐GPU mode. The profiles of computational speed and memory usage vs. system size N are compared in Fig. 1. When HI are switched off, the computational speed scales roughly linearly with N. This is because the computational speed is mostly determined by force calculations and by numerical integration of Langevin equations, both having the O(N) complexity. Because the pair list generation, which has the O(N
2) complexity, is performed infrequently (every 105 steps), it barely affects the runtime. Accounting for HI with the TEA approach results in the increase of algorithm complexity to O(N
2) operations (Table 1). Since the integration of the Langevin equations is performed at each step, the runtime scales as ∼N
2, which translates to a more rapid decrease of the computational time with N (Fig. 1). For small systems (101–103), the TEAHI simulations slow down several‐fold. For large‐size systems (∼104–105), the runtime penalty is one‐two orders of magnitude.Next, we performed the same equilibrium simulations, but in the many‐runs‐per‐GPU mode. There is the question of the optimal number of independent trajectories (s
opt) that can be run concurrently on a GPU. In principle, s
opt can be estimated using the formula:
which connects the total number of CUDA cores M
cores, the number of threads per GPU multiprocessor M
th, and the number of cores per GPU multiprocessor P. For Tesla K40 (based on Kepler architecture with compute capability 3.x), M
cores = 2880, M
th = 512, and P = 192. Hence, for the α‐peptide (N = 22 residues) s
opt ≈ 350. Because, eq. (14) provides only an estimate of s
opt, we performed equilibrium simulations described above in which we varied the number of trajectories s. Supporting Information Fig. S2 demonstrates that for small systems (N = 101–103 residues), the computational speed barely changes when s is small. This is because to fully load a GPU one needs to use up all the multiprocessors, which takes place when M
th
·M
cores
/P ≈ 7500 threads. For example, the computational speed is ∼1.7 × 104–2.5 × 104 steps/second for the α‐peptide when s < 350, ∼1.3 × 104–1.7 × 104 steps/second for the WW‐domain when s < 170, ∼0.8 × 104–1.1 × 104 steps/second for WW3 when s < 70, and ∼0.3 × 104–0.5 × 104 steps/second for the double‐D fragment when s < 9. We find that all the profiles have a similar initial slope, but after passing some optimal value s = s
opt the computational speed starts decreasing more rapidly with s (Supporting Information Fig. S2). The GPU is now fully loaded and starts slowing down with each additional run. We used this crossover point to estimate s
opt. For large systems (N = 104–105), the GPU is loaded when s ≈1, and the initial weak dependence on s is not observed.The values of s
opt (Table 1) are the same for the simulations with and without HI, which shows that accounting for the hydrodynamic effects does not change the total numbers of threads of execution. The values of s
opt obtained were next used to carry out the equilibrium simulations (107 integration steps) in the many‐runs‐per‐GPU mode to estimate the average computational speed. Figure 1 shows similar values of the computational speeds attained with the many‐runs‐per‐GPU and one‐run‐per‐GPU approaches, regardless of whether HI are accounted for or ignored. This means that the SOP‐GPU program in the many‐runs‐per‐GPU mode can be utilized to generate 10–100 independent runs with or without HI. The associated memory usage per trajectory for simulations with HI is very close to that without HI (the inset in Fig. 1), which means that biomolecular simulations with HI are not at all limited by the low memory of graphics cards.
Numerical accuracy
Unfolding energy landscape
To assess the numerical accuracy of TEA implementation, we profiled the unfolding free energy and unfolding force for the α‐peptide and the WW‐domain (Fig. 1) ΔG and F, as a function of the end‐to‐end distance X. Figure 2 compares ΔG vs. X and F vs. X obtained with no HI, with exact HI, and with TEAHI. Because HI should not influence the unfolding thermodynamics, the results of calculations of ΔG and F should not depend on whether HI are accounted for or ignored. We see that the profiles of ΔG vs. X (and F vs. X) for all three sets of simulations practically collapse onto the same curve both for the WW‐domain and the α‐peptide. Small deviations of the profiles of ΔG(X) and F(X) obtained with TEAHI from the results obtained using exact HI are due to insufficient sampling (10 ms) of resulting conformations and the very slow but finite rate of change of X.
Figure 2
Thermodynamics of unfolding of the WW‐domain (panel a) and the α‐peptide (panel b); see Figure 1. Compared are the profiles of unfolding force F vs. end‐to‐end distance X (solid curves; left y‐axes) and the unfolding free energy change ΔG vs. X (dashed curves; right y‐axes) obtained with exact HI, TEA HI, and no HI treatments of hydrodynamics. Snapshots show the WW‐domain and α‐peptide in their folded and force‐unfolded states.
The WW domain represents a paradigm for describing folding and unfolding of the β‐sheet proteins. The calculated profile of ΔG confirms that the WW domain is a two‐state folder (Fig. 2a)49 following the single‐step kinetics of unfolding, F→U, from the native (folded) state F to the unfolded state U. By contrast, the profile of ΔG for the α‐peptide shows that this is a downhill folder (Fig. 2b). We also calculated the average ΔG for unfolding of WW‐domain and α‐peptide from their folded state with X = 2.7 nm (WW‐domain) and X = 3.5 nm (α‐peptide) to the globally unfolded state with X = 8.5 nm (WW‐domain) and X = 6.8 nm (α‐peptide). For the WW‐domain, ΔG = 40.9 kcal/mol (no HI), 41.3 kcal/mol (exact HI), and 40.5 kcal/mol (TEAHI). For the α‐peptide, ΔG = 24.0 kcal/mol (no HI), 24.7 kcal/mol (exact HI), and 23.4 kcal/mol (TEAHI). Hence, the results of exact and approximate treatments of hydrodynamics agree very well.Thermodynamics of unfolding of the WW‐domain (panel a) and the α‐peptide (panel b); see Figure 1. Compared are the profiles of unfolding force F vs. end‐to‐end distance X (solid curves; left y‐axes) and the unfolding free energy change ΔG vs. X (dashed curves; right y‐axes) obtained with exact HI, TEAHI, and no HI treatments of hydrodynamics. Snapshots show the WW‐domain and α‐peptide in their folded and force‐unfolded states.Order statistics of the forced unfolding times from force‐clamp simulations for WW3 (Fig. 1; Table 1). The average unfolding times for the 1st, 2nd, and 3rd unfolding transition t
1:3 (panel a), t
2:3 (panel b), and t
3:3 (panel c) vs. constant force f. Compared are the results of simulations with no HI (black circles), exact HI (red triangles), and TEAHI (blue squares). Averages and standard deviations are calculated over 103 trajectories (data for exact‐HI simulations are shifted along the x‐axis for clarity). [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]
Unfolding kinetics
To explore the effects of hydrodynamic interactions on the kinetics of protein unfolding and to probe further the accuracy of TEA‐based versus exact treatment of HI, we carried out force‐clamp simulations for WW3 (Fig. 1). Because in the trimer WW3 any WW‐domain can unfold at any given time with equal probability (domains are identical),50 the unfolding kinetics can be characterized by analyzing the statistics of ordered unfolding time variates {t} (order statistics), where i = 1, 2, 3.51 We calculated the average first, second, and third unfolding times t, t, and t using the results of simulations with exact HI, TEAHI, and no HI treatments of hydrodynamics. The profiles of order statistics data vs. f (Fig. 3) show that the average 1st, 2nd, and 3rd unfolding times all decrease with force. Surprisingly, the unfolding times from the simulations with exact HI and TEAHI are ∼5–10 times shorter than the unfolding times from no HI simulations. Hence, hydrodynamic coupling accelerates unfolding of protein domains. Our results for ∼10−3–10−5 s unfolding times (Fig. 3) also correlate well with the ∼103–105 s−1 unfolding rate constants from temperature jump and urea denaturation experiments.52, 53 The unfolding‐time profiles obtained with exact HI and TEAHI practically overlap in the entire force range, implying a very good agreement between these two approaches (Fig. 3). Hence, the TEA treatment of HI offers an accurate description of the protein unfolding kinetics.
Figure 3
Order statistics of the forced unfolding times from force‐clamp simulations for WW3 (Fig. 1; Table 1). The average unfolding times for the 1st, 2nd, and 3rd unfolding transition t
1:3 (panel a), t
2:3 (panel b), and t
3:3 (panel c) vs. constant force f. Compared are the results of simulations with no HI (black circles), exact HI (red triangles), and TEA HI (blue squares). Averages and standard deviations are calculated over 103 trajectories (data for exact‐HI simulations are shifted along the x‐axis for clarity). [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]
In our previous study, we showed that mechanical factors couple protein domains in multidomain proteins.54 We estimated pair‐wise correlations between the unfolding times of WW‐domains in WW3
t, I = 1, 2, 3 (parent data). Using statistically representative sets (1000 runs), we calculated for each IJ‐pair of WW‐domains (I, J = 1, 2, and 3) the Pearson correlation coefficient:
and the Spearman rank correlation coefficient:In eqs. (15) and (16),
is the unfolding time of the Ith domain observed in the nth trajectory,
is the rank of
, respectively, and <…> denotes ensemble averaging. The force profiles of p and s (−1 < p, s < 1) are displayed in Figure 4. We observe very strong negative correlations between the unfolding transitions in the 1st WW‐domain (constrained) and 2nd domain (in the middle) reflected in large negative values of p = −0.3–0.7 and s = −0.3–0.6, and weak negative correlations between the unfolding transitions in the 1st and 3rd WW‐domain (pulled) corresponding to small negative values of p = s = −0.05–0.15, and p = −0.05–0.2 and s = −0.05–0.15 in the 150–300 pN force range (Fig. 4). Negative correlations between domains I and J imply that the longer (shorter) unfolding times for domain I correspond to the shorter (longer) unfolding times for domain J on average. These very similar results correspond to exact‐HI and TEA‐HI simulations. In contrast, no‐HI simulations predict that correlations between the unfolding transitions in the 1st and 2nd domains change from large negative (p = s = −0.3–0.6 for f = 100–175 pN) to large positive (p = 0.1–0.4 and s = 0.3–0.6 for f = 175–300 pN), and that correlations between the unfolding times for the 1st and 3rd domains and 2nd and 3rd domains are negligible in the 100–250 pN force range but weak and positive for f > 250 pN (Fig. 4). Positive correlations imply that the longer (shorter) unfolding times for domain I correspond to the longer (shorter) unfolding times for domain J on average. Hence, the results obtained demonstrate that biomolecular simulations with HI ignored incorrectly describe dynamic coupling of protein domains in multidomain proteins.
Figure 4
Pairwise correlations characterizing dynamic coupling of the unfolding transitions in WW‐domains (Fig. 3) from pulling simulations for WW3 with constant force f (Fig. 1; Table 1). Shown are the profiles of Spearman rank correlation coefficient s (panels a–c) and Pearson correlation coefficient p (panels d–f), I ≠ J = 1, 2, 3, calculated using the unfolding times for the 1st and 2nd WW domains (s and p), 1st and 3rd domains (s and p), and 2nd and 3rd domains (s and p). The results are compared for simulations with no HI (black circles), exact HI (red triangles) and TEA HI (blue squares) treatment of hydrodynamics (error bars indicate 95% confidence interval). [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]
Pairwise correlations characterizing dynamic coupling of the unfolding transitions in WW‐domains (Fig. 3) from pulling simulations for WW3 with constant force f (Fig. 1; Table 1). Shown are the profiles of Spearman rank correlation coefficient s (panels a–c) and Pearson correlation coefficient p (panels d–f), I ≠ J = 1, 2, 3, calculated using the unfolding times for the 1st and 2nd WW domains (s and p), 1st and 3rd domains (s and p), and 2nd and 3rd domains (s and p). The results are compared for simulations with no HI (black circles), exact HI (red triangles) and TEAHI (blue squares) treatment of hydrodynamics (error bars indicate 95% confidence interval). [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]
Unfolding pathways
We also analyzed the unfolding pathways. A pathway I‐J‐K is a sequence of unfolding transitions, in which the WW‐domains I, J, and K have unfolded first, second, and third, respectively. The pathway probabilities are accumulated in Table 2. The exact‐HI and TEA‐HI simulations display very similar pathway probabilities, which differ from probabilities for no‐HI simulations. For example, the no‐HI simulations predict that 3‐2‐1 and 3‐1‐2 are the most dominant pathways in the 100‐280 pN force range. For f = 100 pN, the pathways 3‐2‐1 and 3‐1‐2 are detected, respectively, 30.4% and 23% of the time. Yet, the exact‐HI and TEA‐HI simulations display comparable pathway probabilities for all pathways, that is, ∼14–23% (exact‐HI) and ∼11–18% (TEA‐HI). Although at higher forces pathways 3‐2‐1 and 3‐1‐2 become increasingly more dominant in all three simulation schemes, at f = 220 pN, simulations with HI predict 80% (pathway 3‐2‐1) vs. 20% (pathway 3‐1‐2) for the exact‐HI and 83.6% (pathway 3‐2‐1) vs. 16.4% (pathway 3‐1‐2) for the TEA‐HI. At this 220 pN force, the no‐HI simulations predict 99% (pathway 3‐2‐1) vs. 1% (pathway 3‐1‐2). Hence, in terms of pathway probabilities, the results of simulations with and without HI disagree in the entire 100–280 pN force range. This can also be seen in the scatter plots in Supporting Information Figs. S3 and S4, which demonstrate that the TEA‐HI simulations better represent the unfolding pathways not predicted in the no‐HI simulations. Hence, the neglect of hydrodynamic interactions leads to inaccurate descriptions of unfolding pathways.
Table 2
Unfolding pathways for the trimer WW3.
Pathway probability, %
Force, pN
HI
1‐2‐3
1‐3‐2
2‐1‐3
2‐3‐1
3‐1‐2
3‐2‐1
100
exact HI
19.0
14.3
14.3
23.8
11.9
16.7
160
exact HI
0.0
1.0
0.0
5.6
32.4
61.0
220
exact HI
0.0
0.0
0.0
0.1
18.7
81.2
280
exact HI
0.0
0.0
0.0
0.0
5.9
94.1
100
TEA HI
11.0
18.3
15.1
16.6
21.6
17.4
160
TEA HI
0.0
0.9
0.0
2.3
36.0
60.8
220
TEA HI
0.0
0.0
0.0
0.0
16.4
83.6
280
TEA HI
0.0
0.0
0.0
0.0
4.4
95.6
100
no HI
7.8
11.0
8.5
19.3
23.0
30.4
160
no HI
0.0
0.0
0.0
0.0
20.2
79.8
220
no HI
0.0
0.0
0.0
0.0
1.0
99.0
280
no HI
0.0
0.0
0.0
0.0
0.0
100.0
Shown are the probabilities of observing a particular sequence of unfolding transitions in WW‐domains calculated from the force‐clamp pulling simulations with exact HI, TEA HI, and no HI treatments of hydrodynamic interactions for each value of the constant force f = 100, 160, 220, and 280 pN.
Unfolding pathways for the trimer WW3.Shown are the probabilities of observing a particular sequence of unfolding transitions in WW‐domains calculated from the force‐clamp pulling simulations with exact HI, TEAHI, and no HI treatments of hydrodynamic interactions for each value of the constant force f = 100, 160, 220, and 280 pN.
Forced unfolding of double‐D fragment
The very good agreement we obtained comparing the results of exact vs. approximate treatments of hydrodynamics for WW3, has enabled us to explore the effect of hydrodynamic interactions on the dynamics of forced unfolding of a large protein complex–the double‐D fragment (Fig. 1). We demonstrated in our previous study15 that the noncovalent D‐D interface formed by two neighboring fibrin monomers exhibit complex multistep unfolding transitions. We performed the force‐ramp simulations of double‐D fragment with HI (TEAHI) and without HI (no HI) and compared these results obtained with slow and fast cantilever velocities ν = 2.5 and 25 μm/s. The force (F)‐extension (X) profiles (Fig. 5) show that the corresponding FX curves differ only in minute details under slow force loading, but significant differences become manifest under faster loading. Interestingly, the FX‐spectra obtained with HI are more sawtooth‐like, that is, have more pronounced force peaks with higher peak‐forces and deeper valleys, and show much lower tension noise level and a lower lying baseline. The FX curves from TEA‐HI simulations with 2.5 and 25 μm/s cantilever velocity have the baseline running 10–20 pN and 100–200 pN lower, respectively, than the FX curves from no‐HI simulations. We also performed structure analysis using the output from TEA‐HI and no‐HI simulation, and found no substantial differences in the unfolding mechanisms. This can also be learned from the similar QX‐profiles (Fig. 5).
Figure 5
The unfolding transitions in double‐D fragment (Fig. 1; Table 1). Shown are the unfolding force spectra, that is, force (F)–extension (X) curves (left y‐axes), and the profiles of native contacts Q vs. X (right y‐axes) from the force‐ramp simulations with the pulling force increasing with ν = 2.5 μm/s (panel a) and 25 μm/s (panel b). We used the cantilever spring constant κ = 35 pN/nm. Compared are the FX‐and QX‐curves from simulations with no HI (black curves) and TEA HI (blue curves). In panel a, the first small ∼40 pN force peak at the ∼12 nm distance corresponds to the dissociation of D‐D interface (marked by the asterisk). [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]
The unfolding transitions in double‐D fragment (Fig. 1; Table 1). Shown are the unfolding force spectra, that is, force (F)–extension (X) curves (left y‐axes), and the profiles of native contacts Q vs. X (right y‐axes) from the force‐ramp simulations with the pulling force increasing with ν = 2.5 μm/s (panel a) and 25 μm/s (panel b). We used the cantilever spring constant κ = 35 pN/nm. Compared are the FX‐and QX‐curves from simulations with no HI (black curves) and TEAHI (blue curves). In panel a, the first small ∼40 pN force peak at the ∼12 nm distance corresponds to the dissociation of D‐D interface (marked by the asterisk). [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]The disruption of the D‐D interface induces a conformational transition from the “closed” state with hidden D‐D junction to the “open” state with the D‐D interface fully disrupted (see the first small ∼35–40 pN force peak at the ∼12 nm separation distance in Fig. 5a). The following six high‐amplitude force peaks correspond to sequential unraveling of the γ nodules in the globular D regions (three force peaks for each D region).15 Under fast loading (25 μm/s), the last two transitions are resolved in the TEA‐HI simulations and are not resolved in the no‐HI simulations (Fig. 5b). Hence, accounting for the HI makes the FX profiles look sharper and more experimental‐like, in which tension fluctuations are much reduced. Yet, the inclusion of HI does not alter the unfolding pattern.
Bending deformation of MT protofilament
Next, we explored the influence of HI on the dynamics of bending deformations of biological particles using a short 8 αβ‐tubulin dimer long microtubule (MT) protofilament (Fig. 1). We performed in silico nanoindentations of the MT protofilament with and without HI and compared the results obtained with slow, intermediate, and fast cantilever velocities ν = 3, 10, and 30 μm/s (see Supporting Movie S1).The force (F)‐deformation (X) spectra (FX curves) are displayed in Figure 6. The differences in the lineshapes obtained with and without HI grow with the loading rate and become significant under fast loading. At ν = 3 μm/s, the inclusion of HI only slightly affects the values of critical force F* and critical distance X (Fig. 6), that is, X ≈ 6.3 nm and F ≈ 0.19 nN (no‐HI) vs. X ≈ 7.3 nm and F ≈ 0.21 nN (TEA‐HI). Here, the critical force corresponds to the largest deformation before the breakage occurs (snapshots 2 and 3 in Fig. 6a). With the inclusion of HI, the force peak becomes sharper and the baseline is less noisy and runs lower. Increasing v to 10 μm/s barely changes the FX spectrum for the TEA‐HI simulations in terms of the lineshape and the values of F and X, but results in a huge line‐broadening in the spectrum for the no‐HI simulations (Fig. 6b). Under fast loading at ν = 30 μm/s, the force peak barely shifts in the spectrum from the TEA‐HI simulations, but the spectrum from the no‐HI simulations becomes featureless (Fig. 6c). We extracted the average initial slope dF/dX, which was used to calculate the flexural rigidity EI (see Supporting Information). We found that dF/dX = 30.6 pN nm (TEAHI) vs. 31.1 pN nm (no HI), and that EI = 20,500 pN nm2 (TEAHI) vs. 43,800 pN nm2 (no HI). Hence, the neglect of hydrodynamic interactions leads to a two‐fold overestimation of the flexural rigidity of the MT protofilament.The dynamics of deformation of MT protofilament (Fig. 1; Table 1). Shown are the force (F)–deformation (X) curves obtained from in silico nanoindentations with compressive force increasing with ν = 3 μm/s (panel a), 10 μm/s (panel b), and 30 μm/s (panel c). The cantilever spring constant is κ = 50 pN/nm, and the cantilever tip radius is R
tip
= 10 nm. Compared are the FX‐spectra from simulations with no HI (black curves) and TEAHI (blue curves). In panel a, snapshots corresponding to the similarly numbered regions 1, 2, and 3 in the FX‐spectra show the deformation progress, from the bent structure 1 (ascending part of the FX‐spectrum), to the bent structure 2 before the MT protofilament dissociation (force maximum in the FX spectrum), and to the disrupted structure 3 (local minimum in the FX spectrum). In panel c, the locations on the FX‐curves numbered 1–4 correspond to the snapshots displayed in Figure 7. [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]To probe the origin of these differences, we performed the principal component analysis (see Supporting Information) using the structure output from the TEA‐HI and no‐HI nanoindentations of the MT protofilament under fast force loading with ν = 30 μm/s (Fig. 6c). The first mode, which accounts for ∼83% of bending deformation, corresponds to the out‐of‐plane displacement of the MT protofilament beam in the direction of applied force (see Supporting Information Movie S2 and Fig. 6a), but the extent of bending is larger in the presence of HI. Hence, hydrodynamic interactions couple the displacements of relevant elements of the particle's structure. The increased amplitude of these collective displacements corresponds to the decrease of the deformation force (Fig. 6c).
Discussion
Simplified models of biomolecules allow researchers to perform meaningful simulations on biologically relevant time scales.55 The SOP model of biomolecules (RNA and proteins)7, 8 is a native topology based coarse‐grained model, which, in conjunction with the computational acceleration available on a GPU,12, 13 enables one to follow the dynamics of biomolecular systems in the relevant experimental subsecond timescale (Supporting Information Fig. S1). To harness the computational power of graphic processors, the SOP‐GPU package utilizes the CUDA Toolkit (from NVIDIA). The SOP model has already been used by many research groups in the computational exploration of a broad range of biological systems.13, 15, 16, 17, 18, 50, 51, 54, 56, 57, 58, 59, 60There are limitations of the SOP model as with any model: (i) the atomic details are ignored; (ii) only native contacts are accounted for; (iii) there are no side chains (in the Cα‐model); and (iv) electrostatic interactions are not described explicitly. Conversely, laser optical tweezers and atomic force microscopy cannot resolve structural transitions on the sub‐nanometer lengthscale; the 10–20 nm indenter size used in nanomanipulations with AFM makes the atomic structure details irrelevant. Although unfolding of proteins occurs via sequential unraveling of the tertiary and then secondary structure blocks formulating the native topology, in mechanical deformation of whole biological particles, the overall geometry of the particle's structures and specific arrangements of large building blocks (e.g., capsomers in virus shells and protofilaments in MT polymers) play an important role. Hence, it is sufficient to model only contacts that reinforce the native structure of protein domains. Although the side chains are not included into the SOP model, their influence is implicit in the long 8 Å cut‐off distance [eqs. (10) and (11)]. Also, the explicit treatment of side chains is feasible with the Cα‐Cβ version of the SOP model, and the inclusion of electrostatic interactions is possible with the screened Coulomb potential.61 Notwithstanding the limitations of the SOP model summarized above, the model has the following major advantages. The simplicity of the SOP model allows using the relevant experimental force‐loading rate conditions to describe the biomolecular transitions in the diffusion‐controlled limit. The simple form of the energy function [eqs. (8), (9), (10), (11), which involves the potential energy terms also used in more detailed descriptions [e.g., FENE potential; see eq. (9)], makes it possible to implement the SOP model on a GPU. Owing to a huge computational acceleration attainable on a GPU, this opens a new direction of research of the force‐generating properties of entire biological particles (viruses, cellular organelles, filaments, etc.)15, 16, 17, 18, 55To account for solvent‐mediated many‐body effects in Langevin dynamics schemes, here, we have employed the Truncated Expansion approximation for calculation of the Rotne–Prager–Yamakawa hydrodynamic tensor D. We assessed the computational performance of the end‐to‐end realization of Langevin dynamics with HI on a GPU by profiling the computational speed and memory demand (Fig. 1 and Supporting Information Fig. S2) for a range of model systems (N = 101–105 residues; Table 1). Benchmark testing showed that these systems can be computationally described with the HI in reasonable wall‐clock time and that these simulations are not limited by the low memory of graphics cards. To generate 1 ms of Langevin dynamics (with time step of 20 ps) with TEAHI on a single GPU Tesla K40 using the many‐runs‐per‐GPU mode, it would take ∼2 h for the D‐dimer (s
opt = 9); and ∼43 hours for the 8‐dimers long MT protofilament (s
opt = 2). It takes 5–20 times longer, depending on system size, to simulate the dynamics with HI (Fig. 1), but the global transitions in biomolecules occur on a faster timescale in the presence of HI (Figs. 3, 5, and 6, Supporting Information Figs. S3 and S4). For example, in the force‐clamp simulations for the WW‐domain (Fig. 3), the three‐fold computational slowdown is more than balanced by the approximately four‐fold speed‐up of unfolding kinetics.We assessed the numerical accuracy of our GPU‐based realization of the TEA approach to describing the HI by comparing the results of free energy calculations and protein forced unfolding simulations generated with the approximate and exact treatments of hydrodynamics. The profiles of unfolding free energy for the TEA‐HI and exact‐HI treatments for the WW‐domain and the α‐peptide practically collapse onto the same curve (Fig. 2). The thermodynamic state functions (ΔG) characterizing forced unfolding in these model proteins estimated based on these two approaches agree very well. The small size of the trimer WW3, used as an example of a multidomain protein, allowed us to directly compare how simulations with the exact and approximate treatments of HI describe the protein unfolding. Analysis of the unfolding kinetics for WW3 revealed overlapping profiles of ordered unfolding times (Fig. 3), as well as Pearson correlation and Spearman rank correlation coefficients (Fig. 4), and the very similar values of pathway probabilities (Table 2) obtained with the exact and TEA treatments of HI. These results imply an almost quantitative agreement between the results of exact vs. TEA treatments of HI. Hence, the TEA approach offers a convenient, yet accurate description of the thermodynamics and kinetics of protein unfolding.The main results from this part of our studies are the following. Hydrodynamic interactions: (i) accelerate the kinetics of unfolding transitions in protein domains in multidomain proteins and polyproteins (Fig. 3 and Supporting Information Figs. S3 and S4) in agreement with experiments; (ii) couple these transitions when the tension is sufficiently high (> 160–200 pN for WW‐domain; Fig. 4), thereby making them interdependent in agreement with our previous theoretical study54; and (iii) influence the kinetic partitioning of unfolding transitions into various pathways (Table 2).These results can be understood by considering the dynamics of tension propagation in a polypeptide chain together with the kinetics of protein forced unfolding. When the pulling force is small, that is, it barely exceeds the threshold force for unfolding, the timescale of tension propagation and redistribution is much shorter than the unfolding kinetics. Under these conditions, the effect of hydrodynamic interactions between residues in the same or different WW‐domains is negligible, and they unravel nearly independently, with (almost) equal probability regardless of their position in the trimer WW3. Hence, any of the six pathways (1‐2‐3, 1‐3‐2, 2‐1‐3, 2‐3‐1, 3‐1‐2, 3‐2‐1) should be observed with the equal probability of 1/6. When the pulling force is large, the hydrodynamic effects become important. Because HI are a kinetic effect, that is, they couple the displacements of various residues, the native contacts stabilizing the WW‐folded state disrupt roughly at the same time. The larger the force, the more time‐coordinated are the dissociations of native contacts. In a sense, the WW‐domains become destabilized kinetically, which shortens their lifetime and accelerates the unfolding rate (Fig. 3), but not thermodynamically. Recall that the WW‐domains are equally stable thermodynamically with and without HI (Fig. 2). At large force, the timescale of tension propagation becomes comparable with the unfolding kinetics, and the domains closer to the point of force application (i.e., the 3rd and 2rd WW‐domains) unfold more rapidly. This is the cause of kinetic partitioning and emergence of dominant pathways 3‐2‐1 and 3‐1‐2 at large forces (Table 2). However, because tension propagation is more rapid and tension redistribution is more uniform in the presence of HI, the kinetic partitioning becomes stronger at larger forces (compared to when HI are off) when the unfolding kinetics exceeds the rate of tension propagation.The excellent agreement we obtained between the results of approximate and exact treatments of HI, has enabled us to explore the influence of hydrodynamic coupling on dynamic processes involving large biomolecular complexes comprised of a number of protein building blocks (Table 1). We studied the dynamics of protein forced unfolding for the D‐dimer and the dynamics of deformation of an MT protofilament (Fig. 1; Table 1). The results from this part of our studies showed that the inclusion of HI: (iv) modulates the deformation dynamics of a biological particle, which become more deterministic (i.e., less variable) with the reduced role of tension or stress fluctuations; (v) results in the force‐extension/deformation spectra (FX‐curves) that are drastically different from the corresponding FX‐curves collected in the absence of HI, especially in the limit of fast loading (Figs. 5 and 6). However, (vi) the HI do not change the physical picture underlying the unfolding/deformation. The FX‐curves obtained using the TEA treatment of hydrodynamics (Figs. 5) look much more similar to the typical experimental FX‐spectra15 compared to the results obtained with the no‐HI approach. The force peaks, which quantify the limits of the system's stability to the applied pulling/compressive force, are more distinguishable and they are now reminiscent of the experimental sawtooth‐like force signals due to the lower running baseline and weaker noise level. The lower lying baseline in the FX‐spectra means that faster tension (or stress) propagation and a more uniform tension (or stress) distribution exists in the D‐dimer (MT protofilament); see Figure 5 (Fig. 6). This leads to a smaller unfolding (deformation) force reading for the D‐dimer (MT protofilament). Weaker noise levels imply reduced mechanical perturbations experienced by the system due to the external tension/stress factor. This is because hydrodynamic interactions couple structural elements, thereby making them able to undergo correlated displacements and essentially mask the mechanical fluctuations. These correlated displacements also echo with the coupled unfolding transitions observed in the trimer WW3 in the presence of HI that we discussed above. Stated differently, in the presence of HI the displacements are less individual (or local) and more collective (or global). Hence, the inclusion of HI accelerates tension or stress propagation and/or redistribution, and tension or stress fluctuations are much reduced.Surprisingly, the inclusion or neglect of HI does not change qualitatively the mechanism of global transitions (unfolding, deformation, and breakage). This conclusion is uniformly consistent with the results of structural analyses for the D‐dimer unraveling, and MT protofilament deformation and breakage. The most striking evidence supporting this notion comes from the principal component analysis of nanoindentations of the MT protofilament. When the HI are switched on, the out‐of‐plane deformations of the αβ‐tubulin dimers at the clamped ends and around the indenter (first principal mode) become strongly coupled (Supporting Information Movie S2), which makes these displacements time‐coordinated. This explains the lower lying FX‐curves (Fig. 6), and smaller resulting values of the deformation force for a more bendable beam structure. This last point can be better understood by comparing the structures of MT protofilament and deformation forces (F) for the same values of deformation X = 2, 5, 7.3, and 13.3 nm compared in Figure 7 for fast force loading (ν = 30 μm/s) with and without the HI. We observe the very different values of deformation forces: F = 250 pN (no HI) vs. 40 pN (TEAHI) for X = 2 nm (weakly bent conformation); F = 400 pN (no HI) vs. 120 pN (TEAHI) for X = 5 nm (weakly bent conformation); F = 450 pN (no HI) vs. 210 pN (TEAHI) for X ≈ 7 nm (strongly bent conformation); and F = 490 pN (no HI) vs. 75 pN (TEAHI) for X ≈ 13 nm (dissociated state). Taking into account the ∼64 nm size of 8 αβ‐tubulin dimers long MT protofilament (Fig 1), we can appreciate the fact that in the presence of HI the displacements of the relevant elements of MT protofilament structure are coordinated on a truly global scale.
Figure 7
Structural snapshots explaining detectable differences in the dynamics of deformation of the 8 αβ‐dimers long MT protofilament fragment (Table 1) in the absence of HI (no HI; structures 1a, 2a, 3a, and 4a on the left) and in the presence of HI (TEA HI; structures 1b, 2b, 3b, and 4b on the right). The snapshots for X = 2, 5, 7.3, and 13.3 nm deformation correspond to the FX‐curves in Figure 6c. [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]
Structural snapshots explaining detectable differences in the dynamics of deformation of the 8 αβ‐dimers long MT protofilament fragment (Table 1) in the absence of HI (no HI; structures 1a, 2a, 3a, and 4a on the left) and in the presence of HI (TEAHI; structures 1b, 2b, 3b, and 4b on the right). The snapshots for X = 2, 5, 7.3, and 13.3 nm deformation correspond to the FX‐curves in Figure 6c. [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]Our finding that the neglect or inclusion of HI does not alter the mechanism of transitions correlates well with our conclusion that hydrodynamic coupling is entirely a kinetic effect, which mainly correlates the residues undergoing concerted displacements. To illustrate this point, we used an approach described in Ref. [
62] to calculate the local (position‐dependent) diffusion coefficient D along the reaction coordinate X,In eq. (17),
is the average value of X, that is, the end‐to‐end distance for the D‐dimer (Fig. 5) and deformation for the MT protofilament (Fig. 6),
is the variance, and
is the characteristic time (
). The results of calculation of D for the D‐dimer and MT protofilament are presented in Figure 8. Both for simulations with and without the HI, the profiles of D and F vs. X are peaked around the critical values of X, which mark these global transitions: unfolding events in the D‐dimer; and breakage of the MT protofilament. The higher lying baseline and the peaks of D correspond to simulations with HI (Fig. 8). Hence, the large‐amplitude global transitions are accelerated many‐fold when the hydrodynamic interactions are switched on.
Figure 8
Profiles of the position‐dependent diffusion coefficient D (left y‐axis) vs. X for the double‐D fragment unfolding (panel a), and the MT protofilament deformation (panel b) based on experiments in silico with TEA HI (blue curves) and with no HI (black curves). Also shown are the corresponding FX‐curves from Figures 5a, and 6a (dashed curves; right y‐axis), respectively, for the D‐dimer (panel a), and MT protofilament (panel b). For the D‐dimer, we only display the initial 60 nm portions of the D‐ and FX‐curves. Thin vertical lines correlate the critical values of reaction coordinate X in FX‐ and D‐curves marking the global transitions. [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]
Profiles of the position‐dependent diffusion coefficient D (left y‐axis) vs. X for the double‐D fragment unfolding (panel a), and the MT protofilament deformation (panel b) based on experiments in silico with TEAHI (blue curves) and with no HI (black curves). Also shown are the corresponding FX‐curves from Figures 5a, and 6a (dashed curves; right y‐axis), respectively, for the D‐dimer (panel a), and MT protofilament (panel b). For the D‐dimer, we only display the initial 60 nm portions of the D‐ and FX‐curves. Thin vertical lines correlate the critical values of reaction coordinate X in FX‐ and D‐curves marking the global transitions. [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]
Conclusion
Notwithstanding the huge computational speed‐up attained when Langevin simulations are ported to a GPU,12, 13 it is not yet possible to describe the hydrodynamic interactions exactly in reasonable wall‐clock time, even for a relatively small system of N = 103 residues (D‐dimer; Table 1) and even at the one‐bead‐per‐residue level. This situation is due primarily to prohibitively expensive numerical costs associated with the computation of the Rotne–Prager–Yamakawa hydrodynamic tensor. Here, we implemented an algorithm for fast yet accurate calculation of the hydrodynamic tensor on a GPU, which is based on the Truncated Expansion approximation.28 This development allowed us to carry out a comprehensive study of the influence of hydrodynamic interactions on the biomechanical and force‐generating properties for a range of biomolecular systems from small proteins to large‐size protein assemblies.Biomolecular simulations of protein forced unfolding with the hydrodynamic interactions neglected lead to an inaccurate description of the unfolding kinetics and pathways and a significant underestimate of the role of dynamic coupling of protein domains forming a multidomain protein. The visual differences in the force‐extension/deformation spectra and corresponding sizeable disagreement, for example, in the values of slope and critical forces for unfolding/deformation can potentially lead to erroneous estimation of the physico‐chemical and thermodynamic characteristics. These problems can be alleviated using TEA treatment of hydrodynamics. Undoubtedly, hydrodynamic interactions represent a more accurate physical picture of the properties of protein domains imparted by solvent to biomolecules undergoing structural transformation, whether this is as part of their functional mechanism in the biological environment, for example, kinesin walking on the microtubule61 or during an experimental pulling/deformation/indentation measurement.The running of Langevin simulations with implicit approximate or exact treatments of hydrodynamic interactions on a GPU is now possible with the SOP‐GPU package fully available for downloading at https://github.com/BarsegovGroup/SOP‐GPU.Supporting InformationClick here for additional data file.Supporting Information Movie S1Click here for additional data file.Supporting Information Movie S2Click here for additional data file.
Authors: Moritz Mickler; Ruxandra I Dima; Hendrik Dietz; Changbong Hyeon; D Thirumalai; Matthias Rief Journal: Proc Natl Acad Sci U S A Date: 2007-12-13 Impact factor: 11.205