Literature DB >> 24571643

Minimal model of self-assembly: emergence of diversity and complexity.

Abstract

Molecular self-assembly is ubiquitous in nature, yet prediction of assembly pathways from fundamental interparticle interactions has yet to be achieved. Here, we introduce a minimal self-assembly model with two attractive and two repulsive beads bound into a tetrahedron. The model is associated with a single parameter η defined as the repulsive to attractive interaction ratio. We explore self-assembly pathways and resulting assembly morphologies for different η values by discrete molecular dynamics. Our results demonstrate that η governs the assembly dynamics and resulting assembly morphologies, revealing an unexpected diversity and complexity for 0.5 ≤ η < 1. One of the key processes that governs the assembly dynamics is assembly breakage, which emerges spontaneously at η > 0 with the breakage rate increasing with η. The observed assembly pathways display a broad variety of assembly structures characteristic of aggregation of amyloidogenic proteins, including quasi-spherical oligomers that coassemble into elongated protofibrils, followed by a conversion into ordered polymorphic fibril-like aggregates. We further demonstrate that η can be meaningfully mapped onto amyloidogenic protein sequences, with the majority of amyloidogenic proteins characterized by 0.5 ≤ η < 1. Prion proteins, which are known to form highly breakage-prone fibrils, are characterized by η > 1, consistent with the model predictions. Our model thus provides a theoretical basis for understanding the universal aspects of aggregation pathways of amyloidogenic proteins relevant to human disease. As the model is not specific to proteins, these findings represent an important step toward understanding and predicting assembly dynamics of not only proteins but also viruses, colloids, and nanoparticles.

Entities: Chemical Disease Gene Species

Mesh：

Substances：

Year: 2014 PMID： 24571643 PMCID： PMC4324428 DOI： 10.1021/jp412819j

Source DB: PubMed Journal: J Phys Chem B ISSN： 1520-5207 Impact factor: 2.991

Introduction

Molecular self-assembly, a spontaneous association of disordered components into an ordered supramolecular structure, is responsible for the formation of complex biological systems and is becoming increasingly important in material sciences striving to develop novel biomaterials with a great diversity of biochemical and physical properties. Very little is known about the mechanisms underlying the emergence of ordered structures from disordered components. Recently, a remarkable diversity of polyhedra that self-assemble through excluded volume interactions at high packing fractions into entropic crystals with various degrees of crystalline order was reported.[1] At the opposite spectrum of a high packing fraction is protein aggregation, which typically occurs in vivo at nanomolar and in vitro at micromolar concentrations and must be consequently facilitated by attractive intermolecular interactions. Aberrant protein aggregation is at the core of many age-triggered diseases, such as Alzheimer’s, Parkinson’s, and Huntington’s disease, amyotrophic lateral sclerosis, type II diabetes, systemic amyloidoses, and others.[2] These amyloid proteins do not share any obvious aspects of the primary structure yet they self-assemble into cytotoxic low-molecular weight oligomers and form fibrils with a common cross-β structure.[3] Inherent toxicity of amyloid assemblies that cause the disease implies a common assembly mechanism;[4] however, assembly pathways are not well understood.[5] Here, we introduce a minimal model of self-assembly that unifies common features of protein amyloidogenesis and can be meaningfully mapped onto sequences of amyloid proteins. Because the model is not protein specific, the model predictions extend to self-assembly systems beyond the aggregation of amyloidogenic proteins.

Methods

Model Construction

A self-assembly model with a minimal number of beads and interparticle interactions is constructed as following. A one-bead molecule does not allow for the implementation of both attractive and repulsive interactions simultaneously. A two- or three-bead molecule is either anisotropic or planar, which imposes a priori geometric restrictions on self-assembling molecules. A three-dimensional molecule can be formed by a minimum of four beads, which we place at the four vertices of a tetrahedron (Figure 1A). Each tetrahedron molecule thus represents a molecule (protein monomer) comprising four beads each of a diameter D connected by four covalent bonds of identical lengths d (Figure 1A,B). Each bead within a tetrahedron molecule is assigned either an attractive (hydrophobic) or repulsive (hydrophilic) character, resulting in three possible model variants: (i) one hydrophilic and three hydrophobic beads, (ii) three hydrophilic and one hydrophobic bead, and (iii) two hydrophobic and two hydrophilic beads. Consistent with discrete molecular dynamics (DMD) requirements (see below), the effective intermolecular hydrophobic attraction among hydrophobic beads and effective intermolecular hydrophilic repulsion among hydrophilic beads of different molecules are modeled by square-well potentials with a depth Eh and a height Ep, respectively, (Figure 1C). A hydrophobic and a hydrophilic bead interact through an excluded volume only. The ratio of the two potential energies, η = Ep/Eh, is hereafter referred to as the hydropathy ratio. In variant (i), molecules aggregate into a single densely packed quasi-spherical assembly (data not shown). In variant (ii), molecules form stable dimers and trimers that do not assemble further (data now shown). In variant (iii) (Figure 1A), molecules exhibit complex assembly dynamics, which we examine for several values of the hydropathy ratio.

Figure 1

Definition of the tetrahedron model. (A) A tetrahedron molecule comprises two hydrophobic (red) and two hydrophilic (blue) beads of a diameter D located at the vertices of a tetrahedron with the edge length d. (B) An “infinite” square-well intramolecular potential models covalent bonds among the four beads in the tetrahedron molecule. The bond length d can vary between r1 = 1.56D and r2 = 1.61D. (C) Effective attractive (red) and repulsive (blue) intermolecular potentials between pairs of hydrophobic and hydrophilic beads, respectively. The distance rmin = D corresponds to a sum of the van Der Waals radii of the two interacting beads and rmax = 2.22D is the interaction range distance. The strengths of the effective hydrophobic attractive and hydrophilic repulsive potential are denoted by Eh and Ep, respectively.

Discrete Molecular Dynamics

When interparticle potentials are approximated by a square-well or a combination of square-well potentials, molecular dynamics can be reduced to an efficient, event-driven discrete molecular dynamics (DMD). DMD was initially applied to simulate a system of hard spheres.[6,7] More recently, several low, intermediate, and high resolution protein models combined with DMD provided important insights into protein folding and assembly.[8−17] Here, we apply DMD to examine pathways of self-assembly of the tetrahedron model. In each simulation, 1000 tetrahedron molecules occupy the simulation box with an edge length of 111D, which corresponds to a volume fraction of 2.80 ± 0.13 × 10–3. Periodic boundary conditions are used and temperature is controlled by the Berendsen thermostat.[18] We examine self-assembly at five hydropathic ratios η = Ep/Eh = 0, 0.5, 0.75, 1, 1.25. For each η, 1000 tetrahedron molecules are placed into a cubic lattice in the simulation box, followed by high-temperature (kBT ≫ Eh) DMD simulations, which produce five distinct initial ensembles of randomly distributed non-interacting tetrahedron molecules. Thus obtained ensembles, five for each of the five η values, resulting in 25 ensembles in total, are then used as initial configurations for the production runs. Here, a configuration is defined as an ensemble of coordinates of all tetrahedron molecules in a given trajectory at a given simulation time. Production runs are 10 × 106 (η = 0) or 50 × 106 (η > 0) simulation steps long, and configurations are recorded every 104 simulation steps, resulting in 5 × 1000 = 5000 configurations (η = 0) or 5 × 5000 = 25 000 configurations (η > 0). The time scale of one simulation step, Δt, can be expressed in terms of the spatial resolution, Δx ∼ D ≈ 10–9 m (thus approximating a protein by a tetrahedron with a circumsphere radius of ∼1 nm), and thermal energy using the equipartition theorem,where we chose for the mass of one bead mB ∼ 10–22 kg, equivalent to a sequence of ∼50 amino acids or a ∼ 200-residue protein corresponding to each tetrahedron molecule. We use the thermal energy at a physiological temperature 310 K, kBT ≈ 4 × 10–21J, resulting in Δt ≈ 16 × 10–9 s and the total simulation time per trajectory at nonzero η values (50 × 106 simulation steps) of ∼8 × 10–3 s. Note that the unit of length, D, and the mass of each bead, mB, can be adjusted to the size and molecular mass of the protein of interest, which consequently affects the time scale, Δt ∝ D(mB)1/2, such that without considering the actual protein sequences, self-assembly of a larger protein occurs on a longer time scale than self-assembly of a smaller protein.

Probability Distribution of Assembly Sizes

The assembly size is defined as the number of molecules within a connected cluster. A recursive algorithm is used to identify tetrahedron molecules that belong to each cluster. A tetrahedron molecule is identified as a part of the cluster if any of its four beads is at a distance ≤2.22D from at least one of the beads of the molecules within the cluster. To derive three-dimensional plots displaying the time evolution of assembly sizes for each trajectory, assembly size distributions are calculated by using a binning interval of 5 × 105 simulation steps. The probability distribution of assembly sizes is calculated by using a 106 simulation steps-wide window, resulting in 100 ensembles, each comprising 1000 tetrahedron molecules in various assembly states. Then, the average over the five probability distributions is taken, and the corresponding standard error of the mean (SEM) values are calculated. If an assembly size appears only once in a single trajectory, the corresponding SEM value is set to zero. By moving the 106 simulation steps-wide window along the assembly trajectories between 0 and 50 × 106 simulation steps, time evolution of the size distribution for each η is derived.

Order Parameter

To quantify the degree of orientational order within the assemblies of different sizes acquired for different η values, we calculate the order parameter S of each assembly. Each molecule in the assembly is characterized by an axis that connects its two hydrophobic (red) beads. The angle θ between the axis of each molecule and the long axis of the elongated assembly, determined by the principal component analysis,[19] is then used to calculate S as the average:over all molecules in the assembly. Note that due to periodic boundary conditions, some assemblies are split in two or more clusters, which are separated by the linear size of the cubic simulation box in either of the three directions. To accurately calculate the order parameter, these split clusters are merged into a single cluster prior to the order parameter calculation. The order parameter calculation is performed for each assembly of size larger than 5 within 5000 (η = 0) or 25 000 (η > 0) recorded configurations, each containing 1000 tetrahedron molecules. For each ensemble of assemblies of a size ≥6, the average order parameter and SEM values are calculated by taking into account the order parameter values of all assemblies of a given size. For assemblies comprising 100 or more tetrahedron molecules, a running window along the assembly axis containing 50 molecules is used to calculate the local order parameter, followed by an average over all local order parameters.

Results

Here a minimal model of self-assembling molecules, each characterized by a combination of attractive and repulsive interactions, is examined. Generally, any molecule prone to aggregation is characterized by attractive and repulsive interactions due to van der Waals and electrostatic interactions. We adopt in the following protein folding and aggregation terminology when discussing the origin of attractive and repulsive interactions. In proteins, attractive and repulsive interactions stem from the hydropathic (polar versus nonpolar) and electrostatic (charged versus uncharged) nature of individual residues. We posit that in addition to the effective hydrophobic attraction, the effective hydrophilic and/or electrostatic repulsion due to solvation of polar side chains by water molecules plays a critical role in prediction of protein assembly pathways and structures. We then construct a tetrahedron model with two hydrophobic and two hydrophilic beads as described in the Model Construction section (Methods), in which the hydropathy ratio η = Ep/Eh between the repulsive Ep and attractive Eh potential energies is introduced as the only model parameter (Figure 1). Self-assembly of the tetrahedron model is studied in the absence of repulsion among the hydrophilic beads (η = 0) and at four different hydropathy ratios (η = 0.5, 0.75, 1, and 1.25). To avoid a comparison of the assembly pathways of proteins with vastly different solubilities, the energy scale Eh is adjusted for each η to keep the final monomer concentration within the range of 10–16% (Table 1). For each η, five trajectories of self-assembly from initially monomeric ensembles of 1000 tetrahedron molecules are acquired and analyzed. For each trajectory, time evolution of assembly sizes is monitored and displayed as a three-dimensional plot (Figure 2), in which each assembly is also characterized by an order parameter as described in the Order Parameter section (Methods), measuring a degree of the orientational order of molecules within the assembly (Figure 2). Below, we refer to small, quasi-spherical assemblies as oligomers to distinguish them from elongated, curvilinear assemblies (protofibrils) and more ordered assemblies (fibrils).

Table 1

Energy Unit and Steady-State Monomer, Dimer, and Trimer Concentrationsa

η = E_p/E_h	E_h [k_BT]	[monomer] [%]	[dimer] [%]	[trimer] [%]
0.00	0.97	14.027 ± 0.229	0.408 ± 0.028	0.014 ± 0.004
0.50	1.16	11.000 ± 0.383	0.272 ± 0.024	0.007 ± 0.003
0.75	1.27	10.737 ± 0.412	0.249 ± 0.016	0.007 ± 0.001
1.00	1.37	12.768 ± 0.268	0.430 ± 0.031	0.025 ± 0.005
1.25	1.45	15.687 ± 0.095	0.886 ± 0.016	0.052 ± 0.009

The potential energy Eh, associated with the effective hydrophobic attraction, which represents a unit energy in our model is adjusted for each hydropathy ratio η to yield comparable steady-state monomer concentration (solubility). For each η, steady-state monomer, dimer, and trimer concentrations and their SEM values are calculated by averaging over monomer, dimer, and trimer concentrations for each trajectory within 9–10 × 106 steps (η = 0) or 49–50 × 106 steps (η > 0), followed by averaging over the five trajectories.

Figure 2

Assembly pathways of all trajectories. Three-dimensional plots showing probability distributions P(n) of different assembly sizes n as a function of the simulation time t during the process of assembly starting from an ensemble of spatially separated (monomeric) tetrahedron molecules. Three–dimensional plots for each of the five trajectories, labeled as S1, S2, S3, S4, and S5, for each of the five η values are displayed. The average order parameter S that characterizes each assembly is color coded as shown on the color scale. The potential energy Eh, associated with the effective hydrophobic attraction, which represents a unit energy in our model is adjusted for each hydropathy ratio η to yield comparable steady-state monomer concentration (solubility). For each η, steady-state monomer, dimer, and trimer concentrations and their SEM values are calculated by averaging over monomer, dimer, and trimer concentrations for each trajectory within 9–10 × 106 steps (η = 0) or 49–50 × 106 steps (η > 0), followed by averaging over the five trajectories.

Amorphous Assembly at η = 0

Figure 3A shows time evolution of assembly sizes for four representative trajectories corresponding to η = 0, 0.5, 1, and 1.25, respectively. At η = 0, the initial hydrophobic collapse driving the assembly occurs rapidly, bringing the monomer concentration from the initial 100% to the final value of ∼14% (Table 1). Within the first 106 simulation steps (Figure 4A), initial monomers rapidly assemble into an aggregate comprising of 800–900 molecules, which remains in a steady-state equilibrium with monomers (Figure 3A, left) and small populations of dimers and trimers (Table 1). All assemblies have a relatively low order parameter as color-coded in Figure 3A (left). The assembly proceeds through formation of quasi-spherical oligomers, further elongating into larger curved tubules, which collapse onto themselves, exposing nonhydrophobic (blue) and shielding hydrophobic (red) beads from the “solvent”, and eventually form a large quasi-spherical aggregate with an overall porous morphology (Figure 3A, right). The high bending propensity of these aggregates is reflected in their low order parameter for all five trajectories, which display almost identical assembly pathways (Figure 2).

Figure 3

Figure 4

Time evolution of assembly size distributions. The average assembly size distributions at η equal to (A) 0, (B) 0.5, (C) 1, and (D) 1.25 steadily evolve from the initial to the later assembly stages. Note that for η = 0, the size distribution reaches steady state within the initial 106 simulation steps. The error bars correspond to SEM values.

Time evolution of assembly pathways. Three-dimensional plots showing the assembly size distribution or probability (z-axis) of all assembly sizes (x-axis) at different times (y-axis) for representative trajectories (left) and their assembly pathways with characteristic morphologies (right) for simulations at (A) η = 0; (B) η = 0.5; (C) η = 1; and (D) η = 1.25. The color code in (A) is used to characterize the order parameter of each assembly in the three–dimensional plots. Note that the conformations are not displayed on the same scale: The final aggregates along assembly pathways for η = 0, 0.5 (A,B, right) are considerably larger than those for η = 1 and 1.25 (C,D, right), however, their lateral dimensions are comparable. Time evolution of assembly size distributions. The average assembly size distributions at η equal to (A) 0, (B) 0.5, (C) 1, and (D) 1.25 steadily evolve from the initial to the later assembly stages. Note that for η = 0, the size distribution reaches steady state within the initial 106 simulation steps. The error bars correspond to SEM values.

Transient Oligomers and Protofibril-like Assemblies at η = 0.5

Simulations at η = 0.5 can be characterized by two assembly stages. The initial assembly stage lasts ∼10 – 20 × 106 simulation steps (Figure 3B, left), which is an order of magnitude longer than the hydrophobic collapse observed for η = 0. During this initial stage, smaller quasi-spherical oligomers coexist with larger elongated aggregates and a rapid elongation of large aggregates is mostly driven by oligomer addition. The assembly size distribution averaged over all five trajectories (Figure 2) reveals a large number of oligomers comprising 30–70 tetrahedron molecules forming within ∼106 simulation steps (Figure 4B, black curve). These oligomers then coalesce into larger assemblies comprising 100–140 tetrahedron molecules at 1–2 × 106 simulation steps (Figure 4B, blue curve). After this initial assembly stage (at ∼20 × 106 simulation steps), oligomers disappear, marking the onset of the late assembly stage. In the late assembly stage, the elongated aggregates either increase or decrease in length at a rate distinctly lower than the aggregation rate of the initial assembly stage, while persisting in a steady-state equilibrium with the monomer population and minor populations of dimers and trimers (Table 1). Large aggregates at this late stage increase or decrease in size predominantly by monomer association and dissociation. In addition to this predominant growth dynamics, two additional albeit less frequent processes are noted: (a) merging of two elongated assemblies into a single larger elongated assembly and (b) breaking of an elongated aggregate into two smaller aggregates, which continue to either increase or decrease in length. The second process of aggregate breakage results in smaller assemblies of various sizes (Figure 4B, cyan, green, orange, and red curves). The morphology of the assemblies observed in simulation at η = 0.5 differs from those observed for η = 0. All assemblies at η = 0.5 have a considerably larger order parameter than those observed in simulations at η = 0. The order parameter of assemblies increases with their size (Figure 3B, left). Elongated aggregates are considerably less prone to bending than those observed for η = 0, but display kinks and can be slightly bent (Figure 3B, right). Unlike trajectories at η = 0, trajectories at η = 0.5 display significant variations (Figure 2).

Breakage-Dominated Assembly at η ≥ 1

A two-stage assembly process observed for η = 0.5 is absent from simulations at η = 1. All trajectories acquired at η = 1 reach a steady state rapidly within 106 simulation steps. The assembly pathways appear to be characterized by a single assembly growth rate (Figure 2). This rate is slightly lower than the one characterizing the initial assembly stage but notably larger than the elongation rate observed in the late assembly stage at η = 0.5. Importantly, the assembly dynamics at η = 1 is dominated by assembly breakage, which critically reduces the size of the largest aggregate. Consequently, the largest aggregate in the trajectory shown in Figure 3C (left) comprises ∼400 tetrahedron molecules. Time evolution of the assembly size distributions reveals an abundance of quasi-spherical oligomers made of 20–30 molecules that form within the first 106 simulation steps (Figure 4C, black curve). The corresponding peak in the assembly size distribution decreases and broadens at longer simulation times (Figure 4C, blue, cyan, green, orange, and red curves). Unlike in simulations at η = 0.5, where the oligomer assemblies disappear after the initial assembly stage, the assemblies of ∼20–30 molecules persist at all simulation times, and give rise to assemblies of ∼60 – 80 and ∼100 – 120 tetrahedron molecules, respectively (Figure 4C, red curve). This multimodal character of the assembly size distribution indicates that larger assemblies form through coalescence of smaller oligomers rather than through monomer addition. The steady–state pool of assemblies smaller than ∼120 is a cause of a rapid growth of larger aggregates. A large breakage rate characteristic for η = 1 results in a broad distribution of assembly sizes up to ∼550 (Figure 4C, green, orange, red curves). Similar to simulations at η = 0 and η = 0.5, molecules self-assemble initially into smaller quasi-spherical oligomers, followed by formation of elongated aggregates. The largest elongated aggregates show several kinks (Figure 3C, right). Some trajectory-to-trajectory variability is observed at η = 1, reflected in a variable size of the largest aggregate (Figure 2). Simulations at η = 1.25 are characterized by a steady state dynamics, which is reached within ≤106 simulation steps (Figure 3D, left). The rate of breakage is significantly higher than at η = 1, thus abolishing aggregates larger than ∼120. Consequently, a steady-state assembly size distribution centered at the assembly size 25 with additional broader and less significant peaks centered at sizes ∼50 and ∼90, respectively, is observed (Figure 4D, red curve). The assembly proceeds through formation of smaller oligomers, followed by formation of elongated aggregates without any notable kinks or bending (Figure 3D, right). No significant trajectory-to-trajectory variability is observed (Figure 2).

Complex and Diverse Assembly Pathways at η = 0.75

The most variable assembly kinetics is observed for 0.5 ≤ η ≤ 1, where the competition between the effective hydrophobic attraction and hydrophilic repulsion combined with thermal fluctuations results in a complex competing dynamics, as demonstrated by simulations at η = 0.75, which display the largest trajectory-to-trajectory variability (Figure 2). Assembly dynamics shows characteristics of a two-stage aggregation process (observed for η = 0.5) as well as a relatively high breakage rate (observed for η = 1). Unique to simulations at η = 0.75 is a simultaneous occurrence of the fast and slow assembly dynamics, where one aggregate displays a fast growth through oligomer addition and the other aggregate displays a slow dynamics through monomer association and dissociation (Figure 5A, black double-sided arrow). Although the fast aggregation is more frequent earlier in the assembly process, it is sporadically observed at later assembly stages as well, depending on the amount of oligomers available for coalescence with larger elongated aggregates (Figure 2). Figure 6 shows a merging process, during which the smaller assembly attaches itself to the end of the larger aggregate (C), temporarily forms a kink (D), and is subsequently integrated into the larger aggregate, eventually eliminating the kink (F).

Figure 5

Figure 6

Time evolution of two merging assemblies for η = 0.75. (A–C) Large elongated multi-domain assembly and a small aggregate (inside a green circle) that are initially spatially separated, (D–F) merge into a single aggregate. The time frames correspond to (A) 42.33 × 106; (B) 42.34 × 106; (C) 42.35 × 106; (D) 42.36 × 106; (E) 42.38 × 106; and (F) 44.37 × 106 simulation steps. Note that within the 2 × 106 simulation steps between the last two time frames (E and F), the initially formed kink between the two merged assemblies smooths out as the smaller assembly integrates into the structure of the larger aggregate.

Characterization of the assembly at η = 0.75. (A) A three-dimensional plot showing the assembly size distribution or probability (z-axis) of all assembly sizes (x-axis) at different times (y-axis) of the assembly process at η = 0.75. The color code on the right denotes the order parameter of each assembly. (B) Assembly size distributions at several early (top) and later (bottom) assembly stages are displayed. (C–E) The assembly pathway and corresponding morphologies observed in simulations at η = 0.75. (C) Initially, tetrahedron molecules form small quasi-spherical oligomers, which further assemble into elongated protofibril-like aggregates. Finally, a large elongated assembly with distinct structural domains (I–IV), including a sandwich-like ordered structure (II), emerges. (D) A closeup of the four distinct domains with the respective cross sections, displaying distinct degrees of lateral ordering. (E) Time evolution of the sandwich-like ordered structure (II). Time evolution of two merging assemblies for η = 0.75. (A–C) Large elongated multi-domain assembly and a small aggregate (inside a green circle) that are initially spatially separated, (D–F) merge into a single aggregate. The time frames correspond to (A) 42.33 × 106; (B) 42.34 × 106; (C) 42.35 × 106; (D) 42.36 × 106; (E) 42.38 × 106; and (F) 44.37 × 106 simulation steps. Note that within the 2 × 106 simulation steps between the last two time frames (E and F), the initially formed kink between the two merged assemblies smooths out as the smaller assembly integrates into the structure of the larger aggregate. In the early assembly stage, we observe oligomers comprising ∼30 molecules (Figure 5B, top graph), which gradually decrease in abundance at later assembly stages at the expense of assemblies comprising up to 150 molecules as well as considerably larger aggregates comprising up to ∼750 molecules (Figure 5B, bottom graph). This broad distribution of assembly sizes is caused by breakage, which is more frequent than for η = 0.5 but less frequent than for η = 1. Assembly into smaller quasi-spherical oligomers is followed by formation of elongated curvilinear protofibrils that eventually evolve into large multi-domain aggregates, separated by kinks (Figure 5C). Figure 5D shows four structurally distinct domains of a polymorphic aggregate that forms following the pathway described above. Domains I, III, and IV are cylindrically symmetric yet display variable degrees of lateral order from the most disordered (IV) to the most ordered (I), as shown in the corresponding profiles (Figure 5D, bottom). Domain II, which is not cylindrically symmetric, appears to be the most ordered of the four domains and displays a sandwich-like profile. Molecules in domain II are arranged into a double layer of 6–8 molecules in width. This highly structured domain emerges from a less ordered domain and spreads to neighboring regions in the course of the simulation (Figure 5E).

Morphology of the Assemblies Depends on the Hydropathy Ratio

The assembly kinetics in our model is driven by a competition between spontaneous aggregation and disaggregation, controlled by the hydropathic ratio η, which affects the assembly size distribution, breakage rate, and assembly morphologies. The orientational order parameter of assemblies increases with the assembly size for each η (Figure 7A). In addition, pentacosamer (25-mer) conformations are more ordered and less prone to bending as η increases (Figure 7B).

Figure 7

Order parameter and characteristic pentacosamer conformations. (A) The average order parameter as a function of the assembly size for the five different hydropathy ratios η. For each assembly size, the order parameter is calculated for a window containing a connected cluster of 50 tetrahedron molecules and then averaged over all windows along the assembly axis. The error bars correspond to SEM values. (B) Characteristic pentacosamer (25-mer) conformations at different values of η.

Mapping to Natively Unfolded Amyloid Proteins

Complex and diverse self-assembly dynamics observed for 0.5 < η < 1 results in multi-domain aggregates, consistent with in vitro observed molecular-level polymorphism of amyloid fibrils,[20] with bends and kinks (Figure 8A) that strongly resemble the morphology of in vitro amyloid fibers (Figure 8B). Can our minimal protein model thus provide insights into in vitro amyloid formation? Amyloid proteins assemble into cross-β fibrils, stabilized by intermolecular hydrogen bonding along the fibril axis.[21−23] Yet, protein aggregation can occur in the absence of hydrogen bonding. In sickle cell anemia, sickle hemoglobin aggregates into polymers, causing a characteristic sickle-cell shape of red blood cells.[24] Whereas free energy calculations need to account for hydrogen bonding, our model reveals that a multi-stage assembly process that proceeds through formation of quasi-spherical and elongated mesoscopic-size aggregates, which eventually undergo a structural conversion into polymorphic aggregates, occurs also in the absence of hydrogen bonding.

Figure 8

Comparison of in silico and in vitro morphologies. (A) A long in silico aggregate before (upper) and after (lower) the breakage occurring at the position marked by black arrows (DMD simulations at η = 0.75). Cross-sections to the left and to the right of the breakage with mismatched morphologies are displayed at the bottom. (B) Transmission electron micrograph of Aβ42 fibers characterized by several kinks (green arrows) and breakages (red arrows). The image is a courtesy of Drs. Louise C. Serpell, Julian Thorpe, and Thomas L. Williams. The multi-stage assembly mechanism observed in our model is consistent with previously reported theoretical and experimental findings on amyloidogenic proteins. A recent theoretical study postulated both association and breakage processes to predict a variety of distinct aggregation pathways in agreement with experimental data.[25] Experimental studies on amyloid β-protein (Aβ), one of the most studied proteins due to its relevance to Alzheimer’s disease, which is the leading cause of dementia in elderly, showed that its two predominant alloforms, Aβ40 and Aβ42, form oligomers, which further assemble into protofibrils and finally convert into fibrils.[26,27] The existence of a nucleated conversion from prefibrillar into fibrillar assemblies was recently confirmed through FlAsH monitoring of Cys-Cys-Aβ aggregation.[28] A structural conversion from within a molten collapsed intermediates into a structurally ordered fibril was also shown to be consistent with experimental data on yeast Sup35 prion protein.[29] A large group of amyloidogenic proteins is natively unfolded.[30] A tetrahedron molecule in our model represents a monomer that retains its tertiary structure throughout the assembly process. As such, it can be identified as an aggregation-prone monomeric state of a natively unfolded protein. Whereas the internal degrees of freedom associated with intrinsically disordered nature of natively unfolded protein monomers are not incorporated in our model, the existence of such disorder implies that most amino acids in the sequence will be for some time exposed to the solvent, analogous to the four beads in the tetrahedron model. Consequently, it is reasonable to map the ratio of hydrophilic to hydrophobic amino acids, Np/Nh, in the sequence of natively unfolded amyloidogenic proteins to the hydropathy ratio η of our model. In this mapping, we implicitly assume that hydrophobic collapse dominates protein self-assembly and that hydrogen bonds, which are strongly directional, form among peptide regions only after these regions have been driven into proximity by effective hydrophobic interactions. This mapping also does not distinguish between charged and noncharged hydrophilic amino acids and thus all charged hydrophilic amino acids effectively contribute to repulsive interactions. In aqueous solutions, electrostatic interactions among charged amino acids depend on pH and ionic concentrations that may under specific conditions modify the overall repulsive to attractive interaction ratio. However, most protein sequences are dominated by noncharged amino acids and salt bridges between oppositely charged amino acids tend to be exposed to the solvent and rarely stabilize protein structures in water.[31,32] Thus, attractive electrostatic interactions are mostly not expected to significantly contribute to inter-residue interactions during protein self-assembly. By employing several hydropathy scales previously reported by Chothia,[33] Janin,[34] Kyte and Doolittle,[35] Eisenberg et al.,[36] and Engelman et al.,[37] we identified the hydrophilic and hydrophobic amino acids within each scale, counted the number of hydrophobic and hydrophilic amino acids in the sequence, and calculated Np/Nh for several natively unfolded proteins, known to form amyloid fibrils (Table 2). A histogram of the Np/Nh values from Table 2 demonstrates that the Np/Nh values are distributed in the range 0.5 < Np/Nh < 2 (Figure 9). The highest Np/Nh values are found for the human prion, prion-like yeast-Sup35, and tau (Table 2).

Table 2

Natively Unfolded Amyloidogenic Proteins and Their Np/Nh Valuesa

protein	ID	length	Chothia[33]	Janin[34]	Kyte-Doolittle[35]	Eisenberg[36]	Engleman[37]	average
ABri[45]	–	34	0.60	0.93	1.00	1.08	1.00	0.92
ADan[46]	–	34	0.56	0.75	0.88	0.86	0.88	0.79
α-Synuclein	1XX8(43)	140	0.68	0.96	0.98	0.71	0.98	0.86
Amylin	2KB8(43)	37	0.71	0.64	0.71	0.64	0.71	0.68
Aβ(1–40)	1BA4(43)	40	0.53	0.65	0.82	0.48	0.82	0.66
Aβ(1–42)	1IYT(43)	42	0.47	0.58	0.74	0.44	0.74	0.59
Apolipoprotein-A1	P02647[44]	242	1.03	1.26	1.38	1.12	1.32	1.22
Calcitonin	2GLH(43)	33	1.00	1.00	1.12	0.89	1.12	1.03
Human Prion	1QLX(43)	210	1.71	1.35	1.86	0.70	1.57	1.44
Huntingtin	3IO4(43)	449	0.88	0.90	0.97	0.76	0.92	0.89
Tau[47]	–	441	1.13	1.33	1.44	0.93	1.44	1.25
Transthyretin	1BM7(43)	127	0.58	0.71	0.84	0.59	0.80	0.70
Yeast-Sup35	P05453[44]	685	1.24	1.32	1.41	1.04	1.38	1.28
								0.86 ± 0.36

The ratio of hydrophilic to hydrophobic amino acids (Np/Nh) for each protein is calculated within each of the five distinct hydropathy scales. Proteins are identified by their names and/or ID codes from Protein Data Bank[43] or Universal Protein Resource.[44] For the three proteins without the ID codes, the source references are cited next to the protein name in the first column. The length of proteins is given in terms of the number of amino acids in the sequence (third column). The averages over the five Np/Nh values corresponding to each of the five hydropathy scales for each protein are shown in the last column. The average and standard deviation of the average Np/Nh values per protein of the last column are displayed at the bottom right (bold font).

Figure 9

Histogram of the hydropathic ratios for natively unfolded amyloid proteins. Histogram of the ratio of hydrophilic to hydrophobic number of residues, Np/Nh, in the sequence of natively unfolded amyloidogenic proteins. The ratio of hydrophilic to hydrophobic amino acids (Np/Nh) for each protein is calculated within each of the five distinct hydropathy scales. Proteins are identified by their names and/or ID codes from Protein Data Bank[43] or Universal Protein Resource.[44] For the three proteins without the ID codes, the source references are cited next to the protein name in the first column. The length of proteins is given in terms of the number of amino acids in the sequence (third column). The averages over the five Np/Nh values corresponding to each of the five hydropathy scales for each protein are shown in the last column. The average and standard deviation of the average Np/Nh values per protein of the last column are displayed at the bottom right (bold font). High values of η > 1 in our model result in a high breakage rate and relatively short and ordered aggregates. Prion proteins are known to form brittle aggregates, prone to breakage, which is associated with their infectious nature.[38] Substantial evidence indicates that tau protein displays prion-like characteristics by initially aggregating in a few nerve cells in discrete brain areas and then self–propagating and spreading to distant brain regions.[39] Most Np/Nh values from Table 2 fall into the range 0.5 < Np/Nh < 1, for which our model predicts complex and diverse assembly dynamics with a broad assembly size distribution. These proteins include Aβ40 and Aβ42 associated with Alzheimer’s disease, amylin associated with diabetes mellitus type 2, transthyretin associated with systemic amyloidosis, α-synuclein associated with Parkinson’s disease, and huntingtin associated with Huntington’s disease. Interestingly, Aβ42, which aggregates faster and forms larger assemblies than Aβ40 under equivalent in vitro conditions, has a lower Np/Nh value of 0.59 than Aβ40 with Np/Nh = 0.66, consistent with our model’s maximal assembly size decreasing with η in the range 0.5 < η < 1. For η ≥ 0.75, assembly breakage in our model creates a steady-state pool of oligomers, which form from smaller and merge into larger assemblies. Assuming that oligomers are the key toxic species responsible for pathogenesis, even if lifetimes of oligomers are short, the assembly dynamics that produces a constant pool of oligomers is a persistent source of toxicity, which provides a plausible mechanism through which a metastable assembly with a short lifetime might exert toxicity over a longer time period.

Conclusions and Discussion

In summary, our tetrahedron model predicts complex self-assembly into quasi-spherical oliogmers, curvilinear protofibrils, and multi-domain aggregates, characteristic of in vitro observed amyloid formation by natively unfolded amyloidogenic proteins. Spontaneous assembly breakage and merging, assembly size distributions, and a diversity of assembly morphologies in our model are controlled by a single parameter, the hydropathic ratio η, which can be meaningfully mapped onto the hydrophilic to hydrophobic ratio of the sequence of amyloidogenic proteins. We show that most natively unfolded amyloidogenic proteins are characterized by 0.5 < η < 1, for which our model predicts the most complex and diverse assembly dynamics. There is no consensus on the size and structure of toxic oligomers. In Alzheimer’s disease, different Aβ pre-fibrillar species were reported in vitro and in vivo, depending on the methodologies and experimental settings.[40] Moreover, relatively minor changes in the sequence between Aβ40 and Aβ42 result in distinct assembly pathways[27] and oligomer structures[10] that likely underlie distinct cytotoxic properties of the two peptides, with Aβ42 that is genetically more strongly associated with Alzheimer’s disease than Aβ40. Similarly, a single amino acid mutation of Aβ can cause earlier onset of the disease and/or altered pathology.[41] A seeming contradiction between two sides of the protein aggregation puzzle: (a) a large class of proteins with variable sequences that aggregate into amyloid fibrils and (b) small changes in the sequence and/or solvent conditions that significantly alter the assembly pathways and structures, can be reconciled within our model by noting that both sequence and solvent modifications alter an overall hydropathic nature of the protein, thus η, in the range, where the assembly dynamics is the most sensitive. Short hydrophobic peptides incubated with Aβ42 were recently reported to inhibit Aβ42-induced toxicity in cell cultures[42] and subsequent computer simulations of Aβ42 assembly in the presence of these inhibitors elucidated the structure of the resulting large amorphous hetero-oliogmers.[16] By binding hydrophobic peptides to Aβ42, the overall hydropathic ratio of hetero-assemblies, η, is lowered to η ≪ 0.5, thereby biasing aggregation toward presumably less toxic amorphous aggregates. The hydropathy parameter might thus be critical for a deeper understanding of protein aggregation dynamics and developing drugs that aim to alter aggregation pathways in a way to minimize the damage caused by cytotoxic oligomeric assemblies. While the discussion described above is specific to protein aggregation, our model and its predictions are applicable to a wide range of self–assembling systems and elucidate a critical role of the repulsive to attractive intermolecular interaction ratio in supramolecular structure prediction.

41 in total

1. The Protein Data Bank.

Authors: H M Berman; J Westbrook; Z Feng; G Gilliland; T N Bhat; H Weissig; I N Shindyalov; P E Bourne
Journal: Nucleic Acids Res Date: 2000-01-01 Impact factor: 16.971

Review 2. Protein folding and misfolding.

Authors: Christopher M Dobson
Journal: Nature Date: 2003-12-18 Impact factor: 49.962

3. Molecular dynamics simulations of spontaneous fibril formation by random-coil peptides.

Authors: Hung D Nguyen; Carol K Hall
Journal: Proc Natl Acad Sci U S A Date: 2004-11-08 Impact factor: 11.205

4. Folding events in the 21-30 region of amyloid beta-protein (Abeta) studied in silico.

Authors: Jose M Borreguero; Brigita Urbanc; Noel D Lazo; Sergey V Buldyrev; David B Teplow; H Eugene Stanley
Journal: Proc Natl Acad Sci U S A Date: 2005-04-18 Impact factor: 11.205

Review 5. Tau aggregation is driven by a transition from random coil to beta sheet structure.

Authors: Martin von Bergen; Stefan Barghorn; Jacek Biernat; Eva-Maria Mandelkow; Eckhard Mandelkow
Journal: Biochim Biophys Acta Date: 2004-11-12

6. A simple method for displaying the hydropathic character of a protein.

Authors: J Kyte; R F Doolittle
Journal: J Mol Biol Date: 1982-05-05 Impact factor: 5.469

7. A decamer duplication in the 3' region of the BRI gene originates an amyloid peptide that is associated with dementia in a Danish kindred.

Authors: R Vidal; T Revesz; A Rostagno; E Kim; J L Holton; T Bek; M Bojsen-Møller; H Braendgaard; G Plant; J Ghiso; B Frangione
Journal: Proc Natl Acad Sci U S A Date: 2000-04-25 Impact factor: 11.205

Review 8. Solid-state NMR studies of amyloid fibril structure.

Authors: Robert Tycko
Journal: Annu Rev Phys Chem Date: 2011 Impact factor: 12.703

9. Ab initio folding of proteins with all-atom discrete molecular dynamics.

Authors: Feng Ding; Douglas Tsao; Huifen Nie; Nikolay V Dokholyan
Journal: Structure Date: 2008-07 Impact factor: 5.006

10. Discrete molecular dynamics study of oligomer formation by N-terminally truncated amyloid β-protein.

Authors: Derya Meral; Brigita Urbanc
Journal: J Mol Biol Date: 2013-03-13 Impact factor: 5.469

4 in total

1. Amyloid-β (Aβ42) Peptide Aggregation Rate and Mechanism on Surfaces with Widely Varied Properties: Insights from Brownian Dynamics Simulations.

Authors: Timothy Cholko; Joseph Barnum; Chia-En A Chang
Journal: J Phys Chem B Date: 2020-06-26 Impact factor: 2.991