Bogdan Barz1, Brigita Urbanc. 1. Department of Physics, Drexel University , Philadelphia, Pennsylvania 19104, United States.
Abstract
Molecular self-assembly is ubiquitous in nature, yet prediction of assembly pathways from fundamental interparticle interactions has yet to be achieved. Here, we introduce a minimal self-assembly model with two attractive and two repulsive beads bound into a tetrahedron. The model is associated with a single parameter η defined as the repulsive to attractive interaction ratio. We explore self-assembly pathways and resulting assembly morphologies for different η values by discrete molecular dynamics. Our results demonstrate that η governs the assembly dynamics and resulting assembly morphologies, revealing an unexpected diversity and complexity for 0.5 ≤ η < 1. One of the key processes that governs the assembly dynamics is assembly breakage, which emerges spontaneously at η > 0 with the breakage rate increasing with η. The observed assembly pathways display a broad variety of assembly structures characteristic of aggregation of amyloidogenic proteins, including quasi-spherical oligomers that coassemble into elongated protofibrils, followed by a conversion into ordered polymorphic fibril-like aggregates. We further demonstrate that η can be meaningfully mapped onto amyloidogenic protein sequences, with the majority of amyloidogenic proteins characterized by 0.5 ≤ η < 1. Prion proteins, which are known to form highly breakage-prone fibrils, are characterized by η > 1, consistent with the model predictions. Our model thus provides a theoretical basis for understanding the universal aspects of aggregation pathways of amyloidogenic proteins relevant to human disease. As the model is not specific to proteins, these findings represent an important step toward understanding and predicting assembly dynamics of not only proteins but also viruses, colloids, and nanoparticles.
Molecular self-assembly is ubiquitous in nature, yet prediction of assembly pathways from fundamental interparticle interactions has yet to be achieved. Here, we introduce a minimal self-assembly model with two attractive and two repulsive beads bound into a tetrahedron. The model is associated with a single parameter η defined as the repulsive to attractive interaction ratio. We explore self-assembly pathways and resulting assembly morphologies for different η values by discrete molecular dynamics. Our results demonstrate that η governs the assembly dynamics and resulting assembly morphologies, revealing an unexpected diversity and complexity for 0.5 ≤ η < 1. One of the key processes that governs the assembly dynamics is assembly breakage, which emerges spontaneously at η > 0 with the breakage rate increasing with η. The observed assembly pathways display a broad variety of assembly structures characteristic of aggregation of amyloidogenic proteins, including quasi-spherical oligomers that coassemble into elongated protofibrils, followed by a conversion into ordered polymorphic fibril-like aggregates. We further demonstrate that η can be meaningfully mapped onto amyloidogenic protein sequences, with the majority of amyloidogenic proteins characterized by 0.5 ≤ η < 1. Prion proteins, which are known to form highly breakage-prone fibrils, are characterized by η > 1, consistent with the model predictions. Our model thus provides a theoretical basis for understanding the universal aspects of aggregation pathways of amyloidogenic proteins relevant to human disease. As the model is not specific to proteins, these findings represent an important step toward understanding and predicting assembly dynamics of not only proteins but also viruses, colloids, and nanoparticles.
Molecular self-assembly,
a spontaneous association of disordered
components into an ordered supramolecular structure, is responsible
for the formation of complex biological systems and is becoming increasingly
important in material sciences striving to develop novel biomaterials
with a great diversity of biochemical and physical properties. Very
little is known about the mechanisms underlying the emergence of ordered
structures from disordered components. Recently, a remarkable diversity
of polyhedra that self-assemble through excluded volume interactions
at high packing fractions into entropic crystals with various degrees
of crystalline order was reported.[1] At
the opposite spectrum of a high packing fraction is protein aggregation,
which typically occurs in vivo at nanomolar and in vitro at micromolar
concentrations and must be consequently facilitated by attractive
intermolecular interactions. Aberrant protein aggregation is at the
core of many age-triggered diseases, such as Alzheimer’s, Parkinson’s,
and Huntington’s disease, amyotrophic lateral sclerosis, type
II diabetes, systemic amyloidoses, and others.[2] These amyloid proteins do not share any obvious aspects of the primary
structure yet they self-assemble into cytotoxic low-molecular weight
oligomers and form fibrils with a common cross-β structure.[3] Inherent toxicity of amyloid assemblies that
cause the disease implies a common assembly mechanism;[4] however, assembly pathways are not well understood.[5] Here, we introduce a minimal model of self-assembly
that unifies common features of protein amyloidogenesis and can be
meaningfully mapped onto sequences of amyloid proteins. Because the
model is not protein specific, the model predictions extend to self-assembly
systems beyond the aggregation of amyloidogenic proteins.
Methods
Model Construction
A self-assembly model with a minimal
number of beads and interparticle interactions is constructed as following.
A one-bead molecule does not allow for the implementation of both
attractive and repulsive interactions simultaneously. A two- or three-bead
molecule is either anisotropic or planar, which imposes a
priori geometric restrictions on self-assembling molecules.
A three-dimensional molecule can be formed by a minimum of four beads,
which we place at the four vertices of a tetrahedron (Figure 1A). Each tetrahedron molecule thus represents a
molecule (protein monomer) comprising four beads each of a diameter D connected by four covalent bonds of identical lengths d (Figure 1A,B). Each bead within
a tetrahedron molecule is assigned either an attractive (hydrophobic)
or repulsive (hydrophilic) character, resulting in three possible
model variants: (i) one hydrophilic and three hydrophobic beads, (ii)
three hydrophilic and one hydrophobic bead, and (iii) two hydrophobic
and two hydrophilic beads. Consistent with discrete molecular dynamics
(DMD) requirements (see below), the effective intermolecular hydrophobic
attraction among hydrophobic beads and effective intermolecular hydrophilic
repulsion among hydrophilic beads of different molecules are modeled
by square-well potentials with a depth Eh and a height Ep, respectively, (Figure 1C). A hydrophobic and a hydrophilic bead interact
through an excluded volume only. The ratio of the two potential energies,
η = Ep/Eh, is hereafter referred to as the hydropathy ratio. In variant (i),
molecules aggregate into a single densely packed quasi-spherical assembly
(data not shown). In variant (ii), molecules form stable dimers and
trimers that do not assemble further (data now shown). In variant
(iii) (Figure 1A), molecules exhibit complex
assembly dynamics, which we examine for several values of the hydropathy
ratio.
Figure 1
Definition of the tetrahedron model. (A) A tetrahedron molecule
comprises two hydrophobic (red) and two hydrophilic (blue) beads of
a diameter D located at the vertices of a tetrahedron
with the edge length d. (B) An “infinite”
square-well intramolecular potential models covalent
bonds among the four beads in the tetrahedron molecule. The bond length d can vary between r1 = 1.56D and r2 = 1.61D. (C) Effective attractive (red) and repulsive (blue) intermolecular potentials between pairs of hydrophobic and hydrophilic
beads, respectively. The distance rmin = D corresponds to a sum of the van Der Waals radii
of the two interacting beads and rmax =
2.22D is the interaction range distance. The strengths
of the effective hydrophobic attractive and hydrophilic repulsive
potential are denoted by Eh and Ep, respectively.
Definition of the tetrahedron model. (A) A tetrahedron molecule
comprises two hydrophobic (red) and two hydrophilic (blue) beads of
a diameter D located at the vertices of a tetrahedron
with the edge length d. (B) An “infinite”
square-well intramolecular potential models covalent
bonds among the four beads in the tetrahedron molecule. The bond length d can vary between r1 = 1.56D and r2 = 1.61D. (C) Effective attractive (red) and repulsive (blue) intermolecular potentials between pairs of hydrophobic and hydrophilic
beads, respectively. The distance rmin = D corresponds to a sum of the van Der Waals radii
of the two interacting beads and rmax =
2.22D is the interaction range distance. The strengths
of the effective hydrophobic attractive and hydrophilic repulsive
potential are denoted by Eh and Ep, respectively.
Discrete Molecular Dynamics
When interparticle potentials
are approximated by a square-well or a combination of square-well
potentials, molecular dynamics can be reduced to an efficient, event-driven
discrete molecular dynamics (DMD). DMD was initially applied to simulate
a system of hard spheres.[6,7] More recently, several
low, intermediate, and high resolution protein models combined with
DMD provided important insights into protein folding and assembly.[8−17]Here, we apply DMD to examine pathways of self-assembly of
the tetrahedron model. In each simulation, 1000 tetrahedron molecules
occupy the simulation box with an edge length of 111D, which corresponds to a volume fraction of 2.80 ± 0.13 ×
10–3. Periodic boundary conditions are used and
temperature is controlled by the Berendsen thermostat.[18] We examine self-assembly at five hydropathic
ratios η = Ep/Eh = 0, 0.5, 0.75, 1, 1.25. For each η, 1000 tetrahedron
molecules are placed into a cubic lattice in the simulation box, followed
by high-temperature (kBT ≫ Eh) DMD simulations, which
produce five distinct initial ensembles of randomly distributed non-interacting
tetrahedron molecules. Thus obtained ensembles, five for each of the
five η values, resulting in 25 ensembles in total, are then
used as initial configurations for the production runs. Here, a configuration
is defined as an ensemble of coordinates of all tetrahedron molecules
in a given trajectory at a given simulation time. Production runs
are 10 × 106 (η = 0) or 50 × 106 (η > 0) simulation steps long, and configurations are recorded
every 104 simulation steps, resulting in 5 × 1000
= 5000 configurations (η = 0) or 5 × 5000 = 25 000
configurations (η > 0).The time scale of one simulation
step, Δt, can be expressed in terms of the
spatial resolution, Δx ∼ D ≈ 10–9 m (thus approximating a protein
by a tetrahedron with a circumsphere
radius of ∼1 nm), and thermal energy using the equipartition
theorem,where we chose
for the mass of one bead mB ∼ 10–22 kg, equivalent
to a sequence of ∼50 amino acids or a ∼ 200-residue
protein corresponding to each tetrahedron molecule. We use the thermal
energy at a physiological temperature 310 K, kBT ≈ 4 × 10–21J, resulting in Δt ≈
16 × 10–9 s and the total simulation time per
trajectory at nonzero η values (50 × 106 simulation
steps) of ∼8 × 10–3 s. Note that the
unit of length, D, and the mass of each bead, mB, can be adjusted to the size and molecular
mass of the protein of interest, which consequently affects the time
scale, Δt ∝ D(mB)1/2, such that without considering
the actual protein sequences, self-assembly of a larger protein occurs
on a longer time scale than self-assembly of a smaller protein.
Probability Distribution of Assembly Sizes
The assembly
size is defined as the number of molecules within a connected cluster.
A recursive algorithm is used to identify tetrahedron molecules that
belong to each cluster. A tetrahedron molecule is identified as a
part of the cluster if any of its four beads is at a distance ≤2.22D from at least one of the beads of the molecules within
the cluster. To derive three-dimensional plots displaying the time
evolution of assembly sizes for each trajectory, assembly size distributions
are calculated by using a binning interval of 5 × 105 simulation steps. The probability distribution of assembly sizes
is calculated by using a 106 simulation steps-wide window,
resulting in 100 ensembles, each comprising 1000 tetrahedron molecules
in various assembly states. Then, the average over the five probability
distributions is taken, and the corresponding standard error of the
mean (SEM) values are calculated. If an assembly size appears only
once in a single trajectory, the corresponding SEM value is set to
zero. By moving the 106 simulation steps-wide window along
the assembly trajectories between 0 and 50 × 106 simulation
steps, time evolution of the size distribution for each η is
derived.
Order Parameter
To quantify the degree of orientational
order within the assemblies of different sizes acquired for different
η values, we calculate the order parameter S of each assembly. Each molecule in the assembly is characterized
by an axis that connects its two hydrophobic (red) beads. The angle
θ between the axis of each molecule and the long axis of the
elongated assembly, determined by the principal component analysis,[19] is then used to calculate S as the average:over all molecules
in the assembly. Note that
due to periodic boundary conditions, some assemblies are split in
two or more clusters, which are separated by the linear size of the
cubic simulation box in either of the three directions. To accurately
calculate the order parameter, these split clusters are merged into
a single cluster prior to the order parameter calculation. The order
parameter calculation is performed for each assembly of size larger
than 5 within 5000 (η = 0) or 25 000 (η > 0)
recorded
configurations, each containing 1000 tetrahedron molecules. For each
ensemble of assemblies of a size ≥6, the average order parameter
and SEM values are calculated by taking into account the order parameter
values of all assemblies of a given size. For assemblies comprising
100 or more tetrahedron molecules, a running window along the assembly
axis containing 50 molecules is used to calculate the local order
parameter, followed by an average over all local order parameters.
Results
Here a minimal model of self-assembling molecules,
each characterized
by a combination of attractive and repulsive interactions, is examined.
Generally, any molecule prone to aggregation is characterized by attractive
and repulsive interactions due to van der Waals and electrostatic
interactions. We adopt in the following protein folding and aggregation
terminology when discussing the origin of attractive and repulsive
interactions. In proteins, attractive and repulsive interactions stem
from the hydropathic (polar versus nonpolar) and electrostatic (charged
versus uncharged) nature of individual residues. We posit that in
addition to the effective hydrophobic attraction, the effective hydrophilic
and/or electrostatic repulsion due to solvation of polar side chains
by water molecules plays a critical role in prediction of protein
assembly pathways and structures. We then construct a tetrahedron
model with two hydrophobic and two hydrophilic beads as described
in the Model Construction section (Methods),
in which the hydropathy ratio η = Ep/Eh between the repulsive Ep and attractive Eh potential
energies is introduced as the only model parameter (Figure 1).Self-assembly of the tetrahedron model
is studied in the absence
of repulsion among the hydrophilic beads (η = 0) and at four
different hydropathy ratios (η = 0.5, 0.75, 1, and 1.25). To
avoid a comparison of the assembly pathways of proteins with vastly
different solubilities, the energy scale Eh is adjusted for each η to keep the final monomer concentration
within the range of 10–16% (Table 1).
For each η, five trajectories of self-assembly from initially
monomeric ensembles of 1000 tetrahedron molecules are acquired and
analyzed. For each trajectory, time evolution of assembly sizes is
monitored and displayed as a three-dimensional plot (Figure 2), in which each assembly is also characterized
by an order parameter as described in the Order
Parameter section (Methods), measuring a degree of the orientational
order of molecules within the assembly (Figure 2). Below, we refer to small, quasi-spherical assemblies as oligomers
to distinguish them from elongated, curvilinear assemblies (protofibrils)
and more ordered assemblies (fibrils).
Table 1
Energy Unit and Steady-State
Monomer,
Dimer, and Trimer Concentrationsa
η = Ep/Eh
Eh [kBT]
[monomer]
[%]
[dimer] [%]
[trimer] [%]
0.00
0.97
14.027 ± 0.229
0.408 ± 0.028
0.014 ± 0.004
0.50
1.16
11.000 ± 0.383
0.272 ± 0.024
0.007 ± 0.003
0.75
1.27
10.737 ± 0.412
0.249 ± 0.016
0.007 ± 0.001
1.00
1.37
12.768 ± 0.268
0.430 ± 0.031
0.025 ± 0.005
1.25
1.45
15.687 ± 0.095
0.886 ± 0.016
0.052 ± 0.009
The potential
energy Eh, associated with the effective
hydrophobic
attraction, which represents a unit energy in our model is adjusted
for each hydropathy ratio η to yield comparable steady-state
monomer concentration (solubility). For each η, steady-state
monomer, dimer, and trimer concentrations and their SEM values are
calculated by averaging over monomer, dimer, and trimer concentrations
for each trajectory within 9–10 × 106 steps
(η = 0) or 49–50 × 106 steps (η
> 0), followed by averaging over the five trajectories.
Figure 2
Assembly pathways of
all trajectories. Three-dimensional plots
showing probability distributions P(n) of different assembly sizes n as a function of
the simulation time t during the process of assembly
starting from an ensemble of spatially separated (monomeric) tetrahedron
molecules. Three–dimensional plots for each of the five trajectories,
labeled as S1, S2, S3, S4, and S5, for each of the five η values
are displayed. The average order parameter S that
characterizes each assembly is color coded as shown on the color scale.
Assembly pathways of
all trajectories. Three-dimensional plots
showing probability distributions P(n) of different assembly sizes n as a function of
the simulation time t during the process of assembly
starting from an ensemble of spatially separated (monomeric) tetrahedron
molecules. Three–dimensional plots for each of the five trajectories,
labeled as S1, S2, S3, S4, and S5, for each of the five η values
are displayed. The average order parameter S that
characterizes each assembly is color coded as shown on the color scale.The potential
energy Eh, associated with the effective
hydrophobic
attraction, which represents a unit energy in our model is adjusted
for each hydropathy ratio η to yield comparable steady-state
monomer concentration (solubility). For each η, steady-state
monomer, dimer, and trimer concentrations and their SEM values are
calculated by averaging over monomer, dimer, and trimer concentrations
for each trajectory within 9–10 × 106 steps
(η = 0) or 49–50 × 106 steps (η
> 0), followed by averaging over the five trajectories.
Amorphous Assembly at η = 0
Figure 3A shows time evolution of assembly
sizes for four representative
trajectories corresponding to η = 0, 0.5, 1, and 1.25, respectively.
At η = 0, the initial hydrophobic collapse driving the assembly
occurs rapidly, bringing the monomer concentration from the initial
100% to the final value of ∼14% (Table 1). Within the first 106 simulation steps (Figure 4A), initial monomers rapidly assemble into an aggregate
comprising of 800–900 molecules, which remains in a steady-state
equilibrium with monomers (Figure 3A, left)
and small populations of dimers and trimers (Table 1). All assemblies have a relatively low order parameter as
color-coded in Figure 3A (left). The assembly
proceeds through formation of quasi-spherical oligomers, further elongating
into larger curved tubules, which collapse onto themselves, exposing
nonhydrophobic (blue) and shielding hydrophobic (red) beads from the
“solvent”, and eventually form a large quasi-spherical
aggregate with an overall porous morphology (Figure 3A, right). The high bending propensity of these aggregates
is reflected in their low order parameter for all five trajectories,
which display almost identical assembly pathways (Figure 2).
Figure 3
Time evolution of assembly pathways. Three-dimensional
plots showing
the assembly size distribution or probability (z-axis)
of all assembly sizes (x-axis) at different times
(y-axis) for representative trajectories (left) and
their assembly pathways with characteristic morphologies (right) for
simulations at (A) η = 0; (B) η = 0.5; (C) η = 1;
and (D) η = 1.25. The color code in (A) is used to characterize
the order parameter of each assembly in the three–dimensional
plots. Note that the conformations are not displayed on the same scale:
The final aggregates along assembly pathways for η = 0, 0.5
(A,B, right) are considerably larger than those for η = 1 and
1.25 (C,D, right), however, their lateral dimensions are comparable.
Figure 4
Time evolution of assembly size distributions.
The average assembly
size distributions at η equal to (A) 0, (B) 0.5, (C) 1, and
(D) 1.25 steadily evolve from the initial to the later assembly stages.
Note that for η = 0, the size distribution reaches steady state
within the initial 106 simulation steps. The error bars
correspond to SEM values.
Time evolution of assembly pathways. Three-dimensional
plots showing
the assembly size distribution or probability (z-axis)
of all assembly sizes (x-axis) at different times
(y-axis) for representative trajectories (left) and
their assembly pathways with characteristic morphologies (right) for
simulations at (A) η = 0; (B) η = 0.5; (C) η = 1;
and (D) η = 1.25. The color code in (A) is used to characterize
the order parameter of each assembly in the three–dimensional
plots. Note that the conformations are not displayed on the same scale:
The final aggregates along assembly pathways for η = 0, 0.5
(A,B, right) are considerably larger than those for η = 1 and
1.25 (C,D, right), however, their lateral dimensions are comparable.Time evolution of assembly size distributions.
The average assembly
size distributions at η equal to (A) 0, (B) 0.5, (C) 1, and
(D) 1.25 steadily evolve from the initial to the later assembly stages.
Note that for η = 0, the size distribution reaches steady state
within the initial 106 simulation steps. The error bars
correspond to SEM values.
Transient Oligomers and Protofibril-like Assemblies at η
= 0.5
Simulations at η = 0.5 can be characterized by
two assembly stages. The initial assembly stage lasts ∼10 –
20 × 106 simulation steps (Figure 3B, left), which is an order of magnitude longer than the hydrophobic
collapse observed for η = 0. During this initial stage, smaller
quasi-spherical oligomers coexist with larger elongated aggregates
and a rapid elongation of large aggregates is mostly driven by oligomer
addition. The assembly size distribution averaged over all five trajectories
(Figure 2) reveals a large number of oligomers
comprising 30–70 tetrahedron molecules forming within ∼106 simulation steps (Figure 4B, black
curve). These oligomers then coalesce into larger assemblies comprising
100–140 tetrahedron molecules at 1–2 × 106 simulation steps (Figure 4B, blue curve).
After this initial assembly stage (at ∼20 × 106 simulation steps), oligomers disappear, marking the onset of the
late assembly stage. In the late assembly stage, the elongated aggregates
either increase or decrease in length at a rate distinctly lower than
the aggregation rate of the initial assembly stage, while persisting
in a steady-state equilibrium with the monomer population and minor
populations of dimers and trimers (Table 1).
Large aggregates at this late stage increase or decrease in size predominantly
by monomer association and dissociation. In addition to this predominant
growth dynamics, two additional albeit less frequent processes are
noted: (a) merging of two elongated assemblies into a single larger
elongated assembly and (b) breaking of an elongated aggregate into
two smaller aggregates, which continue to either increase or decrease
in length. The second process of aggregate breakage results in smaller
assemblies of various sizes (Figure 4B, cyan,
green, orange, and red curves). The morphology of the assemblies observed
in simulation at η = 0.5 differs from those observed for η
= 0. All assemblies at η = 0.5 have a considerably larger order
parameter than those observed in simulations at η = 0. The order
parameter of assemblies increases with their size (Figure 3B, left). Elongated aggregates are considerably
less prone to bending than those observed for η = 0, but display
kinks and can be slightly bent (Figure 3B,
right). Unlike trajectories at η = 0, trajectories at η
= 0.5 display significant variations (Figure 2).
Breakage-Dominated Assembly at η ≥ 1
A
two-stage assembly process observed for η = 0.5 is absent from
simulations at η = 1. All trajectories acquired at η =
1 reach a steady state rapidly within 106 simulation steps.
The assembly pathways appear to be characterized by a single assembly
growth rate (Figure 2). This rate is slightly
lower than the one characterizing the initial assembly stage but notably
larger than the elongation rate observed in the late assembly stage
at η = 0.5. Importantly, the assembly dynamics at η =
1 is dominated by assembly breakage, which critically reduces the
size of the largest aggregate. Consequently, the largest aggregate
in the trajectory shown in Figure 3C (left)
comprises ∼400 tetrahedron molecules. Time evolution of the
assembly size distributions reveals an abundance of quasi-spherical
oligomers made of 20–30 molecules that form within the first
106 simulation steps (Figure 4C,
black curve). The corresponding peak in the assembly size distribution
decreases and broadens at longer simulation times (Figure 4C, blue, cyan, green, orange, and red curves). Unlike
in simulations at η = 0.5, where the oligomer assemblies disappear
after the initial assembly stage, the assemblies of ∼20–30
molecules persist at all simulation times, and give rise to assemblies
of ∼60 – 80 and ∼100 – 120 tetrahedron
molecules, respectively (Figure 4C, red curve).
This multimodal character of the assembly size distribution indicates
that larger assemblies form through coalescence of smaller oligomers
rather than through monomer addition. The steady–state pool
of assemblies smaller than ∼120 is a cause of a rapid growth
of larger aggregates. A large breakage rate characteristic for η
= 1 results in a broad distribution of assembly sizes up to ∼550
(Figure 4C, green, orange, red curves). Similar
to simulations at η = 0 and η = 0.5, molecules self-assemble
initially into smaller quasi-spherical oligomers, followed by formation
of elongated aggregates. The largest elongated aggregates show several
kinks (Figure 3C, right). Some trajectory-to-trajectory
variability is observed at η = 1, reflected in a variable size
of the largest aggregate (Figure 2).Simulations at η = 1.25 are characterized by a steady state
dynamics, which is reached within ≤106 simulation
steps (Figure 3D, left). The rate of breakage
is significantly higher than at η = 1, thus abolishing aggregates
larger than ∼120. Consequently, a steady-state assembly size
distribution centered at the assembly size 25 with additional broader
and less significant peaks centered at sizes ∼50 and ∼90,
respectively, is observed (Figure 4D, red curve).
The assembly proceeds through formation of smaller oligomers, followed
by formation of elongated aggregates without any notable kinks or
bending (Figure 3D, right). No significant
trajectory-to-trajectory variability is observed (Figure 2).
Complex and Diverse Assembly Pathways at
η = 0.75
The most variable assembly kinetics is observed
for 0.5 ≤
η ≤ 1, where the competition between the effective hydrophobic
attraction and hydrophilic repulsion combined with thermal fluctuations
results in a complex competing dynamics, as demonstrated by simulations
at η = 0.75, which display the largest trajectory-to-trajectory
variability (Figure 2). Assembly dynamics shows
characteristics of a two-stage aggregation process (observed for η
= 0.5) as well as a relatively high breakage rate (observed for η
= 1). Unique to simulations at η = 0.75 is a simultaneous occurrence
of the fast and slow assembly dynamics, where one aggregate displays
a fast growth through oligomer addition and the other aggregate displays
a slow dynamics through monomer association and dissociation (Figure 5A, black double-sided arrow). Although the fast
aggregation is more frequent earlier in the assembly process, it is
sporadically observed at later assembly stages as well, depending
on the amount of oligomers available for coalescence with larger elongated
aggregates (Figure 2). Figure 6 shows a merging process, during which the smaller assembly
attaches itself to the end of the larger aggregate (C), temporarily
forms a kink (D), and is subsequently integrated into the larger aggregate,
eventually eliminating the kink (F).
Figure 5
Characterization of the assembly at η
= 0.75. (A) A three-dimensional
plot showing the assembly size distribution or probability (z-axis) of all assembly sizes (x-axis)
at different times (y-axis) of the assembly process
at η = 0.75. The color code on the right denotes the order parameter
of each assembly. (B) Assembly size distributions at several early
(top) and later (bottom) assembly stages are displayed. (C–E)
The assembly pathway and corresponding morphologies observed in simulations
at η = 0.75. (C) Initially, tetrahedron molecules form small
quasi-spherical oligomers, which further assemble into elongated protofibril-like
aggregates. Finally, a large elongated assembly with distinct structural
domains (I–IV), including a sandwich-like ordered structure
(II), emerges. (D) A closeup of the four distinct domains with the
respective cross sections, displaying distinct degrees of lateral
ordering. (E) Time evolution of the sandwich-like ordered structure
(II).
Figure 6
Time evolution of two merging assemblies for
η = 0.75. (A–C)
Large elongated multi-domain assembly and a small aggregate (inside
a green circle) that are initially spatially separated, (D–F)
merge into a single aggregate. The time frames correspond to (A) 42.33
× 106; (B) 42.34 × 106; (C) 42.35
× 106; (D) 42.36 × 106; (E) 42.38
× 106; and (F) 44.37 × 106 simulation
steps. Note that within the 2 × 106 simulation steps
between the last two time frames (E and F), the initially formed kink
between the two merged assemblies smooths out as the smaller assembly
integrates into the structure of the larger aggregate.
Characterization of the assembly at η
= 0.75. (A) A three-dimensional
plot showing the assembly size distribution or probability (z-axis) of all assembly sizes (x-axis)
at different times (y-axis) of the assembly process
at η = 0.75. The color code on the right denotes the order parameter
of each assembly. (B) Assembly size distributions at several early
(top) and later (bottom) assembly stages are displayed. (C–E)
The assembly pathway and corresponding morphologies observed in simulations
at η = 0.75. (C) Initially, tetrahedron molecules form small
quasi-spherical oligomers, which further assemble into elongated protofibril-like
aggregates. Finally, a large elongated assembly with distinct structural
domains (I–IV), including a sandwich-like ordered structure
(II), emerges. (D) A closeup of the four distinct domains with the
respective cross sections, displaying distinct degrees of lateral
ordering. (E) Time evolution of the sandwich-like ordered structure
(II).Time evolution of two merging assemblies for
η = 0.75. (A–C)
Large elongated multi-domain assembly and a small aggregate (inside
a green circle) that are initially spatially separated, (D–F)
merge into a single aggregate. The time frames correspond to (A) 42.33
× 106; (B) 42.34 × 106; (C) 42.35
× 106; (D) 42.36 × 106; (E) 42.38
× 106; and (F) 44.37 × 106 simulation
steps. Note that within the 2 × 106 simulation steps
between the last two time frames (E and F), the initially formed kink
between the two merged assemblies smooths out as the smaller assembly
integrates into the structure of the larger aggregate.In the early assembly stage, we observe oligomers
comprising ∼30
molecules (Figure 5B, top graph), which gradually
decrease in abundance at later assembly stages at the expense of assemblies
comprising up to 150 molecules as well as considerably larger aggregates
comprising up to ∼750 molecules (Figure 5B, bottom graph). This broad distribution of assembly sizes is caused
by breakage, which is more frequent than for η = 0.5 but less
frequent than for η = 1. Assembly into smaller quasi-spherical
oligomers is followed by formation of elongated curvilinear protofibrils
that eventually evolve into large multi-domain aggregates, separated
by kinks (Figure 5C). Figure 5D shows four structurally distinct domains of a polymorphic
aggregate that forms following the pathway described above. Domains
I, III, and IV are cylindrically symmetric yet display variable degrees
of lateral order from the most disordered (IV) to the most ordered
(I), as shown in the corresponding profiles (Figure 5D, bottom). Domain II, which is not cylindrically symmetric,
appears to be the most ordered of the four domains and displays a
sandwich-like profile. Molecules in domain II are arranged into a
double layer of 6–8 molecules in width. This highly structured
domain emerges from a less ordered domain and spreads to neighboring
regions in the course of the simulation (Figure 5E).
Morphology of the Assemblies Depends on the Hydropathy Ratio
The assembly kinetics in our model is driven by a competition between
spontaneous aggregation and disaggregation, controlled by the hydropathic
ratio η, which affects the assembly size distribution, breakage
rate, and assembly morphologies. The orientational order parameter
of assemblies increases with the assembly size for each η (Figure 7A). In addition, pentacosamer (25-mer) conformations
are more ordered and less prone to bending as η increases (Figure 7B).
Figure 7
Order parameter and characteristic pentacosamer conformations.
(A) The average order parameter as a function of the assembly size
for the five different hydropathy ratios η. For each assembly
size, the order parameter is calculated for a window containing a
connected cluster of 50 tetrahedron molecules and then averaged over
all windows along the assembly axis. The error bars correspond to
SEM values. (B) Characteristic pentacosamer (25-mer) conformations
at different values of η.
Order parameter and characteristic pentacosamer conformations.
(A) The average order parameter as a function of the assembly size
for the five different hydropathy ratios η. For each assembly
size, the order parameter is calculated for a window containing a
connected cluster of 50 tetrahedron molecules and then averaged over
all windows along the assembly axis. The error bars correspond to
SEM values. (B) Characteristic pentacosamer (25-mer) conformations
at different values of η.
Mapping to Natively Unfolded Amyloid Proteins
Complex
and diverse self-assembly dynamics observed for 0.5 < η <
1 results in multi-domain aggregates, consistent with in vitro observed
molecular-level polymorphism of amyloid fibrils,[20] with bends and kinks (Figure 8A)
that strongly resemble the morphology of in vitro amyloid fibers (Figure 8B). Can our minimal protein model thus provide insights
into in vitro amyloid formation? Amyloid proteins assemble into cross-β
fibrils, stabilized by intermolecular hydrogen bonding along the fibril
axis.[21−23] Yet, protein aggregation can occur in the absence
of hydrogen bonding. In sickle cell anemia, sickle hemoglobin aggregates
into polymers, causing a characteristic sickle-cell shape of red blood
cells.[24] Whereas free energy calculations
need to account for hydrogen bonding, our model reveals that a multi-stage
assembly process that proceeds through formation of quasi-spherical
and elongated mesoscopic-size aggregates, which eventually undergo
a structural conversion into polymorphic aggregates, occurs also in
the absence of hydrogen bonding.
Figure 8
Comparison of in silico and in vitro morphologies.
(A) A long in
silico aggregate before (upper) and after (lower) the breakage occurring
at the position marked by black arrows (DMD simulations at η
= 0.75). Cross-sections to the left and to the right of the breakage
with mismatched morphologies are displayed at the bottom. (B) Transmission
electron micrograph of Aβ42 fibers characterized by several
kinks (green arrows) and breakages (red arrows). The image is a courtesy
of Drs. Louise C. Serpell, Julian Thorpe, and Thomas L. Williams.
Comparison of in silico and in vitro morphologies.
(A) A long in
silico aggregate before (upper) and after (lower) the breakage occurring
at the position marked by black arrows (DMD simulations at η
= 0.75). Cross-sections to the left and to the right of the breakage
with mismatched morphologies are displayed at the bottom. (B) Transmission
electron micrograph of Aβ42 fibers characterized by several
kinks (green arrows) and breakages (red arrows). The image is a courtesy
of Drs. Louise C. Serpell, Julian Thorpe, and Thomas L. Williams.The multi-stage assembly mechanism
observed in our model is consistent
with previously reported theoretical and experimental findings on
amyloidogenic proteins. A recent theoretical study postulated both
association and breakage processes to predict a variety of distinct
aggregation pathways in agreement with experimental data.[25] Experimental studies on amyloid β-protein
(Aβ), one of the most studied proteins due to its relevance
to Alzheimer’s disease, which is the leading cause of dementia
in elderly, showed that its two predominant alloforms, Aβ40
and Aβ42, form oligomers, which further assemble into protofibrils
and finally convert into fibrils.[26,27] The existence
of a nucleated conversion from prefibrillar into fibrillar assemblies
was recently confirmed through FlAsH monitoring of Cys-Cys-Aβ
aggregation.[28] A structural conversion
from within a molten collapsed intermediates into a structurally ordered
fibril was also shown to be consistent with experimental data on yeastSup35prion protein.[29]A large group
of amyloidogenic proteins is natively unfolded.[30] A tetrahedron molecule in our model represents
a monomer that retains its tertiary structure throughout the assembly
process. As such, it can be identified as an aggregation-prone monomeric
state of a natively unfolded protein. Whereas the internal degrees
of freedom associated with intrinsically disordered nature of natively
unfolded protein monomers are not incorporated in our model, the existence
of such disorder implies that most amino acids in the sequence will
be for some time exposed to the solvent, analogous to the four beads
in the tetrahedron model. Consequently, it is reasonable to map the
ratio of hydrophilic to hydrophobic amino acids, Np/Nh, in the sequence of natively
unfolded amyloidogenic proteins to the hydropathy ratio η of
our model. In this mapping, we implicitly assume that hydrophobic
collapse dominates protein self-assembly and that hydrogen bonds,
which are strongly directional, form among peptide regions only after
these regions have been driven into proximity by effective hydrophobic
interactions. This mapping also does not distinguish between charged
and noncharged hydrophilic amino acids and thus all charged hydrophilic
amino acids effectively contribute to repulsive interactions. In aqueous
solutions, electrostatic interactions among charged amino acids depend
on pH and ionic concentrations that may under specific conditions
modify the overall repulsive to attractive interaction ratio. However,
most protein sequences are dominated by noncharged amino acids and
salt bridges between oppositely charged amino acids tend to be exposed
to the solvent and rarely stabilize protein structures in water.[31,32] Thus, attractive electrostatic interactions are mostly not expected
to significantly contribute to inter-residue interactions during protein
self-assembly.By employing several hydropathy scales previously
reported by Chothia,[33] Janin,[34] Kyte and
Doolittle,[35] Eisenberg et al.,[36] and Engelman et al.,[37] we identified the hydrophilic and hydrophobic amino acids within
each scale, counted the number of hydrophobic and hydrophilic amino
acids in the sequence, and calculated Np/Nh for several natively unfolded proteins,
known to form amyloid fibrils (Table 2). A
histogram of the Np/Nh values from Table 2 demonstrates
that the Np/Nh values are distributed in the range 0.5 < Np/Nh < 2 (Figure 9). The highest Np/Nh values are found for the humanprion, prion-like yeast-Sup35,
and tau (Table 2).
Table 2
Natively Unfolded
Amyloidogenic Proteins
and Their Np/Nh Valuesa
protein
ID
length
Chothia[33]
Janin[34]
Kyte-Doolittle[35]
Eisenberg[36]
Engleman[37]
average
ABri[45]
–
34
0.60
0.93
1.00
1.08
1.00
0.92
ADan[46]
–
34
0.56
0.75
0.88
0.86
0.88
0.79
α-Synuclein
1XX8(43)
140
0.68
0.96
0.98
0.71
0.98
0.86
Amylin
2KB8(43)
37
0.71
0.64
0.71
0.64
0.71
0.68
Aβ(1–40)
1BA4(43)
40
0.53
0.65
0.82
0.48
0.82
0.66
Aβ(1–42)
1IYT(43)
42
0.47
0.58
0.74
0.44
0.74
0.59
Apolipoprotein-A1
P02647[44]
242
1.03
1.26
1.38
1.12
1.32
1.22
Calcitonin
2GLH(43)
33
1.00
1.00
1.12
0.89
1.12
1.03
Human Prion
1QLX(43)
210
1.71
1.35
1.86
0.70
1.57
1.44
Huntingtin
3IO4(43)
449
0.88
0.90
0.97
0.76
0.92
0.89
Tau[47]
–
441
1.13
1.33
1.44
0.93
1.44
1.25
Transthyretin
1BM7(43)
127
0.58
0.71
0.84
0.59
0.80
0.70
Yeast-Sup35
P05453[44]
685
1.24
1.32
1.41
1.04
1.38
1.28
0.86 ± 0.36
The ratio of hydrophilic to hydrophobic
amino acids (Np/Nh) for each protein is calculated within each of the five distinct
hydropathy scales. Proteins are identified by their names and/or ID
codes from Protein Data Bank[43] or Universal
Protein Resource.[44] For the three proteins
without the ID codes, the source references are cited next to the
protein name in the first column. The length of proteins is given
in terms of the number of amino acids in the sequence (third column).
The averages over the five Np/Nh values corresponding to each of the five hydropathy
scales for each protein are shown in the last column. The average
and standard deviation of the average Np/Nh values per protein of the last column
are displayed at the bottom right (bold font).
Figure 9
Histogram of the hydropathic
ratios for natively unfolded amyloid
proteins. Histogram of the ratio of hydrophilic to hydrophobic number
of residues, Np/Nh, in the sequence of natively unfolded amyloidogenic proteins.
Histogram of the hydropathic
ratios for natively unfolded amyloid
proteins. Histogram of the ratio of hydrophilic to hydrophobic number
of residues, Np/Nh, in the sequence of natively unfolded amyloidogenic proteins.The ratio of hydrophilic to hydrophobic
amino acids (Np/Nh) for each protein is calculated within each of the five distinct
hydropathy scales. Proteins are identified by their names and/or ID
codes from Protein Data Bank[43] or Universal
Protein Resource.[44] For the three proteins
without the ID codes, the source references are cited next to the
protein name in the first column. The length of proteins is given
in terms of the number of amino acids in the sequence (third column).
The averages over the five Np/Nh values corresponding to each of the five hydropathy
scales for each protein are shown in the last column. The average
and standard deviation of the average Np/Nh values per protein of the last column
are displayed at the bottom right (bold font).High values of η > 1 in
our model result in a high breakage
rate and relatively short and ordered aggregates. Prion proteins are
known to form brittle aggregates, prone to breakage, which is associated
with their infectious nature.[38] Substantial
evidence indicates that tau protein displays prion-like characteristics
by initially aggregating in a few nerve cells in discrete brain areas
and then self–propagating and spreading to distant brain regions.[39] Most Np/Nh values from Table 2 fall into the range 0.5 < Np/Nh < 1, for which our model predicts complex
and diverse assembly dynamics with a broad assembly size distribution.
These proteins include Aβ40 and Aβ42 associated with Alzheimer’s
disease, amylin associated with diabetes mellitus type 2, transthyretin
associated with systemic amyloidosis, α-synuclein associated
with Parkinson’s disease, and huntingtin associated with Huntington’s
disease. Interestingly, Aβ42, which aggregates faster and forms
larger assemblies than Aβ40 under equivalent in vitro conditions,
has a lower Np/Nh value of 0.59 than Aβ40 with Np/Nh = 0.66, consistent with our
model’s maximal assembly size decreasing with η in the
range 0.5 < η < 1. For η ≥ 0.75, assembly
breakage in our model creates a steady-state pool of oligomers, which
form from smaller and merge into larger assemblies. Assuming that
oligomers are the key toxic species responsible for pathogenesis,
even if lifetimes of oligomers are short, the assembly dynamics that
produces a constant pool of oligomers is a persistent source of toxicity,
which provides a plausible mechanism through which a metastable assembly
with a short lifetime might exert toxicity over a longer time period.
Conclusions and Discussion
In summary, our tetrahedron model
predicts complex self-assembly
into quasi-spherical oliogmers, curvilinear protofibrils, and multi-domain
aggregates, characteristic of in vitro observed amyloid formation
by natively unfolded amyloidogenic proteins. Spontaneous assembly
breakage and merging, assembly size distributions, and a diversity
of assembly morphologies in our model are controlled by a single parameter,
the hydropathic ratio η, which can be meaningfully mapped onto
the hydrophilic to hydrophobic ratio of the sequence of amyloidogenic
proteins. We show that most natively unfolded amyloidogenic proteins
are characterized by 0.5 < η < 1, for which our model
predicts the most complex and diverse assembly dynamics.There
is no consensus on the size and structure of toxic oligomers.
In Alzheimer’s disease, different Aβ pre-fibrillar species
were reported in vitro and in vivo, depending on the methodologies
and experimental settings.[40] Moreover,
relatively minor changes in the sequence between Aβ40 and Aβ42
result in distinct assembly pathways[27] and
oligomer structures[10] that likely underlie
distinct cytotoxic properties of the two peptides, with Aβ42
that is genetically more strongly associated with Alzheimer’s
disease than Aβ40. Similarly, a single amino acid mutation of
Aβ can cause earlier onset of the disease and/or altered pathology.[41] A seeming contradiction between two sides of
the protein aggregation puzzle: (a) a large class of proteins with
variable sequences that aggregate into amyloid fibrils and (b) small
changes in the sequence and/or solvent conditions that significantly
alter the assembly pathways and structures, can be reconciled within
our model by noting that both sequence and solvent modifications alter
an overall hydropathic nature of the protein, thus η, in the
range, where the assembly dynamics is the most sensitive.Short
hydrophobic peptides incubated with Aβ42 were recently
reported to inhibit Aβ42-induced toxicity in cell cultures[42] and subsequent computer simulations of Aβ42
assembly in the presence of these inhibitors elucidated the structure
of the resulting large amorphous hetero-oliogmers.[16] By binding hydrophobic peptides to Aβ42, the overall
hydropathic ratio of hetero-assemblies, η, is lowered to η
≪ 0.5, thereby biasing aggregation toward presumably less toxic
amorphous aggregates. The hydropathy parameter might thus be critical
for a deeper understanding of protein aggregation dynamics and developing
drugs that aim to alter aggregation pathways in a way to minimize
the damage caused by cytotoxic oligomeric assemblies. While the discussion
described above is specific to protein aggregation, our model and
its predictions are applicable to a wide range of self–assembling
systems and elucidate a critical role of the repulsive to attractive
intermolecular interaction ratio in supramolecular structure prediction.
Authors: H M Berman; J Westbrook; Z Feng; G Gilliland; T N Bhat; H Weissig; I N Shindyalov; P E Bourne Journal: Nucleic Acids Res Date: 2000-01-01 Impact factor: 16.971
Authors: Jose M Borreguero; Brigita Urbanc; Noel D Lazo; Sergey V Buldyrev; David B Teplow; H Eugene Stanley Journal: Proc Natl Acad Sci U S A Date: 2005-04-18 Impact factor: 11.205
Authors: R Vidal; T Revesz; A Rostagno; E Kim; J L Holton; T Bek; M Bojsen-Møller; H Braendgaard; G Plant; J Ghiso; B Frangione Journal: Proc Natl Acad Sci U S A Date: 2000-04-25 Impact factor: 11.205