The nucleosome core particle (NCP) is a large complex of 145-147 base pairs of DNA and eight histone proteins and is the basic building block of chromatin that forms the chromosomes. Here, we develop a coarse-grained (CG) model of the NCP derived through a systematic bottom-up approach based on underlying all-atom MD simulations to compute the necessary CG interactions. The model produces excellent agreement with known structural features of the NCP and gives a realistic description of the nucleosome-nucleosome attraction in the presence of multivalent cations (Mg(H2O)62+ or Co(NH3)63+) for systems comprising 20 NCPs. The results of the simulations reveal structural details of the NCP-NCP interactions unavailable from experimental approaches, and this model opens the prospect for the rigorous modeling of chromatin fibers.
The nucleosome core particle (NCP) is a large complex of 145-147 base pairs of DNA and eight histone proteins and is the basic building block of chromatin that forms the chromosomes. Here, we develop a coarse-grained (CG) model of the NCP derived through a systematic bottom-up approach based on underlying all-atom MD simulations to compute the necessary CG interactions. The model produces excellent agreement with known structural features of the NCP and gives a realistic description of the nucleosome-nucleosome attraction in the presence of multivalent cations (Mg(H2O)62+ or Co(NH3)63+) for systems comprising 20 NCPs. The results of the simulations reveal structural details of the NCP-NCP interactions unavailable from experimental approaches, and this model opens the prospect for the rigorous modeling of chromatin fibers.
Eukaryotic
cells confine long genomic DNA (about 2 meters in humans)
in a small micrometer-sized nucleus through a hierarchy of DNA compaction.
The nucleosome, formed by histone proteins and double-stranded DNA,
is the first level of eukaryotic DNA compaction. Its central part,
the nucleosome core particle (NCP), is a wedge-shaped complex comprised
of 145–147 base pairs (bp) of DNA left-handedly wrapped around
a histone octamer (HO) consisting of two copies of the histone proteins
H2A, H2B, H3, and H4.[1] Each core histone
has a “histone tail,” an unstructured, positively charged
N-terminal domain; histone H2A also has a positively charged tail
at its C terminus.[2] The tails play crucial
roles in NCP–NCP interaction and the organization of higher-level
structures. NCPs are connected by linker DNA of variable length, forming
nucleosome arrays and higher-level chromatin structures. The higher-level
structures of chromatin, such as the “30 nm fiber,”
are less known. Numerous models have been proposed and constantly
refined to describe chromatin structure beyond nucleosome arrays.[3] There is also a continuous debate about the relevance
of the in vitro and in silico generated models to the actual structural
and dynamic features of the folded chromatin in vivo.[3−5] However, it is essential to comprehend that a myriad of in vivo
and in vitro structures of packed chromatin are possible due to the
primary attractive NCP–NCP interaction with secondary contributions
from the other nuclear proteins such as linker histones, transcription
factors, topoisomerases, chromatin remodeling complexes, etc.A number of experimental in vitro studies have been conducted on
isolated NCPs to understand the organization of NCPs in chromatin.[6−11] It has been established that histone tails mediate NCP–NCP
interaction in a salt-dependent manner, highlighting the polyelectrolyte
nature of DNA and the histone proteins.[10] Furthermore, studies have elucidated different structural organizations
of NCPs in condensed phases, the phase diagrams of NCP aggregation,
and details of NCP–NCP interactions induced by the presence
of multivalent cations (such as Mg2+) and high salt.[10,12,13] It was found that NCP aggregation
depends on the charge and nature of the cation.[10,11] Even though the importance of electrostatic interactions in the
structure and dynamics of chromatin is recognized, our understanding
of the detailed mechanism is still relatively vague, often depicted
by classical polymer theory and highly simplified electrolyte models,
such as the Debye–Hückel theory. Additionally, our knowledge
of the roles of histone tails in NCP–NCP interaction and chromatin
structure was mainly obtained from the effects of mutated and chemically
modified histones. The histone tails, highly dynamic, are not resolvable
with most imaging techniques.[14] Hence,
the interaction details involving histone tails are hidden from direct
experimental observation.Since as early as the 1990s,[15,16] numerous modeling and
simulation efforts have been made to address the features unattainable
through experiments (mentioned above). In early works, individual
NCPs in chromatin are modeled as simple solid spheres[16−19] or ellipsoids.[20,21] Spherocylinder approximation[22,23] for the NCP shape and volume became the preferred model after the
high-resolution NCP structure was determined.[24,25] These simple models (for instance, the “two-angle”
model[15,19]) are useful in studying the chromatin structure
with a few degrees of freedom. However, the significant role of histone
tails in NCP–NCP interaction has often been neglected or only
crudely assessed. Additionally, detailed interaction features such
as the presence of the acidic patch on the HO core surface were omitted
or replaced by featureless uniform interactions. Higher-resolution
models, such as the explicit tail model,[26] were developed to mitigate the former issue. An NCP model with refined
electrostatic features and a detailed molecular surface has been developed
in the Schlick group,[27−30] which addresses the latter deficiency in NCP modeling. Nevertheless,
these models cannot describe all physical interactions mediated by
multivalent ions and cellular components. In recent years, a few residue-based
coarse-grained (CG) NCP models[31−36] have been developed as computational capability advances. These
models usually adopt bonds to maintain the NCP structure and simple
interaction terms, such as Lennard–Jones interactions with
mostly empirical top-down derived parameters, to account for the nonbonded
interactions. At such level, not only did structural changes accompanied
slow dynamic processes, such as DNA unwrapping,[35,36] but also NCP–NCP interactions due to ion mediated attraction[31] can be described by the models.In this
paper, we present a newly developed, residue-based CG NCP
model derived through systematic bottom-up coarse-graining based on
the inverse Monte Carlo (IMC) approach.[37] The IMC and related approaches such as iterative Boltzmann inversion
(IBI)[38] are referred to as structure-based
coarse-graining. In such bottom-up coarse-graining, no empirical parameters
are introduced other than the existing all-atom force field parameters
of the reference system. We adopt the all-atom CHARMM27 force field[39] to build our primary all-atom MD simulation
systems. The coarse-grained NCP model is parameterized by the IMC
method,[37] in which structural features
of the reference model, manifested by the CG particle pair radial
distribution functions (RDFs) and distributions of intramolecular
bonds and angles, are used to determine interaction potentials of
the CG model so that the CG model reproduces these structural features.
The resulting NCP model is then validated by comparison with data
from in vitro NCP experiments.[38]A complicating circumstance for systematic bottom-up coarse-graining
of NCP is that the NCP is very large for representative atomistic
simulations. This makes it very difficult, if not impossible, to run
one-step coarse-graining using all-atom MD simulations of the NCP
to generate converged structural properties for all interacting species
(DNA, protein, and ions). To compute the necessary CG interactions,
we overcame this difficulty by performing multiple all-atom MD simulations,
each containing a subset of the interacting molecules under the same
thermodynamic conditions. Subsequently, the complete CG NCP model
is pieced together by inheriting interaction potentials from these
subsystems, following the testing of good transferability under the
same thermodynamic conditions. We show that our new CG model reproduces
well-known structural features of the NCP. Furthermore, the new CG
NCP model produces a realistic representation of NCP aggregation induced
by multivalent cations. To the best of our knowledge, this work is
the first application of a physically rigorous bottom-up coarse-graining
to the NCP, a macromolecular complex, and ionic interactions. Additionally,
this work is a stepping stone for future physics-based chromatin models
combined with an appropriate linker DNA model. We expect such chromatin
models based on physics-based interactions to reveal details in chromatin
structure, which provide important new insights into chromatin biological
functioning.
Methods
The NCP
Model
Coarse-graining of
biomolecular systems is always a compromise between the model’s
accuracy and computational efficiency. We aim to develop a CG NCP
model that contains details of the most important interactions applicable
for simulations of a few tens of NCPs to model NCP phase separation
to an ordered aggregated state in the presence of multivalent cations,
which can later be developed into a chromatin model. We choose the
resolution of one bead per amino acid for protein and five beads for
every two base pairs of DNA, as depicted in Figure , using bottom-up coarse-graining to compute
all interaction terms. The nucleosomal DNA is coarse-grained using
our previously validated DNA model,[40,41] where five
beads represent every two base pairs. Four of the five beads represent
the phosphate group bearing −1e charge. All other atoms in
these two base pairs are represented by the fifth bead, located at
the center of mass (COM) of these atoms. Ions are modeled explicitly,
with one bead representing one ion for all ion species.
Figure 1
All-atom representation
(left) and CG representation (right) of
the NCP. The all-atom representation is built using the NCP crystal
structure (PDB: 1KX5),[25] where the globular domains of histone
proteins are in colors; DNA is colored gray. The CG NCP structure
is shown with positively and negatively charged amino acids colored
in blue and red, respectively. Uncharged amino acid beads are displayed
as white spheres. The beads representing the phosphate groups in DNA
are in orange, while the central DNA beads are shown as green balls.
All-atom representation
(left) and CG representation (right) of
the NCP. The all-atom representation is built using the NCP crystal
structure (PDB: 1KX5),[25] where the globular domains of histone
proteins are in colors; DNA is colored gray. The CG NCP structure
is shown with positively and negatively charged amino acids colored
in blue and red, respectively. Uncharged amino acid beads are displayed
as white spheres. The beads representing the phosphate groups in DNA
are in orange, while the central DNA beads are shown as green balls.The histone proteins are coarse-grained with a
resolution of one
CG site per one amino acid residue. All amino acids of the histone
proteins are represented by beads located at the center of mass (COM)
of each amino acid. To simplify the determination and subsequent use
of the CG potentials, the 20 natural amino acids are grouped into
five bead types corresponding to polar (POL), nonpolar (NPL), positively
charged (POS), negatively charged (NEG), and glycine (GLY) amino acids.
POS and NEG beads bear charges +e and −e, respectively. Glycine
is assigned to a dedicated type because of its small size and higher
degrees of backbone freedom. The assignment of amino acid types into
CG bead types is listed in Table S1. The
same potential functions describe interactions of amino acids belonging
to the same type and with other components of the system (DNA and
ions). We note that compromise has been made in the bead type definition
to balance performance and accuracy. For instance, it has been recently
found that even similar residues, such as lysine and arginine, can
establish distinct interactions under certain conditions and further
affect macroscopic phenomena.[42] We have
conducted a test where lysine and arginine are mapped to two respective
CG types. The resulting CG nonbonded potentials (Figure S1) are similar, justifying our bead type definition.
This CG bead type definition allows the distinction of essential interactions,
including electrostatic, hydrophobic, and hydrophilic, while keeping
a minimum number of bead types, avoiding failure in the coarse-graining
practice due to the high complexity of the CG model.The total
potential energy defining our NCP model contains four
terms:The bonded and angular interactions Ubond, Uangle for DNA, and histone
tails are
given in a tabulated form, with linear interpolation of potential
energy between data points. No angle interaction term is included
within the histone core region as it is modeled as an elastic network
(discussed below). Angle types are determined by examining X-AA-X
angles distributions (“X” stands for any amino acid)
for all amino acid types (AA) in the DNA-peptide all-atom simulation
system (discussed below). It was found that the angle at proline presents
distinctive X-AA-X angle distributions compared with other amino acids.
The angle at glycine shows a peak between 120 and 140 degrees instead
of a plateau shown by other amino acids (Figure S2). In the end, three angle types along the peptide chain
are defined for X-AA-X angles, when AA is proline (denoted Angle_PRO),
glycine (Angle_GLY), or other amino acid residues (Angle_Other).The nonbonded interaction is comprised of electrostatic interaction, Uelec, and short-range nonbonded interaction, Usr. Electrostatic interactions are determined
by integer charges placed on charged amino acids, phosphate groups
of the DNA backbone, and explicit ions. The electrostatic interaction
follows Coulomb’s law with a constant relative permittivity,
ϵ = 78 (eq ).where q and q are charges on bead i and j and r denotes the distance between beads i and j. The long-range electrostatic interactions are treated
by the Particle Mesh Ewald (PME)[43] or the
Particle–Particle Particle–Mesh (PPPM)[44] methods.The short-range nonbonded interactions, Usr, are represented by tabulated pair potentials.
Together
with bonded potentials, these potentials are generated by the IMC
calculation using reference RDFs and distributions of bonds and angles
computed in atomistic simulations. For 1–2 and 1–3 bound
neighbors, both electrostatic and short-range nonbonded interactions
are excluded, whereas the 1–4 and longer interactions are preserved
and described in the same way as nonbonded interactions between CG
sites of the corresponding types.We take additional considerations
for the wedge shape of the histone
core and the highly charged nature of NCP, which are paramount in
reproducing the structures of NCP assembly, nucleosome arrays, and
chromatin.[45] Although the nonbonded interactions
are parametrized from the atomistic model by structure-based coarse-graining,
the histone core structure is not guaranteed to be preserved due to
inherent limitations of the underlying all-atom MD simulations. Hence,
we use an elastic network model for the histone core since its conformation
is much less flexible than the other NCP domains and structurally
well-defined. The definition of histone tail regions and core regions
is listed in Table S2. The elastic network
is built based on bead distances in the experimentally available crystal
structure. Any pair of beads with a distance less than 7 Å is
connected by a harmonic bond with equilibrium length inherited from
the NCP crystal structure:where a uniform bond strength
across all core bonds, kb, is set to 5
kBT/Å2. Testing other kb values did not produce noticeable changes.For
short-range nonbonded interactions, the core–core terms
are treated as excluded volume interactions presented as a repulsive
part of the Lennard–Jones potential:in which σ = (σ + σ)/2. σ is determined by the effective radius of each bead type listed in Table S3. ϵ is set to 1.0 kBT.
Reference All-Atom MD Simulations
The nucleosome is a large DNA–protein complex, which is difficult
to sample directly with the all-atom representation to get reliable
and converged structural distributions necessary for bottom-up coarse-graining.
Furthermore, the CG NCP model contains many different types of CG
sites, which creates difficulties in the bottom-up coarse-graining.
We, therefore, implement a “divide and conquer” strategy
in which we set up several subsystems, each of which is simpler than
a whole NCP. However, taken together, these systems cover all the
interactions of the NCP with the ionic environment. Thus, we set up
several all-atom MD simulations with subsets of components that will
form our CG NCP model. To build a model that is capable of thoroughly
investigating the structure and dynamics of NCP and NCP–NCP
interaction at the CG level, we aimed to obtain a complete set of
interaction potentials for DNA, histone tails, monovalent ions (K+, Na+, and Cl–), and multivalent
cations, including hydrated magnesium (Mg(H2O)62+) and cobalt(III)-hexammine (CoHex3+). Among
these components, distributions of monovalent ions are usually easy
to evaluate. Hence, monovalent ions are present in all reference all-atom
subsystems. As all our nonbonded interaction terms are pairwise, we
can cover all interactions terms among the three other major components
(DNA, histone tails, and multivalent ions) with three simulation setups
under the same thermodynamic conditions, each containing a pair of
the components, namely DNA-multivalent ions, DNA-peptide, and peptide-multivalent
ions.The setup of all all-atom simulations is summarized in Table S4. This study uses the CHARMM27 force
field[39] to provide reference structural
information necessary for the bottom-up coarse-graining. It was shown
previously that the CHARMM27 force field provides a better and more
stable DNA structure than later updated CHARMM36.[41] We emphasize that the current CG NCP model is pairwise
additive as the underlying all-atom model. Higher-order interactions,
such as the multibody effect, are not included, which would not be
a problem for models at such a scale. Below, we describe the atomistic
simulations of each subsystem separately.
DNA-Multivalent
Cation Simulations
In our previous work[41,46] focused on DNA-multivalent cation
interactions, we parameterized a well-performing CG DNA model using
IMC. Briefly, the all-atom reference simulations of DNA-multivalent
cation systems were set up with four 36 bp long DNA double helices,
whose sequences can be found in Table S5. Either Mg(H2O)62+ or CoHex3+ ions were added to reach a total concentration of about
50 mM. It should be noted that upon NCP aggregation and formation
of the aggregated concentrated NCP bundles, positive multivalent salt
ions are enriched in the NCP phase. Hence, the mean volume average
Mg2+ ion concentration in the simulation cell (50 mM) corresponds
to a significantly lower experimental bulk concentration.[47]The optimized hydrated Mg(H2O)62+ model[48] and
optimized CoHex3+ model[40] were
used for multivalent ions. About 50 mM K+ and 35 mM Na+ ions were added to provide a background salt environment
close to physiological conditions. Equal amounts of K+ and
Na+ were added in the case of the Mg(H2O)62+ simulation to neutralize the charge difference
between these two systems. We note that although other models of multivalent
cations are available, such as the Mg2+ model by the Schwierz
group,[49] the multivalent cation models
used in the current study are well tested with double-helical DNA.
They have been proven to reliably reproduce solution structures.[40,48] Three independent simulations were performed for the subsystem with
CoHex3+ ions, having different initial positions and velocities.
The CG potentials between all pairs of CG sites were determined by
the IMC method. We continue to use this model in the current CG NCP
modeling.
DNA-Peptide Simulation
In order
to extract interaction potentials between DNA and amino acids, we
computed the RDFs between CG sites of DNA and amino acids with all-atom
simulations of DNA-peptide systems. Based on the sequence of histone
tails, we designed 48 peptides, each 8 amino acids long, containing
all combinations of tripeptides found in the histone tails. The sequences
of the simulated peptides can be found in Table S6.The simulation box was built with eight double-helical
DNA molecules of 16 bp each and one copy of each of the 48 designed
peptides. Sequences of the DNA double helices are listed in Table S7. Although there is one copy of each
peptide in the simulation, the relevant degrees of freedom that are
subjected to sampling are represented by multiple molecules of peptide
(Table S7). Hence, the sampling is not
hindered by the number of a specific peptide. About 65 mM K+ and 130 mM Na+ ions with the neutralizing amount of Cl– ions were added to the system. The cubic simulation
box size was approximately 15.1 nm in each dimension. Ten independent
simulations of 1.5 μs, each with randomized initial positions
and velocities, were carried out to obtain a reliable sampling of
spatial distributions.
Peptide-Multivalent Cation
Simulations
The third set of atomistic simulations was carried
out to obtain
the interaction potentials between amino acids and multivalent cations.
We simulated the same 48 designed peptides (Table S6) as in the DNA-peptide setup, in the presence of multivalent
cations, i.e., Mg(H2O)62+ or CoHex3+. Each simulation contains one copy of all 48 peptides. The
number of multivalent cations was chosen so that the charge carried
by multivalent cations corresponds to three-quarters of the charge
carried by all the peptides. Additionally, K+ and Na+ ions were inserted to reach a concentration of about 20 mM
for both types of monovalent ions. The appropriate amount of Cl– ions is used to neutralize the system. The simulations
were carried out in a cubic box of 15.2 nm, each resulting in a 1.0
μs long trajectory.
All-Atom MD Simulation
Protocol
The all-atom MD simulations were conducted with
the GROMACS package.[50] The initial configurations
were generated by
randomly placing DNA, peptides, and/or multivalent cations in a cubic
simulation box. The standard procedure of adding water molecules followed
by monovalent ions is performed with tools provided by the GROMACS
software. Following a short energy minimization, which removes high-energy
contacts, equilibration of temperature and volume is conducted in
two stages. In the first stage, the temperature is equilibrated under
a Berendsen thermostat[51] at a constant
volume. In the second stage, the simulation system is subjected to
a constant temperature; constant pressure equilibration which is achieved
using the Berendsen thermostat and barostat.[51] All production runs are conducted under an isothermal–isobaric
ensemble at 298.15 K and 1.0 bar, realized by the velocity rescale
thermostat[52] and Parrinello–Rahman
barostat.[53,54]Depending on the composition of each
subsystem, the required time of simulation differs from a few hundred
to a few thousand nanoseconds. In simulations with aggregating components,
i.e., DNA-Co and DNA-Peptide subsystems, the molecular configuration
becomes static once equilibrium is reached as no other advanced enhanced-sampling
technique is used here. Multiple trajectories started from different
particle positions, and velocities are simulated for such subsystems
to resolve the problem of insufficient sampling with a single trajectory.
In total, we have 3 trajectories of the DNA-Co subsystem and 10 trajectories
of the DNA-peptide subsystem. Other simulations were done with a single
run. All trajectories are at least 1.0 μs long to allow diffusion
of big DNA and peptide molecules to reach equilibrium configurations.
All structural properties (RDFs, distributions of bonds, and angles)
are calculated from the last 500 ns in each trajectory.
Derivation of the CG Effective Potential
Except for
the histone core–core interactions described
by the elastic network model, all other interactions between CG sites
were calculated systematically and rigorously by the IMC method[37,55]. In the IMC practice, the goal is to reproduce the average structural
properties of the all-atom system, such as RDFs between nonbonded
CG beads and bond length and angle distributions for bonded beads.
Based on the mapping rules defined earlier, these structural properties
are straightforwardly calculated from the all-atom trajectories for
the respective simulations. For simulations produced multiple trajectories,
the structural properties are averaged over all trajectories with
equal weights. The final effective potentials for the CG model are
derived iteratively with the MagiC software.[55,56] We note that for the calculated interaction terms, no other empirical
parameters are introduced except those in the CHARMM27 force field.
The derived potentials are tabulated and not prescribed by any specific
functional form.
Assembling the CG NCP Model
The final
CG NCP model is built by combining interaction terms from the subsystems
mentioned above. Note that some of the interaction terms are available
from different atomistic simulations. For example, monovalent ion-monovalent
ion potentials are available from all three series of atomistic simulations;
peptide–peptide interactions are available from DNA-peptide
and peptide-multivalent ions simulations. Potentials between the same
CG sites obtained in different simulations are typically similar but
still different because of statistical uncertainty and the principal
dependence of bottom-up derived effective potential on the simulation
conditions (see Figure S3 and the discussion
on transferability of CG potentials in Section below). While composing the final CG NCP
model, when more than one effective potential for a specific CG-site
pair is available, we pick the one that corresponds best to the NCP
conditions. Table S8 of the Supporting
Information lists all CG-site pairs in the final CG NCP model and
references to atomistic simulations from which the respective CG potential
was derived.Preliminary test runs showed that usage of only
nonbonded CG DNA-histone interactions results in the overly large
distance between the nucleosomal DNA and the histone core, leading
to DNA detachment and unwrapping. This could happen due to two reasons.
First, the CG DNA model is derived using relatively short DNAs (36
bp), which mostly keep a straight conformation in the atomistic simulations,
while in an NCP, the DNA is significantly curved. Secondly, the DNA–amino
acid nonbonded interaction potential might not be optimal for DNA-histone
core contacts as it is derived from DNA and flexible peptides. We
resolve the issue by adding harmonic bonds between the central beads
of the nucleosomal DNA and their respective closest amino acid beads
on the histone core. Exceptions are made for the two central beads
at each end of the DNA helix to allow some “breathing”
motion. The equilibrium distances of these bonds are taken from the
NCP crystal structure. The same bond strength as for the bonds in
histone core (5 kBT/Å2) is used. These
bonds provide proper localization of nucleosomal DNA at the histone
core, providing possibilities for restricted local motions. To test
the effect of added DNA-histone bonds, an alternative model with fewer
DNA-histone core bonds (two fewer bonds at each DNA end) is examined
(Figure S4). Not surprisingly, fewer DNA-histone
core bonds result in a somewhat larger NCP size due to partial DNA
unwrapping. Overall, our current model is suitable for studying nucleosome–nucleosome
interactions and NCP phase separation. Although DNA unwrapping is
crucial in determining chromatin structure, our reference data may
not at present be sufficient to model such motion. We expect that
data from other simulations or experiments can be integrated into
this model in the future so that both DNA unwrapping structure and
dynamics can be modeled.
CG MD Simulations
All CGMD simulations
are performed with Hamiltonian dynamics under a canonical ensemble
with the LAMMPS software under the regulation of velocity rescale
thermostat.[52] Initial configurations are
all generated by randomly placing components of the simulation system
into cubic boxes. Simulations are initiated with a tight thermostat
(temperature temping parameter being 10 ps) and a 1 fs time step.
The time step is then increased to 2 fs and finally to 5 fs while
relaxing the temperature damping parameter to 1000 ps. The production
simulations are performed with a 5 fs time step under 310 K. The PPPM[44] algorithm accounts for long-range electrostatic
interactions.CGMD simulations were carried out with a single
NCP and in a system with 20 NCPs. A single NCP was simulated in the
presence of a varying concentration of monovalent ions in order to
validate the structural features of the developed CG NCP model. To
study NCP–NCP interactions and their dependence on ionic conditions,
a series of CG MD simulation runs were performed with 20 NCPs and
monovalent (mixture of K+ and Na+, K-20NCP system),
divalent (Mg(H2O)62+ at 3.5, 7.1,
and 10.6 mM, Mg-20NCP system), and three-valent (CoHex3+ at 2.3, 4.7, and 7 mM, Co-20NCP system) cations (see exact composition
in Table S9). The concentrations of Mg(H2O)62+ and CoHex3+ ions were
chosen to be 0.5, 1, and 1.5 times the amount of counterions corresponding
to the neutralization of the NCP negative charge. In the Mg-20NCP
and Co-20NCP simulations, NCPs may aggregate due to the condensing
effect of multivalent cations. Therefore, an annealing procedure is
applied to improve sampling and avoid trapping in local minima. Specifically,
after equilibration, the system temperature is raised from 310 K to
an elevated one in 2 ns. The simulation is kept at the hot temperature
for 16 ns before returning to 310 K in 2 ns. The subsequent NVT simulation
is conducted for 50–100 ns depending on how fast the system
reaches equilibrium. The annealing cycle is repeated 10–20
times. The elevated temperature is simulation-specific depending on
the composition of each simulation box (see Table S9).
Results and Discussion
We begin with a description of the results of the all-atom MD simulations.
Then, the data of the bottom-up coarse-graining will be presented,
with discussions on the transferability of the interaction potentials.
Finally, we describe the results of CGMD simulations of the developed
CG NCP model, showing the structural properties of individual NCP
and NCP aggregates in the presence of multivalent cations.
All-Atom Simulations
In order to
extract the effective CG potentials between the components of the
CG NCP system, five subsystems were constructed and simulated under
an isothermal–isobaric ensemble. The system composition of
each simulation is given in Table S4. Representative
snapshots of each subsystem at the end of simulations are shown in Figure . In agreement with
the behavior predicted from experiments,[57] DNA aggregation does not occur in the presence of Mg2+ ions, while the multivalent ions of charge +3 or larger induce DNA–DNA
attraction demonstrated by the formation of bundles of DNA fibers.
Figure 2
Representative
snapshots of (a) DNA-Co, (b) DNA-Mg, (c) DNA-Peptide,
(d) Peptide-Co, and (e) Peptide-Mg subsystems showing configurations
at the end of simulations. DNA is colored red, while peptide chains
are colored blue. Periodic images are colored gray.
Representative
snapshots of (a) DNA-Co, (b) DNA-Mg, (c) DNA-Peptide,
(d) Peptide-Co, and (e) Peptide-Mg subsystems showing configurations
at the end of simulations. DNA is colored red, while peptide chains
are colored blue. Periodic images are colored gray.Generation of well-converged equilibrium RDFs is critically
important
in structure-based coarse-graining. We quantify the RDF convergence
by calculating the root mean square fluctuations (RMSF) of the nonbonded
RDF in each simulation trajectory. The RMSF of the nonbonded RDF is
defined aswhere g is
the nonbonded RDF calculated from a 50 ns section of the simulation, gfin is the final nonbonded RDF calculated from
the last 500 ns of the simulation, and N is the total
number of entries of the nonbonded RDF in discrete representation.In Figure , the
RMSFs of nonbonded RDFs are plotted as functions of simulation time
for each trajectory for the DNA-Co and DNA-Peptide subsystems (for
RMSF plots of all subsystems, see Figure S5). The RDF has converged well in each trajectory, as shown by the
convergence of RMSF to a low value. Notably, during the last 500 ns
of each trajectory, the RMSF of RDF is stable at a low value, which
validates our choice of using this section of the trajectories for
the subsequent CG modeling.
Figure 3
Root mean square fluctuation of block-averaged
RDF relative to
the final RDF for DNA-Co (top) and DNA-Peptide (bottom) subsystems.
Final RDF is calculated with the last 500 ns of each trajectory.
Root mean square fluctuation of block-averaged
RDF relative to
the final RDF for DNA-Co (top) and DNA-Peptide (bottom) subsystems.
Final RDF is calculated with the last 500 ns of each trajectory.
Deriving CG Potentials
The CG effective
potentials are derived by the IMC method,[37] which takes the RDFs and bond distributions from the atomistic simulations
and finds the set of potentials that reproduces the input. A detailed
account of the application of the IMC method has been given earlier
for the DNA-Co subsystem.[46] Below, we give
a brief description of the procedure.For each subsystem, atomistic
trajectories are first mapped to a coarse-grained description according
to the mapping scheme of the specific CG model. The RDFs between nonbonded
CG sites and distribution of bond lengths and angles, plotted in Figures S6–S10 for each subsystem, are
computed from the last 500 ns of all trajectories. These functions
(RDFs and bonded distributions) are taken as input by the Magic-3
software[55] to conduct IMC computations.
Within this approach, the system is first simulated with a trial potential.
Then, comparing the computed RDFs with reference RDFs obtained in
the atomistic simulations, IBI or IMC algorithms are used to compute
a correction to the potential. The potential is improved iteratively
until an agreement between simulated and reference RDFs is reached
within the statistical error of the simulations. In the inverse procedure,
the electrostatic part of the interaction is not changed and treated
by the Ewald summation method,[43] while
short-range nonbonded and bonded interactions are varied.Typically,
the iterative procedure started from zero nonbonded
potentials and Boltzmann inversed bonded potentials. The first 10
iterations were done according to the IBI scheme.[38] Subsequent iterations proceed with IMC until full convergence,[37] which typically takes 30–50 iterations.
Complete sets of final short-range potentials for each subsystem are
provided in Figures S6–S10, with
corresponding RDFs in the Supporting Information. It should be noted that with the systematic modeling of IMC, cross
correlations among interaction terms are accounted for so that the
individual term does not necessarily predict the corresponding RDF.
For example, the P–P nonbonded potential in the DNA-Mg system
(Figure S6) is more attractive than that
in the DNA-Co system (Figure S7). However,
the P–P RDF in the DNA-Mg system is significantly lower as
DNA molecules are not aggregating. With correlations among all interaction
terms, the IMC-derived potential reproduces the reference RDF within
statistical error.In our “divide and conquer”
strategy, a particular
interaction term can be derived from more than one subsystem. A selected
set of short-range nonbonded effective potentials from different subsystems
is plotted in Figure for comparison. A complete comparison of all such potential terms
is shown in Figure S3. Figure and Figure S3 show that the difference in potential energy functions from
different subsystems is minimal for terms related to ions, especially
for the monovalent ions (e.g., K–K potential in Figure ), which demonstrates the transferability
of the effective potential. Most of the other potentials have the
same features (peaks and wells), and the only difference is in the
height of the peaks and the depth of the wells. Such terms include
amino acid–amino acid (e.g., POS–POL potential in Figure ) and amino acid–monovalent
ion (e.g., POL–K potential in Figure ).
Figure 4
Selected short-range potential functions obtained
by the IMC method
from different subsystems. See Figure S3 in the Supporting Information for a complete set of such potential
terms.
Selected short-range potential functions obtained
by the IMC method
from different subsystems. See Figure S3 in the Supporting Information for a complete set of such potential
terms.The most significant variation
of the potentials is observed for
the DNA–DNA nonbonded terms, for instance, the P–P potential
in Figure . The notable
variance in DNA–DNA potentials is due to the different aggregated
configurations exhibited in the simulated subsystems. CoHex3+ is a potent DNA condensing agent,[58,59] while the
positively charged peptides can also induce DNA aggregation.[57,60] The question arises regarding what potentials should we use to construct
the CG NCP model. As the difference in DNA–DNA potentials originates
from the difference in composition of the subsystems, we follow the
principle to use the effective potentials derived from the subsystem
closest to our target system. Since our target system, the CG NCP,
is a DNA-protein complex with DNA surrounded primarily by the histones
(both core and tails), potential functions from the DNA-Peptide subsystem
are preferred. The final CG NCP model is built by combining effective
potentials from the subsystems according to a similar principle. The
chosen combination of effective potential terms for all nonbonded
interactions is summarized in Table S8.
Structural Features of the CG NCP Model
In order to validate our new CG NCP model, we first perform MD
simulations of a single CG NCP at 310 K. The simulations are conducted
in 30 nm cubic boxes within the canonical ensemble. In addition to
the monovalent ions neutralizing the NCP charge, monovalent salt from
0 to 150 mM is added (the amounts of Na+ and K+ are equal). We examine the salt dependence of three parameters of
the CG NCP structure in a single NCP system: the root mean square
deviation (RMSD) relative to the reference crystal structure (PDB:
1KX5), the maximal dimension (Dmax), and
the radius of gyration (Rg) of the NCP.
These properties are plotted in Figure and compared with the experimental data.
Figure 5
Structural
features of CG NCP. Structural properties obtained in
equilibrium simulation of a single NCP are presented with box plots,
including the RMSD relative to the crystal structure (a), maximal
dimension (Dmax in b), and radius of gyration
(Rg in c). The edges on each box plot
represent the 25th and 75th percentile. Mean values and medians are
shown as magenta and cyan lines in the boxes, respectively. Whiskers
are extended for 1.5 times the interquartile range in each direction.
A shaded distribution curve is plotted alongside each box to illustrate
the shape of the distribution for each dataset. RMSD values are calculated
and combined for DNA and histone core (as defined in Table S2). Dmax and Rg are plotted together with experimental data from ref (61). All properties are shown
as functions of salt concentration.
Structural
features of CG NCP. Structural properties obtained in
equilibrium simulation of a single NCP are presented with box plots,
including the RMSD relative to the crystal structure (a), maximal
dimension (Dmax in b), and radius of gyration
(Rg in c). The edges on each box plot
represent the 25th and 75th percentile. Mean values and medians are
shown as magenta and cyan lines in the boxes, respectively. Whiskers
are extended for 1.5 times the interquartile range in each direction.
A shaded distribution curve is plotted alongside each box to illustrate
the shape of the distribution for each dataset. RMSD values are calculated
and combined for DNA and histone core (as defined in Table S2). Dmax and Rg are plotted together with experimental data from ref (61). All properties are shown
as functions of salt concentration.First, we calculated the RMSD for DNA, histone core, and combined
parts (Figure a).
The histone core shows a low average RMSD of about 0.2 nm for all
salt concentrations. It confirms the stability of the histone core
structure as expected from the constructed elastic network model.
The RMSD values of the nucleosomal DNA are about 0.6 nm. This is acceptable because the bonded interactions in DNA are
derived for straight (albeit flexible) DNA, and fluctuations of the
DNA structure are inherited from the underlying MD simulations. The
overall RMSD for the DNA and the histone core region is about 0.5 nm. We conclude that the model well preserves the structure
of the nucleosomal DNA and the histone core. As anticipated, there
is no dependence of the RMSD on monovalent salt concentration.To estimate the salt dependence of the overall size of the NCP
within our model, taking into account the flexible tails, we calculate
the values of Dmax (Figure b) and Rg (Figure c). Both parameters
increase slightly with the increase of salt, which can be explained
by the increasing extension and mobility of the histone tails caused
by weakening the tail–DNA interaction due to the screening
effect of salt ions. This contribution should be the dominant contribution
to salt dependence, considering the stable size of the DNA and histone
core (Figure a). Small-angle
X-ray scattering (SAXS) experiments reported Rg values of the NCP in the range 4.15–4.30 nm and Dmax between 11.5 and 13.5 nm for salt concentrations
in the range of 0–150 mM.[61] The Rg values obtained in our simulations (change
of mean value from 4.08 to 4.13 nm in the same salt range, 0–150
mM) agree very well with experimental data. However, in our CG NCP
simulations, the mean Dmax value increases
from 14.5 to 15.7 nm; both numbers are larger than the experimental
values, while the salt-dependent change is smaller. We can explain
the difference in the Dmax values by the
fact that the experimental determination of the Dmax relies on the quality of the SAXS spectra that are
a product of several measurements with inherited experimental uncertainties
and noise (subtraction of the contributions from the background solution
and cell walls). In particular, higher Dmax values obtained in the CG MD simulations can be explained by the
contribution from the flexible histone tails sampling the space further
away from NCP. In the SAXS experiments, this contribution is at the
noise level due to the small contrast in electron density between
the tails and the solvent, and this contribution may not be accounted
for.We can conclude from single NCP simulations that the new
CG NCP
model is structurally sound under the limitation of the underlying
force field. Our CG NCP model agrees well with the experimental data
for NCP solutions at low to physiological monovalent salt concentrations.
The NCP in solution has a stable structure of the DNA and the histone
core, with the histone tails collapsing on the DNA in a salt-dependent
manner in agreement with the experiment.
NCP Aggregation
in the Presence of Multivalent
Cations
The primary aim of developing the new CG NCP model
is to describe NCP–NCP interactions and particularly aggregation
properties under different conditions. We, therefore, performed a
series of CG MD simulations of 20 NCPs with different mono- or multivalent
cations and background added salt (KCl and NaCl) at 15 mM (Table S9). The simulations were conducted with
Hamiltonian dynamics until the equilibrium state was well sampled.
When NCP aggregation occurs in simulations with multivalent ions,
a simulated annealing procedure is implemented to improve sampling
as described in the Methods section.It is well known from in vitro experiments that multivalent ions
can induce phase separation of NCP from the solution.[10,12,13] In particular, the CoHex3+ ion is a highly potent condensing agent and promotes the
formation of hexagonal columnar NCP phases for a wide range of ion
concentrations.[10,12,13] Even though this counterion lacks biological significance, it provides
an excellent model system for validating the model’s capacity
to reproduce observed behavior. Our CG model correctly reproduces
the multivalent ion-induced NCP aggregation phenomenon in the MD simulations.
After reaching equilibrium, NCPs stay in the solution phase when only
monovalent ions are present (Figure a). In the presence of Mg(H2O)62+ (Figure b) or CoHex3+ (Figure c), all 20 NCPs aggregate, forming a single cluster.
This aggregation behavior is in excellent agreement with experimental
data demonstrating precipitation and formation of various aggregated
ordered phases of NCP.[6−13]
Figure 6
CG
MD simulations of the systems with 20 NCPs and K+, Mg(H2O)62+ (7.1 mM), and CoHex3+ (4.7 mM) cations. Snapshots of typical equilibrium states
in K-20NCP (a), Mg-20NCP (b), and Co-20NCP (c) simulations are shown.
Histone tails and monovalent ions are omitted in the snapshots for
clarity. Mg(H2O)62+ ions are shown
as blue balls, whereas CoHex3+ ions are in magenta. (d)
The histone core–histone core RDF. NCP–NCP interaction
shows different modes as indicated by the RDF. (e) The NCP–NCP
effective potentials derived from the RDFs are displayed in panel
(d). (f) NCP valency is calculated according to eq . Error bars are standard deviations.
CG
MD simulations of the systems with 20 NCPs and K+, Mg(H2O)62+ (7.1 mM), and CoHex3+ (4.7 mM) cations. Snapshots of typical equilibrium states
in K-20NCP (a), Mg-20NCP (b), and Co-20NCP (c) simulations are shown.
Histone tails and monovalent ions are omitted in the snapshots for
clarity. Mg(H2O)62+ ions are shown
as blue balls, whereas CoHex3+ ions are in magenta. (d)
The histone core–histone core RDF. NCP–NCP interaction
shows different modes as indicated by the RDF. (e) The NCP–NCP
effective potentials derived from the RDFs are displayed in panel
(d). (f) NCP valency is calculated according to eq . Error bars are standard deviations.We analyze the structure of the NCP aggregates
by plotting RDFs
between the COM of the NCP histone cores (Figure d). In monovalent salt (K-20NCP system, black
line in Figure d),
the RDF indicates the absence of the close NCP–NCP contacts.
Overall, NCPs repel each other in the presence of only monovalent
ions. Mere occasional NCP contacts occur mediated by histone tail
bridging and cation screening. In contrast, high-intensity peaks in
the core–core RDFs are observed in the Mg-20NCP and Co-20NCP
simulations. In the simulation with Mg(H2O)62+ (red curve in Figure d), two major (at 6.45 and 8.12 nm) and two minor (at
10.8 and 14.2 nm) peaks are observed. The first peak at 6.45 nm corresponds
to the stacking coordination of NCPs, where the flat surfaces of the
two NCPs are in close contact. The second peak at 8.12 nm corresponds
to the perpendicular coordination of the two NCPs, where DNA from
one NCP is in close contact with the histone core from another NCP.
The third peak at 10.8 nm represents the side-by-side positions of
the NCPs with close DNA–DNA interaction, where the distance
between the NCPs is approximately equal to the diameter of the particle.The core–core RDF calculated from the Co-20NCP simulation
(green curve in Figure d) indicates a somewhat different internal structure of the NCP aggregate.
There are two major peaks at 5.65 and 9.53 nm. The first peak corresponds
to the NCP stacking, 0.8 nm closer than in the Mg-20NCP simulation.
This tighter NCP packing can be attributed to the higher CoHex3+ charge that results in the effective screening of the DNA–DNA
repulsion, allowing the almost complete overlapping of the stacked
NCPs. In the Mg-20NCP system, NCPs in the stack are shifted relative
to each other to minimize DNA–DNA proximity. Similar to the
third peak (10.8 nm) in the Mg-20NCP RDF, the second peak in the Co-20NCP
system is for the side-by-side NCP contact due to the DNA–DNA
interactions. In the Co-20NCP system, the perpendicular NCP coordination
as well as the contact between the DNA on the NCP side and the histone
core, observed in the Mg-20NCP system at 8.12 nm, is diminished. This
can be explained by the ability of the CoHex3+ ions to
cause a stronger DNA–DNA attraction than Mg(H2O)62+. As a result, NCP stacking becomes overlapping,
and side-by-side orientation dominates over the perpendicular NCP
coordination.Further simulations have shown that the position
and magnitude
of the RDF peaks depend on the multivalent ion concentrations. As
shown in Figure S11, higher multivalent
ion concentration generally results in the RDF peaks being shifted
to the left, suggesting closer contacts between NCPs into more compact
aggregates. Our future works will elaborate on the details of multivalent
ion-induced NCP condensation at different ionic conditions, including
an analysis of the aggregated structures.To describe the physical
picture of the NCP behavior from the core–core
RDFs (Figure d), we
used the IMC method to calculate the effective NCP–NCP pair
potentials (Figure e). The curves in Figure e show dependencies of the effective NCP–NCP interaction
energy as a function of the core–core distance at different
ionic environments. The steep repulsive sections of the curves at
a short distance show that the effective size of the NCP increases
in the order Co-20NCP < Mg-20NCP < K-20NCP. When only monovalent
ions are present, shallow energy minima are seen (black line in Figure e), revealing that
NCP contacts are transient and only distant weak correlations in the
NCP positions are present due to NCP–NCP repulsion. Multivalent
cations screen NCP–NCP repulsion, ultimately leading to NCP–NCP
attraction and aggregation. Notably, the effective potential minima
determine the equilibrium NCP–NCP distances and define the
internal structure of the condensed phase. In the presence of Mg(H2O)62+ ions, the NCP–NCP potential
shows two major minima corresponding to the two major peaks in the
RDF. The first minimum at 6.45 nm, measuring the NCP–NCP face-to-face
interaction, coincides with the experimentally measured value by Funke
et al.[62] The depth of this minimum (−1.0
kCal/mol) is in an acceptable range compared to the experimental value
(−1.4 kCal/mol)[62] considering the
difference in Mg(H2O)62+ concentration
(7 mM vs 11 mM) and buffer used in the experiment. This again illustrated
the advantage of the bottom-up-derived nonbonded potential in modeling
NCP–NCP interactions. Interestingly, in the Co-20NCP system,
the second minimum of the potential curve is deeper and broader than
all other potential wells. This can be attributed to the strong DNA–DNA
attraction induced by CoHex3+, leading to the coexistence
of several types of NCP–NCP contacts with distances within
the range of this potential well (illustrated in Figure ).
Figure 7
Summary of NCP–NCP
binding modes in the multivalent ion-induced
NCP aggregates. The most populated NCP–NCP contacts correspond
to the minima in the core–core effective potential function
(Figure e). The NCP
is shown as simplified shapes representing its cross section with
green circles for the DNA wrapped around the histone core (red circle).
Summary of NCP–NCP
binding modes in the multivalent ion-induced
NCP aggregates. The most populated NCP–NCP contacts correspond
to the minima in the core–core effective potential function
(Figure e). The NCP
is shown as simplified shapes representing its cross section with
green circles for the DNA wrapped around the histone core (red circle).The spatial distribution function (SDF) of DNA
visualizes how NCPs
are arranged relative to each other. Figure shows that DNA densities below and above
the central NCP are observed in the condensed clusters of the Mg-20NCP
and Co-20NCP systems, confirming that the first peak in the RDFs (Figure d) corresponds to
the NCP–NCP stacking. However, the face-to-side NCP–NCP
contact is only seen in the Mg-20NCP system (left column of Figure ). The perpendicular
rings of DNA density above and below the reference NCP clearly show
the face-to-side NCP contact. Diffuse areas on the side of the central
NCP in the Co-20NCP system (right column in Figure ) correspond to multiple orientations of
the neighboring NCPs in the condensed phase, all within a distance
covered by the wide second well of the core–core effective
potential (green curve in Figure e).
Figure 8
SDFs of DNA in multivalent ion-induced NCP aggregates.
In the Mg-20NCP
(left) and Co-20NCP (right) systems, NCP–NCP stacking is revealed
by the densities above and below the central NCP. In the Mg-20NCP
system, the “arcs” above and beneath the central NCP
indicate a population of perpendicular NCP–NCP orientations.
In the Co-20NCP system, the densities of the NCP lateral side opposite
reflect contacts between DNA on the lateral surface of the wedge-shaped
NCP cylinder. The absence of the SDF density at the DNA entry–exit
location of the NCP indicates that most particles orient this part
toward the condensate surface. See the Supporting Information for 3D animations.
SDFs of DNA in multivalent ion-induced NCP aggregates.
In the Mg-20NCP
(left) and Co-20NCP (right) systems, NCP–NCP stacking is revealed
by the densities above and below the central NCP. In the Mg-20NCP
system, the “arcs” above and beneath the central NCP
indicate a population of perpendicular NCP–NCP orientations.
In the Co-20NCP system, the densities of the NCP lateral side opposite
reflect contacts between DNA on the lateral surface of the wedge-shaped
NCP cylinder. The absence of the SDF density at the DNA entry–exit
location of the NCP indicates that most particles orient this part
toward the condensate surface. See the Supporting Information for 3D animations.In a recent paper,[35] the concept of
NCP “valency” has been suggested as a useful number
counting NCP–NCP contacts in chromatin. The NCP valency is
defined as the average number of NCP–NCP contacts per each
NCP in the system:where i and j are indices of NCP, Nt is
the number of sampled snapshots, and Nn is the number of NCP in the simulation. The NCP–NCP contact
(C(t)) is determined
by a distance criterionwhere dmin is the minimum distance between
NCP i and j (excluding histone tails).
The
statistics of NCP valency is shown in Figure f. As only occasional contacts occur in the
simulation with monovalent ions, NCP valency in this system is close
to zero. The high valency number reflects numerous “soft”
contacts with various NCP–NCP orientations observed in the
system with Mg(H2O)62+. In contrast,
aggregation of the NCPs caused by the CoHex3+ ions results
in a more organized and structurally defined structure with a smaller
number of distinct NCP–NCP contacts reflected by a lower valency
number. The observed picture is reminiscent of the recent observation
that in compact nucleosome arrays, a “trade” between
entropy and energy contributions to the free energy of the condensed
phase leads to the higher NCP valency in the disordered but compact
arrays at physiological salt relative to the highly organized low-NCP
valency folded fibers under low-salt conditions.[35]Furthermore, we examine the distribution of the histone
tails around
an NCP in the Mg-20NCP and Co-20NCP systems. The SDFs of the histone
tails are calculated from the equilibrated parts of the CG MD trajectories.
The resulting SDFs are shown in Figure , with isosurfaces corresponding to an equal probability
of finding tail beads in the Mg-20NCP and Co-20NCP systems. Comparing
the SDFs of the NCP, “Own Tails” shows a slightly larger
volume of SDF in the Co-20NCP simulation. This can be explained by
the tail-mediated DNA–DNA attraction in the NCP stacks; in
the Co-20NCP system, the histone tails are more likely to be spread
between NCPs. In addition, we can directly observe histone tail-mediated
NCP–NCP interaction by plotting the SDFs of the neighboring
(“Others’ Tails”) NCP tails around a central
NCP. The H4 (green) and H2A (yellow) tails have close contact with
neighboring NCPs both in the Mg-20NCP and Co-20NCP systems (bottom
row in Figure ). These
tails mediate NCP–NCP stacking by screening DNA–DNA
repulsion at the top and bottom of the NCP cylinder. The H2B tails
(red) are also near the neighbors’ DNA in the condensed Co-20NCP
but not in the Mg-20NCP aggregate. Last, the H3 tails are away from
the neighboring NCPs in the condensed NCPs, indicating that they do
not make contacts with neighbors contributing to the NCP–NCP
interaction by screening the charge of their own NCP.
Figure 9
SDFs of the histone tails
around the NCP calculated from the equilibrated
parts of the CG MD trajectories in the Mg-20NCP and Co-20NCP systems.
Threshold values used to draw the transparent isosurfaces are the
same for both systems. The SDFs of the H3 tails are colored blue,
the H4 tails are green, the H2A tails are yellow, and the H2B tails
are red; the DNA and core histone beads of the central NCP are shown
as gray spheres. The top and side views of the central NCP are shown.
See the Supporting Information for the
3D animation of these distributions.
SDFs of the histone tails
around the NCP calculated from the equilibrated
parts of the CG MD trajectories in the Mg-20NCP and Co-20NCP systems.
Threshold values used to draw the transparent isosurfaces are the
same for both systems. The SDFs of the H3 tails are colored blue,
the H4 tails are green, the H2A tails are yellow, and the H2B tails
are red; the DNA and core histone beads of the central NCP are shown
as gray spheres. The top and side views of the central NCP are shown.
See the Supporting Information for the
3D animation of these distributions.Finally, the SDFs of the Mg(H2O)62+ and CoHex3+ ions around NCP were calculated and shown
as colored volumes in Figure . The data shows that Mg(H2O)62+ ions preferably associate with the DNA and the acidic patch on the
histone core, whereas CoHex3+ ions concentrate predominantly
near the DNA. At the same time, monovalent cations are excluded from
the space occupied by the multivalent ions, with CoHex3+ ions exerting a more substantial exchange effect than Mg(H2O)62+ near the DNA (see Figure S12). This change in ionic composition near the NCP
leads to an observed density of the Cl– ions in
the central cavity of the histone core (Figure S12), which suggests a change in the electrostatic properties
of the NCP. This may serve a role in the face-to-face contact of NCPs.
In the Mg-20NCP system, monovalent ions are exchanged both near the
DNA and the so-called acidic patch on the surface of the HO core,
resulting in uniformly lower monovalent ion concentration near the
NCP. In the Co-20NCP system, there is a low probability of finding
monovalent cations near the DNA, while some presence is observed near
the acidic patch. Hence, we can infer that the charge screening by
the Mg(H2O)62+ ions is spatially
spread out, while the CoHex3+ ions mainly neutralize nucleosomal
DNA and mediate DNA–DNA attraction. This explains the differences
in the NCP–NCP contacts observed in the Mg-20NCP and Co-20NCP
systems.
Figure 10
Top view (top row) and side view (bottom row) of the distribution
of multivalent ions surrounding the NCP. Yellow regions in the top
view panels indicate the approximate positions of the acidic patch.
Top view (top row) and side view (bottom row) of the distribution
of multivalent ions surrounding the NCP. Yellow regions in the top
view panels indicate the approximate positions of the acidic patch.Structural analysis of the charge–charge
contacts in the
NCP shows that the surface of the histone core exposed to the solvent
is net negatively charged in the presence of monovalent salt, while
the net charge of the core as a whole is net-positive.[14,63] Most of the positively charged amino acids of the globular part
of the histone core are neutralized by the wrapped DNA. The negative
charge on the core surface is located in the two acidic patches on
the two opposing flat core surfaces. Consequently, the interaction
between these contact surfaces of the NCPs is repulsive in the absence
of multivalent ions. From this observation, it follows that participation
of the histone positively charged tails, multi- (Mg2+,
CoHex3+) and monovalent (K+) ions are required
for the DNA–DNA and the core–core interactions to become
attractive and enable the close stacking of nucleosomes on top of
each other illustrated in Figure . These interactions are indeed seen in the SDF distributions
(Figures and 10).In summary, CG MD simulations show that
the geometry of the NCP–NCP
contacts and structure of the NCP condensed phase depend on the ionic
conditions and the cation charge and nature. The lower-strength NCP–NCP
attraction induced by Mg(H2O)62+,
with a screening effect on both nucleosomal DNA and histone core,
allows multiple contact modes, including perpendicular NCP–NCP
orientation, and leads to the higher NCP valency. The stronger CoHex3+–DNA interaction facilitates inter-nucleosome DNA–DNA
attraction, limiting the number of ways of arranging the NCP around
a particular NCP, producing more ordered condensed phases in agreement
with experiments. This difference can explain the experimental observation[10] that NCP precipitates induced by CoHex3+ are more ordered than those induced by Mg(H2O)62+.
Conclusions
We have
developed a CG NCP model within a bottom-up approach, establishing
its parameters exclusively from atomistic simulations with no adjustable
nonbonded parameters. For such a complex system as the NCP, it is
extremely difficult, if not impossible, to follow a direct bottom-up
approach, i.e., sampling the conformational space of several NCPs
at the all-atom level with the subsequent derivation of effective
CG potentials. We adopted a “divide and conquer” strategy,
where the effective CG potentials are extracted from a set of all-atom
MD simulations with subsets of components of the NCP subsequently
integrated into the CG NCP model. MD simulations of the resulting
CG NCP model have shown that the structure of the NCP is well preserved,
and basic NCP structural parameters agree well with experimental SAXS
data.In developing the current CG NCP model, we intentionally
did not
address the influence of DNA sequence on the NCP properties. Numerous
experimental studies (mainly using SAXS and X-ray diffraction methods)
have proved that the phase behavior of NCP does not show sequence
dependence for cell-extracted NCPs with mixed DNA sequence or for
the in vitro reconstituted NCPs with sequences of high-affinity positioning
“601” DNA, α-satellite DNA, and telomeric DNA
(see for example refs (10, 13, 64, 65)). The present approach can be updated to include DNA sequence effects
by using the IMC method to extract effective potentials defined for
the DNA P and D beads for all unique two-base pair sequence combinations.
However, this is outside the scope of the present work since we focus
on NCP aggregation and phase behavior, which, as articulated above,
do not show any sequence dependence.Simulations with multiple
NCPs have shown that the developed model
correctly reproduces multivalent cation-induced NCP condensation.
Further studies of NCP–NCP interaction at different ionic conditions
can be directly carried out with the current CG NCP model. Additionally,
a CG model of a nucleosome array with linker DNA can be built similarly
using CG interaction potentials obtained in this work. Using this
newly developed CG NCP model, we intend to carry out a systematic
investigation of the structural properties of NCP aggregates in the
presence of different concentrations of di- and trivalent ions.
Authors: Nikolay Korolev; Abdollah Allahverdi; Ye Yang; Yanping Fan; Alexander P Lyubartsev; Lars Nordenskiöld Journal: Biophys J Date: 2010-09-22 Impact factor: 4.033