We introduce MADna, a sequence-dependent coarse-grained model of double-stranded DNA (dsDNA), where each nucleotide is described by three beads localized at the sugar, at the base moiety, and at the phosphate group, respectively. The sequence dependence is included by considering a step-dependent parametrization of the bonded interactions, which are tuned in order to reproduce the values of key observables obtained from exhaustive atomistic simulations from the literature. The predictions of the model are benchmarked against an independent set of all-atom simulations, showing that it captures with high fidelity the sequence dependence of conformational and elastic features beyond the single step considered in its formulation. A remarkably good agreement with experiments is found for both sequence-averaged and sequence-dependent conformational and elastic features, including the stretching and torsion moduli, the twist-stretch and twist-bend couplings, the persistence length, and the helical pitch. Overall, for the inspected quantities, the model has a precision comparable to atomistic simulations, hence providing a reliable coarse-grained description for the rationalization of single-molecule experiments and the study of cellular processes involving dsDNA. Owing to the simplicity of its formulation, MADna can be straightforwardly included in common simulation engines. Particularly, an implementation of the model in LAMMPS is made available on an online repository to ease its usage within the DNA research community.
We introduce MADna, a sequence-dependent coarse-grained model of double-stranded DNA (dsDNA), where each nucleotide is described by three beads localized at the sugar, at the base moiety, and at the phosphate group, respectively. The sequence dependence is included by considering a step-dependent parametrization of the bonded interactions, which are tuned in order to reproduce the values of key observables obtained from exhaustive atomistic simulations from the literature. The predictions of the model are benchmarked against an independent set of all-atom simulations, showing that it captures with high fidelity the sequence dependence of conformational and elastic features beyond the single step considered in its formulation. A remarkably good agreement with experiments is found for both sequence-averaged and sequence-dependent conformational and elastic features, including the stretching and torsion moduli, the twist-stretch and twist-bend couplings, the persistence length, and the helical pitch. Overall, for the inspected quantities, the model has a precision comparable to atomistic simulations, hence providing a reliable coarse-grained description for the rationalization of single-molecule experiments and the study of cellular processes involving dsDNA. Owing to the simplicity of its formulation, MADna can be straightforwardly included in common simulation engines. Particularly, an implementation of the model in LAMMPS is made available on an online repository to ease its usage within the DNA research community.
Sequence-dependent
conformational and elastic properties of DNA
are of the utmost importance for its regulation in vivo, as they directly
affect DNA–protein interactions.[1] The detailed shape of a DNA fragment and its deformability are indeed
key determinants of protein recognition and binding.[1,2] For instance, the unique conformational properties of A-tracts are
known to affect nucleosomal organization,[3] as well as DNA replication and recombination.[4] Analogously, TATA-box elements present a strong deviation
from the canonical B-DNA conformation, which is exploited to enhance
binding by the TATA-box binding protein.[1] Moreover, proteins continuously exert mechanical stress on bound
DNA. For example, torsion is applied on DNA by topoisomerases and
polymerases[5,6] and plays a central role in chromatin remodeling.[7] Also, DNA stretching is relevant in vivo, e.g.,
in the action of recombinases[8] or for site
recognition in nucleosomes.[9]Together
with the practical implications of DNA elasticity in nanotechnological
applications,[10,11] this fundamental interest has
prompted a conspicuous amount of experimental efforts devoted to the
detailed characterization of DNA conformational ensemble and elasticity.
For the study of conformers, classical crystallographic and NMR studies
are nowadays being complemented by X-ray interferometry[12,13] and by single-molecule imaging with atomic force microscopy (AFM)[14,15] or cryoelectron microscopy.[16] Single-molecule
techniques like AFM, optical tweezers, and magnetic tweezers are employed
to assess the elastic properties of long DNA molecules, providing
stiffness values for various mechanical perturbation modes such as
stretching,[17−21] twisting,[20,22−27] bending,[17−19,21,28−30] and the coupling between them.[20,31−35] At short length scales, the interpretation of experiments becomes
cumbersome due to both theoretical and experimental challenges. Indeed,
direct AFM imaging has provided conflicting results on the bendability
of short DNA fragments.[36,37] For other experimental
approaches such as cyclization assays,[38−41] the elastic parameters can be
obtained only by interpreting the data within specific theoretical
frameworks, and conflicting conclusions have been reported based on
different physical assumptions.[12,42−44]Molecular simulations provide a valuable means to complement
experimental
studies, particularly in view of the discrepancies mentioned above.
All-atom molecular dynamics has been successfully employed to capture
DNA elastic and conformational features[45−49] as well as their dependence on sequence.[35,50−53]An evident limit of atomistic simulations originates from
the associated
high computational cost, which puts severe boundaries on the length
and time scales that can be simulated. The need to overcome this barrier
has fostered the development of coarse-grained approaches, where one
selects only the degrees of freedom relevant to the problem at hand.
An elegant solution is implemented in the software packages cgDNA[54] and MC-eNN,[55] where
the nucleotides are represented as rigid frames and the energy of
the system is described by means of a stiffness matrix. This approach
is extremely efficient from a computational perspective and enables
assessing the conformational ensemble of long DNA chains.[56] However, an important drawback lies in the absence
of an explicit description of the backbone and the related difficulty
in interfacing this representation with external perturbations present
in most systems of interest, e.g., mechanical stress, confinement,
or a binding protein. A good compromise between computational efficiency
and modeling flexibility is achieved by considering coarse-grained
models in which effective particles represent the various moieties
along the DNA molecule.[57−65] These models have been employed to describe a wide palette of systems
involving DNA, including, e.g., DNA origami,[66] nucleosomes,[67] and cyclization assays.[44]Despite this success, the coarse-grained
models of the latter kind
present in the literature do not satisfactorily capture all the main
elastic features of double-stranded DNA at once (see, e.g., refs (62 and 68)). In this work, we fill this
gap by introducing the mechanically-accurate DNA (MADna) model, a
sequence-dependent coarse-grained model whose parameters are tuned
to reproduce local conformational features of double-stranded DNA
and the stiffness of the various mechanical perturbation modes obtained
via all-atom molecular dynamics.[52] We show
that the model satisfactorily reproduces the atomistic values also
for DNA fragments different than the ones employed for fitting, as
well as experimental data from the literature. Particularly, and at
variance with existing models, our coarse-grained description captures
the negative twist–stretch coupling recently unveiled by single-molecule
force spectroscopy[31,32] as well as the experimentally
determined sequence dependence of dsDNA helical pitch and persistence
length.[28]The full details of MADna
and the definitions of the various physical
and geometrical quantities are provided in Methods. Nevertheless, they are shortly introduced also in Results and Discussion, making this section somewhat self-standing.
In this way, the main findings of this work are presented succinctly,
leaving the interested reader to look up the technical details within
the Methods section.Finally, we remark
that an online repository is available (https://github.com/saassenza/MADna/tree/main/LAMMPS) where MADna has been implemented in LAMMPS[69] without the need to introduce additional custom packages. The repository
also contains a molecular builder which automatically creates the
initial coordinates and topology for simulation in LAMMPS.
Methods
Coarse-Grained
Model
MADna describes a molecule of
double-stranded DNA (dsDNA) by considering three effective particles
per nucleotide, located in the geometric centers of the phosphate
group, the sugar, and the base, as proposed in the past in the 3SPN[57] and TIS[65] models.
In Figure , two cartoons
are reported to compare the atomistic and coarse-grained description
of a representative sequence. Further details on the coarse-graining
procedure are reported in section S1 in the Supporting Information. In spite of the similar level of coarse-graining,
MADna diverges from the other models in terms of the choice of bonded
interactions both in the way they are built and in their parametrization.
Particularly, our choice of bonded interactions enables capturing
the structural heterogeneity predicted by atomistic simulations, as
will be showed below when we comment on the results in Figure and Figure i. On the other hand, we chose to fix the
double-stranded structure by means of bonded interactions, which simplifies
the implementation of the model (for instance, MADna does not need
modulation factors to account for anisotropic interactions) but makes
it less general than the other ones as it cannot account for DNA thermodynamics.
In this regard, MADna is more akin to the MARTINI representation of
double-stranded DNA, where the double helix is maintained through
a suitably defined elastic network.[64] We
will add the possibility of strand separation in future versions of
MADna. As for the parametrization, MADna extracts information from
a large data set provided by atomistic simulations, while both TIS
and 3SPN are parametrized by means of experiments. Experimental results
are of course the ultimate source of information, but we decided to
tune the parameters of MADna from atomistic simulations based on both
the far richer amount of available information and their demonstrated
accuracy in capturing local conformational and elastic features of
dsDNA. As we show in the Results and Discussion section, this choice is rewarded with a very good performance of MADna in
capturing many distinct experimental observations on the structure
and mechanical response of dsDNA.
Figure 1
Atomistic (left) and coarse-grained (right)
description for a representative
dsDNA molecule with leading sequence 5′-CGCTACTTCGAGG-3′
in the B-DNA form as obtained by employing the NAB software.[70] In the coarse-grained cartoon, the color code
is the following: sugar ↔ black; phosphate group ↔ red;
adenine ↔ green; cytosine ↔ cyan; guanine ↔ yellow;
thymine ↔ orange. The size of each bead is proportional to
the WCA radius of the corresponding moiety.
Figure 2
Average
values obtained by coarse-graining the atomistic simulations
for the angles SBB (a) and the dihedrals SBBS (b). Vertical lines separate the distinct sequences,
which are listed in Section S3.1 in the Supporting Information. In the legends, the various labels correspond
to the particular bases involved in the local conformation under consideration.
For instance, SAT considers a SBB angle in which an adenine is bound to the sugar. Note also that
the dihedrals are symmetric with respect to the inversion of the involved
bases, as it just corresponds to changing the arbitrary reference
strand.
Figure 5
Scatter plot comparing
atomistic and coarse-grained results at f = 1 pN
for various structural features: crookedness β
(panel a); helical diameter (b); helical rise (c); helical twist (d);
width of major (e) and minor (f) groove; depth of major (g) and minor
(h) groove; SBBS dihedrals (i). Training and testing sequences are
denoted by red circles and green diamonds, respectively. Black lines
indicate the bisector of the first and third quadrant. For each panel,
the Pearson coefficient indicating the linear correlation between
the two data sets is reported. The atomistic results were obtained
by coarse-graining the trajectories obtained from all-atom simulations
and performing the analysis reported in the Methods.
Atomistic (left) and coarse-grained (right)
description for a representative
dsDNA molecule with leading sequence 5′-CGCTACTTCGAGG-3′
in the B-DNA form as obtained by employing the NAB software.[70] In the coarse-grained cartoon, the color code
is the following: sugar ↔ black; phosphate group ↔ red;
adenine ↔ green; cytosine ↔ cyan; guanine ↔ yellow;
thymine ↔ orange. The size of each bead is proportional to
the WCA radius of the corresponding moiety.Average
values obtained by coarse-graining the atomistic simulations
for the angles SBB (a) and the dihedrals SBBS (b). Vertical lines separate the distinct sequences,
which are listed in Section S3.1 in the Supporting Information. In the legends, the various labels correspond
to the particular bases involved in the local conformation under consideration.
For instance, SAT considers a SBB angle in which an adenine is bound to the sugar. Note also that
the dihedrals are symmetric with respect to the inversion of the involved
bases, as it just corresponds to changing the arbitrary reference
strand.
Bonded Interactions
The double-stranded
topology is
fixed once and for all and is maintained by introducing several two-,
three-, and four-body bonded interactions (from here on, we will refer
to them as bonds, angles, and dihedrals, respectively), which connect
beads within the same strand as well as providing interstrand links.Before going into the details of the bonded interactions considered,
we first proceed to clarify the nomenclature used in this section.
In order to label the bonded interactions, we indicate the sugar,
phosphate, and base beads as S, P, and B, respectively. In this way,
a bond between, e.g., a sugar and a base is indicated as SB. Whenever
the bonded interaction runs along the 5′-3′ direction
of a strand, we add a corresponding tag at the two ends of the label.
For instance, a bond between a sugar and a phosphate in the 5′-3′
direction is indicated as 5′-SP-3′. Analogously, we
denote as 5′-SPSB-3′ a dihedral involving a sugar, a
phosphate, the following sugar, and the base attached to it (ordered
according to the 5′-3′ direction of the strand). In
order to keep in place the double-stranded structure, some bonded
interactions involve both strands. Such bonded interactions are marked
by underlining their label. For instance, the bonds accounting for
Watson–Crick base pairing are indicated as BB-WC (in this specific case we also added “WC” in order
to clearly distinguish these bonds from 5′-BB-3′, which
account for stacking interactions). As another example, we denote
as 5′-PSBB-3′ a dihedral involving
in the 5′-3′ direction a phosphate, a sugar, the attached
base, and the base paired to it. We further note that, due to the
asymmetry of the 5′-3′ direction, the order in which
the same beads are considered is important. For instance, the bonds
5′-SP-3′ and 5′-PS-3′ are different from
each other. This can be clearly seen by comparing the corresponding
entries in Table S2 in the Supporting Information, where we show the equilibrium values computed by coarse-graining
the atomistic simulations used to parametrize MADna (see below). Analogously,
the dihedrals 5′-SPSP-3′ and 5′-PSPS-3′
are different from each other, as can be seen from Table S4 in the Supporting Information.For the bonds potential Ubond, a harmonic
function was considered:where r is the distance
separating
the two beads connected by the bond, kbond is the elastic constant specific to the type of bond considered,
and r0 is the corresponding equilibrium
distance. An analogous formula was used for the angle potential Uangle:where θ is
the angle characterizing
the three-body bonded interaction, kangle is the bending constant relative to the type of angle considered,
and θ0 is the equilibrium value of θ. Finally,
the following formula was chosen for the dihedral potential Udihedral:where ϕ is the dihedral angle, kdihedral is the elastic constant of the particular
type of dihedral considered, and ϕ0 = ϕmin – 180°, with ϕmin being the
equilibrium dihedral angle. This choice for the bonded interactions
relies on the observation of normal distributions for most of the
corresponding observables (see Section S2.2 in the Supporting Information).The general guideline in choosing
how to build the topology was
to obtain a minimal set of bonded interactions which reproduces the
structural features obtained in all-atom simulations. With this spirit,
we considered only bonded interactions involving up to a single step
and implemented the sequence dependence by setting different values
of the parameters according to the particular step under consideration.
For instance, if along the sequence we have a CT step, the bond 5′-BB-3′
accounting for the stacking interaction is implemented by means of eq , where kbond and r0 are equal to the
values assigned to a CT step. Analogously, the dihedral 5′-PSBB-3′ is implemented by means of eq , where kdihedral and ϕ0 are set to the values corresponding to CT.
The same is repeated for all the step-dependent bonded interactions
considered, so that 16 different values are considered for the corresponding
parameters depending on the involved step. Some local features are
symmetric with respect to the 5′-3′ direction. The corresponding
bonded interactions were then implemented as being dependent only
on the corresponding base pair. For instance, for the bond SB we considered
four possibilities (A, C, G, and T), while for the bond BB-WC there were just two choices (AT or CG). Yet, this
assumes that these local features do not depend on the neighboring
steps, which is not always the case. As an example, we report in Figure the averages obtained
by coarse-graining the atomistic simulations for the angles SBB (Figure a) and the dihedrals SBBS (Figure b). The angles SBB are obtained by considering a sugar, the base attached
to it, and the corresponding Watson–Crick partner. For instance,
we indicate as SAT the angle formed by a sugar,
the adenine attached to it, and the thymine paired with the adenine.
As for the dihedrals SBBS, we similarly consider
a Watson–Crick base pair and the corresponding sugars. In Figure , the points are
color-coded according to a classification based on a single base pair.
Hence, four possible choices are present for SBB, depending on the base being attached to the sugar (A, C, G, T),
while only two choices are available for SBBS, corresponding to the Watson–Crick base pairs AT and CG.
As shown in Figure a, the data for SBB nicely cluster around
their average values, indicating that this angle is in practice independent
of the rest of the sequence of the hosting DNA molecule and that it
can thus be considered for bonded interactions depending only on the
base pair. In contrast, the dihedrals SBBS show
a wide variability, particularly in the case of CG base pairs (Figure b). This demonstrates
that these dihedrals depend on their neighborhood and are thus not
apt for implementing bonded interactions depending only on the base
pair. As we show in the Results and Discussion, the heterogeneity of the SBBS dihedrals
is quantitatively captured by MADna as an emergent property resulting
from the interplay between the other bonded interactions. We also
note that this is a distinguishing feature of MADna when compared
to the TIS[65] and 3SPN2[71] models, where instead the base-pair-dependent SBBS interactions are introduced as part of the model,
thus preventing the possibility of capturing the heterogeneity suggested
by the atomistic simulations.The choice of the bonded interactions
was obtained in practice
by trial and error: starting from a far too simplistic description,
where the beads were barely kept together by a minimal set of bonds,
we incrementally added bonded interactions. At each iteration, we
ran simulations with the coarse-grained model and computed the averages
obtained for local conformational features that were not yet directly
implemented. We then compared these averages with the results obtained
by coarse-graining the atomistic trajectories. If the comparison was
not good, we then proceeded to add other bonded interactions. The
iterative procedure stopped when we reached a satisfactory comparison.
We also considered an opposite approach, where we started by considering
far too many bonded interactions and deleted some of them until further
eliminations resulted in disrupting the structure. The process has
some arbitrarity in the properties being monitored, the order in which
we add bonded interaction, etc. Yet, the high fidelity of the coarse-grained
model in reproducing the atomistic values (as shown in the Results and Discussion) validates the final choice
of bonded parameters that we considered. The final set of employed
bonds, angles, and dihedrals is reported in Figure , together with some representative examples
based on the dsDNA molecule sketched in Figure . Some of these bonded interactions are directly
related to physically meaningful features, such as hydrogen bonds
between Watson–Crick pairs (labeled as BB-WC in Figure ) and
stacking interactions (5′-BB-3′ in Figure ). Summing up, there are in
total 202 distinct bonded interactions in the model. Further details
are reported in Section S2 in the Supporting Information.
Figure 3
List of the various bonded interactions considered in the model,
together with representative examples based on the same molecule as
in Figure . Step-dependent
bonded interactions are indicated by the presence of the tags 5′
and 3′ in their label. For interstrand interactions, the corresponding
label is underlined. The letters present in the labels indicate sugars
(S), phosphate groups (P), or generic bases (B). For clarity, all
the selected examples involve beads belonging to the same strand,
whose 5′-3′ direction is indicated by the arrow in the
top-left panel.
List of the various bonded interactions considered in the model,
together with representative examples based on the same molecule as
in Figure . Step-dependent
bonded interactions are indicated by the presence of the tags 5′
and 3′ in their label. For interstrand interactions, the corresponding
label is underlined. The letters present in the labels indicate sugars
(S), phosphate groups (P), or generic bases (B). For clarity, all
the selected examples involve beads belonging to the same strand,
whose 5′-3′ direction is indicated by the arrow in the
top-left panel.Importantly, since all the bonded
interactions involve at most
two consecutive base pairs, any feature at scales larger than a single
step is an emergent property originating from the propagation of the
local interactions in combination with the electrostatic repulsion
between phosphates (see below).
Excluded Volume
Excluded-volume interactions were implemented
by means of a Weeks–Chandler–Andersen (WCA) potential,
i.e., by retaining the repulsive part of a Lennard-Jones interaction:In the previous formula, r is the distance between
the two particles, ϵ = 1
kcal/mol, while σ depends on the bead considered, as reported
in Table S1 in the Supporting Information.
Electrostatics
In order to account for the presence
of charges along the backbone, a charge q·e0 was assigned to the beads corresponding to
phosphate groups, with e0 = 1.6 ×
10–19 C being the elementary charge. Although each
phosphate carries a unit negative charge, in order to account for
counterions condensation an effective reduced value q = −0.6 was considered.[65,71,72] The salt-induced electrostatic screening was modeled via a Debye–Hückel
interaction:[73]In the previous formula, ϵ0 = 8.859
× 10–12 F/m is the absolute permittivity;
ϵr = 78.3 is the relative permittivity of water;
and lD is the Debye length, defined aswhere kB = 1.38
× 10–23 J/K is Boltzmann’s constant; T is the temperature of the system in Kelvin; NA = 6.022 × 1023 mol–1 is Avogadro’s constant; and I is the ionic
strength of the solution in mM.
Determination of Parameters
The parameters of the various
sequence-dependent bonded interactions were tuned in order to reproduce
the results of atomistic simulations from ref (52), which were performed
in AMBER14[74] with the parm99 force field[75] including the bsc0 modifications.[76] In those simulations, dsDNA molecules with different
sequences were pulled under the action of forces ranging from 1 to
20 pN. The main details of the atomistic simulations are reported
in Section S2.1 in the Supporting Information. Here, we used a subset of the atomistic simulations covering all
possible base steps (we refer to this subset as “training sequences”,
see Section S3.1 in the Supporting Information for the sequences) to determine the bonded parameters, while the
rest of sequences were employed as test cases to benchmark the coarse-grained
model (“testing sequences”, also reported in Section
S3.1 in the Supporting Information). It
has to be noted that, although parm99+bsc0 simulations reproduce well
most structural features determined in experiments,[77] more precise modifications have been recently introduced,
namely bsc1[78] and OL15.[79] Our reason to choose bsc0 was rooted in the convenience
of having the atomistic data already available in our group,[46,52] together with the reasonable precision of these modifications.[77] Future developments of MADna will include the
reparameterization of the model via atomistic simulations based on
bsc1 or OL15.A first estimation of the parameters was obtained
via Boltzmann inversion of the all-atom simulations performed at 1
pN. In this regard, the atomistic trajectories were first coarse-grained
according to the three-beads representation. Then, for each bond,
angle, and dihedral the ensemble averages were computed from the coarse-grained
trajectories, which enabled fixing the values of r0, θ0, and ϕ0. Analogously,
the elastic constants were determined in order to reproduce the size
of fluctuations starting from eqs , 2, and 3. Further details on this procedure can be found in Section S2.2
in the Supporting Information.Coarse-grained
simulations performed using the obtained force field
showed that the set of parameters obtained by Boltzmann inversion
provides reasonable values for the elastic constants (see Figure S1
in the Supporting Information). Nevertheless,
some of the parameters were further tuned in order to improve the
quantitative agreement with the atomistic simulations, focusing on
reproducing the elastic constants of the training sequences (see Results and Discussion and Section S2.2 in the Supporting Information).
Molecular Dynamics
Simulations
All the simulations
were performed in LAMMPS (http://lammps.sandia.gov[69]). The temperature T = 300 K was maintained through a Langevin thermostat with damping
constant τdamp = 20 ps. The integration step was
set to 20 fs.
Benchmark Simulations
In order to benchmark the coarse-grained
model, we performed pulling simulations of the training and testing
sequences following the same protocol as in refs (46 and 52). A pulling force f was applied to the center of mass ξ2 of the sugars belonging to the second
base pair. Analogously, an opposite force −f was
applied to the center of mass ξ of the second-to-last base pair (lseq is the length of the sequence) . The direction
of the forces was taken along the line connecting ξ2 and ξ. Simulations were performed for f = 1, 5, 10, 15, and 20 pN. For each simulation, the system
was initialized by considering the average structure created by the
molecular builder (defined below), and the equations of motion were
integrated for 20 ns. In order to ensure full equilibration, only
the last 10 ns were considered for analysis. The convergence of a
representative simulation is reported in Figure S8 in the Supporting Information. For each sequence and
force, 100 independent simulations were performed. The ionic strength I was set at the same value as in the atomistic simulations
from refs (46 and 52) and was
computed as I = Nions/NAV, where Nions is the number of counterions and V is the volume of the simulation box.
Persistence-Length
Simulations
For the computation
of the sequence-averaged persistence length, we considered 20 random
sequences made of 100 base pairs. The sequences are listed in Section
S3.2 in the Supporting Information. For
each realization, the molecule was initialized by considering the
average structure obtained by the molecular builder (see below), and
the simulation was run for 500 ns. In order to ensure full equilibration,
the first 100 ns were discarded from the analysis. The convergence
of a representative trajectory is reported in Figure S9 in the Supporting Information. For each sequence, 10
independent simulations were performed. The ionic strength was set
at I = 150 mM.In a second set of simulations,
we studied the sequence dependence of persistence length and helical
pitch and compared our results with the experiments from ref (28). We considered 14 sequences
of 100 base pairs obtained as the central fragments of the corresponding
experimental ones.[28] The sequences are
listed in Section S3.2 in the Supporting Information. The same protocol as for the sequence-averaged study was applied.
The ionic strength was set at I = 1000 mM to ease
the comparison with the predictions of CGDNA.[56] The convergence of a representative trajectory is reported in Figure
S10 in the Supporting Information. Simulations
were performed with MADna and, for comparison, with oxDNA2[80] (using the LAMMPS implementation[81] and considering sequence-dependent stacking
interactions) and the sequence-dependent model 3SPN2C.[82]
Stretch–Torsion Simulations
In order to determine
the elastic constants, a simulation setup was considered where a 40bp
dsDNA molecule was subjected to the simultaneous action of a pulling
force f and a torque τ directed along the z axis and with constant magnitude.For each simulation,
the molecule was initialized via the molecular builder and was aligned
with the z axis. The two bottom base pairs and the
corresponding sugars and phosphates were tethered to their initial
position via harmonic constraints with constant 100 kcal/mol Å2. A constant pulling force f oriented along
the positive z direction was exerted on the center
of the sugars belonging to the second base pair counting from the
top. A harmonic constraint with constant 100 kcal/mol Å2 was also applied to the x and y coordinates of the center to keep the molecule aligned with the z axis and avoid spurious effects originating from the overall
orientational entropy.[83] Finally, a constant
torque τ was imposed on the two top base pairs and the corresponding
sugars and phosphates.Simulations were performed at τ
= 0 for values of the force f = 2, 4, 6, 8, 10, 15,
20, 25, 30, 35, and 40 pN and at f = 2 pN for values
of the torque τ = 0, 5, 10, 15,
20, 25, and 30 pN·nm. To account for sequence-induced heterogeneity,
five random sequences were generated, which we refer to as sequences
ST1, ..., ST5 (the sequences are listed in Section S3.3 in the Supporting Information). Moreover, to check the
dependence of the elastic constants on the length of the dsDNA molecule,
we also performed simulations on shorter sequences containing 20 base
pairs, which we refer to as sequences ST1-short, ..., ST5-short (the
sequences are listed in Section S3.3 in the Supporting Information). For each combination of sequence, force, and
torque, 10 independent simulations of length 250 ns were run. The
first 50 ns were not considered for analysis in order to let the system
equilibrate. The convergence of a representative trajectory is reported
in Figure S11 in the Supporting Information. Simulations were performed with MADna, oxDNA2, and 3SPN2C. In all
cases, the ionic strength was set at 150 mM. The four base pairs at
each end were discarded from the analysis.Finally, in order
to study the effect of sequence on the elasticity,
we considered another set of simulations with phased A-tracts, which
are listed in Section S3.3 in the Supporting Information. The same protocol as for the rest of the simulations was applied.
Twist–Bend Coupling Simulations
In order to
determine the softer response to torque due to the coupling between
twist and bending, we performed simulations similar to those of the
previous section. Here, 150bp long sequences were placed under the
action of a pulling force directed along the z axis
and acting on one end of the molecule. The other end was tethered
with the same protocol as in the previous section. Note that the pulled
end is here free to move in the x and y directions in order to let the molecule bend.Simulations
were performed for values of the force f in steps
of 0.05 pN within the range 0.05 pN ≤ f ≤
0.50 pN and in steps of 0.25 pN within the range 0.50 pN ≤ f ≤ 2.50 pN. Three different sequences were considered,
which we refer to as TB1, TB2, and TB3 (the sequences are listed in
Section S3.4 in the Supporting Information). For each combination of sequence and force, three independent
simulations of length 2000 ns were run. The first 1000 ns were not
considered for analysis in order to let the system equilibrate. The
convergence of a representative trajectory is reported in Figure S12
in the Supporting Information. Simulations
were performed with MADna, 3SPN2C, and oxDNA2.
Definitions
and Protocols
Basic Definitions
We will indicate
as P, S, and B the position
vectors
of the beads corresponding to the ith phosphate group,
sugar, and base, respectively, and located on strand s = 1, 2. The index i is counted in the 5′-3′
direction of the arbitrarily chosen strand 1. Note that i = 1, ..., n for sugar and base beads, while i = 2, ..., n for the phosphate groups,
where n is the total number of base pairs (compare Figure ). Hence, there are
in total N = 2[(n – 1) + n + n] = 6n – 2
beads, where the factor of 2 is included to account for the presence
of the two strands. Handles were always discarded from the analysis,
so the various indexes introduced here refer to the subfragment under
consideration.
Crookedness
The crookedness β
accounts for the
deviation of a dsDNA molecule from a straight line[52] (Figure b). For short fragments (such as the training and testing sequences
considered here), thermally induced bending is negligible; hence,
in this case β effectively quantifies the spontaneous bending
of the molecule. The definition of β is here modified with respect
to its original formulation to adapt it to the coarse-grained context.
The contour length L of the line connecting the base-pair
centers along the molecule is L = ∑ |Γ – Γ|, where Γ ≡ (B1, + B2,)/2. The crookedness
β is defined asIf the line connecting the centers is perfectly
straight, then β = 0. The more curved the line, the larger the
corresponding β. As a reference, we computed the crookedness
for the conformations obtained via NAB[70] for the training and testing sequences, obtaining an average value
β ≃ 0.17.
Figure 4
Sketches showing pictorially the definitions to characterize
the
geometry of DNA. (a) The helical axis is obtained as the axis of the
best-fitting cylinder. (b) The crookedness is obtained by computing
the arccosine of the ratio between the end-to-end distance (black
arrow) and the contour-length of DNA (red line). These quantities
are obtained along the line formed by the points Γ, which are determined as the centers between bases
belonging to the same pair (inset). (c) Grooves are defined by considering
the lines interpolating the phosphate beads (brown and black lines).
For any couple of phosphates on the first strand (P1, and P1, in this case), we define the midpoint (M). From the midpoint, we find the closest
points on the second strand. The groove widths are obtained as the
corresponding minimum distances (orange segments), suitably shifted
to account for the excluded volume of the backbone. (d) The h-twist
is defined by considering the vectors ζ ≡ S2, – S1, joining
the two sugars within each base. The vectors ζ are projected onto the plane perpendicular
to the helical axis, thus obtaining ζ. The h-twist is then defined as the angle depicted
in red, corresponding to cos h-twist = ζ·ζ·e) The h-rise is defined by considering the geometrical
centers of the sugars ξ ≡
(S1, + S2,)/2 and projecting the vector separating
two consecutive centers onto the helical axis, thus obtaining h-rise
= (ξ – ξ)·ĥ, corresponding to the
red segment in the figure.
Sketches showing pictorially the definitions to characterize
the
geometry of DNA. (a) The helical axis is obtained as the axis of the
best-fitting cylinder. (b) The crookedness is obtained by computing
the arccosine of the ratio between the end-to-end distance (black
arrow) and the contour-length of DNA (red line). These quantities
are obtained along the line formed by the points Γ, which are determined as the centers between bases
belonging to the same pair (inset). (c) Grooves are defined by considering
the lines interpolating the phosphate beads (brown and black lines).
For any couple of phosphates on the first strand (P1, and P1, in this case), we define the midpoint (M). From the midpoint, we find the closest
points on the second strand. The groove widths are obtained as the
corresponding minimum distances (orange segments), suitably shifted
to account for the excluded volume of the backbone. (d) The h-twist
is defined by considering the vectors ζ ≡ S2, – S1, joining
the two sugars within each base. The vectors ζ are projected onto the plane perpendicular
to the helical axis, thus obtaining ζ. The h-twist is then defined as the angle depicted
in red, corresponding to cos h-twist = ζ·ζ·e) The h-rise is defined by considering the geometrical
centers of the sugars ξ ≡
(S1, + S2,)/2 and projecting the vector separating
two consecutive centers onto the helical axis, thus obtaining h-rise
= (ξ – ξ)·ĥ, corresponding to the
red segment in the figure.
Helical Parameters
Due to the limited information on
local coordinates, the definition of the helical parameters has to
be adapted to the coarse-grained framework. Particularly, the representation
of the bases as single beads prevents defining the local reference
frames, which are crucial for the standard definitions of helical
parameters.[84,85] Taking advantage of the short
length of the training and testing sequences, the helical axis ĥ in this case was defined globally as the axis of
the cylinder best fitting the position of phosphates in the double
helix (Figure a; see
Section S4.1 in the Supporting Information for the technical details), which was also employed to estimate
the diameter of the DNA. In order to define the h-twist, we first
introduce the sugar separation vector ζ ≡ S2,–S1, (Figure d). The h-twist for
the step i, i + 1 was defined by
cos(h-twist) ≡ ζ̂·ζ̂, where ζ̂ ≡ ζ/|ζ| and ζ ≡ ζ – (ζ·ĥ) ĥ is the projection of ζ onto the plane perpendicular to the helical axis. Following
the usual convention,[84] the sign of the
h-twist was determined according to the sign of the product (ζ × ζ)·ĥ. The employment
of the positions of the sugars in the definition of the h-twist was
motivated by the high correlation found with the standard definition
in 3DNA[84] (see Figure S2 in the Supporting Information). Finally, the h-rise
for the step i, i + 1 was computed
as h-rise = (ξ–ξ)·ĥ, where ξ ≡ (S1, + S2,)/2 is the center of the sugars of base pair i (Figure e).
Grooves
For a visual description of the definition
of groove geometry, see Figure c. The quantification of the geometry of grooves was assessed
in a similar way as in Curves+.[85] The positions
of the phosphate groups in each strand were interpolated via centripetal
Catmull-Rom splines.[86] Particularly, the
helical fragment connecting P and P was interpolated by considering P, P, P, and P as control points.
For each of the terminal phosphates, the missing external control
point was built by extrapolation of the h-rise and h-twist of the
last step. In order to determine the groove geometry corresponding
to base pair i, we first considered the midpoint M1, between P1, and P1, along the interpolated curve. Then, starting from the analogous
midpoint M2, on the second
strand, we followed the corresponding interpolating curve in both
directions while computing the distance from M1, until a local minimum in the distance was first
encountered at two points denoted as F+, and F–,. From the two identified minimal distances, we subtracted
0.58 nm to account for the van der Waals radius of the backbone[85] and assigned the resulting values to the width
of the major and minor groove. In order to compute the depth of each
groove, we considered the midpoints on the width vectors, i.e., 0.5(M1, + F+,) and 0.5(M1, + F–,) and
computed their distance from the base-pair center Γ. Again, in order to account for the
van der Waals radii, we estimated the final groove depths by subtracting
0.35 nm from the calculated distance.[85]
Determination of Elastic Constants
In order to interpret
single-molecule experiments, dsDNA is often modeled as an elastic
rod. When a pulling force f and a torque τ
are exerted along the direction of the rod, the elastic energy U reads[22,31]where S is
the stretching
modulus; C is the twist modulus; L is the extension; θ is the twist in radians; ΔL ≡ L – L0; and Δθ ≡ θ
– θ0, with L0 and
θ0 being the equilibrium extension and twist at zero
force and torque. On the right-hand side of eq , the first and second terms are stretching
and twisting energy, respectively; the third term is the stretch–twist
coupling energy; the fourth and fifth terms are the work performed
by the external force and torque, respectively.Simultaneous
minimization with respect to extension and force (∂U/∂ΔL = 0 and ∂U/∂Δθ = 0) leads to
the following equations for the elastic equilibrium:where S̃ ≡ S – g2/C is the effective stretching modulus. From the previous formulas,
the sign of g can be immediately ascertained by looking
at the behavior of ΔL as a function of τ
at constant force, with positive and negative correlation implying g < 0 and g > 0, respectively (an
analogous
response is obtained at constant τ for Δθ as a function of f). For a quantitative assessment
of the elastic constants, three equations are needed. Different strategies
were pursued for the two simulation protocols considered in this work.
Benchmark
Simulations
In this case, the torsion angle
is estimated as θ = ∑h-twist. The extension
is computed as L = |Γ – Γ1|. Moreover,
no torque is applied, τ = 0. From eq , one has ΔL = A1f and Δθ = A2f, with A1 ≡ L0/S̃ and A2 ≡ −gL0/CS̃. The third equation is provided
by applying the equipartition theorem to eq and neglecting the twist–stretch coupling
term.[31] This is not a problem for the present
purposes since it affects equally the analysis of both atomistic and
coarse-grained results, thus not jeopardizing the quality of their
comparison, although this detail should be kept in mind for the physical
interpretation of the results. Application of the equipartition theorem
gives var(Δθ) = ⟨(Δθ)2⟩ – ⟨Δθ⟩2 = kBTL0/C. Inverting these
formulas, one findsFitting the linear dependence of L and θ on the pulling force f, we computed
the constants L0, θ0, A1, and A2. var(Δθ)
was instead calculated directly from the simulations performed at f = 1 pN. Plugging the values of these quantities into eq , we finally obtained
the values of S̃, C, and g reported in Figure .
Figure 6
Scatter plot comparing
atomistic and coarse-grained results for
the effective stretching modulus S̃ (panel
a), the crookedness rigidity kβ (b),
the torsion modulus C (c), and the twist–stretch
coupling constant g (d). Training and testing sequences
are denoted by red circles and green diamonds, respectively. Black
lines indicate the bisector of the first quadrant. For each panel,
the Pearson coefficient indicating the linear correlation between
the data sets corresponding to the testing sequences is reported.
As for the crookedness constant kβ, following ref (52) we define it via the relation cos β =
cos β0(1 + f/kβ), where β0 is the crookedness at zero
force. kβ is then suitably extracted
from the slope of cos β versus f.
Stretch–Torsion
Simulations
Taking advantage
of the directionality imposed by the external force, the torsion angle
in this case is computed as θ = ∑ψ, where the angle ψ is calculated as the h-twist but
considering the rejection of ζ from the direction of the force instead of the helical axis.
In the same way as done above, for τ = 0 one has ΔL = A1f, with A1 ≡ L0/S̃. At variance with the benchmark simulations, the
presence of an imposed torque enables deriving all the equations without
resorting to the analysis of the fluctuations. Indeed, for a constant
pulling force f = f0,
ΔL = constant + A3τ and Δθ = constant + A4τ, with A3 = −gL0/CS̃ and A4 = L0S/CS̃ = L0/C(1 + g2/CS̃). From the knowledge of the constants A1, A3, and A4 one can obtain the elastic constants asFitting the linear dependence of
ΔL on the pulling force f at
τ = 0,
we computed the constants L0 and A1. Analogously, fitting of ΔL and Δθ versus τ at f = 2 pN enabled
the computation of A3 and A4. Plugging the values of these quantities into eq , we finally obtained
the values of S̃, C, and g reported in Figure .
Figure 7
(a) Schematic description
of the simulation setup for the stretch–torsion
simulations, prescribing a constant force and torque applied along
a fixed direction. (b) Twist response to the external force in the
absence of imposed torque for MADna (orange circles), oxDNA2 (green
squares), 3SPN2C (blue diamonds), and rotor-bead experiments[31] (gray triangles). The black dashed line indicates
Δθ = 0; therefore, overwinding and unwinding
responses are characterized by points lying above and below the line,
respectively. The simulation data correspond to the average of the
five sequences considered. (c) Effective stretching modulus S̃, torsion modulus C, and twist–stretch
coupling constant g obtained by the three models
for the five sequences reported in Section S3.3 in the Supporting Information. Symbols are the same
as in panel b. The gray triangles correspond to experimental measures
obtained for unrelated sequences in refs (17−20) for S̃, refs (20, 27, 31, and 92) for C, and refs (20, 31, 32, and 93) for g. In the case of g, the dashed black line
denotes g = 0, thus separating two regimes characterized
by qualitatively different twist–stretch coupling.
Computation of Persistence Length
In order to compute
the persistence length, for each base pair i we considered
the geometrical center ξ = (S1, + S2,)/2 of the sugar beads. By introducing
the displacement vector R ≡ ξ – ξ, the ith tangent vector is defined as t̂ ≡ R/|R|. The contour length l separating
the points of application of two consecutive tangent vectors is computed
as the average modulus of the displacement vector. The overall correlation
function cp is defined aswhere s is the contour length,
while ⟨···⟩th, denotes the average over both the conformational ensemble
and the value chosen for i. Therefore, cp computes the decay of the orientation of the tangent
vectors by taking thermal fluctuations into account.Analogously,
the static correlation function cs is
defined aswhere min is the ith tangent vector computed on the minimum-energy
structure and ⟨···⟩ denotes the average over the value chosen for i. The static correlation function quantifies the decay of the orientation
of tangent vectors along the conformation corresponding to the ground
state. For each sequence, the minimum-energy conformation was estimated
by simulated annealing: starting from the average structure obtained
via the molecular builder, we ran 100 short simulations at 300 K,
after which the temperature was quenched at 0 K, while maintaining
the Debye length corresponding to 300 K (this ensured that the minimum-energy
configuration corresponded to the electrostatics at the temperature
of interest). The ground state was then estimated as the final conformation
of the quenched simulation attaining the smallest value of the total
energy.Finally, following ref (56), the dynamic correlation function is defined
asThis
is in fact equivalent to impose, for
each sequence, that the harmonic relation 1/lp = 1/ls + 1/ld proposed by Trifonov and co-workers holds exactly.[56,87]
Errors
Indeterminacies on the computed quantities were
estimated according to the procedures below. Note that error bars
are not present in the figures whenever they are smaller than the
size of symbols.
Observables
For all coarse-grained
simulations, several
independent realizations were performed for each case, whose number
depends on the particular set of simulations considered (see above).
For each computed observable, the indeterminacy on the average was
obtained as the standard error of the mean computed across the independent
realizations. As for var(Δθ), the standard
error of the variance was considered. In the case of the atomistic
trajectories, the time series of each observable was decorrelated
by performing a block analysis, from which the error was then estimated.[88]
Elastic Constants
In order to account
for the errors
on the elastic constants, the following procedure was employed. In
all cases, the constants were obtained starting from a linear relation y = ax + y0, where x is either the force or the torque, while y is a certain observable (e.g., the extension). Let be the average values
of the variables, and let
dy be the error associated
with y (x does not have an associated error).
Assuming the values of y to be independent and distributed
normally, we iteratively considered sets of variables y̅ extracted from Gaussians with
average y and standard
deviation dy, where indicates
the particular realization considered.
For each such realization, we obtained the fitting parameters aα and y0,α by linear regression. The mean slope was finally obtained as the
average , while the associated error was estimated
as the corresponding standard deviation. We checked numerically that
a satisfying convergence was obtained for .
Molecular Builder
The results obtained
for the training
sequences were employed to devise a molecular builder, which provides
sequence-dependent average structures to be used as initial configurations.
For each step XY, with both X and Y chosen among the four bases A,
C, G, and T, we considered the simulations run for the training sequence
which contains it. For instance, the sequence PolyAA was considered
for the steps AA and TT. The corresponding trajectories were aligned
in order to minimize the root-mean-square deviation (rmsd) of the
central step XY, computed according to the Kabsch algorithm.[89] After this operation, the average coordinates
of the 10 beads belonging to the step (four sugars, four bases, and
two phosphates) were computed. These coordinates can be employed to
build the average structure of a molecule, where the junctions between
adjacent steps are aligned in order to minimize the rmsd of the overlapping
base pair. As an example, let us consider the molecule with sequence
5′-ACT-3′. Two steps are present: AC and CT. The molecule
is initiated by considering the average coordinates of the step AC.
At this point, we note that the four beads (two sugars and two bases)
belonging to the second base pair of the step AC are the same as the
four beads belonging to the first base pair of the next step, CT.
The average coordinates of the latter are then translated and rotated
in order to minimize the rmsd between the two sets of coordinates
for the overlapping beads. The final coordinates for the four overlapping
beads are computed as the mean of the two aligned sets. This procedure
can be iterated for molecules of arbitrary length. As a last step,
the final structure is translated so that the origin corresponds to
the first sugar bead and is subsequently aligned to the z axis. The molecular builder is provided at https://github.com/saassenza/MADna/tree/main/LAMMPS and, using as input only the sequence, creates both the initial
coordinates and the double-stranded topology in LAMMPS format.
Results
and Discussion
Summary of the Coarse-Grained Model
In line with previous
work,[57,65] the coarse grain considered in MADna describes
each nucleotide by means of three beads, each centered on the sugar,
phosphate group, and base, respectively (Figure ). The beads interact with each other via
steric interactions and, in the case of phosphate groups, electrostatic
repulsion. The double-stranded molecular structure is maintained by
introducing bonded interactions (eqs , 2, and 3 and Figure ), with
parameters depending on the sequence up to the level of a single step.
The parameters were determined by Boltzmann inversion of atomistic
trajectories from the literature,[52] whose
sequences encompass all the possible steps (hereby referred to as
training sequences, see section S3.1 in the Supporting Information for details). Further tuning was performed in order
to reproduce the elastic response to an external pulling force computed
for the same set of sequences. Full details of the model and the coarse-graining
procedure are found in the Methods and in
Sections S1 and S2 in the Supporting Information.
MADna Reproduces Conformational
Features from Atomistic Simulations
In order to benchmark
the optimized simulation setup, we proceeded with a systematic comparison
between coarse-grained predictions and results from atomistic simulations.
We performed coarse-grained simulations of dsDNA molecules under the
action of a pulling force f ranging in the interval
1–20 pN, following the same protocol as in refs (46 and 52) (see Methods for details on the implementation).
Apart from the training sequences, we considered a second, independent
set of molecules (testing sequences), which has also been studied
in refs (46 and 52). The testing
sequences are reported in Section S3.1 in the Supporting Information and include biologically relevant structures
as well as synthetic A-tracts fragments. Both conformational and elastic
properties were considered for the analysis.The study of various
conformational quantities is reported in Figure . For each observable, a scatter plot is
depicted to compare coarse-grained and atomistic results at the smallest
force considered, f = 1 pN. Training and testing
sequences are indicated by red circles and green diamonds, respectively,
while the black line indicates the bisector of the first and third
quadrant. Overall, an excellent agreement is found, as indicated by
the localization of the plotted points in the vicinity of the bisector
and by the large values attained by the Pearson coefficient for all
the considered quantities. In Figure a, we plot the crookedness β,[52] a dimensionless parameter accounting for the global deviation
of the helical axis with respect to a straight line, i.e., the presence
of a spontaneous curvature. Larger values of β correspond to
more curved structures, and it was found that A-tracts show the straightest
conformations.[52] In Figure b, we analyze the helical diameter. Also
in this case, the coarse-grained model faithfully captures the dependence
on sequence within a range spanning about 0.2 nm, although a slight
overestimation of the atomistic values can be appreciated. In Figure c–h we plot
the comparison of various helical features, namely h-rise (Figure c), h-twist (Figure d), and groove geometry
(Figure e–h).
The coarse -grained predictions closely follow the sequence dependence
of the atomistic values, although a systematic overestimation of about
0.2 nm is present in the case of the depth of the major groove (Figure g). Finally, in Figure i we compare the
values for the local dihedral SBBS, i.e., the dihedral formed by the
sugars and bases within each base pair (inset in Figure i). Again, MADna reproduces
with high precision the large variability induced by sequence heterogeneity,
showing its ability to capture features emerging from the interaction
between neighboring base pairs.Scatter plot comparing
atomistic and coarse-grained results at f = 1 pN
for various structural features: crookedness β
(panel a); helical diameter (b); helical rise (c); helical twist (d);
width of major (e) and minor (f) groove; depth of major (g) and minor
(h) groove; SBBS dihedrals (i). Training and testing sequences are
denoted by red circles and green diamonds, respectively. Black lines
indicate the bisector of the first and third quadrant. For each panel,
the Pearson coefficient indicating the linear correlation between
the two data sets is reported. The atomistic results were obtained
by coarse-graining the trajectories obtained from all-atom simulations
and performing the analysis reported in the Methods.The close agreement between coarse
grain and atomistic simulations
indicates the high accuracy of MADna in describing the conformational
properties of dsDNA. Several striking features are worth mentioning.
First, as mentioned above, the testing sequences were not employed
to build the model, so that in this case the coarse-grained results
are pure predictions. Second, none of the quantities reported in Figure were directly used
to build the model; thus, a quantitative agreement is not trivial
also in the case of the training sequences. Third, these observables
are related to different scales, encompassing a single base pair (SBBS),
a base step (h-twist and h-rise), multiple-step geometry (grooves),
and the molecule as a whole (β and diameter). Fourth, the selected
quantities have a stark dependence on sequence, as shown by their
wide range of variability, indicating that the model captures also
this intrinsic heterogeneity. Particularly striking is the case of
the dihedral SBBS: despite the only two possible choices for the bases
(the Watson–Crick pairs AT and CG), this quantity displays
a strong variability, ranging from −30 to 10 deg (Figure i). This is evidently
an emergent behavior induced by the interaction of these base pairs
with their neighbors and is perfectly reproduced by the coarse-grained
model.
MADna Reproduces Elastic Constants Obtained from Atomistic Simulations
Next, we focused on the elastic properties of training and testing
sequences (Figure ). The effective stretching modulus S̃ (Figure a) and the crookedness elastic constant kβ (Figure b) are related to the elastic response of extension and crookedness
to the external force, respectively, and are obtained as the slopes
of the corresponding observables as a function of f (see Methods and Figure S3 in the Supporting Information). The torsional modulus C (Figure c) accounts for the change of the h-twist upon application of an
external torque. For the present setup, C was computed
by analyzing the fluctuations of the h-twist for f = 1 pN. Finally, the twist–stretch coupling constant g (Figure d) quantifies the torsional response to the external force. The negative
sign displayed by g (Figure d) implies that the molecule overwinds when
stretched, in agreement with experimental observations.[31,32] Precise definitions of the elastic constants and their computation
are reported in the Methods.Scatter plot comparing
atomistic and coarse-grained results for
the effective stretching modulus S̃ (panel
a), the crookedness rigidity kβ (b),
the torsion modulus C (c), and the twist–stretch
coupling constant g (d). Training and testing sequences
are denoted by red circles and green diamonds, respectively. Black
lines indicate the bisector of the first quadrant. For each panel,
the Pearson coefficient indicating the linear correlation between
the data sets corresponding to the testing sequences is reported.As discussed in the Methods and in Section
S2 in the Supporting Information, the values
of S̃, kβ, C, and g for the training sequences
were used to refine the parameters of the bonded interactions obtained
by the Boltzmann inversion of the atomistic trajectories. Due to the
large number of bonded parameters, we opted for a pragmatic approach
and adjusted a minimal subset of parameters. Particularly, we empirically
observed that a major impact on the elastic constants was obtained
by tuning the rigidities of bond 5′-BB-3′, angle 5′-PSB-3′,
and dihedral 5′-SPSP-3′. Interestingly, these three
bonded interactions have a clear physical meaning (compare Figure ). Indeed, the bond
5′-BB-3′ accounts for stacking interactions, which are
likely to affect the stretching stiffness[52] as well as the coupling between twist and stretch deformations.
Moreover, the range of values encompassed by the angle 5′-PSB-3′
is related to the flexibility of the sugar pucker, which has been
shown to play a key role in dsDNA elasticity.[46] Finally, the 5′-SPSP-3′ dihedral is likely to affect
the backbone response and was found to be the quantity most affecting
the crookedness rigidity. In this context, it is also worth mentioning
that the BB-WC elastic constants for the pairs
AT and CG approximately follow the ratio 2:3 (see Table S2 in the Supporting Information). The BB-WC bonds account for the interactions between Watson–Crick base
pairs; hence, this ratio nicely reflects the presence of two and three
hydrogen bonds for AT and CG, respectively.In Figure , we
show the scatter plots comparing S̃, kβ, C, and g for both the training and the testing sequences, while the corresponding
statistical indicators are reported in Table . MADna closely captures the average and
sequence-induced standard deviation of all the elastic quantities
(Table ). While this
is expected for the training sequences, which were used to refine
the model, the excellent agreement found for the testing sequences
(average within 6% and standard deviation within 30% of the atomistic
values) indicates the good performance of our model. The capacity
of MADna to describe the detailed sequence dependence can be assessed
by looking at Figure . MADna precisely accounts for S̃ and kβ (Figure a and Figure b, respectively), as indicated by the large values obtained
for the Pearson coefficient (calculated considering only the testing
sequences). In contrast, despite reproducing the average and standard
deviation of C, MADna does not capture its detailed
sequence dependence, as indicated by the low value of the Pearson
coefficient. While this can be due to an intrinsic limit of our model,
we note that also the input data set might be at fault: in contrast
with S̃ and kβ, which are obtained by considering average values of the corresponding
observables, C is determined by computing the fluctuations
of the cumulated twist. Fluctuations are knowingly slower to converge
when compared to averages, suggesting that the lack of accuracy of
MADna might be due to sampling limitations of the atomistic simulations.
Since to our knowledge there are currently no experiments addressing
the sequence dependence of the twist modulus, and given the excellent
performance of MADna in capturing the average value of C (see Figure c below),
we decided to leave a thorough assessment of the matter to future
investigation. Finally, the twist–stretch coupling g displays a good linear correlation. The modest value of
the Pearson coefficient is an expected side effect of the large indeterminacy
of the atomistic results, for which the error bars have sizes comparable
to the dispersion of the average values. It has to be noted that some
outliers are present; in particular for sequence A8T the coarse-grained
model predicts a twist–stretch coupling markedly different
from the atomistic result g ≃ 0. Nevertheless,
as discussed in Section S5.1 in the Supporting Information, this disagreement is likely to originate from
a possible lack of convergence of the atomistic simulations at large
forces for this specific sequence. Based on this observation, we excluded
this point for the quantitative evaluation of the agreement between
the two data sets by means of the Pearson coefficient reported in Figure d and for the corresponding
statistical indicators in Table .
Table 1
Comparison between the Average Elastic
Constants Obtained by MADna and Atomistic Simulations (Compare Figure )a
training
sequences
testing sequences
S̃ (pN)
all atom
1214 (475)
1358 (481)
coarse-grained
1188 (477)
1307 (490)
kβ (pN)
all atom
1617 (1060)
1901 (1226)
coarse-grained
1609 (1014)
1790 (894)
C (pN·nm2)
all atom
445 (77)
392 (36)
coarse-grained
435 (77)
409 (26)
g (pN·nm)
all atom
–216 (63)
–229 (87)
coarse-grained
–219 (62)
–228 (60)
In parentheses
the standard deviations
are reported.
In parentheses
the standard deviations
are reported.(a) Schematic description
of the simulation setup for the stretch–torsion
simulations, prescribing a constant force and torque applied along
a fixed direction. (b) Twist response to the external force in the
absence of imposed torque for MADna (orange circles), oxDNA2 (green
squares), 3SPN2C (blue diamonds), and rotor-bead experiments[31] (gray triangles). The black dashed line indicates
Δθ = 0; therefore, overwinding and unwinding
responses are characterized by points lying above and below the line,
respectively. The simulation data correspond to the average of the
five sequences considered. (c) Effective stretching modulus S̃, torsion modulus C, and twist–stretch
coupling constant g obtained by the three models
for the five sequences reported in Section S3.3 in the Supporting Information. Symbols are the same
as in panel b. The gray triangles correspond to experimental measures
obtained for unrelated sequences in refs (17−20) for S̃, refs (20, 27, 31, and 92) for C, and refs (20, 31, 32, and 93) for g. In the case of g, the dashed black line
denotes g = 0, thus separating two regimes characterized
by qualitatively different twist–stretch coupling.
Comparison with Experiments: Sequence-Averaged Persistence Length
The persistence length lp quantifies
the bending rigidity of a polymer and is possibly the most characterized
mechanical property of dsDNA.[16−19,21,23,29,37,49] In its classical formulation from polymer
theory, lp corresponds to the length of
the fragment to be considered in order to observe significant thermally
induced bending effects.[90] More technically, lp is defined as the decay length of the thermally
averaged tangent-vector correlation function cp, that is, cp = exp(−s/lp), where s is the contour length of the polymer fragment under consideration.[90] This definition relies on the assumption that
the minimum-energy conformation for the polymer is that of a straight
rod. In the case of dsDNA, according to the sequence a spontaneous
curvature may be present. In order to account separately for intrinsic
bending and thermal fluctuations, it is customary to distinguish the
static (ls) and dynamic (ld) contributions to the persistence length, which characterize
the decay length of the suitably defined correlation functions cs = exp(−s/ls) (computed from the minimum-energy structure) and cd = exp(−s/ld).[56] The three lengths are
approximately related by a harmonic sum, 1/lp ≃ 1/ls+1/ld.[87]In order to compute
the persistence length and its contributions, we performed simulations
for 20 random sequences of length 100 base pairs. The minimum-energy
structures (see Methods) and the equilibrated
trajectories were then employed to compute the correlation functions cp, cs, and cd according to eqs , 13, and 14, respectively. The results are reported in Figure
S5 in the Supporting Information for various
definitions of the tangent vector. Here, we focus on the results of
one of such definitions, according to which the tangent vectors are
obtained by joining the geometrical centers of sugars of base pairs
separated by 10 steps (see Methods). Fitting
the correlation function cp with an exponential
decay resulted in computed values of the persistence length lp ranging between 46 and 64 nm, with an average
equal to lp = 56 ± 1 nm. This is
in good agreement with experimental values on random sequences and
standard ionic conditions (45–55 nm[16−19,21,23,29,37,49]), particularly since lp was not employed in the parametrization of
the model. Other coarse-grained models give predictions for lp within the experimental range,[60,62,65,68,71] although in most of these works the persistence
length of double-[60,68,71] or single-stranded[65] DNA was employed
as a target quantity in the construction of the force field. In the
present case, the slight overestimation of lp is in line with previous results of atomistic simulations,
where an average lp = 57 ± 3 was
found.[49]As expected,[56] for each sequence the
static correlation function c is found to decrease more slowly than cp (Figure S5 in the Supporting Information) since it lacks the disordering action of thermal fluctuations.
The corresponding static persistence length ls was computed by fitting cs via
an exponential decay, in analogy to cp. The obtained values vary widely, ranging from 199 to 796 nm, with
an average equal to ls = 485 ± 42
nm. This is somewhat smaller than previous estimations (ls = 576 ± 191 nm[49]), although
we note that this value varies wildly by considering different sequences
and, even for the same sequences, by employing different definitions
(see below). As discussed in ref (49), its large heterogeneity may rationalize the
markedly different results reported in experiments, where values of ls as different as 130 nm and >1000 nm have
been
estimated.[16,91]Fitting cd via an exponential decay
yields values for the dynamic persistence length ld ranging between 56 and 77 nm, with an average equal
to ld = 64 ± 1 nm. This prediction
is in line with a previous report based on atomistic simulations[49] (ld = 64.7 ±
1.4 nm) and lies between the results obtained from cryoelectron microscopy[16] (ld = 82 ±
15 nm) and cyclization experiments[91] (ld = 50 ± 1 nm). The difference between
the two experimental values might be ascribed to the different techniques
employed, for which future simulations with the present model may
shed some light.As a final consideration, it has to be noted
that the estimation
of the persistence length may vary according to the definition of
the tangent vectors.[56] Alternative choices
are considered in Section S6 in the Supporting Information, yielding average values in the ranges lp = 56–63 nm, ls = 485–1641 nm, and ld =
64–69 nm. This indicates the robustness of the values of lp and ld with respect
to the definition of the tangent vectors. The average value of ls is sensitive to the definition, although the
observed variation can be partly ascribed to the intrinsic large variability
characterizing this quantity.
Comparison with Experiments:
Sequence-Averaged Elastic Constants
A further set of simulations
was devoted to determine the sequence-averaged
elastic constants. These quantities were already estimated for the
training and testing sequences (Figure ), but their quantitative determination is likely to
depend on the details of the microscopic definitions, particularly
for the h-twist. In order to enable a quantitative comparison with
experimental values, we designed a simulation framework which avoids
relying on such definitions and which is more akin to single-molecule
experimental setups.As shown schematically in Figure a, in this set of simulations
a dsDNA molecule is subjected to a constant force f and torque τ acting along the z axis. This
enables defining unequivocally the twist angle θ starting from
the projection of the base pairs onto the xy plane,
in analogy with single-molecule experiments based on rotor beads.[31] The linear response of the extension L as a function of f provides access to
the effective stretch modulus S̃. Analogously,
the torsion modulus C can be computed from the dependence
of θ on τ. Finally, the twist–stretch coupling g can be obtained by looking at the cross-dependence, i.e.,
the dependence of L on τ or the response of
θ to changes in f. The quantitative details
can be found in the Methods.We performed
simulations for five randomly generated sequences
of length equal to 40 base pairs at several values of f and τ (see Methods for full details
and Section S3.3 in the Supporting Information for the sequences, which are named ST1, ..., ST5). With this choice
of sequence length, the molecules are long enough so that several
turns of the double helix are present but short enough to neglect
the effect of bending. For comparison, we also run simulations for
the same set of sequences and parameters by employing the two most
widely used coarse-grained models from the literature, namely oxDNA2[80,81] and the sequence-dependent model 3SPN2C.[82] From a qualitative perspective, a marked difference between MADna
and the other two models becomes evident when analyzing the change
in twist Δθ as a function of the force
(Figure b). Indeed,
the present model prescribes that dsDNA overwinds when stretched (Δθ
> 0), which is in agreement with experimental observations.[31,32] In contrast, neither oxDNA2 nor 3SPN2C capture this feature, predicting
unwinding upon pulling (in the case of oxDNA this fact was already
observed in the original publication[68]).
To our knowledge, no other coarse-grained model available in the literature
has been shown to predict this unintuitive behavior of dsDNA.MADna shows better agreement with experiments also from a quantitative
standpoint. As shown in Figure c and Table , it predicts values for S̃, C, and g in
good quantitative agreement with the experiments. As for the other
models, oxDNA2 shows a similar performance for C,
while it tends to overestimate S̃. Coherently
with the results shown in Figure b, the wrong sign for g is found for
both oxDNA2 and 3SPN2C, with the latter showing in general a significant
overestimation of the elastic constants. The results reported in Figure c and Table correspond to forces larger
than 10 pN, which was chosen as a reasonable threshold to avoid bending
effects. Nevertheless, performing the analysis with the full set of
forces resulted in a small change in S̃ (roughly
10%) and in virtually no change in C and g. It is also worth mentioning that the range considered
for the torque (0–30 pN·nm) corresponds to supercoiling
densities below the threshold value σ ≃ 0.05 usually
associated with torque-induced denaturation,[22] hence making the results obtained for MADna relevant for real dsDNA
molecules within the whole range of applied torques. Particularly,
we computed the supercoiling density as σ = Δθ/θ0, since the lack of bending results in the absence
of relevant writhe. For the largest torque applied (τ = 30 pN·nm),
we found σ ≃ 0.046 for MADna, σ ≃ 0.043
for oxDNA2, and σ ≃ 0.014 for 3SPN2C, which is lower
than for the other two models due to the larger value of the twist
modulus C.
Table 2
Elastic Constants
Obtained by the
Various Models and Their Comparison with Experiments (see Figure c)a
S̃ (pN)
C (pN·nm2)
g (pN·nm)
experiment
1045 ± 92
436 ± 17
–90 ± 10
MADna
1038 ± 21
386 ± 3
–125 ± 6
oxDNA2
1448 ± 5
399 ± 1
+120 ± 1
3SPN2C
2589 ± 71
1180 ± 10
+514 ± 36
Experimental
values are obtained
as averages of the results reported in refs (17−20) for S̃, refs (20, 27, 31, and 92) for C, and refs (20, 31, 32, and 93) for g, while the corresponding indeterminacies are computed
as the standard error of the mean.
Experimental
values are obtained
as averages of the results reported in refs (17−20) for S̃, refs (20, 27, 31, and 92) for C, and refs (20, 31, 32, and 93) for g, while the corresponding indeterminacies are computed
as the standard error of the mean.The values of g obtained for MADna
in Table and Table differ by a factor
of 2 with
respect to each other. This striking disagreement appears to be too
large to be simply ascribed to the different sequences considered
or the different definitions employed for the twist angle. In this
regard, based on atomistic simulations it has been suggested that
the stretching modulus depends on the size of the fragment under consideration.[94] Since in the present case we are comparing sequences
of size 20 and 40 base pairs, it might be that a similar effect is
present for g. In order to check this hypothesis,
we performed another set of simulations following the same protocol
as in the present section but considering sequences made of 20 base
pairs, which are listed as ST1-short, ..., ST5-short in Section S3.3
in the Supporting Information. The resulting
elastic constants are reported in Figure S6 in the Supporting Information. Particularly, we found that for the
short sequences g = −258 ± 16 pN·nm,
which is indeed in line with the results reported in Figure . This confirms the presence
of size effects in MADna for the prediction of elastic constants.
To our knowledge, this is the first instance in which a length dependence
for g has been predicted. This feature might explain
the systematic overestimation of g in atomistic simulations
that are restricted to short sequences.[46]Another interesting effect observed in experiments is the
coupling
between twist and bending, which is relevant at pulling forces up
to a few piconewtons.[33,34] As a consequence of this coupling,
the twist of the DNA molecule appears to be softer; i.e., the effective
twist modulus is Ceff < C. In order to study this feature, we considered a set of simulations
involving DNA molecules of 150 base pairs, which is comparable to
the persistence length, thus enabling the presence of bending fluctuations.
The simulation protocol was similar to the case just analyzed, although
here no torque was applied. We considered three independent sequences
which are reported in Section S3.4 in the Supporting Information. For each sequence, simulations were performed
for MADna, oxDNA2, and 3SPN2C at pulling forces ranging between 0.05
and 2.5 pN. Further simulation details can be found in the Methods. In Figure we report the results obtained and compare them with
the experimental values from refs (33 and 34). In line
with the trend observed in Figure , MADna and oxDNA2 follow quite closely the experimental
curve, while 3SPN2C systematically overestimates the value of Ceff.
Figure 8
Effective torsion modulus Ceff as a
function of force for MADna (orange circles), oxDNA2 (green squares),
3SPN2C (blue diamonds), and experiments (gray triangles). Experimental
data were extracted from refs (33 and 34).
Effective torsion modulus Ceff as a
function of force for MADna (orange circles), oxDNA2 (green squares),
3SPN2C (blue diamonds), and experiments (gray triangles). Experimental
data were extracted from refs (33 and 34).
Comparison with Experiments: Sequence-Dependent
Conformation
and Elasticity
Having characterized and benchmarked the sequence-averaged
mechanical properties of MADna, we now turn to sequence-dependent
features. In this regard, we considered the sequences studied experimentally
in ref (28), where
the authors characterized the sequence-dependent persistence length
and helical repeat by means of cyclization experiments.We performed
simulations for DNA molecules of 100 base pairs obtained by taking
the central parts of 14 different experimental sequences as listed
in Section S3.2 in the Supporting Information. For each simulated sequence, we computed the helical pitch by dividing
the cumulated helical twist by 2π, while for the persistence
length we employed the same approach as above. In Figure we report the comparison between
experiments and MADna predictions. As the plots show, there is a high
correlation between the two data sets, thus indicating that MADna
satisfactorily captures the sequence dependence of two key features
of DNA conformation (helical pitch) and elasticity (persistence length).
From a quantitative perspective, we see from Figure a that the simulated values for the pitch
are larger than the experimental ones. Nonetheless, a quantitative
comparison has to be performed with care. Indeed, the experimental
values were obtained indirectly by analyzing cyclization data by means
of a wormlike chain with twist.[28] Although
the relative differences observed in the experimental results for
the different sequences are robust, the values depend on the validity
of this model down to the scale of single steps and on the value assigned
to the twist modulus when fitting the data. Moreover, it is likely
that the details of the definition of the h-twist in the simulations
affect the quantitative determination of the helical pitch. To our
knowledge, this is the first instance in which these structural data
have been compared to predictions from simulations. As for the persistence
length, the strong correlation found has to be mitigated by the slight
quantitative overestimation of lp in simulations,
in line with the results reported above for the sequence-averaged
persistence length. Moreover, the values of lp from simulations appear to be more heterogeneous than in
experiments. While this can be ascribed to intrinsic limitations of
MADna, it has to be noted that the smaller dispersion of values determined
from experiments might originate from some assumptions made in their
analysis. Indeed, experimental persistence lengths were obtained by
fitting j-factors from cyclization experiments, where
the theoretical formulas depend on lp,
the helical pitch, the twist modulus C, and the contour
length L.[28] The data reported
in ref (28) were obtained
by assuming sequence-independent C and L, which neglects two sources of sequence-driven heterogeneity, suggesting
that the values obtained in experiments might underestimate the actual
range of variability of lp.
Figure 9
Comparison
between experimental values and MADna predictions for
the sequence dependence of the helical pitch (a) and the persistence
length lp (b). The lines correspond to
the linear fits of the scatter plots and are included as a guide to
the eye. The value of the Pearson coefficient is reported in each
plot.
Comparison
between experimental values and MADna predictions for
the sequence dependence of the helical pitch (a) and the persistence
length lp (b). The lines correspond to
the linear fits of the scatter plots and are included as a guide to
the eye. The value of the Pearson coefficient is reported in each
plot.For comparison, we also performed
simulations for oxDNA2 and 3SPN2C,
for which the results are reported in Figure S7 in the Supporting Information. We found that oxDNA2
predictions do not correlate with experiments for either the helical
pitch or the persistence length. This is expected, since in this model
the sequence dependence is implemented by tuning the base-pairing
and stacking interactions to account for thermodynamic data but not
for the elasticity of DNA. In the case of 3SPN2C, we found a weak
correlation for the helical pitch but a strong correlation for the
persistence length. This high correlation was also expected, since
the persistence-length data from ref (28) were used to parametrize the model.[82] For completeness, we mention that also CGDNA
has been used to reproduce these experimental persistence-length data,
for which a similar correlation as for MADna was found (r = 0.73).[56]As a further test, we
performed stretching simulations for phased
A-tracts, i.e., dsDNA molecules obtained by alternating fragments
of consecutive adenines and random sequences, with each fragment having
a length of 5–10 base pairs. It was experimentally found that
the stretching modulus for such molecules is roughly 50% larger than
for random sequences.[21] We run some stretching
simulations for five phased A-tracts of 40 base pairs, whose sequences
were extracted as fragments of the experimental ones[21] and are reported as A-tract-1, ..., A-tract-5 in Section
S3.3 in the Supporting Information. Simulation
protocols and analysis were the same as in the case of random sequences
studied above. The results from the three models were S̃ = (1402 ± 48) pN for MADna, S̃ = (1444
± 2) pN for oxDNA2, and S̃ = (2793 ±
67) pN for 3SPN2C. By comparing these values with the results reported
in Table , we thus
conclude that MADna predicts an increase of roughly 35%, oxDNA2 does
not predict any increase, and 3SPN2C predicts a 8% increase. As expected,
the absence of elasticity-oriented sequence dependence for oxDNA2
prevents it from capturing this feature. The prediction of MADna is
the one most closely resembling the significant increase observed
experimentally, although still underestimating its extent.
Strengths
and Limits
The comparison between MADna,
oxDNA2, 3SPN2C, and experiments can be summarized as follows. MADna
performs in general better at capturing both sequence-averaged and
sequence-dependent conformational and elastic features. As for the
other two models, oxDNA2 reproduces well the sequence-averaged properties,
but it cannot account for sequence dependence. In contrast, 3SPN2C
captures well the sequence dependence of elasticity in the case of
the persistence length, while its accuracy for the change in stretch
modulus or conformation is more limited. Moreover, it tends in general
to overestimate the magnitude of the elastic constants.The
accuracy of MADna in addressing the sequence dependence of conformational
and elastic properties of dsDNA appoints it as the ideal choice to
interpret the outcome of single-molecule experiments, as well as to
study the conformational changes induced by mechanic stress, which
are relevant for many systems in vivo. Moreover, it can be employed
as a virtual laboratory to test analytical theories on DNA elasticity.[95,96] Nonetheless, at the present stage MADna cannot account for breaking
events such as the formation of kinks or local melting. Hence, it
is important to assess the relevance of such mechanisms for the system
under study before drawing conclusions based on simulations performed
with the present model. We are currently working to surpass such limitations,
in order to further extend the palette of possible systems which can
be analyzed through the lens of MADna.Further limitations are
introduced by the use of an implicit solvent,
which prevents studying DNA solvation. Similarly, the use of reduced
charges interacting via the Debye–Hückel potential does
not account for ion–ion correlations, which are particularly
relevant for multivalent ions.[97] We note
that MADna shares these limitations with all the coarse-grained models
based on a similar philosophy, including, e.g., oxDNA2 and 3SPN2C.
In view of future modeling of the interaction of DNA with proteins
by means of MADna, the potential obstacle provided by these limitations
can however be overcome by considering a distance-dependent dielectric
constant for the electrostatic interaction between charged sites on
DNA and proteins.[98]
LAMMPS Implementation and
Availability
The standard
potentials employed in MADna enable a straightforward implementation
of the model in all common Molecular Dynamics simulators. The molecular
builder, the main scripts used for the analysis, and some sample scripts
for simulation in LAMMPS are provided at https://github.com/saassenza/MADna/tree/main/LAMMPS. The molecular builder takes as input the sequence and provides
both the topology and the initial coordinates in LAMMPS format. In
order to run MADna in LAMMPS (at the time of writing, the stable version
is 29Sep2021), the latter has to be built with the packages MOLECULE,
EXTRA-MOLECULE, and EXTRA-PAIR.
Conclusion
We
have introduced MADna, a novel coarse-grained model for the
simulation of double-stranded DNA. MADna captures the sequence dependence
of conformational and elastic properties with accuracy comparable
to that of atomistic simulations. Key conformational features which
closely follow atomistic results include the main helical parameters,
the groove geometry, the diameter of the double helix, and the spontaneous
curvature as quantified by the crookedness. The model also predicts
sequence-averaged and sequence-dependent elasticity and conformation
in agreement with experimental results for a wide set of features,
namely the stretching and torsion moduli, the counterintuitive negative
twist–stretch coupling, the twist–bend coupling, the
persistence length, and the helical pitch. At the present stage, the
model does not account for breaking events, which are being addressed
in ongoing work. The implementation of MADna in Molecular Dynamics
software is straightforward due to the common potentials employed.
Sample scripts for LAMMPS and a molecular builder are openly provided
on GitHub.Due to both its accuracy and its simplicity of use,
we believe
that MADna provides a significant addition to the toolbox of coarse-grained
simulations and will enable precise theoretical studies of a wide
set of large-scale DNA phenomena.
Authors: David A Case; Thomas E Cheatham; Tom Darden; Holger Gohlke; Ray Luo; Kenneth M Merz; Alexey Onufriev; Carlos Simmerling; Bing Wang; Robert J Woods Journal: J Comput Chem Date: 2005-12 Impact factor: 3.376
Authors: Jeff Gore; Zev Bryant; Marcelo Nöllmann; Mai U Le; Nicholas R Cozzarelli; Carlos Bustamante Journal: Nature Date: 2006-07-12 Impact factor: 49.962
Authors: Alberto Pérez; Iván Marchán; Daniel Svozil; Jiri Sponer; Thomas E Cheatham; Charles A Laughton; Modesto Orozco Journal: Biophys J Date: 2007-03-09 Impact factor: 4.033
Authors: Gordon S Freeman; Joshua P Lequieu; Daniel M Hinckley; Jonathan K Whitmer; Juan J de Pablo Journal: Phys Rev Lett Date: 2014-10-14 Impact factor: 9.161
Authors: Victor Velasco-Berrelleza; Matthew Burman; Jack W Shepherd; Mark C Leake; Ramin Golestanian; Agnes Noy Journal: Phys Chem Chem Phys Date: 2020-08-20 Impact factor: 3.676
Authors: Franziska Kriegel; Christian Matek; Tomáš Dršata; Klara Kulenkampff; Sophie Tschirpke; Martin Zacharias; Filip Lankaš; Jan Lipfert Journal: Nucleic Acids Res Date: 2018-09-06 Impact factor: 16.971