A promising anticancer therapeutic strategy is the stabilization of telomeric G-quadruplexes using G-quadruplex-binding small molecules. Although many G-quadruplex-specific ligands have been developed, their low potency and selectivity to G-quadruplexes over duplex remains unsolved. Recently, a crystal structure of a telomeric 3' quadruplex-duplex hybrid was reported and the quadruplex-duplex interface was suggested to a good target to address the issues. However, there are no high-resolution complex structures reported for G-quadruplex ligands except for a docked BSU6037. In this study, molecular dynamic (MD) binding simulations with a free ligand were used to study binding poses and dynamics of four representative ligands: telomestatin, TMPyP4, BSU6037, and BRACO19. The MD data showed that BSU6037 was able to fully intercalate into the interface whereas TMPyP4 and BRACO19 could only maintain partial intercalation into the interface and telomestatin only binds at the quadruplex and duplex ends. Both linear ligands, BSU6037 and BRACO19, were able to interact with the interface, yet they were not selective over duplex DNA. The DNA geometry, binding modes, and binding pathways were systematically characterized, and the binding energy was calculated and compared for each system. The interaction of the ligands to the interface was by the means of an induced-fit binding mechanism rather than a lock-key mechanism, consisting of the DNA unfolding at the interface to allow entrance of the drug and then the refolding and repacking of the DNA and the ligand to further stabilize the G-quadruplex. On the basis of the findings in this study, modifications were suggested to optimize the interface binding for TMPyp4 and telomestatin.
A promising anticancer therapeutic strategy is the stabilization of telomeric G-quadruplexes using G-quadruplex-binding small molecules. Although many G-quadruplex-specific ligands have been developed, their low potency and selectivity to G-quadruplexes over duplex remains unsolved. Recently, a crystal structure of a telomeric 3' quadruplex-duplex hybrid was reported and the quadruplex-duplex interface was suggested to a good target to address the issues. However, there are no high-resolution complex structures reported for G-quadruplex ligands except for a docked BSU6037. In this study, molecular dynamic (MD) binding simulations with a free ligand were used to study binding poses and dynamics of four representative ligands: telomestatin, TMPyP4, BSU6037, and BRACO19. The MD data showed that BSU6037 was able to fully intercalate into the interface whereas TMPyP4 and BRACO19 could only maintain partial intercalation into the interface and telomestatin only binds at the quadruplex and duplex ends. Both linear ligands, BSU6037 and BRACO19, were able to interact with the interface, yet they were not selective over duplex DNA. The DNA geometry, binding modes, and binding pathways were systematically characterized, and the binding energy was calculated and compared for each system. The interaction of the ligands to the interface was by the means of an induced-fit binding mechanism rather than a lock-key mechanism, consisting of the DNA unfolding at the interface to allow entrance of the drug and then the refolding and repacking of the DNA and the ligand to further stabilize the G-quadruplex. On the basis of the findings in this study, modifications were suggested to optimize the interface binding for TMPyp4 and telomestatin.
A promising anticancer
therapeutic strategy is the stabilization
of telomeric G-quadruplexes using quadruplex-binding small molecules.[1−4] G-quadruplexes are formed from the stacking of guanine (G) tetrads
via Hoogsteen base pairing in G-rich sequences.[3−12] Human telomeres appearing at the end of chromosomes consist of a
3′ single-stranded TTAGGG repeat overhang.[2,10−18] This TTAGGG repeat overhang has the ability to form G-quadruplexes.
During cell division, the (TTAGGG) sequence
is progressively shortened until a growth-arrest state and eventual
senescence, the complete stopping of cell division, is reached.[4,12,16,18−22] When senescence is bypassed due to the absence of cell cycle checkpoints,
a cell then reaches a state of crisis where it undergoes uncontrolled
growth.[2,4,9,12,14,19,22,23] The overexpression of telomerase, a reverse transcriptase that lengthens
telomeres,[2−4,9,12,18,19,24] results in the bypass of senescence and
thus indefinite growth of cancerous cells.[9,10,12,13,17−19,25] G-quadruplexes have been shown to inhibit telomerase by making the
3′ overhang inaccessible to the telomerase RNA template.[9,16,26−29] It has also been shown that G-quadruplex
specific ligands compete with the protection of telomere protein 1
binding, where the binding of the ligands leads to a series of events:
the uncapping of telomeres, DNA damage response signaling, and eventual
apoptosis of the mutated cells.[23,30−32] Consequently, the stabilization of telomeric G-quadruplexes is a
viable therapeutic strategy to treat cancers.[12] Furthermore, the over-representation of G-quadruplexes in cancer
cells[21,33−37] makes them cancer-specific targets.[31] Because of this, many studies have been performed on the
telomeric and promoter G-quadruplexes with various G-quadruplex specific
ligands to inhibit cancer cell growth.[1,2,9,15,16,22,27,38−47]The class of anticancer drugs in this paper is widely represented
by two structural characterizations: cyclical and linear. Four ligands
(telomestatin, TMPyP4, BSU6037, and BRACO19 in Figure B–E) are used to exemplify ligands
of both structural families within this class of drugs. Telomestatin
and TMPyP4 are representative of the cyclical ligands; BSU6037 and
BRACO19 are representative of the linear ligands. Telomestatin,[48−51] TMPyP4,[52−55] and BRACO19[56−71] were chosen explicitly because they are the most studied ligands
that are potent against various cancer cell lines. The ligand BSU6037
was previously docked to the quadruplex–duplex interface by
Krauss et al.[46]
Figure 1
Structures of the telomeric
DNA quadruplex–duplex hybrid
(A), telomestatin (B), TMPyP4 (C), BSU6037 (D), and BRACO19 (E).
Structures of the telomeric
DNA quadruplex–duplex hybrid
(A), telomestatin (B), TMPyP4 (C), BSU6037 (D), and BRACO19 (E).A large variety of G-quadruplex
specific ligands have been developed
on the following criteria: an aromatic core that allows π–π
stacking to the G-quartet and a side chain that allows electrostatic
interactions with the sugar-phosphate backbone and ensures solubility.[3,4,10−12,17,18,29,31,72] For a ligand to target G-quadruplexes successfully, its binding
equilibrium constant must be at least 106 M–1 and it also must be selective over duplex binding by at least 2
orders of magnitude. The binding properties of three leading ligands
(telomestatin, TMPyP4, and BRCO19) is listed in Table S1 of the Supporting Information. Telomestatin, a natural
product, shows the highest potency and telomerase inhibition, and
it is also 70 times more selective toward G-quadruplexes than duplex
DNA.[26,73] Despite this, telomestatin’s hydrophobicity
and low water solubility impedes its bioavailability, which is an
integral feature in successful drug design.[3,31,74] TMPyP4 is a synthetic ligand designed for
G-quadruplex stabilization with an affinity of 20 × 106 M–1 and a TRAP-LIG EC50 (TRAP: the
telomere repeat amplification protocol for telomerase, which is used
for determination of telomerase activity) of 8.9 μM but is only
2 times more selective to the G-quadruplex than duplex DNA.[21,75] The synthetic acridine, BRACO19, has been reported to have promising
potency, KG4 = 30 × 106 M–1, and telomeric inhibition, TRAP-LIG EC50 = 6.3 μM, with 10 times more selectivity over duplex
DNA.[31,32,75] However, BRACO19
has low membrane permeability and decomposes quickly, which decreases
the binding of BRACO19 to G-quadruplexes.[76,77] It is clear that potency, selectivity, and other properties need
to be further improved to convert these compounds to therapeutic agents.The detailed binding poses of multiple G-quadruplex ligands have
been experimentally and computationally characterized; these studies
suggested two well-known binding poses: end-to-end stacking (top or
bottom stacking) and groove binding.[9,10,21,22,31,33,41,43−45,78−80] The main binding interactions have been shown to
be π–π stacking of the aromatic core of small ligands
with the G-quartet, which is best achieved in end-to-end stacking.[74] However, since terminal G-quartets are present
in the majority of G-quadruplexes, achieving selectivity to a distinct
G-quadruplex has become another issue.A recent work by Krauss
and others involving G-quadruplexes has
begun to focus on targeting the telomeric 3′ junction formed
between a G-quadruplex and duplex DNA as a solution to increase the
binding selectivity and potency to G-quadruplexes over duplexes for
G-quadruplex ligands.[46] In their work,
a crystal structure of a telomeric 3′ quadruplex–duplex
hybrid formed from putative parallel G-quadruplex and duplex DNA was
reported (Figure A)
and the junction was suggested as a more specific G-quadruplex site
that can be targeted by G-quadruplex ligands.[46] To probe the ligand binding at the quadruplex–duplex interface,
a diverse range of G-quadruplex ligands (BSU6037, BSU6039, BRAC-19,
TMPyP4, and FC4ND10) were selected for ligand soaking experiments
with preformed crystals of the quadruplex–duplex hybrid. Ligands
that did not significantly degrade crystal quality but appeared to
bind to the construct were TMPyP4, Cu-porphyrin, BSU6037, and FC4ND10.
However, the binding pose at the interfaces was not obtained due to
different reasons. Binding of TMPyP4 caused a large decrease in diffraction
quality (from about 3.0 to 4.3 Å resolution) and a slight change
in cell dimensions. Analysis of different Fourier maps did not show
the presence of BSU6037 or Cu-porphyrin bound to the quadruplex–duplex
hybrid, and FC4ND10 was observed to bind in the duplex region. Since
no structural data is available on ligand binding at the quadruplex–duplex
interface, BSU6037 was docked to a pseudo-ligand-binding site at the
interface that was generated by rotating the phosphodiester backbone
of T17, while maintaining strand polarity of the G-rich strand. Although
the obtained binding pose of BSU6037 is closely analogous to that
observed on a crystal complex structure formed between BSU6039 and
a bimolecular quadruplex (1L1H),[81] lack
of structural flexibility in the docking may limit induced-fit binding
modes.This study aims at furthering molecular modeling investigations
with four representative ligands (telomestatin, TMPyP4, BSU6037, and
BRACO19 in Figure B–E) binding to the telomeric 3′ quadruplex–duplex
construct in hopes of characterizing binding poses and their binding
mechanisms using molecular dynamics (MD) binding simulation with a
free ligand. The MD simulations probed the structure, dynamics, and
interaction of the G-quadruplex–duplex construct with the ligands
in high spatial and temporal resolution.[82] Detailed process information was used to decipher the binding mechanism.
Specifically, this study utilized the latest AMBER DNA force field
(OL15) and drug (GAFF2) force fields to simulate the binding process
between the ligands and the human telomeric G-quadruplex–duplex
construct from an initially unbound state (where the ligand is 10
Å away from the DNA). The binding modes and pathways of each
ligand to the G-quadruplex–duplex DNA were characterized, and
then the molecular mechanics generalized born/surface area (MM-GBSA)
binding energies of each binding pose were compared. Three major modes
of binding were observed for TMPyP4, BSU6037, and BRACO19: interface
intercalation, binding to the quadruplex, and binding to the duplex.
No intercalation binding mode was observed for telomestatin, but suggestions
on how to alter the telomestatin scaffold to allow for interface intercalation
were given. The MM-GBSA binding energy analysis indicates that the
charged aromatic TMPyP4 scaffold was better than both the uncharged
aromatic structures telomestatin and linear structures (BSU6037 and
BRACO19). Additionally, an implication on designing a better TMPyP4
analog was also discussed (Table ).
Table 1
Molecular Dynamics Simulation Runs
system IDa
DNA
no. of ligands
no. of runs
no. of water
molecules
ions
box size
(Å)b
drug initial
pose
NPT eq. (ns)
NVT (ns)
total time
(μs)
1
1 (5dww)
0
3
6242
30 K+
67.7
N/A
1
1999
6
2
0
1
5
1491
0
31.8
free
1
499
2.5
3
0
1
5
1043
4 Cl–
37.4
free
1
499
2.5
4
0
1
5
2065
2 Cl–
45.8
free
1
499
2.5
5
0
1
5
1487
3 Cl–
41.5
free
1
499
2.5
6
1 (5dww)
1
19 + 1
18 926
30 K+
94.1
free
1
499/1999
11.5
7
1 (5dww)
1
19 + 1
11 298
26 K+
80.2
free
1
499/1999
11.5
8
1 (5dww)
1
19 + 1
12 131
28 K+
82.0
free
1
499/1999
11.5
9
1 (5dww)
1
19 + 1
15 541
27 K+
88.4
free
1
499 1999
11.5
System 1 refers to the free DNA-only
and systems 2–5 refer to the free ligand-only simulations (2–5:
telomestatin, TMPyP4, BSU6037, and BRACO19, respectively). Systems
6–9 refer to the free DNA plus free ligand simulations (6–9:
telomestatin, TMPyP4, BSU6037, and BRACO19, respectively).
Triclinic box equivalent to the
true truncated octahedral box.
System 1 refers to the free DNA-only
and systems 2–5 refer to the free ligand-only simulations (2–5:
telomestatin, TMPyP4, BSU6037, and BRACO19, respectively). Systems
6–9 refer to the free DNA plus free ligand simulations (6–9:
telomestatin, TMPyP4, BSU6037, and BRACO19, respectively).Triclinic box equivalent to the
true truncated octahedral box.
Results
Scaffold
of the Quadruplex–Duplex Hybrid Was Maintained
in the DNA-Only Simulations
To validate the force field for
DNA, the DNA-only system was constructed from the crystal structure
(Protein Data Bank (PDB) ID: 5DWW) for stability simulations. Three independent runs
for this DNA-only system were carried for 2.0 μs. DNA backbone
root-mean-square deviation (RMSD) plots and the last snapshots are
shown in Figures S3 and S6, respectively. Clearly, the DNA hybrid was stable in all
three runs, as indicated by the small root-mean square deviation of
the backbone from the starting crystal structure (RMSD = 2–3
Å). Neither of the two K+ ions inside the G-quadruplex
moved out, and all of them maintained the initial positions in three
runs.To show more structural details, a representative trajectory
and its order parameters are shown in Figure . First, the G-quadruplex was stable, indicated
by maintaining ∼8, ∼9, and ∼8 hydrogen bonds
for the top, middle, and bottom G4 layers. Second, the interface was
stable, indicated by ∼3 and ∼1 hydrogen bonds for the
tribase layer and the A7–T19 pair. Third, the duplex part was
stable, indicated by maintaining ∼2 hydrogen bonds for each
base pair in the duplex part. Fourth, the DNA backbone was stable,
indicated by a small RMSD of ∼2.5 Å. Fifth, the two K+ ions inside the quadruplex were stable, indicated by maintaining
the K+ to K+ distance of ∼3.8 Å,
with a small fluctuation. Sixth, the calculated relative conformation
energy using MM-GBSA showed fluctuation but was flat overall.
Figure 2
Representative
trajectory of the DNA-only simulation and the order
parameter plot illustrating the breaking and reforming of hydrogen
bonds (at the quadruplex (1), interface (2), and duplex (3)), RMSD
(Å) with reference to the final structure (4), K+ to
K+ distance (5), and the MM-GBSA binding energy (ΔE in kcal/mol) (6). 5′ and 3′ of the DNA chain
are indicated by a red and blue ball, respectively. K+ ions
are indicated by a yellow ball.
Representative
trajectory of the DNA-only simulation and the order
parameter plot illustrating the breaking and reforming of hydrogen
bonds (at the quadruplex (1), interface (2), and duplex (3)), RMSD
(Å) with reference to the final structure (4), K+ to
K+ distance (5), and the MM-GBSA binding energy (ΔE in kcal/mol) (6). 5′ and 3′ of the DNA chain
are indicated by a red and blue ball, respectively. K+ ions
are indicated by a yellow ball.
Geometric Characterization of the G-Quadruplex Part Showed a
Right-Handed Helical Twist of the Three G4 Layers, the Ligand Binding
Can Make Subtle Changes
Although the scaffold of a G-quadruplex
has been extensively studied and classified into parallel, antiparallel,
and hybrid class, the inter-G4 layer geometry from the MD simulations
has not been thoroughly examined. From the top view of the G-quadruplex
part (Figure S1a), the right-handed helical
rotation of G4 layers was obviously identified. Analogous to the right-handed
helical rise and twist in B-form DNA, this work has defined and calculated
the rise, H-rise, and H-twist, as detailed in the method section for
the crystal structure and five simulated systems (Table S6). For the DNA-only simulated system, the rise, H-rise,
and H-twist averaged over the two layer steps (i.e., bottom to middle
and middle to top) are ∼3.44 Å, ∼2.09 Å, and
∼17.6°, respectively, which are close to the values of
the crystal structures (∼3.42 Å, ∼1.13 Å,
and ∼10.7°). The slightly larger differences for H-rise
and H-twist may be attributed to the crystal packing in the condense
phase or the force-field artifact. Further comparison to the solution
NMR structure will clarify this issue. This is supported by a recent
bioinformatics study on 74 G-quadruplex structures in the PDB, suggesting
that the distribution of the twist angle between the two adjacent
G-quartets for biomolecular parallel G-quadruplexes is ranged from
10 to 35°, with a maximum at 31 ± 3°.[83] Nonetheless, H-rise and H-twist in G-quadruplex are smaller
than those in standard B-DNA (3.32 Å and 34.3° for H-rise
and H-twist, respectively). Interestingly, the stacking of telomestatin
on the top of the G-quadruplex (Figure ) did not change these parameters much (∼3.37
Å, ∼1.79 Å, and ∼15.8°) neither did the
binding of BSU6037 (Figure ) at the G-quadruplex–duplex interface (∼3.44
Å, ∼1.92 Å, and ∼16.9°) but the interaction
from TMPyP4 and BRACO19 changed H-rise to from ∼2.09 to ∼3
Å. Thus, ligand binding can slightly modulate the G-quadruplex
interlayer geometry.
Figure 5
Representative trajectory
of the top stacking mode of telomestatin
(run 9) and the order parameter plot, illustrating the breaking and
reforming of hydrogen bonds (at the quadruplex (1), interface (2),
and duplex (3)), drug-base dihedral angle (4), ligand (black)/DNA
(red) RMSD (Å) with reference to the final structure (5), ligand
center-to-DNA center distance (black) and K+-to-K+ distance (red) (6), and the MM-GBSA binding energy (ΔE in kcal/mol) (7). 5′ and 3′ of the DNA chain
are indicated by a red and blue ball, respectively. K+ ions
are indicated by yellow balls.
Figure 7
BSU6037, run 19, trajectory snapshots
and the order parameter plot,
illustrating the breaking and reforming of hydrogen bonds (at the
quadruplex (1), interface (2), and duplex (3)), drug-base dihedral
angle (4), ligand (black)/DNA (red) RMSD (Å) with reference to
the final structure (5), ligand center-to-DNA center distance (black)
and K+-to-K+ distance (red) (6), and the MM-GBSA
binding energy (ΔE in kcal/mol) (7). 5′
and 3′ of the DNA chain are indicated by a red and blue ball,
respectively. K+ ions are indicated by yellow balls.
Ligand Binding Did Not Change the Interaction
between K+ and the G-Quadruplex in Cases When K+ Did Not
Move Out
Potassium ions play an important role in stabilizing
G-quadruplexes. The distances between a potassium ion and eight neighboring
oxygen atoms and the oxygen–oxygen distances were calculated,
and these values are shown in Figure S1c,d for the DNA-only system. The values for the crystal structure and
other four DNA–ligand systems are shown in Tables S2 and S3. The average potassium–oxygen distance
difference between the crystal structure and the simulated DNA-only
system was within 0.16 Å, which is comparable to the structural
fluctuation (±0.18 Å) observed in the MD simulations. The
average oxygen–oxygen distance difference between the crystal
structure and simulated DNA-only system was within 0.34 Å, which
is slightly larger than the structural fluctuation (±0.21 Å).
Thus, the simulation well reproduced these distances. The average
potassium–oxygen distance difference between the simulated
DNA-only system and DNA–ligand systems was within 0.05 Å,
which is much smaller than the structural fluctuation (±0.18
Å) observed in the MD simulations. The average oxygen–oxygen
distance difference between the simulated DNA-only system and DNA–ligand
systems was within 0.03 Å, which is slightly larger than the
structural fluctuation (±0.21 Å). Thus, ligand binding did
not change the interaction between K+ and the G-quadruplex
when K+ did not move out.
Geometric Characterization
of the Duplex Part Showed That a
Normal B-DNA Geometry for Most Base Pairs Was Maintained, the Ligand
Binding on the Top of the G-Quadruplex or at the G-Quadruplex–Duplex
Interface Did Not Change the Duplex Geometry
Various geometry
parameters including base pair-axis, base pair-step, and paired base–base
parameters were calculated, as detailed in the method section, to
characterize the duplex part for the crystal structure and the five
simulated systems (Table S4). The values
from the five simulated systems generally agree with the values from
the crystal structure. For example, the crystal structure has average
values of ∼3.3 Å and ∼31° for H-rise and H-twist,
respectively. The H-rise for the DNA-only system, DNA–telomestatin
system (top binding), DNA–TMPyP4 (interface interacting), DNA–BSU6037
(interface binding), and DNA–BRACO19 (interface interacting)
is ∼2.8, ∼3.4, ∼3.0, ∼3.0, and ∼3.1
Å. The H-twists for the DNA-only system, DNA–telomestatin
system (top binding), DNA–TMPyP4 (interface interacting), DNA–BSU6037
(interface binding), and DNA–BRACO19 (interface interacting)
is ∼30, ∼33, ∼30, ∼30, and ∼32°.
All of these values are close to the values of 3.3 Å and 34°
for H-rise and H-twist of B-DNA helix, respectively. Thus, the helical
structure of the duplex was maintained throughout the MD simulations.
This is also supported from the DNA backbone dihedral angle and pucker
calculations (Table S5): although there
are some subtle differences, most values are close to each other.
Root-Mean-Square Fluctuation (RMSF) Data Showed That the Three
loops (T4, T8, and T12–T13) in the G-Quadruplex Part (1–16)
and Interface Region (T17–A19, T7–A8) and the Two Termini
of the Duplex Part (A25 and A2) had Larger Local Fluctuations Than
Those of the Four G Strands (G1–G3, G5–G7, G9–G11,
and G14–G16) for the DNA-Only System
Figure shows RMSF plot for the five
systems. As to the DNA-only system (black line), five peaks were identified,
corresponding to T4, T8, T12–T13, T18, and A25–A2. In
contrast, the other regions (G1–G3, G5–G7, G9–G11,
G14–G16, G20–T24, and A3–C6) show lower RMSF
values. Thus, although the four G4 strands and two helical strands
had lower structural fluctuation, the three loops in the G-quadruplex
part, the interface region, and the termini of the duplex part had
larger structural fluctuation. The G-quadruplex top binding by telomestatin
(red line) slightly reduced the RMSF values at the five peaks, in
particular, the T12–T13 loop peak and the A24–A2 terminal
peak, indicating the telomestatin might slightly stabilize the overall
structure. The interface interaction by TMPyP4 (green line) slightly
reduced the first three peaks but significantly reduced the fourth
peak in the interface region. The interface interaction by BRACO19
(blue line) slightly reduced the first three peaks but significantly
reduced the fourth and fifth peaks. Thus, the binding by TMPyP4 and
BRACO19 stabilized the interface region. The interface binding by
BSU6037 (purple line) slightly increased the first, second, and fifth
peaks but significantly reduced the third peak and shifted the fourth
peak to G20–C21 region. These changes can be attributed to
the change of the interface by the BSU6037 interface binding (Figure ).
Figure 3
Root-mean-square fluctuation
(RMSF) plot of the five systems (DNA-only,
top binding of telomestatin, interface intercalation mode of BSU6037,
and the interface interacting mode of TMPyP4 and BRACO19). The x axis refers to the residue number of the DNA quadruplex
hybrid structure (Figure a); the residue 1 of the second chain was missed in the crystal
structure.
Root-mean-square fluctuation
(RMSF) plot of the five systems (DNA-only,
top binding of telomestatin, interface intercalation mode of BSU6037,
and the interface interacting mode of TMPyP4 and BRACO19). The x axis refers to the residue number of the DNA quadruplex
hybrid structure (Figure a); the residue 1 of the second chain was missed in the crystal
structure.
Three Major Binding Modes
Were Observed for TMPyP4, BSU6037,
and BRACO19: Interface Interaction, Binding to the Quadruplex, and
Binding to the Duplex, but No Interface Interaction Mode Was Observed
for Telomestatin
Starting from the unbound state, 20 binding
simulation runs (500 ns of each) were carried out using each ligand.
The RMSD of the DNA backbone heavy showed that a majority of the trajectories
for each complex reached and remained in a stable form (Figure S4). Those that experienced a larger fluctuation
in RMSD values were those where the DNA backbone started to slightly
unravel and unfold. The convergence of the binding simulations was
confirmed, as described in the Methods section,
with the simulation systems reaching a relatively steady state where
the DNA–ligand complex retained 20 or more atom contacts (Figure S5). A table to summarize the number of
trajectories in each final binding pose as well as a figure of the
final binding poses of each trajectory is available in Figures S7 and S8–S11, respectively.The most stable complexes (Figure S13)
of the 20 trajectories were extracted and categorized into structural
families on the basis of a clustering analysis for each ligand, as
described in the methods section. By setting the threshold of 1% population,
6 structural families were identified for telomestatin (Figure S14), 9 were for TMPyP4 (Figure S15), 10 were for BSU6037 (Figure S16), and 12 were for BRACO19 (Figure S17). Then, these families were further merged into three binding modes:
interface interacting, quadruplex side binding, and duplex side binding.
After analyzing each of the modes, it was discovered that only TMPyP4,
BSU6037, and BRACO19 were observed in interface interacting, quadruplex
binding, and duplex binding modes: telomestatin was not able to bind
to or interact with the interface (Figure ), and BSU6037 was the only ligand able to
fully intercalate into the interface (Figure ). The results, including percentages, of
the most abundant clusters are reported in Figure . For telomestatin, overbinding to ends of
the DNA duplex could be a simulation artifact due to the missing T1
in the crystal structure. The hydrogen bond networks of the interface
interacting modes as well as detailed two-dimensional ligand–DNA
interactions of the major binding modes are presented in Figures S18 and S19, respectively.
Figure 4
Major binding
modes between ligands and human telomeric quadruplex–duplex
(PDB ID: 5DWW). 5′ and 3′ of the DNA chain are indicated by a red
and blue ball, respectively. K+ ions are indicated by yellow
balls. Overall population abundance (MM-GBSA binding energy) of each
binding mode is annotated.
Major binding
modes between ligands and human telomeric quadruplex–duplex
(PDB ID: 5DWW). 5′ and 3′ of the DNA chain are indicated by a red
and blue ball, respectively. K+ ions are indicated by yellow
balls. Overall population abundance (MM-GBSA binding energy) of each
binding mode is annotated.From the trajectory and clustering analysis, it is clear
that unlike
BSU6037, the ligands TMPyP4 and BRACO19 did not fully intercalate
into interface. Instead, TMPyP4 and BRACO19’s side chains extended
into the interface to make contact. As the interface opened up, the
drugs were able to move further into the interface and the interactions
with the interface were enhanced as the DNA deformed, which suggests
an induced fit binding mechanism. However, despite the utilization
of an induced fit binding mechanism, the final binding mode remained
a partial interaction and not full intercalation for both TMPyP4 and
BRACO19.
MM-GBSA Binding Energy Data Showed That the Interface Interaction
Mode of TMPyP4 Was the Most Energetically Favorable, the G-Quadruplex
Binding Mode of Telomestatin Was Much More Favorable Than Its Duplex
Binding Mode, and the Interface Interactions of BSU6037 and BRACO19
Were Comparable to the Duplex Binding Mode
To examine the
relative stability for the three binding modes observed, MM-GBSA binding
energy calculations were conducted on each binding mode for each complex
(Table ). TMPyP4 had
the highest binding energy for interface interaction, −96.2
± 6.8 kcal/mol, which was 13.3 kcal/mol higher than that for
G-quadruplex binding and 17.7 kcal/mol higher than that for duplex
binding. These values indicate that TMPyP4 strongly interacts with
the interface and is selective over the duplex DNA. Telomestatin did
not interact with the interface but had a much stronger binding energy
and selectivity to the G-quadruplex (−70.4 ± 5.1 kcal/mol)
over the duplex (−30.3 ± 5.0 kcal/mol). BSU6037 and BRACO19
both had similar interface interaction and duplex binding energies
and a lower comparative G-quadruplex binding energy. To further understand
the nature of binding, the binding energy was broken down into van
der Waals (VDW) interactions, hydrophobic interactions (SUR), electrostatic
interactions (GBELE), and the conformational energy change induced
from the complex forming (CONF) (Table ). As shown in Table , a majority of the binding energy is contributed by
the VDW energy and the conformational energy change. As expected,
the top stacking for telomestatin had a higher VDW energy than the
duplex binding; this is likely due to the uniplanar structure of telomestatin,
making stacking a more favorable binding choice. TMPyP4 had the largest
VDW energy for the interface interaction mode, which could be tied
to its charged planar structure. Both linear structures, BSU6037 and
BRACO19, had higher VDW energies for the duplex and interface than
the G-quadruplex, which could be explained by the fact that the linear
structures can better bind to the duplex groove or to the interface
pocket than to the quadruplex loops or top (stacked). Both planar
structures, telomestatin and TMPyP4, had larger conformational energies
for the G-quadruplex than either the interface or duplex, likely due
to their ability to stack on top of the quadruplex to further stabilize
the complex. These large conformational binding energies further support
the induced fit binding mechanism of the ligands to the quadruplex–duplex
DNA. Each of the four ligands varied in their respective charged state;
telomestatin had no charge (charge state of 0), TMPyP4 carried the
largest charge (+4), BSU6037 had a charge state of +2, and BRACO19
had a charge state of +3. The MM-GBSA binding energy data revealed
that BRACO19 exhibited the strongest electrostatic interactions (ΔEGBELE) with the hybrid quadruplex–duplex
DNA, indicating that a charge state of +3 is the best.
Table 2
MM-GBSA Binding Energy of Ligands
to the Quadruplex–Duplex DNAa,b,c,d,e,f
ligands (charge state)
binding
pose
ΔVDW
ΔSUR
ΔGBELE
ΔCONF
ΔTOT
ΔΔE
telomestatin (0)
G-quadruplex
–50.1 ± 2.0
–2.7 ± 0.1
18.3 ± 1.4
–35.9 ± 4.5
–70.4 ± 5.1
25.8
duplex
–41.4 ± 5.4
–2.9 ± 0.3
16.4 ± 2.8
–2.4 ± 3.5
–30.3 ± 5.0
65.9
TMPyP4 (+4)
interface
–68.2 ± 3.7
–5.1 ± 0.3
–11.4 ± 1.6
–11.5 ± 5.8
–96.2 ± 6.8
0.0
G-quadruplex
–51.8 ± 3.1
–3.4 ± 0.2
–12.6 ± 1.8
–15.1 ± 2.9
–82.9 ± 3.7
13.3
duplex
–68.4 ± 3.9
–4.8 ± 0.2
–8.0 ± 2.0
3.5 ± 6.3
–78.5 ± 6.8
17.7
BSU6037 (+2)
interface
–52.9 ± 5.6
–4.2 ± 0.6
–6.0 ± 4.0
–29.2 ± 0.8
–92.3 ± 5.6
3.9
G-quadruplex
–15.8 ± 6.7
–1.6 ± 0.7
–8.9 ± 3.6
–33.2 ± 1.0
–59.5 ± 6.8
36.7
duplex
–52.9 ± 3.9
–4.7 ± 0.3
–7.3 ± 2.8
–20.0 ± 0.3
–90.6 ± 4.5
5.6
BRACO19 (+3)
interface
–37.7 ± 3.1
–4.4 ± 0.3
–19.1 ± 3.3
–26.1 ± 2.8
–87.3 ± 3.4
8.9
G-quadruplex
–44.8 ± 3.5
–3.3 ± 0.4
–10.7 ± 3.2
–24.5 ± 5.9
–77.7 ± 6.1
18.5
duplex
–42.2 ± 8.9
–4.4 ± 0.7
–14.5 ± 5.3
–23.2 ± 4.3
–84.3 ± 9.0
11.9
ΔVDW = change of
VDW energy in gas phase upon complex formation (unit: kcal/mol).
ΔSUR = change
of
energy due to surface area change upon complex formation (unit: kcal/mol).
ΔEBELE = change of GB reaction field energy + gas phase elec.
energy upon
complex formation (unit: kcal/mol).
ΔCONF = change
of conformational energy upon complex formation (unit: kcal/mol).
ΔTOT = ΔVDW +ΔSUR + ΔEBELE + ΔCONF change of potential energy in
water upon complex formation (unit: kcal/mol).
ΔΔE = ΔTOT– (−96.2).
ΔVDW = change of
VDW energy in gas phase upon complex formation (unit: kcal/mol).ΔSUR = change
of
energy due to surface area change upon complex formation (unit: kcal/mol).ΔEBELE = change of GB reaction field energy + gas phase elec.
energy upon
complex formation (unit: kcal/mol).ΔCONF = change
of conformational energy upon complex formation (unit: kcal/mol).ΔTOT = ΔVDW +ΔSUR + ΔEBELE + ΔCONF change of potential energy in
water upon complex formation (unit: kcal/mol).ΔΔE = ΔTOT– (−96.2).Statistics of the interface interacting
trajectories are presented
in Table . This table
provides insight on the overall binding pathways for the trajectories
that maintained a final interface interaction binding pose. Some trajectories
were not included from this analysis, despite making contact with
the interface at some point during the trajectory. For TMPyP4 runs
3 (Figure S22) and 5 (Figure ), the initial contact with
the DNA is with a residue on a side loop of the quadruplex. In both
cases, within 40 ns after the initial contact, the ligand relocates
closer to the interface where its aromatic side chains first make
contact with interface residues. The ligand makes minor adjustments
to maximize interactions with the interface but remains in the same
binding mode for the remainder of the trajectory. BSU6037 system runs
7 (Figure S25) and 19 (Figure ) make initial contact with
the loop residues on the quadruplex region before interacting with
residues of the duplex region. In both trajectories, the ligand proceeds
to interact with terminal residue of the interface, which induces
conformational changes to the DNA, moving the ligand inside of the
binding pocket, increasing its interaction with the interface until
ligand stacking is achieved. BRACO19 made initial contact with the
duplex in runs 6 (Figure ), 8 (Figure S28), and 10 (Figure S29). After initial contact, the side
chains of BRACO19 made contact with the interface while still bound
to the duplex until the final pose was achieved.
Table 3
Statistics of the Interface Interacting
Trajectoriesa
system
total interface
interactions
quadruplex
to interface
duplex to
interface
quadruplex + duplex to interface
TMPyP4
2
2
BSU6037
2
2
BRACO19
3
3
Provided is the overall binding
pathway taken by each ligand in its interface interacting mode.
Figure 6
Representative trajectory
of the interface interacting mode of
TMPyP4 (run 5) and the order parameter plot, illustrating the breaking
and reforming of hydrogen bonds (at the quadruplex (1), interface
(2), and duplex (3)), drug-base dihedral angle (4), ligand (black)/DNA
(red) RMSD (Å) with reference to the final structure (5), ligand
center-to-DNA center distance (black) and K+-to-K+ distance (red) (6), and the MM-GBSA binding energy (ΔE in kcal/mol) (7). 5′ and 3′ of the DNA chain
are indicated by a red and blue ball, respectively. K+ ions
are indicated by yellow balls.
Figure 9
Suggested
modifications for telomestatin and TMPyP4.
Provided is the overall binding
pathway taken by each ligand in its interface interacting mode.
To Characterize the Binding Pathway of the
Two/Three Modes of
Each Ligand, Seven Order Parameters Were Calculated for Representative
Trajectories, As Described in the Methods Section
(Figures , Figures –8, and S20–S31)
The plots are included in the supporting document for telomestatin
(the quadruplex binding mode (Figure S20) and the duplex binding mode (Figure S21)), TMPyP4 (the interface binding mode (Figure S22), the G-quadruplex binding mode (Figure S23), and the duplex binding mode (Figure S24)), BSU6037 (the interface binding mode (Figure S25), the G-quadruplex binding mode (Figure S26), and the duplex binding mode (Figure S27)), and BRACO19 (the interface binding mode (Figures S28 and S29), the G-quadruplex binding
mode (Figure S30), and the duplex binding
mode (Figure S31)). A representative trajectory
for each ligand (the quadruplex binding mode for telomestatin and
the interface interacting mode for TMPyP4, BSU6037, and BRACO19) was
selected to be extended from 500 to 2000 ns for checking the stability
of these binding modes. The information gathered from the 2000 ns
simulations is reported in Figures –8. The occupancy time for the final stable binding mode was identified
from the RMSD plot with reference to the final snapshot: ∼1800
ns for telomestatin, ∼1750 ns for TMPyP4, ∼1200 ns for
BSU6037, and ∼1200 ns for BRACO19.
Figure 8
BRACO19, run 06, trajectory snapshots and the order parameter
plot,
illustrating the breaking and reforming of hydrogen bonds (at the
quadruplex (1), interface (2), and duplex (3)), drug-base dihedral
angle (4), ligand (black)/DNA (red) RMSD (Å) with reference to
the final structure (5), ligand center-to-DNA center distance (black)
and K+-to-K+ distance (red) (6), and the MM-GBSA
binding energy (ΔE in kcal/mol) (7). 5′
and 3′ of the DNA chain are indicated by a red and blue ball,
respectively. K+ ions are indicated by yellow balls.
Representative trajectory
of the top stacking mode of telomestatin
(run 9) and the order parameter plot, illustrating the breaking and
reforming of hydrogen bonds (at the quadruplex (1), interface (2),
and duplex (3)), drug-base dihedral angle (4), ligand (black)/DNA
(red) RMSD (Å) with reference to the final structure (5), ligand
center-to-DNA center distance (black) and K+-to-K+ distance (red) (6), and the MM-GBSA binding energy (ΔE in kcal/mol) (7). 5′ and 3′ of the DNA chain
are indicated by a red and blue ball, respectively. K+ ions
are indicated by yellow balls.
Top Stacking Binding Mode of Telomestatin Provides Support for
the Selectivity of This Ligand to the G-Quadruplex DNA over Duplex
DNA
Telomestatin was unable to bind to or interact with the
interface in any of the simulation runs, but it did bind very selectively
to the top of the quadruplex over the duplex, which is qualitatively
consistent with the experimentally observed 70-fold selectivity (Table S1). In the representative trajectory (Figure ), telomestatin initially
contacted the DNA at 14 ns, binding to the side loop of the quadruplex.
It released from this position at 67 ns and randomly searched for
another binding pose until 105 ns, where it slightly moved up and
rebound to the quadruplex side loop. It moved out and stacked on top
of the quadruplex at 165 ns and remained in this top stacking binding
mode for the remaining time. No significant changes in the number
of hydrogen bonds were observed in either the G-quadruplex, duplex,
or interface, demonstrating that it was not an induced fit binding
mechanism. The two potassium cations maintained their position and
distance throughout the trajectory. The MM-GBSA calculated binding
energy revealed that telomestatin top stacking to the quadruplex (∼−70
kcal/mol) was more favorable than binding to the side loop of the
quadruplex (∼−30 kcal/mol), which can help explain why
this interconversion occurred.
Binding Mechanism of the
Ligands to the Interface Seems to Be
via an Induced Fit Binding Mechanism Rather Than a “Lock–Key”
Binding Mechanism
The interface interaction modes for TMPyP4,
BSU6037, and BRACO19 are presented in Figures –8. The pattern changes in the number of hydrogen
bonds for the interaction of the three ligands at the interface are
consistent with the trajectories (Figures –8). A decrease in the number of hydrogen bonds among the base triad
(red) and base pair (green) was observed during initial ligand–interface
interaction. The hydrogen bonds for the A7–T19 base pair were
completely disrupted for each interface interaction. The energetic
cost of breaking these hydrogen bonds was likely compensated by the
favorable ligand–interface interaction energy gain. During
the interactions, a portion of the ligands were also extended to group
bind to the sides of the DNA backbone adjacent to the interface. The
stacking between the bases and the interacting portion of the ligands
in addition to the group binding between the extended ligand portion
and the DNA is likely further stabilizing the complex.Representative trajectory
of the interface interacting mode of
TMPyP4 (run 5) and the order parameter plot, illustrating the breaking
and reforming of hydrogen bonds (at the quadruplex (1), interface
(2), and duplex (3)), drug-base dihedral angle (4), ligand (black)/DNA
(red) RMSD (Å) with reference to the final structure (5), ligand
center-to-DNA center distance (black) and K+-to-K+ distance (red) (6), and the MM-GBSA binding energy (ΔE in kcal/mol) (7). 5′ and 3′ of the DNA chain
are indicated by a red and blue ball, respectively. K+ ions
are indicated by yellow balls.
TMPyP4 Interface Interaction Follows a Base “Flip-Insertion”
Binding Mechanism
TMPyP4 demonstrates a subtle induced fit
binding mechanism. In the representative trajectory (Figure ), TMPyP4 made initial contact
with the DNA at the side loop of the quadruplex at 4 ns. At 37 ns,
TMPyP4 shifted down toward the top of the duplex, with a region of
the ligand making contact with the A7 and T19 base pair of the interface,
breaking the two hydrogen bonds shared between them. By examining
the trajectory, we discovered that the initial interface contact is
concurrent with the flipping out of base T19, creating a binding pocket
to allow entrance of the ligand into the interface region. One hydrogen
bond between the base triad (T8, A18, and T17) also gets disrupted
by this initial interface contact. TMPyP4 moves further into the interface
at 255 ns, completely disrupting all hydrogen bonds within the base
triad. These bases that made up the base triad in the interface flip
out and bond to the aromatic regions of TMPyP4, such as the methylpiperidine
side chains positioned around the central aromatic ring, rather than
to each other. After this interaction, the top layer of the quadruplex
has an increase in the number of hydrogen bonds but it also exhibits
a higher fluctuation in them; these large fluctuations indicate that
when TMPyP4 moves further into the interface and maintains a partial
intercalation, it does not stabilize the G-quadruplex as much as the
initial interface interaction does. The middle layer of the quadruplex
experiences a loss of a hydrogen bond during this same period. No
significant change in hydrogen bonds is present within the duplex,
as the conformation of this region does not undergo any major changes.
The top potassium ion moved out at 339 ns, whereas the bottom potassium
ion remained stable throughout the course of the trajectory.
Interface
Intercalation of BSU6037 Requires the DNA To Open
Up at the Interface Region To Provide the Ligand Access to the Binding
Site
The binding mechanism of BSU6037 to the interface is
particularly interesting as it clearly demonstrates an induced fit
binding mechanism (Figure ). In the representative trajectory, BSU6037
initially binds to the side of the duplex until 82 ns where it makes
the first contact with the interface residues. BSU6037 binds to terminal
residue A8 that flips into the core of the interface, moving BSU6037
closer into the interface where it is able to interact with interface
residue T19 where it remains until 300 ns. At 301 ns, the duplex begins
to unravel until 483 ns where it is approximately 90° from the
starting position. At 458 ns, the top potassium cation leaves the
quadruplex. With BSU6037 in the interface, the duplex reforms itself
until complete at 584 ns where BSU6037 remains stacking in the interface
for the remainder of the trajectory. Under further examination, it
was discovered that when the BSU6037 ligand bonded to the interface
of the DNA, it forced the bonds in the base pair and base triad to
break, unfolding and opening the DNA up, which allowed the ligand
into the binding pocket. The binding energy of the interface (−92.3
± 5.6 kcal/mol) was much larger than that of the G-quadruplex
(−59.5 ± 6.8 kcal/mol), which could explain why after
BSU6037 made an initial contact with the quadruplex it inserted itself
into the interface instead. These conformational changes that take
place during binding demonstrate the induced fit binding mechanism
that the ligand (BSU6037) produces.BSU6037, run 19, trajectory snapshots
and the order parameter plot,
illustrating the breaking and reforming of hydrogen bonds (at the
quadruplex (1), interface (2), and duplex (3)), drug-base dihedral
angle (4), ligand (black)/DNA (red) RMSD (Å) with reference to
the final structure (5), ligand center-to-DNA center distance (black)
and K+-to-K+ distance (red) (6), and the MM-GBSA
binding energy (ΔE in kcal/mol) (7). 5′
and 3′ of the DNA chain are indicated by a red and blue ball,
respectively. K+ ions are indicated by yellow balls.
BRACO19 Also Interacted
with the Interface Region
BRACO19
demonstrates another subtle induced fit binding mechanism. In the
representative trajectory (Figure ), BRACO19 bound to the major
groove of the duplex part at 3 ns where the N,N-dimethylcyclohexylamine side chain made the initial contact
with the side of the quadruplex where it remained until 107 ns. At
107 ns, a side chain of BRACO19 began making contact with the interface
residues T17, A18, and T19. This side chain binding to the interface
region forced base A7 to flip out, eliminating the hydrogen bonds
formed by the A7–T19 base pair so that the drug moved up and
interacted with the interface region until 767 ns. At 767 ns, the
interface region bends open, allowing the N,N-dimethylcyclohexylamine side chain of BRACO19 to reposition
slightly deeper into the interface region. BRACO19 maintained this
interaction with the interface region throughout the simulation; the
G-quadruplex part was stable through the trajectory. The top potassium
cation moved out of the quadruplex at 1636 ns. MM-GBSA calculations
determined that the binding energy fluctuated between −80 and
−90 kcal/mol.BRACO19, run 06, trajectory snapshots and the order parameter
plot,
illustrating the breaking and reforming of hydrogen bonds (at the
quadruplex (1), interface (2), and duplex (3)), drug-base dihedral
angle (4), ligand (black)/DNA (red) RMSD (Å) with reference to
the final structure (5), ligand center-to-DNA center distance (black)
and K+-to-K+ distance (red) (6), and the MM-GBSA
binding energy (ΔE in kcal/mol) (7). 5′
and 3′ of the DNA chain are indicated by a red and blue ball,
respectively. K+ ions are indicated by yellow balls.
Discussion
G-quadruplex
DNA has become an important therapeutic target recently
as it has been found to be over-represented in a wide variety of cancers.[84] Many ligands targeting G-quadruplexes have been
developed; however, selectivity to a specific G-quadruplex over DNA
duplex is yet to be achieved at a level acceptable for therapeutic
use. Recently, the junction between the G-quadruplex and duplex hybrid
DNA (PDB: 5DWW) in the 3′ end of telomeric overhang was targeted as a promising
opportunity to improve upon G-quadruplex selectivity and ligand potency.[46] Yet, there are no detailed molecular interactions
or high-resolution complex structures reported for most of the ligands
except for BSU6037,[46] which has been docked
to the crystal structure of the G-quadruplex–duplex hybrid.This study was done to gain molecular insights into the binding
of telomestatin, TMPyP4, BSU6037, and BRACO19 to the quadruplex–duplex
DNA by using binding molecular dynamic simulations with a free ligand.
The free ligand simulation allowed for more dynamics and flexibility
within both the DNA and ligand, which in turn showed that ligand interaction
with the interface was by means of an induced fit binding mechanism
rather than a lock–key mechanism. The MM-GBSA binding energy
calculations (Table ) showed that the ligands binding to the DNA induced a large change
in each complex’s conformational energy, which was a major
contributor to the high total energies. Large conformational binding
energies are concurrent with an induced fit binding mechanism occurring.
The DNA subtly opening up at the interface region makes room for the
drug interactions (Figures –8). Once the insertion
is completed, the interface closes and the complex is stabilized.
Energetically, the loss of hydrogen bonds at the interface is compensated
by the initial ligand–DNA interaction and final π–π
interactions once the induced fit mechanism is completed. Of course,
more experimental evidence is required to support this interface open-close
mechanism.Out of multiple major binding poses for each system,
our MM-GBSA
binding energy calculations indicate that interface interactions were
the most stable binding pose. The large magnitude of the binding energy
in combination with the long time (2000 ns) stability suggests that
the binding is likely enthalpically driven, with a minor contribution
in regard to the entropic binding energy. The binding energy decomposition
analysis further suggests that the VDW and CONF terms make the largest
contributions to the total binding energy. This indicates that drug
specific packing optimization is a possible way to further stabilize
G-quadruplexes.MM-GBSA binding energy calculations indicate
that the most stable
binding pose for telomestatin was top stacking to the G-quadruplex;
its large uncharged planar structure was better suited to stacking
on top of the G-quadruplex, as top stacking results in the most favorable
packing interaction. However, telomestatin did not bind to or interact
with the interface. Because the other three charged ligands were found
to be able to interact with the interface, it is speculated that lack
of charge for telomestatin might hinder its ability to interact with
the DNA backbone for opening the binding pocket and interacting with
the interface. If this is true, the findings of this work suggest
that to allow telomestatin to bind at the interface region and maintain
its high selectivity over duplex DNA, the planar scaffold needs to
incorporate some small charged side chains (Figure top). These charged side chains can disrupt the DNA backbone
interactions, thereby destabilizing the DNA scaffold enough to allow
a binding pocket to open up and permit ligand insertion into the interface.Suggested
modifications for telomestatin and TMPyP4.To test this, induced fit docking was performed on the original
ligand, telomestatin, as well as a modified ligand containing two
charged side chains (Figure S32a–d). The results of the induced fit docking revealed there was a −3.1
kcal/mol difference between the original ligand (−7 kcal/mol)
and the modified ligand (−10.1 kcal/mol) binding energies.
This supports the proposed modification that adding a small charged
side chain would increase the ability of the ligand to insert into
the interface. Nonetheless, further simulation and experimentation
are required to verify the proposed modification.TMPyP4 exhibited
the highest interface MM-GBSA binding energy (−96.2
kcal/mol), which was the strongest binding energy of all binding modes
among the four ligands. In fact, the MM-GBSA binding energy shows
that the interface interaction mode is significantly more stable for
TMPyP4 than the quadruplex and duplex binding modes by 13.3 and 17.7
kcal/mol, respectively, indicating that the interface mode may significantly
contribute to the higher affinity as well as a higher selectivity
to the 3′ telomeric quadruplex–duplex junction. However,
TMPyP4 has a preference to bind to diad base over quartet due to its
size and structure and it only interacts with the interface partially
when the interface is completely distorted. It is clear that the size
and charge of TMPyP4 hindered the ligands ability to fully intercalate
into the interface so that the central chromophore could interact
with both the bottom quartet of the G4 as well as the top diad of
the duplex inside of the interface simultaneously; this was also the
case for BRACO19. Therefore, it is speculated that the methyl groups
of TMPyP4 should be removed from the aromatic side chains to increase
stability and allow the ligand to move deeper into the interface to
increase ligand–DNA interactions. The proposed structure (Figure bottom) is designed
to keep TMPyP4 intact and to increase packing interactions with a
deeper insertion of the drug into to the interface (Figure S32E–H). This could potentially create more
potent and selective ligands to bind to the interface of the G-quadruplex–duplex
hybrid DNA. As a result of induced fit docking, the proposed structure
had a docking score of −12.81 kcal/mol, which is 0.37 kcal/mol
stronger than that of the original structure. The proposed TMPyP4
analog showed an increase in its ability to remain planar and increase
the interactions with the interface (Figure 32H). This supports the proposed modification to the original ligand
to increase the ligands ability to move deeper into the interface.
Nonetheless, further simulation and experimentation are required to
verify the proposed modification.The MM-GBSA binding energy
analysis revealed that BSU6037 bound
to the interface and duplex at comparable binding energies (∼−91.5
kcal/mol), which was about 30 kcal/mol stronger than binding to the
quadruplex part. Similarly, BRACO19 interacted with the interface
and bound at the duplex at comparable binding energies (∼−86.6
kcal/mol), which was around 10 kcal/mol stronger than binding to the
quadruplex. Not only do these ligands lack selectivity over duplex
DNA, they also bind to the interface weaker than TMPyP4 between 3.9
and 8.9 kcal/mol. Interestingly, despite BSU6037’s ability
to intercalate into the interface, the partial interaction between
TMPyP4 and the interface remains more energetically favorable. These
findings indicate that linear structures, even those that contain
charged side chains, lack the selectivity and potency needed to be
a viable G-quadruplex stabilizer. Therefore, future ligand development
should focus on charged aromatic scaffolds that optimize packing and
π–π interactions within the interface and include
charged side chains that destabilize the DNA backbone to allow the
ligand entrance to the interface binding pocket.The free binding
simulations also revealed that BSU6037 had a different
binding pose than the one described by Krauss et al. in 2016. Krauss
et al. found that a pseudo binding site for BSU6037 could be constructed
at the interface region by disrupting T17 of the TAT base triad, which
consists of bases T17(A)–A18(A)–T8(B), to expose the
surface of the bottom G-quartet; specifically, they thought that if
T17 of the base triad flipped out, the ligand would have access to
the bottom of the quadruplex.[46] When they
docked BSU6037 into this pseudo binding site, the ligand fit directly
below the bottom G-quartet, with the ligand oriented so that it laid
in-plane with the A7–T19 base pair. However, the free ligand
molecular dynamic simulations in this work did not show BSU6037 binding
in this position. BSU6037 destabilized DNA at the interface region,
causing bases T7 and A8 to break apart from their respective partners
and flip out, which in turn unfolded the DNA at the interface. BSU6037
fit into the opened binding pocket, and the DNA refolded around it
(Figure ). Furthermore,
the simulations resulted in BSU6037 experiencing slightly different
interactions then what was observed in the docked experiments. In
the previous docking experiment, the central planar regions of BSU6037
exhibited π–π stacking interactions with bases
G11, G16, and T17, the center protonated nitrogen of BSU6037 had a
hydrogen bond with the N1 atom of base T19, the side chain amidenitrogen
had a hydrogen bond with the carbonyl oxygen atom (O4) of base A8,
and the other side-chain amidenitrogen had a hydrogen bond with the
N3 atom of base A18.[46] In this work, π–π
stacking interactions with the central planar regions of BSU6037 and
bases G11 and A8 on the 3′ end of the shorter DNA strand were
observed, as well as a hydrogen bond was formed between the side-chain
amidenitrogen and base T17. This change in ligand–DNA interactions
is likely brought on by the different binding location of BSU6037.Coordination of O6 carbonyls by cations within the ion pore is
known to be a major factor in stabilizing the quadruplex structure.
Successful stabilization from the cations has reportedly maintained
an average interquartet distance of 3.3 Å between the bipyramidal
antiprism that is formed.[83] Zheng and co-workers[85] have shown that single-strand G-rich sequence
can fold into a G-quadruplex with very low K+-ion concentration
(0.01–01 mM), indicating strong binding of K+. To
look into this feature, distance calculations were performed to define
some parameters of the K+ cations relative to the surrounding
G4 DNA. Analyzing the oxygen–potassium and oxygen–oxygen
distance calculations (Figure S1; Tables S2, S3, and S6) revealed that each residue was able to maintain an
average oxygen–oxygen distance close to 3.3 Å during the
simulations. The similar mean values and very low standard deviations
for both calculations indicate that the simulation parameters for
the K+ cations are appropriately set despite the K+ cation moving out of the ion pore in some ligand binding
trajectories. Because none of the K+ moved out in three
2.0 μs simulations of the DNA-only system, the ligand binding
might lead to the increase of the dynamic fluctuation of the quadruplex
and the K+-ion pore. In fact, the trajectory analysis indicates
that the moving out of a K+ ion occurred at the induced
fit binding process with a large deformation of the DNA structure.
It is suspected that further extending the simulation period may show
the reversible binding of the K+ cation into the ion pore
as the DNA structure is stabilized from the ligand binding. In future
work, a more detailed analysis can be performed to elucidate the dynamic
properties of the reversible binding of K+ cations.Literature suggested the H-rise and H-twist geometry parameters
were the most important parameters to analyze the overall helical
structure. Despite a number of works defining duplex DNA, there has
been little research into the helical rise and rotation of the quadruplex
structure from the simulations. This research was able to identify
average values for the H-rise (2.4 Å) and H-twist (20 Å)
of the bases within the three quartets of the DNA G-quadruplex, as
well as reproduce an average value for the rise parameter (3.5 Å).
A common feature defining helical B-DNA is maintaining average values
of 3.32 Å and 34.3° for H-rise and H-twist, respectively.
The results of the H-rise and H-twist helical geometry parameter analysis
revealed comparable values for each of the systems being analyzed,
providing quantitative support that the helical structure was maintained
throughout the simulations.In addition to the interface interacting
mode, a top stacking mode
on the parallel G-quadruplex part for TMPyP4 and BRACO19 was also
observed (Figure ).
Interestingly, their complex crystal structure with a two-stranded
parallel human telomeric G-quadruplex are available in the PDB databank.[32,86] (Figure ). Although
the number of strands forming the parallel G-quadruplex is different,
the binding mode from the simulations shows a good similarity to the
crystal structure. This supports the simulation methodology in this
study.
Figure 10
Comparison of the G-quadruplex end binding mode from the ligand
binding simulations and from the crystallography studies. 5′
and 3′ are represented by red and blue balls, respectively.
Potassium cations are represented as yellow balls.
Comparison of the G-quadruplex end binding mode from the ligand
binding simulations and from the crystallography studies. 5′
and 3′ are represented by red and blue balls, respectively.
Potassium cations are represented as yellow balls.
Conclusions
Computational methods
have become essential in drug discovery as
they provide the detailed structural information that experimental
results may lack. Although many G-quadruplex ligands have been designed,
they still lack the potency and selectivity to a specific G-quadruplex
over duplex DNA that is required for therapeutic use. A promising
opportunity to improve upon both selectivity and potency is by targeting
the recently solved junction between G-quadruplex–duplex hybrid
DNA in the 3′ end of telomeric overhang. In this study, free
ligand molecular dynamic simulations were utilized in addition to
MM-GBSA binding energy calculations to evaluate the binding modes
and mechanisms of telomestatin, TMPyP4, BSU6037, and BRACO19, which
have had little research previously done. Three different major binding
modes were identified: interface interacting, quadruplex side binding,
and duplex side binding. The MM-GBSA binding energy analysis revealed
that of the three binding modes, the interactions with the interface
region of the quadruplex–duplex hybrid DNA would result in
the most stable structure. The MM-GBSA binding energy analysis also
revealed that TMPyP4 has the strongest binding energy to the interface
region. The trajectory analysis revealed that TMPyP4, BSU6037, and
BRACO19 were able to interact with the interface via an induced-fit
mechanism, consistent with subtly opening near the interface and repacking
after ligand insertion. However, BSU6037 was the only ligand able
to remain fully intercalated into the interface region. Both linear
ligands BSU6037 and BRACO19 were not selective over duplex DNA. The
results of the induced fit docking suggest that removal of the methyl
groups from the aromatic side chains of TMPyP4 could allow the central
ring to interact deeper within the interface region; the docking also
suggests that adding some small charged side chains to telomestatin’s
aromatic scaffold may allow this ligand to interact with the interface
region. With these findings, it is believed that further research
in increasing the binding stability could result in promising telomeric
G-quadruplex–duplex stabilizers.
Methods
Simulation
System
Three types of systems were constructed:
a DNA-only, a ligand-only, and four DNA + ligand (Table ). The unbound DNA–ligand
system was constructed using the X-ray crystallography solved human
telomeric G-quadruplex–duplex structure (PDB ID: 5DWW(46)), including the K+ ions inside the G-quadruplex
and a ligand molecule that was at least 10 Å away from the DNA.
The 10 Å distance makes sure that there are at least three layers
of water molecules separating the ligand from the DNA; therefore,
there is not much interaction between the ligand and the DNA (i.e.,
an unbound state). This setup allowed the probing of the binding pathways
and binding mechanisms. This unbound system (DNA plus ligand) was
solvated in a water box of truncated octahedron with 10 Å water
buffer plus K+ as counter ions to neutralize the system,
leading to ∼120 mM of K+ cation concentration. The
latest version of the AMBER DNA OL15 ff99 (i.e., parm99bsc0[87] + χOL4[88] + ε/ζOL1[89] + βOL1[90] updates) was applied to represent
the DNA, TIP3P model[91] was used to represent
water, and the recently developed K+ model by Cheatham
group was used for the system.[92] The partial
charges for each ligand molecule were obtained using standard AMBER
protocol: the electrostatic potential of the ligand molecule was obtained
at the HF/6-31G* level after geometry optimization at the same level;
the electrostatic potential using the restrained electrostatic potential
method determined the partial charges,[93] and other force-field parameters were taken from the AMBER GAFF2[94] force field. The ligand force field in Mol2
format can be found in the supporting document (Figure S33). These AMBER force fields are the most common
ones used in nucleic acid simulations including G-quadruplexes.[80,82,88,95−99]
Simulation Protocols
Three runs for the DNA-only, five
runs for the ligand-only, and the 20 simulation runs for each unbound
DNA–ligand system (Table ) were conducted using the GPU version of the AMBER
14 simulation package.[100] The detailed
protocol followed this groups’ early studies.[101,102] Each unbound DNA–ligand system underwent a 500 ps prerun
at 500 K to ensure the position and orientation of the free ligand
was randomized before a production run at 300 K; during this prerun,
the receptors’ position remained fixed. After the energy minimization
for the initial structure of a system, 3, 5, or 20 independent runs
were performed by assigning different initial velocities to the atoms
of the system; this was followed by a Maxwell–Boltzmann distribution
using a different random seed. The multiple independent runs for the
DNA–ligand system allow for a better sampling of binding poses
and pathway. A production run of 500 ns at 300 K included a short
1.0 ns molecular dynamics run in the NPT ensemble mode (constant pressure
and temperature) to equilibrate the system density in which the DNA
and ligand were subjected to Cartesian restraints (1.0 kcal/mol Å)
and 499.0 ns dynamics in the equivalent NVT ensemble mode (constant
volume and temperature) (Table ). SHAKE[103] was applied to constrain
all bonds connecting hydrogen atoms, enabling a 2.0 fs time step in
the simulations. The particle-mesh Ewald method[104] was used to treat long-range electrostatic interactions
under periodic boundary conditions (charge grid spacing of ∼1.0
Å, the fourth order of the B-spline charge interpolation, and
direct sum tolerance of 10–5). The cutoff distance
for short-range nonbonded interactions was 10 Å, with the long-range
van der Waals interactions based on a uniform density approximation.
To reduce the computation, nonbonded forces were calculated using
a two-stage RESPA approach[105] where the
short-range forces were updated every step and the long-range forces
were updated every two steps. Temperature was controlled using the
Langevin thermostat with a coupling constant of 2.0 ps. The trajectories
were saved at 50.0 ps intervals for analysis.
Convergence of Simulations
The root-mean-square deviation
(RMSD) of DNA/ligand backbone heavy was calculated against the starting
structure. The flat and small average RMSDs (Figures S3 and S4) over all runs indicate that these systems were stable.
Atom contacts between the DNA fragment and the drug molecule were
calculated using an atom-to-atom distance cutoff of 3.0 Å. The
simulation systems reached a steady state, as indicated by the stable
average contact number (Figure S5) over
all runs. A stable complex was defined as a complex with the number
of atom contacts between the DNA fragment and the drug molecule greater
than 20. The last snap shots of the apo form are shown (Figure S6).The free drugs bound to different
sites, as shown in the last snapshots for the 20 runs (Figures S8–S11), indicate a good sampling
of binding sites.
K+ Position
Distances
were calculated to
determine the position of the K+ cation relative to the
surrounding DNA G-quadraplex (Figure S1 and Table S6) because the electrostatic interactions between the K+ cation and the partially negative oxygen atom on each residue
of the surrounding G-quartet are critical in stabilizing the G-quadruplex
structure. For the apo DNA form, the distance between the oxygen on
each residue and the nearest K+ cation (Figure S1c), as well as the distance between the oxygen of
each residue and the oxygen of the nearest residue (Figure S1d) was calculated. These oxygen–oxygen and
oxygen–potassium distances per residue were also calculated
for the four ligand binding simulations (Tables S2 and S3).
G-Quadruplex Parameters
To understand
inter-G4 layer
geometry, the rise, H-rise, and H-twist parameters were calculated.
The rise parameter was defined as the distance (Å) between the
centers of the guanine bases, excluding the sugar phosphate backbone,
of the lower G4 layer to the center of the guanine bases on the G4
layer above. The H-rise parameter calculated the projection of the
G4 layer rise with respect to the Z axis, which was
defined as the vertical axis through the two K+ cations
located inside the G4-ion channel. The H-twist parameter calculated
the rotational angle of the bases with respect to the aforementioned Z axis. The rotation in the H-twist parameter was calculated
from the bottom, meaning that the layer above would be used as a reference,
measuring the degree of clockwise rotation required of the lower layer
to align with the position of the reference layer (Table S6).
DNA Duplex Parameters
Various helical
parameters were
calculated using Curves+[106] to characterize
the base pairing in the helical portion of the DNA (Table S4). The base pair-axis parameters were used to quantify
the rigid-body transformation of the base pair reference frame and
the reference frame of the local helical axis. The base pair-axis
parameters were calculated by measuring the movement of the bases
toward the grooves (x-displacement), movement of
the bases perpendicular to the grooves (y-displacement),
rotation around the short axis of the base pairs (inclination), and
rotation around the long axis of the base pairs (tip). Intrabase pair
parameters were calculated using the rigid-body transformation from
one base reference system to its respective paired base reference
system. The three translational parameters calculated measured the
change in distance with respect to the vector pointing to the major
groove (shear), with respect to the vector pointing along the long
axis of the base pair (stretch), and with respect to the vector of
the base pair (stagger). Three rotational parameters were also calculated
that measured the rotation with respect to the vector pointing into
the major groove (buckle), pointing along the long axis of the base
pair (propel), and with respect to the vector normal to the base pair
(opening). Another set of parameters calculated was the interbase
pairing parameters that calculated the rigid-body transformations
between base pairing steps. The interbase pair parameters calculated
included three translational parameters (shift, slide, and rise) and
three translational parameters (tilt, roll, and twist), along with
two additional parameters that considered the rigid body movement
of each step relative to the overall helical axis (H-rise and H-twist).
DNA Backbone Parameters
Backbone torsional angles (α,
β, γ, δ, ε, ζ, χ, Pucker) (Figure S12) were calculated to further characterize
the conformational changes of the DNA in each system (Table S5). The standard backbone dihedral angles
(α, β, γ, δ, ε, and ζ) were measured
around the covalent bonds of the sugar ring and χ about the
glycosidic bond. The sugar puckering was calculated by determining
the position of the phosphorus atoms relative to the sugars.
Structural
Fluctuation of the DNA
For each system,
the root-mean-square fluctuations (RMSFs) of each individual residue
of DNA were calculated to characterize the local structural fluctuation
(Figure ).
Binding
Mode Identification
Because the DNA backbone
was relatively stable in the binding process, we aligned the DNA backbone
of the stable complexes from the trajectories by a least square fitting.
The aligned complexes were clustered into different structural families
on the basis of the 2 Å pair-wise RMSD cutoff of the drug molecule
only using the Daura algorithm,[107] in which
the number of neighboring structures was calculated for every structure
on the basis of the RMSD cutoff. The structure with the largest number
of neighbors together with its neighboring structures was removed
to form a structure family, and the process continued for the remaining
structures until all structures were assigned to a structural family
(Figure S13). The centroid structure (i.e.,
the structure having the largest number of neighbors in the structural
family) was used to represent the family. The centroid structures
of populated structural families (>1% of total structure population)
are shown in Figures S14–S17. On
the basis of visual inspection, the centroid structures were further
merged into superfamilies corresponding to major binding modes (interface,
quadruplex, and duplex binding).
Parameters for Characterizing
DNA–Drug Binding Pathway
Seven order parameters were
calculated to characterize the DNA–drug
binding process: hydrogen bond analysis at the quadruplex, interface,
and duplex, drug-base dihedral angle, ligand RMSD (Å), center-to-center
distances (Å), and MM-GBSA binding energy (ΔE in kcal/mol). The distance cutoff between H-donor and H-acceptor
was 3.5 Å, and the donor-H-acceptor angle cutoff was 120°.
The hydrogen bonds for the G-quadruplex were calculated for the top
(red; G1, G5, G9, and G14), middle (green; G2, G6, G10, and G15),
and bottom (blue; G3, G7, G11, and G16) G-tetrads over the course
of the trajectories (Figure A). The hydrogen bonds for the interface were calculated for
the top/three-base layer (red, T8, T17, and A18) and bottom/two-base
layer (green, A7 and T19) over the course of the trajectories. The
hydrogen bonds for the duplex were calculated for the first (red,
C6, and G20), second (green; G5–C21), and third (blue; C4–G22)
base pairs below the interface, over the course of the trajectories.
The ligand-base dihedral angle was defined as the dihedral angle between
the plane of the DNA layer that is closest to the drug and the drug’s
ring plane. Atom contact number between the DNA and the drug molecule
was calculated using an atom-to-atom distance cutoff of 3.0 Å.
Ligand–DNA distance was defined as the distance from the DNA
center to the drug molecule center, indicated by a black line. Potassium-ion
distance was defined as the distance from the center to center of
the two potassium ions inside the G-quadruplex, indicated by a red
line.The molecular mechanics generalized born/surface area
(MM-GBSA)[108] module in the AMBER package
(GB1 model with Bondi radii set, salt of 0.2 M, and surface tension
of 0.0072 kcal/mol/Å2) was used to analyze the energetics
of the bound complexes to avoid the large energy fluctuation of explicit
solvent. A recent evaluation study by Case and co-worker has concluded
that realistic DNA modeling can be obtained with GB implicit solvation
methods and the GB1 model performs better than GB2-OBC1 and GB5-OBC2.[109] Note that the two K+ ions inside
the G-quadruplex–duplex DNA were kept in MM-GBSA calculations
because they were a part of the complex structure. The MM-GBSA binding
energy for a system was calculated from three simulations:[110] ligand only, DNA only, and DNA–ligand
complex using eq . It
has four components in the eq : van der Waals (VDW) interaction energy, hydrophobic interaction
energy (SUR), electrostatic interaction (GBELE), and the change of
the conformation energy for DNA and ligand. These terms were calculated
using eqs and 4.Absolute binding free-energy calculations
can be very useful in determining binding site selectivity in G-quadruplex
DNA, which is not easily obtained through experimental methods.[111] However, due to the high computation cost of
the absolute free-energy calculation, the relative free-energy calculation
with implicit solvent model can be useful for screening a large number
of ligands. A recent study has shown that GB models make good prediction
even for charged molecules when the relative solvation free energy
is considered.[112] Systematic benchmarking
studies up to 1864 crystal complexes have shown that relative MM-GBSA
binding energy is a powerful tool to rank ligand binding affinity.[113−117] This is also supported by our previous studies, in which the binding
energy of between anticancer drug doxorubicin and a B-DNA fragment
(d(CGATCG)2),[98] between anticancer drug
daunomycin and a TGGGT G-quadruplex DNA,[118] between RHPS4 and human telomeric G-quadruplexes/duplex,[110] between telomestatin and a hybrid telomeric
G-quadruplex,[45] and between BRACO19 and
a parallel-stranded telomeric G-quadruplex[119] were successfully assessed by this MM-GBSA protocol.
Authors: Yang Mei; Zhong Deng; Olga Vladimirova; Nitish Gulve; F Brad Johnson; William C Drosopoulos; Carl L Schildkraut; Paul M Lieberman Journal: Sci Rep Date: 2021-02-10 Impact factor: 4.996