Literature DB >> 32923997

The Structure and Function of DNA G-Quadruplexes.

Jochen Spiegel1, Santosh Adhikari2, Shankar Balasubramanian1,2,3.   

Abstract

Guanine-rich DNA sequences can fold into four-stranded, noncanonical secondary structures called G-quadruplexes (G4s). G4s were initially considered a structural curiosity, but recent evidence suggests their involvement in key genome functions such as transcription, replication, genome stability, and epigenetic regulation, together with numerous connections to cancer biology. Collectively, these advances have stimulated research probing G4 mechanisms and consequent opportunities for therapeutic intervention. Here, we provide a perspective on the structure and function of G4s with an emphasis on key molecules and methodological advances that enable the study of G4 structures in human cells. We also critically examine recent mechanistic insights into G4 biology and protein interaction partners and highlight opportunities for drug discovery.
© 2019 The Author(s).

Entities:  

Keywords:  DNA; G-quadruplex; G4; drug discovery; nucleic acids; secondary structure

Year:  2020        PMID: 32923997      PMCID: PMC7472594          DOI: 10.1016/j.trechm.2019.07.002

Source DB:  PubMed          Journal:  Trends Chem        ISSN: 2589-5974


Beyond the DNA Double Helix

A chemist’s perspective on the function of a molecule, or a system of molecules, is typically led by a consideration of how molecular structure dictates function. The most widely recognised DNA structure is that of the classical DNA double helix [1], which defines a structural basis for the genetic code via defined base-pairing. Yet, it is evident that DNA is structurally dynamic and capable of adopting alternative secondary structures. One such class of DNA secondary structure is the four-stranded G-quadruplex (G4). Herein, we discuss some of the key scientific history that has shaped our understanding of this structural motif, its probable functions in biology, and major unanswered questions that remain to be solved.

G4 Structure

The capacity for guanylic acid derivatives to self-aggregate was noted over a century ago [2]. Some 50 years later, fibre diffraction revealed that guanylic acids form four-stranded, right-handed helices leading to a proposed model in which the strands are stabilised via Hoogsteen hydrogen-bonded guanines to form co-planar G-quartets 3, 4, 5. Subsequent biophysical studies using DNA oligonucleotides with sequences from immunoglobulin switching regions or telomeres (see Glossary) showed stable formation of G4 structures under near-physiological conditions in vitro 6, 7. Stacks of G-quartets are stabilised by cations centrally coordinated to O6 of the guanines with stabilising preference for monovalent cations in the order K+ > Na+ > Li+ (Figure 1A) [8]. G4s can be unimolecular or intermolecular and can adopt a wide diversity of topologies arising from different combinations of strand direction (Figure 1D–F), as well as length and loop composition 9, 10. Structural studies using X-ray crystallography and NMR spectroscopy have provided detailed insights into the structure of DNA G4s primarily based on the human telomeric repeat (Figure 1B,C) [11] or sequences derived from the promoter regions of certain human genes such as MYC or KIT 12, 13. Based on biophysical studies on different G4 structures, algorithms using sequence motifs such as G≥3NxG≥3NxG≥3NxG≥3 were developed and deployed to predict putative G4 structures in genomic DNA 14, 15. Early models assumed loop lengths no longer than seven and the requirement for four continuous stretches of Gs. Subsequently, several G4 structures with longer loop lengths and discontinuities in G-stretches causing bulges were observed 16, 17, leading to alternative predictive models [18]. The recent availability of large datasets on G4 formation has enabled the application of machine learning to predict G4 forming propensity [19]. Further considerations include the effects of molecular crowding [20] and DNA base modifications, such as cytosine methylation [21] and guanine oxidation [22], on the stability of G4 structures.
Figure 1

G-Quadruplex (G4) Structures.

(A) Structure of a G-quartet formed by the Hoogsteen hydrogen-bonded guanines and central cation (coloured green) coordinated to oxygen atoms. Crystal structure of human telomeric G4s (Protein Data Bank: 1KF1): (B) top view and (C) side view; backbone is represented by grey tube and the structures are colour-coded by atoms. Schematic representation of unimolecular G4s based on the strand direction: (D) parallel, (E) anti-parallel, and (F) hybrid with a bulge.

G-Quadruplex (G4) Structures. (A) Structure of a G-quartet formed by the Hoogsteen hydrogen-bonded guanines and central cation (coloured green) coordinated to oxygen atoms. Crystal structure of human telomeric G4s (Protein Data Bank: 1KF1): (B) top view and (C) side view; backbone is represented by grey tube and the structures are colour-coded by atoms. Schematic representation of unimolecular G4s based on the strand direction: (D) parallel, (E) anti-parallel, and (F) hybrid with a bulge.

Small Molecules that Bind G4s

Replicative immortality is a hallmark of cancer and several studies have suggested that cancer cells achieve unlimited proliferation by protecting ends of chromosomes [23]. Telomerase is a reverse transcriptase enzyme that adds repeat segments to the 3′-end of telomeric DNA and is highly expressed in most cancers 23, 24. One way to inhibit the action of telomerase was achieved using small molecules that sequester telomeric ends as stable, liganded G4 structures, which are thought to render telomeric ends inaccessible for telomerase-mediated extension [25]. An X-ray structure of the small molecule daunomycin in complex with the G4 formed by four strands of d(TGGGGT) confirmed stacking of small molecules on a terminal G-quartet [26]. Several structures based on NMR spectroscopy and X-ray crystallography of small molecule-G4 complexes have since been reported [27]. The concept of targeting telomeric G4 structures was extended to G4s located in gene promoters. A cationic porphyrin TmPyP4, which binds to G4s (but does also bind duplex DNA) in vitro was shown to inhibit transcription of oncogene MYC by a mechanism proposed to involve a G4 target in the nuclease hypersensitivity element (NHE) in the MYC promoter [28]. Since then, a variety of different G4-targeted ligands have been described to modulate the expression of genes carrying a sequence capable of forming a G4 in their respective promoters. So far, few studies have investigated transcriptional changes on a genome-wide level [29]. More carefully designed controls will be needed to assess whether a particular G4 is in fact the main biological target or if changes in target gene expression are a result of the ligand binding to other genomic (G4 or non-G4) targets. The central hypothesis would be strengthened by more explicit evidence for G4 ligands actually engaging with G4 structures in the promoters of affected genes in cells, for instance, by employing methods that enable the genome-wide mapping of ligand binding sites in native chromatin 30, 31. To date, around 1000 small molecules targeting G4 structures have been reported in the G-Quadruplex Ligands Database (http://www.g4ldb.org/) [32], with some examples of widely used ligands shown in Figure 2B. Small molecule G4 binders generally have an aromatic surface for π-π stacking with G-tetrads, a positive charge or basic groups to bind to loops or grooves of the G4, and steric bulk to prevent intercalation with double-stranded DNA [33]. To improve G4 binding, the aromatic ring count, positive charge, and number of hydrogen bond donors generally exceed what would be optimal for a small molecule with good pharmacokinetic properties 34, 35. Contrary to the classical perspective, it is noteworthy that a G4 ligand that lacks traditional ‘drug-like’ properties has shown significant accumulation and efficacy in tumour xenografts of human cancers [29]. The X-ray crystal structure of the G4 ligand MM41 bound to a human telomeric G4 (Figure 2A) suggests that certain structural features of G4 ligands can exploit additional interactions in the groove regions of the G4 structure [36]. Interactions with the groove and with the backbone phosphates do not require a flat aromatic structure. Therefore, compounds with reduced planarity (or high fsp) and interactions with the grooves and/or backbone phosphates may have merit for targeting G4s. Structure–activity relationship studies on G4 ligands, controlling physicochemical properties such as planarity (fsp3), polarity (total polar surface area), lipophilicity (LogD), and rotatable bonds would enable an optimum balance between G4 binding, solubility, and permeability. In the targeting of structured RNA elements, it has previously been realised that increased planarity and strongly π-stacking compounds leads to off-target activity and that these molecules are generally difficult to improve via further medicinal chemistry [37].
Figure 2

G-Quadruplex (G4) Ligands.

(A) Crystal structure of a naphthalene diimide, MM41 bound to an intramolecular human telomeric DNA G4, colour-coded by atoms; water molecules are shown as red spheres, MM41 carbon atoms are coloured green, surface of the G4 is coloured light grey (Protein Data Bank: 3UYH). (B) Structures of selected widely used G4 ligands.

G-Quadruplex (G4) Ligands. (A) Crystal structure of a naphthalene diimide, MM41 bound to an intramolecular human telomeric DNA G4, colour-coded by atoms; water molecules are shown as red spheres, MM41 carbon atoms are coloured green, surface of the G4 is coloured light grey (Protein Data Bank: 3UYH). (B) Structures of selected widely used G4 ligands.

G4 Detection and Mapping

Computational algorithms have predicted over 370 000 G4 sequence motifs in the human genome, of which a general enrichment was observed in regions associated with genome regulation, such as telomeres, promoters, and 5′ untranslated regions 14, 38. G4s were first detected in vivo using a G4 structure-specific antibody to stain G4s in the telomeres of ciliates, whereby telomeric G4 structure formation was observed to be dynamically controlled through protein interactions in a cell cycle-dependent manner 39, 40. Subsequently, G4s were visualised in human cells 41, 42, 43, 44 and cancer tissue [45]. Antibodies have been used to monitor the behaviour of G4 structures in human cell lines upon ligand treatment together with depletion of the G4 resolving helicase FANCJ 41, 42, to reveal increased numbers of G4 foci staining in nuclei after pyridostatin [41] or telomestatin ligand treatment and FANCJ depletion in chicken DT40 cells [42]. Telomeric BG4 foci colocalise with human telomerase, suggesting the enzyme is recruited to G4 structures (Figure 3A) [43].
Figure 3

Detection and Mapping of DNA G-Quadruplexes (G4s).

(A) Visualisation of G4s in fixed or live cells using structure-specific antibodies as well as labelled or intrinsically fluorescent G4 ligands. The number of detected G4 foci can be increased by small molecule treatment or helicase depletion. (B) High-throughput sequencing of G4s in human genomic DNA (G4-Seq). Two consecutive sequencing runs, under normal and G4 stabilising conditions, provide a reference map and detect G4-dependent polymerase stalling, respectively. (C) Chromatin immunoprecipitation employing antibodies against endogenous G4-binding proteins followed by next-generation sequencing (ChIP-seq). Genomic occupancy of endogenous proteins is used to infer putative G4 sites. (D) Treatment with G4 ligands induces G4-dependent DNA damage. ChIP-seq of DNA damage markers in combination with deep sequencing detects G4-associated regions. (E) Permanganate oxidation of nucleotides in transiently unwound regions traps the unpaired state, resulting in sensitivity to a single-strand specific nuclease. Computational prediction is then used to infer the type of underlying non-B-DNA structures based on sequence context. (F) G4-specific chromatin immunoprecipitation and next-generation sequencing (G4 ChIP-seq). A G4-specific antibody (e.g., BG4) is used to precipitate G4 structures directly from native chromatin and is identified by deep sequencing.

Detection and Mapping of DNA G-Quadruplexes (G4s). (A) Visualisation of G4s in fixed or live cells using structure-specific antibodies as well as labelled or intrinsically fluorescent G4 ligands. The number of detected G4 foci can be increased by small molecule treatment or helicase depletion. (B) High-throughput sequencing of G4s in human genomic DNA (G4-Seq). Two consecutive sequencing runs, under normal and G4 stabilising conditions, provide a reference map and detect G4-dependent polymerase stalling, respectively. (C) Chromatin immunoprecipitation employing antibodies against endogenous G4-binding proteins followed by next-generation sequencing (ChIP-seq). Genomic occupancy of endogenous proteins is used to infer putative G4 sites. (D) Treatment with G4 ligands induces G4-dependent DNA damage. ChIP-seq of DNA damage markers in combination with deep sequencing detects G4-associated regions. (E) Permanganate oxidation of nucleotides in transiently unwound regions traps the unpaired state, resulting in sensitivity to a single-strand specific nuclease. Computational prediction is then used to infer the type of underlying non-B-DNA structures based on sequence context. (F) G4-specific chromatin immunoprecipitation and next-generation sequencing (G4 ChIP-seq). A G4-specific antibody (e.g., BG4) is used to precipitate G4 structures directly from native chromatin and is identified by deep sequencing. Besides antibodies, small molecules have also been used to detect G4s in cells. Early studies with radiolabelled G4 ligands also showed localisation at telomeres [46]. Subsequently, G4s were detected in human cells by treatment with alkyne functionalised G4 ligands followed by cell fixation and coupling to fluorophores via coper-catalysed azide-alkyne cycloaddition [47] or strain-promoted azide-alkyne cycloaddition [48]. Intrinsically fluorescent molecules that display different fluorescent emission or excitation maxima and fluorescent decay lifetimes upon binding to G4s have been developed for imaging live cells. The uptake of these molecules to the nucleus and a displacement by the G4 ligand pyridostatin have been monitored in living cells via fluorescence microscopy 49, 50. G4 specificity was further corroborated by colocalisation with the G4 structure-specific antibody BG4 in fixed cells [50]. The adaptation of next-generation sequencing has enabled the mapping of G4 structures in genomes. An in vitro reference map of G4s in purified human genomic DNA was obtained using differential G4-stabilising conditions (i.e., ligands or cations) to recognise G4-specific DNA polymerase stalling sites during next-generation whole genome sequencing (G4-seq, see Figure 3B) [51]. The G4-seq study identified more than 700 000 G4 sites in the human genome, exceeding some earlier predictions. Many noncanonical G4s were identified that comprised longer loops, as few as two G-tetrads or bulges caused by discontinuous G-tracts [16]. A recent study extended G4-seq to a variety of other organisms to generate G4 maps and reveal a strong potential for G4 formation in promoters that is particular to mammals (mouse, human), but mostly absent in the other organisms studied [52]. It is important to consider the effects of chromatin and all its associated proteins on G4 stability and formation, which are unaccounted for in maps generated by G4-seq or computational predictions. Efforts have been made to probe the native G4 landscape in vivo. G4 formation has been inferred from chromatin immunoprecipitation followed by high-throughput DNA sequencing (ChIP-seq) experiments that use antibodies against proteins that are known to bind G4s in vitro (Figure 3C). Enrichment of genomic regions comprising computationally predicted G4 structures was observed for α-thalassemia mental retardation X-linked (ATR-X) [53] and for the XPB and XPD helicases [54], yeast telomere binding protein RAP1-interacting factor 1 (Rif) [55], and yeast PIF1 helicase [56]. Such studies are consistent with G4 formation in vivo and suggest that the function of the respective proteins is linked to being physically associated with G4s. The G4 stabilising ligand pyridostatin generates DNA double-strand breaks in cells. The sites of pyridostatin-induced strand breaks were determined by ChIP-sequencing of γH2.AX, a phosphorylated protein that marks stand break sites, and is found to occur predominantly at predicted G4 regions of the cellular genome (Figure 3D) [47]. In a similar approach, ChIP-seq of RAD51, which provides a more narrow ChIP-seq signal at DNA damage sites, identified ∼3000 genomic targets of the G4 ligand CX-5461 [57]. Genome-wide potassium permanganate-dependent nuclease footprinting, which identifies single-stranded, non-B DNA, was performed in mouse B cells and combined with computational analysis to discern the type and enrichment of different non-B DNA structures (Figure 3E) [58]. This approach revealed around 20 000 hypersensitivity sites featuring G4 motifs and suggested a transcription-dependent formation of the noncanonical DNA structures, when comparing resting and lipopolysaccharide-interleukin 4 activated B cells. The BG4 antibody that binds a wide range of G4 structural types with broad selectivity and high affinity was recently used to map endogenous G4 structures in fixed chromatin from human epidermal keratinocytes (NHEKs) and from spontaneously immortalised HaCaT keratinocytes (G4 ChIP-seq, Figure 3F) [59]. In this study, ∼10 000 G4s were uncovered in precancerous HaCaT cells, while only ∼1000 were detected in the ‘normal’ counterpart NHEK cells. This represents only 1% of the sites with the capability to form G4s as observed in G4-Seq [51], suggesting a strong influence of the local chromatin context and other associated proteins on G4 formation in cells. The majority of G4s were found in nucleosome-depleted chromatin regions and were enriched in regulatory regions. In addition, G4s were particularly enriched at promoters of highly transcribed genes [59]. Mapping G4 landscapes in three other cancer cell lines revealed only moderate overlap, indicating a strong cell-type specificity [60]. In an alternative ChIP-seq approach, the G4 antibody, D1, was directly expressed in human cervical carcinoma cells as a GFP-fusion protein. In agreement with the BG4 study, a majority of the G4 signals were found at transcription start sites, in introns and intergenic regions [44]. Notably, ca. 15% of G4s were observed in exons, which were not found by BG4 ChIP-seq [59]. This may reflect different specificities of both G4 antibodies, differences in cell-type specific G4 landscapes, or competition of the D1 antibody with endogenous G4-binding proteins to reveal masked G4 structures.

Natural G4-Binding Proteins

The observation of G4s enriched at genome regulatory sites, and that G4 formation in cells is dynamic and dependent on cell type and state, suggests that the G4 landscape can be regulated by cellular proteins. Indeed, many natural proteins have been identified that interact with G4s (see G4 Interacting Proteins Database, http://bsbe.iiti.ac.in/bsbe/ipdb/) [61]. The identification of G4 associated proteins has mostly relied on affinity proteomics experiments employing RNA and DNA G4 oligomers as baits to isolate G4 interactors from cellular extracts 62, 63, 64, 65. Other approaches involve computational analysis of genomic protein binding sites to assess the enrichment of predicted G4 motifs within these binding sites 54, 66, or meta-analysis, in which shared structural features of G4-binding protein domains were compared to predict new putative G4 interactors 67, 68. In a recent study, a series of protein affinity pull-downs from nuclear lysates with oligonucleotide baits at different concentrations was combined with isobaric tandem-mass-tag labelling and mass spectrometry. This revealed apparent dissociation constants and binding profiles towards different G4 and transcription factor consensus sequences for hundreds of nuclear proteins in parallel [69]. To date, over a dozen specialised helicases have been identified that target DNA G4s with up to picomolar affinity (reviewed in [70]). Recent studies have provided structural and mechanistic insights into G4 recognition and unwinding. Single-molecule imaging studies revealed a common repetitive unfolding mechanism for the specialised helicases DHX36 (also known as RHAU), Blooms (BLM), and Werner (WRN) helicases, albeit with different substrate specificity, and showed the capacity to displace G4-stabilising ligands 71, 72. A cocrystal structure of bos taurus DHX36 helicase bound to the MYC promoter G4 revealed a mode of binding in which the G4 makes contacts with a DHX36-specific motif (DSM) and the C terminal OB-fold domain (Figure 4) [73]. Strikingly, the residues of the α-helical DSM create a nonplanar hydrophobic surface that stacks on the top of a mixed quartet of the quadruplex, which forms after partial unfolding of the original G4 by the protein. The binding mode is somewhat reminiscent of that suggested for most small molecule G4 ligands, but suggesting that absolute planarity of the ligand may not be needed [73]. These findings are complemented by the crystal structure of Drosophila melanogaster DHX36, revealing a positively charged pocket that binds and destabilises the G4 [74].
Figure 4

Crystal Structure of DHX36 Bound to the c-Myc G4 (Protein Data Bank: 5VHE).

Overall structure is shown in cartoon representation with the domain organisation of DHX36 colour-coded (top). The α-helical DHX36-specific motif (DSM) stacks on the 5′-quartet. The residues Ile65, Tyr69, and Ala70 form a nonpolar surface similar to the proposed binding mode of most small molecule G-quadruplex (G4) ligands (bottom).

Crystal Structure of DHX36 Bound to the c-Myc G4 (Protein Data Bank: 5VHE). Overall structure is shown in cartoon representation with the domain organisation of DHX36 colour-coded (top). The α-helical DHX36-specific motif (DSM) stacks on the 5′-quartet. The residues Ile65, Tyr69, and Ala70 form a nonpolar surface similar to the proposed binding mode of most small molecule G-quadruplex (G4) ligands (bottom). In accordance with the G4 forming potential of the telomere repeat sequence, many telomere-associated proteins have been shown to interact with G4 structures [63]. For instance, POT1 [75] and RPA [76], two components of the shelterin complex, as well as the human CST complex [77], are able to bind and unwind telomeric G4 structures and assist the action of telomerase. As G4s are strongly enriched at promoters, it is unsurprising that several transcription factors and transcriptional coactivators bind sites containing predicted G4 motifs, many of which show potential to bind or unwind G4 structures in vitro. In particular, G4 structures at promoter regions of oncogenes represent the most closely studied G4 sites. The nonmetastatic factor NM23-H2 has been reported to recognise and unwind the MYC promoter G4 [78]. Conversely, binding by nucleolin was reported to stabilise the G4 structure resulting in reduced transcription at this site [65]. Similarly, pull-down and chromatin immunoprecipitation experiments showed that Myc-associated zinc finger (MAZ) and poly(ADP-ribose) polymerase 1 (PARP-1) proteins bind the G4 structure found upstream of the KRAS transcription start site [79]. However, hnRNP A1 and MAZ were shown to bind and unfold the KRAS and HRAS promoter G4 in vitro, respectively 80, 81. Chromatin binding sites of the transcriptional helicases XPB and XPD are enriched in predicted G4 motifs. Both helicases associate with G4s in vitro, while XPD can also resolve G4 structures, suggesting that these proteins may affect transcription via a G4-associated mechanism [54]. Furthermore, various epigenetic and chromatin remodelling enzymes selectively bind DNA G4 oligomers [69]. Genomic binding sites of the chromatin remodelling protein ATR-X are enriched at GC-rich tandem repeats and CpG islands with the potential to fold G4 structures, and ATR-X loss is implicated with G4-dependent replication stress, DNA damage, and copy number alterations 53, 82. The DNA methyltransferase enzymes (DNMTs), which catalyse the formation of 5-methylcytosine at CpG dinucleotides in mammalian cells, bind to G4 structures with subnanomolar affinity in vitro 83, 84. DNMT1 shows stronger interaction with G4s compared with duplex DNA and loses enzyme activity upon G4 binding. Maps of DNMT1 chromatin binding sites and endogenous G4s in human K562 leukaemia cells identified using ChIP-seq and G4 ChIP-seq, respectively, revealed that at CpG islands most G4s occur where DNMT1 is bound, leading to a proposal that G4s regulate DNA methylation by sequestering DNMTs [84]. Conversely, DNA methylation can influence G4 topology [85] and may modulate binding by other G4-associated proteins. For instance, CpG methylation at the hTERT gene promoter was suggested to induce G4 formation, resulting in displacement of the CCCTC-binding factor (CTCF) and elevated transcription [86]. G4-protein interactions may provide a means to recruit machinery to specific parts of the genome to influence a wide range of cellular processes. Given that the G4 landscape is dynamic and dependent on the functional state of cells, proteins may be responsible for regulating G4 structural dynamics throughout the genome [60]. Approaches that can reveal the composition and dynamics of chromatin-associated protein complexes 87, 88 will be needed to uncover details of the proteins that constitute the G4 interactome, which in turn may present opportunities for small molecule modulation of these interactions.

Biological Role of G4s

The localisation of G4s at regions that regulate genome function have implicated G4s in a range of biological processes. The finding that G4-rich telomeric repeats form G4 structures [6] suggested a mechanistic link with telomerase-mediated extension of telomeres, prompting an exploration of G4 stabilising ligands that may inhibit the growth of cancer cells by interfering with telomere maintenance (reviewed in [89]). For instance, the mouse regulator of telomere elongation helicase 1 (RTEL1) was shown to maintain telomere integrity by unwinding telomeric G4s [90]. Telomeric G4s had originally been suggested to impair telomerase function [91], but might also be important for telomerase recruitment [43]. G4 structures are strongly associated with genomic and epigenetic instability, particularly when they are not efficiently regulated. The yeast Pif1 helicase was shown to prevent G4-mediated genomic instability and prevent DNA strand breaks 92, 93. Human Pif1 is recruited to DNA double-strand breaks sites to promote homologous recombination (HR) at sequences predicted to form G4s. Treatment with G4 stabilising ligands can impair Pif1 functionality, which can be rescued by PIF1 overexpression [94]. G4s have also been suggested to function as potential sensor or trapping sites of oxidative DNA damage caused by reactive oxygen species. Incorporation of 8-oxoG can affect stability of promoter G4 structures, which resulted in altered expression levels in reporter gene assays 95, 96, 97. Moreover, 8-oxoG modification was shown to disrupt the formation of telomeric DNA G4s [98] and to promote telomerase activity [99]. Various models have been postulated for the biological role of G4 structures during DNA replication 100, 101. G4 structures can act as an obstacle to replication fork progression when helicases are disrupted 93, 102. Introduction of G4 forming sequences at the BU-1 locus of DT40 chicken cells, resulted in pausing at the G4 sites (Figure 5). Additional replication stress due to nucleotide pool depletion decoupled the replication machinery from restoration of the parental histones, ultimately leading to altered gene expression [103]. Based on the same mechanism, small molecule stabilisation of G4 structures induced replication-dependent loss of epigenetic information showing that G4s can serve as obstacles to the replication machinery [104]. Conversely, mapping the genome-wide location of replication origins using deep-sequencing of short nascent strands in four different human cell lines uncovered enrichment of predicted G4s sequences, suggesting G4s may support initiation of DNA replication [105].
Figure 5

G-Quadruplex (G4) Can Induce Epigenetic Reprogramming.

Unresolved G4 structures (e.g., G4 ligand treatment, impaired helicases) on the leading strand may promote uncoupling of DNA synthesis from opening of the replication fork and disrupt histone recycling. Original histone modifications (blue) are lost and replaced with new histones (brown), resulting in epigenetic reprogramming upstream of G4 sites.

G-Quadruplex (G4) Can Induce Epigenetic Reprogramming. Unresolved G4 structures (e.g., G4 ligand treatment, impaired helicases) on the leading strand may promote uncoupling of DNA synthesis from opening of the replication fork and disrupt histone recycling. Original histone modifications (blue) are lost and replaced with new histones (brown), resulting in epigenetic reprogramming upstream of G4 sites. Treatment with G4-stabilising small molecules resulted in reduced mRNA levels at genes that contain G4 sequences in their respective promoters, such as the proto-oncogenes MYC [28] and KRAS [106], supporting the hypothesis that G4s constitute an impediment to the progression of transcription machinery. However, G4-stabilising ligands can also lead to DNA damage and recruit associated response mechanisms 47, 57. Transcriptional changes at G4s may therefore be a result of either direct G4 stabilisation or DNA damage-mediated transcriptional repression. Aberrant function of the G4-resolving helicases WRN and BLM result in altered transcription of genes containing G4 motifs in their promoter region, corroborating a link between G4s and transcription 107, 108. Moreover, BLM-mutated cells derived from Bloom syndrome patients show high rates of sister chromatid exchange at sites of G4 motifs in transcribed genes [109]. Notably, these helicases also process duplex DNA, so not all the observed changes may be linked to G4 structures. Where G4 structures form, the opposite DNA strand cannot form Watson-Crick base pairing. The C-rich opposite strand may be single stranded and potentially complexed with single-stranded binding proteins 110, 111. It has also been postulated that secondary structures might form on the opposite C-rich strand. Intercalated motif (i-motif) structures can be formed via stacks of intercalating hemiprotonated C-neutral C base pairs (C+:C), and are generally stabilised in slightly acidic pH [112]. Recent experiments have used an i-motif-specific antibody to image i-motifs in the nuclei of fixed human cells [113]. Furthermore, i-motif formation is cell cycle dependent, peaking at late G1 phase, whereas G4 formation was maximal during S phase [41], suggesting i-motifs and G4s may have different dependencies or even be mutually exclusive [114]. Indeed, a previously proposed model for the MYC promoter region suggested that negative superhelicity initiated at the proximal promoter travels upstream to mechanically perturb DNA structure [115] and it was suggested that a G4 and an i-motif can each be bound and stabilised by different specialised proteins to play opposite roles in the control of MYC gene transcription [116]. Approaches to map endogenous G4 structures such as permanganate footprinting, G4 ChIP-seq, and immunofluorescence with G4 antibodies have been consistent with G4 structures marking actively transcribing genes in human cells 58, 59. In addition, G4s can contribute to hypomethylation of CpG islands in promoter regions, which also contributes to elevated gene expression [84]. Further work is needed to unravel the mechanistic and molecular details of how G4s influence transcription.

Intervention and Therapeutics

G4s are associated with processes and control mechanisms that are important for the biology and growth of cancer cells, including telomere biology, transcriptional regulation of cancer-related genes, replication, and genome instability. Furthermore, G4 motifs are generally over-represented in cancer-promoting genes 51, 59 and experimental data has shown a higher presence of G4 structures in cancer states compared with normal states, which may favour G4s as molecular targets for cancer. For example, antibody staining of stomach and liver cancer tissues showed higher levels of G4 foci compared with normal tissue [45]. Also, G4 ChIP-seq detected tenfold more G4 sites in cancer-like, immortalised HaCaT cells as compared with their normal human keratinocyte counterpart 59, 60. G4 ligands such as pyridostatin and RHPS4 cause DNA double-strand breaks in cancer cells and activate DNA repair pathways 47, 117. Cancer cells with impaired repair pathways are sensitive to G4 ligands. For example, cancer cells deficient in BRCA2, a vital component of the homologous recombination (HR) repair pathway, are sensitive to pyridostatin derivatives and RHPS4 118, 119. Similarly, knockdown of PARP1, a key regulator of the nonhomologous end joining (NHEJ) repair pathway resulted in cancer growth inhibition by RHPS4 both in cellular and xenograft models [117]. Notably, pyridostatin was also toxic to HR-deficient cells resistant to olaparib, a PARP inhibitor [119]. This result suggests that G4 ligands may be an effective therapeutic approach in HR-deficient cancers resistant to PARP inhibitors. Indeed G4 ligand, CX-5461 is currently in human clinical trials for breast cancer patients with BRCA1/2 germline aberrations [57]. Moreover, G4 ligands have also shown synergy with DNA damaging therapies in ATR-X-deficient glioma cell models [82]. G4 ligands have also been successfully used in combination with inhibitors of key proteins involved in the DNA repair pathway. Pyridostatin synergises with NU7441, an inhibitor of DNA-PK, which is an important kinase in the NHEJ pathway [118]. The combination of RHPS4 and PARP1 inhibitor GPI 15427 showed 50% reduction in growth of HT29 colon tumour-bearing mice xenografts compared with 2% and 30% with GPI 15427 and RHPS4 alone, respectively [117]. Another approach described above is to target G4-containing genes involved in cancer progression and manipulate their transcription. While the concept of regulating expression of individual genes has been explored using several small molecules 28, 106, it is rational to assume that most G4 ligands will bind to other G4s and potentially regulate expression of many genes. This multigene targeting approach can result in poly-pharmacology, affecting several pathways important for cancer progression. In support of this view, global transcriptional profiling revealed that treatment of human pancreatic ductal adenocarcinoma (PDAC) cell lines with the G4 ligand CM03 could induce global downregulation of many G4-containing genes [29]. Analysis of the downregulated genes identified G4-containing genes involved in pathways important for PDAC progression. CM03 reduced tumour growth in PDAC xenografts as well as KPC mouse models at doses that do not cause any observable toxicity. It is possible that a particular G4 ligand may result in a distinct profile of affected genes for proteins involved in several pathways, suitable for targeting certain cancers. Global transcriptional profiling of the cancer cells/tissues treated with G4 ligand may identify the affected pathways and suggest cancers best suited for efficacy studies. A better understanding of the pathways affected by G4 ligands may ultimately provide a rationale to consider combination therapy opportunities with other drugs.

Concluding Remarks

During the past two to three decades, the study of DNA G4 structures has extended from biophysical and structural work to studies in biological models and systems of increasing sophistication. Collectively, these studies have advanced the hypothesis that the G4 DNA structure is intimately linked to a number of biological functions. While there is more work to be done to unravel the mechanistic details of how and why G4 structures influence biological function (see Outstanding Questions), there appears to be rationale and merit in considering G4s as promising molecular targets for future therapeutics. G4 ligands: can we develop small molecule G4 ligands with better pharmacokinetic properties that more effectively target G4s in cells? Is it possible to selectively target individual G4s or subclasses of G4s based on the molecular topology and sequence composition of the loops of G4s? Is it necessary, or desirable, to selectively target individual G4s to achieve therapeutic effects? How do we design suitable functional assays to screen for and validate direct G4 on-target effects in cells? Detection and mapping: can we develop mapping approaches to detect G4s genome-wide in native chromatin with stranded information and at single G4 resolution? Do current antibody-based approaches provide an incomplete picture due to limited accessibility and competition with endogenous proteins? Can we develop detection methods to monitor dynamic G4-associated processes at high-resolution in living cells? G4 landscape: under which conditions do G4s form in cells? What is the chromatin context? What are the key protein regulators that mediate G4 formation and depletion? What is the crosstalk with other epigenetic features? What happens on the opposing strand and what is the interplay with other structural phenomena (e.g., i-motifs, Z-DNA, R-loops)? G4s and transcription: what is the exact mechanism by which G4s affect transcription? Can we control transcription efficiently by manipulating particular G4s or by interfering with particular G4 interactors? G4s and cancer: why are G4s enriched at certain cancer genes? Are G4 signatures suitable biomarkers for diagnostic applications? What pathways are G4s involved in and are there phenotypes that make patients more tractable to G4 ligand therapies? What synthetic lethalities can potentially be exploited for combination therapies? Alt-text: Outstanding Questions
  118 in total

1.  Identification of novel interactors of human telomeric G-quadruplex DNA.

Authors:  Bruno Pagano; Luigi Margarucci; Pasquale Zizza; Jussara Amato; Nunzia Iaccarino; Chiara Cassiano; Erica Salvati; Ettore Novellino; Annamaria Biroccio; Agostino Casapullo; Antonio Randazzo
Journal:  Chem Commun (Camb)       Date:  2015-02-18       Impact factor: 6.222

2.  Mutually Exclusive Formation of G-Quadruplex and i-Motif Is a General Phenomenon Governed by Steric Hindrance in Duplex DNA.

Authors:  Yunxi Cui; Deming Kong; Chiran Ghimire; Cuixia Xu; Hanbin Mao
Journal:  Biochemistry       Date:  2016-04-06       Impact factor: 3.162

3.  Human POT1 disrupts telomeric G-quadruplexes allowing telomerase extension in vitro.

Authors:  Arthur J Zaug; Elaine R Podell; Thomas R Cech
Journal:  Proc Natl Acad Sci U S A       Date:  2005-07-25       Impact factor: 11.205

4.  Cytosine-cytosine+ base pairing stabilizes DNA quadruplexes and cytosine methylation greatly enhances the effect.

Authors:  C C Hardin; M Corregan; B A Brown; L N Frederick
Journal:  Biochemistry       Date:  1993-06-08       Impact factor: 3.162

5.  The Amino Acid Composition of Quadruplex Binding Proteins Reveals a Shared Motif and Predicts New Potential Quadruplex Interactors.

Authors:  Václav Brázda; Jiří Červeň; Martin Bartas; Nikol Mikysková; Jan Coufal; Petr Pečinka
Journal:  Molecules       Date:  2018-09-13       Impact factor: 4.411

6.  Structure of a (3+1) hybrid G-quadruplex in the PARP1 promoter.

Authors:  Anjali Sengar; J Jeya Vandana; Vicki S Chambers; Marco Di Antonio; Fernaldo Richtia Winnerdy; Shankar Balasubramanian; Anh Tuân Phan
Journal:  Nucleic Acids Res       Date:  2019-02-20       Impact factor: 16.971

7.  Metastases suppressor NM23-H2 interaction with G-quadruplex DNA within c-MYC promoter nuclease hypersensitive element induces c-MYC expression.

Authors:  Ram Krishna Thakur; Praveen Kumar; Kangkan Halder; Anjali Verma; Anirban Kar; Jean-Luc Parent; Richa Basundra; Akinchan Kumar; Shantanu Chowdhury
Journal:  Nucleic Acids Res       Date:  2008-11-25       Impact factor: 16.971

8.  A Role for the Fifth G-Track in G-Quadruplex Forming Oncogene Promoter Sequences during Oxidative Stress: Do These "Spare Tires" Have an Evolved Function?

Authors:  Aaron M Fleming; Jia Zhou; Susan S Wallace; Cynthia J Burrows
Journal:  ACS Cent Sci       Date:  2015-07-06       Impact factor: 14.553

9.  Xanthine and 8-oxoguanine in G-quadruplexes: formation of a G·G·X·O tetrad.

Authors:  Vee Vee Cheong; Brahim Heddi; Christopher Jacques Lech; Anh Tuân Phan
Journal:  Nucleic Acids Res       Date:  2015-09-22       Impact factor: 16.971

10.  A quantitative mass spectrometry-based approach to monitor the dynamics of endogenous chromatin-associated protein complexes.

Authors:  Evangelia K Papachristou; Kamal Kishore; Andrew N Holding; Kate Harvey; Theodoros I Roumeliotis; Chandra Sekhar Reddy Chilamakuri; Soleilmane Omarjee; Kee Ming Chia; Alex Swarbrick; Elgene Lim; Florian Markowetz; Matthew Eldridge; Rasmus Siersbaek; Clive S D'Santos; Jason S Carroll
Journal:  Nat Commun       Date:  2018-06-13       Impact factor: 14.919

View more
  104 in total

Review 1.  Development of RNA G-quadruplex (rG4)-targeting L-RNA aptamers by rG4-SELEX.

Authors:  Mubarak I Umar; Chun-Yin Chan; Chun Kit Kwok
Journal:  Nat Protoc       Date:  2022-04-20       Impact factor: 13.491

Review 2.  Molecular Probes, Chemosensors, and Nanosensors for Optical Detection of Biorelevant Molecules and Ions in Aqueous Media and Biofluids.

Authors:  Joana Krämer; Rui Kang; Laura M Grimm; Luisa De Cola; Pierre Picchetti; Frank Biedermann
Journal:  Chem Rev       Date:  2022-01-07       Impact factor: 60.622

3.  BODIPY based Metal-Organic Macrocycles and Frameworks: Recent Therapeutic Developments.

Authors:  Gajendra Gupta; Yan Sun; Abhishek Das; Peter J Stang; Chang Yeon Lee
Journal:  Coord Chem Rev       Date:  2021-11-22       Impact factor: 22.315

4.  The parallel-stranded d(CGA) duplex is a highly predictable structural motif with two conformationally distinct strands.

Authors:  Emily M Luteran; Paul J Paukstelis
Journal:  Acta Crystallogr D Struct Biol       Date:  2022-02-18       Impact factor: 7.652

5.  Epigenomic features of DNA G-quadruplexes and their roles in regulating rice gene transcription.

Authors:  Yilong Feng; Shentong Tao; Pengyue Zhang; Francesco Rota Sperti; Guanqing Liu; Xuejiao Cheng; Tao Zhang; Hengxiu Yu; Xiu-E Wang; Caiyan Chen; David Monchaud; Wenli Zhang
Journal:  Plant Physiol       Date:  2022-03-04       Impact factor: 8.340

6.  Genome-wide analysis of 8-oxo-7,8-dihydro-2'-deoxyguanosine at single-nucleotide resolution unveils reduced occurrence of oxidative damage at G-quadruplex sites.

Authors:  Jiao An; Mengdie Yin; Jiayong Yin; Sizhong Wu; Christopher P Selby; Yanyan Yang; Aziz Sancar; Guo-Liang Xu; Maoxiang Qian; Jinchuan Hu
Journal:  Nucleic Acids Res       Date:  2021-12-02       Impact factor: 16.971

7.  A Helicase Unwinds Hexanucleotide Repeat RNA G-Quadruplexes and Facilitates Repeat-Associated Non-AUG Translation.

Authors:  Honghe Liu; Yu-Ning Lu; Tapas Paul; Goran Periz; Michael T Banco; Adrian R Ferré-D'Amaré; Jeffrey D Rothstein; Lindsey R Hayes; Sua Myong; Jiou Wang
Journal:  J Am Chem Soc       Date:  2021-04-15       Impact factor: 15.419

8.  Vimentin binds to G-quadruplex repeats found at telomeres and gene promoters.

Authors:  Silvia Ceschi; Michele Berselli; Marta Cozzaglio; Mery Giantin; Stefano Toppo; Barbara Spolaore; Claudia Sissi
Journal:  Nucleic Acids Res       Date:  2022-02-22       Impact factor: 16.971

9.  Towards Profiling of the G-Quadruplex Targeting Drugs in the Living Human Cells Using NMR Spectroscopy.

Authors:  Daniel Krafčík; Eva Ištvánková; Šimon Džatko; Pavlína Víšková; Silvie Foldynová-Trantírková; Lukáš Trantírek
Journal:  Int J Mol Sci       Date:  2021-06-03       Impact factor: 5.923

Review 10.  OB-Folds and Genome Maintenance: Targeting Protein-DNA Interactions for Cancer Therapy.

Authors:  Sui Par; Sofia Vaides; Pamela S VanderVere-Carozza; Katherine S Pawelczak; Jason Stewart; John J Turchi
Journal:  Cancers (Basel)       Date:  2021-07-03       Impact factor: 6.639

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.