| Literature DB >> 35791380 |
Vanessa Meier-Stephenson1,2,3.
Abstract
There are over 700,000 putative G4-quadruplexes (G4Qs) in the human genome, found largely in promoter regions, telomeres, and other regions of high regulation. Growing evidence links their presence to functionality in various cellular processes, where cellular proteins interact with them, either stabilizing and/or anchoring upon them, or unwinding them to allow a process to proceed. Interest in understanding and manipulating the plethora of processes regulated by these G4Qs has spawned a new area of small-molecule binder development, with attempts to mimic and block the associated G4-binding protein (G4BP). Despite the growing interest and focus on these G4Qs, there is limited data (in particular, high-resolution structural information), on the nature of these G4Q-G4BP interactions and what makes a G4BP selective to certain G4Qs, if in fact they are at all. This review summarizes the current literature on G4BPs with regards to their interactions with G4Qs, providing groupings for binding mode, drawing conclusions around commonalities and highlighting information on specific interactions where available.Entities:
Keywords: DNA–protein interactions; G4-quadruplexes; Quadruplex-binding proteins; RNA–protein interactions
Year: 2022 PMID: 35791380 PMCID: PMC9250568 DOI: 10.1007/s12551-022-00952-8
Source DB: PubMed Journal: Biophys Rev ISSN: 1867-2450
Fig. 1Schematic of G4-quadruplexes (G4Qs), showing the G-tetrad’s planar orientation (A), formed by Hoogsteen bonds and stabilized by a metal ion, typically potassium (K +), which can stack upon one another in various orientations (B). These structures interact with various cellular proteins, which may bind in a number of different manners (C), including top-stacking (i), groove-binding (ii), and loop-binding (iii)
Fig. 2Example of groove-binding mode—telomeric end-binding protein of Oxytricha nova (OnTEBP), a protozoan analogue of human POT1 protein (PDB 1JB7) broad-view (A) and close-up (B), showing the Tyr142 in proximity to several of the G4Q guanosines and residues Lys105 and Asn139 nearer to the phosphate backbone facilitating H-bonding opportunities
Fig. 3Example of top-stacking—DHX36 with c-MYC promoter region G4Q. A High-level orientation of the structural arrangement showing the DSM helix sitting atop the G4Q, the lateral OB domain loop contacting the G4Q from the side, while the G4Q is pulled through the RecA-like domains (see text; PDB 5VHE); B DSM helix showing the Tyr69 oriented parallel with an upper guanosine from the tetrad facilitating π-π stacking. Other hydrophobic residues make up the remainder of the downward facing helical residues (Ile65, Trp68, and Ala70); C OB domain showing the proximity for the extensive hydrogen-bonding network between the phosphate backbone of the G4Q and Lys860, Asn851, Gly853, and Lys 855. D Independent study of the DHX36 DSM domain with c-MYC showing similar top-stacking binding mode (PDB 6Q6R)
Fig. 4FMRP’s 13-amino acid β turn folding into the groove at the junction of duplex and G4Q DNA (PDB 5DE5). The uppermost amino acid, Arg15, interacts with G7 and A17 nucleotides, which are not part of the G4Q structure. Binding is thought to promote stabilization of the G4Q (see text)
Fig. 5Example of loop-binding mode—synthetic zinc finger, Gq1 targeting a telomeric G4Q. Computationally derived model (PDB from Ladame, et al. 2006), showing overall arrangement (center) and key residues from A the first “finger,” whereby His125, Arg124, and Arg127 interact with the two outward-directed T12 and A13 nucleotides; B the second “finger,” where His153 and Thr156 bind with phosphate backbone of G10, while Arg142 wraps under to bind the other protruding nucleotide, T11; and C the third “finger,” where the Ser175, Arg178, and Thr182 create extensive H-bonds with the phosphate backbone of the loop
G4-quadruplex binding proteins (G4BPs)
| G4-sequence | Function, if known | Comments | |
|---|---|---|---|
| Telomeric G4BPs | |||
| POT1 | (TTAGGG)n | Unfolding of G4 (and refolding with POT1-TPP1 complex | Selective to anti-parallel G4Qs |
| RPA | (TTAGGG)n | Unfolds G4Qs | Unfolds both parallel and anti-parallel G4Qs |
| Human CST | TTAGGG.AATCCC | Unfolds G4Qs | Complex of 3 proteins; CTC1 contains the DNA-binding site |
| hnRNP A1 and UP1 | (TTAGGG)n | Unfolds G4Qs | Specifically, nYAGn seq Binds RNA and DNA |
| BLM | (TTAGGG)n | Unfolds G4Qs (leading strand) in 3′-5′ | Needs a ssDNA spacer between G4Q and the replication fork to function Similar to WRN |
| WRN | (TTAGGG)n | Unfolds G4Qs (lagging strand) in 3′-5′ | Similar to BLM |
| BRCA1 | (TTAGGG)n | Binds G4Qs in vitro | |
| Pif1 | (TTAGGG)n | Unfolds 5′-3′ direction | |
| Replication G4BPs | |||
| FANCJ | Unfolds G4Qs in 5′-3′ direction | unwinds both intramolecular and intermolecular G4Qs | |
| Promoter G4BPs | |||
| Sp1 | G4 (C) G3 (CC) G5 (C) G4 (TCCCGGC) G4 (CGG) (VEGF) CCCGGGCGGGCGCGAGGGAGGGGAGG (c-KIT) CGGGGCGGGGCGGGGGCGGGGGCG (HRAS) | Binds parallel and antiparallel G4Qs | |
| DHX36 | Many sequences | Unfolds G4Qs in 3′-5′ | Strong preference for binding parallel DNA and RNA G4Qs over antiparallel; requires trailing 3′ end |
| Nucleolin | TGGGGAGGGTGGGGAGGGTGGGGAAGG (c-MYC) (GGGGCC)n (HRE) | Stabilizes G4Q to suppress transcription | Preferentially binds parallel G4Qs but can bind both |
| NM23-H2 | TGGGGAGGGTGGGGAGGGTGGGGAAGG (c-MYC) | Unfolds G4Q to enable transcription | |
| MAZ | TGGGGAGGGTGGGGAGGGTGGGGAAGG (c-MYC) ACAGGGGTGTGGGG (Pur-1) TCGGGTTGCGGGCGCAGGGCACGGGCG and CGGGGCGGGGCGGGGGCGGGGGCG (HRAS) GGGAGGGAGGGAAGGAGGGAGGGAGGGA (KRAS) | Unfolds G4Qs to enable transcription | |
| PARP1 | C3G3CG3CGCGAG3AG4AG2 (c-KIT) TGGGGAGGGTGGGGAGGGTGGGGAAGG (c-MYC) GGGAGGGAGGGAAGGAGGGAGGGAGGGA (KRAS) | Recognizes parallel G4Qs; binding with c-KIT activates PARP1 | |
| XPD/XPB | XPD: 5′-3′ direction XPB: 3′-5′ direction | ||
| hnRNP | GGGAGGGAGGGAAGGAGGGAGGGAGGGA (KRAS) GGGGTGGGGCCCTGCGAGGGCGGG (TRA2β) | Unfolds G4Q | |
| RNA G4BPs | |||
| hnRNP A1 | AACGAGGGAGGGAGGGAGAGGGAGAGA- (MMP16) AGCCGGGGGCUGGGCGGGGACCG GGCUUGU (ARPC2) | Unfolds G4Q | |
| DHX36 | Many sequences | ||
| DHX9 | Many sequences | Unfolds G4Qs in 3′-5′ | Preference for RNA substrates; requires a 3′ single-stranded tail for initial binding |
| FMRP | GUGUGGAAGGAGUGGCUGGGUUG(sc1) | Stabilizes the G4Q | |