Literature DB >> 24735378

U3 region in the HIV-1 genome adopts a G-quadruplex structure in its RNA and DNA sequence.

Dorota Piekna-Przybylska¹, Mark A Sullivan, Gaurav Sharma, Robert A Bambara.

Abstract

Genomic regions rich in G residues are prone to adopt G-quadruplex structure. Multiple Sp1-binding motifs arranged in tandem have been suggested to form this structure in promoters of cancer-related genes. Here, we demonstrate that the G-rich proviral DNA sequence of the HIV-1 U3 region, which serves as a promoter of viral transcription, adopts a G-quadruplex structure. The sequence contains three binding elements for transcription factor Sp1, which is involved in the regulation of HIV-1 latency, reactivation, and high-level virus expression. We show that the three Sp1 binding motifs can adopt different forms of G-quadruplex structure and that the Sp1 protein can recognize and bind to its site folded into a G-quadruplex. In addition, a c-kit2 specific antibody, designated hf2, binds to two different G-quadruplexes formed in Sp1 sites. Since U3 is encoded at both viral genomic ends, the G-rich sequence is also present in the RNA genome. We demonstrate that the RNA sequence of U3 forms dimers with characteristics known for intermolecular G-quadruplexes. Together with previous reports showing G-quadruplex dimers in the gag and cPPT regions, these results suggest that integrity of the two viral genomes is maintained through numerous intermolecular G-quadruplexes formed in different RNA genome locations. Reconstituted reverse transcription shows that the potassium-dependent structure formed in U3 RNA facilitates RT template switching, suggesting that the G-quadruplex contributes to recombination in U3.

Entities: CellLine Chemical Disease Gene Mutation Species

Mesh：

Substances：

Year: 2014 PMID： 24735378 PMCID： PMC4007979 DOI： 10.1021/bi4016692

Source DB: PubMed Journal: Biochemistry ISSN： 0006-2960 Impact factor: 3.162

Recent cellular research revealed that G-quadruplexes formed in promoter regions of cancer-related genes regulate their expression.[1−11] Formation of this structure is often linked to inhibition of transcription, but stimulation of promoter activity was also demonstrated.[1,12−19] A G-quadruplex is assembled from two or more G-quartets with compact square structure, in which four guanines from different positions in a G-rich strand are held together by Hoogsteen hydrogen bonding (Figure 1). The G-quadruplexes differ by folding pattern, number of tetrads, size of nontetrad loops, and orientation of the strands in the quadruplex. In addition, whereas most reports show the structure core formed by guanines from G runs (two or more consecutive Gs), unprecedented and bulged G-quadruplexes were also reported with an isolated guanine involved in G-tetrad core formation.[20,21] The DNA sequence can adopt this non-B configuration when complementary strands are separated in the DNA duplex during transcription and replication. The genomic regions prone to adopt this structure are rich in G residues, and include telomeres and gene promoters. In the case of promoters, multiple Sp1-binding motifs arranged in tandem are often indicated by computational analyses to form G-quadruplexes, and promoters of cancer-related genes were shown to form this structure in Sp1 binding regions.[1,12−17]

Figure 1

Guanine-rich sequence of the HIV-1 U3 region of the provirus and in the RNA genome might fold into a G-quadruplex. According to QGRS Mapper, runs of G residues (shaded) in three Sp1 binding sites (in bold) in the virus promoter are capable of forming a G-quadruplex. Four guanines are connected through hydrogen bonding to form a single G-quartet stabilized by a monovalent cation (left). Layering of two or more G-quartets forms a G-quadruplex (right). For RNA sequences, G-quadruplexes were detected in coding and noncoding regions of mRNA, and found to regulate protein synthesis.[22−25] Formation of this structure in introns was suggested to influence alternative splicing.[26−29] In HIV-1, sequences prone to adopt G-quadruplex structure are in a region of gag near DIS and in the central part of the genome near the cPPT.[30−34] In both locations, G-quadruplex formation was associated with dimerization of the homologous templates and increased rate of primer-strand transfers during reverse transcription, suggesting that in vivo the structure contributes to dimerization of the viral genomes that promotes recombination. The ability of the proviral DNA U3 region to adopt G-quadruplexes was recently reported by Perrone and co-workers, who described two parallel-like intramolecular G-quadruplexes, and showed that G-rich sequences of an NF-κB site together with G runs of Sp1 sites are involved in the quadruplex structure.[35] Going beyond this work, we explored the relationship between the formation of G-quadruplex structure in the U3 DNA Sp1 transcription factor binding site region and binding of Sp1. Our native gel analyses, c-kit2 antibody binding analyses, and CD spectra show that the region fully transforms into different forms of intramolecular G-quadruplexes, which likely include a mixture of parallel/antiparallel and/or hybrid configurations that constitute the Sp1 binding site. Additionally, we investigated whether G-quadruplex formation in U3 RNA promotes viral recombination and the implications for a general viral recombination mechanism mediated by periodically spaced genome linkages including those that occur in U3 together with previously reported linkages in the gag and cPPT regions.[30−32,34]

Experimental Procedures

Materials

DNA oligonucleotides and the HPLC purified RNA strand used for CD spectra analyses were purchased from Integrated DNA Technologies, Inc. (Coralville, IA). HIV-1 NC (55 amino acids) was generously provided by Dr. Robert J. Gorelick (NCI, Frederick, MD). HIV-1 reverse transcriptase (p66/p51 heterodimer) (RT) was purified as described previously.[36] The [γ-32P]ATP was purchased from PerkinElmer Life Sciences. Recombinant Sp1 protein was purchased from Active Motif (Carlsbad, CA). Sp1 polyclonal antibodies and antirabbit IgG, HRP-linked antibodies were purchased from Cell Signaling Technology, Inc. (Danvers, MA). The promoter sequences of viruses are from the HIV database (www.hiv.lanl.gov).

Preparation of RNA Templates

RNA molecules were transcribed in vitro (Ambion T7-MEGAshortscript kit; Applied Biosystems) from DNA templates amplified by PCR using Vent DNA polymerase (New England BioLabs, Inc.) and two overlapping oligomers with the sequence of the desired region. The following RNA strands were used in our studies: (a) For the reverse transcription assay, the RNA template with the region of three Sp1 binding sites (8960–9051 in the RNA genome) of NL4–3 HIV-1 was made from a DNA template synthesized with the oligomer pair 1/2. (b) For affinity selection analysis, the nontagged RNAs and poly(A) tagged RNAs were made from DNA generated using oligomers 1/3 and 1/4 representing the U3 region (8960–9037) of NL4–3 HIV-1, 5/6 and 5/7 representing the cPPT region (4309–4396) of NL4–3 HIV-1, 8/9 and 8/10 representing the gag region (290–403) of NL4–3 HIV-1, and 11/12 and 11/13 representing the gag region (303–415) of MAL HIV-1. (c) For the analysis of dimerization in a native gel, the RNA template of the U3 region (8960–9037) in NL4–3 HIV-1 was made from DNA generated using oligomers 1/3. (d) For strand transfer assays, the donor and acceptor templates representing the U3 region (8960–9051 and 8961–9033) of NL4–3 HIV-1 were made from DNA generated using oligomers 1/2 and 14/15, respectively. After transcription in vitro, the RNA templates were purified by polyacrylamide/urea gel electrophoresis and resuspended in water. RNAs were quantitated by UV absorption using a GeneQuant II from Amersham Biosciences.

DNA and RNA Oligonucleotides

Sequences of oligonucleotides are in Table S1 (Supporting Information).

Preparation of the 5′-Radiolabeled RNA Template and DNA Primer

DNA oligomers (16, a–e, c-kit, TBA, and TEL) were labeled at the 5′ end using T4 polynucleotide kinase (New England BioLab) and [γ-32P]ATP (6000 Ci/mmol). Preparation of the 5′-radiolabeled RNA template was performed as follows: The gel-cleaned RNA template was treated with shrimp alkaline phosphatase (SAP, Fermentas) at 37 °C for 60 min and then incubated at 65 °C for 25 min to inactivate the enzyme. Following cooling on ice, the reaction mixture was treated with [γ-32P]ATP (6000 Ci/mmol), 10× PNK buffer, and T4 polynucleotide kinase (New England BioLab). After incubation for 1 h at 37 °C, the radiolabeled RNA and DNA primers were separated from unincorporated radionucleotides using a Micro Bio-Spin column (Bio-Rad).

Reverse Transcriptase (RT) Progression Assay

The assay was performed in two steps (1) folding/annealing, and (2) primer extension. For folding/annealing, the RNA or DNA templates (2 pmol) were mixed with 5′ end labeled primer DNA (2 pmol) in the presence of 0.5 M salt (KCl or LiCl) and 50 mM of Tris HCl (pH 8.0) in a volume of 20 μL. The mixtures were heated to 95 °C for 5 min and cooled slowly to room temperature. The RNA template with three Sp1 binding sites was synthesized from the PCR product primed with oligomers 1 and 2 (see above). As a DNA template with three Sp1 binding sites, we used oligomer 15. After the annealing/folding step, 2 μL of the mixture was taken for the RT-catalyzed primer extension reaction carried out in 25 μL at a final concentration of 50 mM Tris-HCl (pH 8.0), 50 mM KCl or LiCl, 1 mM DTT, 1 mM EDTA, 32 nM HIV-1 RT, 6 mM MgCl2, and 50 μM dNTPs. After 30 min of incubation at 37 °C, reactions were stopped with 1 volume of termination buffer (10 mM EDTA, pH 8.0, 90% formamide (v/v), and 0.1% each xylene cyanole and bromophenol blue). Extension products were resolved on a 6% polyacrylamide–8 M urea gel and analyzed using a PhosphorImager (GE Healthcare). Sizes of DNA products were estimated by using a 5′-radiolabeled 10 bp DNA ladder (Invitrogen).

Native Gel Analysis of Monomer G-Quadruplexes

A mixture of the 5′ end 32P-labeled (about 300,000 cpm) and unlabeled DNA oligonucleotides (a–e, c-kit, TBA, TEL) at a concentration of 4 μM and in a final volume of 25 μL was heated to 95 °C for 3 min, chilled, and incubated for 20 h at room temperature in a buffer containing 10 mM Tris-HCl, pH 7.5, 1 mM EDTA, and 100 mM KCl. Samples were subsequently mixed with one volume of loading buffer (30% glycerol in Tris-EDTA buffer) and applied onto a 15% nondenaturing gel with 0.5× TBE, 10 mM KCl, and 1.5% glycerol, and run at 4 °C at 6.5 V/cm for 18 h. The gel was dried onto Whatman 3MM Ch paper and analyzed using a PhosphorImager (GE Healthcare).

Circular Dichroism

CD spectra were obtained at 25 °C over a wavelength range of 210–340 nm using an AVIV Circular Dichroism Spectrometer, Model 202. The RNA (oligomers 17 and 23) and DNA (oligomers a–e) samples were at a concentration of 4 or 20 μM, in 10 mM Tris HCl, pH 7.5, 0.3 mM EDTA, and 100 mM KCl. Before analysis, the samples were heated to 90 °C for 10 min and gently cooled at a rate of 1 °C/5 min, and incubated at 4 °C overnight. Spectra were recorded using a quartz cell of 1 mm optical path length, with data collected every nanometer at a bandwidth of 1 nm. Each spectrum was recorded three times and baseline-corrected for signal contributions from the buffer. The data were processed with AVIV Biomedical Inc. software and reported as ellipticity (mdeg) versus wavelength (nm).

Production of Phage Displaying hf2 scFv

The plasmid pIT2 with hf2 sequence was generously provided by Dr. Shankar Balasubramanian.[37] The hf2 phagemid was used to produce phages displaying the scFv after infection as previously described[38] except that the VCS M13 helper phage (Agilent) was used instead of the KM13 helper. The phages were isolated from the culture supernatants by PEG precipitation and resuspended in 50 mM K2HPO4 at pH 7.4 and 100 mM KCl containing 3% BSA.

Phage ELISA

The 10 μM stock solutions of biotinylated oligos b–e (20, 24, 25, and 26), single stranded DNA control (27), and c-kit2 were prepared in 10 mM Tris-HCl at pH 7.4 and 100 mM KCl. The samples were heated to 95 °C for 10 min and annealed over 14 h (o/n) at a rate of 0.1 °C/min down to room temperature in a buffer of 10 mM Tris-HCl at pH 7.4 and 100 mM KCl. In order to form a double stranded DNA, oligos 28 and 29 were mixed in a buffer of 10 mM Tris-HCl at pH 7.4 and 100 mM KCl, then heated at 95 °C for 10 min and cooled to room temperature at a rate of 1 °C/min. For phage ELISA, samples were diluted to 50 nM in 10 mM Tris-HCl at pH 7.4 and 100 mM KCl. The standard ELISA protocol was followed, but 50 mM K2HPO4 at pH 7.4 and 100 mM KCl (ELISA buffer) instead of PBS was used to maintain the G-quadruplex conformation. Pierce Streptavidin coated high binding capacity (HBC) strips were coated with biotinylated oligonucleotides for 1 h, then washed three times with ELISA buffer. Wells were blocked (3% BSA in ELISA buffer) for 1 h and then incubated with 50 μL of 2-fold serial dilutions of phages for 1 h. The transductional titer (ampicillin resistance) of the phages was about 7 × 10e11 transducing units/mL. After 6 washes (ELISA buffer), wells were incubated with a 1:3,000 dilution of anti-M13-HRP antibody (GE Healthcare) in ELISA buffer + 3% BSA for 1 h. After three washes (ELISA buffer), the ELISA was developed with the substrate TMB. Color development was terminated after 5–10 min with acid, and the absorbance at 450 nm was measured with a plate reader (Tecan).

Pull-Down of Sp1 and Western Blotting

The Sp1 was selected with biotinylated oligonucleotides and streptavidin-coated magnetic beads (Promega). The beads were washed three times with 0.5 mL of 0.5 × SSC buffer and three times with buffer A (25 mM HEPES at pH 7.5, 12.5 mM MgCl2, 20% v/v glycerol, 0.1% v/v Nonidet P-40, 1 mM dithiothreitol, and 100 mM KCl) containing 3% of BSA. Before the binding of biotinylated DNA to the beads, oligomers c (20) and c-kit (18) were incubated overnight in buffer A to form the G-quadruplex structure. In order to form dsDNA for protein selection, the pairs of oligomers 18/19 and 20/21 were incubated in buffer A in a ratio of 1:3. Biotinylated DNA (200 μL) samples (1 μM) were incubated with the beads for 30 min at room temperature. Beads were then washed three times with 500 μL of buffer A containing 3% of BSA and blocked with the same buffer for 30 min. All subsequent procedures were performed at 4 °C. Sp1 protein (80 ng from Active Motif) was added to 500 μL of buffer A with 3% of BSA. The mixture was added to the beads and incubated for 20 min, followed by washing six times with 200 μL of buffer A. The beads were resuspended in 20 μL of Laemmli buffer and boiled for 2 min. After removing the beads, the samples were separated on a 4–12% gradient Tris–Glycine polyacrylamide gel (BioRad) and transferred to a PVDF membrane. SP1 was identified by immunoblotting using a rabbit polyclonal antibody diluted 1:5000. A goat antirabbit secondary antibody linked with HRP was used for chemiluminescent detection. For competition binding with Sp1, oligomer c and a nonspecific sequence (oligomer 22) were first incubated under G-quadruplex forming conditions (see above) and mixed with biotinylated dsDNA samples before incubation with buffers containing Sp1.

Affinity Selection with Oligo d(T)25 Magnetic Beads

About 40 pmol of poly(A)-tagged RNA and nontagged RNA templates were mixed (ratio 1:1) in the presence of 50 mM Tris-HCl (pH 8.0), 200 mM KCl, and 1 mM EDTA in a final volume of 20 μL. The mixtures were heated to 95 °C for 3 min, then chilled on ice and incubated at room temperature for 2 h. Before using oligo d(T)25 magnetic beads (New England BioLabs), the suspension of 50 μL was washed once with binding buffer (20 mM Tris-HCl, pH 7.5, 500 mM KCl, and 1 mM EDTA), resuspended in 180 μL of binding buffer, and added to the RNA. The mixture was agitated at room temperature for 10 min and then placed in a magnetic rack to separate the magnetic beads from solution. The beads were washed once with binding buffer and three times with wash buffer (20 mM Tris-HCl, pH 7.5, 200 mM KCl, and 1 mM EDTA), each time for 1 min with gentle agitation. In order to elute the RNA, the beads were resuspended in 15 μL of elution buffer (20 mM Tris-HCl, pH 7.5, and 1 mM EDTA), incubated for 3 min in 95 °C, and placed in a magnetic rack to separate the magnetic beads from solution. One volume of loading buffer (10 mM EDTA, pH 8.0, 90% formamide (v/v), and 0.1% each of xylene cyanole and bromophenol blue) was added to the eluted solution, and the products were resolved in 6% polyacrylamide–8 M urea gels. The gels were stained with ethidium bromide.

Cation-Dependent Dimerization and Thermal Dissociation Analysis

Dimerization and melting experiments with RNA dimers of the U3 region were conducted in parallel for each reaction setting. The samples contained a mixture of 5′ end 32P-labeled (about 500,000 cpm) and unlabeled RNA at a concentration of 4 μM and a final volume of 6 μL. To form a dimer, the RNA was heated to 95 °C for 3 min, chilled, and incubated for 60 min at room temperature in a buffer containing 10 mM Tris-HCl, pH 7.5, 1 mM EDTA, and one of three different salts (KCl, NaCl, or LiCl), each at 1 M. After incubation, the mixtures were placed on ice, and one volume of 2× Tris-EDTA buffer was added. Aliquots of 15 μL were transferred to new tubes and incubated for 8 min at a specific temperature between 30 and 90 °C, then returned to the ice. Samples were mixed with one volume of loading dye (30% glycerol in 1× Tris-EDTA buffer and relevant dyes), and loaded onto 6% nondenaturing gels run at 4 °C at 7 V/cm for 3–4 h. All gels were dried onto Whatman 3MM Chr paper and analyzed using a PhosphorImager (GE Healthcare).

Strand Transfer Assay

The 5′ end labeled DNA primer 16 was heat-annealed to donor RNA by incubation at 95 °C for 5 min and slow cooling to 37 °C. The acceptor template was also present in the mixture. NC at 200% polymer substrate-coating level (100% NC is 7 nt of the polymer substrate per NC molecule) was added and incubated for 3 min. Next, the RT was added to the mixture and incubated for another 4 min to prebind the RT with the substrates, before reactions were initiated with MgCl2 and dNTPs. Primer, donor, and acceptor strands were mixed at a ratio of 2:1:1. The final reaction contained 50 mM Tris-HCl (pH 8.0), 50 mM KCl, 1 mM DTT, 1 mM EDTA, 32 nM HIV-1 RT, 6 mM MgCl2, 50 μM dNTPs, 16 nM donor RNA, 16 nM acceptor RNA, and 32 nM primer. For reactions in the presence of lithium ions, KCl was replaced with 50 mM of LiCl. Reactions were incubated at 37 °C, and terminated after 1, 5, 15, and 30 min with 1 volume of termination dye (10 mM EDTA, pH 8.0, 90% formamide (v/v), and 0.1% each xylene cyanole and bromophenol blue). Products were then resolved by 6% polyacrylamide–8 M urea gels and analyzed using a PhosphorImager (GE Healthcare) and ImageQuant software (version 2.1). Sizes of DNA products were estimated by using a 5′-radiolabeled 10 bp DNA ladder (Invitrogen).

Results

G-Rich Sequences Capable of Forming G-Quadruplex Structure Are Present in the U3 Regions of Various HIV Species

Intensive research on G-quadruplexes showed that genomic regions prone to adopt this structure correlate particularly with gene promoters and telomeres, both rich in G residues. More importantly, cellular research revealed that G-quadruplexes in promoter regions of cancer-related genes regulate their expression.[1−11] Since sequences capable of forming G-quadruplexes were previously found in the HIV-1 genome, we wanted to determine whether this structure might also be formed in the viral promoter.[30−32,34] Our computational analyses with a software program designed to predict the formation of G-quadruplexes (QGRS Mapper[39]) revealed that the promoter sequence in HIV-1 NL4–3 U3 between −80 and −48 (+1 refers to transcription start site at the beginning of R), has a high probability of adopting different forms of G-quadruplex structure (Figure 1). The same region was also predicted to form G-quadruplexes in the genomes of various isolates of HIV-1 (A, B, C, D, F, G, H, U, N, and O), HIV-2, and several SIV species closely related to HIV-1 (Table 1). This region shows a higher variation in primary sequence among different viruses than the protein coding regions. Yet, the region from all of these viruses retains the ability to form G-quadruplex structure as indicated by QGRS Mapper. This suggests that the G-rich part of the promoter is prone to adopt G-quadruplex structure and that the structure regulates viral expression. The G-rich promoter sequence in HIV-1 contains from five to seven G runs and contains three Sp1 binding sites. Some of these G runs are composed of only two guanines, suggesting that the putative G-quadruplex is composed of only two G-quartets. However, single G residues near G runs might also be included in a structure, which is called the bulged G-quadruplex. An extra G tetrad in a core formed with an isolated G residue (not a part of G run) will produce a bulge with non-G residue between tetrads.[21] Since in retroviruses the U3 is present in both the proviral LTRs, the Sp1 binding sites should be also capable of forming a G-quadruplex at the 3′ end of the RNA genome (8996–9028 in RNA HIV-1 NL4–3).

Table 1

G-Rich Sequences in the Sp1 Binding Region in Different Immunodeficiency Virusesa

Numbers indicate locations with reference to the transcription start site. G runs are shown in gray boxes.

G-Rich Sequences in U3 of HIV-1 RNA and Single Stranded DNA Form G-Quadruplex Structure

The distinctive signature of cation-dependent pauses during RT progression serves as a simple test to verify whether the G-rich region in a template sequence can form G-quadruplex structure.[33,34,40] In the presence of K+ or Na+, the G-quadruplex structure is formed in the template, and it pauses RT during cDNA synthesis. In the presence of lithium ions, the structure is not formed, so G-quadruplex pausing does not occur. With lithium, higher salt and template concentration can also induce G-quadruplex formation. The RT progression assay is performed in a low salt (50 mM) and template concentration (16 nM) so that ion type-dependent pausing is clearly evident. Pausing caused by hairpin structures occurs with either ion, allowing it to be readily distinguished. In order to assess G-quadruplex formation in HIV-1 U3 RNA, a 92-nt long U3 sequence (8960–9051 in the RNA genome) was synthesized by transcription in vitro. For analysis of G-quadruplex formation, in the corresponding single stranded DNA we made an oligonucleotide with the sequence of the region from −116 to −25 in the HIV-1 promoter. HIV-1 RT-directed cDNA synthesis on these RNA and DNA strands was performed in the presence of K+ or Li+, and results in Figure 2 show that cation-dependent pauses of RT were observed on both templates. Two strong pauses of RT were produced solely in the presence of potassium on the RNA molecule. The first pause is located between marker 20nt and 30nt. In the sequence, this region roughly corresponds to the first G run, which is at position 24. The second strong pause is observed close to marker 30nt and roughly corresponds to the second G run, which starts at 29 indicated on the sequence. A weaker third RT pause is also observed between marker 30nt and 40nt and might correspond to the third G run, which starts at 34 on the sequence. Since these RT pauses are not observed in the presence of lithium ions, they must be caused by a structure formed in the presence of potassium but not lithium. Such a structure is a G-quadruplex. The RT pause profiles for RNA and DNA look similar except for an additional pause site at the first G run encountered in the RNA template. The cation-independent RT pauses occurring in reactions done with either potassium and lithium ions likely result from hairpin structures. Hairpins are known to pause RT, and their formation is dependent on ionic strength but not the identity of the ions.[33]

Figure 2

RT progression assay determining G-quartet formation in the Sp1 binding region. Cation-dependent pausing of reverse transcription at the guanine-rich elements in the U3 region was analyzed with RNA and DNA templates. A fragment of the RNA/DNA sequence of the Sp1 binding region is shown (top) with G-rich elements (shaded). Strong pauses of the RT near G-rich elements were observed in the presence of 50 mM of KCl, but not LiCl, indicating that these elements are involved in the formation of structure, which is stabilized by potassium ions but destabilized by lithium ions. This is indicative of a G-quadruplex. The cation-independent RT pauses are likely caused by hairpin structures. DNA primer, P; DNA marker, M; KCl, K; LiCl, Li. To further confirm the formation of G-quadruplex structure in the HIV-1 U3 region, we analyzed the folding of RNA and DNA sequences by circular dichroism (CD) spectroscopy. CD spectroscopy is widely used to determine G-quadruplex formation in RNA and DNA. The CD spectrum of the sequence folded into a G-quadruplex with a parallel configuration has a positive peak near 263 nm and a negative peak at 241 nm. Similar peaks appear near 295 and 260 nm, respectively, for antiparallel G-quadruplexes, with an additional peak around 240 nm. The parallel configuration refers to the structure in which the 5′–3′ direction of all the strands that form G-quartets is the same. If one or more strands have a 5′–3′ direction opposite to the other strands forming G-quartets, the G-quadruplex is said to have adopted an antiparallel topology. However, G-quadruplexes are highly polymorphic, and more complicated forms can be adopted displaying altered CD profiles. CD spectra with two positive peaks around 295 and 263 nm, might indicate that the sequence adopts parallel and antiparallel G-quadruplexes. However, bulged G-quadruplexes and hybrid configurations with strands in parallel and antiparallel in the single G-quadruplex also display a similar profile.[21,41] Different configurations are seen for G-quadruplexes folded by DNA molecules, but for RNA sequences, all G-quadruplexes described so far exhibit parallel topology. Our CD structural analysis of a 35-nt long RNA with the sequence of HIV-1 U3 (8995–9029) in 100 mM KCl showed a typical profile of a parallel G-quadruplex, with a positive and negative ellipticity at 263 and 241 nm, respectively (Figure 3). However, the CD spectrum of the DNA sequence showed a positive peak around 295 and a negative peak around 260 nm indicating a G-quadruplex with antiparallel topology, although it is possible that the structure is a hybrid with strands in antiparallel and parallel orientations since a peak at 240 nm is not clearly seen.[41] These results confirm that both RNA and single stranded DNA of the HIV-1 U3 region containing the Sp1 binding elements form G-quadruplexes but with different configurations.

Figure 3

CD spectral analysis of the RNA and single stranded DNA with Sp1 binding sites in HIV-1. CD spectra indicate the formation of the parallel G-quadruplex for the RNA template and an antiparallel or hybrid G-quadruplex for single stranded DNA. For reference, a profile of the G-rich sequence (50% of Gs; GGGGGGAUUGUG UGGUACAGUGCAGAGA), which is unable to adopt G-quadruplex structure, is shown in gray.

Single Stranded DNA Containing the Three HIV-1 Sp1 Sites Adopts Different Monomer G-Quadruplex Configurations

Our cation-dependent RT progression analysis and CD spectra show that a G-quadruplex is formed by the HIV-1 promoter single stranded DNA template. This implies that the DNA sequence adopts a G-quadruplex configuration when complementary strands are separated in the DNA duplex during transcription. The G-rich sequences in promoters of cancer-related genes have been shown to form monomer quadruplex structures.[2,4,5,8−11] Since the G-quadruplex monomer is expected to have a biological significance in the HIV-1 promoter, we used native gel analysis to determine how efficiently it is formed over dimers and tetramers, which are intermolecular forms of G-quadruplex that do not require multiple G runs. In this approach, the 32P-labeled strand is first incubated under G-quadruplex folding conditions and then loaded into a native gel to separate monomers, dimers, and tetramers from nonfolded molecules. The gel contains KCl at a concentration of at least 10 mM to maintain the integrity of folded structures. The monomer G-quadruplex appears in the gel as the fastest migrating band, then the nonfolded sequence, whereas dimers and tetramers are slower migrating forms. Because the Sp1 binding elements of the HIV-1 promoter have seven G runs, to gain information on their individual roles in a G-quadruplex formation, we analyzed several DNA oligonucleotides covering different sets of four G runs. The sequences of the c-kit promoter (Sp1 site),[42] altered thrombin aptamer (TBA),[43] and telomere (TEL),[44] had been confirmed to fold into monomer G-quadruplexes and so were used as positive controls. Results in Figure 4A show that the single stranded DNAs with Sp1 binding elements migrate faster than expected when incubated in the G-quadruplex folding condition. Formation of G-quadruplex by shorter sequences (b–e) indicates that each of the seven G runs can be involved in the formation of a G tetrad. The nonfolded samples, which are samples incubated in the absence of salt, do not migrate according to their sizes and appear as smear bands for oligomers d, e, and c-kit indicating that 10 mM KCl in a gel is sufficient to induce their folding into G-quadruplex during electrophoresis. The 10 mM KCl is sufficient to induce G-quadruplex folding in RNA and single stranded DNA.[4,34,45] In addition to monomers, oligomers d and e can also form dimers.

Figure 4

Single stranded DNA of the HIV-1 promoter with Sp1 binding sites folds completely into different G-quadruplex monomers, and each of seven G runs participates in the formation of G-quartets. (A) Native gel analysis showing that the three Sp1 sites adopt monomer G-quadruplexes with different sets of G runs (oligomers b–e). A sequence folded into a monomer G-quadruplex (F) migrates faster than expected for its size. The nonfolded (U) controls of the b, c, d, e, and c-kit also migrate faster and appear as smear bands indicating that 10 mM KCl in the gel induced their folding into G-quadruplex during electrophoresis. The sequences of the c-kit promoter, altered thrombin aptamer (TBA) and telomere (TEL), were confirmed to fold into monomer G-quadruplexes and were used as positive controls. Dimeric G-quadruplexes (D) were observed for oligomers d and e. F, samples incubated in the presence of 100 mM KCl; F2, sample incubated in the presence of 250 mM KCl; and U, samples incubated in the absence of KCl. (B) CD spectra for shorter sequences b–e. Oligomers b and e display a profile known for mixed G-quadruplexes with either parallel or antiparallel configuration and also known for bulged and hybrid G-quadruplexes. CD profiles for oligomers c and d are similar to profiles of G-quadruplexes with antiparallel topology and hybrid or basket-type configuration, respectively. The CD spectral analysis confirms the formation of G-quadruplexes for sequences b, c, and e (Figure 4B); however, the profiles differ for each sequence indicating that they exhibit different configurations. The CD spectra for oligomers b and e have two peaks around 295 and 260 nm, which might represent a mixture of parallel and antiparallel G-quadruplexes.[46] However, such profiles are also described for hybrid (3 + 1) and bulged G-quadruplexes.[21,41] The profile for oligomer d has a strong maximum around 295 nm and a minimum around 250 nm, and such a profile is generally not observed for G-quadruplex structures, although few reports attributed these spectral characteristics to hybrid conformations containing a mixture of both parallel and antiparallel strand orientations and basket-type G-quadruplexes.[47,48] Other techniques, like NMR spectroscopy would need to be used to determine what structure is formed by oligomer d. The CD spectrum for oligomer c is similar to the profile describing G-quadruplexes with antiparallel configuration. These results indicate that the Sp1 binding region can easily and fully adopt different monomer G-quadruplexes. The ability of the sequence to adopt different G-quadruplex configurations was previously reported and might reflect that G-quadruplexes with different topology could have different functions.[49]

hf2 Antibody Recognizes Two G-Quadruplexes Formed in the Sp1 Binding Region of the HIV-1 Promoter

Engineered proteins, such as single-chain antibodies and the Gq1 zinc finger protein, have been previously generated as molecular probes to study G-quadruplex structures.[37,50] Using phage display technology, Fernando and co-workers generated an hf2 antibody with high binding affinity to the G-quadruplex formed by c-kit2 found in the promoter of the c-kit proto-oncogene.[37] Recent studies determined that this antibody also has binding affinity to G-quadruplexes formed in other genomic regions.[46] In order to determine whether hf2 would bind to structures formed in the Sp1 binding sites of HIV-1, we produced phages displaying hf2 antibodies on their surface and performed a phage ELISA against structures formed by oligomers b–e. Results showed that the hf2 displaying phage binds to G-quadruplexes formed by two HIV-1 sequences, the oligomers b and c (Figure 5A). Serial dilutions of phages showed that the highest affinity of hf2 antibody was seen for the G-quadruplex formed by the c-kit2 sequence, as expected (Figure 5B). A lower affinity was observed for structures formed by sequences of oligomers b and c, indicating that these G-quadruplexes must contain some structural features similar to those of the G-quadruplex formed by the c-kit2 sequence. A very weak binding affinity was detected for the structure formed by oligomer d, whereas no binding was detected for oligomer e and two controls, single stranded and double stranded DNA. As an additional control, we performed a phage ELISA for phages displaying unrelated antibodies. No binding was observed for all analyzed sequences (data not shown). In summary, our results show that the hf2 antibody is able to recognize and bind to two G-quadruplex structures of the Sp1 binding region in HIV-1 promoter. This directly proves that oligomers b and c adopt G-quadruplex configurations. The lack of interactions between antibody hf2 and oligomers d and e suggest that both sequences do not form G-quadruplexes. However, although the hf2 antibody displayed ability to recognize different G-quadruplexes, likely they do not recognize all configurations. Thus, other methods would have to be used to provide direct evidence that oligomers d and e adopt this structure.

Figure 5

Interactions between phages displaying the hf2 scFv and G-quadruplexes formed in the HIV-1 Sp1 sites. (A) Phage ELISA shows that the c-kit2 G-quadruplex-specific antibody recognizes two of four G-quadruplexes formed by sequences of the HIV-1 Sp1 sites. (B) Binding curves of hf2-DNA interactions with serial dilutions of phages show that bindings of hf2 to oligomers b and c have lower affinities when compared to binding with the c-kit2 G-quadruplex. No significant interactions were observed for oligomer d, and no binding was seen for oligomer e and two controls, single stranded (ssDNA), and double stranded DNA (dsDNA).

Sp1 Protein Binds to a G-Quadruplex Formed in the HIV-1 Promoter

Recent studies showed that the Sp1 binding sites of the c-kit promoter fold into a G-quadruplex that is recognized and bound by Sp1 protein.[42] The G-rich sequence of the c-kit Sp1 site and distribution of G runs are different from Sp1 sites in HIV-1. In order to determine whether Sp1 can also bind to an HIV-1 sequence folded into a G-quadruplex, we used a previously described affinity selection approach.[42] In this method, the pull-down of protein is performed with the 3′-biotinylated oligonucleotide with the sequence folded into G-quadruplex and immobilized to streptavidin-coated magnetic beads. The Sp1 protein used for affinity selection is at a concentration of 1.98 nM. As indicated above, the three Sp1 sites have the ability to adopt different G-quadruplex configurations, in which different sets of G runs participate in the formation of a G-quartet. However, for some configurations, the folding might involve G runs of two Sp1 binding sites leaving one site unfolded. To ensure that only one Sp1 site is available for protein binding in a pull-down of Sp1 and that it is folded, we used as bait a 21-nt sequence of oligomer c with the Sp1 site II surrounded by two G runs from Sp1 sites I and III (Figure 4). The native gel analysis shows that this sequence transforms completely into a monomer G-quadruplex (Figure 4A), and CD spectra indicate that the G-quadruplex is likely in an antiparallel configuration (Figure 4B). In assessing Sp1 binding to a G-quadruplex in the HIV-1 promoter, we used, as a positive control, an oligonucleotide with the sequence of the c-kit promoter for which the binding of Sp1 to its site folded as a G-quadruplex had been confirmed.[42] As a negative control, selection of Sp1 protein was done with beads not coupled to any DNA. As seen in the Western blots in Figure 6A, the Sp1 protein was pulled down with the HIV-1 Sp1 binding site II folded into a G-quadruplex with the same efficiency as with a G-quadruplex of the c-kit promoter. When the dsDNA of this sequence was used for Sp1 selection, the interaction of protein with the DNA was disrupted by the presence of the 3 and 6 molar excess of oligomer c with the Sp1 site II folded into a G-quadruplex (Figure 6B). This indicates that the Sp1 site in a G-quadruplex form competes efficiently with dsDNA for binding with Sp1. Importantly, the interaction of Sp1 with dsDNA was not affected by the presence of a 6 molar excess of the nonspecific sequence (T15CTA; ds/ss line in Figure 6B). In summary, these results confirm that Sp1 protein can recognize and bind to its binding element in the HIV-1 promoter folded into a G-quadruplex configuration.

Figure 6

Sp1 binds to a G-quadruplex in the HIV-1 promoter. (A) Sp1 is selected by a G-quadruplex (ss-GQ) with the sequence of the Sp1 site II of the HIV-1 promoter (top) with the same efficiency as that selected by a G-quadruplex with the sequence of the c-kit promoter. Sp1 is also selected with the same efficiency by these sequences in dsDNA form. (B) Three and six molar excess of oligomer c with Sp1 site II (top) folded into a G-quadruplex competes efficiently with dsDNA sequence for binding with Sp1. Sp1 binding to the dsDNA is not affected in the presence of 6 molar excess of a nonspecific 18-nt sequence (T)15CTA.

Dimerization of RNA Strands in the HIV-1 U3 Region

Since the two copies of the HIV-1 RNA genomes are held together at DIS, the ability of the RNA strands to fold into a G-quadruplex configuration raises the possibility that multiple G-quadruplex structures formed between the two viral RNA genomes support additional dimer contacts during reverse transcription. In order to determine whether an intermolecular G-quadruplex in the U3 region can be formed between two RNA strands, we first used our previously developed affinity selection approach to test if two homologous sequences of U3 can interact.[34] In this method, the RNA strands tagged with poly(A) tail are incubated with nontagged RNA molecules, and then the mixture is subjected to affinity selection with magnetic beads conjugated with oligo-d(T)25. The interaction between nucleic acids is measured by analyzing selected RNA molecules. The interacting partners are distinguished by their size in a denaturing gel stained with ethidium bromide. The quantity of selected nontagged template is expected to be lower since interactions will also occur between two tagged RNAs and two nontagged RNAs. In addition, the formation of an intermolecular G-quadruplex between two templates likely competes with the formation of intramolecular G-quadruplexes in both templates. The RNA strands corresponding to positions 8960–9037 of the HIV-1 NL4–3 RNA genome were synthesized with poly(A) tails and were then coincubated with equivalent RNA strands but devoid of a poly(A) sequence. Because the gag regions near DIS from HIV-1 MAL and NL4–3, and the cPPT region in NL4–3 were shown to form G-quadruplex dimers, we used RNA strands of these regions as positive controls.[30−32,34] The RNA strands with gag sequences (303–415 in HIV-1 MAL and 290–403 in HIV-1 NL4–3) did not include the DIS region. The RNA partners were combined and incubated to allow dimerization, and subsequently used for affinity selection with magnetic beads. As shown in Figure 7, the nontagged RNA strands of the U3 region were coselected with corresponding poly(A) tagged RNAs, similar to results with the gag and cPPT regions with which dimeric G-quadruplex formation was previously reported. The slower migration rate for tagged RNA of U3 region results from a longer poly(A) tag used for this template. Nontagged RNA with the U3 region sequence was not selected by magnetic beads in the absence of the poly(A) tagged partner, demonstrating that observed interactions are not resulting from nonspecific binding to the magnetic beads.

Figure 7

Affinity selection of HIV-1 RNAs enriched in G residues. Poly(A) tagged RNA templates are sequences elongated at the 3′ end with a poly(A) tag in order to select them with oligo(dT)25 magnetic beads. After incubation and affinity selection, the samples were analyzed in a denaturing gel stained with ethidium bromide. The nontagged RNAs (faster migrating species) of the gag region (RNA genomic sequence 303–415 of the MAL isolate and 290–403 of the NL4-3), cPPT region (4309–4396 of NL4-3), and U3 region (8960–9037 of NL4-3) were selected with corresponding poly(A) tagged RNA partners. No U3 region RNA was selected in the absence of a poly(A) tagged partner (lane C). In order to determine whether RNA interactions in U3 have characteristics of intermolecular G-quadruplexes, the interactions were also investigated by native gel analysis. Previous studies showed that the ability of a test sequence to form a dimer through intermolecular G-quadruplex increased with template and salt concentration; however, the yield of RNA dimers correlated inversely with the size of monovalent cation (i.e., Li+ > Na+ > K+).[30,31,34] Thus, G-quadruplex RNA dimers can form more efficiently in the presence of a high concentration of LiCl, than KCl, although complexes are less stable. To compare the cation-dependent association and thermal stability profiles of complexes formed by the G-rich U3 region, we used radiolabeled RNA templates and analyzed complexes in a native gel. A 78-nt HIV-1 fragment of the U3 region (8960–9037) at a concentration of 4 μM in buffer containing 1 M of KCl, NaCl, or LiCl was heated to 95 °C, chilled on ice, and subsequently incubated for 1 h at room temperature. A 1 M concentration of salt was used to provide optimal conditions for the G-quadruplex folding in the presence of lithium ions. The mixtures were then exposed for 8 min to different temperatures, and the stability of the complexes was analyzed in a native gel. The results in Figure 8A show that dimer complexes were formed more efficiently in the presence of LiCl and NaCl, than in the presence of KCl. This observation is consistent with results obtained in previous studies for regions of gag and cPPT in HIV-1.[30,31,34] With higher temperature, all complexes were less stable, but the rate of dissociation was higher for complexes formed in the presence of lithium ions. For example, about 77% of complexes formed in the presence of potassium ions were still present at 70 °C, whereas about 1/3 of complexes formed in the presence of sodium and about a half of those formed in the presence of lithium ions had dissociated at this temperature. The low level of RNA G-quadruplex dimers formed in the presence of potassium ions resulted from a slower rate of folding, and up to 50% yield of complexes could be achieved after 24 h of incubation at lower salt concentration (0.2 or 0.5 M) (Figure 8B). These results demonstrate that dimeric complexes formed by the RNA template of the U3 region display characteristics expected for intermolecular RNA G-quadruplex structures.

Figure 8

Cation-dependent association and thermal stability of the RNA dimer formed by a sequence with the Sp1 sites of HIV-1 U3. (A) RNA dimers were allowed to form for 1 h at a template concentration of 4 μM in buffers containing 1 M KCl, NaCl, or LiCl. One volume of Tris-EDTA buffer was added, and 15-μL aliquots were incubated at the indicated temperatures for 8 min. Thermal stabilities of RNA dimers were measured by analyzing samples in nondenaturing gels run at 4 °C. Higher yield of RNA dimers formed in the presence of 1 M LiCl and their lower thermal stability are characteristics of complexes formed through intermolecular G-quadruplexes. (B) The yield of RNA dimers increases after overnight incubation in the presence of potassium ions and at a lower salt concentration (0.2 and 0.5 M).

G-Quadruplex in RNA Facilitates RT in Switching Templates during Reverse Transcription

The folding of an RNA sequence into a structure that can pause RT is linked to an increased rate of viral recombination. Our previous studies showed that under conditions that encourage the formation of G-quadruplex structure, the rate of template switching during reverse transcription increased for the G-rich regions in gag and near the cPPT, suggesting that the structure increases the efficiency of recombination.[33,34] These observations are in agreement with studies in vivo on the distribution of recombination breakpoints in the HIV-1 genome, which show a higher recombination rate in these two regions.[51−53] The U3 region of the RNA genome is also a site of frequent recombination events, which generally occur upstream of the Sp1 binding sites.[54] Significantly, studies revealed that efficient crossovers in U3 rely on the presence of the 150-nt long sequence at the 3′ end of U3, containing the G-rich elements of the Sp1 binding sites.[55] In order to evaluate the influence of G-quadruplex formation on strand transfer efficiency (the recombination reaction), we used a reconstituted system consisting of HIV-1 RT, HIV-1 nucleocapsid protein (NC), a primer (DNA oligonucleotide), and two RNA templates representing the two copies of the HIV-1 RNA genome (Figure 9A). To initiate the reaction, a 32P-labeled DNA primer was annealed to the RNA template (HIV-1 genome sequence 8960–9054), denoted here as donor RNA. The second RNA template, denoted as acceptor RNA, had the HIV-1 sequence 8961–9033 and shared homology with the donor RNA over 72-nt of the Sp1 binding region. The 5′ end of the acceptor RNA was elongated with GGAAAAAAAAAA so that transfer products (TP) could be separated and distinguished on a denaturing gel from DNA fully extended on the donor RNA (DE). End transfer of the DE was prevented because our acceptor RNA did not share homology with two nucleotides at the 5′ end of the donor RNA (a circle in Figure 9A). Thus, all transfers to the acceptor RNA originated only from internal regions of the donor template, as they do in vivo during reverse transcription over the U3 region.

Figure 9

Formation of the structure stabilized by potassium ions in the HIV-1 U3 region facilitates RT template switching during reverse transcription. (A) Reconstituted system to analyze the influence of G-rich elements on strand transfer during HIV-1 minus strand DNA synthesis in vitro. Donor and acceptor RNA templates represent two copies of the viral RNA genome; in which reverse transcription is initiated from a 32P-labeled DNA primer annealed to the donor RNA. The acceptor RNA does not share a homology (circle) with two nucleotides at the 5′ end of the donor RNA. TP, transfer product; DE, donor extension product; and P, DNA primer. (B) A time course of strand transfer reactions performed in the presence of potassium and lithium ions. Samples were collected at 1, 5, 15, and 30 min after the reaction was initiated. Formation of a potassium-dependent structure, anticipated to be a G-quadruplex, in the RNA template paused the RT during minus strand DNA synthesis and influenced the yield of the final products. The transfer efficiency decreased about 37% in reactions with lithium ions, presumably because the templates could not form a G-quadruplex. The strand transfer assays were performed in the presence of a low concentration of either K+ or Li+. The monovalent cations added to the reaction do not significantly affect the enzymatic activity of RT.[56,57] The major cation-dependent pause sites of RT synthesis were clearly visible within the G-rich region in the presence of K+ but not Li+ (Figure 9B). A nonion-dependent RT pause, likely resulting from a hairpin structure, was visible. The strong pauses in potassium, presumed to result from G-quadruplex formation, evidently caused dissociation of the RT since fewer final products were made when compared with that in reactions in lithium, where no RT pauses were seen in the G-rich region. The transfer efficiency of reactions was calculated by comparison of values of donor extension products and transfer products using the following formula: transfer efficiency = 100 × TP/(TP + DE). As expected, the elimination of ion-dependent RT pauses by lithium caused a drop in transfer efficiency of about 37%. Since ion-dependent RT pauses are associated with G-quadruplex formation, the effect of increased template switching likely resulted from RT encountering G-quadruplex. This is consistent with a general observation that a structure capable of pausing the RT also facilitates transfer reactions; however, the effect observed here is less striking than for two other G-quadruplex forming sequences in a region of gag and cPPT.[33,34] In summary, these results show that G-quadruplex(es) formed at the 3′ end of the viral RNA genome are likely contributors to the increased recombination rate in the U3 region.

Discussion

Many promoters have multiple runs of G-residues, including those with several G-rich Sp-1 transcription factor binding motifs in tandem. Using genome-wide computational analysis, it has been suggested that such tandem Sp1 binding sites form G-quadruplex structures.[58,59] Direct evidence that Sp1 binding elements form G-quadruplexes that regulate gene expression has already be derived from studies of the c-myc promoter.[1,12,15−17] The HIV-1 promoter has three Sp1 binding elements, and our analysis of the region in U3 revealed that the sequence readily adopts different G-quadruplex configurations within both its RNA and single stranded DNA forms. These results predict that during viral infection the G-quadruplex structure forms in both the RNA genome and in the promoter of the HIV-1 provirus, and presumably has important biological functions in both environments. In fact, recently Perrone and co-workers published results showing that HIV-1 promoter activity is impaired in the presence of a G-quadruplex binding ligand, indicating that indeed the structures have a role in regulating virus expression.[35] The group also demonstrated that different G-quadruplexes are formed in the U3 region of DNA and that some of them involve G runs of Sp1 sites together with a G run in the NF-κB site, which was not a subject of our studies. However, these and our results together show that the entire G-rich transcription factor binding region in the HIV-1 proximal promoter adopts different forms of G-quadruplex structure. The variation of genomic sequence among various viral species is higher for the U3 region having Sp1 binding sites than for the protein coding regions. However, according to our computational analysis, the sequence variants all retain the ability of the Sp1 binding sites to form G-quadruplex structure. This observation additionally supports the conclusion that G-quadruplex structure is an important element of the HIV-1 promoter region that is maintained by a strong evolutionary pressure. Because the G-quadruplex structure forms in the Sp1 transcription factor binding region, it is likely that it participates in the regulation of promoter function. Recent studies of the c-kit promoter showed that Sp1 protein can bind to its sequence folded into a G-quadruplex.[42] The sequence and G run distribution in HIV-1 Sp1 sites differ from those of the Sp1 binding region in the c-kit promoter. However, our results also show Sp1-G-quadruplex interactions, indicating that Sp1 could regulate HIV-1 promoter activity through binding dsDNA or G-quadruplex structure and that these binding characteristics might be linked with different functions of Sp1. Moreover, we found that the G-rich HIV-1 sequence with three Sp1 binding sites can adopt alternative configuration G-quadruplexes with different interacting sets of G runs and that antibodies specific to the c-kit2 G-quadruplex recognize two of these forms. Such diversity suggests that the G-rich Sp1 binding region uses its ability to adopt several non-B DNA configurations as a complex switching mechanism involving subtle Sp1 interaction differences to fine-tune transcriptional output. Why might such complex alternative protein–DNA interactions be necessary? Sp1 is a strong activator of HIV-1 expression, but Sp1 activity is also associated with maintaining virus latency.[60,61] Importantly, the Sp1 binding elements are within a region of the HIV-1 promoter that remains nucleosome-free despite hypermethylation of two CpG islands that flank the HIV-1 transcription start site.[62,63] As a result, the region remains accessible to transcription regulators, any of which might act through G-quadruplex structure. G-quadruplex binding agents and proteins were shown to suppress cellular promoter activity by stabilizing a G-quadruplex formed in the Sp1 binding region, but unwinding of the structure reactivated gene expression.[1,12−17] Interestingly, stimulation of cellular promoter activity through G-quadruplex structure was recently also demonstrated.[18,19] However, recently published results by Perrone and co-workers indicate that G-quadruplexes formed in HIV-1 promoter have a rather inhibitory effect on viral transcription.[35] Formation of G-quadruplexes in the HIV-1 RNA genome has previously been demonstrated in gag near the RNA 5′ end and recently in the central region of the genome near the cPPT.[30−32,34] Significantly, both sequences can form dimer complexes through intermolecular G-quadruplex structure, indicating that G-quartets might be formed that link the two RNA genomes. Our results show that short RNA molecules with HIV-1 sequences containing the Sp1 binding sites can also dimerize and that their interactions have characteristics of the intermolecular G-quadruplex. This suggests that the U3 region is an additional point of contact in a multiply linked genome dimer and together with G-rich sequences in gag and cPPT helps to maintain interaction of the HIV-1 genomes along their whole sequence. Apart from dimerization through the DIS region, additional interactions through intermolecular G-quadruplexes would explain the ability of viral RNA genomes to maintain a dimeric configuration when binding-disruptive mutations are introduced in DIS.[64,65] Moreover, additional contacts would also be necessary to keep the two 9kb-long RNA sequences in proximity for efficient reverse transcription and the observed widely distributed hot spots for recombination. The ability of the G-rich sequence in U3 to form a dimer also means that a contact point at the 3′ end provides equal accessibility of either of the 3′ ends of the copackaged RNA genomes for minus strong stop DNA transfer. Indeed, studies showed that transfers of the minus strand DNA primer occur with the same frequency to each 3′ genomic end.[54] Consistent with these expectations, our results demonstrate that the G-quadruplex formed by the Sp1 binding sites is likely a structural element contributing to the increased recombination rate in U3. Detailed measurements of strand transfers at the ends of HIV-1 showed that 11 of 86 analyzed clones (12%) underwent homologous recombination in the U3 region, and all crossovers occurred upstream of the G-rich region of the Sp1 binding sites.[54] Other results showed that recombination in U3 dropped significantly in tests conducted with a template construct missing a 150-nt long sequence containing the putative dimerization site at the 3′ end, suggesting the existence of sequence elements in this region that are crucial for efficient homologous recombination in this region.[55] Our analysis of minus strand transfer using a reconstituted system shows that the transfer is cation-dependent and that transfer efficiency decreases in the presence of lithium ions, which destabilize G-quadruplex structure, although the effect is less dramatic than observed for G-rich regions of gag and cPPT. This suggests that G-quadruplex(es) formed in the region of Sp1 binding sites are one of the factors causing increased recombination in U3. The correlation between potential G-quadruplex formation and hot spots for recombination was also found in the gag and cPPT regions and is in an agreement with previous analyses in vitro showing that G-quadruplexes facilitate RT template switching during reverse transcription.[33,34,66] In summary, our current studies, combined with previous work, show that G-rich sequences in the HIV-1 genome are capable of forming G-quadruplexes in both RNA and DNA forms of the genome. The distribution of recombination hot spots correlates with the sites where G-quadruplexes are formed, and reconstituted systems confirm that these structures facilitate strand transfers. The concept of G-quadruplexes regulating the activity of the HIV-1 promoter is new, and determining how and when G-quadruplexes regulate HIV-1 transcription would enhance our understanding of HIV-1 latency and reactivation, which might help to identify a new molecular target for therapeutic reactivation of virus replication.

66 in total

1. Demonstration that drug-targeted down-regulation of MYC in non-Hodgkins lymphoma is directly mediated through the promoter G-quadruplex.

Authors: Robert V Brown; Forest L Danford; Vijay Gokhale; Laurence H Hurley; Tracy A Brooks
Journal: J Biol Chem Date: 2011-09-28 Impact factor: 5.157

2. HIV-1 nucleocapsid protein increases strand transfer recombination by promoting dimeric G-quartet formation.

Authors: Wen Shen; Robert J Gorelick; Robert A Bambara
Journal: J Biol Chem Date: 2011-07-07 Impact factor: 5.157

3. G-quadruplex structures in TP53 intron 3: role in alternative splicing and in production of p53 mRNA isoforms.

Authors: Virginie Marcel; Phong L T Tran; Charlotte Sagne; Ghyslaine Martel-Planche; Laurence Vaslin; Marie-Paule Teulade-Fichou; Janet Hall; Jean-Louis Mergny; Pierre Hainaut; Eric Van Dyck
Journal: Carcinogenesis Date: 2010-11-26 Impact factor: 4.944

4. The C-terminus of nucleolin promotes the formation of the c-MYC G-quadruplex and inhibits c-MYC promoter activity.

Authors: Verónica González; Laurence H Hurley
Journal: Biochemistry Date: 2010-10-21 Impact factor: 3.162

5. Putative DNA G-quadruplex formation within the promoters of Plasmodium falciparum var genes.

Authors: Nicolas Smargiasso; Valérie Gabelica; Christian Damblon; Frédéric Rosu; Edwin De Pauw; Marie-Paule Teulade-Fichou; J Alexandra Rowe; Antoine Claessens
Journal: BMC Genomics Date: 2009-08-06 Impact factor: 3.969

6. 5'-UTR G-quadruplex structures acting as translational repressors.

Authors: Jean-Denis Beaudoin; Jean-Pierre Perreault
Journal: Nucleic Acids Res Date: 2010-06-22 Impact factor: 16.971

7. A recombination hot spot in HIV-1 contains guanosine runs that can form a G-quartet structure and promote strand transfer in vitro.

Authors: Wen Shen; Lu Gao; Mini Balakrishnan; Robert A Bambara
Journal: J Biol Chem Date: 2009-10-12 Impact factor: 5.157

8. Identification and characterization of nucleolin as a c-myc G-quadruplex-binding protein.

Authors: Verónica González; Kexiao Guo; Laurence Hurley; Daekyu Sun
Journal: J Biol Chem Date: 2009-07-06 Impact factor: 5.157

9. A non-canonical DNA structure is a binding motif for the transcription factor SP1 in vitro.

Authors: Eun-Ang Raiber; Ramon Kranaster; Enid Lam; Mehran Nikan; Shankar Balasubramanian
Journal: Nucleic Acids Res Date: 2011-10-22 Impact factor: 16.971

10. Epigenetic regulation of HIV-1 latency by cytosine methylation.

Authors: Steven E Kauder; Alberto Bosque; Annica Lindqvist; Vicente Planelles; Eric Verdin
Journal: PLoS Pathog Date: 2009-06-26 Impact factor: 6.823

39 in total

Review 1. Transposable elements and G-quadruplexes.

Authors: Eduard Kejnovsky; Viktor Tokan; Matej Lexa
Journal: Chromosome Res Date: 2015-09 Impact factor: 5.239

2. Virion-associated, host-derived DHX9/RNA helicase A enhances the processivity of HIV-1 reverse transcriptase on genomic RNA.

Authors: Samantha Brady; Gatikrushna Singh; Cheryl Bolinger; Zhenwei Song; Ioana Boeras; Kexin Weng; Bria Trent; William Clay Brown; Kamalendra Singh; Kathleen Boris-Lawrie; Xiao Heng
Journal: J Biol Chem Date: 2019-06-07 Impact factor: 5.157

3. High-resolution three-dimensional NMR structure of the KRAS proto-oncogene promoter reveals key features of a G-quadruplex involved in transcriptional regulation.

Authors: Abdelaziz Kerkour; Julien Marquevielle; Stefaniia Ivashchenko; Liliya A Yatsunyk; Jean-Louis Mergny; Gilmar F Salgado
Journal: J Biol Chem Date: 2017-03-22 Impact factor: 5.157

Review 4. Innate Sensing of DNA Virus Genomes.

Authors: Zhe Ma; Guoxin Ni; Blossom Damania
Journal: Annu Rev Virol Date: 2018-09-29 Impact factor: 10.431

5. G-quadruplex ligands targeting telomeres do not inhibit HIV promoter activity and cooperate with latency reversing agents in killing latently infected cells.

Authors: Dorota Piekna-Przybylska; Robert A Bambara; Sanjay B Maggirwar; Stephen Dewhurst
Journal: Cell Cycle Date: 2020-08-17 Impact factor: 4.534

6. Structure and target interaction of a G-quadruplex RNA-aptamer.

Authors: Kristina Szameit; Katharina Berg; Sven Kruspe; Erica Valentini; Eileen Magbanua; Marcel Kwiatkowski; Isaure Chauvot de Beauchêne; Boris Krichel; Kira Schamoni; Charlotte Uetrecht; Dmitri I Svergun; Hartmut Schlüter; Martin Zacharias; Ulrich Hahn
Journal: RNA Biol Date: 2016-07-29 Impact factor: 4.652

7. Deficiency in DNA damage response, a new characteristic of cells infected with latent HIV-1.

Authors: Dorota Piekna-Przybylska; Gaurav Sharma; Sanjay B Maggirwar; Robert A Bambara
Journal: Cell Cycle Date: 2017-04-07 Impact factor: 4.534

8. HIV-1 genomic RNA U3 region forms a stable quadruplex-hairpin structure.

Authors: Chelsea Harpster; Elaina Boyle; Karin Musier-Forsyth; Besik Kankia
Journal: Biophys Chem Date: 2021-03-08 Impact factor: 2.352

Review 9. G-quadruplexes in pathogens: a common route to virulence control?

Authors: Lynne M Harris; Catherine J Merrick
Journal: PLoS Pathog Date: 2015-02-05 Impact factor: 6.823

10. Synthesis, Binding and Antiviral Properties of Potent Core-Extended Naphthalene Diimides Targeting the HIV-1 Long Terminal Repeat Promoter G-Quadruplexes.

Authors: Rosalba Perrone; Filippo Doria; Elena Butovskaya; Ilaria Frasson; Silvia Botti; Matteo Scalabrin; Sara Lago; Vincenzo Grande; Matteo Nadai; Mauro Freccero; Sara N Richter
Journal: J Med Chem Date: 2015-12-08 Impact factor: 7.446