Literature DB >> 33769784

Targeting G-quadruplex Forming Sequences with Cas9.

Hamza Balci^1,2, Viktorija Globyte², Chirlmin Joo².

Abstract

Clustered regularly interspaced palindromic repeats (CRISPR) and CRISPR-associated (Cas) proteins, particularly Cas9, have provided unprecedented control on targeting and editing specific DNA sequences. If the target sequences are prone to folding into noncanonical secondary structures, such as G-quadruplex (GQ), the conformational states and activity of the CRISPR-Cas9 complex may be influenced, but the impact has not been assessed. Using single molecule FRET, we investigated structural characteristics of the complex formed by CRISPR-Cas9 and target DNA, which contains a potentially GQ forming sequence (PQS) in either the target or the nontarget strand (TS or NTS). We observed different conformational states and dynamics depending on the stability of the GQ and the position of PQS. When PQS was in NTS, we observed evidence for GQ formation for both weak and stable GQs. This is consistent with R-loop formation between TS and crRNA releasing NTS from Watson-Crick pairing and facilitating secondary structure formation in it. When PQS was in TS, R-loop formation was adequate to maintain a weak GQ in the unfolded state but not a GQ with moderate or high stability. The observed structural heterogeneity within the target dsDNA and the R-loop strongly depended on whether the PQS was in TS or NTS. We propose these variations in the complex structures to have functional implications for Cas9 activity.

Entities: Chemical Disease Gene Species

Year: 2021 PMID： 33769784 PMCID： PMC8056391 DOI： 10.1021/acschembio.0c00687

Source DB: PubMed Journal: ACS Chem Biol ISSN： 1554-8929 Impact factor: 5.100

Introduction

The capability to target desired DNA/RNA sequences or secondary structures with high specificity is crucial for many scientific, technological, and medical applications. Various approaches have been employed to achieve this goal including small molecules, nucleic acids, peptides, and proteins that recognize these sequences or structures. However, low specificity has consistently hindered widespread use of these approaches in complex settings, such as the mammalian cells where the large genome size or presence of similar structures demand high specificity. In this context, the repurposing of an adaptive prokaryotic immune system from Streptococcus pyogenes into a potent genome targeting and editing tool has been one of the most important scientific developments of recent decades. This immune system, clustered regularly interspaced short palindromic repeats (CRISPR), consists of an array of viral-derived DNA fragments (spacers) collected from previous attacks by various mobile genetic elements.[1−3] CRISPR-associated (Cas) proteins, CRISPR-RNA (crRNA), which guides Cas proteins to the target sequence, and trans activating crRNA (tracrRNA) are other important agents of this system. For target DNA to be cleaved, near-perfect complementarity between the spacer in crRNA and the target DNA is required. In addition, a protospacer adjacent motif (PAM) needs to be in the immediate vicinity of the target sequence.[4,5] For Cas9 derived from S. pyogenes, the target sequence is 20-nt long and the PAM sequence is NGG.[6] Also, Cas9 effectively functions with a single guide RNA (sgRNA) which combines tracrRNA and crRNA.[7] The capabilities of various Cas proteins and their interacting partners over a broad range of applications are still actively researched.[8,9] However, Cas9 has been the most widely used system due to the simplicity of the CRISPR–Cas9 complex and its high specificity.[10] In addition to wild-type Cas9 from different bacteria, its engineered mutants that promise higher specificity, tighter binding, and reduced or disabled cleavage activity, including endonuclease-dead Cas9 (dCas9) or enhanced nuclease activity, have been generated.[11−13] Despite the vast amount of information and know-how that have been accumulated on CRISPR–Cas9 in the last several years, the capabilities and limitations of this system in targeting DNA sequences that can form secondary structures have not been systematically investigated. Secondary structures such as G-quadruplex (GQ) structures, hairpins, and various loops (R-, D-, T-loops) are physiologically significant and have a high propensity to form when the double stranded DNA (dsDNA) is unwound during transcription, replication, or repair. CRISPR–Cas9 also targets one of the strands of dsDNA (target strand, named TS) with crRNA, which results in an R-loop between TS and crRNA,[14] while the other strand (nontarget strand, named NTS) is released and may fold into alternative secondary structures. In addition, the complementarity between TS and NTS needs to be broken for crRNA and TS to hybridize. During this transition, TS with an appropriate sequence might also transiently fold into alternative structures, especially if crRNA and TS are not perfectly complementary. Such transient or persistent secondary structures could influence target recognition,[15] binding stability, conformational dynamics, and cleavage activity of CRISPR–Cas9. Considering the abundance of sequences that could form secondary structures, it is critical to understand how they influence CRISPR–Cas9 complex structure and function. GQs form in guanine-rich regions of the genome[16] and are characterized by stacked G-tetrad layers, which contain a guanine (G) in each corner. G-tetrads are stabilized by Hoogsteen hydrogen bonds between the guanines and monovalent cations that intercalate between them. In addition to numerous demonstrations of their formation,[17,18] both DNA and RNA GQs have been visualized in human cells and were shown to be modulated during the cell cycle.[19,20] GQs are generally thermally more stable than the corresponding dsDNA formed by Watson–Crick base pairing[21] and require protein activity to be destabilized and unfolded.[22−25] The inability to unfold or destabilize GQs impedes replication machinery and results in elevated levels of DNA breaks and genomic instability.[26−28] The most prominent sites for potentially GQ forming sequences (PQS) are the telomeric overhangs and genomic regions involved in transcription or translation level gene expression regulation. The 3′ telomeric overhang in human cells has ∼200-nucleotide (nt) -long GGGTTA repeats, which form multiple 3-layered GQs. Unless unfolded, these GQs prevent telomere elongation by inhibiting telomerase,[29] making them prominent drug targets in cancer therapy.[30] In addition to telomeres, genome-wide computational studies and high-throughput sequencing have identified several hundred thousand PQSs in the human genome.[31,32] The PQSs are significantly enriched in the human genome compared to S. cerevisiae, ∼40-fold when normalized with respect to genome size, suggesting a functional significance in higher organisms.[33] While the coding regions are in general poor in PQS, promoters, especially the immediate vicinity of transcription start site (TSS), are rich in PQS, suggesting a role in transcription level regulation of gene expression.[34] About 50% of the human genes contain a PQS within 1000 nts upstream of TSS,[32] and interestingly PQSs are more prevalent in promoters of oncogenes and regulatory genes, such as transcription factors, compared to housekeeping genes.[35] The thermodynamic stability of GQs can be greatly modulated while keeping the overall length of the sequence similar and within the ∼20 nt target sequence of Cas9 by modulating the loop length, loop organization, or the number of G-tetrad layers.[21,36] In general, longer loops reduce GQ stability while more G-tetrad layers increase it. To illustrate, the PQS commonly used as a thrombin binding aptamer (TBA-GQ: GGTTGGTGTGGTTGG)[37] and the PQS that is used as an HIV integrase inhibitor (3L1L-GQ: GGGTGGGTGGGTGGG)[38,39] are both only 15-nt long but have thermal melting temperatures (Tm) that differ by >45 °C under physiological ion and pH conditions (Tm = 51 °C for TBA-GQ while Tm > 95 °C for 3L1L-GQ). TBA-GQ has only two G-tetrad layers and relatively long loops while 3L1L-GQ has three G-tetrad layers and 1-nt loops. We and others have demonstrated that such variations in structure influence the stability of GQs against ssDNA binding proteins and helicases.[18,24,40,41] TBA-GQ and 3L1L-GQ will be used in this study as representatives of low and high stability GQ, respectively. In addition, the GQ formed by human telomeric sequence, hGQ, GGGTTAGGGTTAGGGTTAGGG, will serve as the GQ with intermediate stability (Tm = 68 °C).[40] Using smFRET, we investigated complex formation and dynamic interactions between CRISPR–Cas9 and target dsDNA that contains one of these PQSs in either TS or NTS.

Materials and Methods

DNA/RNA Constructs

Table S1 lists sequences of the DNA/RNA constructs and locations of Cy3/Cy5. The DNA oligonucleotides were purchased from Ella Biotech GmbH (Germany). RNA oligos were purchased from Dharmacon-USA (now part of Horizon Discovery, UK) or IBA GmbH (Germany). All oligonucleotides were HPLC purified by vendors. Labeling with Cy3/Cy5 was performed in the lab using a previously published protocol.[42] The double stranded DNA constructs were created by annealing TS and NTS at 95 °C for 5 min followed by slow cooling to RT at 0.2 °C/min. Figure shows schematics of constructs that have a PQS in either TS or NTS. To clarify the writing, these cases will be referred to as “[PQS in TS]” or “[PQS in NTS],” and in certain cases PQS will be replaced with the name of GQ construct, such as “[hGQ in TS]” to describe having hGQ sequence in the target strand. In one of the labeling schemes, donor (Cy3) was placed on TS while the acceptor (Cy5) was on crRNA (Figure A–D). The fluorophore positions are kept at consistent separations for all DNA constructs with PQS and the reference construct that does not include a PQS. This arrangement enabled probing the complex between TS and crRNA. Following a smFRET assay (Supporting Information Figure S1), several different sites on TS were tested to identify a location for Cy3 such that the main FRET peak is in EFRET ≈ 0.6–0.7 range, which should make it sensitive to conformational changes in the complex. Sites several nts up or downstream of the optimal location selected for this study resulted in either a very high or very low FRET where this sensitivity was lost (Supporting Information Figure S2). Depending on whether the PQS is in TS or NTS and whether the GQ forms in either strand or the crRNA, different complex conformations are possible (Figure A–D), which should result in different FRET levels. In the second labeling scheme, the Cy3 is placed on TS while Cy5 is placed on NTS within the loop region of PQS (Figure E).

Figure 1

Schematics demonstrating different potential conformations of CRISPR–Cas9 and dsDNA complex. The PQS is in either TS or NTS. In A–D, the donor fluorophore (Cy3-green) is on TS and acceptor fluorophore (Cy5-red) is on crRNA. In E, the donor is on TS and acceptor on NTS, on different sides of PQS. When PQS is in NTS, it also must be in crRNA, which raises the possibility of GQ formation in crRNA as shown in D. For all these different conformations, GQ may or may not fold or transition between different states. (F) A schematic of slide surface and laser excitation in TIR mode for smFRET measurements. A schematic of the steps of the FRET assay are shown in Supporting Information Figure S1.

SmFRET Assay and Setup

The protocols for purification and biotinylation of Streptococcus pyogenes Cas9 (SpCas9), labeling of nucleic acids with Cy3/Cy5, sample preparation for smFRET assay, and data acquisition and analysis have been detailed in earlier studies by Globyte et al.[43,44] Briefly, quartz slides and glass coverslips were cleaned and coated with a larger (5000 Da, 97.5% PEG + 2.5% biotin-PEG) and smaller (333 Da) polyethylene glycol (PEG) molecules following a published double PEGylation protocol. After forming the microfluidic channels, the surfaces were also treated with 5% Tween-20 to reduce nonspecific binding. To activate the CRISPR–Cas9 complex, biotin-SpCas9 (1 nM), crRNA (2 nM), and tracrRNA (12 nM) were mixed in Buffer A (100 mM KCl, 50 mM Tris-HCl (pH 7.5), 10 mM MgCl2, and 1 mM DTT) and incubated at 37 °C for 20 min. During this incubation, 0.1 mg mL–1 streptavidin was added to the microfluidic channel. After 2 min of incubation, the excess streptavidin was washed away. At the end of the 20 min activation, the CRISPR–Cas9 complex was diluted two times in Buffer A and introduced to the microfluidic channel. After 2 min of incubation, the channel was washed with imaging buffer (50 mM Tris-HCl (pH 7.5), 150 mM KCl, 0.8% w/v glucose, 2 mM MgCl2, 1 mM Trolox, 1 mg mL–1 glucose oxidase [Sigma], 170 μg/mL catalase [Merck]). In the case of constructs with the first labeling scheme where Cy5 is attached to crRNA, Cy5 molecules were excited with a red laser, and the surface density of bound CRISPR–Cas9 complexes was confirmed. This initial red-excitation step was not performed for constructs that did not have a fluorophore on crRNA. Then, target dsDNA (8 nM) was introduced to the channel and image acquisition started. A schematic summarizing these steps of the smFRET assay is shown in Supporting Information Figure S1. For measurements performed in LiCl, KCl was replaced with equimolar LiCl in Buffer A, the imaging buffer, and all buffers used to dilute biotin-SpCas9 to ensure that oligos containing PQS are not exposed to any significant concentration of KCl as this facilitates GQ formation. A custom-built prism-type TIRF instrument was used to collect smFRET data. Short (15 frames) and long (500–2000 frames) movies were collected at 100–300 ms integration time. An Olympus IX-71 microscope equipped with a 60× (NA 1.20) water-immersion objective (Olympus) and an EMCCD camera (Andor Ixon Ultra) formed the main components of the instrument. The donor fluorophores were excited with a 532 nm diode laser. A 635 nm laser was used to directly excite the acceptor molecules. All histograms were created by trace-by-trace analysis where data on each molecule were inspected and background subtracted. The numbers of molecules are given in figure captions and were on average a few hundred per histogram, except the [3L1L-GQ in NTS] construct (Figure ), where complications in complex formation resulted in a significantly smaller number of molecules. This might be due to the inability to prevent formation of a very stable GQ in crRNA before the activation step.

Figure 2

(A) A schematic of the complex. Donor Cy3 (green ball) is on TS, and acceptor Cy5 is on crRNA. (B) Schematics of TS, NTS, and crRNA sequences and positions of Cy3 and Cy5 (green and red balls). The PAM sequence is indicated with blue fonts, PQS with orange fonts, and complementary C-rich sequence with cyan fonts. The labeling positions for Figures and 4 are indicated with green (Cy3) and red (Cy5) rectangles. (C) SmFRET histograms for the reference sample that does not contain a PQS (top panel), TBS-GQ (second panel), hGQ (third panel), and 3L1L-GQ (bottom panel). All experiments were performed in KCl. [PQS in TS] data are shown with gray filled columns, while [PQS in NTS] data are shown with blue empty columns. The contrast between [PQS in TS] and [PQS in NTS] cases is particularly prominent for the 3L1L-GQ construct (bottom). The numbers of molecules in each histogram are N = 602 for [TBA-GQ in TS] and N = 603 for [TBA-GQ in NTS]; N = 1237 for [hGQ in TS] and N = 539 for [hGQ in NTS]; N = 582 for [3L1L-GQ in TS] and N = 29 for [3L1L-GQ in NTS]; and N = 461 for the Reference-No PQS sample. (D) Example smFRET traces demonstrating dynamics in the [TBA-GQ in NTS] construct.

Figure 3

3L1L-GQ is targeted by biotin-Cas9 while the GQ stability is modulated by maintaining KCl or LiCl in the environment. (A) Schematic showing the labeling scheme where the donor is on TS and the acceptor on NTS, on opposite sides of PQS, making FRET sensitive to GQ conformational states. Linearized DNA and crRNA constructs showing fluorophore positions are shown in Figure B (red and green rectangles). (B) For [PQS in TS], the distributions for KCl and LiCl are similar. (C) For [PQS in NTS], a high FRET peak (EFRET = 0.9) that appears in KCl is absent in LiCl data, while the population of the low FRET peak (EFRET = 0.4) is significantly higher. These data are consistent with the formation of a more stable GQ in KCl, resulting in the higher FRET peak, while unfolded GQ gives rise to the lower FRET peak. The number of molecules in each histogram are as follows: N = 179 for [PQS in TS] and N = 189 for [PQS in NTS] for KCl data, while N = 301 for [PQS in TS] and N = 36 for [PQS in NTS] for LiCl data.

Figure 4

(A) SmFRET histograms for TBA-GQ (top), hGQ (middle), and 3L1L-GQ (bottom) when PQS is in TS (gray filled bins) or NTS (blue empty bins). All experiments were performed in KCl. Cy3 was on TS and Cy5 on NTS, and the fluorophore positions are shown by red and green rectangles in Figure B. The numbers of molecules in histograms were as follows: N = 201 for [TBA-GQ in TS] and N = 263 for [TBA-GQ in NTS]; N = 511 for [hGQ in TS] and N = 453 for [hGQ in NTS]; N = 179 for [3L1L-GQ in TS] and N = 189 for [3L1L-GQ in NTS]. (B) Example smFRET traces demonstrating dynamics in [TBA-GQ in TS] (top) and [3L1L-GQ in TS] (bottom) constructs.

Results and Discussion

Figure shows smFRET data on GQ constructs that have a TBA-GQ, hGQ, or 3L1L-GQ in either TS or NTS, in addition to a reference construct that does not contain a PQS in either strand. The donor was on TS and the acceptor on crRNA as shown in Figure A, where GQ formation is not shown for simplicity. The separation between donor and acceptor was the same for all constructs except for the [hGQ in TS] case, which was one bp different from the others. The corresponding separation in the reference sample was also similar to that in PQS constructs, so FRET histograms can be directly compared. Figure B shows a schematic of labeling positions. The data on [TBA-GQ in NTS] are different from those for [TBA-GQ in TS] or the reference sample (Figure C). If a GQ does not form in NTS, the FRET distributions for [PQS in TS], [PQS in NTS], and the reference sample should be very similar since fluorophore positions are the same. Therefore, the observed difference can be attributed to GQ formation. This suggests that even TBA-GQ, the weakest of the GQs, can fold and modify the complex structure if it is in NTS. Since NTS is freed from Watson–Crick pairing by R-loop formation between TS and crRNA, GQ may more readily form in NTS. The data for [TBA-GQ in NTS] show two prominent peaks, which might be due to different folding states of GQ. The higher FRET peak would be consistent with the folded GQ state since this peak is not observed in the reference construct. The data on [hGQ in NTS] and [3L1L-GQ in NTS] support this interpretation as they show a systematic transition to higher FRET levels as the GQ gets more stable (blue histograms in Figure C). While FRET distribution for [hGQ in NTS] is broad, suggesting structural heterogeneity, the higher FRET states are clearly more populated compared to [TBA-GQ in NTS]. The data on [3L1L-GQ in NTS] show a single high-FRET peak, in agreement with folding of this very stable GQ in NTS. In all the studied cases, the complexes demonstrate significant dynamics, as exemplified in single molecule time traces in Figure C. The [TBA-GQ in TS] data show a similar FRET distribution to that of the reference sample (top two panels in Figure C). This suggests that R-loop formation between TS and crRNA prevents GQ formation in TS for this weak GQ. The [hGQ in TS] and [3L1L-GQ in TS] data (Figure C, third and fourth panels) are significantly different from those on reference and TBA-GQ constructs. The distributions for both hGQ and 3L1L-GQ are much broader than all other distributions. This suggests that the stability of the CRISPR–Cas9 complex is significantly lower when a moderate to high stability GQ is placed in TS, where R-loop formation must compete with folding of the GQ. Further supporting this observation, the lower FRET states become more populated as the stability of GQ is increased for [PQS in TS] cases, an opposite trend to [PQS in NTS] cases. Considering the fluorophores are on TS and crRNA, the transition to lower FRET states might be due to displacement of crRNA from the complex or an overall distortion in the CRISPR–Cas9 complex because of GQ formation in TS. These data demonstrate that there are limitations on the type and stability of secondary structures that can be maintained in an unfolded state by CRISPR–Cas9 and the R loop. It is known that different monovalent cations stabilize the GQ at different levels: while K+ is very effective, Li+ is a weak stabilizer.[21] This property allows modulating the stability of GQ without changing the overall ionic strength of the environment. For the reference sample which does not contain a PQS, LiCl and KCl give rise to very similar smFRET distributions (Supporting Information Figure S3). However, the distributions for the high stability 3L1L-GQ construct are significantly different in KCl vs LiCl (Supporting Information Figure S4). For the [3L1L-GQ in TS] case, the smFRET distribution is very broad in the presence of KCl (Figure C), which was interpreted as the GQ preventing a stable R-loop formation. In the presence of LiCl, the distribution is more concentrated at higher FRET levels (Supporting Information Figure S4), suggesting that R-loop formation is inhibited to a lesser extent by a lower stability GQ. For the [3L1L-GQ in NTS] case, the distribution was dominated by a single high-FRET peak in the presence of KCl (Figure C), which was interpreted as the conformation space being dominated by GQ formation in NTS. In the presence of LiCl, the distribution is not dominated by a single peak, but multiple conformations are present (Supporting Information Figure S4), as would be expected from a lower stability GQ. To illustrate, the distribution for [3L1L-GQ in NTS] in the presence of LiCl resembles the distribution for [TBA-GQ in NTS] in the presence of KCl. Previous studies have demonstrated that the HNH domain, an endonuclease domain that cleaves TS (Supporting Information Figure S5), samples multiple conformations before docking into the cleavage-active state and that divalent cations, such as Mg2+, are required for Cas9 to remain in this conformation.[45] To test the potential impact of such conformational changes, we performed studies in the absence and presence (2 mM) of MgCl2 (Supporting Information Figure S5), which demonstrated very similar FRET distributions, suggesting HNH domain conformations are not the dominant factor for the broad distributions. The hGQ and 3L1L-GQ constructs contain repeating sequences of GGGTTA and GGGT, respectively. This symmetry raises the question of whether hybridization of crRNA with 1–3 of such repeats, instead of full hybridization with four repeats, could be the cause of some of the broadening in the FRET histograms. Several arguments can be made against this possibility. Some of the sharpest histograms were observed for [hGQ in NTS] and [3L1L-GQ in NTS]. Partial hybridization between TS and crRNA should have resulted in broad histograms for these constructs as well. Second, the symmetric PQS in the 3L1L-GQ construct is 15-nt long, and the last 5 nt of the target sequence in PAM-distal region break the symmetry between the ends. If the first GGGT repeat from the PAM-proximal region is skipped, the crRNA will not be able to hybridize with these 5-nt at the PAM-distal region either, resulting in an 11-bp-long R-loop. Skipping two GGGT repeats results in a 7-bp R-loop. Both 7-bp- and 11-bp-long R-loops would be significantly less stable than the 20-bp R-loop of full hybridization case. Following the same arguments for hGQ constructs, skipping one or two GGGTTA repeats will result in 14-bp- and 8-bp-long R-loops. Given these, we would have expected the competition between partial and full hybridization should have been more dominant in hGQ compared to 3L1L-GQ; however, the histograms of the latter are broader than the former, arguing against this scenario. To monitor the GQ folding state and dynamics, we moved the fluorophores to TS and NTS on different sides of PQS, as shown in Figure A. We initially tested placing both the donor and acceptor outside of PQS; however, this resulted in distributions peaked at very low FRET levels (Supporting Information Figure S6), which is not ideal for detecting different structural features. Therefore, the acceptor was moved within the last loop of each PQS (Figure B and Table S1), which should maintain structural symmetry between different PQSs. However, as hGQ (21 nt) is longer than TBA-GQ and 3L1L-GQ (both 15 nt), the separation between donor–acceptor fluorophores was greater for hGQ compared to the others (22 bp vs 16 bp). To establish the FRET levels in these dual-labeled dsDNA constructs before they are targeted by CRISPR–Cas9, we created biotinylated versions of these dual labeled DNA constructs and immobilized them on the surface. These measurements showed low FRET peaks in both KCl and LiCl consistent with fully formed duplex DNA, i.e., unfolded GQ, before it is targeted by CRISPR–Cas9 (Supporting Information Figure S7). The peak positions for each construct were very similar in KCl and LiCl, further supporting the absence of GQ before the construct is targeted by CRISPR–Cas9. 3L1L-GQ is targeted by biotin-Cas9 while the GQ stability is modulated by maintaining KCl or LiCl in the environment. (A) Schematic showing the labeling scheme where the donor is on TS and the acceptor on NTS, on opposite sides of PQS, making FRET sensitive to GQ conformational states. Linearized DNA and crRNA constructs showing fluorophore positions are shown in Figure B (red and green rectangles). (B) For [PQS in TS], the distributions for KCl and LiCl are similar. (C) For [PQS in NTS], a high FRET peak (EFRET = 0.9) that appears in KCl is absent in LiCl data, while the population of the low FRET peak (EFRET = 0.4) is significantly higher. These data are consistent with the formation of a more stable GQ in KCl, resulting in the higher FRET peak, while unfolded GQ gives rise to the lower FRET peak. The number of molecules in each histogram are as follows: N = 179 for [PQS in TS] and N = 189 for [PQS in NTS] for KCl data, while N = 301 for [PQS in TS] and N = 36 for [PQS in NTS] for LiCl data. Figure demonstrates comparative studies in KCl and LiCl for this fluorophore arrangement after the dsDNA is targeted by CRISPR–Cas9 (Figure ). Similar to the measurements in Figure , the CRISPR complex with biotinylated Cas9 was immobilized on the surface, and dual-labeled target dsDNA (not biotinylated) was introduced in the chamber. While the distributions for KCl and LiCl are similar for [3L1L-GQ in TS] (Figure B), there are significant differences for [3L1L-GQ in NTS] (Figure C). In Figure C, a major high FRET peak (EFRET ≈ 0.9) is present in KCl but not LiCl, while the population of a low FRET peak (EFRET ≈ 0.4) is significantly higher in LiCl. More prominent GQ formation in KCl compared to LiCl would suggest that the high FRET peak is due to GQ formation, while the low FRET peak is due to unfolding of the GQ. The similarity of distributions in KCl and LiCl for [3L1L-GQ in TS] suggests that the variations in GQ stability are not adequate to result in significantly different structures for this case. Interestingly, the distribution for [3L1L-GQ in TS] is significantly narrower when both fluorophores are on the dsDNA (Figure B) compared to the case when one is on TS and the other on crRNA (Figure C, bottom panel). This would suggest that in the presence of a very stable GQ in TS, the R-loop between crRNA and TS is very dynamic and explores many conformations while the target region in dsDNA is to a certain extent immune to these dynamics and explores fewer conformations. Figure shows comparative data in KCl on all three GQ constructs using the arrangement of fluorophores where donor and acceptor are on TS and NTS, respectively (see Figure A for a schematic), and CRISPR–Cas9 is immobilized on the surface. For all studied constructs, the FRET levels were significantly higher than those observed before the dsDNA was targeted by CRISPR–Cas9 (Supporting Information Figure S7). Since their donor–acceptor separations are the same (16 bp), the distributions for TBA-GQ (top panel of Figure A) and 3L1L-GQ (bottom panel of Figure A) can be directly compared. Despite some variation in their spread, the distributions for [3L1L-GQ in TS] and [TBA-GQ in TS] are surprisingly similar. This contrasts with the significant difference between the two cases in Figure where the fluorophores are on TS and crRNA and FRET is sensitive to conformational variations between these two strands. The same conclusions are valid for the hGQ construct, i.e., the distribution for [hGQ in TS] is significantly narrower in Figure compared to that in Figure . These suggest that having a moderate to high stability GQ in TS primarily results in structural instability between crRNA and TS, possibly within the R-loop, while the target dsDNA maintains a more stable structure. (A) SmFRET histograms for TBA-GQ (top), hGQ (middle), and 3L1L-GQ (bottom) when PQS is in TS (gray filled bins) or NTS (blue empty bins). All experiments were performed in KCl. Cy3 was on TS and Cy5 on NTS, and the fluorophore positions are shown by red and green rectangles in Figure B. The numbers of molecules in histograms were as follows: N = 201 for [TBA-GQ in TS] and N = 263 for [TBA-GQ in NTS]; N = 511 for [hGQ in TS] and N = 453 for [hGQ in NTS]; N = 179 for [3L1L-GQ in TS] and N = 189 for [3L1L-GQ in NTS]. (B) Example smFRET traces demonstrating dynamics in [TBA-GQ in TS] (top) and [3L1L-GQ in TS] (bottom) constructs. For [TBA-GQ in NTS], two clearly distinguishable peaks are observed, which might be representing different folding states of GQ. The distribution for [3L1L-GQ in NTS] shows multiple peaks and a broader distribution, suggesting a structurally more heterogeneous system. This is again in contrast to the distribution in Figure , where a single high-FRET peak was observed for [3L1L-GQ in NTS]. The same conclusion is also valid for [hGQ in NTS] where the distribution in Figure is significantly broader than that for the same case in Figure . These suggest that having a moderate to high stability GQ in NTS results in significant structural heterogeneity within the target dsDNA while the complex between TS and crRNA remains relatively stable. These conclusions are also supported by the broader distributions observed for all [PQS in NTS] cases compared to [PQS in TS] cases in Figure , i.e., having a PQS in NTS results in greater structural heterogeneity within the target dsDNA compared to having the same PQS in TS. In all cases we studied, the smFRET traces are dynamic and demonstrate frequent transitions between different FRET levels (Figure B).

Conclusions

We demonstrate that the position of PQS and stability of the GQ influence the conformations and structural heterogeneities experienced by the CRISPR–Cas9 complex. In the first set of measurements, the donor and acceptor were placed on TS and crRNA, which made FRET sensitive to conformational changes between TS and crRNA, i.e., the R-loop. The [PQS in TS] case for a weak GQ results in similar conformations to a reference construct that does not contain a PQS, which suggests that this weak GQ can be destabilized by R-loop formation. However, when the same PQS is placed in NTS, the resulting conformations are significantly different from those observed for the reference construct. This suggests that when in NTS, even a weak GQ can fold within the CRISPR–Cas9 complex and gives rise to variations in the complex structure. A GQ with higher stability creates a significant disturbance in the complex structure even when in TS, suggesting R-loop formation is not adequate to maintain such structures in the unfolded state. The broader FRET histograms for [hGQ in TS] and [3L1L-GQ in TS] cases shown in Figure , suggesting more heterogeneous complex structures, are a clear manifestation of this. We also observed persistent dynamics within the CRISPR–Cas9 complex during these interactions. In the second set of measurements, the donor was on TS and the acceptor on NTS, on opposite sides of PQS, which made FRET more sensitive to conformational changes due to GQ folding dynamics. In these cases, [PQS in NTS] cases showed more heterogeneous structures compared to [PQS in TS] cases for all constructs. This suggests that [PQS in NTS] creates a more significant disturbance for the target dsDNA structure compared to [PQS in TS], which may be justified by the latter having to compete with the R-loop while the former is relatively less inhibited to attain alternative secondary structures. These observations were made possible by optimizing the positions of donor/acceptor fluorophores on TS, NTS, and crRNA. Having established these structural and dynamic variations introduced by PQS, it will be critical to understand how they impact CRISPR–Cas9 activity in terms of target recognition, R-loop progression and stability, and target dsDNA cleavage. The understanding attained for GQs will likely have implications for other secondary structures that might form within the sequences targeted by CRISPR–Cas9.

44 in total

1. ATP-dependent G-quadruplex unfolding by Bloom helicase exhibits low processivity.

Authors: Jagat B Budhathoki; Edward J Stafford; Jaya G Yodh; Hamza Balci
Journal: Nucleic Acids Res Date: 2015-05-18 Impact factor: 16.971

2. Cas9 slide-and-seek for phage defense and genome engineering.

Authors: Andrew Santiago-Frangos; Tanner Wiegand; Blake Wiedenheft
Journal: EMBO J Date: 2019-02-07 Impact factor: 11.598

3. Effect of DNA secondary structure on human telomerase activity.

Authors: T M Fletcher; D Sun; M Salazar; L H Hurley
Journal: Biochemistry Date: 1998-04-21 Impact factor: 3.162

4. Diversity, activity, and evolution of CRISPR loci in Streptococcus thermophilus.

Authors: Philippe Horvath; Dennis A Romero; Anne-Claire Coûté-Monvoisin; Melissa Richards; Hélène Deveau; Sylvain Moineau; Patrick Boyaval; Christophe Fremaux; Rodolphe Barrangou
Journal: J Bacteriol Date: 2007-12-07 Impact factor: 3.490

5. Phage response to CRISPR-encoded resistance in Streptococcus thermophilus.

Authors: Hélène Deveau; Rodolphe Barrangou; Josiane E Garneau; Jessica Labonté; Christophe Fremaux; Patrick Boyaval; Dennis A Romero; Philippe Horvath; Sylvain Moineau
Journal: J Bacteriol Date: 2007-12-07 Impact factor: 3.490

6. Stacking of G-quadruplexes: NMR structure of a G-rich oligonucleotide with potential anti-HIV and anticancer activity.

Authors: Ngoc Quang Do; Kah Wai Lim; Ming Hoon Teo; Brahim Heddi; Anh Tuân Phan
Journal: Nucleic Acids Res Date: 2011-08-12 Impact factor: 16.971

7. Enhanced proofreading governs CRISPR-Cas9 targeting accuracy.

Authors: Janice S Chen; Yavuz S Dagdas; Benjamin P Kleinstiver; Moira M Welch; Alexander A Sousa; Lucas B Harrington; Samuel H Sternberg; J Keith Joung; Ahmet Yildiz; Jennifer A Doudna
Journal: Nature Date: 2017-09-20 Impact factor: 49.962

Review 8. Beyond editing: repurposing CRISPR-Cas9 for precision genome regulation and interrogation.

Authors: Antonia A Dominguez; Wendell A Lim; Lei S Qi
Journal: Nat Rev Mol Cell Biol Date: 2015-12-16 Impact factor: 94.444

9. Predicting and understanding the stability of G-quadruplexes.

Authors: Oliver Stegle; Linda Payet; Jean-Louis Mergny; David J C MacKay; Julian Huppert Leon
Journal: Bioinformatics Date: 2009-06-15 Impact factor: 6.937

Review 10. G-quadruplexes and helicases.

Authors: Oscar Mendoza; Anne Bourdoncle; Jean-Baptiste Boulé; Robert M Brosh; Jean-Louis Mergny
Journal: Nucleic Acids Res Date: 2016-02-15 Impact factor: 16.971

2 in total

1. Encounters between Cas9/dCas9 and G-Quadruplexes: Implications for Transcription Regulation and Cas9-Mediated DNA Cleavage.

Authors: Mohammed Enamul Hoque; Golam Mustafa; Soumitra Basu; Hamza Balci
Journal: ACS Synth Biol Date: 2021-05-10 Impact factor: 5.249

Review 2. G-Quadruplexes in Neurobiology and Virology: Functional Roles and Potential Therapeutic Approaches.

Authors: Jinglei Xu; Haiyan Huang; Xiang Zhou
Journal: JACS Au Date: 2021-11-22

2 in total