Literature DB >> 30988504

Increasing the specificity of CRISPR systems with engineered RNA secondary structures.

D Dewran Kocak^1,2, Eric A Josephs^3,4, Vidit Bhandarkar^1,2, Shaunak S Adkar^1,2, Jennifer B Kwon^2,5, Charles A Gersbach^6,7,8.

Abstract

CRISPR (clustered regularly interspaced short palindromic repeat) systems have been broadly adopted for basic science, biotechnology, and gene and cell therapy. In some cases, these bacterial nucleases have demonstrated off-target activity. This creates a potential hazard for therapeutic applications and could confound results in biological research. Therefore, improving the precision of these nucleases is of broad interest. Here we show that engineering a hairpin secondary structure onto the spacer region of single guide RNAs (hp-sgRNAs) can increase specificity by several orders of magnitude when combined with various CRISPR effectors. We first demonstrate that designed hp-sgRNAs can tune the activity of a transactivator based on Cas9 from Streptococcus pyogenes (SpCas9). We then show that hp-sgRNAs increase the specificity of gene editing using five different Cas9 or Cas12a variants. Our results demonstrate that RNA secondary structure is a fundamental parameter that can tune the activity of diverse CRISPR systems.

Entities: Chemical Disease Gene Mutation Species

Mesh：

Substances：
RNA, Guide
RNA

Year: 2019 PMID： 30988504 PMCID： PMC6626619 DOI： 10.1038/s41587-019-0095-1

Source DB: PubMed Journal: Nat Biotechnol ISSN： 1087-0156 Impact factor: 54.908

CRISPR-Cas systems are adaptive immune systems in bacteria and archaea, and have proven to be robust genome editing platforms [1]. Efforts to repurpose CRISPR-Cas systems for genome editing have largely focused on class 2 CRISPR systems because of their simplicity. While class 1 systems use multi-protein complexes to target nucleic acids, class 2 systems use a single Cas protein, termed the Cas effector, which can be easily reconstituted and harnessed for a variety of applications [2]. The arms race between viruses and prokaryotes has driven immense genetic diversity of Cas effectors. Each Cas effector has unique properties (e.g. nucleic acid preference, PAM requirements, size of the Cas effector) that endow it with particular advantages and disadvantages. The identification and characterization of class 2 CRISPR systems is thus an active area of research with the overarching goal of finding Cas effectors with novel or improved properties [3-5]. Since the initial characterization of SpCas9, the number of Cas effectors active in mammalian cells has expanded to include compact Cas9 effectors from the type II CRISPR systems, Cas12a (previously Cpf1) effectors with A/T rich PAMs from type V systems, and RNA-targeting Cas13 variants [6-12]. Although these nucleases are versatile tools for gene editing outside of their native environments, they also have off-target effects, leading to unintended DNA breaks at sites with imperfect complementarity to the spacer sequence [13-15]. Thus, improving the specificity of these nucleases is a critical goal, especially for gene therapy applications [16]. Methods to increase the specificity of class 2 CRISPR systems through rational design have largely focused on SpCas9 and have adopted two general strategies. The first strategy is to create an AND-gate that requires coordinate binding of two Cas9 molecules, imposing a stricter requirement for nuclease activity [17-20]. The second strategy is to reduce the energetics of DNA interrogation by the Cas9-sgRNA complex, which results in an overall increase in specificity [21-25]. The second strategy is particularly attractive because, unlike the first strategy, it does not increase the number of components of the gene editing system. This simplifies gene delivery, which is often a critical barrier. While previous efforts from either strategy were successful, they suffer from one or more of a variety of limitations, including incompatibility with viral packaging constraints, increasing the number of components of the system, and requiring extensive protein engineering. Recent studies that employ directed evolution rather than rational design have yielded many new variants with improved properties [26-28]. However, it remains to be seen which of these many approaches will have general applicability across CRISPR systems. Thus there is a need for a simple method for increasing specificity of diverse CRISPR systems. Employing rational design and adopting the second strategy, we hypothesized that engineering the sgRNA might serve as a means to regulate diverse CRISPR systems. Specifically, we engineered RNA secondary structure onto the spacer by extending a designed hairpin on the 5’ end of the sgRNA (hp-sgRNA). The resulting hairpin structure could then serve as a steric and energetic barrier to R-loop formation. We hypothesized that by adjusting the strength of the secondary structure, R-loop formation could proceed to completion at the on-target site, but could be impeded at off-target sites, which have reduced energetics due to RNA-DNA mispairing. Because R-loop formation is the critical process governing the conformational change of SpCas9 to an active nuclease [29, 30], this would block off-target nuclease activity and result in an increase in specificity. Since CRISPR endonucleases accommodate a nucleic acid duplex within their binding channel, we hypothesized that the RNA-RNA duplexes of hp-sgRNAs could also be accommodated without interfering with formation of the sgRNA-protein complex. Moreover, hp-sgRNAs are simple to design and produce: RNA-hairpins generally follow Watson-Crick base-pairing guidelines and sgRNA production methods are rapid and inexpensive.

Results

Design considerations for hp-sgRNAs

RNA can fold into many different complex structures. For our initial engineered structures we adopted the RNA hairpin, a fundamental structural unit in many RNA molecules [31]. RNA hairpins are composed of two components, stems and loops, which we create by extending the PAM-distal end of the spacer to generate hp-sgRNAs (Figure 1a). All designs were informed through the use of in silico structure determination and only spacer sequences were used for these predictions (i.e. structural sequences in the tracrRNA or crRNAs were excluded).

Figure 1.

Engineered RNA Secondary Structures Tune the Activity of dCas9-P300

(a) Structure of the wild-type sgRNA for SpCas9 and design parameters of hp-sgRNAs. (b) Gene activation of IL1RN using hp-sgRNAs with varying stem lengths, measured by qRT-PCR. Hairpin sgRNAs are abbreviated as ‘hp’, non-structured controls are abbreviated as ‘ns’, and numbers indicate the number of nucleotides added 5’ of the spacer. Data are shown as fold increase relative to the control sample, which was transfected with dCas9-P300 only. Error bars represent s.e.m. for n=3. All hp-sgRNA variants show significant activation over control, P < 0.005 using a two-sided t test after a global one-way ANOVA. (c) Replotting the mean of each group in (b) as a function of the predicted folding energy of each hp-sgRNA’s engineered secondary structure. Trends in the data are annotated for clarity (e.g. “Region 1”). The sequences of all sgRNAs used are listed in Table S1.

We expected thermodynamic stability of the secondary structure to be an influential characteristic of hp-sgRNAs. However, there are many variables one can use to create different structures with similar stability (Figure 1a). The stem can be placed along any area of the 20-nucleotide spacer, which may have variable effects on R-loop formation kinetics. Stem lengths, the major determinant of hairpin stability, can also be varied. In order to modulate stability but not necessarily overall hp-sgRNA structure, non-canonical rG-rU base-pairs can be substituted for potential rG-rC/rA-rU sites in the stems. Many RNA hairpins found in nature utilize 5’-ANYA-3’ or 5’-UNCG-3’ tetraloops, which have favorable base-stacking behavior [32]. We utilize these tetraloops for our initial structures, but one can also use part of the spacer itself for the hairpin loop. In this study all of these variables were used to generate hp-sgRNAs. Furthermore, to control for any effects of sgRNA length, we also designed non-structured-sgRNAs (ns-sgRNAs), which have extensions to the spacer but whose extensions are not predicted to form any secondary structures.

hp-sgRNAs regulate a SpCas9-based transcriptional activator

We first tested the effect of predicted hp-sgRNAs structures on Cas9 binding to DNA. Critically, we wanted to analyze this interaction in human cells, where reports have shown that extensions to the 5’ end of the sgRNA can be processed back to lengths of the native spacer [19, 33]. We thus decided to utilize nuclease-inactive dCas9-based transcriptional activators [34, 35], where endogenous gene activation can serve as a sensitive measure of dCas9 binding to target DNA. For our initial hp-sgRNA designs, we used a tetraloop that is external to the 20-nucleotide spacer and placed the hairpin stems on the PAM-distal end of the spacer using canonical Watson-Crick base pairing. We used a spacer that targets the endogenous promoter of IL1RN, a gene we have previously activated with high efficiency [34, 35]. Transfecting sgRNA variants and a dCas9-P300 transactivator into human cells, we observed that hp-sgRNAs can tune gene activation at the target locus (Fig. 1b), suggesting modulation of dCas9 binding. We observed a generally regular relationship between length of the hp-sgRNA spacer extension and impact on dCas9 binding (Fig. 1b). The only irregularity was observed with hp15, which has an unpaired 5’ guanine, necessitated by the U6 promoter. Replotting the activity of each hp-sgRNA variant as a function of thermodynamic stability of their predicted structures, we observed a monotonic decrease of gene activation over four orders of magnitude (Fig. 1c). These data provide evidence that the predicted RNA structures form in human cells and demonstrate that the in silico predicted free energy of the structures is an accurate predictor of its regulatory effect on dCas9 binding to genomic DNA target sites. Notably, use of ns-sgRNAs did not decrease transactivation to the same degree as those seen with hp-sgRNAs, indicating that hairpin formation, and not simply sgRNA extension, was responsible for modulating dCas9 binding. However, on average, ns-sgRNAs caused a ~2.8-fold reduction in gene activation when compared to the unmodified guide (WT sgRNA). This is consistent with other evidence of spacer length having significant effects on the efficiency of dCas9-based transcriptional regulators [36], underscoring the need to control for guide length when measuring the effects of sgRNA secondary structure. In fact, length effects may be the underlying cause for the observation that sgRNAs with guanine-dinucleotide extensions have increased specificity [37]. These data describe nonlinear effects of 5’ sgRNA extensions on SpCas9 binding to DNA, dependent on both the length and secondary structure of the spacer. This relationship is characterized by three key regions in the data (Fig. 1c). First, extensions to the 20 nt spacer cause a decrease in overall binding energy that is independent of secondary structure (Fig. 1c, ‘Region 1’). Second, extensions that form weaker predicted secondary structures do not seem to have measurable effects on SpCas9 binding beyond those caused by length effects (Fig. 1c, ‘Region 2’), however it is possible that R-loop formation is still being inhibited in this region [38, 39]. Finally, more stable hairpins cause measurable decreases in Cas9 binding as a function of the strength of the hp-sgRNA’s secondary structure (Fig. 1c, ‘Region 3’). Further, these decreases in activity occur as the hairpin extends into the seed region of the sgRNA that is critical for initiating the interaction between Cas9 and a target. The trend of hairpin structure modulating targeted gene activation was corroborated at two additional gene targets in human cells (Supplementary Fig. 1). Although we ascribe the changes in gene activation to modulation of R-loop formation by hp-sgRNAs, previous studies showed by Northern blot that 5’ extensions to sgRNAs were efficiently processed to 20 nt spacers [19, 33]. To control for both processing of the hairpins and expression of sgRNA variants, we repeated this experiment, harvested total RNA, and performed sample-matched measurements of IL1RN and sgRNA expression by RT-qPCR, and 5’ sgRNA processing by 5’RACE followed by RNA-seq (Supplementary Fig. 2a,b). Patterns in IL1RN gene activation were faithfully replicated (Supplementary Fig. 2c,d). We observed no correlation between hp-sgRNA expression and hp-sgRNA activity (Supplementary Fig. 2e,f). In contrast to the previous reports, we observed that hp-sgRNAs are moderately to minimally processed, with stronger predicted secondary structures undergoing less processing (Supplementary Fig. 2g, range 0.8–48% processed). The corresponding non-structured sgRNAs had a much higher rate of processing (Supplementary Fig. 2h, range 52–79% processed). We observed no clear association between the level of hp-sgRNA processing and IL1RN transactivation (Supplementary Figure 2i,j). These data suggest that hp-sgRNAs are maintained in cells and can be accommodated within the Cas9 binding pocket where they are protected from processing.

Kinetic modeling of R-loop formation

The differences in behavior between hp-sgRNAs and ns-sgRNAs indicate that the secondary structure of the spacer is a critical determinant of CRISPR activity. To gain a better understanding of how spacer secondary structure might affect SpCas9 behavior, we applied a kinetic model of R-loop formation and generalized it to accommodate any species of mismatches, an arbitrary number of mismatches, and RNA secondary structure (Fig. 2a) [29]. Strand invasion is represented as a series of 20 discrete states and the probability of exchange between states is governed by three energetic processes: 1) hybridization or melting of the genomic target (DNA-DNA), 2) the hybridization or melting of the spacer to the genomic target (RNA-DNA), and 3) the breaking or forming of spacer secondary structure (RNA-RNA). This approach defines the kinetics of R-loop formation entirely in terms of empirically measured thermodynamic values of nucleic acid pairs (See Methods).

Figure 2.

Spacer Secondary Structure Improves the Performance of a Kinetic Model of R-loop formation

(a) Schema of kinetic model of R-loop formation. Left panel: modeled molecular interactions. The target DNA is shown in green and sgRNA spacer is shown in red with both a mismatch and RNA secondary structure. Center panel: distinct states representing degree of R-loop formation by the spacer. The forward and reverse rates between states are calculated using the free energy differences between states (see methods). Right panel: Q matrix of forward and reverse reaction rates. The starting state of the simulation is represented by vector α0. (b) Correlation between model-based predictions of binding lifetime and the ChIP-seq intensity [40, 41]. Model was initiated with a pre-formed R-loop. For each gRNA, log(L) was correlated (Pearson) with log(ChIP-seq intensity), and these correlations combined using Fisher’s method, n=12,181. (c) Correlation coefficients with (ρ=1) and without (ρ=0) energetic contributions from spacer secondary structure, for various starting states. Plots show the calculated Pearson correlation coefficient, and error bars are 95% confidence intervals. (d) Simulated values of the mean binding lifetimes for sgRNA variants, shown in Figure 1b, plotted against their activation of the IL1RN gene, n=12.

To test the model, we used previously reported ChIP-seq data of 16 sgRNAs and 12,181 called binding sites [40, 41]. We simulated the mean residence time of each of the 16 sgRNAs to each of the reported binding sites, compared this simulation to the measured ChIP-Seq signal, and combined correlations across sgRNAs using Fisher’s method. We find correlation coefficients of 0.285 (95% confidence: 0.252, 0.317) when the simulation is initiated at the PAM-proximal site and a correlation of 0.380 (95% confidence: 0.349, 0.410) if initiated with a pre-formed R-loop (Fig. 2b). These correlations were higher than the previously reported best performing feature, chromatin accessibility [40]. The predictive power of our model demonstrates that the dynamics of R-loop formation play an important role in Cas9 binding to DNA. To determine the contribution of spacer secondary structure to the model’s predictive power, we removed the energetic terms for RNA folding from the reaction rates. We observed a decrease in correlation from 0.285 to 0.194 (95% confidence: 0.160, 0.228) if the simulation is initiated at the PAM-proximal nucleotides or from 0.380 to 0.273 (95% confidence: 0.240, 0.305) if the simulation is initiated with the R-loop already pre-formed (Fig. 2c). Finally, we performed simulations to predict the behavior of the hp-sgRNA variants used to modulate the expression of the IL1RN promoter in Figure 1 (Figure 2d). We found a strong correlation, 0.915, between estimated binding lifetime and fold increase in gene expression. Collectively, these findings suggest that spacer secondary structure influences Cas9 binding activity by modulating invasion kinetics and stability of the R-loop, key determinants of nucleolytic activation of SpCas9 [30].

hp-sgRNAs increase the gene editing specificity of SpCas9

We next assessed the effect of spacer secondary structure on SpCas9 nuclease activity. It was our hypothesis that hairpin structures could increase nuclease specificity by modulating R-loop formation without necessarily altering binding to target sites [29, 30]. Thus, for hp-sgRNAs designed for the SpCas9 nuclease, we generally chose hairpins with predicted free energies weaker than −15 kcal/mol, i.e. within Region 1 of Figure 1c, since any further increase in hairpin stability resulted in significant decreases in SpCas9 binding to its on-target site. To assess the effects of engineered hp-sgRNAs on the nuclease activity and specificity of Cas9 in human cells, we chose spacers that have large numbers of well-characterized off-target sites [42]. We generated a variety of hp-sgRNAs for these spacers where we varied several hp-sgRNA structural characteristics, including utilizing both external and internal loops or adjusting PAM-distal and PAM-proximal stem placement. We measured indel frequency at on-target and off-target sites for each spacer and compared the activity of these hp-sgRNAs with activities of both unextended sgRNAs (WT-sgRNA) and truncated-sgRNAs (tru-sgRNA) [23]. We observed a number of hp-sgRNA designs with on-target activities comparable to WT-sgRNAs and reduced off-target activity, comparable to tru-sgRNAs (Fig. 3a–c, Supplementary Figs. 3–6). We defined a specificity metric by dividing on-target mutation rates by the sum of all off-target mutation rates. All designed hp-sgRNAs significantly increased the specificity of SpCas9, on par with increases observed with tru-sgRNAs (Fig. 3d, Supplementary Fig. 6e). Hp-sgRNA 7 of the EMX1.1 spacer, which had the highest fold-increase in specificity, had both a spacer truncation and designed secondary structure, suggesting that these approaches may be combined in some cases (Supplementary Fig. 6e). We observed that tru-sgRNAs increase off-target activity at 8 of the 37 off-target loci (Fig. 3a–c). This increase may be due to the decreased sequence complexity of tru-sgRNAs and was not observed for any hp-sgRNA variants, consistent with hp-sgRNAs behaving in an entirely inhibitory manner (Fig. 3a–c, Supplementary Fig. 6a–c). Collectively these results show that hp-sgRNAs can increase the specificity of SpCas9 nuclease by multiple orders of magnitude.

Figure 3.

Hairpin-sgRNAs Increase the Specificity of SpCas9 in Human Cells

(a,b,c) On-target and off-target mutation rates for sgRNA variants targeting the EMX1 and VEGFA genes, measured by deep sequencing. ‘Percent modified’ indicates percentage of reads containing indels compared to the wild-type sequence (mean + s.e.m., n=3).

Wild-type sgRNAs (‘WT’) generated significant editing activity at all off-target sites, except for VEGFA spacer 2 at OT10 (P < 0.01). Hairpin sgRNAs show significant decreases in activity at all measured off-target sites when compared to wild-type sgRNA, (P < 0.05). Hypothesis testing using a one-sided Fisher exact test with pooled read counts, adjusting for multiple comparisons using the Benjamini–Hochberg method. (d,e) On-target activity and specificity metric for different sgRNA variants. Samples labeled as ‘hairpin’ use the same hairpin variant listed in panels (a,b,c). The specificity metric is defined as on-target indel rate divided by the sum of all off-target indel-rates (mean + s.e.m., n=3).

The sequences of sgRNA variants are listed in Table S1. The predicted structures of hp-sgRNAs are displayed in Supplementary Figures 3–5.

To test whether the 5’ extensions of hp-sgRNAs might lead to any new off-target cleavage events beyond what had previously been identified for the corresponding WT-sgRNAs, we performed CIRCLE-seq, an unbiased in vitro method to determine genome-wide cleavage events [43]. We performed CIRCLE-seq using the EMX1.1 spacer and used WT-, truncated-, and, hairpin-sgRNA variants; off-targets were reliably identified across replicates for each sgRNA variant (Supplementary Fig. 7a,b,c,d). Comparing to WT-sgRNA, the tru-sgRNA eliminated 77 off-target sites but also had 25 unique off-target sites that were reproducibly detected using CIRCLE-seq (Supplementary Fig. 8a, Supplementary Fig. 9a,b). In contrast, the hp-sgRNA eliminated 124 off-target sites found with the WT-sgRNA and generated no unique off-target sites (Supplementary Fig. 8b, Supplementary Fig. 9a,c). We next sought insight into the mechanism of specificity increases driven by hp-sgRNAs, in particular, whether this was a result of decreased binding to DNA. We performed ChIP-qPCR to measure the relative enrichment of the nuclease-null dSpCas9 at on-target vs. off-target sites using the same EMX1 spacer tested with nuclease-active SpCas9. We observed that both the hp-sgRNAs and tru-sgRNA yielded similar levels of dCas9 occupancy at the on-target site (Fig. 4a). Interestingly, hp-sgRNA 2 did not measurably decrease dCas9 occupancy at any of the measured off-target sites relative to the WT-sgRNA (Fig. 4b,c,d), even though nuclease activity was reduced at these sites by an order of magnitude or more (Fig. 4e, Supplementary Fig. 6b). This suggests that, similar to high-fidelity Cas9 variants [24], hp-sgRNAs do not mediate specificity increases through a decrease in binding. Hp-sgRNA 7 had more variable behavior, which we attribute to the combination of a hairpin and a truncated spacer.

Figure 4.

Hairpin-sgRNAs Retain Binding Activity at Off-Target Loci

(a) dCas9 enrichment at the on-target site using sgRNA variants containing EMX1 spacer 1 by ChIP-qPCR. The WT sgRNA sample had significant enrichment over control, P<0.001. The tru-sgRNA and hp-sgRNAs showed a decreased enrichment relative to WT-sgRNA, P<0.05. (b,c,d) dCas9 enrichment at designated off-target sites using sgRNA variants containing EMX1 spacer 1 by ChIP-qPCR. Hp-sgRNAs were also assayed for editing activity with nuclease active SpCas9 (Supplementary Figure 6b), and their predicted secondary structure is shown in Figure S1. (e) Off-target editing rates, as shown on Supplementary Figure 5B, as a function of corresponding DNA binding as measured by ChIP-qPCR. Hairpin 2, when compared to WT, showed significantly decreased editing activity at off-target sites (P<0.05×10−18), but showed no significant decreases in ChIP enrichment (mean + s.e.m., n=3).

P-values for ChIP-qPCR data were calculated using a post-hoc Tukey-test after a global one-way ANOVA. For editing activity, hypothesis testing was carried out using a one-sided Fisher exact test with pooled read counts, adjusting for multiple comparisons using the Benjamini–Hochberg method.

All fold enrichments are relative to transfection of a control sgRNA plasmid targeted to the IL1RN promoter and normalized to a region of the β-Actin locus.

The sequences of sgRNA variants are listed in Table S1.

hp-sgRNAs increase specificity of Cas9 and Cas12a variants

We next tested whether hp-sgRNA designs can be extended to other CRISPR systems. In particular, we were interested in SaCas9 because its compact size facilitates delivery by AAV vectors and is therefore of significant interest for gene therapy applications [6, 44]. While SaCas9 and SpCas9 have many analogous domains and a similar bilobed structure, they share only 17% sequence similarity [45]. Focusing on SaCas9 and SaCas9-KKH, a relaxed PAM variant, we designed hp-sgRNAs of varying stem lengths using target sites with previously characterized off-target effects [6, 13]. We delivered sgRNA variants with each SaCas9 to human cells and assayed for nuclease activity at on-target and off-target loci. Similar to SpCas9, SaCas9 activity is tuned by hp-sgRNAs according to the strength of predicted secondary structure (Fig. 5a,b, Supplementary Fig. 10a,b,c). Truncated sgRNAs of varying length were also used, though they did not eliminate off-target activity without severely impacting on-target activity; shorter truncations resulted in complete abrogation of off-target and on-target nuclease activity (Fig. 5a,b, Supplementary Fig. 10a,b,c; data not shown).

Figure 5.

Hairpin-sgRNAs and -crRNAs Increase the Specificity of Various Cas Effectors

(a,b,c,d) On-target and off-target nuclease activity of sgRNA and crRNA variants with SaCas9, SaCas9-KKH, LbCas12a, and AsCas12a, respectively. Plasmids encoding the Cas effector and the sgRNA or crRNA variant were transfected into human cells and mutational activity was measured using the Surveyor nuclease assay. Representative gels are shown from optimizations that were performed one to three times. Optimized structures were further investigated with deep sequencing in Figure 6.

Aliases for sgRNA variants are listed above each lane and are detailed in Table S1. Wild-type sgRNAs are referred to as ‘WT’, truncated sgRNAs/crRNAs are abbreviated as ‘Tru’, hairpin-sgRNAs/crRNAs are abbreviated as ‘Hp’, and nonstructured controls are abbreviated as ‘N.s.’. For LbCas12a, off-target activity was generated by introducing a mismatch in the sgRNA spacer, as shown, and is referred to as a “pseudo-off-target”. (e,f,g,h) Predicted structure of optimized hairpin-sgRNA spacers, arrows indicate 3’ end of RNA.

The sequences of the sgRNA variants are listed in Table S1. (i) Normalized nuclease activity of wild-type sgRNAs and various hp-sgRNAs, plotted against their predicted free energy of secondary structure folding. Data from panels A-D were normalized to the wild-type sgRNA activity at the corresponding on-target (solid line) or off-target site (dotted line).

We next tested whether hp-sgRNAs could be applied to type V Cas12a nucleases. While SpCas9 and Cas12a share a bilobed architecture, they share no structural or sequence homology other than a single RuvC domain [46]. Cas12a nucleases are unique in that they can process their own crRNAs, and these crRNAs are sufficient for Cas12a target recognition and cleavage [47]. Cas12a recognizes its crRNA via a hairpin that is at the 5’ end of the crRNA and the spacer is at the 3’ end: the reverse orientation relative to Cas9 sgRNA structure. Target recognition by Cas12a and R-loop formation mechanisms are also reversed when comparing to that of Cas9: the PAM sequence is located 5’ of the target sequence and R-loop formation of the target strand proceeds 3’ to 5’. Despite these many differences, we hypothesized that the activity of Cas12a nucleases could also be regulated by spacer secondary structure. Using a spacer with previously characterized off-target sites [14, 15, 48], we designed hp-crRNAs with varying structural stability. We observed that both AsCas12a and LbCas12a activity can be regulated by spacer secondary structure and that off-target activity can be reduced without altering on-target activity by tuning the strength of the secondary structure (Fig. 5c,d Supplementary Fig. 11a,b,c). Truncated crRNAs did not consistently result in specificity increases for either AsCas12a or LbCas12a, indicating this strategy might not be consistently translatable to Cas12a nucleases (Fig. 5c–d, Supplementary Fig. 11a,b,c). Shorter truncations of the spacer resulted in complete abrogation of off-target and on-target nuclease activity. We observed that hp-crRNAs influence the activity of Cas12a nucleases according to the strength of the secondary structure, consistent with the effect of hp-sgRNAs on SpCas9 and SaCas9 activity (Fig. 5c,d, Supplementary Fig. 11a,b,c). Significantly, as predicted folding energy increases, decreases in gene editing activity occur preferentially at off-target loci, allowing for increases in specificity (Fig. 5i). In order to confirm that increases in specificity are caused by RNA secondary structures, we generated ns-sgRNAs for hp-sgRNAs used with Cas9 and Cas12a effectors. For each Cas effector we chose hp-sgRNA variants that maintained on-target activity but had the most stable predicted free energy. We delivered these sgRNA variants with their respective Cas nuclease and used deep sequencing to assay mutational rates at both on-target and off-target loci (Fig. 6a,b,c,d,e). Across 12 spacer sequences and six different Cas9 or Cas12a variants, hp-sgRNAs increased specificity by an average of 55-fold (median 12-fold) compared to unmodified sgRNAs and 9-fold compared to length-matched non-structured control sgRNAs (Fig. 6f, Supplementary Fig. 12). Hp-sgRNAs showed particular sensitivity to off-targets with multiple mismatches (Supplementary Fig. 13).

Figure 6.

RNA Secondary Structure Drives the Specificity Increases Observed with hp-sgRNAs

(a,b,c,d,e) Nuclease activity of hp-sgRNAs/crRNAs and corresponding non-structured controls in human cells; sgRNA variants were applied with SaCas9, SaCas9-KKH, and AsCas12a, respectively. Deep sequencing was used to measure editing activity of Cas effector-sgRNA pairs.

Wild-type sgRNAs (‘WT’) induced significant editing activity at all off-target sites (P < 0.01×10−7). Hp-sgRNAs/crRNAs significantly reduced editing activity at all examined off-target sites when compared to wild-type sgRNA/crRNA (P < 0.05×10−9). Hypothesis testing was carried out using a one-sided Fisher exact test with pooled read counts, adjusting for multiple comparisons using the Benjamini–Hochberg method. (f) Specificity metric for sgRNA variants applied with the indicated Cas effector (mean + s.e.m., n=3). The gene target of each spacer is listed on the x-axis.

To further ensure that the specificity increases were due to modulation of kinetics of R-loop formation, rather than changes to expression or stability that could occur within transfected cells, we completed in vitro assays for nuclease activity and DNA-binding. For in vitro nuclease activity, we digested PCR amplicons containing the on-target EMX1 spacer 1, EMX1 spacer 2, or DNMT1 spacer 1, by defined concentrations of purified SpCas9, SaCas9, or AsCas12a protein, respectively, complexed with corresponding chemically synthesized WT-, hp-, or ns-sgRNAs (Supplementary Fig. 14). At the on-target sites, the activity of the hp-gRNAs was reduced by 85%, 59%, and 69% relative to activity of WT-gRNAs at the on-target sites for SpCas9, SaCas9, and AsCas12a, respectively, compared to a reduction of 12% and increases of 35% and 6% with the corresponding ns-gRNAs. The significant reduction of activity of hp-gRNAs at on-target sites in vitro, but not in cells (Figs. 3b,d, 6a,c), may be the result of the short time frame of the assay or other differences with the intracellular environment in which these particular hairpin structures were selected. We also tested identical digestion reactions with PCR amplicons containing the corresponding off-target 1 (OT1) spacer sequence. At the off-target sites, hp-gRNAs also showed decreases of 91%, 79%, and 67% relative to WT-gRNAs, compared to decreases of 88%, 38%, and 0% for the ns-gRNAs. To assay DNA-binding, we used atomic force microscopy to directly image and quantify interactions of the same combinations of Cas effectors and gRNAs at on-target and off-target sequences (Supplementary Fig. 15). These analyses showed that only hp-gRNAs, and not ns-gRNAs, robustly and reproducibly decreased occupancy at off-target sites relative to the on-target site. Collectively, these data support that, under controlled conditions of in vitro reactions, hairpin structure – and not simply any 5’ extension – modulates CRISPR activity.

Discussion

CRISPR-Cas endonucleases did not evolve to function for highly specific gene editing of mammalian genomes, and cases of off-target activity have been reported for the majority of CRISPR endonucleases tested so far in human cells. Additionally, the discovery of novel CRISPR systems with potential biotechnological applications is occurring at a steady pace. Hence, there is a need to improve the performance of CRISPR endonucleases that is robust and can be applied easily across CRISPR systems. The rational design of hp-sgRNAs as characterized in this study is a promising method to meet this need. For five of the most commonly applied Cas effectors, utilizing well-characterized off-target sites, we demonstrate that rationally designed RNA secondary structures increase specificity by an average of 55-fold. Moreover, despite the widely ranging biochemical properties of each Cas effector used, we observe consistent behavior of hp-sgRNAs, where CRISPR activity is inhibited as a function of the stability of the secondary structure. The strategy used in this study was inspired by previous efforts, which aimed to increase nuclease specificity by weakening direct interactions between Cas9 and the DNA [21, 22]. While we do not directly determine the mechanism of hp-sgRNA-driven specificity increases, we hypothesize that it occurs through inhibition of R-loop kinetics, which inhibits the structural transitions of the CRISPR endonuclease that are necessary for activity at off-target sites [30]. The evidence for this is three-fold. First, using ChIP-qPCR we show that hp-sgRNAs do not decrease dCas9 binding at off-target sites, even when nuclease activity is reduced by orders of magnitude (Fig. 4e). This is evidence that nuclease activity is diminished by the inhibition of full R-loop formation. Second, because RNA-DNA duplexes are regularly accommodated in the central binding channel of CRISPR endonucleases, it is likely that RNA-RNA duplexes are similarly accommodated without interfering with RNP complex formation. This is supported by evidence that sgRNAs with significant spacer secondary structure could readily complex with SpCas9 [49]. Finally, the predictive power of our kinetic model supports its principle hypothesis: that R-loop formation is a kinetic process that is modulated by RNA secondary structures. Collectively, these points suggest that sgRNA-endonuclease complex levels are maintained and that observed specificity increases are caused by secondary-structure mediated inhibition of R-loop formation, limiting the conformation change to an activated endonuclease at off-target sites. Our study considers R-loop formation as the central process governing CRISPR nuclease activity: its modulation allows for more specific genome editing and its modeling facilitates predictions of CRISPR activity. Improvements to the modeling of this process would be broadly useful for in silico prediction of off-target effects and for designing functional hp-sgRNAs a priori. As our model approximates this behavior using thermodynamic parameters of nucleic acids derived from in vitro data, further refinement of our understanding of RNA-DNA interactions and mispairing within the catalytic environment of different CRISPR endonucleases will likely improve its predictive and design performance. Recent methods using massively parallel assessment of CRISPR endonuclease binding and catalysis could provide attractive data sets for model refinement [50, 51]. To our knowledge, this is the first study to demonstrate a method to increase specificity across diverse CRISPR systems. Future studies will be useful to determine whether hp-sgRNAs can similarly regulate new Cas12, Cas13, or Cas14 effectors [4, 5, 11, 52, 53]. The hp-sgRNA secondary structures that regulate specificity may be combined with other methods of sgRNA engineering to modulate activity, specificity, and orthogonality [54-56]. sgRNA engineering, in conjunction with careful spacer choice and optimized gene delivery, could enable higher specificity of CRISPR nucleases for next-generation genome editing and facilitate realizing the potential of CRISPR for sensitive therapeutic and diagnostic applications.

Online Methods

Plasmids and oligonucleotides

Expression plasmids for the Cas effectors and their respective gRNAs were obtained through Addgene (Addgene #41815, 47108, 65776, 70708, 70709, 78741, 78742, 78743, 78744); crRNA sequences are listed in Table S1 and oligonucleotide sequences are found in Table S2. To create sgRNA plasmids, oligonucleotides containing the target sequences were obtained from IDT, hybridized, phosphorylated and cloned in the appropriate plasmids using BbsI or BsmBI sites. All hp-sgRNA designs were informed through the use of in silico structure determination and only spacer sequences were used for these predictions (i.e. structural sequences in the tracrRNA or crRNAs were excluded) [57].

Human cell culture and transfection

HEK293T cells were obtained from the American Tissue Collection Center (ATCC) through the Duke University Cancer Center Facilities and were maintained in DMEM supplemented with 10%FBS and 1% penicillin-streptomycin at 37 °C with 5% CO2. HEK293T cells were transfected with Lipofectamine 2000 (Invitrogen) according to manufacturer’s instructions. Transfection efficiencies were routinely higher than 80%, as determined by fluorescence microscopy after delivery of a control eGFP expression plasmid. All transfections were performed in 24-well cell culture plates that were coated with a 1:10 dilution of poly-l-lysine (P8920 SIGMA). On day 1, cell culture plates were coated and 200,000 cells were seeded per well. On day 2 cells were put in OptiMEM and transfected with 800 ng of plasmid (600 ng of Cas effector, 200 ng sgRNA) and 2 μL of Lipofectamine 2000. On day 3 media was changed to DMEM supplemented with 10% FBS and 1% penicillin-streptomycin. Cells were harvested for downstream analysis on day 5.

Surveyor assays

The region surrounding the sgRNA or crRNA target site was amplified by PCR with the AccuPrime PCR kit (Invitrogen) and 50–200 ng of genomic DNA as template using primers listed in Table S3. The PCR products were melted and reannealed using the temperature program: 95°C for 180 s, 85°C for 20 s, 75°C for 20 s, 65°C for 20 s, 55°C for 20 s, 45°C for 20 s, 35°C for 20 s and 25°C for 20 s with a 0.1°C/s decrease rate in between steps. This allows the formation of mutant and wild-type DNA strands with the consequent formation of distorted duplex DNA. Without purifying the PCR product, 18 μl of the reannealed duplex were combined with 2 μl of the Surveyor nuclease (IDT), which cleaves DNA duplexes at the sites of distortions created by either bulges or mismatches, and 1 μl of enhancer solution. This reaction was incubated at 42°C for 60 min and then separated on a 10% TBE polyacrylamide gel. The gels were stained with ethidium bromide and quantified using ImageLab (BioRad) [58].

Deep sequencing

Genomic DNA was purified from cells using the DNeasy kit (Qiagen). Biological replicates were generated from three separate transfections for each experimental condition. On-target and off-target sites were amplified using 100 ng of genomic DNA with AccuPrime polymerase (Invitrogen). Primers are listed in Table S3. For some regions, 4% v/v DMSO was used in the PCR for efficient amplification. PCR primers included Nextera adapters for binding to Illumina flowcells. Using a second round of PCR, group-specific barcodes were added. The resulting PCR products were purified using Agencourt AMPure beads (Beckman coulter), quantified using Qubit Fluorimeter (Thermo Fisher), pooled, and sequenced with 150 bp paired-end reads on an Illumina MiSeq instrument. CRISPResso was used for sequence analysis [59]. Sequences were first trimmed to remove adapter sequences. Sequences were filtered using a minimum average quality score of 30. Reads were trimmed to remove adapter sequences. Paired reads were then merged using the FLASH method to create a single sequence of higher quality; a minimum overlap of 40 bp was used. CRISPRessoPooled was then used to demultiplex reads and quantify NHEJ rates. A minimum identity score of 80 was used for demultiplexing. Only insertions and deletions were used in calling CRISPR-generated NHEJ events, since CRISPR-based gene editing largely causes indels and not substitutions. Each biological replicate had a minimum of 1,500 reads per loci; the average was approximately 20,000 reads per replicate per loci. Hypothesis testing was carried out using a one-sided Fisher exact test on the pooled read counts of three biological replicates. P values were adjusted for multiple comparisons using the method of Benjamini and Hochberg.

Quantitative reverse transcription–PCR.

IL1RN activation experiment: Cells were transfected as described above. RNA was isolated using the RNeasy Plus RNA isolation kit (Qiagen). cDNA synthesis was performed using the SuperScript VILO cDNA Synthesis Kit (Invitrogen). Real-time PCR using SYBR green Fastmix (Quanta BioSciences) was performed with the CFX96 Real-Time PCR Detection System (Bio-Rad) with oligonucleotide primers reported in Table S3 that were designed using Primer3Plus software and purchased from IDT. Primer specificity was confirmed by agarose gel electrophoresis and melting curve analysis. Reaction efficiencies over the appropriate dynamic range were calculated to ensure linearity of the standard curve. The results are expressed as fold-increase mRNA expression of the gene of interest normalized to GAPDH expression by the ΔΔCt method. HBG1 and IL1B activation experiments: The day before transfection HEK293T cells were plated at 105 cells/well in a 96 well plate coated with poly-l-lysine. The day of transfection, DMEM was aspirated and 100 μL of Opti-MEM was added to each well. Each well was then transfected with 400 ng of plasmid (300 ng of dCas9-P300 and 100 ng of sgRNA). Plasmids were brought to 25 μL with Opti-MEM. A separate mixture was made of 24.5 μL Opti-MEM and 0.5 μL Lipofectamine 2000 and this was combined with the 25 μL plasmid mixture. The 50 μL solution was incubated for 5 minutes and pipetted slowly onto each well. Media was changed the next day to DMEM+10%FBS+pen/strep. Cells were harvested using Cells-to-CT 1-Step TaqMan Kit and TaqMan gene expression assays (ThermoFisher).

Sample-matched 5’ RACE and sgRNA expression measurements

Cells were grown and transfected as described above. Cells were harvested using the miRNeasy kit (Qiagen) and the on-column DNase digestion was performed to rid any remaining plasmid DNA. RNA concentrations were then measured and normalized by dilution. For measurement of IL1RN gene activation and sgRNA expression, cDNA was created using SuperScript VILO cDNA Synthesis Kit. Primers for the sgRNA RT-qPCR were designed to bind the spacer region and end of the sgRNA scaffold. RT-qPCR was carried out as described above. 5’ RACE was carried out on the RNA samples using Maxima H Minus Reverse Transcriptase (EP0753, ThemoFisher). Both the template switch primer and sgRNA-specifc RT-primer were ordered from IDT. The RT-primer included a 10 nt random barcode that serves as a unique molecular identifier. Reactions were run using manufacturer’s protocol with slight modification. Specifically, 1 μg total RNA, 0.2 pmol RT primer, 50 pmol template-switch primer, 1 μL 10 mM dNTP mix, 4 μL 5x RT buffer were combined and brought to 19.5 μL with water. The mixture was incubated at 85 °C for 2 minutes to disrupt RNA secondary structure. The temperature was then brought down to 55 °C, 0.5 μL RTase was added, and the reaction was incubated at 55 °C for 30 minutes and terminated by incubating at 85 °C for 5 minutes. 1 μL of each reaction was then used in a 50 μL PCR to enrich for the desired product, barcode, and add i5 and i7 illumina adapters. PCR product was run on an agarose gel to confirm expected product lengths. The desired sgRNA cDNAs were purified using a 0.9x SPRI bead cleanup, concentrations were measured using the high-sensitivity qubit assay, and samples were pooled and run on an Illumina MiSeq instrument. Samples were sequenced using 150 bp SE reads at an average depth of approximately 100,000 reads per replicate. Any reads without an exact 76 nucleotide sgRNA scaffold sequence were discarded. UMI sequences were used to remove any events that might result from PCR duplication. After these two filters, each sample had on average of 47,675 reads with a minimum of 8,092. Spacer lengths were then calculated using locations of the sgRNA scaffold and template-switch sequence as anchors. Finally, the frequency of each observed spacer length was determined for each sample.

CIRCLE-Seq

CIRCLE-Seq libraries were generated largely as previously described [60]. Large quantities of HEK293T gDNA were harvested as follows. 6 ml of NK Lysis Buffer (50 mM Tris, 50 mM EDTA, 1% SDS, pH 8) and 30 μl of 20 mg/ml Proteinase K (QIAGEN 19131) were used to resuspend 5×107 cells. This lysate was incubated at 55°C overnight. The next day, 30 μl of 10 mg/ml RNase A (QIAGEN 19101) was added to the lysed sample. The sample was vortexed, and incubated at 37°C for 30 min. Samples were cooled on ice before addition of 2 ml of pre-chilled 7.5M ammonium acetate (Sigma A1542) to precipitate proteins. The samples were vortexed, centrifuged at ≥ 10,000 × g for 10 min, and the supernatant was carefully decanted into a new 15 ml conical tube. Then 6 ml 100% isopropanol was added to the tube, inverted several times and centrifuged at ≥ 10,000 × g for 10 min. Genomic DNA was visible as a small white pellet in each tube. The supernatant was discarded, 5 ml of freshly prepared 80% ethanol was added to wash the pellet, and then centrifuged at ≥ 10,000 × g for 1 min. The supernatant was carefully discarded, the pellet was air dried for 30 minutes, and finally resuspended in TE buffer. Approximately 50 μg-100 μg of starting gDNA was needed to generate enough circles for each CIRCLE-seq reaction. Using a Diagenode Bioruptor XL sonicator at 4 °C, gDNA was sonicated to an average size of approximately 500 bp, with a visible range of 200–1000 bp, as determined by agarose gel electrophoresis. The enzymatic procedure to generate circles was carried out as previously described [60]. For the in vitro digest of the circles, sgRNAs were synthesized from Synthego and SpCas9 was purchased from New England Biolabs. Library production was carried out as previously described [60]. Libraries were quantified using a Qubit Fluorimeter (Thermo Fisher), pooled, and sequenced with 150 bp paired-end reads on an Illumina MiSeq instrument. CIRCLE-seq read counts were obtained using previously described methods and software [43]. The following parameters were used for running the CIRCLE-seq pipeline: read threshold of 4, window_size of 3, mapq threshold of 50, start threshold of 1, gap threshold of 3, and mismatch threshold of 6.

ChIP-qPCR

ChIP experiments were performed in biological triplicate, starting from independent cell transfections, and harvested 3 days after transfection. For each replicate, 2 × 107 nuclei were resuspended in 1 mL of RIPA buffer (1% NP-40, 0.5% sodium deoxycholate, 0.1% SDS in PBS at pH 7.4). Samples were sonicated using a Diagenode Bioruptor XL sonicator at 4 °C to fragment chromatin to 200–500-bp segments. Insoluble components were removed by centrifugation for 15 min at 15,000 r.p.m. 5 μg of FLAG M2 antibody (F1804) was conjugated with sheep anti-mouse IgG magnetic beads (Life Technologies, 11203D/11201D). Sheared chromatin in RIPA buffer was then added to the antibody-conjugated beads and incubated on a rotator overnight at 4 °C. After incubation, beads were washed five times with an LiCl wash buffer (100 mM Tris, pH 7.5, 500 mM LiCl, 1% NP-40, 1% sodium deoxycholate), and remaining ions were removed with a wash in 1 mL of TE (10 mM Tris-HCl, pH 7.5, 0.1 mM Na2-EDTA) at 4 °C. Chromatin and antibodies were eluted from beads by incubation for 1 h at 65 °C in immunoprecipitation elution buffer (1% SDS, 0.1 M NaHCO3) followed by overnight incubation at 65 °C to reverse formaldehyde cross-links. DNA was purified using MinElute DNA purification columns (Qiagen). qRT-PCR using SYBR green Fastmix (Quanta BioSciences) was performed with the CFX96 Real-Time PCR Detection System (Bio-Rad) and the oligonucleotide primers reported in Table S3. 100 pg of ChIP DNA was loaded into each reaction. The results are expressed as a fold-increase of signal at the target locus normalized to signal of a region in the β-Actin locus using the ΔΔCt method.

Kinetic R-loop formation simulations

A first-principles, biophysical simulation of sgRNA invasion of a DNA duplex was performed in MATLAB by modeling the processes as a one-dimensional random walk in a position-dependent potential [29]. This was formulated as a continuous time Markov chain in MATLAB. The position-dependent potential is determined by the nearest-neighbor dependent DNA:DNA binding free energies [61], RNA:DNA binding free energies [62], and guide RNA secondary structure free energies that are disrupted or restored as invasion progresses/recedes. Here we have generalized the model to estimate sgRNA residence time at spacers with arbitrary numbers and species of mismatches, and to account for effects of spacer secondary structure on invasion kinetics. The gRNA is base-paired with the spacer up to spacer site m (2 ≥ m ≥ 20). At each state m, the gRNA is assumed to be in quasi-equilibrium with the DNA, such that at perfectly matched spacer sites the forward rate (rate of additional guide RNA invasion; m to m+1) v is estimated using the symmetric approximation to be exp(−(ΔG°(m +1)RNA:DNA − ΔG°(m +1)DNA:DNA − ΔG°(m +1)RNA,SS)/2RT), where R is Boltzmann’s constant, T is the temperature (here 37°C to correspond with parameter set we used), and the 1/2 corrective term is included to satisfy detailed balance). ΔG°(m + 1)RNA:DNA is free energy of the base-pairing between the RNA and DNA target at site m + 1. ΔG°(m + 1)DNA:DNA is the free energy of the base-pairing between the spacer and its complementary DNA strand. ΔG°(m + 1)RNA,SS is the difference in free energies between the predicted structures of the 20 - m - 1 uninvaded nucleotides of the gRNA at site m + 1 and the 20 - m uninvaded nucleotides of the gRNA at site m. The reverse rate v was calculated similarly as exp(−(ΔG°(m - 1)DNA:DNA – ΔG°(m - 1)RNA:DNA + ΔG°(m - 1)RNA,SS)/2RT). At m = 1, the gRNA irreversibly falls off the DNA (m = 1 acts as an absorbing state). RNA secondary structure free energy was calculated using the rnafold function in MATLAB [63, 64]. To estimate transition rates from site m in the presence of mismatched nucleotides, the next complementary site n is identified, and ΔG°(n)MM is estimated from the difference in free energies between the gRNA (R)-DNA target (P) duplex from sites 1 to m and the gRNA-DNA target duplex from sites 1 to n. These duplex free energies were calculated using the MATLAB rnafold function using the sequence R–UUUU–P, with a minimum size of the loops (in nucleotides) set to 4. The forward rate was then calculated as exp(−(ΔG°(n)MM – Σ ΔG°(k)DNA:DNA – ΔG°(k)RNA,SS)/2RT) and similarly for the reverse. The forward and reverse rates were calculated and assembled into a 19 × 19 Q-matrix (Q) [65], and the mean lifetimes L of the gRNA-spacer interaction was calculated as L = −α0Q−11, where 1 is a 19-element column vector with values all 1. α0 is a 19-element row vector containing the fractional population of initial states (m = 2 to 20). These experiments were performed for all 16 gRNAs and 12,181 ChIP-seq hits using the published data sets from Kuscu et al. [66] and Wu et al. [40]. For each gRNA, log(L) was correlated (Pearson) with log(ChIP-seq count normalized to on-target site), and these correlations combined using Fisher’s method.

Protein Purification

Plasmids encoding SpCas9 and SaCas9 were transformed into Rosetta 2 (DE3) competent cells. Clones were used to inoculate 25 mL starter cultures. Starter cultures were grown overnight, spun down, and used to inoculate 1 L cultures. Inoculated 1 L cultures were grown for 5 hours @ 25 °C after which the temperature was dropped to 16 °C and induced using 0.1 mM IPTG. Induced cultures were grown for another 12 hours @ 16 °C. Cells were harvested by centrifugation at 4000xg and stored @ −80 °C for long term storage. Cell pellets were resuspended in 30 mL of Lysis Buffer (50 mM Tris-HCl, 500 mM NaCl, 10 mM MgCl2, 10 % v/v glycerol, 0.2% Triton-1000, 1mM PMSF). The cell suspension was lysed by sonication at 30% duty for 5 minutes. The suspension was then centrifuged for 30 minutes at 12,000xg. The supernatant was then taken and incubated with Ni-NTA resin (Qiagen) for 30 minutes under gentle agitation. The resin was then loaded onto a column, washed with Wash Buffer (35 mM imidizole, 50mM Tris-HCl, 500 mM NaCl, 10 mM MgCl2, 10 % v/v glycerol), and eluted with Elution Buffer (120 mM imidizole, 50mM Tris-HCl, 500 mM NaCl, 10 mM MgCl2, 10 % v/v glycerol). Ultracel-30k centrifugal filters were then used to exchange solvents to the Storage Buffer (50mM Tris-HCl, 500 mM NaCl, 10 mM MgCl2, 10 % v/v glycerol). The samples were then aliquoted and frozen at the −80 °C.

In vitro digestion

Regions of interest were amplified using PCR from HEK293T genomic DNA and purified using SPRI beads. Cas9 and sgRNA were combined and incubated for 10 minutes at room temperature at a 1:1 molar ratio. The Cas9-sgRNA complex was then combined with DNA at a 10:1 molar excess of RNP in NEB buffer 2.1. The reaction was incubated at 37 °C for one hour after which Gel Loading Dye, Purple (6X) (NEB #B7024S) was added. In order to fully dissociate Cas9-DNA interactions the reaction was heated to 90 °C and cooled. The reaction was then resolved on a 2% agarose gel.

Atomic force microscopy

Atomic force microscopy (AFM) was performed in air as previously described; see Ref. [6] for details. Imaging was performed using a Bruker Nanoscope V Multimode with RTSEP (Bruker) probes (nominal spring constant 40 N/m, resonance frequency, 300 kHz). Prior to experiments, protein and guide RNAs were mixed in 1:1.5 ratio for 10 min in a buffer to limit DNA cleavage but not DNA binding (20 mM Tris-HCl (pH 7.5), 100 mM potassium glutamate, 5 mM CoCl2, and 0.4 mM TCEP) [67]. SpCas9 and SaCas9 proteins were purified as described above, AsCas12a was purchased from IDT, and all sgRNAs/crRNAs were purchased from Synthego. Protein and DNA were mixed in a solution of working buffer for at least 10 min at room temperature, deposited for 8 s on freshly cleaved mica (Ted Pella, Inc.) that had been treated with 3-aminopropylsiloxane, as previously described [68], rinsed with ultra-pure (>17 MΩ) water, and dried in air. Proteins were centrifuged briefly prior to incubation with DNA. At least three preparations for each experimental condition were imaged and analyzed. Images were acquired with pixel resolution of 1024 × 1024 over 2.75-micron square areas or 2048 × 2048 over 5.5 micron square areas at 1.5 lines/s for each sample. Image analysis to determine the distribution of binding sites along the DNA was performed as described previously. Apparent dissociation constants of CRISPR proteins were determined using the method pioneered by Yang et al. [69], adapted as previously described [29]. Consensus structures of images of CRISPR proteins determined by performing a reference-free alignment as previously described [5].

67 in total

1. Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure.

Authors: D H Mathews; J Sabina; M Zuker; D H Turner
Journal: J Mol Biol Date: 1999-05-21 Impact factor: 5.469

Review 2. Biology and Applications of CRISPR Systems: Harnessing Nature's Toolbox for Genome Engineering.

Authors: Addison V Wright; James K Nuñez; Jennifer A Doudna
Journal: Cell Date: 2016-01-14 Impact factor: 41.582

3. Analyzing CRISPR genome-editing experiments with CRISPResso.

Authors: Luca Pinello; Matthew C Canver; Megan D Hoban; Stuart H Orkin; Donald B Kohn; Daniel E Bauer; Guo-Cheng Yuan
Journal: Nat Biotechnol Date: 2016-07-12 Impact factor: 54.908

4. Genome-wide analysis reveals specificities of Cpf1 endonucleases in human cells.

Authors: Daesik Kim; Jungeun Kim; Junho K Hur; Kyung Wook Been; Sun-Heui Yoon; Jin-Soo Kim
Journal: Nat Biotechnol Date: 2016-06-06 Impact factor: 54.908

5. Applications of CRISPR technologies in research and beyond.

Authors: Rodolphe Barrangou; Jennifer A Doudna
Journal: Nat Biotechnol Date: 2016-09-08 Impact factor: 54.908

6. In vivo genome editing improves muscle function in a mouse model of Duchenne muscular dystrophy.

Authors: Christopher E Nelson; Chady H Hakim; David G Ousterout; Pratiksha I Thakore; Eirik A Moreb; Ruth M Castellanos Rivera; Sarina Madhavan; Xiufang Pan; F Ann Ran; Winston X Yan; Aravind Asokan; Feng Zhang; Dongsheng Duan; Charles A Gersbach
Journal: Science Date: 2015-12-31 Impact factor: 47.728

7. Efficient genome engineering in human pluripotent stem cells using Cas9 from Neisseria meningitidis.

Authors: Zhonggang Hou; Yan Zhang; Nicholas E Propson; Sara E Howden; Li-Fang Chu; Erik J Sontheimer; James A Thomson
Journal: Proc Natl Acad Sci U S A Date: 2013-08-12 Impact factor: 11.205

8. Orthogonal gene knockout and activation with a catalytically active Cas9 nuclease.

Authors: James E Dahlman; Omar O Abudayyeh; Julia Joung; Jonathan S Gootenberg; Feng Zhang; Silvana Konermann
Journal: Nat Biotechnol Date: 2015-11 Impact factor: 54.908

9. CIRCLE-seq: a highly sensitive in vitro screen for genome-wide CRISPR-Cas9 nuclease off-targets.

Authors: Shengdar Q Tsai; Nhu T Nguyen; Jose Malagon-Lopez; Ved V Topkar; Martin J Aryee; J Keith Joung
Journal: Nat Methods Date: 2017-05-01 Impact factor: 28.547

10. RNA-guided gene activation by CRISPR-Cas9-based transcription factors.

Authors: Pablo Perez-Pinera; D Dewran Kocak; Christopher M Vockley; Andrew F Adler; Ami M Kabadi; Lauren R Polstein; Pratiksha I Thakore; Katherine A Glass; David G Ousterout; Kam W Leong; Farshid Guilak; Gregory E Crawford; Timothy E Reddy; Charles A Gersbach
Journal: Nat Methods Date: 2013-07-25 Impact factor: 28.547

83 in total