Literature DB >> 26480473

DNA-binding-domain fusions enhance the targeting range and precision of Cas9.

Mehmet Fatih Bolukbasi1,2, Ankit Gupta1, Sarah Oikemus1, Alan G Derr3, Manuel Garber3,4, Michael H Brodsky1, Lihua Julie Zhu1,3,4, Scot A Wolfe1,2.   

Abstract

The CRISPR-Cas9 system is commonly used in biomedical research; however, the precision of Cas9 is suboptimal for applications that involve editing a large population of cells (for example, gene therapy). Variations on the standard Cas9 system have yielded improvements in the precision of targeted DNA cleavage, but they often restrict the range of targetable sequences. It remains unclear whether these variants can limit lesions to a single site in the human genome over a large cohort of treated cells. Here we show that by fusing a programmable DNA-binding domain (pDBD) to Cas9 and attenuating Cas9's inherent DNA-binding affinity, we were able to produce a Cas9-pDBD chimera with dramatically improved precision and an increased targeting range. Because the specificity and affinity of this framework can be easily tuned, Cas9-pDBDs provide a flexible system that can be tailored to achieve extremely precise genome editing at nearly any genomic locus.

Entities:  

Mesh:

Substances:

Year:  2015        PMID: 26480473      PMCID: PMC4679368          DOI: 10.1038/nmeth.3624

Source DB:  PubMed          Journal:  Nat Methods        ISSN: 1548-7091            Impact factor:   28.547


Introduction

The CRISPR-Cas9 genome engineering system is revolutionizing biological sciences due to its simplicity and efficacy[1-3]. The most commonly studied Cas9 nuclease originates from Streptococcus pyogenes (SpCas9)[4]. SpCas9 and its associated guide RNA license a DNA sequence for cleavage based on two stages of sequence interrogation[4-8] (Supplementary Fig. 1): 1) compatibility of the PAM element with the specificity of the PAM-interacting domain, and 2) complementarity of the guide RNA sequence with the target site. Because it is straightforward to program Cas9 to cleave a desired target site through incorporation of a complementary single guide RNA (sgRNA)[4], the primary constraint on Cas9 targeting is the presence of a compatible PAM element[4,9,10]. The PAM-interacting domain of wild-type SpCas9 preferentially recognizes a nGG element[4], although it can inefficiently utilize other PAM sequences (e.g. nAG, nGA)[9,11]. The simplicity of the SpCas9-sgRNA system allows facile editing of genomes in a variety of organisms and cell lines[1-3]. The precision of SpCas9 is sub-optimal for most gene therapy applications involving editing of a large population of cells[12,13]. Numerous studies have demonstrated that SpCas9 can cleave the genome at unintended sites[9,14-20], with some guides acting at more than 100 off-target sites[17]. Recent genome-wide analyses of SpCas9 precision indicate that the majority of genomic loci that differ at two nucleotides from the guide RNA sequence and a subset of genomic loci that differ at three nucleotides are cleaved with moderate activity[17-20]. For some guides, off-target sites that differ by up to six nucleotides can be inefficiently cleaved[17-20] and bulges can be accommodated within the sgRNA:DNA heteroduplex[15]. In this light, we assessed the general frequency of potential off-target sites with three or fewer mismatches for SpCas9 guide RNAs in exons or promoter regions using CRISPRseek[21,22]. We found that the vast majority of guides (~98% in exons and ~99% in promoters) have one or more off-target sites with 3 or fewer mismatches (Supplementary Fig. 1), and thus are likely to have some level of off-target activity. Because off-target breaks have the potential to cause both local mutagenesis and genomic rearrangements (e.g. segmental deletions, inversions and translocations)[17,18,23,24], the resulting collateral damage from SpCas9 treatment could have adverse consequences in therapeutic applications. Reduced off-target cleavage rates have been reported with several modifications to the structure or delivery of the CRISPR-Cas9 system. Examples include: changing guide sequence length and composition[25,26]; employing a pair of Cas9 nickases[26-28] or FokI-dCas9 nucleases[10,29]; inducible assembly of split Cas9[30-33]; Cas9 PAM variants with enhanced specificity[34]; and delivery of Cas9:sgRNA ribonucleoprotein complexes[35-37]. However, it remains unknown whether these variations can restrict cleavage to a single site within the human genome over a large cohort of treated cells[12,38]. In addition, some of the most promising approaches (e.g. paired nickases or dimeric FokI-dCas9) restrict the targetable sequence space by requiring the proximity of two sequences compatible with Cas9 recognition. We envisioned an improved Cas9 platform, where the precision of target recognition would be augmented by the incorporation of a programmable DNA-binding domain (pDBD), such as Cys2-His2 zinc finger protein (ZFPs)[39] or transcription-activator like effectors (TALEs)[40] (Fig. 1a and Supplementary Fig. 2). Both of these pDBD platforms can be programmed to recognize nearly any sequence within the genome[39-42]. Indeed, pDBDs have been employed with great success as targeting domains for programmable nucleases by incorporating non-specific FokI nuclease domain (ZFNs[39] and TALENs[40]) or sequence-specific nuclease domains (e.g. megaTAL[43]). One favorable characteristic of pDBDs is their inherent modularity whereby specificity and affinity can be rationally tuned by adjusting the number and composition of incorporated modules and the linkage between modules[44,45]. Here, we demonstrate that the fusion of a pDBD to a mutant SpCas9 with attenuated DNA-binding affinity generates a chimeric nuclease with broader sequence targeting range and dramatically improved precision, This SpCas9-pDBD platform has favorable properties for genome engineering applications. In addition, our analysis of these SpCas9-pDBD chimeras provides new insights into the barriers involved in licensing target site cleavage by a SpCas9:sgRNA complex.
Figure 1

Development of a SpCas9-pDBD framework. (a) Schematic of the SpCas9:sgRNA system fused to a pDBD (orange) that recognizes a binding site 3′ to the PAM. (b) (Top) Schematic of the pDBD binding site orientation and spacing parameters examined. The position and 5′ to 3′ orientation of the pDBD binding site relative to the PAM element of the SpCas9 binding site is represented by an orange arrow (Watson or Crick). (Bottom) Activity profile of SpCas9 (blue, on an nGG or nAG PAM), SpCas9-Zif268 (red, nAG PAM) or SpCas9-TAL268 (brown, nAG PAM) in the GFP reporter assay on a common sgRNA target site. pDBD site orientation is either Watson (W) or Crick (C), and spacing is 5, 8, 11 or 14 bp from the PAM. (c) Activity profile of SpCas9 (blue) or SpCas9-Zif268 (red) in the GFP reporter assay on a common target site with different PAM sequences and a neighboring Zif268 site. (d) (top) SpCas9 or SpCas9-Zif268 programmed independently with four different sgRNAs targeting four different genomic sites with neighboring Zif268 binding sites (highlighted in orange). (bottom) SpCas9 cuts efficiently only at the target site with a nGG PAM, but SpCas9-Zif268 cuts efficiently at additional target sites with nAG, nGA or nGC PAMs. Genomic regions were PCR-amplified, and lesions (indicating cleavage and mutagenic NHEJ) were detected by T7 Endonuclease I (T7EI) assay. (e) Quantification of lesion frequencies from three independent biological replicates performed on different days in HEK293T cells. Error bars indicate standard error of the mean.

Results

Defining the properties of the SpCas9-pDBD framework

To define the parameters necessary for the function of a SpCas9-pDBD chimera, we assayed cleavage of a Cas9:sgRNA target site with a suboptimal nAG PAM using a plasmid reporter assay[46]. We examined the ability of a pDBD fusion (ZFP or TALE) to SpCas9 to enhance nuclease activity when the pDBD binding sites are located at different positions and orientations relative to the Cas9 target site (Fig. 1b). In pilot experiments, the most robust activity was observed using a C-terminal fusion of a ZFP or a TALE to SpCas9 when the pDBD binding sites are positioned 3′ to the PAM element (M.F.B. and S.A.W. unpublished results). Both SpCas9-ZFP and SpCas9-TALE proteins increased nuclease activity on a nAG PAM target to a level comparable to wild-type SpCas9 activity on a nGG PAM (Fig. 1b) while being expressed at similar levels (Supplementary Fig. 3). SpCas9-pDBD nuclease activity remained dependent on the length of the guide sequence (Supplementary Fig. 4), confirming that the chimera retains the guide-dependent licensing stage for sequence cleavage. To define the functional PAM elements for SpCas9-pDBD, we examined activity at each of the 16 possible sequence combinations. In contrast to wild type SpCas9, SpCas9-pDBD displayed high activity for nAG, nGA, nGC as well as the standard nGG PAM (Fig. 1c and Supplementary Fig. 5). Accounting for reverse complements of the functional PAM elements, the SpCas9-pDBD chimeras can recognize seven of the 16 possible dinucleotide sequence combinations. The increased targeting range for SpCas9-pDBDs was also observed at genomic target sites (Fig. 1d, e). Because of the smaller size of SpCas9-ZFPs relative to SpCas9-TALEs - conferring advantages for certain viral delivery systems[47] - we have focused primarily on SpCas9-ZFP chimeras for the immediate development of this platform.

Attenuating the DNA-binding activity of SpCas9

The fusion of a pDBD to SpCas9 should increase nuclease precision if target cleavage is dependent on DNA recognition by the pDBD. To achieve this, we attenuated the DNA-binding affinity of SpCas9 by independently mutating the key PAM recognition residues (Arg1333 and Arg1335)[7] to either Lysine or Serine (Fig. 2a and Supplementary Fig 6). In the plasmid reporter assay, all four mutations reduced the nuclease activity of SpCas9 to background levels. A ZFP fusion in the presence of a complementary binding site restored nuclease activity in all mutants except R1335S (SpCas9MT4) (Fig. 2b). Interestingly, we found that R1335K (SpCas9MT3) lacked activity with the nAG PAM even as a SpCas9-ZFP fusion. This prompted a broader assessment of PAM specificity for the three active SpCas9-ZFP mutants, which revealed a preference for alternate PAMs that preserved the remaining arginineguanine interaction[7] (i.e. R1333 mutants prefer nnG PAMs, whereas the R1335K mutant prefers nGn PAMs; Supplementary Fig. 6). The activity of each SpCas9 mutant was also characterized on compatible genomic target sites with an nGG PAM. R1333K (SpCas9MT1) retained independent activity on a subset of target sequences, whereas R1333S (SpCas9MT2) and R1335K (SpCas9MT3) displayed only background activity, which could be restored to wild type levels in the presence of a ZFP fusion (Fig. 2c,d and Supplementary Fig. 7 and 8). To confirm that the ZFP-dependent restoration of activity is general, we assessed the nuclease activity of three additional SpCas9MT3-ZFP fusions, two of which restore nuclease function (Supplementary Fig. 9 and Supplementary Table 1). Thus, altering the affinity of PAM recognition through mutation generates SpCas9 variants that are dependent on the attached pDBD for efficient function. This pDBD dependence establishes a third stage of target site licensing for our SpCas9MT3-pDBDs, which should increase their precision.
Figure 2

Attenuating nuclease activity of SpCas9. (a) Four PAM-interacting amino acids neighboring the nGG PAM (magenta) in the structure of SpCas9[7]. Arginines at positions 1333 and 1335 were mutated to attenuate DNA-binding affinity of SpCas9. (b) Activity profile of SpCas9 (blue) or SpCas9-Zif268 (red) bearing lysine or serine substitutions at positions 1333 or 1335 in the PAM interaction domain in comparison to wild-type (WT) SpCas9. Reporter assays were performed in HEK293T cells. Bar heights represent means from three independent biological replicates performed on different days. Error bars indicate standard error of the mean. (c) T7 Endonuclease I (T7EI) assays on PCR products spanning a genomic target site (underlined) with an NGG PAM (magenta) and neighboring Zif268 site (orange) for SpCas9 or SpCas9 mutants with or without a Zif268 fusion. For SpCas9MT2 & SpCas9MT3, robust nuclease activity is only observed when Zif268 is fused to the C-terminus. The gel image is representative of T7EI assays at this genomic target site, where cleaved products are noted by magenta arrowheads. (d) Quantification of average T7EI-based lesion rates at the PLXNB2 locus from three independent biological replicates performed on different days in HEK293T cells (Supplementary Fig. 7). Error bars indicate standard error of the mean.

Assessing the precision of SpCas9-ZFP fusions

Next we compared the precision of SpCas9-ZFPs to SpCas9 using sgRNAs with previously defined off-target sites[14,25]. We programmed three different four-finger ZFPs to recognize 12 base pair sequences neighboring the TS2, TS3 or TS4 sgRNA target sites for use as SpCas9MT3-ZFP fusions (Fig. 3a). The activity of SpCas9, SpCas9MT3 and SpCas9MT3-ZFPTS2:TS3:TS4 with the corresponding sgRNA was compared at each target site. In all cases SpCas9MT3 dramatically decreased cleavage efficiencies, which were restored by the cognate ZFP fusion (Fig. 3b). The activity of SpCas9MT3-ZFP was dependent on the presence of both a cognate sgRNA and ZFP (Fig. 3c). Consistent with the dependence on ZFP binding, truncation of one zinc finger from either end of ZFPTS3 reduced the activity of SpCas9MT3-ZFPTS3 at the TS3 target site, and the removal of two zinc fingers abrogated activity (Supplementary Fig. 10). The additional stage of target site licensing supplied by the pDBD dramatically increased the precision of SpCas9MT3-ZFPTS3 relative to wild type SpCas9; lesion rates at the most active off-target site (OT3-2) for sgRNATS3 were 22% by T7EI assay with wild type Cas9, but were undetectable with SpCas9MT3-ZFPTS3 (Fig. 3c). We also programmed two TALE arrays to target SpCas9MT3 to TS3 and TS4 (TALETS3 and TALETS4). Nuclease activity at the TS3 site but not TS4 can be restored by the related SpCas9MT3-TALE fusion (Supplementary Fig. 11).
Figure 3

SpCas9MT-ZFP chimeras have improved precision. (a) Sequences of Target Site 2 (TS2), Target Site 3 (TS3) and Target Site 4 (TS4) for the SpCas9:sgRNAs described by Joung and colleagues[14,25]. The 12 bp ZFP binding sites for TS2, TS3 and TS4 are highlighted in cyan, red and teal, respectively, with the arrow indicating the strand that is bound. (b) Lesion rates determined by T7EI assay for SpCas9, SpCas9MT3 and SpCas9MT3-ZFP at TS2, TS3 and TS4. Data are from three independent biological replicates performed on different days in HEK293T cells. Error bars indicate standard error of the mean. (c) Representative T7EI assay comparing lesion rates at TS3 and off-target site 2 (OT3-2)[25] for various SpCas9-chimera:sgRNA combinations. The activity at the target site for SpCas9MT3-ZFP is dependent on the cognate sgRNA and ZFP, where SpCas9MT3-ZFPTS3 can discriminate between TS3 and OT3-2. (d) Genomic target site cleavage activity by SpCas9, SpCas9WT-ZFPTS3 and SpCas9MT3-ZFPTS3 in response to dinucleotide mismatches placed at different positions within the guide sequence targeting the TS3 site (Supplementary Table 2). (Top) T7EI assay data from PCR products spanning TS3 site in three independent biological replicates performed on different days in HEK293T cells. Error bars indicate standard error of the mean. (Bottom) Schematic indicating the position of the dinucleotide mismatches across the guide sequence. SpCas9MT3-ZFPTS3 displays superior discrimination to SpCas9 for dinucleotide mismatches in the sgRNA recognition sequence.

To examine the catalytic tolerance of the SpCas9MT3-ZFPTS3:sgRNA complex to mismatches between the guide and a target sequence, we utilized a set of guides that progressively shift blocks of 2 base mismatches from the 5′ to the 3′ end of the guide sequence. SpCas9MT3-ZFPTS3 has a lower tolerance for mismatches between the guide and target site relative to SpCas9WT, whereas SpCas9WT-ZFPTS3 appears to modestly increase the tolerance for mismatches (Fig. 3d and Supplementary Table 2). SpCas9MT3-ZFPs also exhibit reduced activity with truncated sgRNAs[25] (Supplementary Fig. 12), consistent with the requirement for a higher degree of guide-target site complementarity to achieve efficient cleavage.

Deep sequencing analysis of off-target activity

To more broadly assess improvements in precision, we deep-sequenced PCR products spanning previously defined off-target sites for sgRNATS2:TS3:TS4 (ref. 14,25), as well as several additional genomic loci that have favorable ZFPTS2:TS3:TS4 recognition sites and have some complementarity to the TS2, TS3 or TS4 guide sequences (43 total; Supplementary Tables 3 and 4). We compared the nuclease activity of SpCas9, SpCas9MT3, SpCas9WT-ZFPTS2:TS3:TS4 and SpCas9MT3-ZFPTS2:TS3:TS4 at these off-target sites, and found that SpCas9MT3-ZFPTS2:TS3:TS4 dramatically increased the precision of target site cleavage (Fig. 4a). In most cases, utilizing SpCas9MT3-ZFPTS2:TS3:TS4 reduced lesion rates at off-target sites to background levels resulting in improvements in the Specificity Ratio of up to 150-fold (Fig. 4b). Only one off-target site (OT2-2), which has a neighboring sequence that is similar to the expected ZFPTS2 recognition sequence (Supplementary Fig. 13), still displays high lesion rates. One other site (OT2-6), displays some residual activity both for SpCas9MT3 and SpCas9MT3-ZFPTS2 that is above the background error rate within our sequencing data. Overall, these data demonstrate a dramatic enhancement in precision for SpCas9MT3-ZFPs relative to standard SpCas9 at previously defined active off-target sites.
Figure 4

Deep sequencing analysis of SpCas9MT3-ZFP chimera precision. (a) Lesion rates for target sites and off-target sites with statistically significant activity (Supplementary Table 3) assayed by deep sequencing PCR products spanning each genomic locus for SpCas9 (blue), SpCas9MT3 (light blue), SpCas9WT-ZFP (red) and SpCas9MT3-ZFP (pink). Error bars indicate standard error of the mean. (b) Improvement in precision of SpCas9MT3-ZFP relative to SpCas9WT as measured by the relative Specificity Ratio of target site lesion rate relative to each off-target lesion rate (Specificity Ratio = Target site lesion rate/Off-target lesion rate). (c) Comparison of average lesion rates at TS2 and OT2-2 determined by T7EI assay for SpCas9WT and SpCas9MT3-ZFPTS2 variants that alter the number of zinc fingers or change them completely (TS2*). The binding site for the ZFPTS2* is indicated in green. Removing finger 1 (F2-4) or finger 4 (F1-3) from the four finger TS2 ZFP array (F1-4) at most modestly impacts target site activity, but it dramatically improves precision. Data are from three independent biological replicates performed on different days in HEK293T cells (Supplementary Figure 14). Error bars indicate standard error of the mean.

One potential advantage of the SpCas9-pDBD system over other Cas9 platforms is the ability to rapidly tune the affinity and specificity of the attached pDBD to further optimize its precision. Consequently, we sought to improve the precision of SpCas9MT3-ZFPTS2 by truncating the ZFP to reduce its affinity for target site OT2-2. Constructs with a truncation of either of the terminal zinc fingers display high activity at the target site (Fig. 4c). However, these truncations reduced or eliminated off-target activity at OT2-2, reflecting a profound improvement in the precision of SpCas9MT3-ZFPTS2 (Fig. 4c and Supplementary Fig. 14). Similarly, utilization of a ZFP (TS2*) that recognizes an alternate sequence neighboring the TS2 guide target site also abolishes off-target activity at OT2-2, confirming that cleavage by SpCas9MT3 at this off-target site is ZFP dependent (Fig. 4c & Supplementary Fig. 4). Given the improvements in precision realized by these simple adjustments in the composition of the ZFP, it should be possible to achieve even greater enhancements in precision via more focused modification of the ZFP composition and the linker connecting it to SpCas9. Finally, we employed GUIDE-seq[17] to provide an unbiased assessment of the propensity for SpCas9MT3-ZFPs to cleave at alternate off-target sites within the genome. Using a modified version of the original protocol and bioinformatics pipeline, we assessed genome-wide DSB induction by SpCas9 and the SpCas9MT3-ZFPTS2:TS3:TS4 (see Methods). This analysis reveals a dramatic enhancement of the precision of the SpCas9MT3-ZFPs for all three target sites (Fig. 5 and Supplementary Table 5). For SpCas9MT3-ZFPTS3 and SpCas9MT3-ZFPTS4 we did not detect nuclease dependent-oligonucleotide capture at any site besides the target site. For SpCas9MT3-ZFPTS2, which retains two active off-target sites that overlap with SpCas9, there is a dramatic reduction in cleavage activity at all of the alternate sequences. In addition there is one new weak off-target site (OTG2-42) for SpCas9MT3-ZFPTS2. These data demonstrate that the utilization of the SpCas9MT3-ZFP fusion reduces cleavage at wild type SpCas9 off-target sites without generating a new class of highly active ZFP-mediated off-target sites.
Figure 5

Genome-wide off-target analysis of SpCas9MT3-ZFPs by GUIDE-seq[17]. (a) Number of off-target sites with nuclease activity detected for SpCas9WT (blue) and SpCas9MT3-ZFP (red) with TS2, TS3 and TS4 guides. (b–d) Number of unique reads captured by GUIDE-seq for nuclease active sites within the genome (TS2, TS3 or TS4 target site (bold) and off-target sites). Previously defined off-target sites are colored black[14,17] and potential new off-target sites that were identified in this analysis are colored green for SpCas9WT or orange for SpCas9MT3-ZFP. Some sites (e.g. OGT2-10 & OGT2-20) contain only reads from a single library for SpCas9MT3-ZFP, so are not binned as off-target sites in Fig. 5A. Detailed information the sites and counts are presented in Supplementary Table 5. (e) Model of the three stages of target site licensing that are necessary for SpCas9MT3-pDBD to cleave DNA. Due to the modification of SpCas9 (mutation indicated by yellow star), the efficient engagement of a sequence for PAM recognition or guide RNA complementarity requires the presence of a neighboring DNA sequence that can be bound by the attached pDBD. This requirement for pDBD binding adds a third stage of target site licensing for efficient cleavage.

Discussion

Our analysis of the activity of SpCas9-pDBD chimeras provides important new insights into the mechanism of target site licensing by SpCas9 and methods to exploit this mechanism to improve precision. Fusion of a pDBD to SpCas9 allows efficient utilization of a broader repertoire of PAM sequences by SpCas9. However, even for SpCas9-pDBDs there remains a dichotomy between functional and inactive PAMs. The broader targeting range of SpCas9-pDBDs likely reflects the bypass of a kinetic barrier to R-loop formation that follows PAM recognition, as proposed by Seidel and colleagues[6]. We believe that the pDBD tethering of SpCas9 achieves activity at a target site containing a sub-optimal PAM by increasing the effective concentration of SpCas9 around the target site and hence, stabilizing the SpCas9-PAM interaction[48]. For wild type SpCas9, only high affinity (nGG) PAM sites consistently have sufficient residence time to facilitate efficient progression to R-loop formation, but pDBD tethering increases the likelihood that SpCas9:sgRNA can overcome this barrier at sub-optimal PAMs. Our data also support an allosteric licensing mechanism, as described by Doudna and colleagues[5], which likely restricts Cas9 nuclease activity for the majority of sequence combinations in the PAM element even with the increased local concentration afforded by pDBD tethering. The enhanced sensitivity to guide-target site heteroduplex stability observed for our SpCas9MT3-ZFPTS3 chimera (Fig. 3d and Supplementary Fig. 12) further supports the interplay between PAM recognition and guide complementarity in the licensing of nuclease activity. We find that mutations to the SpCas9 PAM interacting domain introduce a third stage of licensing (pDBD site recognition) for efficient target site cleavage within the SpCas9MT-pDBD system (Fig. 5e). The weakened interaction between mutant Cas9 and the PAM sequence now necessitates increased effective concentration for nuclease function that is achieved by the high affinity interaction of the tethered pDBD with its target site. This combination dramatically improves precision as assessed using targeted deep-sequencing and GUIDE-seq analysis. Compared with the previous GUIDE-seq analysis of the TS2, TS3 and TS4 targets for SpCas9, we detect five, three and three of the top 5 off-target sites that were previously described[17]. The discrepancy between these studies could be due to our lower sequencing depth, the use of an alternate cell line, or different delivery methods. Nonetheless, from our analysis we can exclude the presence of a new class of highly active off-target sites that are generated by the fusion of the ZFP to Cas9. This system has important advantages over other previously described Cas9 variant systems that improve precision[10,25-30]. The SpCas9MT-pDBD system increases the targeting range of the nuclease by expanding the repertoire of highly active PAM sequences. This is in contrast to dimeric systems (dual nickases or FokI-dCas9 nucleases) that have a more restricted targeting range due to the requirement for a pair of compatible target sequences. Moreover, our system should be compatible with either of these dimeric nuclease variants, providing a further potential increase in precision while also expanding the number of compatible target sites for these platforms. In addition, the affinity and the specificity of the pDBD component can also be easily tuned to achieve the desired level of nuclease activity and precision for demanding gene therapy applications. We programmed our SpCas9-ZFPs targeting TS2, TS3 or TS4 with four-finger ZFPs, as we believed that these would have the optimal balance of specificity and affinity. In the case of SpCas9MT3-ZFPTS3 this proved prudent (Supplementary Fig. 10). However for SpCas9MT3-ZFPTS2 improved precision was achieved by utilizing a three finger ZFP, demonstrating the flexibility provided by modular pDBDs. (See the Supplementary Discussion for more details on ZFP design for Cas9-ZFPs and our website [http://mccb.umassmed.edu/Cas9-pDBD_search.html] for assistance with the identification of target sites and compatible ZFP sequences.) In addition to tuning the pDBD, further optimization of the linker length and its composition can provide improvements in precision (and potentially activity) by further restricting the relative orientation and spacing of the SpCas9 and pDBD. Finally, it should be possible to generate Cas9-pDBD fusions for Cas9 orthologs from other species that have superior characteristics for gene therapy applications (e.g. more compact Cas9 nucleases[49,50] for viral delivery). Ultimately, for gene therapy applications where precision, activity and target site location are of paramount importance, the expanded targeting range and precision achieved by the Cas9-pDBD framework provides a potent platform for the optimization of nuclease-based reagents that cleave a single target site in the human genome.

Methods

Plasmid Constructs

Our SpCas9-pDBD experiments employed the following plasmids: All sgRNAs are expressed via a U6 promoter from pLKO1-puro[51]. All SpCas9 and SpCas9-DBD fusions are expressed via pCS2-Dest gateway plasmid under chicken beta-globin promoter[52]. ZFPs were assembled as gBlocks (Integrated DNA Technologies) from finger modules based on previously described recognition preferences[53,54]. ZFPs were cloned into a pCS2-Dest-SpCas9 plasmid backbone cloned thorough BspEI and XhoI sites. TALEs were assembled via golden gate assembly[55] into our JDS TALE plasmids[56]. Assembled TALEs were cloned into BbsI digested pCS2-Dest-SpCas9-TALEntry backbone through Acc65I and BamHI sites. Sequences of the SpCas9-pDBDs are listed in Supplementary Figure 15, and these plasmids will be deposited at addgene for distribution to the community. Plasmid reporter assays of nuclease activity utilized the restoration of GFP activity through SSA-mediated repair of an inactive GFP construct using the M427 plasmid developed by the Porteus laboratory[46]. SpCas9 target sites were cloned into plasmid M427 via ligation independent methods following SbfI digestion. Mutations in the PAM interacting domain of SpCas9 were generated by cassette mutagenesis.

Cell Culture Assay

Human Embryonic Kidney (HEK293T) cells were obtained from our collaborator M. Green (UMass Medical School) are cultured in high glucose DMEM with 10% FBS and 1% Penicillin and Streptomycin (Gibco) in a 37°C incubator with 5% CO2. These cells were not verified nor tested for mycoplasma contamination. For transient transfection, we used early to mid-passage cells (passage number 5–25). Approximately 1.6x105 cells are transfected with 50 ng SpCas9-pDBD-expressing plasmid, 50 ng sgRNA expressing plasmid and 100 ng mCherry plasmid via Polyfect transfection reagent (Qiagen) in 24-well format according to the manufacturer’s suggested protocol. For SSA-reporter assay, 150 ng M427 SSA-reporter plasmid is also included in the co-transfection mix.

Western Blot

HEK293T cells are transfected with 500 ng Cas9 and 500 ng sgRNA expressing plasmid in a 6-well plate by Lipofectamine 3000 transfection reagent (Invitrogen) according to manufacturer’s suggested protocol. 48 hours after transfection, cells are harvested and lysed with 100 μl RIPA buffer. 8ul of cell lysate is used for electrophoresis and blotting. The blots are probed with anti-HA (Sigma #H9658) and anti alpha-tubilin (Sigma #T6074) primary antibodies; then HRP conjugated anti-mouse IgG (Abcam #ab6808) and anti-rabbit IgG secondary antibodies, respectively. Visualization employed Immobilon Western Chemiluminescent HRP substrate (EMD Millipore #WBKLS0100).

Flow cytometry Reporter Assay

48 hours post-transfection cells are trypsinized and harvested into a microcentrifuge tube. Cells are centrifuged at 500*g for 2 minutes, washed once with 1 x PBS, recentrifuged at 500*g for 2 minutes and resuspended in 1 x PBS for flow cytometry (Becton Dickonson FACScan). For FACS analysis, 10000 events are counted from each sample. To minimize the effect of differences in the efficiency of transfection among samples, cells are initially gated for mCherry-expression, and the percentage of EGFP expressing cells (nuclease positive events) are quantified within mCherry positive cells. All of the experimental replicates are performed in triplicate on different days with data reported as mean values with error bars indicating the standard error of the mean.

Genomic targeting analysis with T7EI

72 hours post-transfection cells are harvested and genomic DNA is extracted via DNeasy Blood and Tissue kit (Qiagen) according to the manufacturer’s suggested protocol. 50ng input DNA is PCR-amplified using “T7EI primers” that are specific for each genomic region (Supplementary Table 4) with Phusion High Fidelity DNA Polymerase (New England Biolabs): (98°C, 15s; 67°C, 25s; 72°C, 18s) for 30 cycles. 10 μl of a PCR product is hybridized and treated with 0.5μl T7 Endonuclease I (New England Biolabs) in 1 x NEB Buffer2 for 45 minutes[57]. The samples are run on a 2.5% agarose gel and quantified with ImageJ software[58]. Indel percentages are calculated as previously described[57]. Experiments for T7EI analysis are performed in triplicate on different days with data reported as mean values with standard error of the mean.

Targeted deep-sequencing based off-target analysis for SpCas9-pDBDs

For generation of each amplicon, we used two-step PCR amplification approach to first amplify the genomic segments and then install the barcodes and indexes. In the first step, we used “locus-specific primers” bearing common overhangs with complementary tails to the TruSeq adaptor sequences (Supplementary Table 4). 50 ng input DNA is PCR amplified with Phusion High Fidelity DNA Polymerase (New England Biolabs): (98°C, 15s; 67°C 25s; 72°C 18s) for 30 cycles. 5 μl of each PCR reaction is gel-quantified by ImageJ[58] against a reference ladder and equal amounts from each genomic locus PCR are pooled for each treatment group (15 different treatment groups). The pooled PCR products from each group are run on a 2% agarose gel and the DNA from the expected product size (between 100 and 200 bp) is extracted and purified via QIAquick Gel Extraction Kit (Qiagen). In the second step, the purified pool from each treatment group was amplified with a “universal forward primer and an indexed reverse primer” to reconstitute the TruSeq adaptors (Supplementary Table 4). 2ng of input DNA is PCR amplified with Phusion High Fidelity DNA Polymerase (New England Biolabs): (98°C, 15s; 61°C, 25s; 72°C, 18s) for 9 cycles. 5 μl of each PCR reaction is gel-quantified by ImageJ[58], and then equal amounts of the products from each treatment group are mixed and run on a 2% agarose gel. Full-size products (~250bp in length) are gel-extracted and purified via QIAquick Gel Extraction Kit (Qiagen). The purified library was deep sequenced using a paired-end 150 bp MiSeq run. Sequences from each genomic locus within a specific index were identified based on a perfect match to the final 11 bp of the proximal genomic primer used for locus amplification (Supplementary Table 6). Insertions or deletions in the SpCas9 target region were defined based on the distance between a “prefix” sequence at the 5′ end of each off-target site (typically 10 bp) and a “suffix” sequence at the 3′ end of each off-target site (typically 10 bp)[59], where there were typically 33 bp between these elements in the unmodified locus (Supplementary Table 6). Distances that were greater than expected were binned as “insertions (I)”, and distances that were shorter were binned as “deletions (D)”. Reads that did not contain the suffix sequence were marked as undefined (U). For some loci the background sequencing error rate is high. For example for OT2-1 a homopolymer sequence in the guide region leads to a high error rate. All statistical analyses were performed using R, a system for statistical computation and graphics[60]. Log odd ratios of lesion were calculated for the on-target and off-target sites of each individual Cas9 treatment group vs. the untreated control for each of the three independent experiments. T-test was applied to assess whether the log odd ratio is significantly different from 0, i.e., whether there is a significant difference in lesion odds between each individual Cas9 treatment group and the untreated control for the on-target and off-target sites. Odds ratios and their 99% confidence intervals were obtained by taking exponent of the estimated log odds ratios and their 99% confidence intervals. These analyses were also applied to the sum of the lesion rates across all three replicates (combined). To adjust for multiple comparisons, p-values were adjusted using the Benjamini-Hochberg (BH) method[61]. Only loci that have significant BH-adjusted p-values in the combined data for the treatment group relative to the control are considered significant.

GUIDE-Seq off-target analysis for SpCas9-pDBDs

We performed GUIDE-Seq with some modifications to the original protocol[17]. Importantly, there is an error in the original publication with regards to the GSP1 and GSP2 primer sets, which listed incompatible combinations. It was necessary to properly assort the primer sets for the positive (+) and negative (−) strands to get successful library amplification: Nuclease_off_+_GSP1 GGATCTCGACGCTCTCCCTGTTTAATTGAGTTGTCATATGTTAATAAC + Nuclease_off_−_GSP1 GGATCTCGACGCTCTCCCTATACCGTTATTAACATATGACA − Nuclease_off_+_GSP2 CCTCTCTATGGGCAGTCGGTGATTTGAGTTGTCATATGTTAATAACGGTA + Nuclease_off_−_GSP2 CCTCTCTATGGGCAGTCGGTGATACATATGACAACTCAATTAAAC − In addition, our protocol differed from the published protocol[17] in the following manner: In a 24-well format, HEK293T cells are transfected with 250 ng Cas9, 150 ng sgRNA, 50 ng GFP, and 10 pmol of annealed GUIDE-Seq oligonucleotide using Lipofectamine 3000 transfection reagent (Invitrogen) according to manufacturer’s suggested protocol. 48 hours post-transfection, genomic DNA was extracted via DNeasy Blood and Tissue kit (Qiagen) according to the manufacturer’s suggested protocol. Library preparations are done with original adaptors according to protocols described by the Joung laboratory[17], where each library was barcoded for pooled sequencing. The barcoded, purified libraries were deep sequenced as a pool using two paired-end 150 bp MiSeq runs. Reads containing the identical molecular index and identical starting 8 bp elements on the Read1 were pooled into one unique read. The initial 30 bp and the final 50 bp of the unique Read2 sequences were clipped for removal of the adapter sequence and low quality sequences and then mapped to the human genome (hg19) using Bowtie2. Peaks containing mapped unique reads were identified using a pile-up program ESAT (http://garberlab.umassmed.edu/software/esat/) using a window of 25 bp with a 15 bp overlap. Neighboring windows that are on different strands of the genome and less than 50 bp apart were merged using Bioconductor package ChIPpeakAnno[62,63]. Peaks that were present with multiple different guides (hotspots[17]) or do not contain unique reads for both sense and anti-sense libraries[17] were discarded. The remaining peaks were searched for sequence elements that were complementary to the nuclease target site using CRISPRseek[21]. Only peaks that harbor a sequence with less than 7 mismatches to the target site were considered potential off-target sites. These regions are reported in Supplementary Table 5 and the number of reads from the sense and the antisense libraries were combined into the final read number.

CRISPRseek analysis of potential off-target site for SpCas9 sgRNAs

Human hg19 exon and promoter sequences were fetched using Bioconductor packages ChIPpeakAnno[62,63] and TxDb.Hsapiens.UCSC.hg19.knownGene. A subset of 16500 exons and 192 promoter sequences of 2 kb each were selected for sgRNA searching and genome-wide off target analysis using Bioconductor package CRISPRseek[21,22] using the default settings (both nGG and nAG PAMs are allowed) except BSgenomeName = BSgenome.Hsapiens.UCSC.hg19, annotateExon = FALSE, outputUniqueREs = FALSE, exportAllgRNAs = “fasta” and fetchSequence = FALSE. After excluding sgRNAs with on-target or/and off-targets in the haplotype blocks, there are 124793 unique sgRNAs from exon sequences and 55687 unique gRNA from promoter sequences included in the analysis. Each guide was binned based on either the off-target site with the fewest number of mismatches to the guide sequence or the sum of the off-target scores for the top 10 off-target sites. The fraction of guides in each bin for exons or promoters is displayed as a pie chart.

Reproducibility

No statistical methods were used to predetermine sample size, and the investigators were not blinded to allocation during experiments and outcome assessment.
  60 in total

1.  Double nicking by RNA-guided CRISPR Cas9 for enhanced genome editing specificity.

Authors:  F Ann Ran; Patrick D Hsu; Chie-Yu Lin; Jonathan S Gootenberg; Silvana Konermann; Alexandro E Trevino; David A Scott; Azusa Inoue; Shogo Matoba; Yi Zhang; Feng Zhang
Journal:  Cell       Date:  2013-08-29       Impact factor: 41.582

2.  Integrative analysis of ChIP-chip and ChIP-seq dataset.

Authors:  Lihua Julie Zhu
Journal:  Methods Mol Biol       Date:  2013

3.  Construction and application of site-specific artificial nucleases for targeted gene editing.

Authors:  Fatma O Kok; Ankit Gupta; Nathan D Lawson; Scot A Wolfe
Journal:  Methods Mol Biol       Date:  2014

4.  A split-Cas9 architecture for inducible genome editing and transcription modulation.

Authors:  Bernd Zetsche; Sara E Volz; Feng Zhang
Journal:  Nat Biotechnol       Date:  2015-02       Impact factor: 54.908

Review 5.  Genome editing. The new frontier of genome engineering with CRISPR-Cas9.

Authors:  Jennifer A Doudna; Emmanuelle Charpentier
Journal:  Science       Date:  2014-11-28       Impact factor: 47.728

6.  Rational design of a split-Cas9 enzyme complex.

Authors:  Addison V Wright; Samuel H Sternberg; David W Taylor; Brett T Staahl; Jorge A Bardales; Jack E Kornfeld; Jennifer A Doudna
Journal:  Proc Natl Acad Sci U S A       Date:  2015-02-23       Impact factor: 11.205

7.  Efficient design and assembly of custom TALEN and other TAL effector-based constructs for DNA targeting.

Authors:  Tomas Cermak; Erin L Doyle; Michelle Christian; Li Wang; Yong Zhang; Clarice Schmidt; Joshua A Baller; Nikunj V Somia; Adam J Bogdanove; Daniel F Voytas
Journal:  Nucleic Acids Res       Date:  2011-04-14       Impact factor: 16.971

8.  CAS9 transcriptional activators for target specificity screening and paired nickases for cooperative genome engineering.

Authors:  Prashant Mali; John Aach; P Benjamin Stranges; Kevin M Esvelt; Mark Moosburner; Sriram Kosuri; Luhan Yang; George M Church
Journal:  Nat Biotechnol       Date:  2013-08-01       Impact factor: 54.908

9.  Improving CRISPR-Cas nuclease specificity using truncated guide RNAs.

Authors:  Yanfang Fu; Jeffry D Sander; Deepak Reyon; Vincent M Cascio; J Keith Joung
Journal:  Nat Biotechnol       Date:  2014-01-26       Impact factor: 54.908

10.  megaTALs: a rare-cleaving nuclease architecture for therapeutic genome engineering.

Authors:  Sandrine Boissel; Jordan Jarjour; Alexander Astrakhan; Andrew Adey; Agnès Gouble; Philippe Duchateau; Jay Shendure; Barry L Stoddard; Michael T Certo; David Baker; Andrew M Scharenberg
Journal:  Nucleic Acids Res       Date:  2013-11-26       Impact factor: 16.971

View more
  48 in total

Review 1.  Creating and evaluating accurate CRISPR-Cas9 scalpels for genomic surgery.

Authors:  Mehmet Fatih Bolukbasi; Ankit Gupta; Scot A Wolfe
Journal:  Nat Methods       Date:  2016-01       Impact factor: 28.547

Review 2.  Genome Editing with mRNA Encoding ZFN, TALEN, and Cas9.

Authors:  Hong-Xia Zhang; Ying Zhang; Hao Yin
Journal:  Mol Ther       Date:  2019-01-25       Impact factor: 11.454

Review 3.  Delivery technologies for genome editing.

Authors:  Hao Yin; Kevin J Kauffman; Daniel G Anderson
Journal:  Nat Rev Drug Discov       Date:  2017-03-24       Impact factor: 84.694

Review 4.  Genome-Editing Technologies: Principles and Applications.

Authors:  Thomas Gaj; Shannon J Sirk; Sai-Lan Shui; Jia Liu
Journal:  Cold Spring Harb Perspect Biol       Date:  2016-12-01       Impact factor: 10.005

Review 5.  Methods for Optimizing CRISPR-Cas9 Genome Editing Specificity.

Authors:  Josh Tycko; Vic E Myer; Patrick D Hsu
Journal:  Mol Cell       Date:  2016-08-04       Impact factor: 17.970

Review 6.  Genome editing comes of age.

Authors:  Jin-Soo Kim
Journal:  Nat Protoc       Date:  2016-08-04       Impact factor: 13.491

7.  Nonspecific toxicities of Streptococcus pyogenes and Staphylococcus aureus dCas9 in Chlamydia trachomatis.

Authors:  Wurihan Wurihan; Yehong Huang; Alec M Weber; Xiang Wu; Huizhou Fan
Journal:  Pathog Dis       Date:  2019-12-01       Impact factor: 3.166

8.  Engineered dCas9 with reduced toxicity in bacteria: implications for genetic circuit design.

Authors:  Shuyi Zhang; Christopher A Voigt
Journal:  Nucleic Acids Res       Date:  2018-11-16       Impact factor: 16.971

9.  Partial DNA-guided Cas9 enables genome editing with reduced off-target activity.

Authors:  Hao Yin; Chun-Qing Song; Sneha Suresh; Suet-Yan Kwan; Qiongqiong Wu; Stephen Walsh; Junmei Ding; Roman L Bogorad; Lihua Julie Zhu; Scot A Wolfe; Victor Koteliansky; Wen Xue; Robert Langer; Daniel G Anderson
Journal:  Nat Chem Biol       Date:  2018-01-29       Impact factor: 15.040

10.  The Amaryllidaceae alkaloids: biosynthesis and methods for enzyme discovery.

Authors:  Matthew B Kilgore; Toni M Kutchan
Journal:  Phytochem Rev       Date:  2015-12-17       Impact factor: 5.374

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.