Literature DB >> 31548729

Targeted transcriptional modulation with type I CRISPR-Cas systems in human cells.

Adrian Pickar-Oliver^1,2, Joshua B Black^1,2, Mae M Lewis^1,2, Kevin J Mutchnick^1,2, Tyler S Klann^1,2, Kylie A Gilcrest^1,2, Madeleine J Sitton^1,2, Christopher E Nelson^1,2, Alejandro Barrera³, Luke C Bartelt^2,3, Timothy E Reddy^2,3,4, Chase L Beisel^5,6,7, Rodolphe Barrangou⁸, Charles A Gersbach^9,10,11.

Abstract

Class 2 CRISPR-Cas systems, such as Cas9 and Cas12, have been widely used to target DNA sequences in eukaryotic genomes. However, class 1 CRISPR-Cas systems, which represent about 90% of all CRISPR systems in nature, remain largely unexplored for genome engineering applications. Here, we show that class 1 CRISPR-Cas systems can be expressed in mammalian cells and used for DNA targeting and transcriptional control. We repurpose type I variants of class 1 CRISPR-Cas systems from Escherichia coli and Listeria monocytogenes, which target DNA via a multi-component RNA-guided complex termed Cascade. We validate Cascade expression, complex formation and nuclear localization in human cells, and demonstrate programmable CRISPR RNA (crRNA)-mediated targeting of specific loci in the human genome. By tethering activation and repression domains to Cascade, we modulate the expression of targeted endogenous genes in human cells. This study demonstrates the use of Cascade as a CRISPR-based technology for targeted eukaryotic gene regulation, highlighting class 1 CRISPR-Cas systems for further exploration.

Entities: Chemical

Mesh：

Substances：
RNA, Guide

Year: 2019 PMID： 31548729 PMCID： PMC6893126 DOI： 10.1038/s41587-019-0235-7

Source DB: PubMed Journal: Nat Biotechnol ISSN： 1087-0156 Impact factor: 54.908

Introduction

The ability to modulate and perturb genetic information is indispensable for studying gene function and elucidating biological mechanisms. Targetable DNA-binding proteins that modify genomes at specific loci have led to tremendous advances in science, biotechnology, and medicine[1]. Specifically, the development of genome engineering tools based on class 2 CRISPR-Cas systems, which use a single effector protein such as Cas9 or Cas12, has revolutionized the field due to the ease of use of this technology with a vast range of applications[2]. In fact, bioinformatics analyses have further revealed a diversity of CRISPR-Cas systems, and the most recent classification encompasses six major types (I through VI)[3]. However, single-effector class 2 systems (types II, V and VI) have been primarily used for nucleic acid targeting in eukaryotes, despite multi-subunit class 1 systems (types I, III, and IV) comprising about 90% of all identified systems across bacteria and archaea[4]. The continued efforts to discover and develop single-component class 2 CRISPR effectors beyond Cas9-based type II systems have resulted in new technologies with specific advantages or applications. For example, the Cas12-based type V[5] and Cas13-based type VI[6] CRISPR-Cas systems of class 2 have distinct targeting and editing mechanisms. Here, we describe the development of type I systems, which account for more than 50% of all identified CRISPR-Cas loci, for use as programmable transcriptional activators and repressors in mammalian cells. Type I CRISPR-Cas systems use the signature Cas3 nuclease-helicase to eliminate invading DNA, and are further divided into eight subtypes (I-A to I-G and I-U) based on related, but subtype-specific, accessory cas genes[3,4,7]. The well-studied prototypical type I-E system of Escherichia coli K12 consists of eight cas genes and a downstream CRISPR array[8-10]. Following transcription of the CRISPR array, the 29-bp repeat sequences flanking the variable spacer sequences are cleaved during crRNA biogenesis by the Cas6e endoribonuclease[10]. Together, five protein subunits (Cas8e, Cse2, Cas7, Cas5, and Cas6, previously referred to as CasA, CasB, CasC, CasD, and CasE, respectively)[4] and the mature 61-nt crRNA form the 405-kDa CRISPR-associated complex for antiviral defense (Cascade)[11-13] (Figure 1A). Unlike type II CRISPR systems, there is no tracrRNA required for effector complex formation. For type I CRISPR systems, CasE processes the CRISPR array and targeting relies on a single crRNA. To bind to a target, Cascade surveys the DNA to find a protospacer-adjacent motif (PAM) upstream of a target sequence with complementarity to the crRNA spacer sequence (Figure 1B). We sought to harness this prokaryotic immune system for genome targeting in eukaryotes.

Figure 1.

EcoCascade expression and complex formation in human cells.

(a) Schematic of type I-E CRISPR-Cas system in E. coli K-12 showing EcoCascade stoichiometry and crRNA processing. Genes encoding proteins comprising the EcoCascade complex are presented in different colors. CRISPR repeats are indicated with a red diamond. Cas6 cleaves the primary CRISPR RNA (crRNA) transcript at the regions indicated with blue arrows yielding mature crRNAs. (b) Schematic representation of processed crRNA with 5’ PAM recognition and base pairing at the DNA target site. (c) Cas subunits driven by human cytomegalovirus (CMV) promoter and pre-processed crRNA driven by U6 promoter for expression and processing in mammalian cells. (d) Western blot showing expression of human codon-optimized individual Cas proteins and GAPDH loading control in human HEK293T cells. (* indicates second round of codon optimization). (e) Co-IP and western blot showing crRNA-dependent Cascade formation following co-transfection with V5-tagged Cas7 and Flag-tagged Cas8e, Cse2, Cas5, and Cas6 (immunoprecipitation with α-V5, and detection with α-V5 and α-Flag). (f) Immunofluorescence imaging showing engineered Cas subunits with NLSs enables import into the human nucleus. Red indicates Cas subunit; scale bar, 25μm. For d-f, two independent experiments were conducted with similar results. All samples processed at 3 days post-transfection. CMV, human cytomegalovirus; CoIP, Co-immunoprecipitation; NLS, nuclear localization signal.

Results (with Subheadings)

Expressing the Cascade complex in human cells

To repurpose Cascade for use in mammalian cells, we used a CMV promoter to express each Cascade subunit of the E. coli K12 system (EcoCascade). Based on available structural information[13-16], N-terminal Flag epitope tags and nuclear localization signals (NLSs) were attached to each EcoCascade construct. We used the RNA polymerase III U6 promoter to express target spacers flanked by full repeat sequences for crRNA processing by CasE (Figure 1C)[17]. Heterologous expression of all EcoCascade constructs was confirmed by Western blot following transfection of each plasmid individually into human HEK293T cells (Figure 1D). The cas genes were originally optimized based on human codon usage, but variable expression of these cas constructs indicated a need for additional codon-optimization of Cas5, which was performed by ATUM/DNA2.0 (DNA sequences provided in Supplementary Information). To determine if EcoCascade complex formation occurred in human cells, the six plasmids encoding each of the five Cas subunits and the crRNA cassette were co-transfected into HEK293T cells. Co-immunoprecipitation by pull-down of the V5 epitope on Cas7 and blotting for the Flag epitope on the other subunits confirmed proper complex assembly (Figure 1E). Interestingly, EcoCascade complexes can be purified from bacteria in the absence of a crRNA[18], however we observed EcoCascade formation in human cells only in the presence of a crRNA (Figure 1E). Additionally, nuclear localization of each subunit with the single NLS was confirmed by immunofluorescence staining of transfected HEK293T cells (Figure 1F).

Programmable transcriptional activation by EcoCascade-p300

Next, we sought to repurpose EcoCascade for CRISPR-based programmable transcriptional activation (CRISPRa) in mammalian cells. Transcriptional modulation using class 2 CRISPR systems has been achieved by introducing point mutations into the endo-nucleolytic domains of single-component effectors to maintain binding but not cleavage of the target DNA[19]. For example, the nuclease-deficient Cas9 (dCas9) can be engineered to function as a synthetic transcriptional activator by genetically fusing it to a transactivation or epigenome-modifying domain[20-24]. In the natural type I CRISPR-Cas immune system, target site recognition by Cascade leads to recruitment of the Cas3 nuclease to eliminate target DNA. However, deletion of cas3 from the endogenous type I-E system in E. coli was utilized for a CRISPR interference (CRISPRi) strategy in bacteria by permitting Cascade to bind target DNA and block transcription without DNA degradation[25]. Therefore, we hypothesized that Cascade can be repurposed as a programmable DNA-binding technology in eukaryotes by neglecting to express cas3. To repurpose EcoCascade as a programmable transcriptional activator, we explored the various Cas-effector subunits for tethering of the activation domain. We previously demonstrated robust endogenous gene activation with dCas9 fused to the catalytic core domain of the human acetyltransferase p300[23]. The EcoCascade system contains five Cas subunits available at various stoichiometry (Figure 1A), providing versatile options for synthetic fusions to transcriptional regulatory domains and other modular engineering strategies. Following heterologous expression of EcoCascade with p300 fused to Cas8e, Cse2, Cas5, or Cas6, we confirmed EcoCascade complex formation by co-immunoprecipitation (Figure 2A). The ability of each of these subunits to accommodate the fusion of the large p300 core catalytic domain (72kDa) without abrogating complex formation suggests that the modular Cascade complex could be particularly useful for multiplexed targeting of regulatory domains at specific loci.

Figure 2.

EcoCascade activates transcription of endogenous genes in human cells.

(a) Co-IP showing EcoCascade formation following co-transfection with plasmids encoding the crRNA, Flag-tagged EcoCascade subunits with V5-tagged Cas7, and various Cas-p300 fusions. IP performed with α-V5 and IB performed with α-V5 and α-Flag. Two independent experiments were conducted with similar results. (b) Schematic of the IL1RN locus along with tiled IL1RN crRNA target sites. H3K27ac enrichment from the ENCODE Consortium is indicated with the vertical range set to 400 to indicate regulatory regions. The two IL1RN ChIP-qPCR amplicons are shown in corresponding locations. (c) Relative IL1RN expression following co-transfection of individual crRNAs (100 ng) and EcoCascade (50 ng Cas8e, 100 ng Cse2, 50 ng Cas7, 250 ng Cas5*) with Cas6-p300 (50 ng). (n=3 biological independent samples; mean ± SEM). (d) Relative IL1RN expression following co-transfection of Ctrl crRNA or cr26 and EcoCascade with various Cas-p300 effectors. (n=3 biological independent samples; mean ± SEM). (e) ChIP-qPCR enrichment following co-transfection of individual crRNAs with EcoCascade and Cas6-p300. IP performed with α-Flag and qPCR performed with primers for amplicon regions designated in b. (n=3 biological independent samples; mean ± SEM; bars indicate mean fold enrichment). (f) Schematic of the HBG locus along with HBG crRNA target sites. (g) Relative HBG expression following co-transfection of individual crRNAs and EcoCascade with Cas6-p300. (n=3 biological independent samples; mean ± SEM). All samples processed at 3 days post transfection. (Tukey-test following log transformation, **P<0.001 and *P<0.05 compared to Ctrl crRNA). Numbers above bars indicate mean relative expression. TSS, Transcription start site; Ctrl crRNA, Control crRNA; IP, immunoprecipitation; IB, immunoblot; ChIP, chromatin immunoprecipitation; qPCR, quantitative PCR.

To test programmable endogenous gene activation in human cells, a panel of crRNAs was generated tiling the endogenous IL1RN promoter at protospacer targets downstream of known PAMs (5’-AAG, AGG, ATG, GAG, TAG-3’)[10,26-28] (Figure 2B, Table S1). Co-transfection of HEK293T cells with plasmids encoding EcoCascade with Cas6-p300 and individual crRNAs revealed robust IL1RN activation with many crRNAs, including >3,000-fold IL1RN activation with cr26 (**P<0.001, Figure 2C). Importantly, cr26 with EcoCascade lacking a p300 domain, or cr26 alone did not activate IL1RN (Figure S1), suggesting target-specific activation by EcoCascade-p300. Based on EcoCascade stoichiometry (Figure 1A) and relative Cas construct expression (Figure 1D), we optimized the relative masses of transfected plasmids to maximize gene activation (Figure S2). Additionally, the transactivation potential of all Cas-p300 fusions was explored with cr26. Relative to heterologous expression with a crRNA targeted to a control locus, EcoCascade containing Cas8e-p300 or Cas6-p300 displayed significant transactivation of IL1RN (*P<0.05, Figure 2D).

Versatility of EcoCascade for targeted gene activation

To investigate EcoCascade-p300 interactions at the target locus, we performed chromatin immunoprecipitation with an anti-Flag antibody followed by quantitative PCR (ChIP-qPCR) of two amplicons adjacent to the target site (Figure 2B). We observed significant enrichment of the target regions in EcoCascade-p300 samples co-transfected with cr25 or cr26 compared to a control crRNA (**P<0.001, Figure 2E). These results further confirm EcoCascade as a programmable DNA-binding platform for efficient targeting of specific loci in the human genome. Intriguingly, we observed enhanced enrichment of IL1RN signal in samples treated with Flag-tagged EcoCascade-p300 and an IL1RN-targeted crRNA compared to samples treated with Flag-tagged dCas9-p300 and an IL1RN-targeted single-guide RNA (sgRNA) (Figure S3). These results may indicate increased occupancy by EcoCascade relative to dCas9 but could also be the result of differences in epitope avidity or presentation. Targeted endogenous IL1RN activation was also achieved by tethering Cas6 to the tripartite activator, VP64-p65-Rta (VPR)[24], although both p300 and VPR tethered to Cas6 led to reduced activation levels compared to fusion to dCas9 (Figure S4). To assess activation of other endogenous targets in the human genome, we targeted the HBG promoter with EcoCascade-p300 (Figure 2F) and observed robust transactivation (Figure 2G). The natural function of Cascade to process crRNAs suggests the possibility of using arrayed spacers for multiplexed genome engineering. By generating a CRISPR array containing multiple crRNA spacers that target both IL1RN and HBG, we demonstrated multiplexed activation of endogenous genes (Figure S5). Together, these results demonstrate the potential for repurposing type I-E Cascade systems as versatile programmable transcriptional activators in mammalian cells.

Highly specific crRNA-dependent EcoCascade-p300 targeting

Assessing the specificity of genome and epigenome engineering tools is essential for their successful application in basic research and medicine. To date, there have been varied reports regarding the specificity of Cas9-based genome and epigenome editing technologies[29-31], which has led to the development of a variety of strategies to improve specificity[32]. To quantify the genome-wide binding specificity of EcoCascade-p300, we performed ChIP-seq using the FLAG epitope fused to the N terminus of each of the EcoCascade subunits. Binding of EcoCascade-p300 was highly enriched at the IL1RN promoter when targeting with cr25 and cr26, with no detectable binding observed with a HBE1-targeting control crRNA (Figure 3A). Interestingly, the strength of binding signal was comparable between cr25 and cr26, even though they differed substantially in their induction of IL1RN transcription (Figure 2C). With a genome-wide false discovery rate (FDR) < 0.001, there were a few off-target differential binding sites observed when comparing cr25 and cr26 to Ctrl crRNA (Figure 3B–C). These were all substantially weaker sites compared to the signal at the IL1RN locus, with the exception of one genomic window located on chromosome 6 about 5 kilobases adjacent to TAAR4P that was significantly enriched only with cr26 (Figure 3C and Figure S6A). However, we could not readily detect a seed sequence complementary to cr26 within this region. A site near the UBB locus was enriched in the control crRNA-treated sample relative to both cr25 and cr26, indicating a possible off-target site of this crRNA (Figure 3B–C).

Figure 3.

Genome-wide specificity of EcoCascade-p300 targeted to IL1RN.

(a) ChIP-seq tracks for binding of Flag-tagged EcoCascade-p300 targeted to the IL1RN promoter with cr25 and cr26, compared with binding of EcoCascade-p300 with Ctrl crRNA. An ENCODE H3K27ac track is included to highlight the regulatory regions. (b,c) MA plots for the differential analyses of global binding activity including comparisons of EcoCascade-p300 targeted by (b) cr25 and (c) cr26 versus EcoCascade-p300 with HBE1-targeting control crRNA in HEK293T cells. Red data points indicate FDR < 0.001 by differential DESeq2 analysis using Wald test p-values. (d) MA plots for differential expression analyses comparing EcoCascade-p300 targeted by cr26 versus GFP-targeting control crRNA in HEK293T cells. Red data points indicate FDR < 0.01 by differential expression analysis using Wald test p-values. (n=3 biological independent samples). Ctrl crRNA, Control crRNA.

To evaluate the specificity of crRNA-mediated endogenous gene activation with EcoCascade-p300, we performed RNA-seq to quantify transcriptome-wide changes when targeting IL1RN with cr26 or with a GFP-targeting control crRNA (Figure 3D). Targeted activation was highly specific to the target gene when EcoCascade-p300 was co-expressed with cr26, with detection of a modest change to only six other genes (Figure 3D). However, we did observe that the addition of the p300 domain to EcoCascade resulted in significant off-target transcriptional changes when compared to EcoCascade alone (Figure S6B). Given the highly specific DNA-targeting by EcoCascade (Figure 3B–C) and cr26-dependent activation of IL1RN (Figure 3D), these results may indicate non-specific crRNA-independent effects of overexpression of the p300 acetyltransferase fused to Cas6. Collectively, this genome-wide specificity analysis demonstrates highly specific crRNA-dependent targeting of EcoCascade in mammalian cells.

Expanding the Cascade toolbox with LmoCascade

Beyond the well-characterized EcoCascade system, bioinformatic analyses have revealed a plethora of additional type I CRISPR-Cas systems. To explore the potential for repurposing other Cascade complexes, we extended our results with the model type I-E EcoCascade to repurposing the type I-B CRISPR-Cas system of L. monocytogenes Finland_1998 (LmoCascade) (Figure 4A). Expression of all subunits was confirmed in HEK293T cells following human codon-optimization with N-terminal Flag epitope tags and NLSs attached to each LmoCascade construct (Figure 4B). To repurpose LmoCascade as a programmable transcriptional activator, we fused the catalytic core domain of p300 to Cas6, the predicted EcoCascade Cas6 ortholog. To test programmable endogenous gene activation in human cells, a panel of crRNAs, with the predicted spacer length of 36 nucleotides, was generated tiling the endogenous IL1RN promoter at protospacer targets downstream of the known PAM (5’-CCA-3’)[33] (Figure 4C). Co-transfection of HEK293T cells with plasmids encoding LmoCascade with Cas6-p300 and individual crRNAs revealed robust IL1RN activation with most crRNAs (**P<0.001, Figure 4D). Additionally, the transactivation potential of all Cas-p300 fusions was explored with cr03. Relative to heterologous expression with a control crRNA, LmoCascade containing Cas8b2-p300, Cas5-p300 or Cas6-p300 displayed significant transactivation of IL1RN (**P<0.001, Figure 4E). In contrast to the panel of IL1RN-targeting crRNAs for EcoCascade (Figure 2C), almost all of the LmoCascade crRNAs, as well as three of the four Cas-p300 effector fusions, achieved significant IL1RN activation (Figure 4D–E). To further assess crRNA-dependent activation of other endogenous targets in the human genome, we targeted the HBG promoter with LmoCascade-p300 and observed robust transactivation (Figure S7). These results demonstrate the potential to broaden our fundamental knowledge of type I CRISPR systems as we expand the CRISPR engineering toolbox by repurposing less characterized systems for use in mammalian cells.

Figure 4.

LmoCascade activates transcription of IL1RN gene in human cells.

(a) Schematic of type I-B CRISPR-Cas system in L. monocytogenes Finland_1998 showing predicted LmoCascade stoichiometry and crRNA processing. Genes encoding proteins comprising the LmoCascade complex are presented in different colors. CRISPR repeats are indicated with a red diamond. The primary crRNA transcript is predicted to be cleaved at the regions indicated with blue arrows yielding mature crRNAs. (b) Western blot showing expression of human codon-optimized individual Cas proteins and Actin loading control in human HEK293T cells. Two independent experiments were conducted with similar results. (c) Schematic of the IL1RN locus along with tiled IL1RN crRNA target sites. (d) Relative IL1RN expression following co-transfection of individual crRNAs and LmoCascade with Cas6-p300. (e) Relative IL1RN expression following co-transfection of Ctrl crRNA or cr03 and LmoCascade with various Cas-p300 effectors. All samples were processed at 3 days post transfection. (Tukey-test following log transformation, **P<0.001 compared to Ctrl crRNA, n=3 biological independent samples; error bars, SEM). Numbers above bars indicate mean relative expression. TSS, Transcription start site; Ctrl crRNA, Control crRNA; qPCR, quantitative PCR.

Targeted gene repression by stable LmoCascade-KRAB expression

In addition to harnessing type I CRISPR systems for programmable transcriptional activation, we sought to take advantage of steric hindrance by the large Cascade complex, the strong binding of Cascade to target DNA[34], and tethering of transcriptional repressor domains such as KRAB[20,31,35], to repurpose LmoCascade for targeted transcriptional repression in eukaryotic cells. To achieve stable expression of LmoCascade-KRAB in mammalian cells, we generated lentivirus expression constructs for each Cascade subunit, including the addition of a T2A-BlasticidinR sequence downstream of Cas6-KRAB (Figure S8A). Following co-transduction of a K562-HBE1-mCherry endogenous gene reporter cell line[36], LmoCascade-KRAB expressing cells were selected with blasticidin S, followed by clonal isolation and expansion. Clone 2 was selected following confirmation of LmoCascade-KRAB expression by Western blot analysis (Figure S8B). To test programmable endogenous gene repression in human cells, a panel of crRNAs was generated tiling the endogenous 5’-untranslated region of HBE1 (Figure 5A). Lentiviral expression constructs were generated for stable, independent expression of a crRNA and an eGFP-2A-PuroR selection cassette (Figure 5B). Transduction, selection, and expansion of K562-HBE1-mCherry cells expressing LmoCascade-KRAB (Figure 5C) revealed HBE1 transcriptional repression with all crRNAs (Figure 5D). To further assess the repressive capacity of LmoCascade-KRAB, protein expression was evaluated by flow cytometry analysis of the HBE1-mCherry reporter (Figure 5E–F). Significant reduction in mCherry fluorescence was observed for all crRNA targets (*P<0.05, Figure 5E), including robust repression in 5 of 6 crRNAs (**P<0.001, Figure 5E). These results demonstrate the potential for repurposing type I CRISPR systems as programmable transcriptional repressors in mammalian cells.

Figure 5.

LmoCascade represses transcription of HBE1 gene in human cells.

(a) Schematic of the HBE1 locus along with tiled HBE1 crRNA target sites. (b) Schematic of lentiviral expression constructs with dual promoters for co-expression of crRNAs with eGFP and PuromycinR. (c) Experimental timeline for transduction of monoclonal K562-HBE1-2A-mCherry cells with stable LmoCascade-KRAB expression. Cells were transduced with individual crRNAs, selected for two days with puromycin and harvested on day seven. (d) Relative HBE1 expression measured by qPCR. (e) Relative MFI of mCherry measured by flow cytometry. (f) Representative flow cytometry histogram. For d-f, two independent experiments were conducted with similar results. (Tukey-test, **P<0.001 compared to Ctrl crRNA, n=3 biological independent samples; error bars, SEM). Numbers above bars indicate mean relative expression. TSS, Transcription start site; Ctrl crRNA, Control crRNA; qPCR, quantitative PCR. MFI, mean fluorescence intensity.

Discussion

In summary, these results show that Cascade from type I-E and type I-B CRISPR systems can be reprogrammed for RNA-guided transcriptional modulation in human cells. This new class of genome engineering tools has potential benefits that expand the CRISPR engineering toolbox. For example, the promiscuous PAM recognition of type I-E EcoCascade (5’-AAG, AGG, ATG, GAG, TAG-3’), and the additional PAM recognition of type I-B LmoCascade (5’-CCA-3’), located 5’ of the spacer in contrast to the 3’ PAM of type II systems[16,37], enables a larger set of available CRISPR target sequences. By tiling crRNAs along endogenous promoters, our studies revealed PAM-independent (Table S1) variation in transactivation potential of crRNAs (Figures 2B–C, 2F–G, 4C–D). The reason for these differences remains to be elucidated but is similar to differences observed among gRNAs and crRNAs for class 2 systems. Ineffective class 2 effectors can gain activity when co-targeting dCas9 to proximal sites[38], suggesting dCas9 increases accessibility of these sites. Similarly, we observed 16-fold enhanced transactivation with EcoCascade-p300 and cr22 when dCas9 was co-targeted to IL1RN (Figure S9). These results suggest a narrower ability for Cascade to recognize genomic DNA targets in eukaryotes, however this may also serve as a mechanism to increase targeting specificity. Additionally, the preservation of complex formation observed after effector tethering suggests opportunities to utilize the stoichiometry of the Cascade complex for exploring synergistic activities of multiple effector domains. For dCas9, combinatorial targeting by tethering KRAB and DNA methyltransferases has been used to achieve heritable transcriptional silencing[39]. Additionally, the stoichiometry and architecture of Cascade have been tuned in bacteria by altering crRNA protospacer length[25,40]. The several cas genes involved and their various corresponding Cas proteins also present individual opportunities to append molecules and functional domains with increased flexibility. Beyond repurposing type I CRISPR systems for targeted transcriptional modulation, we also anticipate that Cascade subunits can be tethered to endonuclease effectors, such as the catalytic domains of the homodimeric FokI and monomeric I-TevI[41-43] endonucleases, for programmable editing via generation of double-stranded breaks or single-stranded nicks in genomic DNA. In fact, we have observed indels characteristic of double-strand break repair following delivery of Cascade-I-TevI fusions in preliminary experiments (Figure S10). Alternatively, Cascade can be co-expressed with the Cas3 helicase-nuclease to generate a spectrum of long-range chromosomal deletions[44]. Targeted transcriptional modulation is important for perturbing gene function, designing gene regulatory networks, investigating the function of distal regulatory elements, manipulating cellular and organismal phenotypes, and inducing therapeutic changes to gene expression. Cascade complexes from type I CRISPR-Cas systems provide a novel and widely applicable RNA-guided platform for targeting DNA sequences in eukaryotes.

Online Methods

Mammalian expression plasmid construction

E. coli K-12 Cascade sequences were originally codon-optimized by human codon usage tables using Integrated DNA technology (IDT), synthesized as gene blocks and integrated into expression plasmids containing a CMV-driven cassette by Gibson cloning strategies. ATUM/DNA2.0 synthesized the second-round human codon-optimized constructs using proprietary methods. See Supplementary Text for gene sequences of E. coli K-12 Cascade constructs. For crRNA expression, a cloning vector was constructed (pAPcrRNA_Eco) with a U6-driven cassette and digested with SacII and XhoI. To insert repeat-spacer pairs, oligonucleotides encoding the palindromic repeat and crRNA spacers were annealed, 5’ phosphorylated with PNK and ligated into digested pAPcrRNA_Eco. See Supplementary Figure S11 for an illustration of the cloning scheme. The control crRNA used throughout this study targets Cascade to an irrelevant control locus at HBE1. See Supplementary Table S3 for Cas9 gRNA target sequences[23]. L. monocytogenes Finland_1998 Cascade sequences were synthesized by ATUM/DNA2.0 as human codon-optimized constructs using proprietary methods. See Supplementary Text for gene sequences of L. monocytogenes Finland_1998 Cascade constructs. For crRNA expression, a cloning vector was constructed (pAPcrRNA_Lmo) with a U6-driven cassette and digested with SacII and AgeI. To insert repeat-spacer pairs, oligonucleotides encoding the palindromic repeat and crRNA spacers were annealed, 5’ phosphorylated with PNK and ligated into digested pAPcrRNA_Lmo. See Supplementary Figure S12 for an illustration of the cloning scheme. The control crRNA used throughout this study targets Cascade to an irrelevant control locus at HBE1. Plasmids used throughout this study are available through Addgene (Plasmid IDs: 106270–106276, 126481–126494, 126501).

Cell culture and transfections

HEK293T cells were maintained in Dulbecco’s Modified Eagle’s Medium (Invitrogen) with 10% Fetal Bovine Serum (Sigma) and 1% penicillin-streptomycin (Gibco). K562 cells were maintained in RPMI-1640 Medium (Invitrogen) with 10% Fetal Bovine Serum (Sigma) and 1% penicillin-streptomycin (Gibco). Cells were incubated at 37°C with 5% CO2. Lentivirus was produced in HEK293Ts using Lipofectamine 3000 (Invitrogen). All other transfections were performed using Lipofectamine 2000 (Invitrogen) according to the manufacturer’s protocol.

Immunofluorescence staining

Cells were passaged and transfected with 100 ng plasmid DNA on coverslips in 24-well plates. At three days post-transfection, cells were washed with PBS and fixed with 4% paraformaldehyde (Sigma). Cells were incubated with blocking buffer (5% goat serum, 0.2% Triton X-100 in PBS) then incubated with mouse anti-Flag (1:200 dilution, Sigma, M2 clone), followed by incubation with goat anti-mouse Alexa Fluor 647 (1:200 dilution, Life Technologies, A21236), and DAPI nucleic acid stain (Invitrogen). Cells were imaged with a Leica DMI 3000 B fluorescence microscope. Exposure time set by fluorescence of lowest expressed construct and maintained for all samples.

Western blot and co-immunoprecipitation

For protein analysis, HEK293T cells were transfected with 2 μg of individual Cas constructs in 6-well plates. After three days, cells were lysed in RIPA buffer (Sigma) with a proteinase inhibitor cocktail (Roche). Samples were centrifuged at 12,000rpm for 5min and the supernatant was isolated and quantified using a bicinchronic acid assay (BCA) protein standard curve (Thermo Scientific) on the BioTek Synergy 2 Multi-Mode Microplate Reader. Mixed with NuPAGE loading buffer (Invitrogen) and 5% β-mercaptoethanol, 25 μg protein was heated at 100°C for 10 min. Samples were loaded into 10% NuPAGE Bis-Tris gels (Invitrogen) with MES buffer (Invitrogen) and electrophoresed for 70 min at 200V on ice. Protein was transferred to nitrocellulose membranes for 1 hour in 1X tris-glycine transfer buffer containing 10% methanol and 0.01% SDS at 4°C at 400 mA. The blot was blocked at room temperature for 30 min in 5% milk-TBST (50 mM Tris, 150 mM NaCl and 0.1% Tween-20) and incubated with mouse anti-Flag (1:1,000 dilution, Sigma, M2 clone) in 5% milk-TBST at 4°C overnight. Blots were then washed in TBST and incubated with goat anti-mouse-conjugated horseradish peroxidase (1:2,500 dilution, Sigma) in 5% milk-TBST for 45 min at room temperature. Blots were washed in TBST then visualized using Western-C ECL substrate (Bio-Rad) on a ChemiDoc XRS+ System (Bio-Rad). Blots were stripped with Restore PLUS Western Blot Stripping Buffer (Thermo Scientific), blocked, and re-blotted with rabbit anti-GAPDH (1:1,000 dilution, Cell Signaling, 14C10) or anti-Actin (1:1,000 dilution, Sigma, A2066) and goat anti-rabbit-conjugated horseradish peroxidase (1:2,500 dilution, Sigma). Blots were visualized again using the methods described above. For co-immunoprecipitation analysis, co-transfections were completed using a V5-Cas7 construct. HEK293T cells were co-transfected in 6-well plates with 425 ng of each Cas construct and crRNA for 2.25 μg total plasmid DNA per condition. At three days post transfection, cells were lysed with IP lysis buffer (Thermo Scientific) with a proteinase inhibitor cocktail (Roche). Samples were centrifuged at 12,000 rpm for 5 min and the supernatant was isolated and subjected to immunoprecipitation using goat anti-V5-agarose conjugate (10 μl, Abcam, ab1229) at 4°C overnight. The IP products were washed three times with IP lysis buffer, mixed with NuPAGE loading buffer and 5% β-mercaptoethanol, and heated at 100°C for 10 min. Samples were loaded into 10% NuPAGE Bis-Tris gels, and resolved as described above. Blots were blocked, incubated with mouse anti-Flag (1:1,000 dilution, Sigma, M2 clone) and mouse anti-V5 (1:40,000 dilution, Abcam, SV5-Pk1 clone) then with goat anti-mouse-conjugated horseradish peroxidase (1:2,500 dilution, Sigma). Blots were visualized as described above.

RNA analysis

For quantitative PCR (qPCR), HEK293T cells were co-transfected with individual crRNAs (100 ng) and EcoCascade (50 ng Cas8e, 100 ng Cse2, 50 ng Cas7, 250 ng Cas5, and 50 ng Cas6-p300) or LmoCascade (150 ng Cas8b2, 50 ng Cas7, 75 ng Cas5, and 150 ng Cas6-p300) in 24-well plates. After three days, total RNA was isolated using QIAshredder and QIAGEN RNeasy kits (Qiagen). Reverse transcription was carried out using 500 ng total RNA per sample in a 10 μl reaction using the SuperScript VILO Reverse Transcription Kit (Invitrogen). Per qPCR reaction, 1.0 μl of cDNA was used with Perfecta SYBR Green Fastmix (Quanta Biosciences) and ran using the CFX96 Real-Time PCR Detection System (Bio-Rad). All sequences for qPCR primers can be found in Supplementary Table S3. All qPCR data are presented as fold change in RNA normalized to Gapdh expression and relative to samples targeting Cascade with a crRNA targeted to an irrelevant control locus at HBE1.

RNA sequencing

HEK293T cells were co-transfected with 3 ug total plasmid in 6-well plates. After three days, cells were washed twice with PBS and 350 uL of 1:10 mixture of β-mercaptoethanol and Buffer RLT (Qiagen) was added to each well. While on ice, cells were lysed NA was quantified using a Nanodrop instrument, and RNA quality was assessed using an Agilent TapeStation 2200 with RNA ScreenTape (Agilent). Using 1 ug of total RNA input, stranded mRNA sample preparation was performed with the Illumina TruSeq Stranded mRNA Library Prep Kit (Illumina) following manufacturer’s protocol except the enzymatic fragmentation time was reduced from 8 min to 1 min. Libraries were sequenced at the Duke GCB Sequencing Core as 51 cycles paired-end runs (51PE), in an Illumina HiSeq 4000 platform. Reads were aligned against the human reference genome GRCh38 using the aligner STAR v2.4.1a[45] following the proposed 2-pass strategy to first identify a splice junction database to improve the overall mapping quality. Gene counts were estimated with featureCounts from the subread package v1.4.6-p6[46], using gene annotations from Refseq[47] and allowing for multimapping reads (-M and --fraction arguments). Differential expression analyses were performed using the DESeq2 package[48] filtering out non-expressed genes and fitting a negative binomial generalized linear model to find significantly differentially expressed genes.

Chromatin immunoprecipitation qPCR

HEK293T cells were transfected with 40 μg total plasmid in 15cm dishes. After three days, cells were fixed in 1% formaldehyde for 10 min at room temperature. The reaction was quenched with 0.125M glycine and the cells were lysed using Farnham lysis buffer with a protease inhibitor cocktail (Roche). Nuclei were collected by centrifugation at 2,000rpm for 5min at 4°C and lysed in RIPA buffer with a protease inhibitor cocktail (Roche). Chromatin was sonicated using a Biorupter Sonicator (Diagenode, model XL) and immunoprecipitated using anti-Flag (Sigma, M2). The formaldehyde crosslinks were reversed by heating overnight at 65°C and genomic DNA fragments were purified using a spin column. For qPCR, 500 pg of ChIP’d DNA was used per reaction. qPCR was performed as described above. The data are presented as fold change gDNA normalized to a region of the β-actin locus and relative to samples targeting Cascade with the control crRNA mentioned above. All sequences for qPCR primers can be found in Supplementary Table S4.

Chromatin immunoprecipitation sequencing

Reverse crosslinked ChIP’d DNA was cleaned using PCR purification columns (Qiagen). DNA concentration was determined using Qubit dsDNA High Sensitivity and Broad Range assay kit (Invitrogen). To prepare sequencing libraries for Illumina sequencing, 7 ng input of ChIP’ed DNA was used with the Hyper Prep kit (Kapa Biosystems). After library preparation, samples were barcoded with Illumina Truseq indexes and normalized to 10 nM. Final libraries were pooled and run on a HiSeq 4000 to generate ~20 million, 50 bp single-end reads per sample. Sequences for TruSeq Illumina adapters were removed from the raw reads using Trimmomatic v0.32[49]. Then, reads were aligned using Bowtie v1.0.0[50], reporting the best alignment with up to 2 mismatches (parameters --best --strata -v 2). Duplicates were removed using Picard MarkDuplicates v1.130, while low mappability or blacklisted regions identified by the ENCODE project[51] were filtered out from the final BAM files. Using the sequenced input controls, binding regions were identified using the callpeak function in MACS2 v2.1.0.20151222[52]. For the differential binding analysis, first, a union peakset was computed merging individual peak calls using bedtools2 v2.25.0[53]. Then, reads in peaks were estimated using featureCounts and the difference in binding was assessed with DESeq2. For the genomic window of 562 bp on chromosome 6 near the TAAR4P pseudogene displaying significant differential binding for cr25 compared with Ctrl cRNA, we searched for both global and local alignments of the 32 bp cr26 sequence. Using the water program in the EMBOSS v6.6.0 package [54] which implements the Smith-Waterman algorithm for local alignment, we were only able to map 8 nucleotides (26.6% of the cr26 sequence). When looking for end-to-end alignments with the needle program implementing the Needleman-Wunsch algorithm, the best alignment contained 5 nucleotides on the 5’ end of the cr26 protospacer sequence and 16 mismatches (50% sequence identity). By contrast, when aligning cr26 to the IL1RN gene, a nearly perfect continuous match (30/32 bp, 94% sequence identity) starting at the 5’ end occurs in the genomic location chr2:113117890–113117919. This difference suggests an alternative mode of binding for this potential off-target site.

Lentiviral transduction

K562-HBE1-mCherry reporter cells[36] were co-transduced with lentivirus expressing LmoCascade subunits and Cas6-KRAB-2A-BlastR. Transduced cells were selected with blasticidin S (Invitrogen) at a concentration of 10 μg/ml then clonally isolated and expanded. For protein analysis, cells were lysed in RIPA buffer (Sigma) with a proteinase inhibitor cocktail (Roch) and a western blot was completed using 25 μg protein and mouse anti-Flag (1:1,000 dilution, Sigma, M2 clone) using the methods described above. LmoCascade-KRAB clone 2 cells were transduced with lentivirus expressing individual crRNAs and selected with puromycin (Sigma) at a concentration of 1 μg/ml. After seven days, cells were harvested, washed twice with 2 mM EDTA (ThermoFisher) and 0.5% bovine serum albumin (Sigma) in PBS. Flow cytometry was done with a SH800 (Sony Biotechnology). Total RNA was isolated and used for qPCR analysis using the methods described above. The qPCR data is presented as fold change in RNA normalized to Gapdh expression and relative to samples targeting Cascade with a crRNA targeted to an irrelevant control locus at IL1RN.

Deep sequencing and indel analysis

HEK293T cells were co-transfected with 600 ng total plasmid in 24-well plates. After four days, genomic DNA was isolated using QIAGEN DNeasy Blood and Tissue kit (Qiagen). In a first round PCR, 100 ng of genomic DNA was amplified with genome specific primers. A second round of PCR was used to add experimental barcodes and Illumina flow cell binding sequences. The resulting sequence libraries were diluted to 2 nM, pooled, and sequenced with 150 bp paired-end reads on an Illumina MiSeq instrument. Samples were demultiplexed and analyzed for insertions and deletions with a local distribution of CRISPResso[55] with default parameters and a 30 bp window around the predicted I-TeVI 5’-CNNNG-3’ cut sites. All primers for genomic DNA amplification can be found in Supplementary Table S5.

Statistical analysis

All data analyzed with two to three biological replicates and presented as mean ± SEM. Logarithmic transformations were completed prior to statistical analysis where indicated. All p values calculated by global one-way ANOVA with Tukey post hoc tests (α=0.05).

55 in total

1. Cpf1 is a single RNA-guided endonuclease of a class 2 CRISPR-Cas system.

Authors: Bernd Zetsche; Jonathan S Gootenberg; Omar O Abudayyeh; Ian M Slaymaker; Kira S Makarova; Patrick Essletzbichler; Sara E Volz; Julia Joung; John van der Oost; Aviv Regev; Eugene V Koonin; Feng Zhang
Journal: Cell Date: 2015-09-25 Impact factor: 41.582

2. Applications of CRISPR technologies in research and beyond.

Authors: Rodolphe Barrangou; Jennifer A Doudna
Journal: Nat Biotechnol Date: 2016-09-08 Impact factor: 54.908

Review 3. Diversity, classification and evolution of CRISPR-Cas systems.

Authors: Eugene V Koonin; Kira S Makarova; Feng Zhang
Journal: Curr Opin Microbiol Date: 2017-06-09 Impact factor: 7.934

4. Clustered regularly interspaced short palindrome repeats (CRISPRs) have spacers of extrachromosomal origin.

Authors: Alexander Bolotin; Benoit Quinquis; Alexei Sorokin; S Dusko Ehrlich
Journal: Microbiology Date: 2005-08 Impact factor: 2.777

Review 5. An updated evolutionary classification of CRISPR-Cas systems.

Authors: Kira S Makarova; Yuri I Wolf; Omer S Alkhnbashi; Fabrizio Costa; Shiraz A Shah; Sita J Saunders; Rodolphe Barrangou; Stan J J Brouns; Emmanuelle Charpentier; Daniel H Haft; Philippe Horvath; Sylvain Moineau; Francisco J M Mojica; Rebecca M Terns; Michael P Terns; Malcolm F White; Alexander F Yakunin; Roger A Garrett; John van der Oost; Rolf Backofen; Eugene V Koonin
Journal: Nat Rev Microbiol Date: 2015-09-28 Impact factor: 60.633

6. In vitro reconstitution of an Escherichia coli RNA-guided immune system reveals unidirectional, ATP-dependent degradation of DNA target.

Authors: Sabin Mulepati; Scott Bailey
Journal: J Biol Chem Date: 2013-06-11 Impact factor: 5.157

7. RNA editing with CRISPR-Cas13.

Authors: David B T Cox; Jonathan S Gootenberg; Omar O Abudayyeh; Brian Franklin; Max J Kellner; Julia Joung; Feng Zhang
Journal: Science Date: 2017-10-25 Impact factor: 47.728

Review 8. ZFN, TALEN, and CRISPR/Cas-based methods for genome engineering.

Authors: Thomas Gaj; Charles A Gersbach; Carlos F Barbas
Journal: Trends Biotechnol Date: 2013-05-09 Impact factor: 19.536

9. Short motif sequences determine the targets of the prokaryotic CRISPR defence system.

Authors: F J M Mojica; C Díez-Villaseñor; J García-Martínez; C Almendros
Journal: Microbiology Date: 2009-03 Impact factor: 2.777

10. Three CRISPR-Cas immune effector complexes coexist in Pyrococcus furiosus.

Authors: Sonali Majumdar; Peng Zhao; Neil T Pfister; Mark Compton; Sara Olson; Claiborne V C Glover; Lance Wells; Brenton R Graveley; Rebecca M Terns; Michael P Terns
Journal: RNA Date: 2015-04-22 Impact factor: 4.942

24 in total

Review 1. Chemistry of Class 1 CRISPR-Cas effectors: Binding, editing, and regulation.

Authors: Tina Y Liu; Jennifer A Doudna
Journal: J Biol Chem Date: 2020-08-14 Impact factor: 5.157

2. Introducing Large Genomic Deletions in Human Pluripotent Stem Cells Using CRISPR-Cas3.

Authors: Zhonggang Hou; Chunyi Hu; Ailong Ke; Yan Zhang
Journal: Curr Protoc Date: 2022-02

3. A TXTL-Based Assay to Rapidly Identify PAMs for CRISPR-Cas Systems with Multi-Protein Effector Complexes.

Authors: Franziska Wimmer; Frank Englert; Chase L Beisel
Journal: Methods Mol Biol Date: 2022

Review 4. Human brain evolution: Emerging roles for regulatory DNA and RNA.

Authors: Jing Liu; Federica Mosti; Debra L Silver
Journal: Curr Opin Neurobiol Date: 2021-11-30 Impact factor: 6.627

5. Allosteric control of type I-A CRISPR-Cas3 complexes and establishment as effective nucleic acid detection and human genome editing tools.

Authors: Chunyi Hu; Dongchun Ni; Ki Hyun Nam; Sonali Majumdar; Justin McLean; Henning Stahlberg; Michael P Terns; Ailong Ke
Journal: Mol Cell Date: 2022-07-13 Impact factor: 19.328