Adrian Pickar-Oliver1,2, Joshua B Black1,2, Mae M Lewis1,2, Kevin J Mutchnick1,2, Tyler S Klann1,2, Kylie A Gilcrest1,2, Madeleine J Sitton1,2, Christopher E Nelson1,2, Alejandro Barrera3, Luke C Bartelt2,3, Timothy E Reddy2,3,4, Chase L Beisel5,6,7, Rodolphe Barrangou8, Charles A Gersbach9,10,11. 1. Department of Biomedical Engineering, Duke University, Durham, NC, USA. 2. Center for Advanced Genomic Technologies, Duke University, Durham, NC, USA. 3. Department of Biostatistics and Bioinformatics, Duke University Medical Center, Durham, NC, USA. 4. University Program in Genetics and Genomics, Duke University, Durham, NC, USA. 5. Department of Chemical and Biomolecular Engineering, North Carolina State University, Raleigh, NC, USA. 6. Helmholtz Institute for RNA-based Infection Research (HIRI), Helmholtz-Centre for Infection Research (HZI), Würzburg, Germany. 7. Medical Faculty, University of Würzburg, Würzburg, Germany. 8. Department of Food, Bioprocessing and Nutrition Sciences, North Carolina State University, Raleigh, NC, USA. 9. Department of Biomedical Engineering, Duke University, Durham, NC, USA. charles.gersbach@duke.edu. 10. Center for Advanced Genomic Technologies, Duke University, Durham, NC, USA. charles.gersbach@duke.edu. 11. Department of Surgery, Duke University Medical Center, Durham, NC, USA. charles.gersbach@duke.edu.
Abstract
Class 2 CRISPR-Cas systems, such as Cas9 and Cas12, have been widely used to target DNA sequences in eukaryotic genomes. However, class 1 CRISPR-Cas systems, which represent about 90% of all CRISPR systems in nature, remain largely unexplored for genome engineering applications. Here, we show that class 1 CRISPR-Cas systems can be expressed in mammalian cells and used for DNA targeting and transcriptional control. We repurpose type I variants of class 1 CRISPR-Cas systems from Escherichia coli and Listeria monocytogenes, which target DNA via a multi-component RNA-guided complex termed Cascade. We validate Cascade expression, complex formation and nuclear localization in human cells, and demonstrate programmable CRISPR RNA (crRNA)-mediated targeting of specific loci in the human genome. By tethering activation and repression domains to Cascade, we modulate the expression of targeted endogenous genes in human cells. This study demonstrates the use of Cascade as a CRISPR-based technology for targeted eukaryotic gene regulation, highlighting class 1 CRISPR-Cas systems for further exploration.
Class 2 CRISPR-Cas systems, such as Cas9 and Cas12, have been widely used to target DNA sequences in eukaryotic genomes. However, class 1 CRISPR-Cas systems, which represent about 90% of all CRISPR systems in nature, remain largely unexplored for genome engineering applications. Here, we show that class 1 CRISPR-Cas systems can be expressed in mammalian cells and used for DNA targeting and transcriptional control. We repurpose type I variants of class 1 CRISPR-Cas systems from Escherichia coli and Listeria monocytogenes, which target DNA via a multi-component RNA-guided complex termed Cascade. We validate Cascade expression, complex formation and nuclear localization in human cells, and demonstrate programmable CRISPR RNA (crRNA)-mediated targeting of specific loci in the human genome. By tethering activation and repression domains to Cascade, we modulate the expression of targeted endogenous genes in human cells. This study demonstrates the use of Cascade as a CRISPR-based technology for targeted eukaryotic gene regulation, highlighting class 1 CRISPR-Cas systems for further exploration.
The ability to modulate and perturb genetic information is indispensable for
studying gene function and elucidating biological mechanisms. Targetable DNA-binding
proteins that modify genomes at specific loci have led to tremendous advances in
science, biotechnology, and medicine[1]. Specifically, the development of genome engineering tools based
on class 2 CRISPR-Cas systems, which use a single effector protein such as Cas9 or
Cas12, has revolutionized the field due to the ease of use of this technology with a
vast range of applications[2]. In
fact, bioinformatics analyses have further revealed a diversity of CRISPR-Cas
systems, and the most recent classification encompasses six major types (I through
VI)[3]. However,
single-effector class 2 systems (types II, V and VI) have been primarily used for
nucleic acid targeting in eukaryotes, despite multi-subunit class 1 systems (types
I, III, and IV) comprising about 90% of all identified systems across bacteria and
archaea[4]. The continued
efforts to discover and develop single-component class 2 CRISPR effectors beyond
Cas9-based type II systems have resulted in new technologies with specific
advantages or applications. For example, the Cas12-based type V[5] and Cas13-based type VI[6] CRISPR-Cas systems of class 2 have distinct
targeting and editing mechanisms. Here, we describe the development of type I
systems, which account for more than 50% of all identified CRISPR-Cas loci, for use
as programmable transcriptional activators and repressors in mammalian cells.Type I CRISPR-Cas systems use the signature Cas3 nuclease-helicase to
eliminate invading DNA, and are further divided into eight subtypes (I-A to I-G and
I-U) based on related, but subtype-specific, accessory cas
genes[3,4,7]. The
well-studied prototypical type I-E system of Escherichia coli K12
consists of eight cas genes and a downstream CRISPR array[8-10]. Following transcription of the CRISPR array, the 29-bp
repeat sequences flanking the variable spacer sequences are cleaved during crRNA
biogenesis by the Cas6e endoribonuclease[10]. Together, five protein subunits (Cas8e, Cse2, Cas7, Cas5,
and Cas6, previously referred to as CasA, CasB, CasC, CasD, and CasE,
respectively)[4] and the
mature 61-nt crRNA form the 405-kDa CRISPR-associated complex for antiviral defense
(Cascade)[11-13] (Figure 1A). Unlike type II CRISPR systems, there is no tracrRNA required
for effector complex formation. For type I CRISPR systems, CasE processes the CRISPR
array and targeting relies on a single crRNA. To bind to a target, Cascade surveys
the DNA to find a protospacer-adjacent motif (PAM) upstream of a target sequence
with complementarity to the crRNA spacer sequence (Figure 1B). We sought to harness this prokaryotic immune system for
genome targeting in eukaryotes.
Figure 1.
EcoCascade expression and complex formation in human cells.
(a) Schematic of type I-E CRISPR-Cas system in E.
coli K-12 showing EcoCascade stoichiometry and crRNA processing.
Genes encoding proteins comprising the EcoCascade complex are presented in
different colors. CRISPR repeats are indicated with a red diamond. Cas6 cleaves
the primary CRISPR RNA (crRNA) transcript at the regions indicated with blue
arrows yielding mature crRNAs. (b) Schematic representation of
processed crRNA with 5’ PAM recognition and base pairing at the DNA
target site. (c) Cas subunits driven by human cytomegalovirus (CMV)
promoter and pre-processed crRNA driven by U6 promoter for expression and
processing in mammalian cells. (d) Western blot showing expression
of human codon-optimized individual Cas proteins and GAPDH loading control in
human HEK293T cells. (* indicates second round of codon optimization).
(e) Co-IP and western blot showing crRNA-dependent Cascade
formation following co-transfection with V5-tagged Cas7 and Flag-tagged Cas8e,
Cse2, Cas5, and Cas6 (immunoprecipitation with α-V5, and detection with
α-V5 and α-Flag). (f) Immunofluorescence imaging
showing engineered Cas subunits with NLSs enables import into the human nucleus.
Red indicates Cas subunit; scale bar, 25μm. For d-f, two
independent experiments were conducted with similar results. All samples
processed at 3 days post-transfection. CMV, human cytomegalovirus; CoIP,
Co-immunoprecipitation; NLS, nuclear localization signal.
Results (with Subheadings)
Expressing the Cascade complex in human cells
To repurpose Cascade for use in mammalian cells, we used a CMV promoter
to express each Cascade subunit of the E. coli K12 system
(EcoCascade). Based on available structural information[13-16], N-terminal Flag epitope tags and nuclear localization
signals (NLSs) were attached to each EcoCascade construct. We used the RNA
polymerase III U6 promoter to express target spacers flanked by full repeat
sequences for crRNA processing by CasE (Figure
1C)[17]. Heterologous
expression of all EcoCascade constructs was confirmed by Western blot following
transfection of each plasmid individually into human HEK293T cells (Figure 1D). The cas genes
were originally optimized based on human codon usage, but variable expression of
these cas constructs indicated a need for additional
codon-optimization of Cas5, which was performed by ATUM/DNA2.0
(DNA sequences provided in Supplementary Information). To determine if EcoCascade complex
formation occurred in human cells, the six plasmids encoding each of the five
Cas subunits and the crRNA cassette were co-transfected into HEK293T cells.
Co-immunoprecipitation by pull-down of the V5 epitope on Cas7 and blotting for
the Flag epitope on the other subunits confirmed proper complex assembly (Figure 1E). Interestingly, EcoCascade
complexes can be purified from bacteria in the absence of a crRNA[18], however we observed
EcoCascade formation in human cells only in the presence of a crRNA (Figure 1E). Additionally, nuclear
localization of each subunit with the single NLS was confirmed by
immunofluorescence staining of transfected HEK293T cells (Figure 1F).
Programmable transcriptional activation by EcoCascade-p300
Next, we sought to repurpose EcoCascade for CRISPR-based programmable
transcriptional activation (CRISPRa) in mammalian cells. Transcriptional
modulation using class 2 CRISPR systems has been achieved by introducing point
mutations into the endo-nucleolytic domains of single-component effectors to
maintain binding but not cleavage of the target DNA[19]. For example, the nuclease-deficient
Cas9 (dCas9) can be engineered to function as a synthetic transcriptional
activator by genetically fusing it to a transactivation or epigenome-modifying
domain[20-24]. In the natural type I
CRISPR-Cas immune system, target site recognition by Cascade leads to
recruitment of the Cas3 nuclease to eliminate target DNA. However, deletion of
cas3 from the endogenous type I-E system in E.
coli was utilized for a CRISPR interference (CRISPRi) strategy in
bacteria by permitting Cascade to bind target DNA and block transcription
without DNA degradation[25].
Therefore, we hypothesized that Cascade can be repurposed as a programmable
DNA-binding technology in eukaryotes by neglecting to express
cas3.To repurpose EcoCascade as a programmable transcriptional activator, we
explored the various Cas-effector subunits for tethering of the activation
domain. We previously demonstrated robust endogenous gene activation with dCas9
fused to the catalytic core domain of the human acetyltransferase p300[23]. The EcoCascade system
contains five Cas subunits available at various stoichiometry (Figure 1A), providing versatile options for synthetic
fusions to transcriptional regulatory domains and other modular engineering
strategies. Following heterologous expression of EcoCascade with p300 fused to
Cas8e, Cse2, Cas5, or Cas6, we confirmed EcoCascade complex formation by
co-immunoprecipitation (Figure 2A). The
ability of each of these subunits to accommodate the fusion of the large p300
core catalytic domain (72kDa) without abrogating complex formation suggests that
the modular Cascade complex could be particularly useful for multiplexed
targeting of regulatory domains at specific loci.
Figure 2.
EcoCascade activates transcription of endogenous genes in human
cells.
(a) Co-IP showing EcoCascade formation following
co-transfection with plasmids encoding the crRNA, Flag-tagged EcoCascade
subunits with V5-tagged Cas7, and various Cas-p300 fusions. IP performed with
α-V5 and IB performed with α-V5 and α-Flag. Two independent
experiments were conducted with similar results. (b) Schematic of
the IL1RN locus along with tiled IL1RN crRNA
target sites. H3K27ac enrichment from the ENCODE Consortium is indicated with
the vertical range set to 400 to indicate regulatory regions. The two
IL1RN ChIP-qPCR amplicons are shown in corresponding
locations. (c) Relative IL1RN expression following
co-transfection of individual crRNAs (100 ng) and EcoCascade (50 ng Cas8e, 100
ng Cse2, 50 ng Cas7, 250 ng Cas5*) with Cas6-p300 (50 ng). (n=3 biological
independent samples; mean ± SEM). (d) Relative
IL1RN expression following co-transfection of Ctrl crRNA or
cr26 and EcoCascade with various Cas-p300 effectors. (n=3 biological independent
samples; mean ± SEM). (e) ChIP-qPCR enrichment following
co-transfection of individual crRNAs with EcoCascade and Cas6-p300. IP performed
with α-Flag and qPCR performed with primers for amplicon regions
designated in b. (n=3 biological independent samples; mean ±
SEM; bars indicate mean fold enrichment). (f) Schematic of the
HBG locus along with HBG crRNA target
sites. (g) Relative HBG expression following
co-transfection of individual crRNAs and EcoCascade with Cas6-p300. (n=3
biological independent samples; mean ± SEM). All samples processed at 3
days post transfection. (Tukey-test following log transformation,
**P<0.001 and *P<0.05
compared to Ctrl crRNA). Numbers above bars indicate mean relative expression.
TSS, Transcription start site; Ctrl crRNA, Control crRNA; IP,
immunoprecipitation; IB, immunoblot; ChIP, chromatin immunoprecipitation; qPCR,
quantitative PCR.
To test programmable endogenous gene activation in human cells, a panel
of crRNAs was generated tiling the endogenous IL1RN promoter at
protospacer targets downstream of known PAMs (5’-AAG, AGG, ATG, GAG,
TAG-3’)[10,26-28] (Figure
2B, Table
S1). Co-transfection of HEK293T cells with plasmids encoding EcoCascade
with Cas6-p300 and individual crRNAs revealed robust IL1RN
activation with many crRNAs, including >3,000-fold IL1RN
activation with cr26 (**P<0.001, Figure 2C). Importantly, cr26 with EcoCascade lacking
a p300 domain, or cr26 alone did not activate IL1RN (Figure S1), suggesting
target-specific activation by EcoCascade-p300. Based on EcoCascade stoichiometry
(Figure 1A) and relative Cas construct
expression (Figure 1D), we optimized the
relative masses of transfected plasmids to maximize gene activation (Figure S2). Additionally,
the transactivation potential of all Cas-p300 fusions was explored with cr26.
Relative to heterologous expression with a crRNA targeted to a control locus,
EcoCascade containing Cas8e-p300 or Cas6-p300 displayed significant
transactivation of IL1RN (*P<0.05,
Figure 2D).
Versatility of EcoCascade for targeted gene activation
To investigate EcoCascade-p300 interactions at the target locus, we
performed chromatin immunoprecipitation with an anti-Flag antibody followed by
quantitative PCR (ChIP-qPCR) of two amplicons adjacent to the target site (Figure 2B). We observed significant
enrichment of the target regions in EcoCascade-p300 samples co-transfected with
cr25 or cr26 compared to a control crRNA (**P<0.001,
Figure 2E). These results further
confirm EcoCascade as a programmable DNA-binding platform for efficient
targeting of specific loci in the human genome. Intriguingly, we observed
enhanced enrichment of IL1RN signal in samples treated with
Flag-tagged EcoCascade-p300 and an IL1RN-targeted crRNA
compared to samples treated with Flag-tagged dCas9-p300 and an
IL1RN-targeted single-guide RNA (sgRNA) (Figure S3). These results may
indicate increased occupancy by EcoCascade relative to dCas9 but could also be
the result of differences in epitope avidity or presentation. Targeted
endogenous IL1RN activation was also achieved by tethering Cas6
to the tripartite activator, VP64-p65-Rta (VPR)[24], although both p300 and VPR tethered to
Cas6 led to reduced activation levels compared to fusion to dCas9 (Figure S4). To assess
activation of other endogenous targets in the human genome, we targeted the
HBG promoter with EcoCascade-p300 (Figure 2F) and observed robust transactivation (Figure 2G). The natural function of Cascade
to process crRNAs suggests the possibility of using arrayed spacers for
multiplexed genome engineering. By generating a CRISPR array containing multiple
crRNA spacers that target both IL1RN and HBG,
we demonstrated multiplexed activation of endogenous genes (Figure S5). Together, these results
demonstrate the potential for repurposing type I-E Cascade systems as versatile
programmable transcriptional activators in mammalian cells.
Highly specific crRNA-dependent EcoCascade-p300 targeting
Assessing the specificity of genome and epigenome engineering tools is
essential for their successful application in basic research and medicine. To
date, there have been varied reports regarding the specificity of Cas9-based
genome and epigenome editing technologies[29-31], which
has led to the development of a variety of strategies to improve
specificity[32]. To
quantify the genome-wide binding specificity of EcoCascade-p300, we performed
ChIP-seq using the FLAG epitope fused to the N terminus of each of the
EcoCascade subunits. Binding of EcoCascade-p300 was highly enriched at the
IL1RN promoter when targeting with cr25 and cr26, with no
detectable binding observed with a HBE1-targeting control crRNA
(Figure 3A). Interestingly, the
strength of binding signal was comparable between cr25 and cr26, even though
they differed substantially in their induction of IL1RN
transcription (Figure 2C). With a
genome-wide false discovery rate (FDR) < 0.001, there were a few
off-target differential binding sites observed when comparing cr25 and cr26 to
Ctrl crRNA (Figure 3B–C). These were all substantially weaker sites compared
to the signal at the IL1RN locus, with the exception of one
genomic window located on chromosome 6 about 5 kilobases adjacent to
TAAR4P that was significantly enriched only with cr26
(Figure 3C and Figure S6A). However, we could not
readily detect a seed sequence complementary to cr26 within this region. A site
near the UBB locus was enriched in the control crRNA-treated
sample relative to both cr25 and cr26, indicating a possible off-target site of
this crRNA (Figure 3B–C).
Figure 3.
Genome-wide specificity of EcoCascade-p300 targeted to
IL1RN.
(a) ChIP-seq tracks for binding of Flag-tagged
EcoCascade-p300 targeted to the IL1RN promoter with cr25 and
cr26, compared with binding of EcoCascade-p300 with Ctrl crRNA. An ENCODE
H3K27ac track is included to highlight the regulatory regions.
(b,c) MA plots for the differential analyses of global binding
activity including comparisons of EcoCascade-p300 targeted by (b)
cr25 and (c) cr26 versus EcoCascade-p300 with
HBE1-targeting control crRNA in HEK293T cells. Red data points
indicate FDR < 0.001 by differential DESeq2 analysis using Wald test
p-values. (d) MA plots for differential expression analyses
comparing EcoCascade-p300 targeted by cr26 versus GFP-targeting
control crRNA in HEK293T cells. Red data points indicate FDR < 0.01 by
differential expression analysis using Wald test p-values. (n=3 biological
independent samples). Ctrl crRNA, Control crRNA.
To evaluate the specificity of crRNA-mediated endogenous gene activation
with EcoCascade-p300, we performed RNA-seq to quantify transcriptome-wide
changes when targeting IL1RN with cr26 or with a
GFP-targeting control crRNA (Figure 3D). Targeted activation was highly specific to the target
gene when EcoCascade-p300 was co-expressed with cr26, with detection of a modest
change to only six other genes (Figure 3D).
However, we did observe that the addition of the p300 domain to EcoCascade
resulted in significant off-target transcriptional changes when compared to
EcoCascade alone (Figure
S6B). Given the highly specific DNA-targeting by EcoCascade (Figure 3B–C) and cr26-dependent activation of IL1RN (Figure 3D), these results may indicate
non-specific crRNA-independent effects of overexpression of the p300
acetyltransferase fused to Cas6. Collectively, this genome-wide specificity
analysis demonstrates highly specific crRNA-dependent targeting of EcoCascade in
mammalian cells.
Expanding the Cascade toolbox with LmoCascade
Beyond the well-characterized EcoCascade system, bioinformatic analyses
have revealed a plethora of additional type I CRISPR-Cas systems. To explore the
potential for repurposing other Cascade complexes, we extended our results with
the model type I-E EcoCascade to repurposing the type I-B CRISPR-Cas system of
L. monocytogenes Finland_1998 (LmoCascade) (Figure 4A). Expression of all subunits was confirmed
in HEK293T cells following human codon-optimization with N-terminal Flag epitope
tags and NLSs attached to each LmoCascade construct (Figure 4B). To repurpose LmoCascade as a programmable
transcriptional activator, we fused the catalytic core domain of p300 to Cas6,
the predicted EcoCascade Cas6 ortholog. To test programmable endogenous gene
activation in human cells, a panel of crRNAs, with the predicted spacer length
of 36 nucleotides, was generated tiling the endogenous IL1RN
promoter at protospacer targets downstream of the known PAM
(5’-CCA-3’)[33] (Figure 4C).
Co-transfection of HEK293T cells with plasmids encoding LmoCascade with
Cas6-p300 and individual crRNAs revealed robust IL1RN
activation with most crRNAs (**P<0.001, Figure 4D). Additionally, the transactivation
potential of all Cas-p300 fusions was explored with cr03. Relative to
heterologous expression with a control crRNA, LmoCascade containing Cas8b2-p300,
Cas5-p300 or Cas6-p300 displayed significant transactivation of
IL1RN (**P<0.001, Figure 4E). In contrast to the panel of
IL1RN-targeting crRNAs for EcoCascade (Figure 2C), almost all of the LmoCascade crRNAs, as
well as three of the four Cas-p300 effector fusions, achieved significant
IL1RN activation (Figure
4D–E). To further assess
crRNA-dependent activation of other endogenous targets in the human genome, we
targeted the HBG promoter with LmoCascade-p300 and observed
robust transactivation (Figure
S7). These results demonstrate the potential to broaden our
fundamental knowledge of type I CRISPR systems as we expand the CRISPR
engineering toolbox by repurposing less characterized systems for use in
mammalian cells.
Figure 4.
LmoCascade activates transcription of IL1RN gene in human
cells.
(a) Schematic of type I-B CRISPR-Cas system in L.
monocytogenes Finland_1998 showing predicted LmoCascade
stoichiometry and crRNA processing. Genes encoding proteins comprising the
LmoCascade complex are presented in different colors. CRISPR repeats are
indicated with a red diamond. The primary crRNA transcript is predicted to be
cleaved at the regions indicated with blue arrows yielding mature crRNAs.
(b) Western blot showing expression of human codon-optimized
individual Cas proteins and Actin loading control in human HEK293T cells. Two
independent experiments were conducted with similar results. (c)
Schematic of the IL1RN locus along with tiled
IL1RN crRNA target sites. (d) Relative
IL1RN expression following co-transfection of individual
crRNAs and LmoCascade with Cas6-p300. (e) Relative
IL1RN expression following co-transfection of Ctrl crRNA or
cr03 and LmoCascade with various Cas-p300 effectors. All samples were processed
at 3 days post transfection. (Tukey-test following log transformation,
**P<0.001 compared to Ctrl crRNA, n=3 biological
independent samples; error bars, SEM). Numbers above bars indicate mean relative
expression. TSS, Transcription start site; Ctrl crRNA, Control crRNA; qPCR,
quantitative PCR.
Targeted gene repression by stable LmoCascade-KRAB expression
In addition to harnessing type I CRISPR systems for programmable
transcriptional activation, we sought to take advantage of steric hindrance by
the large Cascade complex, the strong binding of Cascade to target DNA[34], and tethering of
transcriptional repressor domains such as KRAB[20,31,35], to repurpose
LmoCascade for targeted transcriptional repression in eukaryotic cells. To
achieve stable expression of LmoCascade-KRAB in mammalian cells, we generated
lentivirus expression constructs for each Cascade subunit, including the
addition of a T2A-BlasticidinR sequence downstream of Cas6-KRAB
(Figure S8A).
Following co-transduction of a K562-HBE1-mCherry endogenous gene reporter cell
line[36],
LmoCascade-KRAB expressing cells were selected with blasticidin S, followed by
clonal isolation and expansion. Clone 2 was selected following confirmation of
LmoCascade-KRAB expression by Western blot analysis (Figure S8B).To test programmable endogenous gene repression in human cells, a panel
of crRNAs was generated tiling the endogenous 5’-untranslated region of
HBE1 (Figure 5A).
Lentiviral expression constructs were generated for stable, independent
expression of a crRNA and an eGFP-2A-PuroR selection cassette (Figure 5B). Transduction, selection, and
expansion of K562-HBE1-mCherry cells expressing LmoCascade-KRAB (Figure 5C) revealed HBE1
transcriptional repression with all crRNAs (Figure
5D). To further assess the repressive capacity of LmoCascade-KRAB,
protein expression was evaluated by flow cytometry analysis of the HBE1-mCherry
reporter (Figure 5E–F). Significant reduction in mCherry fluorescence was
observed for all crRNA targets (*P<0.05, Figure 5E), including robust repression in 5 of 6
crRNAs (**P<0.001, Figure
5E). These results demonstrate the potential for repurposing type I
CRISPR systems as programmable transcriptional repressors in mammalian
cells.
Figure 5.
LmoCascade represses transcription of HBE1 gene in human
cells.
(a) Schematic of the HBE1 locus along with
tiled HBE1 crRNA target sites. (b) Schematic of
lentiviral expression constructs with dual promoters for co-expression of crRNAs
with eGFP and PuromycinR. (c) Experimental timeline for
transduction of monoclonal K562-HBE1-2A-mCherry cells with stable
LmoCascade-KRAB expression. Cells were transduced with individual crRNAs,
selected for two days with puromycin and harvested on day seven.
(d) Relative HBE1 expression measured by qPCR.
(e) Relative MFI of mCherry measured by flow cytometry.
(f) Representative flow cytometry histogram. For
d-f, two independent experiments were conducted with similar
results. (Tukey-test, **P<0.001 compared to Ctrl crRNA,
n=3 biological independent samples; error bars, SEM). Numbers above bars
indicate mean relative expression. TSS, Transcription start site; Ctrl crRNA,
Control crRNA; qPCR, quantitative PCR. MFI, mean fluorescence intensity.
Discussion
In summary, these results show that Cascade from type I-E and type I-B
CRISPR systems can be reprogrammed for RNA-guided transcriptional modulation in
human cells. This new class of genome engineering tools has potential benefits that
expand the CRISPR engineering toolbox. For example, the promiscuous PAM recognition
of type I-E EcoCascade (5’-AAG, AGG, ATG, GAG, TAG-3’), and the
additional PAM recognition of type I-B LmoCascade (5’-CCA-3’), located
5’ of the spacer in contrast to the 3’ PAM of type II
systems[16,37], enables a larger set of available CRISPR
target sequences.By tiling crRNAs along endogenous promoters, our studies revealed
PAM-independent (Table S1)
variation in transactivation potential of crRNAs (Figures 2B–C, 2F–G, 4C–D). The reason for these differences remains to be elucidated but is similar
to differences observed among gRNAs and crRNAs for class 2 systems. Ineffective
class 2 effectors can gain activity when co-targeting dCas9 to proximal
sites[38], suggesting dCas9
increases accessibility of these sites. Similarly, we observed 16-fold enhanced
transactivation with EcoCascade-p300 and cr22 when dCas9 was co-targeted to
IL1RN (Figure
S9). These results suggest a narrower ability for Cascade to recognize
genomic DNA targets in eukaryotes, however this may also serve as a mechanism to
increase targeting specificity.Additionally, the preservation of complex formation observed after effector
tethering suggests opportunities to utilize the stoichiometry of the Cascade complex
for exploring synergistic activities of multiple effector domains. For dCas9,
combinatorial targeting by tethering KRAB and DNA methyltransferases has been used
to achieve heritable transcriptional silencing[39]. Additionally, the stoichiometry and architecture of
Cascade have been tuned in bacteria by altering crRNA protospacer length[25,40]. The several cas genes involved and their
various corresponding Cas proteins also present individual opportunities to append
molecules and functional domains with increased flexibility.Beyond repurposing type I CRISPR systems for targeted transcriptional
modulation, we also anticipate that Cascade subunits can be tethered to endonuclease
effectors, such as the catalytic domains of the homodimeric FokI and monomeric
I-TevI[41-43] endonucleases, for programmable editing via
generation of double-stranded breaks or single-stranded nicks in genomic DNA. In
fact, we have observed indels characteristic of double-strand break repair following
delivery of Cascade-I-TevI fusions in preliminary experiments (Figure S10). Alternatively, Cascade can
be co-expressed with the Cas3 helicase-nuclease to generate a spectrum of long-range
chromosomal deletions[44].Targeted transcriptional modulation is important for perturbing gene
function, designing gene regulatory networks, investigating the function of distal
regulatory elements, manipulating cellular and organismal phenotypes, and inducing
therapeutic changes to gene expression. Cascade complexes from type I CRISPR-Cas
systems provide a novel and widely applicable RNA-guided platform for targeting DNA
sequences in eukaryotes.
Online Methods
Mammalian expression plasmid construction
E. coli K-12 Cascade sequences were originally
codon-optimized by human codon usage tables using Integrated DNA technology
(IDT), synthesized as gene blocks and integrated into expression plasmids
containing a CMV-driven cassette by Gibson cloning strategies. ATUM/DNA2.0
synthesized the second-round human codon-optimized constructs using proprietary
methods. See Supplementary
Text for gene sequences of E. coli K-12 Cascade
constructs. For crRNA expression, a cloning vector was constructed
(pAPcrRNA_Eco) with a U6-driven cassette and digested with SacII and XhoI. To
insert repeat-spacer pairs, oligonucleotides encoding the palindromic repeat and
crRNA spacers were annealed, 5’ phosphorylated with PNK and ligated into
digested pAPcrRNA_Eco. See Supplementary Figure S11 for an illustration of the cloning scheme.
The control crRNA used throughout this study targets Cascade to an irrelevant
control locus at HBE1. See Supplementary Table S3 for Cas9
gRNA target sequences[23].L. monocytogenes Finland_1998 Cascade sequences were
synthesized by ATUM/DNA2.0 as human codon-optimized constructs using proprietary
methods. See Supplementary
Text for gene sequences of L. monocytogenes
Finland_1998 Cascade constructs. For crRNA expression, a cloning vector was
constructed (pAPcrRNA_Lmo) with a U6-driven cassette and digested with SacII and
AgeI. To insert repeat-spacer pairs, oligonucleotides encoding the palindromic
repeat and crRNA spacers were annealed, 5’ phosphorylated with PNK and
ligated into digested pAPcrRNA_Lmo. See Supplementary Figure S12 for an
illustration of the cloning scheme. The control crRNA used throughout this study
targets Cascade to an irrelevant control locus at HBE1.
Plasmids used throughout this study are available through Addgene (Plasmid IDs:
106270–106276, 126481–126494, 126501).
Cell culture and transfections
HEK293T cells were maintained in Dulbecco’s Modified
Eagle’s Medium (Invitrogen) with 10% Fetal Bovine Serum (Sigma) and 1%
penicillin-streptomycin (Gibco). K562 cells were maintained in RPMI-1640 Medium
(Invitrogen) with 10% Fetal Bovine Serum (Sigma) and 1% penicillin-streptomycin
(Gibco). Cells were incubated at 37°C with 5% CO2. Lentivirus
was produced in HEK293Ts using Lipofectamine 3000 (Invitrogen). All other
transfections were performed using Lipofectamine 2000 (Invitrogen) according to
the manufacturer’s protocol.
Immunofluorescence staining
Cells were passaged and transfected with 100 ng plasmid DNA on
coverslips in 24-well plates. At three days post-transfection, cells were washed
with PBS and fixed with 4% paraformaldehyde (Sigma). Cells were incubated with
blocking buffer (5% goat serum, 0.2% Triton X-100 in PBS) then incubated with
mouse anti-Flag (1:200 dilution, Sigma, M2 clone), followed by incubation with
goat anti-mouse Alexa Fluor 647 (1:200 dilution, Life Technologies, A21236), and
DAPI nucleic acid stain (Invitrogen). Cells were imaged with a Leica DMI 3000 B
fluorescence microscope. Exposure time set by fluorescence of lowest expressed
construct and maintained for all samples.
Western blot and co-immunoprecipitation
For protein analysis, HEK293T cells were transfected with 2 μg of
individual Cas constructs in 6-well plates. After three days, cells were lysed
in RIPA buffer (Sigma) with a proteinase inhibitor cocktail (Roche). Samples
were centrifuged at 12,000rpm for 5min and the supernatant was isolated and
quantified using a bicinchronic acid assay (BCA) protein standard curve (Thermo
Scientific) on the BioTek Synergy 2 Multi-Mode Microplate Reader. Mixed with
NuPAGE loading buffer (Invitrogen) and 5% β-mercaptoethanol, 25 μg
protein was heated at 100°C for 10 min. Samples were loaded into 10%
NuPAGE Bis-Tris gels (Invitrogen) with MES buffer (Invitrogen) and
electrophoresed for 70 min at 200V on ice. Protein was transferred to
nitrocellulose membranes for 1 hour in 1X tris-glycine transfer buffer
containing 10% methanol and 0.01% SDS at 4°C at 400 mA. The blot was
blocked at room temperature for 30 min in 5% milk-TBST (50 mM Tris, 150 mM NaCl
and 0.1% Tween-20) and incubated with mouse anti-Flag (1:1,000 dilution, Sigma,
M2 clone) in 5% milk-TBST at 4°C overnight. Blots were then washed in
TBST and incubated with goat anti-mouse-conjugated horseradish peroxidase
(1:2,500 dilution, Sigma) in 5% milk-TBST for 45 min at room temperature. Blots
were washed in TBST then visualized using Western-C ECL substrate (Bio-Rad) on a
ChemiDoc XRS+ System (Bio-Rad). Blots were stripped with Restore PLUS Western
Blot Stripping Buffer (Thermo Scientific), blocked, and re-blotted with rabbit
anti-GAPDH (1:1,000 dilution, Cell Signaling, 14C10) or anti-Actin (1:1,000
dilution, Sigma, A2066) and goat anti-rabbit-conjugated horseradish peroxidase
(1:2,500 dilution, Sigma). Blots were visualized again using the methods
described above.For co-immunoprecipitation analysis, co-transfections were completed
using a V5-Cas7 construct. HEK293T cells were co-transfected in 6-well plates
with 425 ng of each Cas construct and crRNA for 2.25 μg total plasmid DNA
per condition. At three days post transfection, cells were lysed with IP lysis
buffer (Thermo Scientific) with a proteinase inhibitor cocktail (Roche). Samples
were centrifuged at 12,000 rpm for 5 min and the supernatant was isolated and
subjected to immunoprecipitation using goat anti-V5-agarose conjugate (10
μl, Abcam, ab1229) at 4°C overnight. The IP products were washed
three times with IP lysis buffer, mixed with NuPAGE loading buffer and 5%
β-mercaptoethanol, and heated at 100°C for 10 min. Samples were
loaded into 10% NuPAGE Bis-Tris gels, and resolved as described above. Blots
were blocked, incubated with mouse anti-Flag (1:1,000 dilution, Sigma, M2 clone)
and mouse anti-V5 (1:40,000 dilution, Abcam, SV5-Pk1 clone) then with goat
anti-mouse-conjugated horseradish peroxidase (1:2,500 dilution, Sigma). Blots
were visualized as described above.
RNA analysis
For quantitative PCR (qPCR), HEK293T cells were co-transfected with
individual crRNAs (100 ng) and EcoCascade (50 ng Cas8e, 100 ng Cse2, 50 ng Cas7,
250 ng Cas5, and 50 ng Cas6-p300) or LmoCascade (150 ng Cas8b2, 50 ng Cas7, 75
ng Cas5, and 150 ng Cas6-p300) in 24-well plates. After three days, total RNA
was isolated using QIAshredder and QIAGEN RNeasy kits (Qiagen). Reverse
transcription was carried out using 500 ng total RNA per sample in a 10
μl reaction using the SuperScript VILO Reverse Transcription Kit
(Invitrogen). Per qPCR reaction, 1.0 μl of cDNA was used with Perfecta
SYBR Green Fastmix (Quanta Biosciences) and ran using the CFX96 Real-Time PCR
Detection System (Bio-Rad). All sequences for qPCR primers can be found in Supplementary Table S3.
All qPCR data are presented as fold change in RNA normalized to
Gapdh expression and relative to samples targeting Cascade
with a crRNA targeted to an irrelevant control locus at
HBE1.
RNA sequencing
HEK293T cells were co-transfected with 3 ug total plasmid in 6-well
plates. After three days, cells were washed twice with PBS and 350 uL of 1:10
mixture of β-mercaptoethanol and Buffer RLT (Qiagen) was added to each
well. While on ice, cells were lysed NA was quantified using a Nanodrop
instrument, and RNA quality was assessed using an Agilent TapeStation 2200 with
RNA ScreenTape (Agilent). Using 1 ug of total RNA input, stranded mRNA sample
preparation was performed with the Illumina TruSeq Stranded mRNA Library Prep
Kit (Illumina) following manufacturer’s protocol except the enzymatic
fragmentation time was reduced from 8 min to 1 min. Libraries were sequenced at
the Duke GCB Sequencing Core as 51 cycles paired-end runs (51PE), in an Illumina
HiSeq 4000 platform. Reads were aligned against the human reference genome
GRCh38 using the aligner STAR v2.4.1a[45] following the proposed 2-pass strategy to first
identify a splice junction database to improve the overall mapping quality. Gene
counts were estimated with featureCounts from the subread package
v1.4.6-p6[46], using
gene annotations from Refseq[47]
and allowing for multimapping reads (-M and --fraction arguments). Differential
expression analyses were performed using the DESeq2 package[48] filtering out non-expressed genes and
fitting a negative binomial generalized linear model to find significantly
differentially expressed genes.
Chromatin immunoprecipitation qPCR
HEK293T cells were transfected with 40 μg total plasmid in 15cm
dishes. After three days, cells were fixed in 1% formaldehyde for 10 min at room
temperature. The reaction was quenched with 0.125M glycine and the cells were
lysed using Farnham lysis buffer with a protease inhibitor cocktail (Roche).
Nuclei were collected by centrifugation at 2,000rpm for 5min at 4°C and
lysed in RIPA buffer with a protease inhibitor cocktail (Roche). Chromatin was
sonicated using a Biorupter Sonicator (Diagenode, model XL) and
immunoprecipitated using anti-Flag (Sigma, M2). The formaldehyde crosslinks were
reversed by heating overnight at 65°C and genomic DNA fragments were
purified using a spin column. For qPCR, 500 pg of ChIP’d DNA was used per
reaction. qPCR was performed as described above. The data are presented as fold
change gDNA normalized to a region of the β-actin locus and relative to
samples targeting Cascade with the control crRNA mentioned above. All sequences
for qPCR primers can be found in Supplementary Table S4.
Chromatin immunoprecipitation sequencing
Reverse crosslinked ChIP’d DNA was cleaned using PCR purification
columns (Qiagen). DNA concentration was determined using Qubit dsDNA High
Sensitivity and Broad Range assay kit (Invitrogen). To prepare sequencing
libraries for Illumina sequencing, 7 ng input of ChIP’ed DNA was used
with the Hyper Prep kit (Kapa Biosystems). After library preparation, samples
were barcoded with Illumina Truseq indexes and normalized to 10 nM. Final
libraries were pooled and run on a HiSeq 4000 to generate ~20 million, 50 bp
single-end reads per sample.Sequences for TruSeq Illumina adapters were removed from the raw reads
using Trimmomatic v0.32[49].
Then, reads were aligned using Bowtie v1.0.0[50], reporting the best alignment with up to 2 mismatches
(parameters --best --strata -v 2). Duplicates were removed using Picard
MarkDuplicates v1.130, while low mappability or blacklisted regions identified
by the ENCODE project[51] were
filtered out from the final BAM files. Using the sequenced input controls,
binding regions were identified using the callpeak function in MACS2
v2.1.0.20151222[52]. For
the differential binding analysis, first, a union peakset was computed merging
individual peak calls using bedtools2 v2.25.0[53]. Then, reads in peaks were estimated
using featureCounts and the difference in binding was assessed with DESeq2.For the genomic window of 562 bp on chromosome 6 near the
TAAR4P pseudogene displaying significant differential
binding for cr25 compared with Ctrl cRNA, we searched for both global and local
alignments of the 32 bp cr26 sequence. Using the water program in the EMBOSS
v6.6.0 package [54] which
implements the Smith-Waterman algorithm for local alignment, we were only able
to map 8 nucleotides (26.6% of the cr26 sequence). When looking for end-to-end
alignments with the needle program implementing the Needleman-Wunsch algorithm,
the best alignment contained 5 nucleotides on the 5’ end of the cr26
protospacer sequence and 16 mismatches (50% sequence identity). By contrast,
when aligning cr26 to the IL1RN gene, a nearly perfect
continuous match (30/32 bp, 94% sequence identity) starting at the 5’ end
occurs in the genomic location chr2:113117890–113117919. This difference
suggests an alternative mode of binding for this potential off-target site.
Lentiviral transduction
K562-HBE1-mCherry reporter cells[36] were co-transduced with lentivirus expressing
LmoCascade subunits and Cas6-KRAB-2A-BlastR. Transduced cells were selected with
blasticidin S (Invitrogen) at a concentration of 10 μg/ml then clonally
isolated and expanded. For protein analysis, cells were lysed in RIPA buffer
(Sigma) with a proteinase inhibitor cocktail (Roch) and a western blot was
completed using 25 μg protein and mouse anti-Flag (1:1,000 dilution,
Sigma, M2 clone) using the methods described above.LmoCascade-KRAB clone 2 cells were transduced with lentivirus expressing
individual crRNAs and selected with puromycin (Sigma) at a concentration of 1
μg/ml. After seven days, cells were harvested, washed twice with 2 mM
EDTA (ThermoFisher) and 0.5% bovine serum albumin (Sigma) in PBS. Flow cytometry
was done with a SH800 (Sony Biotechnology). Total RNA was isolated and used for
qPCR analysis using the methods described above. The qPCR data is presented as
fold change in RNA normalized to Gapdh expression and relative
to samples targeting Cascade with a crRNA targeted to an irrelevant control
locus at IL1RN.
Deep sequencing and indel analysis
HEK293T cells were co-transfected with 600 ng total plasmid in 24-well
plates. After four days, genomic DNA was isolated using QIAGEN DNeasy Blood and
Tissue kit (Qiagen). In a first round PCR, 100 ng of genomic DNA was amplified
with genome specific primers. A second round of PCR was used to add experimental
barcodes and Illumina flow cell binding sequences. The resulting sequence
libraries were diluted to 2 nM, pooled, and sequenced with 150 bp paired-end
reads on an Illumina MiSeq instrument. Samples were demultiplexed and analyzed
for insertions and deletions with a local distribution of CRISPResso[55] with default parameters and a
30 bp window around the predicted I-TeVI 5’-CNNNG-3’ cut sites.
All primers for genomic DNA amplification can be found in Supplementary Table S5.
Statistical analysis
All data analyzed with two to three biological replicates and presented
as mean ± SEM. Logarithmic transformations were completed prior to
statistical analysis where indicated. All p values calculated by global one-way
ANOVA with Tukey post hoc tests (α=0.05).
Authors: Bernd Zetsche; Jonathan S Gootenberg; Omar O Abudayyeh; Ian M Slaymaker; Kira S Makarova; Patrick Essletzbichler; Sara E Volz; Julia Joung; John van der Oost; Aviv Regev; Eugene V Koonin; Feng Zhang Journal: Cell Date: 2015-09-25 Impact factor: 41.582
Authors: Kira S Makarova; Yuri I Wolf; Omer S Alkhnbashi; Fabrizio Costa; Shiraz A Shah; Sita J Saunders; Rodolphe Barrangou; Stan J J Brouns; Emmanuelle Charpentier; Daniel H Haft; Philippe Horvath; Sylvain Moineau; Francisco J M Mojica; Rebecca M Terns; Michael P Terns; Malcolm F White; Alexander F Yakunin; Roger A Garrett; John van der Oost; Rolf Backofen; Eugene V Koonin Journal: Nat Rev Microbiol Date: 2015-09-28 Impact factor: 60.633
Authors: David B T Cox; Jonathan S Gootenberg; Omar O Abudayyeh; Brian Franklin; Max J Kellner; Julia Joung; Feng Zhang Journal: Science Date: 2017-10-25 Impact factor: 47.728
Authors: Sonali Majumdar; Peng Zhao; Neil T Pfister; Mark Compton; Sara Olson; Claiborne V C Glover; Lance Wells; Brenton R Graveley; Rebecca M Terns; Michael P Terns Journal: RNA Date: 2015-04-22 Impact factor: 4.942
Authors: Bálint Csörgő; Lina M León; Ilea J Chau-Ly; Alejandro Vasquez-Rifo; Joel D Berry; Caroline Mahendra; Emily D Crawford; Jennifer D Lewis; Joseph Bondy-Denomy Journal: Nat Methods Date: 2020-10-19 Impact factor: 28.547