Chance M Nowak1,2, Seth Lawson1, Megan Zerez1,2, Leonidas Bleris3,2,4. 1. Department of Biological Sciences, The University of Texas at Dallas, Richardson, TX 75080, USA. 2. Center for Systems Biology, The University of Texas at Dallas, Richardson, TX 75080, USA. 3. Department of Biological Sciences, The University of Texas at Dallas, Richardson, TX 75080, USA bleris@utdallas.edu. 4. Bioengineering Department, The University of Texas at Dallas, Richardson, TX 75080, USA.
Abstract
The Clustered Regularly Interspaced Short Palindromic Repeats system allows a single guide RNA (sgRNA) to direct a protein with combined helicase and nuclease activity to the DNA. Streptococcus pyogenes Cas9 (SpCas9), a CRISPR-associated protein, has revolutionized our ability to probe and edit the human genome in vitro and in vivo Arguably, the true modularity of the Cas9 platform is conferred through the ease of sgRNA programmability as well as the degree of modifications the sgRNA can tolerate without compromising its association with SpCas9 and function. In this review, we focus on the properties and recent engineering advances of the sgRNA component in Cas9-mediated genome targeting.
The Clustered Regularly Interspaced Short Palindromic Repeats system allows a single guide RNA (sgRNA) to direct a protein with combined helicase and nuclease activity to the DNA. Streptococcus pyogenes Cas9 (SpCas9), a CRISPR-associated protein, has revolutionized our ability to probe and edit the human genome in vitro and in vivo Arguably, the true modularity of the Cas9 platform is conferred through the ease of sgRNA programmability as well as the degree of modifications the sgRNA can tolerate without compromising its association with SpCas9 and function. In this review, we focus on the properties and recent engineering advances of the sgRNA component in Cas9-mediated genome targeting.
Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) loci are present in prokaryotes, including both bacteria and archaea (1,2), and are primarily characterized by direct repeat sequences interspaced by similarly sized variable sequences (3–6). Early investigations into the nature of the repeat and variable sequences revealed that CRISPR and the CRISPR-associated proteins (Cas) work in tandem to recognize and cleave invading foreign DNA (1–8). The characterization of CRISPR-Cas as a type of prokaryotic immune system laid the groundwork for what has now become a powerful tool for various applications well outside the original biological context (9–16).The CRISPR-Cas systems are diverse in prokaryotes, and as such, are divided into two classes based on whether the effector complex is multimeric (Class 1) or monomeric (Class 2). The two classes are in turn subdivided into six types with types I, III and IV belonging to Class 1, and types II, V and VI belonging to Class 2 (17,18). While the primary function of conferring acquired immunity to invading foreign nucleic acids is conserved between types, the individual components required to carry out this function vary. For the purposes of this review we largely focus on the RNA components from the CRISPR type II-A SpCas9 system.In prokaryotes, the CRISPR-Cas system functions as a microbial analog to the acquired (adaptive) immune system present in higher organisms (5,19,20). The variable sequences of the CRISPR array, known as spacers, are relics of previous infectious events whereby fragments of invading DNA, or protospacers, have been captured and integrated into the host genome at the CRISPR locus to serve as an immunological memory (8). Once a new protospacer has been integrated into the CRISPR array, the entire array can be transcribed into pre-crRNA and processed into mature crRNA.The processing of pre-crRNA into mature crRNA is distinct in type II CRISPR systems in that it relies on the presence of trans-activating crRNAs (tracrRNAs) that hybridize with the pre-crRNA through complementary base pairing to the repeat regions (21–23). RNase III, a dimeric endoribonuclease that cleaves double-stranded RNA, then recognizes the pre-crRNA:tracrRNA hybrid and cleaves individual crRNA:tracrRNA hybrids from the primary CRISPR array transcript (21). Ultimately the crRNA-tracrRNA hybrid spacer sequence (Figure 1A) is trimmed down to 20 nucleotides (21) before tightly associating with the SpCas9 nuclease and forming the catalytically active ribonucleoprotein (RNP) complex used for targeted DNA cleavage (19,20).
Figure 1.
Streptococcus pyogenes CRISPR-SpCas9 guide RNA anatomy. (A) Endogenous CRISPR RNA (crRNA) and transacting crRNA (tracrRNA). The spacer sequence (orange) is 20 nucleotides in length and the repeat sequence (green) is 22 nucleotides that basepairs with tracrRNA complementary region (blue). The 3′ handle region (purple) has functional significance for structure-dependent recognition by SpCas9. (B) The synthetic sgRNA retains dual-tracrRNA:crRNA secondary structure via a fusion of the 3′ end of the crRNA to the 5′ end of the tracrRNA with an engineered tetraloop. (C) Individual functional modules of the sgRNA (sgRNA structure adopted from Briner et al., 2014). The 5′ spacer sequence dictates SpCas9 localization within the genome. The lower stem is formed by the duplex between the CRISPR repeat sequence from the crRNA and the region of complementarity in the tracrRNA. SpCas9 interacts with the upper and lower stems in a sequence-independent manner, whereas the bulge interactions with SpCas9 appear to be sequence-dependent. The nexus contains both sequence and structural features necessary for DNA cleavage and lies at the center of the sgRNA:SpCas9 interactions. The nexus also forms a junction between the sgRNA and both SpCas9 and the target DNA. The terminal hairpins assist in stabilizing the sgRNA and supports stable complex formation with SpCas9. The hairpins can also tolerate large deletion mutations and still exhibit cleavage activity.
Streptococcus pyogenes CRISPR-SpCas9 guide RNA anatomy. (A) Endogenous CRISPR RNA (crRNA) and transacting crRNA (tracrRNA). The spacer sequence (orange) is 20 nucleotides in length and the repeat sequence (green) is 22 nucleotides that basepairs with tracrRNA complementary region (blue). The 3′ handle region (purple) has functional significance for structure-dependent recognition by SpCas9. (B) The synthetic sgRNA retains dual-tracrRNA:crRNA secondary structure via a fusion of the 3′ end of the crRNA to the 5′ end of the tracrRNA with an engineered tetraloop. (C) Individual functional modules of the sgRNA (sgRNA structure adopted from Briner et al., 2014). The 5′ spacer sequence dictates SpCas9 localization within the genome. The lower stem is formed by the duplex between the CRISPR repeat sequence from the crRNA and the region of complementarity in the tracrRNA. SpCas9 interacts with the upper and lower stems in a sequence-independent manner, whereas the bulge interactions with SpCas9 appear to be sequence-dependent. The nexus contains both sequence and structural features necessary for DNA cleavage and lies at the center of the sgRNA:SpCas9 interactions. The nexus also forms a junction between the sgRNA and both SpCas9 and the target DNA. The terminal hairpins assist in stabilizing the sgRNA and supports stable complex formation with SpCas9. The hairpins can also tolerate large deletion mutations and still exhibit cleavage activity.An indispensable aspect of any immune system is the ability to distinguish self from non-self; in other words, the components of the immune system must be able to recognize molecules that do not originate from the host. The SpCas9 CRISPR system achieves this distinction through the recognition of a protospacer adjacent motif (PAM), which is a short G-rich oligonucleotide sequence downstream of the target DNA (8,24). This feature is crucial for targeted DNA cleavage, as the corresponding spacer in the CRISPR array is identical to the target DNA, and would otherwise be cleaved. It is not until after SpCas9 scans invading foreign DNA for the PAM sequence 5′NGG that complementary base pairing between the target DNA and crRNA can occur and trigger targeted DNA cleavage (25,26).
Cas9, the protein
The high-resolution crystal structure of SpCas9 in complex with a single guide RNA (sgRNA), and its cognate target DNA obtained by Nishimasu et al. identified key functional interactions that govern the molecular mechanism of SpCas9-mediated DNA cleavage. The crystal structure revealed that SpCas9 has a bilobed architecture composed of a Recognition lobe (REC) and a Nuclease lobe (NUC), and the site of heteroduplex formation between the sgRNA and its cognate target DNA is a positively charged cleft at the interface between the two lobes (27,28). The REC lobe is comprised of an α-helical region termed the bridge helix domain that recognizes the ‘seed’ region (the 10–12 PAM-proximal nucleotides of the guide region) of the sgRNA through salt bridges with sgRNA backbone, a REC1 domain that recognizes repeat:anti-repeat duplex of the sgRNA and the REC2 domain that does not interact with the guide:target heteroduplex. The NUC lobe is comprised of an RuvC catalytic nuclease domain that cleaves the non-complementary strand of the target DNA, an HNH catalytic nuclease domain that cleaves the complementary strand of the target DNA, and a PAM-interacting (PI) domain that recognizes the 5′NGG PAM on the non-complementary strand (22,23,27,28).SpCas9 interacts with the sgRNA in both sequence-dependent and independent manners—the guide region is recognized in a sequence-independent mechanism, whereas SpCas9 recognition of the sgRNA repeat:anti-repeat duplex involves sequence-dependent interactions (27). Additional information on the base pair interactions and details regarding conformational changes due to sgRNA and target DNA/PAM recognition can be found in recent literature (27–30).
CRISPR Class 2, a brief overview
Among Class 2 Type II proteins there is a large number of different Cas9 orthologs (31). Here, we include a brief overview of some of the various Cas9 orthologs used in mammalian genome editing.The Type II-A Cas9 ortholog Staphylococcus aureus (SaCas9) is a notable alternative to SpCas9. SaCas9 has structural similarity to SpCas9, notwithstanding its 17% sequence identity (32), and was shown to edit genomes with efficiency similar to SpCas9 (33). Less popular are the co-expressed StCas9s from Streptococcus thermophilus located on CRISPR loci 1 or 3. These orthologs are only slightly smaller than SpCas9 and have reduced cleavage activity in human cells (10,34). Type II-C Cas9s have shown poor dsDNA cleavage efficiency, a propensity to cleave off-target, and are not commonly used for genome editing (35). However, NmCas9 from Neisseria meningitides, has successfully been used for genome engineering in human pluripotent stem cells (36–38).Other types within the Class 2 CRISPR-Cas repertoire include the Type V system Cpf1. Cpf1 targets a 5′ T-rich PAM upstream of the target DNA, it generates a five nucleotide 5′ overhang, and does not require a tracrRNA (39). AsCpf1 from Acidaminococcus sp. recognizes a 5′ TTTN PAM and has shown robust activity in human cells (39). The Class 2 Type VI CRISPR system consists of only one member, C2c2 from Leptotrichia shahii, and cleaves RNA rather than DNA, using a one part guide (similar to Cpf1) (18,40).Our understanding of other Class 2 CRISPR-Cas systems is growing rapidly, and new systems are being discovered and adapted for genome editing. The sequences of the sgRNAs as well as the PAMs from the Class 2 CRIPSR-Cas proteins mentioned above are listed in Table 1. While this review will focus on guide RNA engineering advances, there are a number of developments in the protein engineering field of Cas9 that cannot be recapitulated by sgRNA engineering alone and warrant mention. For example, human codon optimization has been performed on SpCas9 (9), SaCas9 (41), NmCas9 (42), St1Cas9 (42), St3Cas9 (43) and Cpf1 (39). Likewise, these have been made into catalytically inactive and nicking versions (23,32,42,44–46). Further, mutagenesis of PAM-interacting regions of SpCas9 and SaCas9 has resulted in altered PAM mutants (41,47), and mutational neutralization of key charged residues increased target specificity and decreased off-target effects by SpCas9 (48,49).
Table 1.
CRISPR-associated programmable nucleases and cognate sgRNA
Guide RNA production and multiplexing
Endogenous Type II CRISPR RNA components require extensive processing before becoming functional (22,23). The first effort to recapitulate the bacterial CRISPR system in mammalian cells involved the delivery of SpCas9, SpRNase III, the tracrRNA and the pre-crRNA array, which contained the spacer sequence flanked by direct repeats (10). Interestingly, the inclusion of SpRNase III was found to be unnecessary for cleavage of the target DNA sequence in mammalian cells (10).A key advance in CRISPR programmability came with the engineering of the chimeric sgRNA (23). The chimeric sgRNA (Figure 1B) is a single transcript that retains the dual-tracrRNA:crRNA secondary structure via a fusion of the 3′ end of the crRNA to the 5′ end of the tracrRNA with an engineered tetraloop (23). Results from systematic mutational analysis of 77 sgRNA variants revealed that the sgRNA is composed of six structural modules (Figure 1C): the spacer, the lower stem, the bulge, the upper stem, the nexus and the hairpins (50).The most common approach to produce sgRNAs in human cells is using the human U6 RNA polymerase III (RNAP III) promoter (9,10,16). This constitutive RNAP III promoter allows the sgRNA transcript to escape post-transcriptional modifications that are coupled to RNAP II transcription (such as 5′ methyl capping and polyadenylation), which would otherwise result in its export out of the nucleus (51,52). Alternatively, the type III RNAP III promoter H1, which requires a purine nucleotide transcriptional start site, can be used (53). Notably, both the U6 and HI promoters can be combined with transcriptional response elements to facilitate inducible sgRNA production (54,55).RNAP II mediated sgRNA expression has been utilized by placing the sgRNA downstream of a minimal Cytomegalovirus promoter (mCMV) followed by a minimal polyadenylation sequence, with the entire sequence under the inducible control of the tetracycline response element (56). Additionally, RNAP II mediated sgRNA production was also demonstrated by generating an artificial intron in a fluorescent reporter gene by flanking the sgRNA with the appropriate splice sites (56). However, considering that intronic RNAs typically have short half-life (57,58), the stability of the sgRNA product has to be further examined.RNAP II based sgRNA production can be combined with strategies that exploit RNA binding proteins and utilize RNA secondary structures for improved efficiency and multiplexed sgRNA production (59). One approach is based on flanking the sgRNA with a 28 nucleotide hairpin that is recognized by the endoribonuclease Cys4. Cys4 cleaves immediately downstream of the hairpin and remains bound to the upstream secondary structure (59–62) (Figure 2A). Critically, it was shown that Csy4 binding can assist with stabilizing intronic sgRNAs (59). The hairpin can also be embedded in the 3′ untranslated region (3′UTR) of a protein coding transcript with an additional RNA module that forms a 3′ triple helical structure (triplex) that stabilizes RNAs lacking poly(A) tails (59,63).
Figure 2.
sgRNA multiplexing strategies. (A) RNA endonuclease Csy4 recognizes a 28 nucleotide sequence flanking the sgRNA sequence and cleaves after the 20th nucleotide while remaining bound to the upstream region. This production strategy allows for RNAP II mediated transcription via a CMV promoter and polyadenylation signal. (B) The cis-acting ribozymes hammerhead ribozyme and HDV ribozyme flanking the 5′ and 3′ of the sgRNA, respectively, allow for self-cleaving production of sgRNAs and are not dependent on the presence of an exogenous protein. This production strategy also allows for RNAP II mediated transcription via a CMV promoter and polyadenylation signal. (C) Polycistronic tRNA–gRNA architecture allows the production of multiple sgRNAs from a single synthetic gene. Endogenous RNases RNaseP and RNase Z cleave the 5′ leader and 3′ trailer sequences at specific sites, respectively. This production strategy relies on the presence of an RNAP III promoter and terminator sequence, but achieves multiple sgRNA production via internal RNAP III promoter elements intrinsic to tRNA genes.
sgRNA multiplexing strategies. (A) RNA endonuclease Csy4 recognizes a 28 nucleotide sequence flanking the sgRNA sequence and cleaves after the 20th nucleotide while remaining bound to the upstream region. This production strategy allows for RNAP II mediated transcription via a CMV promoter and polyadenylation signal. (B) The cis-acting ribozymes hammerhead ribozyme and HDV ribozyme flanking the 5′ and 3′ of the sgRNA, respectively, allow for self-cleaving production of sgRNAs and are not dependent on the presence of an exogenous protein. This production strategy also allows for RNAP II mediated transcription via a CMV promoter and polyadenylation signal. (C) Polycistronic tRNA–gRNA architecture allows the production of multiple sgRNAs from a single synthetic gene. Endogenous RNases RNaseP and RNase Z cleave the 5′ leader and 3′ trailer sequences at specific sites, respectively. This production strategy relies on the presence of an RNAP III promoter and terminator sequence, but achieves multiple sgRNA production via internal RNAP III promoter elements intrinsic to tRNA genes.RNAP II sgRNA production can also be carried out with ribozyme flanked sgRNA cassettes, in which the 5′ target sequence is fused to the self-cleaving Hammerhead (HH) ribozyme and 3′end scaffold sequence is fused to the Hepatitis delta virus (HDV) ribozyme (59,64) (Figure 2B). The triplex structure can also be used in stabilizing transcripts originating upstream from the ribozyme-produced sgRNA constructs (59).Despite these attempts to produce sgRNA from RNAP II promoters, RNAP III remains the predominantly used promoter in recent papers. RNAP III transcribes tRNAs that must undergo processing in the nucleoplasm. By constructing a polycistronic tRNA–gRNA architecture, enzymes used in the endogenous eukaryotic tRNA processing machinery (i.e. RNase P and Z) can cleave the tRNA fragments out of the transcript producing functional and multiplexed sgRNAs (65) (Figure 2C). In Table 2 we include the DNA sequences for the modified sgRNA constructs presented in the Figures 2–4.
Table 2.
sgRNA modification sequences
Figure 4.
Incorporating RNA aptamer sequences into sgRNA. (A) MS2 loops that selectively bind MCP incorporated into the sgRNA 3′ end. (B) MS2 loops that selectively bind MCP incorporated into both the sgRNA tetraloop and the first hairpin. (C) MS2 loop incorporated in the 3′end in conjunction with a second aptamer hairpin f6 that has been selected to bind MCP. (D) CRISPRainbow sgRNA that utilizes three RNA binding protein-aptamer systems. The N22 peptide is fused to red fluorescent protein that binds the BoxB aptamer, the MCP peptide is fused to blue fluorescent protein and binds the MS2 aptamer and the PCP peptide fused to green fluorescent peptide binds the PP7 aptamer. Using different aptamers to bind red, green and blue fluorescent proteins the CRISPRainbow system creates seven different scaffolds that can be imaged as individual combinations of the primary colors. (E) sgRNA variant similar to the structure shown in Figure 4B, but also utilizes a truncated spacer sgRNA to achieve sgRNA multiplexing schemes that allow for both gene knockout and activation with catalytically active SpCas9.
sgRNA production can also be achieved in vitro by appending a T7 RNA polymerase promoter to the 5′ end of the spacer region, and similar to transcription by the human RNAP III U6 promoter, transcription by T7 RNAP requires 1–2 guanine residues directly upstream of the spacer sequence (66). In vitro-transcribed (IVT) sgRNA can be microinjected into embryos along with mRNA encoding SpCas9 ORF (67), or IVT sgRNAs can be transfected with purified SpCas9 protein (68). Additionally, chemical synthesis of sgRNAs and chemically modified sgRNA nucleotides transfected into human primary cells with either SpCas9 mRNA or purified protein has shown to enhance target specificity (69).
Guide RNA sequence modification and function
A systematic mutational analysis of the sgRNA (50) revealed that the bulge and nexus are the most sensitive to disruption and are necessary for DNA cleavage. Replacement of the bulge with perfectly complementary base pairing abrogates DNA cleavage. Also, substituting a pair of guanine nucleotides that form the base of the nexus structure with two cytosine nucleotides completely abolished cleavage activity. Interestingly, the upper stem can withstand large deletion mutations and still exhibit DNA cleavage activity (23,27,50) (Figure 3A). Alternatively, extensions to the stemloop were found to increase sgRNA stability and enhance its assembly with catalytically inactive SpCas9 (dSpCas9) (70–72).
Figure 3.
SpCas9 sgRNA mutational variants. (A) sgRNA variant in which the entire upper stem is removed and the bulge is replaced by a tetraloop that retains cleavage activity, suggesting that the upper stem may be dispensable. (B) sgRNA variant in which the spacer sequence is truncated from the canonical 20 nucleotides down to 14–15 nucleotides that allows catalytically active SpCas9 to still bind its target DNA without cleaving the target DNA. (C) sgRNA variant in which a putative RNAP III terminator sequence is removed from the lower stem by an A-U base pair flip and the upper stem is extended that increase sgRNA stability and enhance its assembly with SpCas9.
SpCas9 sgRNA mutational variants. (A) sgRNA variant in which the entire upper stem is removed and the bulge is replaced by a tetraloop that retains cleavage activity, suggesting that the upper stem may be dispensable. (B) sgRNA variant in which the spacer sequence is truncated from the canonical 20 nucleotides down to 14–15 nucleotides that allows catalytically active SpCas9 to still bind its target DNA without cleaving the target DNA. (C) sgRNA variant in which a putative RNAP III terminator sequence is removed from the lower stem by an A-U base pair flip and the upper stem is extended that increase sgRNA stability and enhance its assembly with SpCas9.sgRNA targeting efficacy has shown to vary between guides (67,73–78). There are many factors that can contribute to off-target effects such as chromatin accessibility (78), nucleotide composition of the guide (79,80) and length of the guide (81). Indeed, deletions to the sgRNA have proved to be beneficial, including decreased off-target effects by truncation of the 5′ end of the sgRNA such that they possess 17 or 18 nucleotides of complementarity to the target sequence. Utilization of truncated sgRNAs (tru-sgRNAs) by RNA-guided nucleases (RGNs) such as SpCas9 (tru-RGNs) decreased undesired cleavage at known off-targets sites by several orders of magnitude (81). Truncations of the spacer sequence down to 15 nucleotides abolish SpCas9 cleavage activity, though the enzyme still retains genomic targeting function (81,82) (Figure 3B). Notably, 5′ mismatches and truncations as low as 11 nucleotides have shown not to significantly compromise dSpCas9 binding activity (82).The truncated sgRNA can be used in conjunction with a catalytically active SpCas9 fused to a transactivation domain such as VPR, for multiplexed sgRNA schemes that require both targeted activation and targeting cleavage with a single SpCas9 enzyme (83). Moreover, truncated sgRNAs were utilized for genomic imaging and found to outperform their longer spacer counterparts when targeting repetitive elements (70,72). Other sgRNA augmentations that increased genomic labeling efficiency include mutations that disrupt the putative Pol-III terminator (four consecutive U's) in the sgRNA stemloop, as well as the previously mentioned stemloop extension (70,72) (Figure 3C). Both modifications helped to improve sgRNA-dSpCas9 assembly and increase the sgRNA stability.The versatility of sgRNA is exemplified by the ability to incorporate separate RNA secondary structures into the sgRNA scaffold without compromising its association of SpCas9. Indeed, Shechner et al. established a platform dubbed CRISPR-Display that utilizes catalytically inactive SpCas9 for targeted localization of large RNA cargos (84). The CRISPR-Display platform enabled the incorporation of large non-coding RNA (lncRNA) domains such as the repressive Xist A-repeat domains (85), and enhancer-transcribed RNAs (86) up to 4.8 kb (87) into the sgRNA. While the effects of lncRNA-mediated transcriptional repression and activation were modest, the CRISPR-Display platform introduced a novel approach for studying lncRNA function. MS2bacteriophage coat proteins (MCP) can dimerize and selectively bind a specific RNA hairpin-forming aptamer (88,89). Effector domains such as transcriptional activators, transcriptional repressors and epigenetic demethylators have been fused to the MCP protein. By engineering MS2 loops onto the sgRNA, dCas9 can localize to a specific locus and recruit the corresponding effector fusions (32,42,46,56,82,90–92). MS2 loops have been added to the 3′ end of the sgRNA (Figure 4A), as well as engineered into the tetraloop and 1st hairpin with successful transactivation of target genes (84,90,91) (Figure 4B). Conversely, 5′ additions of single MS2 or PP7 loops were degraded and failed to achieve transactivation (91), but 5′ additions of multiple tandem MS2 and PP7 loops were found to be intact and achieved modest activation (84). For 3′ appended loops, 2–20 bp long linkers have been shown to function, with increasing linker length correlated to decreased stability and recruitment efficiency. Additions of 3X MS2 loops to the 3′ end have been attempted, but increasing the number of MS2 loops corresponded to a decrease in transactivation and stability (91). To address this problem, a double stranded linker was developed to enhance the stability of the scaffold, which resulted in a notable increase in MCP-VP64-mediated transactivation for sgRNAs bearing two MS2 loops (91). The design was further improved by replacing the second MS2 loop for an in vitro selection-derived aptamer that also binds to MCP so as to reduce misfolding between the aptamers (91,93) (Figure 4C). It is worth noting that this construct was developed and tested in yeast, but such transcriptional control should be achievable in mammalian cells.Incorporating RNA aptamer sequences into sgRNA. (A) MS2 loops that selectively bind MCP incorporated into the sgRNA 3′ end. (B) MS2 loops that selectively bind MCP incorporated into both the sgRNA tetraloop and the first hairpin. (C) MS2 loop incorporated in the 3′end in conjunction with a second aptamer hairpin f6 that has been selected to bind MCP. (D) CRISPRainbow sgRNA that utilizes three RNA binding protein-aptamer systems. The N22 peptide is fused to red fluorescent protein that binds the BoxB aptamer, the MCP peptide is fused to blue fluorescent protein and binds the MS2 aptamer and the PCP peptide fused to green fluorescent peptide binds the PP7 aptamer. Using different aptamers to bind red, green and blue fluorescent proteins the CRISPRainbow system creates seven different scaffolds that can be imaged as individual combinations of the primary colors. (E) sgRNA variant similar to the structure shown in Figure 4B, but also utilizes a truncated spacer sgRNA to achieve sgRNA multiplexing schemes that allow for both gene knockout and activation with catalytically active SpCas9.Alternatively, Konermann et al. have engineered MS2 loops into the tetraloop and 1st hairpin of the sgRNA separately (sgRNA 1.1 and 1.2, respectively) and within the same construct (sgRNA 2.0), showing substantially increased activity when using both modifications. Another study was able to demonstrate epigenetic editing through utilization of the sgRNA 2.0 construct (90) in which the MCP was fused to the catalytic domain of the Tet1 dioxygenase protein via a flexible linker that enables site directed demethylation of endogenous promoters in mammalian cell cultures (92).Similar to the MCP-MS2 system are other RNA binding protein-aptamer systems (e.g. PCP-PP7, Com-com and NN2-BoxB) that have been used in sgRNA scaffold design. By utilizing several sgRNAs in a multiplexed targeting scheme, combinatorial sets of transcriptional control sgRNAs at endogenous loci can be used to engineer complex regulatory networks (10,59,94). Moreover, RNA aptamer systems with distinct effector fusions can be appended to the same sgRNA molecule and function in parallel without crosstalk (71,72,91). This was exemplified in the CRISPRainbow system in which multiple fluorescent proteins were fused to distinct RNA aptamer-binding proteins with their respective RNA aptamer sequence incorporated into a combination of either the tetraloop, 1st hairpin and/or appended to the 3′ end resulting in a total of seven different sgRNA species for multiplexed imaging of genomic loci (72) (Figure 4D).Notably the multiplexing efforts described here have largely been performed with the catalytically inactive dSpCas9. As mentioned previously, the catalytically active SpCas9 has been shown to bind but not cleave its target DNA sequence when guided by truncated sgRNAs (82,83). Therefore, orthogonal control can be conferred through the length and the specific sequence of the sgRNA spacer, as well as the type of RNA aptamer incorporated onto the sgRNA secondary structure. This allows for the concurrent transactivation and site-specific DNA cleavage at desired genomic loci using the catalytically active SpCas9 (82,83) (Figure 4E).
CONCLUSION
The Cas9:sgRNA technology has revolutionized our ability to probe and edit the human genome in vitro and in vivo. The ability to reliably control and modify the human genome is expected to be instrumental toward unraveling disease properties and developing novel therapeutics. Engineering increasingly sophisticated functions in cells will rely on the continued expansion of the CRISPR toolkit and the rational engineering of novel guide RNAs. Novel engineered sgRNA constructs will not only point to a specific genomic address but also define the desired function (82,91). As more diverse CRISPR-associated proteins appear in literature (17,39,40) we expect that current advances and observations on the SpCas9 sgRNAs to readily inform the RNA engineering for Cas9 orthologs and other CRISPR-associated RNPs.
Authors: Xuebing Wu; David A Scott; Andrea J Kriz; Anthony C Chiu; Patrick D Hsu; Daniel B Dadon; Albert W Cheng; Alexandro E Trevino; Silvana Konermann; Sidi Chen; Rudolf Jaenisch; Feng Zhang; Phillip A Sharp Journal: Nat Biotechnol Date: 2014-04-20 Impact factor: 54.908
Authors: Ayal Hendel; Rasmus O Bak; Joseph T Clark; Andrew B Kennedy; Daniel E Ryan; Subhadeep Roy; Israel Steinfeld; Benjamin D Lunstad; Robert J Kaiser; Alec B Wilkens; Rosa Bacchetta; Anya Tsalenko; Douglas Dellinger; Laurakay Bruhn; Matthew H Porteus Journal: Nat Biotechnol Date: 2015-06-29 Impact factor: 54.908
Authors: Sergey Shmakov; Omar O Abudayyeh; Kira S Makarova; Yuri I Wolf; Jonathan S Gootenberg; Ekaterina Semenova; Leonid Minakhin; Julia Joung; Silvana Konermann; Konstantin Severinov; Feng Zhang; Eugene V Koonin Journal: Mol Cell Date: 2015-10-22 Impact factor: 17.970
Authors: Baohui Chen; Luke A Gilbert; Beth A Cimini; Joerg Schnitzbauer; Wei Zhang; Gene-Wei Li; Jason Park; Elizabeth H Blackburn; Jonathan S Weissman; Lei S Qi; Bo Huang Journal: Cell Date: 2013-12-19 Impact factor: 41.582
Authors: Yi Li; José A De la Paz; Xianli Jiang; Richard Liu; Adarsha P Pokkulandra; Leonidas Bleris; Faruck Morcos Journal: Biophys J Date: 2019-10-08 Impact factor: 4.033