V Edwin Hillary1, S Antony Ceasar2. 1. Department of Biosciences, Rajagiri College of Social Sciences, 683 104, Cochin, Kerala, India. 2. Department of Biosciences, Rajagiri College of Social Sciences, 683 104, Cochin, Kerala, India. antony_sm2003@yahoo.co.in.
Abstract
The clustered regularly interspaced short palindromic repeats (CRISPR) and CRISPR-associated protein (CRISPR/Cas) system has altered life science research offering enormous options in manipulating, detecting, imaging, and annotating specific DNA or RNA sequences of diverse organisms. This system incorporates fragments of foreign DNA (spacers) into CRISPR cassettes, which are further transcribed into the CRISPR arrays and then processed to make guide RNA (gRNA). The CRISPR arrays are genes that encode Cas proteins. Cas proteins provide the enzymatic machinery required for acquiring new spacers targeting invading elements. Due to programmable sequence specificity, numerous Cas proteins such as Cas9, Cas12, Cas13, and Cas14 have been exploited to develop new tools for genome engineering. Cas variants stimulated genetic research and propelled the CRISPR/Cas tool for manipulating and editing nucleic acid sequences of living cells of diverse organisms. This review aims to provide detail on two classes (class 1 and 2) of the CRISPR/Cas system, and the mechanisms of all Cas proteins, including Cas12, Cas13, and Cas14 discovered so far. In addition, we also discuss the pros and cons and recent applications of various Cas proteins in diverse fields, including those used to detect viruses like severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2). This review enables the researcher to gain knowledge on various Cas proteins and their applications, which have the potential to be used in next-generation precise genome engineering.
The clustered regularly interspaced short palindromic repeats (CRISPR) and CRISPR-associated protein (CRISPR/Cas) system has altered life science research offering enormous options in manipulating, detecting, imaging, and annotating specific DNA or RNA sequences of diverse organisms. This system incorporates fragments of foreign DNA (spacers) into CRISPR cassettes, which are further transcribed into the CRISPR arrays and then processed to make guide RNA (gRNA). The CRISPR arrays are genes that encode Cas proteins. Cas proteins provide the enzymatic machinery required for acquiring new spacers targeting invading elements. Due to programmable sequence specificity, numerous Cas proteins such as Cas9, Cas12, Cas13, and Cas14 have been exploited to develop new tools for genome engineering. Cas variants stimulated genetic research and propelled the CRISPR/Cas tool for manipulating and editing nucleic acid sequences of living cells of diverse organisms. This review aims to provide detail on two classes (class 1 and 2) of the CRISPR/Cas system, and the mechanisms of all Cas proteins, including Cas12, Cas13, and Cas14 discovered so far. In addition, we also discuss the pros and cons and recent applications of various Cas proteins in diverse fields, including those used to detect viruses like severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2). This review enables the researcher to gain knowledge on various Cas proteins and their applications, which have the potential to be used in next-generation precise genome engineering.
Genome editing or gene editing is a popular technology used in medicines, therapeutic drugs, infectious studies, and agricultural biotechnology. The genome-editing tools have been employed to study the precise function of a gene by cutting and altering at a programmed locus through insertion, deletion, or replacement of targeted bases. In the beginning, conventional gene-editing techniques like homologous recombination (HR) were utilized for gene inactivation, but the effectiveness of HR was extremely low with the labor-intensive process [1]. Later, targeted gene knock-down utilizing RNA interference (RNAi) technique has provided researchers with rapid and low-cost technology to silence the gene of interest to study its functions. However, it also could not completely knock-down the targeted sequence, which faced unpredictable off-target effects and delivered only temporary or partial inhibition of gene function [2].The genome-editing technique needs programmable sequence-specific endonucleases to produce the site-specific single-stranded breaks (SSBs) or double-stranded breaks (DSBs) at the targeted site that allow the endogenous repair mechanisms to fill the breaks [3]. These breaks are fixed by either of the two major repair mechanisms, (1) homology-directed repair (HDR) and (2) non-homologous end-joining repair (NHEJ) [4]. To facilitate specific DNA breaks, various genome-editing tools are developed previously, such as meganucleases or homing endonucleases [5], zinc-finger nucleases (ZFNs) [6], and transcription activator-like effector nucleases (TALENs) [7]. But these tools demand laborious efforts for cloning and protein construction to make DSBs, which hinders these tools from routine applications of genome editing. In 2012, the researchers developed clustered regularly interspaced short palindromic repeats (CRISPR) and CRISPR-associated proteins (Cas) system-based genome editing tools [8]. CRISPR/Cas system mediates diverse adaptive immune systems against phages or plasmids. Due to unique features of simplicity in design, cost-effectiveness, and labor intensity, the research community has immediately adopted the CRISPR/Cas system as a user-friendly and robust RNA-guided DNA targeting tool for genome editing in various species [8-11]. Based on the mechanism, the CRISPR system has been divided into two main classes (1 and 2) and six types (I-VI) [12]. In these, types I to III were extensively studied, whereas types IV and VI were recently discovered. Types I, II, and V cut DNA, type VI cleaves RNA, type III cleaves both DNA and RNA and the cleavage activity of type IV has not yet been identified [13].Cas9 protein has been utilized for diverse applications such as fluorescent imaging, base-editing, and transcriptional activation, apart from targeted cleavage of dsDNA. Like Cas9, the Cas10 protein is also involved in various applications such as fluorescent imaging, base-editing, and RNA tracking. As an alternative to Cas9 and Cas10, the Cas12 protein enhanced genome-editing efficiency by targeting only T-rich motifs without utilizing tracrRNA. So Cas12 system has expanded editing applications such as base-editing and detecting transcriptional variations. Cas13 protein is also used for diverse applications such as imaging, base-editing, and detection of transcriptional variations. Recently, the Cas14 protein has advanced genome-editing efficiency without needing an adjacent protospacer motif (PAM) and performs transcriptional regression and base-editing. This review describes the Cas variants of two classes of CRISPR systems used for genome editing.
History of CRISPR/Cas System
A group of researchers detected CRISPRs by analyzing the alkaline phosphatase gene, which is liable for the isozyme conversion of alkaline phosphatase (iap) in the E. coli K-12 strain [14]. They identified a genomic region that contains a series of 32 nucleotides of distinctive sequences flanked by invariable palindromic repeats on the 3′ end of the iap gene [14]. Later, distinctive parallel sequences were found in the other E. coli strains and Enterobacteria (Shigella dysenteriae and Salmonella enterica) [15]. Similarly, during the study of Mycobacterium tuberculosis strains, researchers identified 36 bp repeats interspaced with unique spacers of 35–41 bp [16].In subsequent works, the CRISPR array was found in archaea (Haloferax mediterranei and Streptococcus thermopile), and the same was identified in 90% bacterial and 40% archaeal genomes [17]. However, this odd genomic sequence first turned out to be the outline of the CRISPR array. Still, due to the absence of necessary genome sequence data, the biological function of CRISPR remained elusive. Jansen et al. in 2002 described the CRISPR-associated genes [18]. Following this, several CRISPR/Cas genes were identified. In 2005, spacer sequence was identified in many genomes, and unique spacer regions were found within the CRISPR array [19]. These results showed that CRISPR is an adaptive immune system to defend prokaryotic cells against phage infection through the RNA-guided process.
Classification of CRISPR/Cas System
The CRISPR system has been classified into two major classes. In the Class 1 system, the RNA-guided target cleavage needs several effector proteins, but the Class 2 system requires only one RNA-guided endonuclease to cleave the DNA sequences [12, 20]. The class 1 system of CRISPR is divided into three types I, III, and IV, and the Class 2 system is divided into types II, V, and VI [21, 22]. In the type I system, the CRISPR/Cas locus contains the Cas3 signature gene that encodes a large protein with a helicase to unwind DNA-DNA and RNA–DNA duplexes [23]. The type II locus encodes multidomain protein to target and cleave the dsDNA [8]. The type III CRISPR/Cas possesses the Cas10 signature gene, encodes a multidomain protein with palm domain to target, and cleaves ssDNA [24]. Type IV system contains CRISPR-associated splicing factor 1 (Csf1), which encodes a ribonucleic protein, but the detailed function of this system is yet to be identified [21]. Type V locus possesses Cas12 signature gene (known as CRISPR from Prevotella and Francisella 1 (Cpf1), C2c1 or C2c3 protein) encodes RuvC (an E. coli protein involved in DNA repair) domain, which cleaves both dsDNA or ssDNA [25]. Type VI contains Cas13 (C2c2) that encodes higher eukaryotes and prokaryotes' nucleotide-binding domain (HEPN), which cleaves ssRNA [26] (Table 1).
Table 1
Classification of CRISPR/Cas system with their protein, target molecule, mechanism of spacer acquisition, processing of Pre-CRISPR, self-vs non-self-discrimination and their effectors with the name of isolated organism with references
Class
Type
Protein
Target
Spacer acquisition strategy
Name of CRISPR/Cas system
Pre-CRISPR processing
Self vs nonself-discrimination
Effectors of CRISPR system
Host organism
References
Class1
I
Cas3
ssDNA
Cas1/Cas2/ Cas4
Cas7, Cas5, Cas8, and Cas3
Cas6
PAM
Cas3, Cascade, and crRNA
E. coli
[23]
III
Cas10
ssDNA
Cas1/Cas2
Cas7, Cas5, and Cas1
Cas6
CRISPR repeat
Cmr/Csm, crRNA, and Cas10
S. epidermics
[24]
IV
Csf1
–
NA
Cas7, Cas5, and Csf1
–
–
–
–
[21]
Class2
II
Cas9
dsDNA
Cas1/Cas2/ Cas4
Cas9
RNase III, and tracrRNA
PAM
Cas9, tracrRNA, and crRNA
S. thermophilus and S. pyogenes
[8]
V
Cpf1
ssDNA and dsDNA
Cas1/Cas2/ Cas4
Cas12
Cpf1
PAM
Cpf1, crRNA and tracrRNA
F. novicida
[25]
VI
C2c2
ssRNA
Cas1/Cas2
Cas13
–
–
C2c1, and crRNA
–
[26]
Classification of CRISPR/Cas system with their protein, target molecule, mechanism of spacer acquisition, processing of Pre-CRISPR, self-vs non-self-discrimination and their effectors with the name of isolated organism with referencesThe CRISPR/Cas system uses three stages to defend against viruses or foreign genetic materials [27] (Fig. 1). In the first stage, protospacers are incorporated into the host CRISPR locus as spacers between crRNA repeats [28]; next, Cas proteins are expressed, a spacer is transcribed into pre-crRNA, and the pre-crRNA is cleaved by Cas proteins and becomes functional as a mature crRNA [28]. In the third stage, Cas protein recognizes the target with the help of crRNA and generates cleavage of the genome [28]. Many CRISPR systems act based on the presence of a sequence-specific PAM, which is adjacent to the crRNA-specific site within the target genome.
Fig. 1
CRISPR/Cas adaptive immunity system. The three stages such as CRISPR adaptation (stage 1), CRISPR RNA biogenesis (stage 2), and CRISPR interference (stage 3) are schematically illustrated. During the adaption stage, the injection of genetic material (virus) into bacterial cells triggers the Cas1 and Cas2 adaption module proteins, which cleaves the invading sequences (spacers) and are then incorporated into the CRISPR array. During CRISPR/RNA biogenesis stage, the CRISPR array is transcribed into a precursor to crRNA molecules (pre-crRNA), which are then cleaved into mature crRNAs. These mature crRNAs form effector complexes with Cas proteins. When foreign genetic material sequences match a CRISPR spacer, the matching crRNA binds to the invading strand and cleaves the invading strand with the help of Cas nuclease (CRISPR interference stage)
CRISPR/Cas adaptive immunity system. The three stages such as CRISPR adaptation (stage 1), CRISPR RNA biogenesis (stage 2), and CRISPR interference (stage 3) are schematically illustrated. During the adaption stage, the injection of genetic material (virus) into bacterial cells triggers the Cas1 and Cas2 adaption module proteins, which cleaves the invading sequences (spacers) and are then incorporated into the CRISPR array. During CRISPR/RNA biogenesis stage, the CRISPR array is transcribed into a precursor to crRNA molecules (pre-crRNA), which are then cleaved into mature crRNAs. These mature crRNAs form effector complexes with Cas proteins. When foreign genetic material sequences match a CRISPR spacer, the matching crRNA binds to the invading strand and cleaves the invading strand with the help of Cas nuclease (CRISPR interference stage)
CRISPR/Cas System
Scientists have discovered a novel microbial defense system that defends itself from viral and mobile genetic elements. One of the defense mechanisms found in bacterium and archaea is referred to as CRISPR/Cas system. By integrating DNA sequences into their genome identical to previous invaders, bacteria and archaea generate a cellular memory of them [29]. These acquired sequences permit them to spot viruses or mobile genetic invaders resulting in the degradation of the invading sequences and work as an adaptive immune system [29, 30].CRISPR immunity is denoted by distinct phases. In the adaptation phase, bacterium and archaea gain cellular memory of the invading phages [30]. The phage genome sequences are integrated into the CRISPR locus of the bacterial or archaeal genomes, composed of 24 to 47 base pair (bp) repeats and separated by spacers. The unique loci of the CRISPR system were first discovered in 1987 [14]. Then Bolotin et al. in 2005 identified CRISPR’s function (enzymes encoding Cas genes generate DNA fragmentation) involved in protecting the bacteria and archaea from foreign invaders [31]. Later, scientists revealed that the CRISPR/Cas system works as an adaptive immune system [32] that was eventually adopted as a versatile system for RNA programmable genome editing [8].
Cas Proteins of the CRISPR System
Cas proteins have gained popularity among the research community for broader genome engineering applications and are currently used in diverse fields, including biotechnology, agriculture, and medical research. Recently discovered programmable Cas proteins like Cas 12, Cas 13, and Cas 14 have improved the precision of the CRISPR/Cas-mediated genome editing (Table 2). The mechanism of these different Cas proteins, pros and cons, and their applications are summarized in detail below.
Table 2
Details on various CRISPR/Cas proteins with their host organism, sgRNA size, PAM sequence with their target molecule and cut site
Protein name
Host organism
sgRNA sequence size
PAM sequence
Target
Cut site
References
Cas9
S. pyogenes
20
5ʹ-NGG-3ʹ
dsDNA
5ʹ of PAM
[33]
Cas9
S. pyogenes
–
5ʹ-NAC, NTG, NTT, and NCG-3ʹ
DNA
5ʹ of PAM
[34]
Cas9
F. novicida
20
5ʹ-NGG-3ʹ
DNA
5ʹ of PAM
–
Cas9
S. aureus
21
5ʹ-NNGRRT-3ʹ
DNA
5ʹ of PAM
[35]
Cas9
Neisseria meningitidis
24
5ʹ-NNNNGATT-3ʹ
DNA
5ʹ of PAM
[35]
Cas9
S. thermophilus
20
5ʹ-NNAGAAW 5ʹ
DNA
5ʹ of PAM
–
Cas9
S. thermophilus
20
5ʹ-NGGNG-3
DNA
5ʹ of PAM
[36]
Cas9
Campylobacter jejuna
22
NNNNACAC and NNNRYAC
DNA
5ʹ of PAM
[37]
C2c1
Alicyclobacillus acidoterrestris
20
T-rich PAM
DNA
5ʹ of PAM
[38]
Cpf1
Prevotella and Francisella
20
TTTV
DNA
5ʹ of PAM
[25]
Cpf1
Acidaminococcus sp.
24
5ʹ-TTTN-3ʹ
DNA
3ʹ of PAM
[39]
Cas12a
Acidaminococcus sp.
–
Thymine-rich PAM
DNA
5ʹ of PAM
[37]
Cas13
Lb
28
Non-G nucleotide at the 3ʹ protospacer flanking site (PFS)
ssRNA
–
[26]
Cas14
Uncultivated archaea
–
–
ssDNA
–
[40]
Details on various CRISPR/Cas proteins with their host organism, sgRNA size, PAM sequence with their target molecule and cut site
Cas1 and Cas2 Proteins
Cas1 and Cas2 are generally conserved proteins in the prokaryotic adaptive immune system [28]. CRISPR/Cas system consists of a CRISPR array containing the repeats (∼30 to 40 bp) separated by spacers, an adjacent module comprising Cas1 and Cas2, and the distinctive signature proteins [41].
Mechanism of Cas1 and Cas2 Proteins
Cas1 and Cas2 belong to the Type II CRISPR system found in E. coli [42]. E. coli Cas1-Cas2 complex mediates spacer acquisition in vivo; however, the molecular mechanism behind these proteins throughout immunity is unclear [43]. Cas1 and Cas2 proteins formed an integrase complex consisting of two distal Cas1 dimers bridged by a Cas2 dimer [44]. The Cas1 and Cas2 proteins bind to the pre-spacer like twin-folded DNA. The pre-spacer integrated the proximal leader region of the CRISPR array guided by the leader sequence with a pair of inverted repeats inside the CRISPR repeat [45].
Applications of Cas1 and Cas2 Proteins
Cas1 and Cas2 are the conserved proteins among all CRISPR/Cas systems. The molecular mechanism behind the Cas1 and Cas2 proteins is still unclear. Hence, researchers have not applied these proteins for genome-editing in various fields.
Pros and Cons of Cas1 and Cas2 Proteins
So far, no genome editing studies have been demonstrated via Cas1 and Cas2 proteins; therefore, significant merits and demerits of Cas1 and Cas2 proteins should be evaluated before applying these in diverse fields.
Cas9 Protein
Cas9 is a protein associated with the CRISPR adaptive immune systems of S. pyogenes and is referred to as SpyCas9 protein. The SpyCas9 protein had a large multifunctional domain of 1368 amino acids and acts as a DNA endonuclease in natural and artificial CRISPR/Cas systems.
Mechanism of Cas9 Protein
The mechanism of Cas9 protein has been studied extensively. Cas9 protein has six domains (1) Recognition lobe (REC I), (2) REC II, (3) Arginine-rich bridge helix, (4) PAM Interacting, (5) HNH, and (6) RuvC. REC I is the major domain responsible for binding with the gRNA; the REC II function is not studied [46]. The arginine-rich bridge helix initiates cleavage activity upon binding to targeted sequences. The interaction with PAM confers PAM specificity, which is responsible for binding with the target sequence. The HNH and RuvC are nuclease domains to chop the target sequence [46, 47] (Fig. 2).
Fig. 2
Schematic illustration of CRISPR/Cas9 mechanism. A The Cas9 protein complex contains six domains (Recognition lobe (REC I), REC II, Arginine-rich bridge helix, PAM Interacting, HNH, and RuvC). REC I is the major domain responsible for binding with the gRNA, the REC II function is not studied. The arginine-rich bridge helix initiates cleavage activity upon binding to targeted sequences. The interaction with PAM confers PAM specificity, which is responsible for binding with the target sequence. The HNH and RuvC are nuclease domains to chop the target sequence. The Cas9 protein remains inactive due to the absence of gRNA. B The programmed gRNA binds to the Cas9 and generates changes in the protein, which leads the inactive Cas9 protein into its active form. Once triggered, it searches the target sequence by binding with a sequence that matches the PAM sequence (5′-NGG-3′). Then Cas9 generates DSBs at 3 bp upstream of the PAM using its HNH and RuvC domains
Schematic illustration of CRISPR/Cas9 mechanism. A The Cas9 protein complex contains six domains (Recognition lobe (REC I), REC II, Arginine-rich bridge helix, PAM Interacting, HNH, and RuvC). REC I is the major domain responsible for binding with the gRNA, the REC II function is not studied. The arginine-rich bridge helix initiates cleavage activity upon binding to targeted sequences. The interaction with PAM confers PAM specificity, which is responsible for binding with the target sequence. The HNH and RuvC are nuclease domains to chop the target sequence. The Cas9 protein remains inactive due to the absence of gRNA. B The programmed gRNA binds to the Cas9 and generates changes in the protein, which leads the inactive Cas9 protein into its active form. Once triggered, it searches the target sequence by binding with a sequence that matches the PAM sequence (5′-NGG-3′). Then Cas9 generates DSBs at 3 bp upstream of the PAM using its HNH and RuvC domainsThe Cas9 protein remains inactive due to the absence of gRNA. The engineered gRNA forms a T-shape involved in 1 tetra-loop and 3 stem-loops [8, 46]. The gRNA is programmed to own a 5′ end, which is complementary to the target sequence. The programmed gRNA binds to the Cas9 and generates changes in the protein, which leads the inactive Cas9 protein into its active form. Once triggered, it searches the target sequence by binding with a sequence that matches the PAM sequence (5′-NGG-3′). Then Cas9 cuts dsDNA at 3 bp upstream of the PAM using its HNH and RuvC domains. The HNH domain cleaves the DNA strand that is complementary to the 20-nucleotide sequence (gRNA) of crRNA (target strand)) and the RuvC domain cleaves the DNA strand opposite to the complementary strand (non-target DNA strand). The spyCas9 system for cleaving target DNA recognizes a short ‘seed’ sequence with the 5'-NGG-3' di-nucleotide containing PAM [8]. The twin tracrRNA: crRNA was fused into a single guide RNA (sgRNA), making the CRISPR/Cas9 system to cut the targeted dsDNA or ssDNA sequences [8].
Application of Cas9 Protein
Cas9 protein holds great promise for efficient and targeted genome engineering applications in research, medicine, and biotechnology. The ease of genome modification in many species by simply designing a gRNA sequence enables large-scale genome editing experiments to probe genome function more than traditional gene editing techniques like TALEN and ZFN. Further, the Cas9 protein is also converted into an RNA-guided homing device (dCas9) by inactivating the nuclease domains to slow down the transcription. Using effector fusion (Cas9 proteins or sgRNA) to alter transcription states of specific genomic loci, or rearrange the genome, can significantly expand the genome engineering modification utilizing the CRISPR/Cas9 system. CRISPR/Cas9 system has been widely adopted by researchers and applied in diverse fields, including microbes [48], plants [49-51], animals [52], insects [53, 54], and human cell lines [55] (briefly explained in Fig. 3).
Fig. 3
Applications of CRISPR/Cas9 system. CRISPR/Cas9 system has revolutionized genome engineering: its accuracy, rapidity, and affordability permit its use in a nearly limitless range of applications. Since its discovery, researchers have been using the CRISPR/Cas9 system to cure diseases, discover new treatments, and for precision medicine. It does not stop there; beyond treating human diseases, CRISPR/Cas9 is also being utilized for studying the model and non-model insects’ biology, somatic genome editing, manufacturing biofuels, and engineering better crops (rice, wheat, etc.), etc. These advances made possible by the invention of the CRISPR/Cas9 system will change the lives of people globally
Applications of CRISPR/Cas9 system. CRISPR/Cas9 system has revolutionized genome engineering: its accuracy, rapidity, and affordability permit its use in a nearly limitless range of applications. Since its discovery, researchers have been using the CRISPR/Cas9 system to cure diseases, discover new treatments, and for precision medicine. It does not stop there; beyond treating human diseases, CRISPR/Cas9 is also being utilized for studying the model and non-model insects’ biology, somatic genome editing, manufacturing biofuels, and engineering better crops (rice, wheat, etc.), etc. These advances made possible by the invention of the CRISPR/Cas9 system will change the lives of people globally
Pros and Cons of Cas9 Protein
The advantage of the CRISPR/Cas9 system is the design simplicity with more efficiency than existing ZFN and TALEN systems [8]. Multiplexed genome editing is another significant advantage of Cas9, which can be achieved by designing multiple sequence-specific gRNAs simultaneously [8].Despite numerous advances, CRISPR/Cas9 system faced several disadvantages and also opened numerous queries about risks associated with editing. One example is the gRNA, which guides the Cas9 to cleave dsDNA or ssDNA showed higher on- and off-target mutations in the targeted organisms [9, 55]. Even the finest accessible CRISPR/Cas system that uses HDR also induces undesirable mutations. However, these off-target effects can be reduced by utilizing the modified Cas9 version called null Cas9 (nCas9), which generates a nick in only one strand than the DSBs [55]. However, 100% off-targets cannot be reduced using nCas9, which requires upgraded Cas9 versions in the future.
Cas12 Protein
Cas12 is a versatile protein with more dynamic applications, including epigenome editing. The Cas12 protein belongs to the type V CRISPR system [56]. The Cas12 protein had recently emerged as an effector RNA-guided DNA endonuclease that becomes an alternative to the Cas9 protein for genome editing [25]. The Cas12 protein was isolated from the Acidaminococcus species (AsCas12a) and Lachnospiraceae bacterium (LbCas12a) that fight against invading viruses. Cas9 protein discriminates more accurately against mismatches within the initial ~ 10 bp of the RNA–DNA helix proximal to the PAM sequence [25]. Moreover, Cas12 isn’t a bit like the Cas9 protein; as a result, this protein can process the precursor crRNA by itself, which doesn’t require tracrRNA or RNase III. This process enhanced researchers to use Cas12 protein for multiplex genome editing [25].
Mechanism of Cas12 Protein
The Cas12 protein requires only the crRNAs to create an efficient cut at ssDNA and dsDNA. Cas12 protein contains the RuvC and nuclease lobe (NUC) domains for cleavage activity [56]. Like Cas9, Cas12 encounters a potential target site beside a PAM sequence. Once Cas12 starts encountering, it initiates R-loop, which forms base-pair hybridization between the crRNA and the target DNA strand. During this step, Cas12 matches the < 17 bp of the target sequence and leads to an R-loop formation [47]. Once R-loop is formed, the Cas12 protein uses its active RuvC domain to cleave the non-target strand with the help of the PAM sequence [56] (Fig. 4). However, the function of the RuvC domain of Cas12 protein in cleaving the targeted DNA strand is not yet clearly studied.
Fig. 4
Schematic illustration of CRISPR/Cas12 mechanism. The Cas12 protein requires only the crRNAs to generate DSBs. Cas12 protein cleaves the target region beside a PAM sequence (CTA, TTN, TTTN) with the help of the RuvC and nuclease lobe (NUC) domains. Once Cas12 starts encountering, it initiates R-loop, which forms base-pair hybridization between the crRNA and the target DNA strand. During this step, Cas12 matches the < 17 bp of the target sequence and leads to an R-loop formation. Once R-loop is formed, the Cas12 protein uses its active RuvC domain and generates a staggering cut in the non-target strand with the help of the PAM sequence
Schematic illustration of CRISPR/Cas12 mechanism. The Cas12 protein requires only the crRNAs to generate DSBs. Cas12 protein cleaves the target region beside a PAM sequence (CTA, TTN, TTTN) with the help of the RuvC and nuclease lobe (NUC) domains. Once Cas12 starts encountering, it initiates R-loop, which forms base-pair hybridization between the crRNA and the target DNA strand. During this step, Cas12 matches the < 17 bp of the target sequence and leads to an R-loop formation. Once R-loop is formed, the Cas12 protein uses its active RuvC domain and generates a staggering cut in the non-target strand with the help of the PAM sequence
Application of Cas12 Protein
A more concise system, CRISPR/Cas12 cleaves, dsDNA or ssDNA via the RuvC domain and does not require tracrRNA. Recently, Doudna’s group introduced a novel CRISPR/Cas diagnostic tool termed DNA endonuclease-targeted CRISPR trans-reporter (DETECTR) based on Cas12 protein (Fig. 5b) [57]. This DETECTR technique uses type V enzyme to cleave ssDNA sequence in a three-stage process: (1) Cas12a protein and target crRNA are complemented to the DNA reporter probes; once crRNA recognizes its target sequence through Cas12a protein, the Cas12a protein turns on collateral action and cleaves the target ssDNA or dsDNA. (2) Target DNA probes bind with fluorophores and a quencher molecule. (3) Degradation of the DNA probes releases fluorophore and quencher, producing a robust fluorescent signal for detecting targeted ssDNA or dsDNA cleavage [58]. In addition, this system detected even a single molecule of viral particles (Fig. 5b). For instance, Chen et al. (2018) used the DETECTR technique and detected Human Papilloma Viruses (HPV) [57]. In brief, they combined the non-specific ssDNA of Cas12 with DETECTR and differentiated several HPV16 and HPV18 strains from crude DNA extracts of clinical samples within 1 h. Furthermore, researchers detected African Swine Fever Virus (ASFV) by employing the CRISPR/Cas12 system [59]. They combined a fluorescent-based point of care (POC) system with CRISPR/Cas12 and detected ASFV within 2 h. From this result, they reported that the CRISPR/Cas12 is very specific and can detect even up to a single nucleotide of a targeted virus [59]. Recently, DETECTR is also utilized for detecting novel severe acute respiratory syndrome coronavirus-2 (SARS-Cov-2). Mammoth Biosciences, Inc. targeted two Nucleocapsid (N) and Envelope (E) genes of SARS-Cov-2 and observed faster virus detection within 1 h [60]. In brief, they generated Cas12-gRNAs to specifically target the SARS-CoV-2 and optimized the DETECTR assay for E and N genes. They utilized this upgraded DETECTR assay and detected the SARS-CoV-2 on a lateral flow strip within 1 h. Similar to this, Argentina and CASPR Biotech used a quick and portable SARS-CoV-2 diagnostic method based on CRISPR/Cas12 system [61]. They gathered saliva samples from COVID-19 patients and reported that the naturally occurring proteins in saliva had no inhibitory effects on the CRISPR/Cas12-based paper strip experiment [61]. Additionally, a Chinese institution used the CRISPR/Cas12-based DETECTR system to confirm a clear detection of SARS-CoV-2 [62]. They engineered gRNA that targeted the orf1a, orf1b, N, and E genes of the SARS (Wuhan-Hu-1 strain) and related viruses, in which they detected single nucleotide polymorphisms (SNPs). From these results, researchers stated that CRISPR/Cas12-based detection could be employed for the efficient and rapid diagnosis of COVID-19. Jiang et al. (2021) recently developed a magnetic-pull-down-assisted colorimetric (M-CDC) technique coupled to the CRISPR/Cas12 system to detect SARS-CoV-2. They used gold nanoparticle (AuNP) probes for this technique to detect SARS-CoV-2. Additionally, they screened 41 viral samples and reported that M-CDC is a useful technique for screening SARS-CoV-2 variants without requiring advanced instruments [63]. These applications of Cas12 nuclease enhanced scientists to increase the scope of genome-editing in various fields, including in diagnosing novel viruses [64-66].
Fig. 5
a and b Mechanism of SHERLOCK and DETECTR systems. (A) Targeted double-stranded DNA (dsDNA) or RNA is amplified with recombinase polymerase amplification (RPA) or reverse transcription (RT)-RPA. The RPA is coupled with T7 transcription to covert targeted RNA for detection by Cas13 system. This amplification step with the combination of reporter probe, enable specific high-sensitivity enzymatic reporter unlocking (SHERLOCK) to detect the targeted sequence, (B) In DNA endonuclease-targeted CRISPR trans reporter (DETECTR), DNA is amplified with RPA. The Cas12 system pairs with the single-stranded DNA (ssDNA) of interest, and the DNase activity of Cas12 system is initiated. This amplification step, combined with the reporter probe, enables DETECR to detect the targeted sequence
a and b Mechanism of SHERLOCK and DETECTR systems. (A) Targeted double-stranded DNA (dsDNA) or RNA is amplified with recombinase polymerase amplification (RPA) or reverse transcription (RT)-RPA. The RPA is coupled with T7 transcription to covert targeted RNA for detection by Cas13 system. This amplification step with the combination of reporter probe, enable specific high-sensitivity enzymatic reporter unlocking (SHERLOCK) to detect the targeted sequence, (B) In DNA endonuclease-targeted CRISPR trans reporter (DETECTR), DNA is amplified with RPA. The Cas12 system pairs with the single-stranded DNA (ssDNA) of interest, and the DNase activity of Cas12 system is initiated. This amplification step, combined with the reporter probe, enables DETECR to detect the targeted sequence
Pros and Cons of Cas12 Protein
Like Cas9, Cas12 protein was also considered as a sole member of the CRISPR family for genome editing. But in most circumstances, Cas12 was deemed superior to Cas9 protein because Cas12 protein generates staggered DSBs and promotes HDR repair mechanism instead of both NHEJ & HDR [67]. Cas12 system also overcomes disadvantages associated with diagnostic strategies. For example, diagnosing SARS-COV-2 utilizing quantitative qRT-PCR demands a longer time to get the results [66]. But the CRISPR/Cas12-based DETECTR detects the SARS-CoV-2 within 1 h [66]. These results proved the advantage of the CRISPR/Cas12 system, which can also be utilized in detecting newly emerging viruses in the future.The rapid advancement of the CRISPR-Cas12 system for genome editing has proved revolutionary for the life sciences. Despite this technology's wide application areas, the CRISPR/Cas12 system faces several significant flaws. For example, the CRISPR/Cas12 system is dependent on host cell DNA repair machinery with or without the presence of a template [68]. Although this system has been successfully used to acquire accurate DNA insertion into the desired genomic loci, its effectiveness varies depending on the cell type. DNA repair through HDR is also associated with active cell division, making these tools ineffective in cell division (for example, neurons) [69]. However, significant continuing research aims to tailor the Cas12 system further to ensure accurate DNA insertion into the targeted genome. Apart from this drawback, this system has a wide range of applicability, and ongoing endeavors are striving to generate enhanced CRISPR/Cas12 for robust genome engineering.
Cas13 Protein
The most recently discovered Cas protein is Cas13. CRISPR/Cas13 system functions as an ‘adaptive’ immune system in archaea and bacteria to defend against the invading RNA elements [70]. The Cas13 protein family contains two subtypes: (1) Cas13a protein from Leptotrichia shahii bacterium (LshCas13a), which is formally known as C2c2 and belongs to type VI, and (2) Cas13b from Prevotella sp. (PspCas13b) belongs to the type III CRISPR/Cas system. This system targets and cleaves only the ssRNA, not the ssDNA or dsDNA [71].
Mechanism of Cas13a Protein
Cas13a protein is activated through a single crRNA, like Cas12 protein from pre-crRNA processing. Cas13a protein comprises crRNA, NUC lobes, and two nucleotide-binding (HEPN) RNase domains for targeting RNA (Fig. 6). The LshCas13a cleaves ssRNA, upon recognizing the target sequence (22–28 nt) complementary to the crRNA spacer [72, 73]. The target sequence is flanked by a protospacer-flanking site (PFS) at the 3′-end, which has a bias to adenosine (A), uracil (U), and cytosine(C). LshCas13a and crRNA bind together and cleave the target region of ssRNA without the tracrRNA.
Fig. 6
Schematic illustration of CRISPR/Cas13a mechanism. Cas13a protein is activated through a single crRNA. Cas13a protein comprises crRNA, NUC lobes, and two nucleotide-binding (HEPN) RNase domains for targeting RNA. The Cas13a cleaves ssRNA, upon recognizing the target sequence (22–28 nt) complementary to the crRNA spacer. The target sequence is flanked by a protospacer-flanking site (PFS) at the 3′-end and crRNA binds together and cleaves the target region of ssRNA without the tracrRNA
Schematic illustration of CRISPR/Cas13a mechanism. Cas13a protein is activated through a single crRNA. Cas13a protein comprises crRNA, NUC lobes, and two nucleotide-binding (HEPN) RNase domains for targeting RNA. The Cas13a cleaves ssRNA, upon recognizing the target sequence (22–28 nt) complementary to the crRNA spacer. The target sequence is flanked by a protospacer-flanking site (PFS) at the 3′-end and crRNA binds together and cleaves the target region of ssRNA without the tracrRNA
Mechanism of Cas13b Protein
The Cas13b protein is more precise than the Cas13a protein since a PFS flanks RNA targeting with A, U, or G at the 5′ end and PAM (NAN/NNA) at the 3′ end [74]. Cas13b protein is associated with the mature crRNA. ÇRISPR/Cas13b complex searches for the target ssRNA and induces conformational changes at the ssRNA target, resulting in nonspecific RNA cleavage [75]. However, the mechanism behind the Cas13b protein is not fully revealed, but scientists tested the ability of the cas13b protein for RNA editing (Fig. 7).
Fig. 7
Schematic illustration of CRISPR/Cas13b mechanism. Cas13b protein is associated with the mature crRNA. This ÇRISPR/Cas13b complex searches for the target ssRNA and induces precise conformational changes at the ssRNA target with the help of the Protospacer flanking site (PFS), which flanks RNA targeting at the 5′ end and PAM sequence (NAN/NNA) at the 3′ end, resulting in nonspecific RNA cleavage
Schematic illustration of CRISPR/Cas13b mechanism. Cas13b protein is associated with the mature crRNA. This ÇRISPR/Cas13b complex searches for the target ssRNA and induces precise conformational changes at the ssRNA target with the help of the Protospacer flanking site (PFS), which flanks RNA targeting at the 5′ end and PAM sequence (NAN/NNA) at the 3′ end, resulting in nonspecific RNA cleavage
Application of Cas13 Protein
Cas13 proteins had wider application in various genome-editing and diagnostic fields. Beyond other applications like base editing, the Cas13a is also utilized for single-nucleotide detection at any site on the target sequence [25]. Like the DETECTR system, which depends on the activity of Cas12 protein, the specific high-sensitivity enzymatic reporter unlocking (SHERLOCK) technique depends on Cas13 protein (Fig. 5a) [76, 77]. The SHERLOCK system works by integrating the targeted RNA fragments with Cas13 crRNA and fluorescent RNA probes. If the corresponding target sequence exists in the sample, Cas13 recognizes it via crRNA and cleaves fluorescent DNA probes, influencing fluorophore and quencher. This results in detecting the target via a fluorescent signal [73]. In addition, while SHERLOCK combined with reverse transcription-recombinase polymerase amplification (RT-RPA) or isothermal-RPA, the crRNA-Cas13a complex binds and cleaves a specific target sequence with high specificity [72]. SHERLOCK is used for various functions, including RNA detection, sensitive detection of nucleic acid contamination, and general RNA/DNA quantitation of specific qPCR assays [77].Additionally, Cas13 nuclease could potentially track the allele-specific expression of transcripts or disease-associated mutations in cells. These attractive features opened new avenues to discriminate single nucleotide changes with high sensitivity in many organisms, including viruses (Fig. 5a). For instance, Gootenberg et al. (2018) diagnosed E. coli and Pseudomonas aeruginosa with the DNA isolated from these strains using the SHERLOCK system. They have also distinguished the isolates of Klebsiella pneumonia with two types of resistance genes, such as K. pneumoniae carbapenemase (KPC) and New Delhi metallo-β-lactamase 1 [70]. Furthermore, SHERLOCK differentiated African and American strains of the zika virus, the dengue virus (DENV) serotype, and cancer-related DNA targets. In addition, SHERLOCK detected Zika virus and DENV with higher sensitivity within 30 min. Researchers also developed a unique technique for detecting viral infection by combining SHERLOCK with heating unextracted diagnostic samples to obliterate nucleases (HUDSON) [75]. Within 2 hours, updated SHERLOCK identified DENV in a patient's saliva, blood, and serum. From these existing results, SHERLOCK Biosciences, Inc. and Mammoth Biosciences, Inc. used the SHERLOCK technique for the detection of SARS-CoV-2. They targeted S and Orf1ab genes and detected the SARS-Cov-2 RNA sequence within an hour. Metsky et al. (2020) also detected the SARS-CoV-2 RNA using the Cas13a proteins. They combined synthetic RNA targets with fluorescent visualizers and detected SARS-CoV-2-RNA [76]. To test the effectiveness of detecting SARS-CoV-2, Rauch et al. (2020) recently utilized the Cas13 based-Ragged-Equitable-Scalable Testing (CREST) method [77]. This technology uses portable and lightweight LED visualizers built with plastic filters. Researchers also used lateral flow immunochromatography strips to detect target RNAs, but these strips are expensive. Therefore, they created a P51 cardboard fluorescent visualizer, which was less expensive and could detect 10 copies of target RNA per millilitre. Additionally, they utilised the smartphone camera to capture data, which they then uploaded to the cloud to enable POC testing [77]. It improves nucleic acid detection based on CRISPR and demonstrates its power to change diagnostic and surveillance efforts by direct screening even on a vast population. Later, the CRISPR/Cas13a test based on single-molecule RNA diagnosis was developed by the Tian et al. (2020) group, which eliminated reverse transcriptase and nucleic acid extension procedures [78]. They stated that the SARS-CoV-2 would be easily detectable using the CRISPR/Cas13a test [78, 79]. These advantages of Cas13 protein enabled researchers to target any non-targetable sequences in diverse organisms [80-82].
Pros and Cons of Cas13 Protein
Like Cas9 and Cas12, CRISPR/Cas13 is also a robust, precise, and versatile RNA-targeting system, which opens novel research horizons in diverse fields. Compared with earlier RNA manipulation systems, CRISPR/Cas13 offers numerous advantages. For example, its modular construction, which consists of a single protein effector module and an RNA guide module, allows for significant scalability by enabling the production of whole libraries of various guide RNAs in addition to easy and quick design [83]. The recently discovered Cas13 mutant versions (dCas13, Cas13x), which function as programmable RNA-binding proteins, efficiently target different effectors to specific RNAs to induce specific mutations [83]. Due to the inherent crRNA biogenesis, multiple RNAs can be targeted using Cas13 precisely. Compared with RNAi, CRISPR/Cas13 mediated genome modifications are not limited to targeting cytoplasmic transcripts. In addition, Cas13 enables faster downregulation of gene expression by directly knocking out the cytoplasmic mRNA transcripts [83]. Recently, CRISPR/Cas13-based SHERLOCK and SHERLOCKv2 also played a vital role in developing novel molecular diagnostic tools to detect viruses, including SARS-CoV-2. Apart from these major advantages, CRISPR/Cas13 faced off-target mutations [55], which is the major drawback of this system. However, future research would help overcome this obstacle and aid in developing novel RNA knockdown approaches with more specificity and efficiency.
Cas14 Protein
Doudna’s group investigated other and similar types (Cas9, Cas12, and Cas13) of Cas systems that were available in nature [84]. They explored by creating a metagenomic database of the bacterial genome to search for uncharacterized Cas genes [84]. Surprisingly, they found Cas14 protein, which codes for smaller Cas protein with MW 40–70 kd. This Cas14 protein is extremely smaller (400–700 amino acids) than the other characterized Cas proteins [84]. Due to their small size, Doudna’s lab reported that Cas14 protein can target ssDNA without a PAM [84].
Mechanism of Cas14 Protein
This Cas14 cleaves ssDNA and confers immunity against viruses with ssDNA genomes or mobile genetic elements (MGEs) [84]. Cas14 protein recognizes the ssDNA, mediates seed sequence interaction with the target ssDNA, and cleaves the ssDNA, not dsDNA or ssRNA. Like Cas9, the Cas14 protein requires both tracrRNA and crRNA to target the ssDNA (Fig. 8). The cleavage efficiency of the Cas14 protein is more specific than Cas9, Cas12, and Cas 13 proteins without the presence of the PAM region [40]. Thus, this system meets all the criteria for high-fidelity genome editing.
Fig. 8
Schematic illustration of CRISPR/Cas14 mechanism. The Cas14 protein comprises both tracrRNA and crRNA to target ssDNA. Cas14 protein recognizes the ssDNA with the help of tracrRNA and crRNA, mediates seed sequence interaction with the target ssDNA, and cleaves the ssDNA, not dsDNA or ssRNA. The cleavage efficiency of the Cas14 protein is more specific than Cas9, Cas12, and Cas 13 proteins without the presence of the PAM region
Schematic illustration of CRISPR/Cas14 mechanism. The Cas14 protein comprises both tracrRNA and crRNA to target ssDNA. Cas14 protein recognizes the ssDNA with the help of tracrRNA and crRNA, mediates seed sequence interaction with the target ssDNA, and cleaves the ssDNA, not dsDNA or ssRNA. The cleavage efficiency of the Cas14 protein is more specific than Cas9, Cas12, and Cas 13 proteins without the presence of the PAM region
Application of Cas14 Protein
CRISPR/Cas14 system is now harnessed as superior to the Cas13 system. Researchers combined the CRISPR/Cas14 system with the DETECTR as a diagnostic approach for high-fidelity detection of ssDNA. Harrington et al. (2018) first used CRISPR/Cas14 system with the DETECTR technique. They designed gRNA targeting the HECT and RLD Domain Containing E3 Ubiquitin Protein Ligase 2 (HERC2) gene of human saliva samples from blue-eyed single nucleotide polymorphisms (SNP) individuals. They exhibited strong activation of recognition of the blue-eyed SNP by Cas14, while the Cas12 system failed to detect the blue-eyed SNP [40]. This result represents a cost-effective method for screening pathogenic mutations and provides a great opportunity for mapping the candidate genes associated with different pathogens. Another group proposes the Cas14 for viral diagnostic in combination with the simplified nucleic acid extraction, which does not involve complicated sample extraction like heating unextracted diagnostic samples to obliterate nucleases (HUDSON). The results reported that Cas14 could provide an advanced platform in the future for viral screening [72, 79]. This novel CRISPR/Cas14 system crosses criteria with user-friendly, sensitive, and more specific applications for future genome engineering. It can be harnessed for other purposes, including diagnostic applications like phylogenetic association (new viruses), epidemiology association with other pathogens, and taxonomic analysis [85].
Pros and Cons of Cas14 Protein
The Cas14 protein, which edits dsDNA, ssDNA, and RNA, holds various advantages over traditional Cas9 protein. For example, the Cas14 protein that has around 500 amino acid is very small, so this protein could be easily delivered to target any tissue than Cas9 protein [86]. Additionally, the Cas14 protein's selectivity improves the fidelity of single nucleotide polymorphism (SNP) [83]. Finally, the Cas14 protein has less limiting PAM requirements (uses only T-rich sequences) [83], allowing it to edit more targeted genomic sequences than the Cas9 protein. However, only a few studies have been demonstrated, which requires extensive studies to assist Cas14 in several fields of research in the future.
Conclusion and Future Perspectives
The exploitation of CRISPR/Cas tools made a breakthrough in genome engineering in the last few years. The discovery of Cas proteins has revolutionized natural science research and enabled advances in basic research, therapeutics, and diagnostics. The ease to use, the requirement of very little equipment, and the low cost have made many laboratories utilize these CRISPR/Cas-based systems to study the function of genes in various organisms. The initial discovery of the Cas9 protein, which shows its spotlight in the CRISPR system, has enhanced researchers to find different Cas variants like Cas12, Cas13, and Cas14 proteins. This research improvement in Cas proteins has provided exciting platforms in genome engineering, including detecting RNA viruses from plants, animals, and humans and treating infections. Several studies are currently underway to find novel Cas variants from nature. However, numerous aspects of the mechanism of Cas proteins still lack sufficient understanding. Therefore, understanding the molecular mechanism behind the Cas proteins, identification of PAM-less Cas proteins, and accurate targeting specificity to minimize off-target effects will increase the potential of utilizing Cas proteins for precise applications of genome engineering. In addition, identifying the uncharacterized advance Cas proteins that may still exist in many bacteria or archaea will revolutionize multiple fields, including diagnosing novel viruses, therapeutics, agriculture, breeding, etc. If these proteins are fully addressed with improved genome-editing abilities, it might kick-start new CRISPR “fever” offering influential and groundbreaking CRISPR technologies into mainstream use in the near future.
Authors: Kirill A Datsenko; Ksenia Pougach; Anton Tikhonov; Barry L Wanner; Konstantin Severinov; Ekaterina Semenova Journal: Nat Commun Date: 2012-07-10 Impact factor: 14.919
Authors: Blake Wiedenheft; Kaihong Zhou; Martin Jinek; Scott M Coyle; Wendy Ma; Jennifer A Doudna Journal: Structure Date: 2009-06-10 Impact factor: 5.006