| Literature DB >> 36017348 |
Adnan Asadbeigi1, Milad Norouzi2, Mohammad Sadegh Vafaei Sadi2, Mojtaba Saffari3, Mohammad Reza Bakhtiarizadeh4.
Abstract
The efficiency of the CRISPR-Cas system is highly dependent on well-designed CRISPR RNA (crRNA). To facilitate the use of various types of CRISPR-Cas systems, there is a need for the development of computational tools to design crRNAs which cover different CRISPR-Cas systems with off-target analysis capability. Numerous crRNA design tools have been developed, but nearly all of them are dedicated to design crRNA for genome editing. Hence, we developed a tool matching the needs of both beginners and experts, named CaSilico, which was inspired by the limitations of the current crRNA design tools for designing crRNAs for Cas12, Cas13, and Cas14 CRISPR-Cas systems. This tool considers a comprehensive list of the principal rules that are not yet well described to design crRNA for these types. Using a list of important features such as mismatch tolerance rules, self-complementarity, GC content, frequency of cleaving base around the target site, target accessibility, and PFS (protospacer flanking site) or PAM (protospacer adjacent motif) requirement, CaSilico searches all potential crRNAs in a user-input sequence. Considering these features help users to rank all crRNAs for a sequence and make an informed decision about whether a crRNA is suited for an experiment or not. Our tool is sufficiently flexible to tune some key parameters governing the design of crRNA and identification of off-targets, which can lead to an increase in the chances of successful CRISPR-Cas experiments. CaSilico outperforms previous crRNA design tools in the following aspects: 1) supporting any reference genome/gene/transcriptome for which an FASTA file is available; 2) designing crRNAs that simultaneously target multiple sequences through conserved region detection among a set of sequences; 3) considering new CRISPR-Cas subtypes; and 4) reporting a list of different features for each candidate crRNA, which can help the user to select the best one. Given these capabilities, CaSilico addresses end-user concerns arising from the use of sophisticated bioinformatics algorithms and has a wide range of potential research applications in different areas, especially in the design of crRNA for pathogen diagnosis. CaSilico was successfully applied to design crRNAs for different genes in the SARS-CoV-2 genome, as some of the crRNAs have been experimentally tested in the previous studies.Entities:
Keywords: Cas12; Cas13; gene editing; genome engineering; guide RNA
Year: 2022 PMID: 36017348 PMCID: PMC9395711 DOI: 10.3389/fbioe.2022.957131
Source DB: PubMed Journal: Front Bioeng Biotechnol ISSN: 2296-4185
Summary of commonly used crRNA design tools.
| Tool | Cas enzyme | Species support | Sequence ID* search | Off-target search | Result visualization | Reference |
|---|---|---|---|---|---|---|
| CRISPOR | Cas9 and Cas12 variants | Many | No | Yes | Yes |
|
| CHOPCHOP | Cas9, Cas12, and Cas13 variants | Many | Yes | Yes | Yes |
|
| Cas13design | Cas13d variant | Many | Yes | Yes | No |
|
| E-CRISP | SpCas9 variant | Many | Yes | Yes | No |
|
| CRISPick | SpCas9, SaCas9, AsCas12a, and enAsCas12a variants | Human, mouse, and rat | Yes | Yes | No |
|
| GUIDES | SpCas9 variant | Human and mouse | Yes | Yes | Yes |
|
| Microsoft Research CRISPR | SpCas9 variant | Human | Yes | Yes | No |
|
| Cas-OFFinder | Cas9 and Cas12 variants | Many | No | Yes | No |
|
| Off-Spotter | SpCas9, CjCas9, and SaCas9 variants | Human, mouse, and yeast | No | Yes | Yes |
|
Table shows class 2 CRISPR-Cas system variants for which CaSilico designs crRNA.
| CRISPR-cas system | Cas protein type | Corresponding organism | Application |
|---|---|---|---|
| VI-A | Cas13a/C2c2 |
| Nucleic acid detection, vaccination, transcript targeting, and SNP detection |
| VI-B | Cas13b/C2c6 | Prevotella sp. P5-125 | Nucleic acid detection, vaccination, transcript targeting, and SNP detection |
| VI-D | Cas13d |
| Nucleic acid detection, vaccination, transcript targeting, and SNP detection |
| V-A | Cas12a/Cpf1 | Lachnospiraceae bacterium ND2006 | Nucleic acid detection and genome editing |
| V-B | Cas12b/C2c1 |
| Nucleic acid detection and genome editing |
| V-F1 | Cas14a/Cas12f1 | Uncultured archaea | SNP detection |
FIGURE 1CaSilico workflow. (A) CaSilico accepts a single or a set of DNA or RNA sequences to be scanned for crRNA designing. (B) When more than one sequence is given as input, the conserved regions among them are automatically detected considering conservation threshold and one of the two different approaches for identifying conserved regions. (C) A sliding window (stride of 1 nt) is employed across the single sequences or conserved region of multiple sequences to specify potential target sites. (D) CaSilico applies multiple criteria for crRNA designing, performs off-target analysis, and returns outputs in an interactive graphical interface and some files such as MSA and secondary structure (E,F).
FIGURE 2Schematic structure of the subtypes of V and VI types related to class II of CRISPR-Cas system. Generalized locus along with their main details illustrated in sections (A) V-A, (B) V-B1, (C) V-F1, (D) VI-A, (E) VI-B, and (F) VI-D. LwaCas13a requires spacers of at least 20 nt length but shorter spacers have lower efficiency. AapCas12b and Cas14a are dual-RNA-guided endonucleases that require association with a structural accessory RNA named transactivating CRISPR RNA (tracrRNA) for their activity. This image was created using www.biorender.com.
Principles of crRNA designing for type VI-A/B/D CRISPR-Cas system.
| Rule No | Parameter | |
|---|---|---|
| Positional rule | ||
|
| Perfect base pairing with target region | |
| 1–1 | Allowing single mismatches into the spacer | |
| Type VI-A | Central (seed) region or base pairs 13 to 24 LwaCas13a target are intolerant to single mismatches | |
| Type VI-B | Central (seed) region or base pairs 12 to 26 PspCas13b target are intolerant to single mismatches | |
| Type VI-D | Central (seed) region or base pairs 2 to 8 RfxCas13d target are intolerant to single mismatches | |
| 1–2 | Allowing consecutive or nonconsecutive double mismatches into the spacer | |
| Type VI-A | Central (seed) region or base pairs 8 to 27 LwaCas13a target are intolerant to double mismatches (mismatch occurrence exactly at 5′ or 3′ end is preferred) | |
| Type VI-B | Central (seed) region or base pairs 12 to 29 PspCas13b target are intolerant to double mismatches (mismatch occurrence exactly at 5′ or 3′ end is preferred) | |
| Type VI-D | Double mismatches exactly at 5′ or 3′ end are acceptable | |
|
| Protospacer flanking site or sequence (PFS) requirement | |
| LwaCas13a lacks PFS. | ||
| In PspCas13b, not being D (G, A, or U) at first and second sites of 5′ PFS inhibits single-stranded RNA cleavage and being NAN or NNA at 3′ PFS enhances this activity | ||
| The PFS may not be necessary for RfxCas13d | ||
|
| Target nucleotide content | |
| Cleavage preferentially occurs in uracil-rich regions (poly UU/AU; at uracil bases) for LwaCas13a and RfxCas13d and in adenine-rich regions (at adenine bases) for PspCas13b | ||
|
| GC content | |
| Thermodynamics rules | ||
|
| Target site accessibility | |
| Not having any stable secondary structures in target region. This feature is important in type VI-A/B/D that targets RNA | ||
|
| Self-complementarity | |
| BLAST rules | ||
|
| Searching off-target effects in target organism | |
|
| Searching off-target effects in other organisms | |
G-U wobble base pairing is allowable.
Principles of crRNA designing for type V-A/B/F1 CRISPR-Cas system.
| Rule no | Parameters | |
|---|---|---|
| Positional rules | ||
|
| Perfect base pairing with target region | |
| 1–1 | Allowing single and consecutive or nonconsecutive double mismatches into the spacer | |
| Type V-A | Central (seed) region or base pairs 1 to 6 LbCas12a target are intolerant to single and consecutive or nonconsecutive double mismatches | |
| Type V-B | Single and consecutive or nonconsecutive double mismatches at any positions of AapCas12b target are acceptable | |
| Type V-F1 | Central (seed) region or base pairs 9 to 16 Cas14a target are intolerant to single and consecutive or nonconsecutive double mismatches | |
|
| Protospacer adjacent motif (PAM; CRISPR motif) requirement | |
| PAM sequence of LbCas12a and AapCas12b for target DNA recognition is 5′-TTTV-3′ and 5′-TTN-3′, respectively, located upstream of the target sequence. PAM is not necessary for Cas14a | ||
|
| GC content | |
| Thermodynamics rules | ||
|
| Target site accessibility | |
| Not having any stable secondary structures in target region. This feature is important in type V-F1 that targets ssDNA. | ||
|
| Self-complementarity | |
| BLAST rules | ||
|
| Searching off-target effects in target organism | |
|
| Searching off-target effects in other organisms | |
G-U wobble base pairing is allowable.
FIGURE 3CaSilico represents the results in an interactive table interface. (A) Result table shows (a) candidate crRNAs from 5′ to 3', (b) allele frequency for each consensus base all over the window, (c) location of target site on the (consensus) sequence, (d) GC content of the spacer, (e) the number of bases that have an allele frequency less than the conservation threshold, (f) position of each mismatch on the spacer sequence, (g) other bases of a polymorphic position, (h) the average of conservation score of all positions, (i, j) the proportion of Us or As in 100 nt windows upstream and downstream of the spacer target site that is normalized by the proportion of Us or As throughout the gene, (k) the number of target site unpaired bases that are divided by the length of target site, and (l) frequency of all unpaired bases in 100 nt windows upstream and downstream of the target site. (B) CaSilico reports all candidate off-targets for inputted organisms by considering some mismatches in another interactive table that is available through clicking of a particular link. Another feature of this table is reporting the gene in which off-target effect is located. (C) For each candidate crRNA, CaSilico provides a link in the plots column of the table results and is linked to a graphical view of (a) spacer dot-plot, (b) the predicted self-complementarity of spacer and crRNA, and (c) crRNA information plot.
Result of crRNA designing for ten genes of SARS-CoV-2 by CaSilico.
| Gene subtype | Nsp3 | 3CLpro | Nsp8 | Nsp9 | Nsp10 | RdRp | Nsp14 | S | E | N | ||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| VI-A | Candidate | 5,663 | 867 | 567 | 312 | 390 | 2,636 | 1,542 | 3,539 | 201 | 1,129 | |
| GC content between 40 and 70 | 1,293 | 298 | 206 | 176 | 220 | 718 | 550 | 1,104 | 67 | 908 | ||
| Local U-rich ≥ 0.25 | 5,518 | 867 | 525 | 312 | 390 | 2,636 | 1,532 | 3,526 | 201 | 3 | ||
| Protospacer accessibility ≥ 0.5 | 693 | 114 | 75 | 64 | 69 | 275 | 278 | 692 | 66 | 344 | ||
| Off-target | No mismatch | 9 | 0 | 0 | 16 | 2 | 152 | 3 | 20 | 71 | 149 | |
| One mismatch | 12 | 19 | 23 | 13 | 32 | 240 | 27 | 40 | 26 | 69 | ||
| Two -mismatches | 7 | 8 | 8 | 8 | 5 | 53 | 7 | 18 | 2 | 31 | ||
| VI-B | Candidate | 3,865 | 580 | 377 | 204 | 252 | 1745 | 1,027 | 2,303 | 124 | 622 | |
| GC content between 40 and 70 | 1,422 | 312 | 190 | 146 | 174 | 725 | 546 | 1,003 | 51 | 544 | ||
| Local A-rich ≥ 0.25 | 3,665 | 540 | 376 | 204 | 249 | 1736 | 978 | 2059 | 0 | 622 | ||
| Protospacer accessibility ≥ 0.5 | 106 | 68 | 40 | 31 | 44 | 93 | 166 | 253 | 33 | 191 | ||
| Off-target | No mismatch | 2 | 0 | 0 | 6 | 0 | 95 | 0 | 9 | 37 | 64 | |
| One mismatch | 5 | 9 | 6 | 4 | 13 | 121 | 13 | 17 | 19 | 37 | ||
| Two mismatches | 4 | 8 | 12 | 7 | 4 | 45 | 5 | 19 | 0 | 32 | ||
| VI-D | Candidate | 5,730 | 883 | 573 | 318 | 396 | 2,697 | 1,553 | 3,643 | 207 | 1,159 | |
| GC content between 40 and 70 | 2,103 | 438 | 288 | 217 | 248 | 1,131 | 757 | 1,591 | 90 | 965 | ||
| Local U-rich ≥ 0.25 | 5,591 | 883 | 538 | 316 | 396 | 2,697 | 1,549 | 3,636 | 207 | 9 | ||
| Protospacer accessibility ≥ 0.5 | 828 | 159 | 106 | 78 | 90 | 344 | 335 | 837 | 52 | 369 | ||
| Off-target | No mismatch | 27 | 12 | 8 | 33 | 16 | 325 | 22 | 50 | 87 | 214 | |
| One mismatch | 81 | 55 | 70 | 16 | 62 | 420 | 86 | 100 | 33 | 98 | ||
| Two mismatches | 16 | 12 | 12 | 6 | 9 | 89 | 18 | 27 | 6 | 28 | ||
| V-A | Candidate | 274 | 29 | 21 | 14 | 13 | 118 | 55 | 180 | 7 | 39 | |
| GC content between 40 and 70 | 117 | 16 | 16 | 7 | 9 | 50 | 35 | 93 | 3 | 32 | ||
| Off-target | No mismatch | 0 | 0 | 0 | 0 | 0 | 13 | 2 | 3 | 1 | 10 | |
| One mismatch | 3 | 4 | 6 | 0 | 2 | 15 | 3 | 3 | 2 | 3 | ||
| Two mismatches | 9 | 1 | 2 | 2 | 2 | 25 | 10 | 13 | 2 | 4 | ||
| Three mismatches | 30 | 9 | 3 | 3 | 3 | 27 | 13 | 16 | 1 | 3 | ||
| V-B | Candidate | 1,268 | 175 | 112 | 52 | 64 | 535 | 267 | 810 | 45 | 211 | |
| GC content between 40 and 70 | 544 | 103 | 65 | 31 | 45 | 242 | 151 | 394 | 24 | 179 | ||
| Off-target | No mismatch | 3 | 2 | 2 | 3 | 0 | 58 | 5 | 17 | 14 | 46 | |
| One mismatch | 26 | 18 | 17 | 2 | 16 | 111 | 23 | 27 | 9 | 17 | ||
| Two mismatches | 85 | 23 | 22 | 9 | 13 | 140 | 51 | 63 | 10 | 44 | ||
| Three mismatches | 274 | 49 | 27 | 17 | 20 | 207 | 77 | 191 | 4 | 36 | ||
| V-F1 | Candidate | 5,720 | 883 | 575 | 320 | 398 | 2,688 | 1,554 | 3,636 | 209 | 1,171 | |
| GC content between 40 and 70 | 2,438 | 483 | 312 | 226 | 52 | 1,259 | 838 | 1809 | 105 | 991 | ||
| Protospacer accessibility ≥ 0.5 | 941 | 180 | 114 | 74 | 39 | 411 | 348 | 863 | 52 | 378 | ||
| Off-target | No mismatch | 43 | 20 | 16 | 39 | 0 | 381 | 36 | 74 | 95 | 242 | |
| One mismatch | 124 | 54 | 63 | 23 | 6 | 394 | 107 | 122 | 28 | 106 | ||
| Two mismatches | 371 | 78 | 50 | 37 | 12 | 385 | 141 | 219 | 10 | 117 | ||