| Literature DB >> 33085710 |
Abstract
Guide RNA design for CRISPR genome editing of gene families is a challenging task as usually good candidate sgRNAs are tagged with low scores precisely because they match several locations in the genome, thus time-consuming manual evaluation of targets is required. To address this issues, I have developed ARES-GT, a Python local command line tool compatible with any operative system. ARES-GT allows the selection of candidate sgRNAs that match multiple input query sequences, in addition of candidate sgRNAs that specifically match each query sequence. It also contemplates the use of unmapped contigs apart from complete genomes thus allowing the use of any genome provided by user and being able to handle intraspecies allelic variability and individual polymorphisms. ARES-GT is available at GitHub (https://github.com/eugomin/ARES-GT.git).Entities:
Mesh:
Substances:
Year: 2020 PMID: 33085710 PMCID: PMC7577430 DOI: 10.1371/journal.pone.0241001
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1sgRNS targets in CBF genes.
A) Genomic distribution of CBF genes in Arabiopsis thaliana chromosomes 4 and 5. Location of Cas9 (B) and Cas12a (C) candidates with multiple CBF gene targets. (*) Asterisk marks candidates corresponding with previously reported sgRNAs (Cho et al., 2017).
Multiple targets Cas9 candidates for AtCBF genes.
All possible genome targets and offtargets (with ARES-GT thresholds: L0 = 4 and L1 = 3) of each candidate are listed with indication of genome coordinates (TAIR v10) and whether it corresponds to a CBF gene. In alignments, black boxes mark mismatches and a space separates PAM (NGG or NAG) from sequence. Differences in the “N” position in the PAM are not marked.
| Candidate ID | Targets + Offtargets (L0 = 4, L1 = 3) | |||||
|---|---|---|---|---|---|---|
| Cas9AtCBF1_014 | 4 | 13015820 | 13015842 | + | ||
| 4 | 13022305 | 13022327 | + | |||
| 4 | 13018737 | 13018759 | + | |||
| Cas9AtCBF1_015 | 4 | 13015825 | 13015847 | + | ||
| 4 | 13022310 | 13022332 | + | |||
| Cas9AtCBF1_018 | 4 | 13015920 | 13015942 | + | ||
| 4 | 13022405 | 13022427 | + | |||
| 5 | 21117612 | 21117634 | + | |||
| 4 | 13018837 | 13018859 | + | |||
| Cas9AtCBF1_019 | 4 | 13015921 | 13015943 | + | ||
| 4 | 13022406 | 13022428 | + | |||
| ---- | 1 | 1597274 | 1597296 | + | ||
| 4 | 13018838 | 13018860 | + | |||
| Cas9AtCBF1_051 | 4 | 13015738 | 13015760 | - | ||
| 4 | 13022223 | 13022245 | - | |||
| Cas9AtCBF1_056 | 4 | 13015831 | 13015853 | - | ||
| 4 | 13022316 | 13022338 | - | |||
| 4 | 13018748 | 13018770 | - | |||
| Cas9AtCBF1_061 | 4 | 13015900 | 13015922 | - | ||
| 4 | 13022385 | 13022407 | - | |||
| 4 | 13018817 | 13018839 | - | |||
| Cas9AtCBF1_062 | 4 | 13015901 | 13015923 | - | ||
| 4 | 13022386 | 13022408 | - | |||
| 4 | 13018818 | 13018840 | - | |||
| Cas9AtCBF1_063 | 4 | 13015908 | 13015930 | - | ||
| 4 | 13022393 | 13022415 | - | |||
| ---- | 2 | 6123419 | 6123441 | - | ||
| Cas9AtCBF1_064 | 4 | 13015929 | 13015951 | - | ||
| 4 | 13022414 | 13022436 | - | |||
| 4 | 13018846 | 13018868 | - | |||
| ---- | 1 | 4290740 | 4290762 | - | ||
| ---- | 1 | 23368054 | 23368076 | - | ||
| 5 | 21117621 | 21117643 | - | |||
| Cas9AtCBF2_081 | 4 | 13015760 | 13015782 | + | ||
| 4 | 13018677 | 13018699 | + | |||
| 5 | 21117452 | 21117474 | + | |||
| Cas9AtCBF2_123 | 4 | 13015754 | 13015776 | - | ||
| 4 | 13018671 | 13018693 | - | |||
| 4 | 13022239 | 13022261 | - | |||
| Cas9AtCBF2_124 | 4 | 13015759 | 13015781 | - | ||
| 4 | 13018676 | 13018698 | - | |||
| 4 | 13022244 | 13022266 | - | |||
Multiple targets Cas12a candidates for AtCBF genes.
All possible genome targets and offtargets (with ARES-GT thresholds: L0 = 4 and L1 = 3) of each candidate are listed with indication of genome coordinates (TAIR v10) and whether it corresponds to a CBF gene. In alignments, black boxes mark mismatches and a space separates PAM (TTTN) from sequence. Differences in the “N” position in the PAM are not marked.
| Candidate ID | Targets + Offtargets (L0 = 4, L1 = 3) | |||||
|---|---|---|---|---|---|---|
| Cas12aAtCBF1_011 | 4 | 13015814 | 13015837 | - | ||
| 4 | 13022299 | 13022322 | - | |||
| Cas12aAtCBF1_012 | 4 | 13015827 | 13015850 | - | ||
| 4 | 13022312 | 13022335 | - | |||
| ---- | 1 | 27242286 | 27242310 | + | ||
| ---- | 3 | 8296023 | 8296047 | + | ||
| ---- | 5 | 17806910 | 17806934 | + | ||
| ---- | 5 | 21618544 | 21618567 | - | ||
| ---- | 4 | 7932903 | 7932927 | + | ||
| ---- | 4 | 10190722 | 10190745 | - | ||
| 4 | 13018744 | 13018767 | - | |||
| Cas12aAtCBF1_014 | 4 | 13015902 | 13015925 | - | ||
| 4 | 13022387 | 13022410 | - | |||
| 5 | 21117594 | 21117617 | - | |||
| Cas12aAtCBF1_015 | 4 | 13015924 | 13015947 | - | ||
| 4 | 13022409 | 13022432 | - | |||
| 4 | 13018841 | 13018864 | - | |||
| 5 | 21117616 | 21117639 | - | |||
| Cas12aAtCBF1_017 | 4 | 13016031 | 13016054 | - | ||
| 4 | 13018948 | 13018971 | - | |||
| 4 | 13022507 | 13022530 | - | |||
| ---- | 1 | 8279033 | 8279056 | - | ||
| ---- | 3 | 9399469 | 9399493 | + | ||
| Cas12aAtCBF1_018 | 4 | 13016032 | 13016055 | - | ||
| 4 | 13018949 | 13018972 | - | |||
| 4 | 13022508 | 13022531 | - | |||
| ---- | 1 | 9505057 | 9505081 | + | ||
| Cas12aAtCBF1_019 | 4 | 13018950 | 13018973 | - | ||
| 4 | 13022509 | 13022532 | - | |||
| Cas12aAtCBF1_024 | 4 | 13015842 | 13015865 | + | ||
| 4 | 13022327 | 13022350 | + | |||
| ---- | 3 | 8296020 | 8296043 | - | ||
| Cas12aAtCBF1_028 | 4 | 13015913 | 13015936 | + | ||
| 4 | 13022398 | 13022421 | + | |||
| ---- | 5 | 16311156 | 16311179 | + | ||
| Cas12aAtCBF1_029 | 4 | 13015917 | 13015940 | + | ||
| 4 | 13022402 | 13022425 | + | |||
Multiple targets Cas9 and Cas12a candidates for ChCBF genes.
All possible genome targets and offtargets (with ARES-GT thresholds: L0 = 4 and L1 = 3) of each candidate are listed with indication of genome coordinates (Cardamine hirsuta v1.0) and whether it corresponds to a CBF gene. In alignments, black boxes mark mismatches and a space separates PAM (NGG/NAG or TTTN) from sequence. Differences in the “N” position in the PAM are not marked.
| Candidate ID | Targets + Offtargets (L0 = 4, L1 =3) | |||||
|---|---|---|---|---|---|---|
| Cas9ChCBF1_004 | ChCBF2 | 4 | 6514798 | 6514820 | + | |
| ChCBF1 | 7 | 17908883 | 17908905 | - | ||
| Cas9ChCBF1_010 | ChCBF2 | 4 | 6514878 | 6514900 | + | |
| ChCBF1 | 7 | 17908803 | 17908825 | - | ||
| Cas9ChCBF1_018 | ChCBF2 | 4 | 6514910 | 6514932 | + | |
| ChCBF1 | 7 | 17908771 | 17908793 | - | ||
| ChCBF3 | 8 | 13812274 | 13812296 | - | ||
| ---- | 5 | 18638271 | 18638293 | - | ||
| ---- | 5 | 21152837 | 21152859 | - | ||
| Cas9ChCBF1_013 | ChCBF2 | 4 | 6514915 | 6514937 | + | |
| ChCBF1 | 7 | 17908766 | 17908788 | - | ||
| ---- | 8 | 18333140 | 18333162 | - | ||
| ---- | 1 | 5556241 | 5556263 | + | ||
| ---- | 1 | 370416 | 370438 | + | ||
| ChCBF3 | 8 | 13812269 | 13812291 | - | ||
| Cas9ChCBF1_033 | ChCBF2 | 4 | 6515264 | 6515286 | + | |
| ChCBF1 | 7 | 17908390 | 17908412 | - | ||
| NSCAFA. | 444 | 2316 | 2338 | + | ||
| Cas9ChCBF1_036 | ChCBF2 | 4 | 6514793 | 6514815 | - | |
| ChCBF1 | 7 | 17908888 | 17908910 | + | ||
| Cas9ChCBF1_043 | ChCBF2 | 4 | 6514880 | 6514902 | - | |
| ChCBF1 | 7 | 17908801 | 17908823 | + | ||
| Cas9ChCBF1_044 | ChCBF2 | 4 | 6514909 | 6514931 | - | |
| ChCBF1 | 7 | 17908772 | 17908794 | + | ||
| ChCBF3 | 8 | 13812275 | 13812297 | + | ||
| Cas9ChCBF1_056 | ChCBF2 | 4 | 6515266 | 6515288 | - | |
| ChCBF1 | 7 | 17908388 | 17908410 | + | ||
| ---- | 2 | 8347578 | 8347600 | + | ||
| Cas9ChCBF1_057 | ChCBF2 | 4 | 6515269 | 6515291 | - | |
| ChCBF1 | 7 | 17908385 | 17908407 | + | ||
| ---- | 1 | 17089187 | 17089209 | + | ||
| ---- | 5 | 5225681 | 5225703 | - | ||
| Cas21aChCBF1_018 | ChCBF2 | 4 | 6514830 | 6514853 | + | |
| ChCBF1 | 7 | 17908848 | 17908871 | - | ||
| ChCBF3 | 8 | 13812351 | 13812374 | - | ||
| Cas21aChCBF1_029 | ChCBF2 | 4 | 6515260 | 6515283 | + | |
| ChCBF1 | 7 | 17908391 | 17908414 | - | ||
| Cas21aChCBF1_030 | ChCBF2 | 4 | 6515261 | 6515284 | + | |
| ChCBF1 | 7 | 17908390 | 17908413 | - | ||
Intraspecies variability effect in the number of Cas9 and Cas12a candidates targeting multiple or unique AtCBF genes.
Sequence variability in the CBF genes from different Arabidopsis thaliana accessions change the number of candidates that can match multiple targets due to SNPs in the 20 nucleotides of the guide but also SNPs affecting PAM sequence. The use of the standard Col-0 genome reference (TAIR v10) or the corresponding accession genome affects the identification of offtargets thus the correct identification of specific (unique) candidates matching only one CBF gene. The column “exclusive” indicates the number of specific candidates that are only listed when the corresponding reference genome is used.
| Multiple Targets Candidates | Reference | Unique Cas9 Candidates | Unique Cas12a Candidates | ||||
|---|---|---|---|---|---|---|---|
| Cas9 | Cas12a | Total | Exclusive | Total | Exclusive | ||
| 13 | 10 | 96 | - | 34 | - | ||
| 13 | 9 | 100 | 3 | 37 | 2 | ||
| 105 | 8 | 41 | 6 | ||||
| 13 | 10 | 100 | 4 | 33 | 2 | ||
| 101 | 5 | 31 | 0 | ||||
| 11 | 9 | 102 | 6 | 34 | 3 | ||
| 107 | 11 | 37 | 6 | ||||
| 13 | 10 | 101 | 2 | 32 | 1 | ||
| 101 | 2 | 31 | 0 | ||||
| 18 | 6 | 99 | 8 | 32 | 2 | ||
| 103 | 12 | 33 | 3 | ||||
| 13 | 10 | 102 | 3 | 32 | 0 | ||
| 105 | 6 | 34 | 2 | ||||
| 13 | 10 | 101 | 6 | 31 | 2 | ||
| 102 | 7 | 31 | 2 | ||||
Intraspecies variability effect in the identification of targets and possible offtargets.
For each example, upper file shows the targets and offtargets listed by ARES-GT (with thresholds L0 = 4 and L1 = 3) for each reference genome. SNPs differences between genomes that explain why some targets or offtargets are not detected are shown in lower file (separated by discontinuous line) as red boxes. Black boxes mark mismatches with candidates sequence.
| Candidate ID | ||||||
|---|---|---|---|---|---|---|
| C24_Cas21aCBF1_019 | C24CBF2 | C24_4 | 13745457 | 13745480 | - | |
| C24CBF3 | C24_4 | 13748381 | 13748404 | - | ||
| C24CBF1 | C24_4 | 13751940 | 13751963 | - | ||
| ---- | C24_3 | 4670219 | 4670243 | + | ||
| ColCBF3 | Col_4 | 13018950 | 13018973 | - | ||
| ColCBF1 | Col_4 | 13022509 | 13022532 | - | ||
| ColCBF2 | Col_4 | 13016046 | 13016068 | - | ||
| ---- | Col_3 | 4673610 | 4673633 | + | ||
| Eri_Cas12aCBF1_017 | EriCBF2 | Eri_4 | 12981374 | 12981397 | - | |
| EriCBF3 | Eri_4 | 12984307 | 12984330 | - | ||
| EriCBF1 | Eri_4 | 12987866 | 12987889 | - | ||
| ColCBF2 | Col_4 | 13016031 | 13016054 | - | ||
| ColCBF3 | Col_4 | 13018948 | 13018971 | - | ||
| ColCBF1 | Col_4 | 13022507 | 13022530 | - | ||
| ---- | Col_1 | 8279033 | 8279056 | - | ||
| ---- | Col_3 | 9399469 | 9399493 | + | ||
| ---- | Eri_1 | 8194484 | 8194507 | - | ||
| ---- | Eri_3 | 9400735 | 9400758 | + | ||