| Literature DB >> 29976198 |
Aaron McKenna1, Jay Shendure2,3.
Abstract
BACKGROUND: Genome-wide knockout studies, noncoding deletion scans, and other large-scale studies require a simple and lightweight framework that can quickly discover and score thousands of candidate CRISPR guides targeting an arbitrary DNA sequence. While several CRISPR web applications exist, there is a need for a high-throughput tool to rapidly discover and process hundreds of thousands of CRISPR targets.Entities:
Keywords: CRISPR; Cas9; Cpf1; Deletion scan; Guide library; Off-target; On-target
Mesh:
Year: 2018 PMID: 29976198 PMCID: PMC6033233 DOI: 10.1186/s12915-018-0545-0
Source DB: PubMed Journal: BMC Biol ISSN: 1741-7007 Impact factor: 7.431
A sample of computational times (h:m:s) required to build a FlashFry database for versions of the Caenorhabditis elegans, human, mouse, and Drosophila melanogaster genomes for common CRISPR enzymes. All timing analyses were run with default FlashFry parameters on an Amazon r4.large instance, limited to 8 GB of memory limited and one CPU core
| Genome | CRISPR/Cas9 (NGG) | CRISPR/Cas9 (NGG/NAG) | Cpf1 (TTTN) |
|---|---|---|---|
| 0:1:28 | 0:2:55 | 0:2:13 | |
| Human—hg38 | 1:15:15 | 2:42:27 | 0:56:23 |
| Mouse—mm10 | 1:02:13 | 2:21:11 | 0:43:13 |
| 0:2:38 | 0:5:01 | 0:2:14 |
Fig. 1Discovery and scoring of CRISPR target sites. FlashFry schematic. The genome of interest is scanned for targets that match the PAM of the specified CRISPR enzyme. These genomic targets are then aggregated and bit-encoded into a database of compressed bins, sorted by their prefix. This database can then be searched by comparing the prefix of a candidate target against the prefix of the bins, and bins within the allowed mismatch (orange) can be examined for individual off-targets in the genome. The resulting off-target list is aggregated and used by various scoring metrics
Fig. 2Comparison of the runtimes and memory usage of common CRISPR target discovery tools over an increasing number of targets and permitted mismatches. Five random CRISPR guide sets were run for each target-count (x-axis) and permitted mismatch level (y-axis). Plotted are the mean runtime with standard deviation bars for each set of replicates. a Running time per sequence for increasing numbers of target sites and b their corresponding memory usage. FlashFry benefits from aggregating all guide-to-genome comparisons in one pass of the database, matching BWA’s performance at hundreds of targets for five mismatches, and thousands of targets at four mismatches. Only BWA and FlashFry were run for the 10,000 and 100,000 target searches