| Literature DB >> 26612492 |
Daniel B Graham1,2,3, David E Root4.
Abstract
CRISPR-based approaches have quickly become a favored method to perturb genes to uncover their functions. Here, we review the key considerations in the design of genome editing experiments, and survey the tools and resources currently available to assist users of this technology.Entities:
Mesh:
Substances:
Year: 2015 PMID: 26612492 PMCID: PMC4661947 DOI: 10.1186/s13059-015-0823-x
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Fig. 1Components of the CRISPR-Cas9 system. Streptococcus pyogenes Cas9 (SpCas9) forms a complex with a chimeric single guide RNA (sgRNA) comprising a spacer that hybridizes with the genomic target site, and a scaffold RNA termed tracrRNA required for complex formation. The protospacer adjacent motif (PAM) is required for sequence specificity of SpCas9-mediated endonuclease activity against genomic DNA
Fig. 2Genetic perturbations enabled by engineered CRISPR/Cas9 systems. a Knockout approaches generate loss-of-function (LOF) alleles by means of insertion/deletion (indel) mutations incurred by erroneous repair of DNA double-strand breaks by nonhomologous end joining (NHEJ). b Knock-in approaches aim to introduce defined mutations [e.g., an insertion or single-nucleotide polymorphism (SNP)] encoded by repair templates that exploit endogenous homology-directed repair (HDR) mechanisms. c Transcriptional inhibition with CRISPR interference (CRISPRi) employs endonuclease-dead Cas9 (dCas9), or transcriptional repressors fused to dCas9, to suppress gene transcription. d Overexpression with CRISPR activation (CRISPRa) employs transcriptional activators fused to dCas9 to activate gene transcription. In addition, single guide RNAs (sgRNAs) have been engineered that contain aptamers to recruit additional transcriptional activator complexes
Tools for the design of guide RNAs
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| sgRNA Designer |
| [ | Ensembl transcript IDs or nucleotide sequences | Activity-ranked sgRNA, exon, percentage protein sequence C-terminal of target site | Medium/high | Find target sequences for up to ten transcripts (small-batch mode); good for generating a lot of candidate guides quickly | Human, mouse | N/A | Meta-analysis of genome-wide CRISPR screens | Ease of use, on-target efficacy based on experiment data and validation | No OT prediction | Web/local |
| CRISPR MultiTargeter |
| [ | Gene/transcript ID or sequence | Activity-ranked sgRNA based on sgRNA Designer (see above). Reports percentage GC and Tm | Low | Find all target sequences for a single sequence; find all common target sequences for all transcripts for a given gene; find unique/non-unique target sequences providing multiple sequences or similar gene types | 12 common genomes | 5′ G or GG. Target length. PAM NGG or user specified. Paired sgRNA | Ease of use. Many options, including any PAM. Has multiple modes, separated out, that could be useful | No OT prediction. Multiple modes can be complicated | Web | |
| Cas9 Design |
| [ | Input sequence or FASTA file | Target sequence and exact matches in reference genome. Percentage AT, predicted RNA folding structure for sgRNA | Low | Find target sequences for a single sequence | Ten common genomes | Target length. User-specified scaffold RNA for structural predictions | Can be used to identify potentially problematic hairpin structures in sgRNAs | No on-target efficacy prediction. No OT prediction | Web | |
| SSFinder |
| [ | FASTA file | All potential NGG-PAM guides | High | Find target sequences for any number of FASTA sequences | N/A | None | Very simple input/output requirements. Works quickly on a small number of sequences. High throughput possible | No options. No on-target or off-target information | Python script | |
| Cas OFFinder |
| [ | sgRNA | Lists OT sites with number and position of MM | Medium/high | Find comprehensive off-target information for one or more guide sequences; must run as script | Approximately 20 common genomes | Alternative PAMs, tolerance for MMs | Ease of use. Analyzes multiple sgRNAs in batch. OT sites with up to nine MMs and two bulges | Does not indicate whether OTs are in CDS. Does not account for identity of MMs | Web/local |
aThe input data are gene sequence or sgRNA sequence. bThe output is sgRNA sequence or off-target sites. c‘Low’: input format and run times support one-gene-at-a-time or one-guide-at-a-time queries. ‘Medium’: input format and run times support small batches of genes or sgRNAs, tens to hundreds of queries. ‘High’: input format and run times support genome-scale queries
Abbreviations: CDS coding sequence,CRISPR clustered regularly interspaced short palindromic repeats, MM mismatch, N/A not applicable, OT off-target, PAM protospacer adjacent motif, sgRNA single guide RNA, Tm melting temperature
All-in-one packages for the design of guide RNAs and prediction of off-target effects
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| CCTop | http://crispr.cos.uni-heidelberg.de/ | [ | Sequences 23 to 500 bp (similar to crispr.mit input); also has single or batch mode | Scores OTs and ranks sgRNA by OTs. OT sites; position with respect to CDS. Number and position of MMs | Low | Find target sequences for a sequence or sequences (has batch mode); good for generating a lot of candidates with comprehensive on/off-target information | Approximately 15 common organisms; only a single human build available | NGG or NRG PAMs for on- and off-target. 5′ G or GGa. MM tolerance, total and in core. Annotated OT sites with RefSeq IDs, if applicable | On- and off-target efficacy in vitro. For one guide | Ease of use, many options. Comprehensive and easy-to-understand output | No on-target efficacy prediction. Does not account for identity of MMs. Regular mode is relatively fast. Advanced mode is slow | Web |
| CHOPCHOP |
| [ | Target transcript ID or sequence (raw or chromosomal position); also allows gene input | Scores OT and ranks sgRNA by off-targets, GC content, genomic map, position with respect to CDS, primers for validation, RE sites. Off-target sites 0–2 MM | Low/medium | Find target sequences for a single sequence/gene/transcript; good for generating a lot of guides for a single target quickly | Approximately 20 common organisms, plus two most recent builds for human | 5′ GN, GG no TTTT, CDS, junctions, alternative PAMs; specify restrictions for position of mismatch, e.g., nine-nucleotide 5′ to PAM (seed) | Ease of use, flexible options, downloadable results | No on-target efficacy prediction. OT limited to zero to two mismatches. Does not account for identity of mismatches | Web | |
| Crispr.mit |
| [ | Sequence or FASTA files; single or batch mode | Scores OT and ranks sgRNA by OTs, paired sgRNAs for nickase, OT sites | Low/medium | Find target sequences for a sequence or sequences (has batch mode); good for generating a lot of candidates with comprehensive on/off-target information | Approximately 15 common organisms; only a single human build available | Paired sgRNAs | Ease of use. Scores OTs for up to four MMs and provides positions. Specifies OTs in genes versus intergenic sequences | Handles short sequences 23–500 bp, although 250 bp is actual upper limit. Very slow. No efficacy metric. Does not account for identity of mismatches. Occasionally misses OT sites with no MMs | Web | |
| WU-CRISPR |
| [ | Gene symbol or 24–30,000 bp sequence | List of sgRNA ranked by efficacy score | Low | Find target seqs based on efficacy score and absence of OT perfect seed match | Mouse and human | None | Ease of use. On-target efficacy scores based on re-analysis from [ | OT exclusion: perfect 13-nt seed match or >85 % similarity of 20-nt sequence to exome. Doesn’t account for identity of mismatches | web | |
| GT-Scan |
| [ | Sequence or FASTA file. Genomic coordinates | sgRNA, genomic sites with zero to three mismatches. Links to genome browser | Low | Find target sequences and OTs for a single sequence | Approximately 20 common organisms; only a single human build available | Many user-defined OT rules and filters. Alternative PAMs | Ease of use. Many filters to define OT rules | Has trouble finding exact matches in genome. Does not account for identity of mismatches | Web | |
| CROP-IT |
| [ | sgRNA | Lists OT sites. Scoring for OT sites. Number, position, identity of MMs. Genomic position, CDS gene name if relevant. Email results | Low | Find off-target information for a guide sequence | Mouse and human | Cas9 binding sites versus predicted cleavage sites. NGG or NNG PAMs | Ease of use. Provides gene name if OT is in exon | Analyzes one sgRNA at a time. Does not account for identity of mismatches. | Web | |
| Cas OT |
| [ | FASTA files | sgRNA and OT sites | Low/medium | Target sequences and OTs | Provided by user | Several OT rules and restrictions | Many options. Alternative PAMs, paired sgRNAs, 5′ G | Programming knowledge necessary or helpful. Does not account for identity of mismatches | Perl script | |
| CRISPRseek |
| [ | Software package | Candidate sgRNAs with a variety of scores, dependent on parameters | High | Find target sequences and OT sites for multiple sequences; performs both nickase and paired guide design | Several common genomes | Many options. Very comprehensive in terms of design and scoring | Many options | Very laborious to both install and operate, despite having extensive documentation. Does not account for identity of mismatches | Bioconductor package in R | |
| ZiFiT |
| [ | Input sequence | sgRNA, OTs with zero to three MMs. Position and identity of MM. Genomic position of OT | Low | Find target sequences and OTs for a single sequence | Nine common genomes | 5' G or GG | Ease of use | No on-target efficacy prediction. Does not account for identity of mismatches | Web | |
| E-CRISP |
| [ | Gene symbol/sequence | Many options for output in advanced mode; table provides sequence, three-part score (specificity, annotation of gene target regions hit, and on-target efficacy), context, number of hits | Low | Find target sequences for a single gene or sequence; performs guide evaluation as well (not tested); also cas9 nickase design for paired sgRNAs | >30 genomes | Basic mode: user defines MM tolerance, PAM sequence. Advanced mode: many options for OT specs | Lots of options, fast results, summary of all designs found | So many options — could be confusing. Does not account for identity of mismatches | Web | |
| CRISPR Direct |
| [ | Transcript/genome location/nucleotide sequence | Table with target position, sequence plus context, some sequence info (GC content, Tm, poly-T), target counts plus downloads | Low | Find target sequences for a single transcript/sequence with some limited off-target info | 20 common genomes | PAM type | Very fast, visual display of target sequence and OT info | No options, no on-target efficacy, OT matches limited to number of target sites with 20/12/8-mer plus PAM matches. Does not account for identity of mismatches | Web | |
| COD |
| N/A | 23–400 bp input sequence | GenBank file and CSV file. OT scoring. Position, identity, number of MMs. Genomic position of OT site with link to graphic | Low | Find target sequences for an input sequence. Predicts and ranks OT sites | 23 common genomes | Length of target. OT stringency. NGG and/or NAG for OT | Ease of use. OT scoring | No on-target prediction. Does not account for identity of mismatches. Slow | Web | |
| sgRNAcas9 |
| [ | Software package | Multiple files that include all possible designs for a given sequence plus information to filter based on cloning or on-target efficacy, OT information | High | Find target sequences for a sequence or sequences (has batch mode); good for generating a lot of candidates, with some limited on-target or OT information | User can provide any genome reference file | Can generate single or paired sgRNAs; many options for OT stringency (number of mismatches and number of offsets); several options for ease of cloning | Can generate several candidates, with some (but not a comprehensive set of) options with respect to cloning ease and efficiency and OT matching | Difficult to use. No clear on-target efficacy score. Supports only NGG PAM currently. Does not account for identity of mismatches | Local (Perl script) | |
| DNA 2.0 gRNA Design Tool |
| N/A | Gene, locus, sequence | Display of targets; table: hit position, target sequence in context, score, overlapping gene info, number of splice variant targets | Low | Find target sequences for a single gene or sequence; performs cas9 and nickase design | Human, mouse, | PAM type | Simple interface, very fast results, simple and clear output | Tied in to commercial site; output has very few data points, and unclear what is available with respect to scoring. Does not account for identity of mismatches | Web/commercial | |
| Cas-Designer |
| N/A | Nucleotide sequence or FASTA | Target sequence and OT. Zero to two MMs and one bulge. Percentage GC. Link to Ensembl for OT sites | Low | Find target sequences. Uses Cas-OFFinder and Microhomology Predictor for OT searching | 30 genome builds | Accounts for bulge MMs. Alternative PAMs | Ease of use. Fast. Accommodates bulges in OT prediction | No on-target efficacy score. OT prediction does not account for identity | Web | |
| sgRNA Scorer 1.0 |
| [ | Nucleotide sequence. FASTA files up to 10 kb sequences | Target sequence with activity score. Number of OT sites with genomic coordinates | Low | Find target sequences OT searching using CasFinder. On-target activity scoring using support vector machine (SVM) model | 12 common genomes |
| On-target efficacy score. Offers a precomputed list of target sequences for all human and mouse genes | Slow. Email output. OT prediction does not account for identity | Web | |
| Protospacer |
| [ | Many inputs. Gene ID, genomic coordinates, etc. | Target sequence with activity score. Percentage GC, OT sites, positions, identities | Medium/high | Find target sequences, OT sites, prioritize sgRNAs | Provided by user | Accounts for MMs. Alternative PAMs | On-target efficacy score based on sgRNA Designer rules (Table | Requires extensive setup. Not obvious where OTs fall with respect to CDS. OT prediction does not account for identity | Software, |
a‘Low’: input format and run times support one-gene-at-a-time or one-guide-at-a-time queries. ‘Medium’: input format and run times support small batches of genes or sgRNAs, tens to hundreds of queries. ‘High’: input format and run times support genome-scale queries. bOptions for sgRNA sequence criteria: alternative PAMs. Require 5′ G to promote PolIII-dependent transcription from the U6 promoter, or 5′ GG for in vitro transcription using the T7 polymerase. Avoid TTTT, which signals PolIII transcriptional termination.
Abbreviations: CDS coding sequence, MM mismatch, N/A not applicable, OT off-target, PAM protospacer adjacent motif, sgRNA single guide RNA, Tm melting temperature
Fig. 3Summary of experimental options for validating CRISPR edits at the target site and off-target sites, highlighting the varying degrees of comprehensiveness that can be achieved