| Literature DB >> 26521937 |
Nathan Wong1,2, Weijun Liu2, Xiaowei Wang3,4.
Abstract
The CRISPR/Cas9 system has been rapidly adopted for genome editing. However, one major issue with this system is the lack of robust bioinformatics tools for design of single guide RNA (sgRNA), which determines the efficacy and specificity of genome editing. To address this pressing need, we analyze CRISPR RNA-seq data and identify many novel features that are characteristic of highly potent sgRNAs. These features are used to develop a bioinformatics tool for genome-wide design of sgRNAs with improved efficiency. These sgRNAs as well as the design tool are freely accessible via a web server, WU-CRISPR ( http://crispr.wustl.edu ).Entities:
Mesh:
Substances:
Year: 2015 PMID: 26521937 PMCID: PMC4629399 DOI: 10.1186/s13059-015-0784-0
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Fig. 1Structural characteristics of sgRNAs. a Secondary structure of the sgRNA. The 20-nucleotide guide sequence is complementary to the target sequence and resides at the 5′ end of the sgRNA. The highlighted nucleotides could potentially base pair, leading to an extended stem-loop structure. b Statistical significance of position-specific nucleotide accessibility of functional sgRNAs compared with non-functional sgRNAs. c Comparison of position-specific nucleotide accessibilities between functional and non-functional sgRNAs
Fig. 2Thermodynamic properties of the guide sequence (gRNA). Functional and non-functional gRNAs were compared in the analysis. a Structural stability of the gRNA as evaluated by self-folding free energy (ΔG). b Structural stability of the gRNA/target sequence duplex as evaluated by free energy calculation
Significant base counts in functional gRNAs
| Mono- or dinucleoside count | Enrichment ratioa |
|
|---|---|---|
| A | 1.39 | 9.3E–18 |
| U | 0.89 | 1.9E–03 |
| G | 0.92 | 6.2E–03 |
| C | 0.95 | 5.5E–02 |
| GG | 0.64 | 2.3E–11 |
| AG | 1.43 | 1.3E–09 |
| CA | 1.38 | 6.7E–09 |
| AC | 1.47 | 1.2E–08 |
| UU | 0.59 | 7.5E–08 |
| UA | 1.84 | 1.1E–07 |
| GC | 0.77 | 3.2E–06 |
aThe enrichment ratio was determined by comparing the average nucleoside counts of functional gRNAs to that of non-functional gRNAs. bThe P values were calculated with Student’s t-test
Fig. 3Evaluation of the gRNA prediction model by receiver operating characteristic (ROC) curves. Two cross-validation strategies were employed, tenfold cross-validation and gene-based cross validation
gRNA feature filters that were applied before the SVM modeling process
| Filtered features | Excluded value | Enrichment ratio for non-functional gRNA |
|---|---|---|
| gRNA folding (∆G) | < −8 kcal/mol | 15.8 |
| Duplex binding (∆G) | < −22 kcal/mol | 3.5 |
| GC content | >80 % | 30.7 |
| UUU in the seed region | True | 10.5 |
| Repetitive bases | True | 4.2 |
| Position 19 | U | 2.6 |
| Position 20 | C or U | 2.5 |
Free energy (∆G) was calculated by RNAfold for gRNA self-folding and by the nearest neighbor method for binding stability of gRNA–target duplex
Fig. 4Validation of WU-CRISPR using independent experimental data. Precision-recall curves were constructed to evaluate the performance of WU-CRISPR and three other bioinformatics algorithms for sgRNA design