| Literature DB >> 31492858 |
Josh Tycko1, Michael Wainberg2, Georgi K Marinov1, Oana Ursu1, Gaelen T Hess1, Braeden K Ego1, Amy Li1, Alisa Truong1, Alexandro E Trevino3,4, Kaitlyn Spees1, David Yao1, Irene M Kaplow2,5, Peyton G Greenside1,6, David W Morgens1, Douglas H Phanstiel1,7,8, Michael P Snyder1, Lacramioara Bintu4, William J Greenleaf9,10,11, Anshul Kundaje12,13, Michael C Bassik14,15.
Abstract
Pooled CRISPR-Cas9 screens are a powerful method for functionally characterizing regulatory elements in the non-coding genome, but off-target effects in these experiments have not been systematically evaluated. Here, we investigate Cas9, dCas9, and CRISPRi/a off-target activity in screens for essential regulatory elements. The sgRNAs with the largest effects in genome-scale screens for essential CTCF loop anchors in K562 cells were not single guide RNAs (sgRNAs) that disrupted gene expression near the on-target CTCF anchor. Rather, these sgRNAs had high off-target activity that, while only weakly correlated with absolute off-target site number, could be predicted by the recently developed GuideScan specificity score. Screens conducted in parallel with CRISPRi/a, which do not induce double-stranded DNA breaks, revealed that a distinct set of off-targets also cause strong confounding fitness effects with these epigenome-editing tools. Promisingly, filtering of CRISPRi libraries using GuideScan specificity scores removed these confounded sgRNAs and enabled identification of essential regulatory elements.Entities:
Mesh:
Substances:
Year: 2019 PMID: 31492858 PMCID: PMC6731277 DOI: 10.1038/s41467-019-11955-7
Source DB: PubMed Journal: Nat Commun ISSN: 2041-1723 Impact factor: 14.919
Fig. 1GuideScan specificity filtering of a genome-scale CRISPR-Cas9 screen for essential CTCF loop anchors. a Schematic of CTCF loop anchor motif screen, with 2 to 5 sgRNAs targeting each CTCF motif. b Fitness effects are reproducible between independently transduced biological replicates of the screen. sgRNAs targeting essential gene exons or the BCR-ABL amplification drop out during the growth screen, as expected. Guide enrichment values are the log2(fold-change) of an sgRNA’s sequencing counts from after the screen compared with the original plasmid pool, computed with the casTLE screen analysis software[5]. c The growth effects of CTCF motif-targeting sgRNA are validated in individual competitive growth assays after lentiviral delivery of single guides to K562-Cas9 cells. Error bars are standard deviation of three technical replicates. d Comparison of sgRNA fitness effects with the number of off-target sites with 2-3 mismatches. Any sgRNAs with off-target sites with only 0 or 1 mismatch, as determined by the GuideScan search algorithm, are excluded. e Low-specificity score guides are significantly enriched among CTCF motif-targeting guides with fitness effects. The Fisher’s exact test provided the p-value for the association between fitness effect and specificity using the 2 × 2 contingency table of the numbers of guides in each quadrant based on the thresholds drawn in black lines. Numbers in corners correspond to the number of CTCF site-targeting guides (blue circles) in the quadrant. The off-target search was done with GuideScan, which retrieves all off-target locations with 2 or 3 mismatches to the sgRNA spacer. sgRNAs with >1 perfect matches to the genome or >0 off-target locations with only 1 mismatch are not searchable within the GuideScan trie data structure and were excluded from this analysis. f Filtering for high-specificity scores removes all CTCF motifs with concordant evidence of fitness effects from multiple sgRNAs. Gray circles are screen biological replicates. Source data are available in the Source Data file
Fig. 2Low-specificity sgRNAs confound identification of essential motifs in dense-tiling screen of loop anchors and enhancers. a A dense-tiling Cas9 growth screen was performed with sgRNAs densely tiling two types of regions: (1) 1 kb windows around select hit and non-hit CTCF loop anchors from the CTCF motif screen and (2) two enhancers of GATA1, previously called eGATA1 and eHDAC6. b As a positive control, we verified that the dense-tiling screen correctly maps the boundaries of exons of essential genes with high-specificity sgRNAs. Each point is the average enrichment of two biological replicates and the bar is the standard error. c Dense-tiling screen results from a 1 kb region centered on a motif that was a false positive hit in the original motif-targeting screen (targeted with sgRNAs 15776 and 15777 and also shown in Fig. 1 and Supplementary Fig. 1). All evidence for the essentiality of a CTCF motif comes from low-specificity sgRNAs. Motifs in ChIP-seq peaks are shown as black boxes and CTCF motifs as green boxes. d Dense-tiling screen results from two regions containing enhancers of the essential gene GATA1. sgRNAs selected for validation studies are labeled (e.g., 1 L represents the first sgRNA with a low specificity score). ChromHMM is colored according to the 15-state scheme[76] (briefly, reds are predicted promoter states, yellows are enhancer states, and greens are other transcriptionally active states). e The enhancer motif-targeting sgRNAs identified in d do not significantly decrease GATA1 expression according to qPCR (p > 0.05, ANOVA). Each dot is a sgRNA infection biological replicate. f The sgRNAs identified in d do not significantly decrease GATA1 protein expression according to Western blot. g The sgRNAs identified in d do not significantly decrease GATA1 protein expression according to flow cytometry for GATA1 protein level. Additional validation data are shown in Supplementary Fig. 4. Source data are available in the Source Data file
Fig. 3GuideScan specificity filtering of CRISPRi library reduces false positives. a Four parallel screens were conducted tiling the loci of essential growth genes GATA1, MYB, and ZMYND8 using the four platforms Cas9, CRISPRa, CRISPRi and dCas9. b Zoomed-in view of screen data around essential gene GATA1. Highlighted are regulatory elements with known effects on cell growth: enhancers eGATA1 and eHDAC6, and the GATA1 transcription start site. ChromHMM is colored according to the 15-state scheme[76] (briefly, reds are predicted promoter states, yellows are enhancer states, and greens are other transcriptionally active states). Each point is the average enrichment of two screen biological replicates and the bar is the standard error. c Enrichment of growth effects among low-scoring sgRNAs with no perfectly matching and no 1-mismatch off-target sites. p-value from the Fisher’s exact test for the 2 × 2 table with quadrants as drawn and guide counts as labeled in the corners; these counts include all the sgRNAs regardless of the categories indicated in colors. d Clustering of low-specificity sgRNAs reveals that each perturbation has off-target activity that reduces cell fitness with a unique subset of the low-specificity sgRNAs. Shown are the subset of sgRNAs that are upstream of eGATA1 or downstream of eHDAC6 (i.e., sgRNAs with predominantly off-target effects) and that also have a strong guide enrichment ≤ −3 in at least one replicate. Color scale is the log2 fold-change guide enrichment. e Filtering of sgRNAs in panel B with GuideScan specificity scores reduces noise. f After filtering, the CRISPRi sgRNAs in peaks have validated effects on GATA1 expression by qPCR (p < 0.05, ANOVA). Each dot is a sgRNA infection biological replicate. g Effects of indicated sgRNAs on GATA1 protein expression measured by Western blot. h Effects of indicated sgRNAs on GATA1 protein expression measured by flow cytometry. Here, cells expressing an sgRNA and mCherry were co-cultured with the blank parental cell line, stained for GATA1 protein, and analyzed by flow cytometry. We then compared the distribution of GATA1 protein level between the mCherry + and blank control cells from the same sample. Horizontal lines show the median and quartiles. Source data are available in the Source Data file
Fig. 4High-specificity CRISPR-Cas9 screen designs for non-coding elements. a Distribution of GuideScan specificity scores for two non-coding libraries from this study and a gene-targeting library, in comparison to all possible sgRNA. b Most TSSs can be targeted with multiple high-specificity sgRNA. Fraction of TSS in the ENCODE SCREEN database of ccREs that can be targeted with dCas9-based epigenome editors within a window of +/−100 bp, after filtering for GuideScan scores >0.2. c Fraction of motifs in TFBS motifs that can be targeted with sgRNAs with a cut site in the motif, after filtering out low-specificity sgRNAs