| Literature DB >> 28655737 |
Traver Hart1, Amy Hin Yan Tong2, Katie Chan2, Jolanda Van Leeuwen2, Ashwin Seetharaman2, Michael Aregger2, Megha Chandrashekhar2, Nicole Hustedt3, Sahil Seth4, Avery Noonan2, Andrea Habsid2, Olga Sizova2, Lyudmila Nedyalkova2, Ryan Climie2, Leanne Tworzyanski2, Keith Lawson2, Maria Augusta Sartori2, Sabriyeh Alibeh2, David Tieu2,5, Sanna Masud2,5, Patricia Mero2, Alexander Weiss2, Kevin R Brown2, Matej Usaj2, Maximilian Billmann6, Mahfuzur Rahman6, Michael Constanzo2, Chad L Myers6, Brenda J Andrews2,5,7, Charles Boone2,5,7, Daniel Durocher3,5, Jason Moffat8,5,7.
Abstract
The adaptation of CRISPR/SpCas9 technology to mammalian cell lines is transforming the study of human functional genomics. Pooled libraries of CRISPR guide RNAs (gRNAs) targeting human protein-coding genes and encoded in viral vectors have been used to systematically create gene knockouts in a variety of human cancer and immortalized cell lines, in an effort to identify whether these knockouts cause cellular fitness defects. Previous work has shown that CRISPR screens are more sensitive and specific than pooled-library shRNA screens in similar assays, but currently there exists significant variability across CRISPR library designs and experimental protocols. In this study, we reanalyze 17 genome-scale knockout screens in human cell lines from three research groups, using three different genome-scale gRNA libraries. Using the Bayesian Analysis of Gene Essentiality algorithm to identify essential genes, we refine and expand our previously defined set of human core essential genes from 360 to 684 genes. We use this expanded set of reference core essential genes, CEG2, plus empirical data from six CRISPR knockout screens to guide the design of a sequence-optimized gRNA library, the Toronto KnockOut version 3.0 (TKOv3) library. We then demonstrate the high effectiveness of the library relative to reference sets of essential and nonessential genes, as well as other screens using similar approaches. The optimized TKOv3 library, combined with the CEG2 reference set, provide an efficient, highly optimized platform for performing and assessing gene knockout screens in human cell lines.Entities:
Keywords: CRISPR/Cas9; cancer cell lines; core essential genes; genetic screens
Mesh:
Substances:
Year: 2017 PMID: 28655737 PMCID: PMC5555476 DOI: 10.1534/g3.117.041277
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
Figure 2Effect of experimental design on screen performance. (A and B) Effect of number of gRNA per gene. (A) Subsets of the Sabatini library were randomly selected and evaluated using BAGEL. The fraction of CEG2 detected is plotted as a function of the number of gRNA per gene. Error bars represent SD of 10 random samples from the Sabatini library. (B) Incremental increase in the total number of essential genes per screen vs. incremental increase in the number of gRNA per gene. (C and D) Effect of number of replicates per experiment. The TKOv1 screen in HAP1 cells and the Yusa screen in HT29 cells, each screened at multiple timepoints, were reanalyzed using all combinations of one, two, or three replicates per screen. (C) The fraction of CEG2 reference essentials identified vs. the number of replicates. (D) The incremental increase in total number of essential genes as the number of replicates is increased.
Number of guide RNAs ranked by sequence score (SeqScore) and included in the TKOv3
| Rank | Class | SeqScore | No. of Candidate Guides | No. of gRNAs Added | Cumulative gRNAs in Library |
|---|---|---|---|---|---|
| 1 | 1 | >0.0 | 1501 | 679 | 679 |
| 2 | 1,2,3 | >0.85 | 286,415 | 63,834 | 64,513 |
| 3 | 1,2,3,4 | >0.85 | 307,059 | 1069 | 65,582 |
| 4 | 1,2,3 | 0.0–0.85 | 304,477 | 3476 | 69,058 |
| 5 | 1,2,3,4 | −1.0–0.85 | 837,136 | 1890 | 70,948 |
Figure 1(A) List of CRISPR knockout screens used for this study. (B) Precision-recall curves for the screens in (A) using gold standards defined in Hart . Dashed lines represent low-performing screens that were excluded from further analysis. (C) Number of genes assayed by at least three gRNA per gene, across the 12 screens. (D) Number of genes classified as essential (BF ≥ 6, FDR ≤ 3%) across the 12 screens. (E) Fraction of screens in which a gene is classified as essential. Genes assayed in at least seven screens and essential in 85% of screens (red) are CEG2. (F) CEG2 (n = 684) is substantially larger and only overlaps CEG1 (n = 360; Hart ) by ∼50%. (G) Functional characterization of CEG2 (Core-v2) vs. CEG1 (Core-v1).
Figure 3Nonessentials vs. nontargeting controls. The distribution of observed fold-changes of gRNA targeting nonessential genes (black) is compared to the distribution for nontargeting control gRNA (green), in the Sabatini screens of Jiyoye (A), K562 (B), KBM7(C), and Raji (D) cells; P-value from T-test. For reference, the fold-change of gRNA targeting essential genes is also shown (red).
Figure 4Sequence signature of high-performing guides. (A) Heatmap of the guide score derived from high-performing guides in TKOv1 screens. (B) Across the TKOv1 supplemental library in the HCT116 screen, gRNA targeting CEG2 with sequence scores in the top quartile (red) are compared with gRNA with scores in the bottom quartile (blue), and guides targeting nonessential genes are shown in black. (C–F) Similar plots for TKOv1 (HeLa screen), Yusa, Sabatini, and GeCKOv2 libraries.
Figure 5Evaluation of TKOv3 library. (A) Precision-recall curves of TKOv1 and TKOv3 screens in HAP1 cells. (B) Comparison of essential genes in TKOv3 vs. TKOv1 and HAP1 essentials from Blomen at 5% FDR. (C) TKOv3 guides targeting essential vs. nonessential genes in HAP1. Guides targeting essential genes, with fold-change <5th percentile of guides targeting nonessential genes, are defined as active guides. (D) The fraction of active guides (active guides targeting essential genes / all guides targeting essential genes) across the five libraries tested. (E) Distribution of sequence scores for all candidate gRNA sequences (n ∼ 2.5 million) compared to published CRISPR/SpCas9 libraries. (F) Overlap of gRNA sequences in the top three libraries by sequence score.