| Literature DB >> 30103702 |
Francesco Iorio1,2,3, Fiona M Behan4,5, Emanuel Gonçalves4, Shriram G Bhosle4, Elisabeth Chen4, Rebecca Shepherd4, Charlotte Beaver4, Rizwan Ansari4, Rachel Pooley4, Piers Wilkinson4, Sarah Harper4, Adam P Butler4, Euan A Stronach6,5, Julio Saez-Rodriguez7,8,5,9, Kosuke Yusa4, Mathew J Garnett10,11.
Abstract
BACKGROUND: Genome editing by CRISPR-Cas9 technology allows large-scale screening of gene essentiality in cancer. A confounding factor when interpreting CRISPR-Cas9 screens is the high false-positive rate in detecting essential genes within copy number amplified regions of the genome. We have developed the computational tool CRISPRcleanR which is capable of identifying and correcting gene-independent responses to CRISPR-Cas9 targeting. CRISPRcleanR uses an unsupervised approach based on the segmentation of single-guide RNA fold change values across the genome, without making any assumption about the copy number status of the targeted genes.Entities:
Keywords: Bias correction; CRISPR-Cas9; Cancer; Gene copy number; Genetic screens
Mesh:
Year: 2018 PMID: 30103702 PMCID: PMC6088408 DOI: 10.1186/s12864-018-4989-y
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1Heterogeneous gene-independent responses to CRISPR-Cas9 targeting. a Average logFC values of sgRNA within segments of equal CN (excluding FE and histones) for three cell lines. Each circle represents a CN segment of the indicated copy number. Asterisks mark the CN at which a significance difference (Welchs t-test, p < 0.05) is initially (starting point) and continuously (critical point) observed compared to logFC values at CN = 2. Box-plots show the median, inter-quartile ranges and 95% confidence intervals. b Same as for A but considering only non-expressed genes (FPKM < 0.05). (c,d,e) Segments of equal gene copy number and segments of equal sgRNA logFCs for selected chromosomes in three cell lines
Fig. 2Unsupervised detection of segments of equal sgRNA logFCs and their correction. a and b Example segments of equal gene copy number and equal sgRNA logFC values detected and corrected by CRISPRcleanR in two cell lines. c logFC values of sgRNAs of the entire library for all cell lines grouped according to the copy number of their targeted gene before (left) and after (right) CRISPRcleanR correction. Box-plots show the median, inter-quartile ranges and 95% confidence intervals. d Recall curves of sgRNA when classified as targeting amplified genes, amplified non-expressed genes, FE genes, and non-essential genes before and after CRISPRcleanR correction, for an example cell line (EPLC−272H). e Assessment of CRISPRcleanR correction comparing Recall at 5% FDR (top row) or area under the Recall curve (AURC, bottom row) of genes in six predefined gene sets based on their uncorrected or corrected logFCs (averaged across targeting sgRNAs)
Fig. 3CRISPRcleanR is effective with multiple different sgRNA libraries. a Recall curves for three sgRNA libraries when classifying sgRNAs targeting amplified genes, amplified non-expressed genes, FE genes, and non-essential genes using sgRNA logFCs before (first row of plots) and after (second row of plots) CRISPRcleanR correction. b Variation of the area under the recall curve for sgRNAs targeting genes in six predefined sets, based on their uncorrected/corrected logFCs, across the three different libraries (one circle per library). c Segments within chromosome 8 of equal gene copy number juxtaposed to segments of equal sgRNA logFCs before and after CRISPRcleanR in HT-29 cells screened with three different sgRNA libraries. The position of MYC is shown with a blue line
Fig. 4CRISPRcleanR retains overall essentiality profiles. a Example precision/recall curves in HuP-T3 cells for the indicated number of top depleted sgRNAs after CRISPRcleanR correction, classified based on their un-corrected sgRNAs logFC rank position. b Area under the precision/recall curves defined as for A for all cell lines. Box-plots show the median, inter-quartile ranges and 95% confidence intervals
Fig. 5CRISPRcleanR corrected sgRNA counts and downstream analysis with MAGeCK. a and b Normalised counts of sgRNAs of the transfected libraries versus the control plasmid for FE and non-essential genes (first two rows of plots), CN amplified genes (third row) and CN non-expressed genes (fourth row), for two example cell lines before (first and third column) and after (second and fourth column) CRISPRcleanR correction. Essentialities for CN-amplified cancer driver genes such as MYC, ERBB2 and CCND1 are retained post correction. For the sake of readability only genes with at least 10 copies have been highlighted. c Comparison of recall using MAGeCK for sgRNAs targeting genes in six predefined gene sets when using as input CRISPRcleanR uncorrected and corrected sgRNAs counts
Fig. 6CRISPRcleanR enables detection of cancer gene dependencies. a Detection of EGFR and PIK3CA dependencies at the level of targeting sgRNAs in mutant cancer cell lines. Rank position of sgRNAs targeting the indicated genes before (top) and after (bottom) CRISPRcleanR correction. FE and non-essential genes are shown for comparison. b A CN-amplified region of chromosome 8 in HT-29 cell line including MYC and 3 surrounding up-streaming/down-streaming genes. Expanded view of sgRNAs targeting MYC and its surrounding genes, with each gene identified by a different colour. The heatmaps (first 7 columns) show ranked positions of the sgRNAs targeting the 7 considered genes (blue bars) before (top heatmap) and after (bottom heatmap) CRISPRcleanR correction. The last two columns show rank positions for the sgRNAs targeting FE genes (second last column) and non-essential genes (last column). c Same as for B but considering a region on chromosome 16 in the NCI-H2170 cell line, including ERBB2 and four flanking upstreaming/downstreaming genes and CDK12