| Literature DB >> 25984343 |
Glenn S Cowley1, Barbara A Weir2, Francisca Vazquez2, Pablo Tamayo1, Justine A Scott1, Scott Rusin1, Alexandra East-Seletsky1, Levi D Ali1, William Fj Gerath1, Sarah E Pantel1, Patrick H Lizotte1, Guozhi Jiang1, Jessica Hsiao1, Aviad Tsherniak1, Elizabeth Dwinell1, Simon Aoyama1, Michael Okamoto1, William Harrington1, Ellen Gelfand1, Thomas M Green1, Mark J Tomko1, Shuba Gopal1, Terence C Wong1, Terrence C Wong, Hubo Li3, Sara Howell1, Nicolas Stransky4, Ted Liefeld1, Dongkeun Jang1, Jonathan Bistline1, Barbara Hill Meyers1, Scott A Armstrong5, Ken C Anderson6, Kimberly Stegmaier7, Michael Reich1, David Pellman3, Jesse S Boehm1, Jill P Mesirov1, Todd R Golub1, David E Root1, William C Hahn8.
Abstract
Using a genome-scale, lentivirally delivered shRNA library, we performed massively parallel pooled shRNA screens in 216 cancer cell lines to identify genes that are required for cell proliferation and/or viability. Cell line dependencies on 11,000 genes were interrogated by 5 shRNAs per gene. The proliferation effect of each shRNA in each cell line was assessed by transducing a population of 11M cells with one shRNA-virus per cell and determining the relative enrichment or depletion of each of the 54,000 shRNAs after 16 population doublings using Next Generation Sequencing. All the cell lines were screened using standardized conditions to best assess differential genetic dependencies across cell lines. When combined with genomic characterization of these cell lines, this dataset facilitates the linkage of genetic dependencies with specific cellular contexts (e.g., gene mutations or cell lineage). To enable such comparisons, we developed and provided a bioinformatics tool to identify linear and nonlinear correlations between these features.Entities:
Mesh:
Substances:
Year: 2014 PMID: 25984343 PMCID: PMC4432652 DOI: 10.1038/sdata.2014.35
Source DB: PubMed Journal: Sci Data ISSN: 2052-4463 Impact factor: 6.444
Figure 1Schematic representation of the schema used for pooled shRNA screening.
Figure 2Assessment of data accuracy using DNA pools containing known relative proportions of DNA.
Two 45,000-shRNA pools were created by combining 4 subsets of the shRNA library plasmids (labeled in black, red, green, and blue) in a 1:1:1:1 ratio of concentrations for the ‘Reference pool’ and in a 1:4:16:64 ratio for the ‘Dilution pool.’ The observed separation of the 4 subsets of shRNAs according to their known relative proportions in the 2 pools illustrates the ability of (a) NGS and (b) Affymetrix arrays to deconvolve the pooled shRNA library.
Figure 3Comparison of pooled screen measurements from sequencing deconvolution against individual shRNA proliferation tests.
The relative abundance (fold change values) of 350 shRNAs measured from sequencing deconvolution of four OVCAR-8 replicates (y-axis) are plotted against the relative abundance of OVCAR-8 cells (x-axis) infected with each shRNA encoded in a GFP+ plasmid, measured at 7 days post infection in the competition assay [3]. The circled dot indicates the median value, boxes represent the 25th to 75th percentile and whiskers extend to the full range of the data for those 4 replicates.
Figure 4Evaluation of batch effect from differences in screening conditions.
The first principal component (x-axis) was plotted against the second principal component (y-axis) using the shRNA-level data for all 216 cell lines. Each point is an individual cell line, and is colored by (a) cancer type, (b) screener, (c) observed infection rate of each screen, (d) date of the PCR reaction, and (e) observed cell representation of each screen. Ellipses are drawn around colored groups with greater than 5 examples, to aid in visualization.
Figure 5Assessment of reproducibility by measuring intra- and inter-replicate correlation.
(a) A boxplot of correlation between replicates (y-axis) plotted for each cell line (x-axis) shows the range of replicate-replicate correlations. The circled dot indicates the median value, boxes represent the 25th to 75th percentile and whiskers extend to the full range of the data not considered outliers for each cell line. A line indicating the threshold for passing quality control is in red. Histograms of (b) all intra-replicate correlations and (c) all inter-replicate (non-replicate) correlations show overall that replicate correlations are higher than non-replicate correlations. Colors indicate the percentile of signal in the initial DNA reference pool.