| Literature DB >> 26382112 |
Laurent Guyon1,2,3, Christian Lajaunie4,5,6, Frédéric Fer1,2,3, Ricky Bhajun1,2,3, Eric Sulpice1,2,3, Guillaume Pinna7, Anna Campalans8,9,10,11, J Pablo Radicella8,9,10,11, Philippe Rouillier4,5,6, Mélissa Mary1,2,3, Stéphanie Combe1,2,3, Patricia Obeid1,2,3, Jean-Philippe Vert4,5,6, Xavier Gidrol1,2,3.
Abstract
Phenotypic screening monitors phenotypic changes induced by perturbations, including those generated by drugs or RNA interference. Currently-used methods for scoring screen hits have proven to be problematic, particularly when applied to physiologically relevant conditions such as low cell numbers or inefficient transfection. Here, we describe the Φ-score, which is a novel scoring method for the identification of phenotypic modifiers or hits in cell-based screens. Φ-score performance was assessed with simulations, a validation experiment and its application to gene identification in a large-scale RNAi screen. Using robust statistics and a variance model, we demonstrated that the Φ-score showed better sensitivity, selectivity and reproducibility compared to classical approaches. The improved performance of the Φ-score paves the way for cell-based screening of primary cells, which are often difficult to obtain from patients in sufficient numbers. We also describe a dedicated merging procedure to pool scores from small interfering RNAs targeting the same gene so as to provide improved visualization and hit selection.Entities:
Year: 2015 PMID: 26382112 PMCID: PMC4585642 DOI: 10.1038/srep14221
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Comparison of scores using simulations and a validation experiment with controls only.
Simulated data (a,b). (a) Benchmarking scores; Area Under Receiver Operating Characteristic Curve (AUC) as a function of average cell number. Simulations performed with lognormal distribution for cellular phenotypic values and negative binomial distribution for the number of cells per “well”. 384 wells have been simulated, and each perturbation is used in triplicate. Twenty-five perturbations out of 120 were active; each cell of an active perturbation has a probability of 60% of reducing the initial fluorescence by 30%. Each point has been simulated 50 times; error bars correspond to the standard deviation of the 50 computed AUCs. Average computation time is added for each score (for the 250 cells per well condition). The Φ-score significantly outperforms the Z-score (Wilcoxon test P-value = 6.7 × 10−8 in the 250 cells per well condition). (b) Benchmarking scores; AUC as a function of probability of transfection. Same parameters as in (a) but with an average of 150 cells per well. Validation experiment (c,d). Results (GFP per cell and perturbation) were resampled 1000 times with variable numbers of replicates and cells per well. For each resampling, an AUC was calculated for each score. (c) Histogram of AUC difference between Φ-score and Z-score. (d) AUC difference as a function of average cell number. The color code corresponds to the Z-score AUC (a high Z-score AUC is less likely to be improved). Only resampling leading to an AUC (Z-score) < 0.95 is kept. For low cell numbers, the Φ-score performs better than the Z-score. For high cell numbers, the two scores perform almost equally (AUC > 0.9).
Figure 2Φ-score and Z-score comparison with OGG1 screen.
(a) Φ-score histogram of positive (orange) and negative (purple) controls among the 138 plates. Z’ of the distributions was 0.58. (b) Same as (a) for the Z-score. Z’ = −0.19. (c) Primary screen, Φ-score as a function of the Z-score for each siRNA (normalized version of scores). In red, Loess estimation of the scores. (d) Primary screen, normalized Φ-score (Φn) as a function of the average cell number per siRNA. In red, Loess estimation; the shaded envelope corresponds to quantile regression at 1% and 99% with a moving window of 8. cor = Pearson correlation coefficient. CI = 95% Confidence Interval. (e) Same as (d) for the normalized Z-score (Zn). (f) Normalized Φ-score (Φn) per siRNA in the secondary screen as a function of the primary screen. The shaded envelope corresponds to quantile regression at 1% and 99% with a moving window of 8. (g) Same as (f) for the normalized Z-score (Zn).
Figure 3Merged Φ-score and Z-score comparison with the OGG1 screen.
(a) Histogram of the sum of siRNA Φ-score signs for the 500 first absolute Φ-score hits: −3 (resp. −1) indicates that all three siRNA (resp. two out of three) for a given gene have negative Φ-scores. A total of 74% of all of the 3 × 500 siRNAs are positive (p = 0.74). Above, P-values corresponding to all 3 siRNAs with the same sign. (b) Same as (a) for the Z-score, p = 0.68. (c) Merged Φ-score mΦ as a function of the merged Z-score mZ. Hits are highlighted in red (only mΦ hits), in blue (only mZ hits) and in purple (both hits). cor stands for Pearson correlation coefficient. (d) Venn diagram for positive hits. (e) Venn diagram for negative hits. (e) Histogram of mΦ. Orange and red colors highlight groups of 2 and 3 siRNA hits per gene with the same sign score, respectively. (f) Histogram of mΦ. Cyan and blue colors highlight groups of 2 and 3 siRNA hits per gene with the same sign score, respectively.
Figure 4Gene ontology enrichment.
(a) Φ-score with respect to the Z-score for the 88 genes localized in or part of the nuclear chromosome (Cellular Component). Hits specific to one of the scores are named. Red and blue indicate specific hits for the Φ-score and Z-score, respectively. The percentage of hits among the 88 genes and associated P-values are given for each score in the lower-right box. The number of genes in each subgraph is added in green. (b) Same as (a) for the 159 genes investigated in the screen and involved in chromatin modification (Biological Pathway). (c) Same as (a) for the 15 genes involved in histone acetyltransferase activity (Molecular function). (d) For each Molecular Function ontology, the P-value of mΦ positive hits as a function of mZ positive hits (logarithm scale). Here, only strong and/or highly confident positive hits are considered (merged score above +12). The lower right half corresponds to ontologies that are more significant for Φ-score hits; red dots correspond to a difference of at least three orders of magnitude. Enriched ontologies are similar; a few have been named on the graph. In the top right boxes (non-significant ontologies), only a random selection of ontologies are plotted to lighten the graph.