| Literature DB >> 32541899 |
Steffen Renner1, Christian Bergsdorf2, Rochdi Bouhelal2, Magdalena Koziczak-Holbro3, Andrea Marco Amati2,4, Valerie Techer-Etienne2, Ludivine Flotte3, Nicole Reymann2, Karen Kapur5, Sebastian Hoersch5, Edward James Oakeley6, Ansgar Schuffenhauer2, Hanspeter Gubler5, Eugen Lounkine7,8, Pierre Farmer9.
Abstract
Multiplexed gene-signature-based phenotypic assays are increasingly used for the identification and profiling of small molecule-tool compounds and drugs. Here we introduce a method (provided as R-package) for the quantification of the dose-response potency of a gene-signature as EC50 and IC50 values. Two signaling pathways were used as models to validate our methods: beta-adrenergic agonistic activity on cAMP generation (dedicated dataset generated for this study) and EGFR inhibitory effect on cancer cell viability. In both cases, potencies derived from multi-gene expression data were highly correlated with orthogonal potencies derived from cAMP and cell growth readouts, and superior to potencies derived from single individual genes. Based on our results we propose gene-signature potencies as a novel valid alternative for the quantitative prioritization, optimization and development of novel drugs.Entities:
Mesh:
Substances:
Year: 2020 PMID: 32541899 PMCID: PMC7295968 DOI: 10.1038/s41598-020-66533-5
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Introduction to gene-signature quantification methods. (a) Within the manuscript, we consider methods measuring the similarity of gene-signature changes relative to an active control (AC) or a neutral control (NC). An AC signature is the gene expression signature which is representative of a phenotype of interest, typically induced by a compound or genetic treatment. The NC signature represents the background state without compound treatment. Different ACs (with different AC sigatures), e.g. representing different pathways might result in different EC50 or IC50 values for measured compounds. The effect of compounds on gene expression can be quantified with multiple approaches relative to AC and NC, e.g. quantifying the norm of the vector relative to NC (vector A = vec_norm in the manuscript), the effect size in the direction of AC ( | A | cos θ = scalar_projection_AC), or the angle between the compound vector and the AC vector(cos θ = cos_AC). The distance to the AC is another option (exemplified by dashed circle around AC). (b) Two main characteristics of signature similarity can be distinguished: similar changes in magnitude or similar changes in the direction of the gene expression. The magnitude can be interpreted biologically as the efficacy, while the direction emphasizes the direction of the change of the phenotype, e.g. different pathways might result in different directions of changes in gene expression. Note that (b) shows an effect in four-gene space and (a) only shows the effects in two-gene space for the sake of a better illustration of our concepts, even though all methods work on high dimensional spaces.
Overview over gene-signature quantification methods.
| Method | Description | Method class |
|---|---|---|
| cor_p_AC | Pearson correlation of the compound signature to the active control signature | direction |
| cor_s_AC | Spearman rank correlation of the compound signature to the active control signature | direction |
| cos_AC | Cosine of the compound signature to the active control signature | direction |
| cos_weight_AC | cos_AC * significance_weight. Idea: downweight cosine values for signatures with very small / non-significant amplitudes, likely caused by noise. Significance weight = weight from 0 to 1 quantifying the significance of the signal amplitude of the compound gene-signature vector. Formula: min(1, mean(abs(gene rscores)) /3). The mean of the absolute gene expression value rscores of the signature readouts divided by 3 equals 1, if on average the rscores of the signature are 3 standard deviations away from the background. This is considered the threshold from where on signals are considered strong enough not to be downweighted. This score requires the gene expression readouts to be scaled as rscores. Rscores give the number of robust standard deviations where a gene expression of a treatment is away from the median of untreated NC conditions (rscore = (fold change treatment – median fold change NC) / mad (NC)) | direction |
| dot_p_AC | Dot product of compound signature with the active control signature | direction&magnitude |
| scalar_projection_AC | Scalar projection of the signature to the active control signature | direction&magnitude |
| vec_norm | Norm of the compound signature vector | magnitude |
| euc_NC | Euclidean distance of compound signature from neutral control signature | magnitude |
| maha_NC | Mahalanobis distance of the compound signature from the neutral control signature | magnitude |
| num_readouts_changed | Number of readouts with signal different from background (abs(rscore of gene)> 3) | magnitude |
| euc_AC | Euclidean distance of the compound signature to the active control signature | AC_similarity |
| maha_AC | Mahalanobis distance of the compounds signature to the active control signature | AC_similarity |
Figure 2Comparison of EC50s from gene-signatures, single-genes and cAMP for the beta agonists dataset. (a) Example of gene and gene-signature EC50s from representative methods compared to cAMP EC50s. Each point represents a compound. The dashed red lines indicates one log unit above and below the red line of equality. The shown gene and gene-signature EC50s are from signature one, except THBS1 from signature two. The shown data is from replicate two. Axes are log10 transformed. (b) Correlation of gene-signature and single-gene EC50s with cAMP EC50s. (c) PCA of the cAMP, gene, and gene-signature summary methods logged EC50s of all compounds in the dataset. Colors of (a, b, c) are according to the definition in (c). (d). Dose dependent change of the genes in the gene-signature (left panel, with y-axis values > 50 not shown, orange dashed lines at three rscores indicating significant changes from the background), compared with the dose dependent change in gene-signature summary score methods and cAMP for metaproterenol (right panel, boxes colored according concentrations shown in left panel, dashed grey line at 100% activity).
Figure 3Comparison of gene and gene-signature IC50s to growth rate inhibition GR50s of EGFR inhibitors. (a) Pearson correlation of representative methods and 20 individual gene IC50s to GR50s. (b) Comparison of representative EGFR gene-signature IC50s in MCF10A vs. GR50s. The dashed red lines indicate one log unit above and below the red line of equality.