| Literature DB >> 32764586 |
Michael J Cox1, Steffen Jaensch2, Jelle Van de Waeter1, Laure Cougnaud3, Daan Seynaeve3, Soulaiman Benalla1, Seong Joo Koo1, Ilse Van Den Wyngaert1, Jean-Marc Neefs1, Dmitry Malkov4, Mart Bittremieux1, Margino Steemans1, Pieter J Peeters1, Jörg Kurt Wegner1, Hugo Ceulemans1, Emmanuel Gustin1, Yolanda T Chong1,5, Hinrich W H Göhlmann1.
Abstract
Phenomic profiles are high-dimensional sets of readouts that can comprehensively capture the biological impact of chemical and genetic perturbations in cellular assay systems. Phenomic profiling of compound libraries can be used for compound target identification or mechanism of action (MoA) prediction and other applications in drug discovery. To devise an economical set of phenomic profiling assays, we assembled a library of 1,008 approved drugs and well-characterized tool compounds manually annotated to 218 unique MoAs, and we profiled each compound at four concentrations in live-cell, high-content imaging screens against a panel of 15 reporter cell lines, which expressed a diverse set of fluorescent organelle and pathway markers in three distinct cell lineages. For 41 of 83 testable MoAs, phenomic profiles accurately ranked the reference compounds (AUC-ROC ≥ 0.9). MoAs could be better resolved by screening compounds at multiple concentrations than by including replicates at a single concentration. Screening additional cell lineages and fluorescent markers increased the number of distinguishable MoAs but this effect quickly plateaued. There remains a substantial number of MoAs that were hard to distinguish from others under the current study's conditions. We discuss ways to close this gap, which will inform the design of future phenomic profiling efforts.Entities:
Mesh:
Substances:
Year: 2020 PMID: 32764586 PMCID: PMC7411054 DOI: 10.1038/s41598-020-69354-8
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Overview of experimental and analytical steps in phenotypic screen. Fifteen reporter cell lines (three cell types × five marker combinations) were screened against a chemical library that contained 1,008 reference compounds which had been manually annotated with MoA information. Compounds were screened at four concentrations (0.3, 1, 3, and 9 µM) and live cells were imaged 24 h after treatment. Following segmentation, ~ 500 features were computed per cell, aggregated at well-level and z-score normalized against vehicle (DMSO control). The mRMR algorithm was then used to select an informative subset of non-redundant features that exhibited high reproducibility across replicate compound treatments for each reporter cell line, referred to as imaging signature, which describes the phenotypic response to each compound treatment. A compound treatment was deemed to be phenotypically active if its imaging signature was significantly different from that of DMSO control and highly similar across replicates. To quantify the extent to which each MoA with ≥ 3 active members in a single reporter cell line could be distinguished from other MoAs, we computed AUC-ROC values based on ranking all active treatments by the Pearson correlations between their imaging signatures. We would like to acknowledge somersault 18:24 BV (www.somersault1824.com) for the graphic design of this figure.
Figure 2Compounds annotated to expressed targets and core-fitness genes are more likely to have phenotypic activity. (a) Heat map showing the percentage of compounds that induced significant phenotypic activity for MoAs with ≥ 6 compounds tested (complete data in Supplementary Table S3). (b) Compounds annotated to expressed targets were more likely to be active than compounds annotated to targets that are not expressed. Target expression was determined using microarray data. Each dark-grey bar shows the number of compounds that are active in ≥ 1 of the five cell lines within each cell type; each light-grey bar shows the number of compounds that are not active in any of the five cell lines within each cell type. Fisher’s exact test p values are indicated below the cell type names in the header. (c) Active compounds annotated to expressed targets tended to be active at lower concentrations than active compounds annotated to targets that are not expressed. The vertical axis shows the lowest active concentration across the five cell lines within each cell type for each active compound. Wilcoxon rank sum test p values are indicated below the cell type names in the header. (d) Compounds that target core-fitness genes were more likely to be active than compounds that target non-core-fitness genes. Only compounds annotated to ≥ 1 expressed target were considered. As in (a), target expression was determined using microarray data. Each dark-grey bar shows the number of compounds that are active in ≥ 1 of the five cell lines within each cell type, and each light-grey bar shows the number of compounds that are not active in any of the five cell lines within each cell type. Fisher’s exact test p-values are indicated below the cell type names in the header.
Figure 3t-SNE map of all active treatments in the A549-ACTB-RAB5A cell line based on the Pearson correlation between their imaging signatures. Each point corresponds to a unique treatment. Selected MoAs are indicated by colours, 11 of which form distinct phenotypic clusters, while ABL1 inhibitors and VEGFR family inhibitors are examples of MoAs that showed diverse phenotypes and hence poor clustering. Circles indicating pheno-clusters were drawn manually. This figure was created in R version 3.6.1 (https://www.R-project.org/) using the ggplot2 package version 3.2.1 (https://ggplot2.tidyverse.org/).
Figure 4Summary how well each MoA can be distinguished from other MoAs based on compound ranking by phenotypic similarity. (a) Heat map showing AUC-ROC values for all MoAs with ≥ 5 active compounds in the respective cell line (complete data in Supplementary Fig. S8 and Supplementary Table S4); empty entries (white) correspond to MoAs with < 5 active compounds. (b) The number of distinguishable MoAs (AUC-ROC ≥ 0.9) increased with additional reporter cell lines, but this effect quickly plateaued. We considered all 83 MoAs (indicated by the dashed line) with ≥ 3 active compounds and ranked the cell lines by how many (additional) distinguishable MoAs they contribute.
Figure 5Summary of the degree to which different experimental factors affected how well MoAs can be distinguished. (a) When using only the 9 µM concentration compared to all four concentrations, the AUC-ROC decreased by ≥ 0.05 for 32.1% (113) of the MoA/cell line pairs, whereas the AUC-ROC increased by ≥ 0.05 for only 3.4% (12) of the MoA/cell line pairs, resulting in a net AUC-ROC decrease of ≥ 0.05 for 28.7% (101) of the MoA/cell line pairs. Using only the 3 µM concentration resulted in a similar trend but compared to the “9 µM only”-case more MoA/cell line pairs benefited from the exclusion of the other concentrations. In each case all active treatments at the respective concentration(s) were considered for the AUC-ROC computation; for the “3 µM only”-case this means substantially fewer compounds than for the “9 µM only”- and “all concentrations”-cases. (b) When using only the information from the BFP channel compared to using the information from all three fluorescent channels, we observed a net AUC-ROC decrease of ≥ 0.05 for only 13.5% of the MoA/cell line pairs, indicating that most of the information is contained in the BFP channel alone. Adding the GFP channel reduces the net loss to 6.1%; adding the RFP channel reduces the net loss to 1.2%. We have not observed strong patterns as to which kind of markers (structural versus signalling / bright versus dim) are most beneficial for MoA distinction, or for which MoAs distinction is particularly sensitive across cell types to the exclusion of particular markers. In all cases, the AUC-ROC is based on the exact same set of active treatments (active calling based on the “all three channels”-signature). For the AUC-ROC analysis we repeated the mRMR feature selection for each considered combination of channels (rather than just dropping features related to the excluded channel(s) from the “all three channels”-signature) for an unbiased comparison between the different combinations of channels.