| Literature DB >> 36207580 |
Sean M Gross1, Mark A Dane1, Rebecca L Smith1, Kaylyn L Devlin1, Ian C McLean1, Daniel S Derrick1, Caitlin E Mills2, Kartik Subramanian2, Alexandra B London3, Denis Torre3, John Erol Evangelista3, Daniel J B Clarke3, Zhuorui Xie3, Cemal Erdem4, Nicholas Lyons5, Ted Natoli5, Sarah Pessa5, Xiaodong Lu5, James Mullahoo5, Jonathan Li6, Miriam Adam6, Brook Wassie6, Moqing Liu1, David F Kilburn1, Tiera A Liby1, Elmar Bucher1, Crystal Sanchez-Aguila1, Kenneth Daily7, Larsson Omberg7, Yunguan Wang2, Connor Jacobson2, Clarence Yapp2, Mirra Chung2, Dusica Vidovic8,9,10, Yiling Lu11, Stephan Schurer8,9,10, Albert Lee12, Ajay Pillai13, Aravind Subramanian5, Malvina Papanastasiou5, Ernest Fraenkel5,6, Heidi S Feiler1,14, Gordon B Mills14,15, Jake D Jaffe5, Avi Ma'ayan3, Marc R Birtwistle4, Peter K Sorger2, James E Korkola1,14, Joe W Gray1,14, Laura M Heiser16,17.
Abstract
The phenotype of a cell and its underlying molecular state is strongly influenced by extracellular signals, including growth factors, hormones, and extracellular matrix proteins. While these signals are normally tightly controlled, their dysregulation leads to phenotypic and molecular states associated with diverse diseases. To develop a detailed understanding of the linkage between molecular and phenotypic changes, we generated a comprehensive dataset that catalogs the transcriptional, proteomic, epigenomic and phenotypic responses of MCF10A mammary epithelial cells after exposure to the ligands EGF, HGF, OSM, IFNG, TGFB and BMP2. Systematic assessment of the molecular and cellular phenotypes induced by these ligands comprise the LINCS Microenvironment (ME) perturbation dataset, which has been curated and made publicly available for community-wide analysis and development of novel computational methods ( synapse.org/LINCS_MCF10A ). In illustrative analyses, we demonstrate how this dataset can be used to discover functionally related molecular features linked to specific cellular phenotypes. Beyond these analyses, this dataset will serve as a resource for the broader scientific community to mine for biological insights, to compare signals carried across distinct molecular modalities, and to develop new computational methods for integrative data analysis.Entities:
Mesh:
Substances:
Year: 2022 PMID: 36207580 PMCID: PMC9546880 DOI: 10.1038/s42003-022-03975-9
Source DB: PubMed Journal: Commun Biol ISSN: 2399-3642
Fig. 1Overview of experimental approach to assess the impact of microenvironmental factors.
a Map of LINCS data generation and analysis centers. b Schematic illustrating the experimental and analytical approaches to link molecular and cellular phenotypes. c Schematic of the experimental design, cell culture protocol, and sample harvest time points. d The experimental treatments, dosages, and assays deployed to generate the LINCS ME perturbation datasets. e Summary of the assays, time points, and features for the three experimental collections.
Fig. 2Ligand treatments induce diverse phenotypic responses.
a Representative immunofluorescent images of ligand-induced cellular phenotypes at 48H. MCF10A cells were stained with Cell Mask to visualize cytoplasm. b Cartoon showing the image-based cellular phenotypes assessed from the immunofluorescence and live cell imaging assays. c–g Boxplots summarizing cellular phenotypes at time 0H (CTRL) and 48H after ligand addition from 8 biological replicates. Individual datapoints represent well-level means normalized to 0H. Circles are from collection 1 and triangles are from collection 2. The interquartile range is indicated by the box, with whiskers extending to no further than 1.5 times the interquartile range. Note that EdU positive proportion was not measured at 0H. Data in Supplementary Data 1. h Accumulated cell migration (colored lines) from 0-48H for 25 cell lineages (individual cells and one of their progeny if they divided). Circles indicate mitotic events. The solid black lines indicate the population average; the dotted gray line shows the average TGFB + EGF induced migration at 48H, which was the treatment that induced the greatest increase in cell migration. Data in Supplementary Data 2, 3.
Fig. 3Six molecular assays reveal diverse dynamic responses to treatments.
a Line graphs show dynamic responses for 12 proteins measured in the RPPA assay under the different ligand treatments. b Heatmap of protein abundances as measured by RPPA. Rows represent abundance of 295 (phosphor)proteins and are median-centered and hierarchically clustered. Columns represent individual replicate samples, ordered by treatment and time. Callouts show the 12 proteins from panel A. c UMAPs for each of the six molecular assays. Each dot represents data from an individual sample and is the 2-dimensional embedding of all features measured in the assay. Color indicates ligand treatment and size indicates time point. d Plot of the first two principal components (PCs) of RPPA assay. Variance in PC1 and PC2 is largely driven by ligand treatment and experimental time point, respectively. Data in Supplementary Data 6. e Analysis of RPPA covariates reveals the proportion of variance explained by sample replicate, experimental time point, and ligand treatment for each of the top seven principal components of the RPPA dataset. (f) Stacked bar graph shows a comparison of the information content contained within each molecular assay. Data in Supplementary Data 7.
Fig. 4Assessment of ligand-induced molecular change.
a Barplot showing the number of features significantly modulated by each ligand treatment at 24H or 48H. Shading indicates whether induced features are unique to a particular treatment (dark) or induced by multiple treatments (light). Numbers above bars indicate the number of features uniquely induced over the total number of features induced. Data in Supplementary Data 8. b Heatmap showing pairwise correlations between molecular features induced by each ligand. Ligand responses from similar families are more highly correlated than those from unrelated families. c UpSet plot showing overlaps of induced transcription factor motifs among ligand treatments calculated from ATACseq data at 24H or 48H. Column heights represent the number of transcription factor motifs induced by the ligand(s) indicated with filled dots. Data in Supplementary Data 12. d Hallmark Geneset enrichment scores computed from RNAseq data at 24H.
Fig. 5Integrated analysis identifies co-regulated molecular modules.
a Heatmap showing the 14 integrative molecular modules for each ligand at 24H and 48H. Features are grouped by cluster. Biological interpretation for modules is indicated on the left; feature callouts for RPPA (R), CyCIF (C), ATACseq (A) are shown to the right. b Bubble plot shows the top enriched Reactome pathways in each module, computed from RNAseq features. Dot size indicates the gene ratio; dot color indicates FDR value. c Heatmap showing the five top-ranked ChEA3 transcription factor enrichments computed from the RNAseq features in each module (pink). Red border indicates transcription factor enrichments with a q-value below 0.2 (FDR-adjusted Fisher’s exact test). d–g Scatterplots show the relationships between module activity and quantitative phenotypic responses for selected pairs. Dot color indicates the ligand treatment and dot size indicates the time point. The black dotted line shows the linear fit, and the q-value of the fit is shown at the bottom of the plot.
Fig. 6Module 10 is associated with cell cycle progression.
a Donut plot showing distribution of Module 10 features across assays. Transcription factors and kinases in the RNA gene set are called out to the right of the plot. b Line plot showing 6 of the Module 10 RPPA features. Data in Supplementary Data 5. c Plot of the top 10 most significantly enriched transcription factors inferred from the Module 10 RNAseq gene set. Data in Supplementary Data 17. d Bar plot shows the enrichment of Reactome superpathways from the Module 10 RNA gene set. Data in Supplementary Data 16. e Bubble plot showing the top 5 enriched Reactome subpathways from the Reactome Cell Cycle, DNA Repair, and DNA Replication superpathways. Dot color indicates q-value; dot size indicates the number of genes in Module 10 that are found in each gene set. f Heat map showing expression of Seurat G1/S cell cycle genes in Module 10 (37 of 43 genes shared), sorted based on the EdU positive proportion. g Boxplot of mean Module 10 gene expression for a panel of breast cancer cell lines treated with three CDK4/6 inhibitors for 24H or an untreated control. Cell lines are ordered by abemaciclib GR50 (increasing). The interquartile range is indicated by the box, with whiskers extending to the minimum and maximum values. Data from Hafner, et al.[87]. h Dot plot of mean Module 10 gene expression from 65 human breast cancer cell lines graphed against their mean doubling time. Cell lines are colored based on their breast cancer subtype classification. The line indicates the linear fit across all cell lines, with the 95% confidence interval represented by the gray shaded area. Data from Heiser et al.[10]. Figure data in Supplementary Data 15.
Fig. 7Analysis of molecular modules identifies functional relationships between molecular and phenotypic responses to OSM.
a OSM induces the formation of cell clusters that undergo collective migration and merge to form large clusters. Representative tracks of OSM-induced cluster migration are shown from 24H to 48H after OSM treatment. Cluster outlines are colored by experimental time point. All images are set to the same scale. b Boxplot shows the mean expression of molecular features in Module 4 for each of the six ligand treatments. The boxplots’ lower and upper hinges correspond to the first and third quartiles. The median is shown as the center line. The upper whisker extends from the hinge to the largest value no further than 1.5 * IQR from the hinge (where IQR is the inter-quartile range, or distance between the first and third quartiles). The lower whisker extends from the hinge to the smallest value at most 1.5 * IQR of the hinge. Data in Supplementary Data 15. c Barplot showing the top 5 enriched transcription factors inferred for the Module 2 genes in Chea3. Data in Supplementary Data 17. d The JAK/STAT inhibitor Ruxolitinib inhibits cell growth in the presence of OSM. Line graph shows the relative number of cells across time. PBS (phosphate buffered saline) treatment serves as a control. e Barplot of the top 10 enriched pathways in Bioplanet using the module 4 RNAseq gene set. Data in Supplementary Data 20. f OSM-induced collective migration is mediated by protease activity. Line graph shows the accumulated cluster migration distance after OSM + /− a protease inhibitor cocktail and its individual components including bestatin, E-64, aprotonin, and pepstatin A. Solid lines show the population average and gray shaded regions indicate 95% confidence intervals of the mean distance travelled at each time point. g False color phase contrast images at 48H show that bestatin inhibits the formation of large cell clusters when given in conjunction with OSM. Cells are colored red and the background is colored gray.