| Literature DB >> 30389932 |
Aubin Samacoits1,2, Racha Chouaib3,4, Adham Safieddine3,4, Abdel-Meneem Traboulsi3,4, Wei Ouyang1,2, Christophe Zimmer1,2, Marion Peter3,4, Edouard Bertrand5,6, Thomas Walter7,8,9, Florian Mueller10,11.
Abstract
RNA localization is a crucial process for cellular function and can be quantitatively studied by single molecule FISH (smFISH). Here, we present an integrated analysis framework to analyze sub-cellular RNA localization. Using simulated images, we design and validate a set of features describing different RNA localization patterns including polarized distribution, accumulation in cell extensions or foci, at the cell membrane or nuclear envelope. These features are largely invariant to RNA levels, work in multiple cell lines, and can measure localization strength in perturbation experiments. Most importantly, they allow classification by supervised and unsupervised learning at unprecedented accuracy. We successfully validate our approach on representative experimental data. This analysis reveals a surprisingly high degree of localization heterogeneity at the single cell level, indicating a dynamic and plastic nature of RNA localization.Entities:
Mesh:
Substances:
Year: 2018 PMID: 30389932 PMCID: PMC6214940 DOI: 10.1038/s41467-018-06868-w
Source DB: PubMed Journal: Nat Commun ISSN: 2041-1723 Impact factor: 14.919
Fig. 1Simulation of smFISH images. a smFISH experiment against GAPDH (probes labeled with Cy5) in HeLa cells shown as a maximum intensity projection along Z (MIP). Cell segmentation (green) was performed with CellMaskTM, nuclear segmentation (yellow) with DAPI signal. Projections of detected GAPDH mRNAs located between the dashed lines are displayed as indicated by the arrows as side panels. b Final 3D polygon of a cell (yellow) and its nucleus (blue). c Cumulative histogram of pooled expression level. d mRNA intensity distribution extracted from experimental data (KIF1C mRNA). Histogram is fitted with a skewed normal distribution (red). e Example of a simulated cell with random mRNA localization shown as a MIP. Outline of cell and nucleus in green and yellow, respectively. Scale bars 10μm. f Cartoon illustrating the simulated seven non-random localization patterns. g Outlines of a cell and nucleus with mRNA positions of the pattern cell extension: experimental data (RAB13 mRNA) (left), simulated data with low mRNA density and moderate pattern strength (right)
Fig. 2Analysis of simulated data. a Analysis with previously published mRNA detection approach and localization features. In the t-SNE plot, each dot is a cell colored with its localization pattern. b Confusion matrix for k-means clustering with 8 classes for data in a. Rows show identified classes, columns the known localization pattern. Numbers in each square indicate percentage of cells for a localization pattern (column) that was assigned to a given class (row). Off-diagonal elements are mis-classifications. c, d t-SNE projection and confusion matrix as in a and b but determined with our 3D spot detection with GMM and the new localization features. e Analysis of pooled simulated data (different pattern strength and expression levels). Analysis and t-SNE projection as in c. f Confusion matrix for data in e with k-means clustering performed on t-SNE projected data in six dimensions
List of localization features used in Figs. 3 and 4
| Feature ID | Feature description |
|---|---|
| 1 | Ripley: maximum |
| 2 | Ripley: max gradient [0,max] |
| 3 | Ripley: min gradient [max,end] |
| 4 | Ripley: value at mid-point between center and boundary |
| 5 | Ripley: Spearman correlation between Ripley and radius |
| 6 | Ripley: radius of max value |
| 7 | Polarization index |
| 8 | Dispersion index |
| 9–12 | Morph opening–enrichment ratio: 15, 30, 45, 60 pixels |
| 13 | Cell height: Spearman correlation with ZmRNA |
| 14 | Cell height: |
| 15 | Cell membrane: distance – mean |
| 16-19 | Distance membrane: quantile 5%, 10%, 20%, 50% |
| 20 | Nucleus: distance – mean |
| 21 | Cell centroid: distance – mean |
| 22 | Nucleus centroid: distance – mean |
| 23 | Ratio: mRNAs inside nucleus/outside nucleus |
Fig. 3Analysis of experimental data with unsupervised approaches. a t-SNE projection of the 23 localization features for smFISH experiments against 10 different genes. Each dot is one cell and is color-coded according to the gene. Images on the right are examples of DYNC1H1 and CEP192 cells with different localization patterns from the numbered regions in the t-SNE indicated with circles. b t-SNE plot as in a, but color-coded according to results of a k-means classification or spectral clustering. The latter finds a small cluster with strong intranuclear localization (lower plot, black arrow). c, d Examples of hierarchical clustering results. Each row is a cell, each column a localization feature (see list of features in Table 1). Plots on the right show the smoothed distribution of the most enriched genes in the clusters
Fig. 4Supervised analysis of experimental data reveals localization heterogeneity. a Results of supervised random forest classification trained on 5 classes (mRNA foci, nuclear envelope in 2D or 3D, cell extension and random) from simulated data. Heatmap shows the majority voting results for all cells of each gene. Among the simulation classes, “Nuclear envelope 3D” is closest to the experimental class intranuclear. b Posterior probabilities for single cells of the indicated genes. c Example of individual DYNC1H1 cells and their posterior probability. d Scatter plot of Gini impurity calculated on average probability of a gene (population) against the average Gini impurity of individual cells of a gene (intra-cellular)