| Literature DB >> 24143171 |
Lucía Spangenberg1, Alejandro Correa, Bruno Dallagiovanna, Hugo Naya.
Abstract
Post-transcriptional regulation of stem cell differentiation is far from being completely understood. Changes in protein levels are not fully correlated with corresponding changes in mRNAs; the observed differences might be partially explained by post-transcriptional regulation mechanisms, such as alternative polyadenylation. This would involve changes in protein binding, transcript usage, miRNAs and other non-coding RNAs. In the present work we analyzed the distribution of alternative transcripts during adipogenic differentiation and the potential role of miRNAs in post-transcriptional regulation. Our in silico analysis suggests a modest, consistent, bias in 3'UTR lengths during differentiation enabling a fine-tuned transcript regulation via small non-coding RNAs. Including these effects in the analyses partially accounts for the observed discrepancies in relative abundance of protein and mRNA.Entities:
Mesh:
Substances:
Year: 2013 PMID: 24143171 PMCID: PMC3797115 DOI: 10.1371/journal.pone.0075578
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Heatmap of the residuals of the model logFC logFC.
Protein levels (logFC) of the set of secreted proteins are compared against the logFC of our data set and the residuals of the linear model analyzed; polysomal fraction (A) and total fraction (B). All time points are considered: day 1, 3, 5 and 7 (dendrogram on the top). Genes are on the rows (dendrogram on the left). Only data for genes with large absolute residuals are shown.
Linear model results for secreted and nuclear proteins at day 5.
| SECRETOME | NUCLEAR | ||||||
| Polysomal RNA | Polysomal RNA | ||||||
| logFC |
|
|
| logFC |
|
|
|
|
| - | - |
|
| - | - |
|
|
| miR-130a | - |
|
| miR-185* | - |
|
|
| miR-130b | - |
|
| miR-20b* | - | 0.175 |
|
| miR-130b | miR-558 |
|
| miR-16-2* | miR-185* |
|
Results for applying linear models to the data at day secreted and nuclear proteins. Both RNA fractions are considered. For each subtable (e.g. secretome-polysomal) the first row shows the results for a linear model without considering microRNA effect (the standard model: vs. ). The 2 and 3 row represent the values for univariate models, including the effect of only one miRNA. We selected the two most significant miRNAs. The last row shows the (multivariate) best model as determined by the BIC value. In several cases the best model is not multivariate, especially since BIC penalizes the number of parameters.
means a significance level of .
Significant miRNAs at day 5 as obtained from the linear univariate model.
| Polysomal RNA | Total RNA | |
| secreted |
|
|
| miR-142-3p,miR-144,miR-148a |
| |
| miR-148b,miR-150*,miR-152,miR-15a | miR-150*,miR-152,miR-15a | |
| miR-15b,miR-16,miR-190b,miR-195 | miR-190b, | |
|
|
| |
| miR-29b,miR-29b-2*,miR-29c,miR-301a | miR-26b, | |
| miR-301b,miR-302a,miR-302d,miR-338-5p | miR-29a,miR-29b,miR-29b-2* | |
| miR-33a,miR-33a*,miR-33b,miR-340 | miR-29c,miR-301a,miR-301b | |
| miR-486-5p,miR-509-5p,miR-510,miR-551b* | miR-338-5p,miR-33a,miR-33a* | |
| miR-553,miR-558,miR-569,miR-574-5p | miR-340,miR-361-5p | |
| miR-589*,miR-628-5p,miR-633,miR-672 | miR-486-5p,miR-509-5p | |
| miR-768-3p,miR-768-5p,miR-891b | miR-510,miR-551b*,miR-553 | |
| miR-558,miR-569,miR-574-5p | ||
| miR-575,miR-582-3p,miR-587 | ||
| miR-589*,miR-604,miR-607 | ||
| miR-628-5p,miR-672 | ||
| miR-768-3p,miR-768-5p,miR-891b | ||
| nuclear |
| miR-100,miR-106b,miR-10b*,miR-185* |
| miR-346,miR-372,miR-378*,miR-587 | miR-193a-5p,miR-222*,miR-28-5p | |
| miR-372,miR-433,miR-507 | ||
| miR-523,miR-548b-3p,miR-551b | ||
| miR-576-5p,miR-621,miR-885-5p |
Set of significant miRNAs in each data set. Underlined miRNAs correspond to those found in Zhang et al. (revision on miRNAs involved in adipogenesis) [31].
Figure 2Bootstrap to asses our results for each RNA fraction and each protein set.
Bootstrap results for total RNA fractions are shown in A (nuclear) and B (secretome). Polysomal fraction is shown in C (nuclear) and D (secretome). For each such pair of conditions, we performed a bootstrap analysis as explained in 0.6. For each miRNA we permute the values of the genes and calculate the explained variance from the resulting linear model. This procedure is repeated times. The y-axis represents how many times the “true” miRNA wins over the random model. The x-axis represents all miRNAs. The colors, from red to green, represent the explained variance from the current “true” model. It can be observed that the miRNAs win almost all times (the larger bars, almost reaching 1), explain the larger variance, and hence produce the best models (red).
Figure 3Linear models for day 5 secreted proteins represented graphically.
(A, B) Polysomal fraction, (C, D) total RNA. (A) and (C): plot representing logFC against logFC. The dashed blue line is the best fitting line of the base model, against . The straight black line is the identity line (so you get an idea of the real coefficient of the model). The colored full dots are genes, which are moved after applying the model with miRNAs. Hence, they represent genes that are better explained by our model. The arrows indicate the direction of the movement. (B) and (D): plot representing our linear model including miRNA effect. In this case, the best (multivariate) model is shown: miR-130b and miR-558 (polysomal) and miR-150* (total). Full dots are the genes that were corrected by our model, being now closer to the protein prediction line of the model (red full line). Black identity line concurs with the red line. Note that the abscissas of (A) and (C) seem to have a compression of range with respect to the plots below, (B) and (D). This is not a compression, since they are different x-axis: (A) and (C) hold logFC values, while (B) and (D) logFC.
Figure 43′UTR differences for PluriNet genes.
On the x-axis one observes the ranking of 3′UTR lengths as determined in section 1 of all genes used for logFC calculations in the total RNA fraction. The ranking of genes belonging to the PluriNet are shown as densities (y-axis on the left). Negative lengths (CT>IN) lie to the left of the red dashed line. Positive values are to the right of the green dashed line. The wide space between those lines correspond to genes with no differences in 3′UTR length. The median of the rankings is represented as a doted black line. Tick marks in blue represent the ranking positions of the PluriNet genes. On top of the density plot the cumulative distribution of rankings is shown. The straight blue line has slope 1 and intersect 0. Gray dots represent the cumulative ranking of the PluriNet genes. The y-axis to the right indicates the meassure of this cumulative ranking. An under-representation of PluriNet genes with high negative values and a slight over-representation of positive values is observed. Moreover, only marginal PluriNet genes are presenting values of 0.
Figure 5An example of how different microRNAs binding sites arise from alternative transcripts.
The table shows the presence of the miRNAs in the transcripts. The longer the 3′UTR the more binding sites are seen.
Mapping statistics of RNA-seq.
| donor | condition | raw data | reads for mapping | mapped | unmapped | junctions | % |
| 61 | CT_poly |
|
|
|
|
|
|
| 61 | IN_poly |
|
|
|
|
|
|
| 67 | CT_poly |
|
|
|
|
|
|
| 67 | IN_poly |
|
|
|
|
|
|
| 67 | CT_total |
|
|
|
|
|
|
| 70 | CT_poly |
|
|
|
|
|
|
| 70 | IN_poly |
|
|
|
|
|
|
| 70 | CT_total |
|
|
|
|
|
|
| 61 | IN_total |
|
|
|
|
|
|
| 67 | IN_total |
|
|
|
|
|
|
| 70 | IN_total |
|
|
|
|
|
|
| 61 | CT_total |
|
|
|
|
|
|
| 67 | CT_total |
|
|
|
|
|
|
Mapping data of SOLiD runs. Following data is shown: donor number, condition considered (CT or IN, and polysomal or total RNA), number of raw reads obtained from the sequencing process, number of reads considered for mapping, number of mapped reads, unmapped reads, and the percentage of mapped reads.
Figure 6Representative table for constructing the model.
For each gene we determined the proportion of FPKM in each sample and calculated the differences (). Furthermore, we determined the miRNAs targeting transcripts (inside 3′UTRs). A total of were considered. The isoform has a 1 in if that miRNA is present in that transcript, a 0 otherwise. For each (eg. ) corresponding to one gene (e.g. ), the vector is multiplied by the presence/absence vector of (with assigned 1 s and 0 s). The intermediate result is, thus, a vector having the respective value if was present in the isoform and 0 otherwise (). The resulting vector is summed giving a total value for for (). This represents the mean weighted usage of the miRNA in that specific gene. Larger positive values indicate that the miRNA is used more (appears more often) in IN than in CT. Larger negative values represent a higher usage in CT (values around 0 indicate same usage in both). The same procedure is done for each miRNA (so a vector of values is assigned to ) and for each gene. The gene wise table below in addition to showing the resulting values calculated above, also shows the other data needed for the model; the logFC values (at day 3, 5 and 7, from Molina et al.) and the respective logFC values (our data).