| Literature DB >> 31510670 |
Markus List1,2, Azim Dehghani Amirabad1,3,4, Dennis Kostka5, Marcel H Schulz1,3,6,7.
Abstract
MOTIVATION: MicroRNAs (miRNAs) are important non-coding post-transcriptional regulators that are involved in many biological processes and human diseases. Individual miRNAs may regulate hundreds of genes, giving rise to a complex gene regulatory network in which transcripts carrying miRNA binding sites act as competing endogenous RNAs (ceRNAs). Several methods for the analysis of ceRNA interactions exist, but these do often not adjust for statistical confounders or address the problem that more than one miRNA interacts with a target transcript.Entities:
Mesh:
Substances:
Year: 2019 PMID: 31510670 PMCID: PMC6612827 DOI: 10.1093/bioinformatics/btz314
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.Overview of the SPONGE workflow. (A) Predicted and/or experimentally validated gene–miRNA interactions are subjected to regularized regression on gene and miRNA expression data. Interactions with negative coefficients are retained since they indicate miRNA induced inhibition of gene expression. (B) We compute sensitivity correlation coefficients for gene pairs based on shared miRNAs identified in (A). (C) Given the sample number, we compute empirical null models for various gene–gene correlation coefficients (k) and number of miRNAs (m). Sensitivity correlations coefficients are assigned to the best matching null model and a P-value is inferred. (D) After multiple testing correction, significant ceRNA interactions can be used to construct a genome-wide, disease or dataset-specific ceRNA interaction network
Fig. 2.Comparison of sensitivity correlation and SPONGE FDR control on liver cancer data. (A) mscor values (y-axis) compared to maximal scor values (x-axis) for the same gene–gene interaction. (B) mscor P-values obtained from sampling compared to P-value summarization of scor values using Fisher’s method. (C) Boxplot of gene–gene correlations for gene–miRNA–gene triplets obtained after selecting the top 5% ceRNA interactions according to the raw scor values (orange) or based on FDR corrected P-values from SPONGE (blue). t-test P-value between both distributions is shown on top
Fig. 3.Runtime comparison between SPONGE and JAMI, a fast method for computing ceRNA interactions based on CMI. (A) Runtime for varying number of samples on a fixed set of ca. 80 000 triplets. (B) Runtime for varying number of triplets on a fixed number of samples. Time was measured in CPU hours (y-axis)
Fig. 4.Analysis of SPONGE ceRNA interactions on the pan-cancer dataset. Barplots show the number of interactions (y-axis) that are initially analyzed (grey), obtained after the regression filter (Step 1) and after computing mscor values and FDR correction of empirical P-values (Step 3). The analysis is shown for miRNA–gene relationships for which miRNA binding sites (seeds) have been predicted (orange bars) and for a large set of true-negative miRNA–gene relationships, investigating miRNAs without seed matches in a given gene (blue bars)
Number of genes participating in significant ceRNA pan-cancer interactions (FDR < 1e−5) divided by Ensembl gene type
| Gene type | Number of genes |
|---|---|
| Protein coding | 12 776 |
| Pseudogenes | 1529 |
| lincRNA | 1086 |
| Antisense | 1025 |
| Processed transcript | 207 |
| Sense intronic | 69 |
| Sense overlapping | 67 |
Top 10 ceRNA regulating genes with highest node degree among genes differentially expressed between cancer and tumor-adjacent samples
| Ensembl gene id | HGNC gene symbol | Degree | |
|---|---|---|---|
| 1 | ENSG00000038427 | VCAN | 1135 |
| 2 | ENSG00000113810 | SMC4 | 923 |
| 3 | ENSG00000166851 | PLK1 | 812 |
| 4 | ENSG00000115414 | FN1 | 698 |
| 5 | ENSG00000142945 | KIF2C | 519 |
| 6 | ENSG00000134013 | LOXL2 | 513 |
| 7 | ENSG00000141756 | FKBP10 | 481 |
| 8 | ENSG00000227036 | LINC00511 | 478 |
| 9 | ENSG00000258947 | TUBB3 | 433 |
| 10 | ENSG00000106089 | STX1A | 391 |
Note: The full table with 141 differentially expressed genes is shown in Supplementary Table S1.
Fig. 5.(A) Degree of ceRNA genes with mean expression (TPM > 100) and differential expression between cancer and tumor-adjacent samples (FDR < 0.01 and log fold change > 1). Number of ceRNA interactions (y-axis) is compared to mean expression (x-axis). Differential expression magnitude is shown as color code in the plot. (B) The 10 genes with highest degree ranked by their survival analysis P-value. (C) Kaplan Meier survival plot of the non-coding RNA LINC00511