| Literature DB >> 24758252 |
Gianluca Corrado, Toma Tebaldi, Giulio Bertamini, Fabrizio Costa, Alessandro Quattrone, Gabriella Viero1, Andrea Passerini.
Abstract
BACKGROUND: The progress in mapping RNA-protein and RNA-RNA interactions at the transcriptome-wide level paves the way to decipher possible combinatorial patterns embedded in post-transcriptional regulation of gene expression.Entities:
Mesh:
Year: 2014 PMID: 24758252 PMCID: PMC4234518 DOI: 10.1186/1471-2164-15-304
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1PTRiner workflow scheme. PTRcombiner is composed of two main methodological parts. The first, called “Mining combinatorial features” (orange panel), identifies groups (clusters) of regulatory elements (trans-factors) that act on common mRNAs. The second, called “Analyzing combinatorial features” (blue panels), explores the identified clusters to evaluate their biological characteristics in terms of commonly regulated mRNAs, Gene Ontology enrichments, and compatibility of binding sites among trans-factors belonging to the same cluster.
Figure 2Interaction maps annotated in AURA 2.(A) Human trans-factors (RBPs and miRNAs) were ordered according to the number of annotated target UTRs. Trans-factors with less than 750 distinct UTR targets are not shown. (B) Distribution of the number of distinct trans-factors bound to the same UTR. (C) Graphical representation of the Boolean interaction matrix, derived from the input pairwise interactions. Each row corresponds to a trans-factor, each column to a UTR. Positive interactions are displayed in red.
Figure 3Mining combinatorial features: identifying trans-factor clusters. The average size (i.e. the number of trans-factors members) of the identified clusters is displayed at different combinations of k and τ values. (A) The white spot marks the configuration of the parameters selected to extract the clusters in the presence of recurrent trans-factors. (B) For each trans-factor, the number of occurrences in the identified clusters is plotted on the x axis, against the number of bound UTRs on the y axis. Recurrent trans-factors occurring in more than one cluster are labeled in orange. (C) After all the recurrent trans-factors were removed from the analysis, the average size of the identified clusters was displayed on the y axis at different combinations of k and τ values. The white spot identifies the configuration of parameters selected to extract the clusters of sporadic trans-factors. (D) The proportions among the number of recurrent, sporadic, and absent trans-factors are shown. (E) The proportions among the number of interactions associated with recurrent, sporadic, and absent trans-factors are shown.
List of the inferred clusters in the presence of recurrent trans-factors
| RBP | Clust R01 | AGO1, AGO2, ELAVL1, FMR1_iso1, FMR1_iso7, |
| | | FXR2, LIN28A, LIN28B, MOV10, TIA1, TIAL1, |
| | | ZC3H7B |
| RBP | Clust R02 | AGO1, AGO2, ELAVL1, IGF2BP1, IGF2BP2, |
| | | IGF2BP3, TIAL1 |
| Singleton | Clust R03 | AGO1 |
| RBP | Clust R04 | ELAVL1, HNRNPD |
| RBP | Clust R05 | AGO1, AGO2, ELAVL1, EWSR1, FMR1_iso1, |
| | | FUS, LIN28A, LIN28B, TAF15, TIA1, TIAL1, |
| | | ZC3H7B |
| RBP | Clust R06 | AGO1, ELAVL1, TIA1, TIAL1 |
| RBP | Clust R07 | AGO1, FMR1_iso1, FMR1_iso7 |
| RBP | Clust R08 | AGO1, AGO2, CAPRIN1, ELAVL1, FMR1_iso1, |
| | | FMR1_iso7, LIN28B, TIA1, TIAL1, ZC3H7B |
| RBP | Clust R09 | AGO1, AGO2, C22ORF28, ELAVL1, FMR1_iso1, |
| | | FMR1_iso7, LIN28B, TIA1, TIAL1, ZC3H7B |
| RBP-miRNA | Clust R10 | LIN28A, LIN28B, hsa-miR-221* |
| RBP | Clust R11 | AGO1, HNRNPH |
| RBP | Clust R12 | AGO1, AGO2, ELAVL1, FMR1_iso1, HNRNPC, |
| | | TIA1, TIAL1 |
| Singleton | Clust R13 | PUM1 |
| RBP | Clust R14 | AGO1, AGO2, ELAVL1, FMR1_iso1, FMR1_iso7, |
| | | HNRNPU, TIA1, TIAL1 |
| RBP | Clust R15 | AGO1, AGO2, ELAVL1, FMR1_iso1, FMR1_iso7, |
| | | HNRNPF, TIA1, TIAL1 |
| RBP | Clust R16 | AGO1, AGO2, ELAVL1, EWSR1, FMR1_iso1, |
| | | FMR1_iso7, FXR1, FXR2, LIN28A, LIN28B, TIA1, |
| | | TIAL1, ZC3H7B |
| RBP | Clust R17 | AGO1, AGO2, ELAVL1, FMR1_iso1, IGF2BP1, |
| | | IGF2BP2, IGF2BP3, PUM2, TIA1, TIAL1 |
| Singleton | Clust R18 | PABPC1 |
| Singleton | Clust R19 | U2AF2 |
| RBP-miRNA | Clust R20 | AGO1, AGO2, ELAVL1, FMR1_iso1, IGF2BP1, |
| | | IGF2BP2, IGF2BP3, TIA1, TIAL1, hsa-miR-130a, |
| | | hsa-miR-130b, hsa-miR-148a, hsa-miR-148b, |
| | | hsa-miR-301a, hsa-miR-301b |
| RBP-miRNA | Clust R21 | AGO1, AGO2, ELAVL1, FMR1_iso1, IGF2BP1, |
| | | IGF2BP2, IGF2BP3, TIA1, TIAL1, hsa-miR-15a, |
| | | hsa-miR-15b, hsa-miR-16, hsa-miR-424 |
| Singleton | Clust R22 | DGCR8 |
| RBP-miRNA | Clust R23 | AGO1, AGO2, ELAVL1, FMR1_iso1, IGF2BP1, |
| | | IGF2BP2, IGF2BP3, TIA1, TIAL1, hsa-miR-106b, |
| | | hsa-miR-17, hsa-miR-20a, hsa-miR-320, |
| | | hsa-miR-93 |
| RBP-miRNA | Clust R24 | AGO1, AGO2, ELAVL1, IGF2BP1, IGF2BP2, |
| | | IGF2BP3, TIAL1, hsa-let-7a, hsa-let-7b, |
| | | hsa-let-7c, hsa-let-7d, hsa-let-7e, hsa-let-7f, |
| | | hsa-let-7g, hsa-let-7i |
| RBP | Clust R25 | AGO1, AGO2, ELAVL1, FMR1_iso1, |
| HNRNPA2B1, TIA1, TIAL1 |
Clusters were classified according to their composition: “RBP” for those composed exclusively of RBPs; “miRNA”, composed exclusively of miRNAs; “RBP-miRNA”, composed of both RBPs and miRNAs; and “Singleton”, composed of only one trans-factor.
List of the inferred clusters composed of sporadic trans-factors
| Singleton | Clust S01 | HNRNPD |
| RBP | Clust S02 | CAPRIN1, FUS, FXR1, MOV10, TAF15 |
| Singleton | Clust S03 | HNRNPH |
| RBP | Clust S04 | C22ORF28, CAPRIN1, MOV10 |
| Singleton | Clust S05 | HNRNPC |
| Singleton | Clust S06 | HNRNPU |
| Singleton | Clust S07 | HNRNPF |
| Singleton | Clust S08 | PUM1 |
| miRNA | Clust S09 | hsa-miR-15a, hsa-miR-15b, hsa-miR-16, |
| | | hsa-miR-424 |
| RBP-miRNA | Clust S10 | PUM2, hsa-miR-130a, hsa-miR-130b, |
| | | hsa-miR-148a, hsa-miR-148b, hsa-miR-19a, |
| | | hsa-miR-19b, hsa-miR-301a, hsa-miR-301b |
| Singleton | Clust S11 | HNRNPA2B1 |
| Singleton | Clust S12 | PABPC1 |
| Singleton | Clust S13 | U2AF2 |
| miRNA | Clust S14 | hsa-miR-106b, hsa-miR-17, hsa-miR-20a, |
| | | hsa-miR-93 |
| RBP | Clust S15 | MOV10, PUM2 |
| miRNA | Clust S16 | hsa-let-7a, hsa-let-7b, hsa-let-7c, hsa-let-7d, |
| | | hsa-let-7e, hsa-let-7f, hsa-let-7g, hsa-let-7i |
| Singleton | Clust S17 | DGCR8 |
| Singleton | Clust S18 | C17ORF85 |
| Singleton | Clust S19 | TARDBP |
| RBP | Clust S20 | FUS, MOV10, TAF15 |
| RBP-miRNA | Clust S21 | PUM2, hsa-miR-103, hsa-miR-107, |
| | | hsa-miR-183, hsa-miR-221, hsa-miR-222, |
| | | hsa-miR-23b, hsa-miR-25, hsa-miR-27a, |
| | | hsa-miR-27b, hsa-miR-32, hsa-miR-92a, |
| | | hsa-miR-96 |
| miRNA | Clust S22 | hsa-miR-103, hsa-miR-107, hsa-miR-15a, |
| | | hsa-miR-15b, hsa-miR-16, hsa-miR-29a, |
| | | hsa-miR-29b, hsa-miR-29c, hsa-miR-424 |
| Singleton | Clust S23 | CELF1 |
| Singleton | Clust S24 | hsa-miR-124 |
| Singleton | Clust S25 | hsa-miR-1 |
Clusters were classified according to their composition, as for Table 1.
Figure 4Analyzing combinatorial features, step 1: biological annotation of the trans-factor clusters. The number of HGNC genes targeted by the top five ranking clusters that included either recurrent trans-factors (A) or sporadic trans-factors (B) are displayed. The Jaccard similarities among the top five ranking clusters that included recurrent trans-factors (C) or sporadic trans-factors (D) are given. Heatmap showing the top enriched Molecular Function GO terms associated to the lists of genes targeted by the top five ranking clusters obtained with recurrent trans-factors (E) or with sporadic trans-factors (F) are shown. Singletons were excluded from the analyses.
Figure 5Analyzing combinatorial features, step 1: intra-cluster enrichment analysis. For cluster S2, the first column of the heatmap displays the top enriched GO terms associated to its target mRNAs, with the GO terms biological products (BP) in the upper panel, cellular components (CC) in the middle panel, and molecular functions (MF) in the lower panel. The remaining columns show the enrichment analysis performed on the lists of mRNAs interacting with each individual member of cluster S2. Cells are colored according to the enrichment P values, with significant enrichments displayed in shades of blue. The top rows of each panel list the semantic similarity values between enriched terms associated to the cluster and those associated to the single trans-factors.
Figure 6Analyzing combinatorial features, step 2: RBP site classification. The pairwise classification performance values are displayed for cluster S2 (A) and cluster S4 (B). AUROCC values are shown in the top-right halves of the tables and colored in shades of green, while F1-scores are shown in the bottom-left halves of the tables and colored in shades of blue. The distributions of the distances between binding sites of two distinct trans-factors belonging to cluster S2 (C) and S4 (D) are also shown.
Figure 7Analyzing combinatorial features: comparison between unbalanced and balanced normalization. The number of HGNC genes targeted by the top five ranking clusters obtained using unbalanced normalization (A) or balanced normalization (B) are displayed. Jaccard similarities among the target genes of the top five ranking clusters obtained with unbalanced normalization (C), or those with balanced normalization (D) are shown. Heatmap showing the top enriched Molecular Function GO terms associated to the lists of genes targeted by the top five ranking clusters obtained with unbalanced normalization (E) or balanced normalization (F) are given. Singletons were excluded from the analyses.
Figure 8Interaction maps annotated for.(A) Yeast trans-factors (41 RBPs) were ordered according to the number of their annotated target genes. Trans-factors with less than 100 distinct targets are not shown. (B) Distribution of the number of distinct trans-factors bound to the same gene. (C) Graphical representation of the Boolean interaction matrix, derived from the input pairwise interactions. Each row corresponds to a trans-factor, each column to a gene. Positive interactions are displayed in red.