| Literature DB >> 19603121 |
Angela Re1, Davide Corá, Daniela Taverna, Michele Caselle.
Abstract
In this work, we describe a computational framework for the genome-wide identification and characterization of mixed transcriptional/post-transcriptional regulatory circuits in humans. We concentrated in particular on feed-forward loops (FFL), in which a master transcription factor regulates a microRNA, and together with it, a set of joint target protein coding genes. The circuits were assembled with a two step procedure. We first constructed separately the transcriptional and post-transcriptional components of the human regulatory network by looking for conserved over-represented motifs in human and mouse promoters, and 3'-UTRs. Then, we combined the two subnetworks looking for mixed feed-forward regulatory interactions, finding a total of 638 putative (merged) FFLs. In order to investigate their biological relevance, we filtered these circuits using three selection criteria: (I) GeneOntology enrichment among the joint targets of the FFL, (II) independent computational evidence for the regulatory interactions of the FFL, extracted from external databases, and (III) relevance of the FFL in cancer. Most of the selected FFLs seem to be involved in various aspects of organism development and differentiation. We finally discuss a few of the most interesting cases in detail.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19603121 PMCID: PMC2898627 DOI: 10.1039/b900177h
Source DB: PubMed Journal: Mol Biosyst ISSN: 1742-2051
Fig. 1Feed-forward loops. (a) Representation of a typical mixed feed-forward loop (FFL) analyzed in this work. In the square box, TF is the master transcription factor; in the diamond-shaped box miR represents the microRNA involved in the circuit, while in the round box, the Joint Target is the joint protein-coding target gene (JT). Inside each circuit, –• indicates transcriptional activation/repression, whilst indicates post-transcriptional repression. (b) Flow-chart of the annotation strategies for the feed-forward circuits. After building the catalogue of closed FFLs (see Fig. 2), each side of the circuit was expanded and analyzed using external support databases and functional annotations. Beside each circuit link the source used for its annotation is reported; see Materials and Methods for details.
Fig. 2Flow-chart of our pipeline for the identification of the mixed feed-forward regulatory loops. We built two independent but symmetrical pipelines for the construction of a transcriptional and, separately, a post-transcriptional regulatory network in humans. On the left: we defined a catalogue of core promoter regions around the transcription start sites (TSS) for protein-coding and miRNA genes in the human genome. We then applied a genome-wide sequence analysis strategy in order to identify a catalogue of human putative transcriptional regulatory motifs and the corresponding regulated genes. In so doing, the key ingredients used were statistical properties of short DNA words (oligo analysis) and conservation to mouse, implemented in an alignment-free manner (conserved over-representation). On the right: a similar strategy was used, starting from a catalogue of 3′-UTRs in humans, to obtain a catalogue of human post-transcriptional regulated genes, with a focus for miRNA-mediated interactions. We fixed 0.1 as the false discovery rate (FDR) level for both the two motifs discovery pipelines. At the end, the two regulatory networks were merged to extract the complete dataset of closed mixed feed-forward loops (FFLs), as defined in Fig. 1a, and the results were filtered according to three different procedures: by looking for (I) significant functional (Gene Ontology) annotations between the joint targets of the FFLs, (II) independent computational evidences for the regulatory interactions of the FFLs, and (III) relevance to cancer. See Materials and Methods for details.
The most relevant mixed feed-forward loops (FFLs) obtained with the Gene Ontology filter. Mixed FFLs assembled with the pipeline outlined in Fig. 2 and characterized by enriched Gene Ontology functional annotations. For each circuit, we report the circuit id (FFL id: TF|miRNA) and the complete list of joint targets (JTs). We then report some of the most relevant Gene Ontology annotations, with the relative p-values evaluated by using Fisher’s test. The complete dataset of circuits with their relative annotation is reported in the ESI, supplementary file S8.† Mature microRNA ids are written according to the standard nomeclature of miRBase,[47] for the TF and JT protein-coding genes, we used the standard HGNC ids. The F and P labels in the last column denote the “biological process” and “function” classifications, respectively
| FFL id | JTs | Fisher test | Gene Ontology characterization |
| AP-4|hsa-miR-133b | ADORA1 AP1GBP1 | 7.42e-5 | endocytosis (P) |
| AREB6|hsa-miR-126 | STRBP HERPUD1 CARD14 TRIM4 NP_995324.1 | 4.01e-6 | cellular developmental process (P) |
| EGFL7 PIK3R1 WFDC12 CDKN2A KLF10 C17orf70 | 3.63e-5 | regulation of osteoclast differentiation (P) | |
| RORB FBXL2 PPP3CB | 6.20e-5 | leukocyte differentiation (P) | |
| AREB6|hsa-miR-375 | PCSK6 LRP5 HABP2 USP6 GUF1 CNN3 PTPN4 | 1.94e-5 | anterior/posterior pattern formation (P) |
| XR_017284.1 ATPAF1 LCN1L1 NLGN3 LRFN1 AQP4 | 7.86e-5 | regionalization (P) | |
| TCF2 | |||
| C-REL|hsa-miR-126 | ARHGAP22 DSCR1 EGFR PIK3R2 Q96N05_HUMAN | 2.64e-6 | regulation of cell migration (P) |
| TOX2 PIK3R1 PARP16 ADAMTS9 EGFL7 | 2.97e-6 | phosphoinositide 3-kinase regulator activity (F) | |
| 4.00e-6 | regulation of cell motility (P) | ||
| 4.74e-6 | regulation of locomotion(P) | ||
| C-REL|hsa-miR-199a | ENO3 DDR1 SP2 CCNL1 PALLD | 9.10e-5 | transmembrane receptor protein tyrosine kinase activity(F) |
| ELF-1|hsa-miR-342 | C22orf15 ADAMTS5 CCDC32 IBRDC2 C5orf24 UBE4B CCR2 RPE PHB Q6PK04_HUMAN | 2.97e-6 | protein ubiquitination during ubiquitin-dependent protein catabolic process (P) |
| ER|hsa-miR-135b | GBE1 HCN2 CD99L2 TTC21A BSN RNASE11 NP_787078.1 PRLR | 4.11e-5 | cellular protein complex assembly(P) |
| ANGPT2 Q49AQ9_HUMAN | |||
| ZNF69 FAM129A FMOD IL11 ISCA1 PR285_HUMAN | |||
| CITED1 TGM2 MUSK DEFB123 MFSD3 C17orf28 | |||
| NP_057628.1 LZTS2 | |||
| HMGIY|hsa-miR-152 | EDG1 Q86V52_HUMAN DMRTA2 SLC25A32 FGF1 ITGA5 MEOX2 EPAS1 ZNF33A ADAM17 MAPK6 RNF182 | 6.48e-5 | angiogenesis (P) |
| ICSBP|hsa-miR-223 | ADM GAST PRL GTDC1 FOXO3A | 1.40e-6 | hormone activity (F) reproductive process (P) multicellular organism reproduction (P) |
| 2.18e-5 | |||
| 7.49e-5 | |||
| IRF1|hsa-miR-126 | EGFR EGFL7 GOLPH3 BDH2 ZADH2 | 8.01e-5 | regulation of cell migration (P) |
| IRF-7|hsa-miR-26a | VAX1 GALNT10 CA3 EIF2S1 NDUFA4 | 8.01e-5 | regulation of cell migration (P) |
| ARP19_HUMAN FBXO42 RPIA FBXL19 ALS2CR2 | 6.25e-5 | cellular response to stress (P) | |
| XR_017723.1 GSK3B DBR1 TTC13 NT5DC1 | |||
| MYC|hsa-miR-17-5p | BICC1 STK33 VSX1 EDD1 SLC24A4 NFAT5 E2F1 | 9.40e-0 | cellular metabolic process (P) primary metabolic process (P) |
| C21orf25 C9orf117 MYNN MAPK1 | 9.56e-5 | ||
| MYOD|hsa-miR-140 | ANK2 TSSK2 EIF2AK1 HMX2 THY1 ALAS2 UROC1 | 7.20e-6 | hemoglobin metabolic process (P) organ development (P) |
| CDKL4 PPARA CYBB PPL CDS2 ZIC3 | 6.61e-5 | ||
| SRY|hsa-miR-26a | FANCA GSK3B RPIA Q6ZQV3_HUMAN ALS2CR2 | 2.68e-5 | protein export from nucleus (P) |
| KIF1C RG9MTD2 CDS1 BAG4 PPP2R3C | 5.64e-5 | anti-apoptosis (P) |
Summary of mixed feed-forward loops external annotations and relative examples. (a) General view: here we report the number of circuits presented in our database that obtained the same number of external annotations, from 1–3. Detailed view: here we specify the multiple external resources used for the annotation scheme and their relative contributions. We report the number of circuits with assessed link between: the transcription factor (TF) and the miRNA [TF → miR]; the TF and a joint target (JT) protein-coding gene [TF → JT]; the mature microRNA (miR) and a JT [miR → JT]. (b) Selection of a few circuits validated by the above tests. The complete dataset of circuits is reported in the ESI, supplementary file S8.† For each circuit, we report the circuit id (FFL id: TF|miRNA) and the complete list of JTs. Mature microRNA ids are written according to the standard nomenclature of miRBase,[47] for the TF and JT protein-coding genes, we used the standard HGNC ids
| (a) | General view: | |
| Number of annotated links | Number of circuits | |
| 3 | 75 | |
| 2 | 207 | |
| 1 | 334 | |
Top ten transcriptional factors and microRNAs ranked by out-degree and in-degree respectively. Considering the links between transcriptional factors (TF) and microRNA (miRNA) promoters defined in our transcriptional network, [TF → miR link] we list the top ten TFs and miRNAs according to their out- and in-degree. The out-degree is defined, for a certain TF, as the number of miRNAs directly controlled by the TF itself. The in-degree is defined, for a certain miRNA, as the total number of TF acting on it
| TF | Out-degree | miRNA | In-degree |
| MEIS1 | 31 | hsa-mir-148b | 15 |
| ER | 30 | hsa-mir-203 | 14 |
| SRY | 29 | hsa-mir-181d | 13 |
| HNF-1 | 27 | hsa-mir-99a | 12 |
| SOX-5 | 27 | hsa-mir-125b-2 | 12 |
| LEF1 | 23 | hsa-mir-423 | 11 |
| AREB6 | 22 | hsa-mir-129-2 | 11 |
| NCX | 18 | hsa-mir-149 | 11 |
| SRF | 18 | hsa-mir-214 | 11 |
| C-REL | 17 | hsa-mir-296 | 11 |
Cancer-related circuits. Here, we report the circuits that involve at least two cancer related items. For each circuit we indicated the circuit id (FFL id) in the first column, the master transcription factor (TF) in the second column, the microRNA (miRNA) in the third column and the joint protein-coding target genes (JTs) in the fourth column. For each circuit, only its cancer related items are listed in the table, according to the role they serve within the circuit. In the upper panel we report circuits for which the regulatory motifs in the promoter regions of the miRNA and of the JTs can be associated to a known TF. In the bottom panel we report circuits for which the regulatory motif is uncharacterized. FFL id is the identifier of a certain merged circuit, composed by the TF and miRNA names (TF|miRNA), or, in case of unknown TF, by the exact DNA motif and the miRNA name. Mature miRNA ids are written according to the standard nomenclature of miRBase,[47] for the TF and JT protein-coding genes, we used the standard HGNC ids. For each circuit, the complete list of joint targets is available in the ESI, supplementary file S8†
| FFL id | TF | miRNA | JTs |
| AP-1|hsa-miR-142-3p | hsa-miR-142-3p | DDIT3 | |
| ATF-1|hsa-miR-199a* | hsa-miR-199a* | MTCP1 | |
| ATF6|hsa-miR-199a* | hsa-miR-199a* | MTCP1 | |
| ER|hsa-miR-375 | TPR, USP6 | ||
| HIF-1|hsa-miR-199a* | hsa-miR-199a* | MTCP1 | |
| HNF-3|hsa-let-7a | hsa-let-7a | CCND2 | |
| HNF-3|hsa-let-7f | hsa-let-7f | CCND2 | |
| HNF-3|hsa-miR-30a-5p | MYH11, BCL9 | ||
| HNF-3|hsa-miR-30c | MYH11, BCL9 | ||
| HSF2|hsa-let-7a | hsa-let-7a | MYCN | |
| HSF2|hsa-let-7f | hsa-let-7f | MYCN | |
| HSF2|hsa-miR-199a* | hsa-miR-199a* | MYCN | |
| IRF|hsa-miR-125b | hsa-miR-125b | BCL2 | |
| IY|hsa-miR-296 | RPL22, BCL2 | ||
| MYC|hsa-miR-17-5p | MYC | hsa-miR-17-5p | |
| MYC|hsa-miR-19a | MYC | hsa-miR-19a | |
| MYC|hsa-miR-20a | MYC | hsa-miR-20a | |
| NF-Y|hsa-miR-223 | APC, ATF1 | ||
| OCTAMER|hsa-miR-125b | hsa-miR-125b | IRF4 | |
| PAX-4|hsa-miR-125b | hsa-miR-125b | IRF4 | |
| SOX-5|hsa-miR-125b | hsa-miR-125b | SS18 | |
| SOX-5|hsa-miR-29a | EXT1,COL1A1 | ||
| SRY|hsa-miR-221 | hsa-miR-221 | CCND2 | |
| SRY|hsa-miR-412 | BRAF, ATIC |
Fig. 3Graphical representation of Type I and Type II circuits. TF is the master transcription factor, miR represents the microRNA involved in the circuit and Joint Target is the joint target gene. Inside each circuit, → indicates transcription activation, whilst indicates transcription or post-transcriptional repression. In representing Type I and Type II circuits, we followed the nomenclature used in ref. 21.