| Literature DB >> 27625394 |
Klev Diamanti1, Husen M Umer1, Marcin Kruczyk1, Michał J Dąbrowski2, Marco Cavalli3, Claes Wadelius3, Jan Komorowski4,2.
Abstract
Gene transcription is regulated mainly by transcription factors (TFs). ENCODE and Roadmap Epigenomics provide global binding profiles of TFs, which can be used to identify regulatory regions. To this end we implemented a method to systematically construct cell-type and species-specific maps of regulatory regions and TF-TF interactions. We illustrated the approach by developing maps for five human cell-lines and two other species. We detected ∼144k putative regulatory regions among the human cell-lines, with the majority of them being ∼300 bp. We found ∼20k putative regulatory elements in the ENCODE heterochromatic domains suggesting a large regulatory potential in the regions presumed transcriptionally silent. Among the most significant TF interactions identified in the heterochromatic regions were CTCF and the cohesin complex, which is in agreement with previous reports. Finally, we investigated the enrichment of the obtained putative regulatory regions in the 3D chromatin domains. More than 90% of the regions were discovered in the 3D contacting domains. We found a significant enrichment of GWAS SNPs in the putative regulatory regions. These significant enrichments provide evidence that the regulatory regions play a crucial role in the genomic structural stability. Additionally, we generated maps of putative regulatory regions for prostate and colorectal cancer human cell-lines.Entities:
Mesh:
Substances:
Year: 2016 PMID: 27625394 PMCID: PMC5100580 DOI: 10.1093/nar/gkw800
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.(A) Annotation of the putative regulatory regions for each of the five cell lines according to their proximity to GENCODEv23 genes (26). (B) Annotation of putative regulatory regions according to the merged annotations from the ChromHMM data set (cf. Materials and Methods). The regions that did not intersect with any of the ChromHMM annotations are marked as “Unannotated". We lack ChromHMM annotations for HeLa-S3. (C) Pair-wise comparison of gene expression differences among ChromHMM heterochromatic, insulator and promoter putative regulatory regions located in physical promoters (Supplementary Note S4). The Y-axis represents the number of physical gene promoters intersecting with heterochromatic, insulator or promoter putative regulatory regions in three different cell lines. The P-value shows the statistically significant difference (Wilcoxon rank-sum test) between gene expression levels in heterochromatin, insulator and promoters according to ChromHMM. “ns" denotes that there was no statistical significance between the gene expression levels. (D) Biological validation of a subset of the proposed regulatory regions by tfNet. The information in the X-axis contains the GWAS reference SNP IDs (rs) for the SNPs located within the regulatory regions and the ChromHMM annotation. The Y-axis shows the relative luciferase activity for each tested region. P-values are calculated between the control and each corresponding tested region (Mann–Whitney U test). “ns" denotes that there was no statistical significance between the tested region and the control.
Figure 2.Heatmap networks modelling the significant TF–TF interactions in putative regulatory regions of heterochromatin annotation. The colour intensity in each cell represents the TF–TF interaction significance for each network type in the corresponding cell line. The shown interactions are between (A) CTCF-RAD21-SMC3, (B) NRSF-SIX5-ZNF143, (C) BACH1-MAFF-MAFK, (D) CJUN-FOSL1-JUND, (E) USF1-USF2, (F) ARF2-BATF-NFIC-RUNX3, (G) HNF4A-HNF4G and (H) FOXA1-FOXA2.
Abundance of TFBSs participating in putative regulatory regions of heterochromatic annotation for four examined cell lines and for each TF network model (co-occurring, overlapping and neighbouring). The percentages and the colour code demonstrate the ratio of the TFBSs participating in heterochromatin compared to the total number of input TFBSs
|
|
Figure 3.Participation of the putative regulatory regions (including DNaseI) in interacting domains for (A) GM12878 and (B) K562 (27,57). Region annotations are shown outside the circles. The percentages show the participation of regulatory regions of each annotation. The numbers between the inner and the outer circle represent the amount of putative regulatory regions of a specific annotation interacting with other annotated regions. Putative regulatory regions participating in multiple interactions have been counted multiple times while the numbers in pink stand for the actual amount of putative regulatory regions detected by tfNet. The colour code for the putative regulatory region annotations is the same as in Figure 1B. The thickness of the ribbons shows the number of interacting regions of each annotation. The arks of the innermost circle denote the edges of the corresponding ribbon. (C) Enrichment of GWAS SNPs in putative regulatory regions of interacting domains. In the first track, the blue-box clusters represent ChIP-seq peaks constituting regulatory regions located in the chromatin interacting domains. The lines show the three-dimensional interactions between the upstream and the two downstream domains (Supplementary Table S8). The GWAS SNPs enriched in the regulatory regions are shown in the second track. The red bars are harboured by regions within the interacting domains while those in blue harboured by the nearby regions. In the third track enrichment of histone modification and DNaseI signals are shown. In the final track the ENSEMBL genes close to the looping domains are shown. The arrows show the transcription direction.