| Literature DB >> 31131080 |
Ivan Antonov1,2, Yulia A Medvedeva1,2,3.
Abstract
Many long noncoding RNAs are bound to the chromatin and some of these interactions are mediated by triple helices. It is usually assumed that a transcript can form triplexes with a distinct set of genomic loci also known as triplex target sites (TTSs). Here we performed computational analyses of the TTSs that have been experimentally identified for particular RNAs. To assess the ability of these TTSs to bind other transcripts we developed a method to estimate the statistical significance of the predicted number of triplexes for a given RNA-DNA pair. We demonstrated that each DNA set included a subset of sequences that have a potential to form a statistically significant (adjusted p-value < 0.01) number of triplexes with the majority (>90%) of the analyzed transcripts. Due to the predicted ability of these DNA sequences to interact with a wide range of different RNAs, we called them "universal TTSs". While the universal TTSs were quite rare in the human genome (around 0.5%), they were more frequent (>15%) among the MEG3 binding sites (ChOP-seq peaks) and especially among the shared Capture-seq peaks (40%). The universal TTSs were enriched with the purine-rich low complexity regions. Nowadays, the role of the chromatin bound RNAs in the formation of 3D chromatin structure is actively discussed. We speculated that such universal TTSs may contribute to establishing long-distance chromosomal contacts and may facilitate distal enhancer-promoter interactions. All the scripts and the data files related to this study are available at: https://github.com/vanya-antonov/universal_tts.Entities:
Keywords: MEG3 lncRNA; triple helix; triplex target sites (TTS)
Mesh:
Substances:
Year: 2018 PMID: 31131080 PMCID: PMC6518440 DOI: 10.12688/f1000research.13522.2
Source DB: PubMed Journal: F1000Res ISSN: 2046-1402
Figure 1. ( A, B) The number of the DNA sequences from ( A) the ChOP-seq or ( B) the control genomi set with the statistically significant number of predicted triplexes for different query RNAs (the black dots). ( C, D, E) The heat maps of the – log 10 (adjusted p-value) corresponding to the predicted triplexes between the 307 different query RNAs (columns) and ( C) all the ChOP-seq peaks, ( D) the control genomic sites or ( E) the Shared Capture-seq peaks (rows). The universal TTSs were identified based on their interactions with the 153 expressed transcripts (left part of each heat map) and visualized as a separate (top) cluster. The MEG3 column was intentionally drawn wider. The blue color corresponds to the RNA-DNA pairs with adjusted p-value = 1 (including cases where no triplexes were predicted). ( F) Repeat classes present in different sets of genomic regions.