| Literature DB >> 32040509 |
Xiangying Sun1,2, Zhezhen Wang1, Johnathon M Hall3, Carlos Perez-Cervantes1, Alexander J Ruthenburg3, Ivan P Moskowitz1,3, Michael Gribskov2,4, Xinan H Yang1.
Abstract
Long noncoding RNAs (lncRNAs) localize in the cell nucleus and influence gene expression through a variety of molecular mechanisms. Chromatin-enriched RNAs (cheRNAs) are a unique class of lncRNAs that are tightly bound to chromatin and putatively function to locally cis-activate gene transcription. CheRNAs can be identified by biochemical fractionation of nuclear RNA followed by RNA sequencing, but until now, a rigorous analytic pipeline for nuclear RNA-seq has been lacking. In this study, we survey four computational strategies for nuclear RNA-seq data analysis and develop a new pipeline, Tuxedo-ch, which outperforms other approaches. Tuxedo-ch assembles a more complete transcriptome and identifies cheRNA with higher accuracy than other approaches. We used Tuxedo-ch to analyze benchmark datasets of K562 cells and further characterize the genomic features of intergenic cheRNA (icheRNA) and their similarity to enhancer RNAs (eRNAs). We quantify the transcriptional correlation of icheRNA and adjacent genes and show that icheRNA is more positively associated with neighboring gene expression than eRNA or cap analysis of gene expression (CAGE) signals. We also explore two novel genomic associations of cheRNA, which indicate that cheRNAs may function to promote or repress gene expression in a context-dependent manner. IcheRNA loci with significant levels of H3K9me3 modifications are associated with active enhancers, consistent with the hypothesis that enhancers are derived from ancient mobile elements. In contrast, antisense cheRNA (as-cheRNA) may play a role in local gene repression, possibly through local RNA:DNA:DNA triple-helix formation.Entities:
Mesh:
Substances:
Year: 2020 PMID: 32040509 PMCID: PMC7034927 DOI: 10.1371/journal.pcbi.1007119
Source DB: PubMed Journal: PLoS Comput Biol ISSN: 1553-734X Impact factor: 4.779
Fig 4Known genomic features of the icheRNA in the K562 nucleus.
a, Distribution of three classes of expressed RNA (CPM>1) in fractionated libraries. Three classes were defined based on their relative genomic locations to GENCODE (v25)-annotated coding genes (). Chromatin-independent RNAs refer to RNAs not differentially expressed in CPE and SNE samples. b, Coding potential of icheRNA (red), ChromHMM-predicted intergenic eRNAs (yellow), FANTOM5-predicted eRNAs (green, of which the majority overlap with coding genes), isneRNA (blue) and mRNAs (purple). Intergenic RNA overlapped (at least 1 base) with any ChromHMM/FANTOM5 identified enhancer region is assumed to be predicted ChromHMM/FANTOM5 predicted eRNAs. The online tool CPC2 is used. c, Percentage of GENCODE (v25) annotated and unannotated RNAs in icheRNA and isneRNA. d, Pairwise Correlation of expression between RNAs in four classes and their local genes. To pair an intergenic genomic feature with its neighboring gene, the adjacent upstream or downstream gene with the highest magnitude PCC is selected. The relative density at a certain PCC value is calculated by dividing the kernel density estimates of indicated RNA and neighboring-gene pairs by its own random control. For example, the random control of the 3,298 icheRNA-neighboring gene pairs is that of pairing these icheRNA with randomly selected coding genes (, dash line). Two vertical dashed lines mark significant cutoffs of PCC values at -0.8 or 0.8. e, Normalized expression values of fractionate RNA classes. Values are given in FPKM in Poly(A)+ nuclear RNA-Seq library (x-axis, GSE88339) versus nuclear total-RNA-Seq library (y-axis, GSE87982). f, Average ChIP-seq read density versus input centered at promoters (±1kb centered at TSS) of RNAs, p-values calculated by two-sided Wilcoxon rank sum test, NS p>0.05, * p<0.01, ** p<1e-10, **** p<2.2e-16. (Note that in each panel, boxes without overlaps are significantly different without showing **** for simplicity.) 3k random mRNAs were selected from 9.8k transcribed mRNAs and 3k randomly selected silent RNAs from 66.9k annotated but K562-untranscribed RNAs.