Literature DB >> 33898883

The nuclear functions of long noncoding RNAs come into focus.

Zhenxing Song^1,2, Jiamei Lin^1,2, Zhengguo Li^1,2, Chuan Huang^1,2.

Abstract

Long noncoding RNAs (lncRNAs), defined as untranslated and tightly-regulated transcripts with a length exceeding 200 nt, are common outputs of the eukaryotic genome. It is becoming increasingly apparent that many lncRNAs likely serve as important regulators in a variety of biological processes. In particular, some of them accumulate in the nucleus and function in diverse nuclear events, including chromatin remodeling, transcriptional regulation, RNA processing, DNA damage repair, etc. Here, we unite recent progresses on the functions of nuclear lncRNAs and provide insights into the future research directions of this field.

Entities: CellLine Chemical Disease Gene Species

Keywords: Circular RNA; MALAT1; NEAT1; Nucleus; RNA-Binding protein; XIST

Year: 2021 PMID： 33898883 PMCID： PMC8053782 DOI： 10.1016/j.ncrna.2021.03.002

Source DB: PubMed Journal: Noncoding RNA Res ISSN： 2468-0540

Introduction

It was long assumed that only a small proportion of the eukaryotic genome is actively transcribed to generate templates for protein translation. However, RNA deep sequencing of eukaryotic transcriptomes has revealed that transcription is pervasively existing across eukaryotic genomes and a diverse group of RNA classes with little or no protein-coding capacity are produced [1,2]. Among these transcripts, long noncoding RNAs (lncRNAs) are ranging from 200 nt to 100 kb in length. Although most lncRNAs (e.g. X-inactive specific transcript, aka XIST) are generated with splicing signals and/or modifications similar to that of protein-coding genes as described by a comprehensive analysis [3], a number of lncRNAs are processed in unexpected ways. For example, the 3’ end of MALAT 1 (Metastasis associated lung adenocarcinoma transcript 1) and NEAT1 (Nuclear enriched abundant transcript 1) are processed by the tRNA biogenesis machinery, generating a stable triple helix structure instead of a poly(A) tail [[4], [5], [6]]. Circular RNAs are generated via back-splicing, a reaction by which a splicing donor is joined to an upstream acceptor [[7], [8], [9]]. Interestingly, some of lncRNAs are highly conserved at the sequence level. The expression levels of certain lncRNAs are higher than most protein-coding genes in a cell- and/or tissue-specific manner. Importantly, the dysregulation of many lncRNAs are associated with various human diseases, such as cancers and brain disorders. For example, H19 is one of the most abundant lncRNAs during fetal development in mammals [10,11]. H19 is robustly induced during embryogenesis at a level comparable to the house-keeping gene β-actin, but is downregulated significantly after birth. Nevertheless, it remains highly expressed in adult skeletal muscle and heart [10,11]. Loss of H19 induces precocious muscle differentiation and defective embryogenesis [10,11]. Similar to mRNAs, lots of lncRNAs display a post-transcriptional role in the cytoplasm as well summarized by Rashid et al. [12]. For example, cytoplasmic lincRNA-p21 has an inhibitory effect on the translation activity of CTNNB1 and JUNB by sterically blocking their transcripts [13]. In addition, circular RNA ND1 and ND5 can promote the mitochondrial entry of nuclear-encoded proteins by interacting with the mitochondrial importation complex [14]. But it is important to note that there are also many lncRNAs accumulating and functioning in the nucleus, which has gained a broad interest in recent years. These nuclear lncRNAs exert critical roles in chromatin remodeling, transcriptional regulation, RNA processing and modification, etc. In this review, we will discuss the functions and underlying mechanisms of several well-studied nuclear lncRNAs and provide insights into the future research directions of nuclear lncRNAs in the field.

Nuclear speckle-associated lncRNA MALAT1 (metastasis associated lung adenocarcinoma transcript 1)

The MALAT1 locus is located on human chromosome 11q13 and is able to generate a long non-coding and nuclear-retained transcript with a length of ~8 kb [[15], [16], [17]]. Despite being transcribed by RNA polymerase II (Pol II) [18,19], mature MALAT1 transcript does not contain a canonical poly(A) tail, since the 3′ end processing of MALAT1 is completely different from the classical mRNA 3′ terminus formation [4,20,21]. The 3′ end of MALAT1 precursor lacks canonical cleavage/polyadenylation signals, but instead consists of a short and highly conserved poly(A)-rich tract [4]. During generation of mature MALAT1, the precursor is first cleaved endonucleolytically by RNase P immediately downstream of the poly(A)-rich tract to simultaneously produce the 3′ end of MALAT1 and the 5′ end of a tRNA-like small RNA (MALAT1-associated small cytoplasmic RNA, aka mascRNA) [4]. The 3′ end of mascRNA is then cleaved by RNase Z, followed by CCA addition (Fig. 1A) [4]. Curiously, MALAT1 accumulates in the nucleus, whereas mascRNA mainly translocates to the cytoplasm (Fig. 1A) [4]. Although MALAT1 lacks a poly(A) tail, it is highly abundant in human and mouse tissues [16,22,23] and expressed at a level higher than a number of house-keeping genes [24], indicating that the 3′ end of MALAT1 is able to resist 3′ degradation by 3′-5′ exonucleases. Indeed, a following study found that the highly conserved A- and U-rich motifs at the 3′ end of MALAT1 can form a triple helix via base-pairing, and disrupting the triple helix triggers the degradation of MALAT1 (Fig. 1B) [5,6,25], demonstrating that this extensive secondary structure is able to protect mature MATAL1 from 3′-5′ exonucleolytic decay.

Fig. 1

The biogenesis, structure, and functions of The MALAT1 precursor is first cleaved by RNase P immediately downstream of the poly(A)-rich tract, generating the 3′ end of MALAT1 and the 5′ end of mascRNA. The 3′ end of mascRNA is then cleaved by RNase Z and added with CCA. MALAT1 accumulates in nuclear speckles, while mascRNA is exported to the cytoplasm. (B) The A- and U-rich motifs at the 3′ end of MALAT1 form a triple helix via base-pairing, protecting the 3′ end of MALAT1 from degradation. Figure is adapted from Ref. [5]. (C)MALAT1 participates in diverse molecular events and is associated with many diseases. Different from mascRNA or many other ncRNAs, MALAT1 specifically localizes in nuclear speckles (also called as interchromatin granule cluster) [23,26,27], nuclear domains that are composed of multiple factors involved in transcription activation or repression, pre-mRNA splicing, mRNA modification, and mRNA nuclear export [[28], [29], [30]]. Three nuclear speckle proteins (RNPS1, IBP160, and SRm160) and two distinct motifs on MALAT1 contribute to the specific localization [27]. Such localization suggests that MALAT1 may play important roles in some molecular processes in the nucleus. In fact, MALAT1 is not a structural RNA required for the maintenance of nuclear speckle integrity [31,32], but functions in chromosome segregation during mitosis [32]. Lack of MALAT1 results in aberrant mitosis and the fragmentation of cell nuclei [32]. Intriguingly, Tripathi and colleagues found that MALAT1 interacts with the serine/arginine (SR) splicing factors, controls the distribution and phosphorylation status of SR proteins in nuclear speckles and, in turn, modulates alternative splicing (AS) events of a set of SR-regulated pre-mRNAs in HeLa cells (Fig. 1C) [32]. It is thus likely that the cycling of at least some of SR proteins between speckle domains and sites of transcription is regulated by MALAT1. However, there are several contradictory observations that MALAT1 had almost no effect on the localization of nuclear speckle components or phosphorylation status of SR proteins in knockout mice [24,33,34]. Besides, no obvious connection between MALAT1 and AS regulation was found [24,33]. These inconsistent results might be explained by the possibility that MALAT1 has evolved a function in human cells that is not easily detected in mice or the function of MALAT1 is simply cell/tissue-specific, since MALAT1 expression varies significantly among tissues and cells [16,33]. MALAT1 has also been reported to participate in modulation of transcription (Fig. 1C). Indeed, a comprehensive analysis of genomic binding sites of MALAT1 implied that it exhibits localization to many actively transcribed chromatin sites [35]. The human LTBP3 is a gene downstream of the MALAT1 locus and has an opposite transcription orientation to that of MALAT1. MALAT1 activates the transcription of LTBP3 by promoting the recruitment of the key transcription factor Sp1 to the LTBP3 promotor in mesenchymal stem cells [36]. A large reduction in expression of NEAT1, an lncRNA locus upstream of MALAT1, was observed in MEFs obtained from MALAT1 knockout mice [27]. In addition to the cis effect on neighboring genes, the localization of the polycomb 2 protein (Pc2), a key histone modifier, is controlled by MALAT1, which in turn influences expression of many genes [37]. Yang and colleagues demonstrated that the association between MALAT1 and unmethylated Pc2 promotes E2F1 SUMOylation, resulting in a biochemical switch of the Pc2 chromodomain from interacting with repressive histone marks to preferring active histone marks. Subsequently, a number of growth-control genes are activated by the event [37]. MALAT1 can also act as a microRNA sponge in the nucleus in an AGO2-dependent manner (Fig. 1C) [[38], [39], [40], [41]]. For example, there are two targeting sites for mircoRNA-9 on MALAT1 transcript. The MALAT1 level was significantly reduced in AGO2 immunoprecipitates purified from cells treated with anti-mircoRNA-9 as compared to the control sample [38]. In another case, MALAT1 can modulate c-Myc expression by sponging mircoRNA-34a in melanoma cells [40]. MALAT1 transcript was first discovered in a screening for genes whose expression have a potential relationship to metastasis in early-stage non-small cell lung cancer [16]. Afterwards, MALAT1 was found to be evolutionally conserved from mouse to human and widely expressed in numerous cancer tissues [[42], [43], [44], [45]], implying that the dysregulation of MALAT1 plays crucial roles in cancers and MALAT1 can be used as a valid cancer biomarker as well as a therapeutic target for clinical application (Fig. 1C). For instance, MALAT1 has been found as an important regulator of lung cancer metastasis probably due to its effect on metastasis-associated gene expression [46]. Interestingly, the MALAT1 ASO (antisense oligonucleotide) treated mice was able to resist to EBC-1 tumor metastasis to the lung, indicating a potential application of ASOs against MALAT1 in preventing lung cancer from spreading [46]. In colorectal cancer (CRC), the binding of MATAL1 to the tumor suppressor PSF (PTB-associated splicing factor) dissociates PTB (polypyrimidine-tract-binding protein) from PSF. The released PTB promotes CRC cell proliferation and tumor growth [47]. Two recent studies indicated that the secondary structure of MALAT1 varies in different cancers [48,49]. By dimethyl sulfate-sequencing and RNA structure analysis, MALAT1 is likely to be more unstructured in chronic myeloid leukemia, but undergoes 18 novel structural rearrangements in cervical cancer [49]. This provides new insights into how RNA structure affects its regulatory role under various conditions. Besides cancers, the dysregulation of MALAT1 is also involved in many other physiological processes and pathological diseases, including brain development [26,50], vascular growth [51], cardiovascular diseases [52,53], neurological disorders [[54], [55], [56], [57]], just to name a few. Unfortunately, most of the precise mechanisms underlying the MALAT1-associated diseases still remain unclear. Therefore, further studies are needed to clarify these questions.

Nuclear paraspeckle-associated lncRNA NEAT1 (nuclear enriched abundant transcript 1)

NEAT1 is originally identified as a large and infrequently spliced transcript in a genome-wide screening to search for nuclear ncRNAs that might modulate gene expression [23]. The NEAT1 locus is adjacent to the MALAT1 locus on human chromosome 11q13 and generates two overlapping isoforms [58,59]. NEAT1_1, aka MENε, is the short isoform with a length of ~3700 nucleotides [60]. The 3′ end of NEAT1_1 undergoes the canonical cleavage/polyadenylation steps and, in turn, terminates in a poly(A) tail (Fig. 2A). On the contrary, the long version of NEAT1 (NEAT1_2, aka MEN β) is over 23 kb and stabilized by a triple helix structure following RNase P cleavage at its 3’ end (Fig. 2A and B) [5,6,60], which is similar to MALAT1. Although none of the two isoforms has an open reading frame or a capacity to yield a peptide, they are highly and ubiquitously expressed in a wide range of tissues, such as ovary, prostate, colon, and pancreas [23]. As a type of architectural lncRNA, NEAT1 is essential for organization and integrity of nuclear paraspeckles [[61], [62], [63], [64]], irregularly shaped subnuclear bodies closed to the NEAT1 transcription site [31]. NEAT1 translocates immediately to nuclear paraspeckles once it is transcribed by RNA Pol II [65]. During daughter nuclei reformation, NEAT1 transcription happens prior to the accumulation of the paraspeckle-associated proteins PSP1/p54 in paraspeckles [31]. NEAT1 depletion leads to loss of paraspeckles, whereas overexpression of NEAT1 increases paraspeckle number [31,60,65,66], implying that the paraspeckle-associated proteins might be multifunctional when endogenous NEAT1 expression is altered. In fact, NEAT1 has been found to bind specifically to PSP1/p54 [31]. Interestingly, PSP1 forms a heterodimer with p54 in a NEAT1-independent manner [66]. But PSP1without the RNA recognition motifs no longer exhibits an interaction with NEAT1 or paraspeckles [31]. Furthermore, NEAT1 depletion has no significant effect on the protein level of PSP1 or p54, excluding the possibility that NEAT1 regulates the degradation of some paraspeckle proteins [60]. Therefore, nuclear paraspeckle is built on NEAT1 and the dysfunction of nuclear paraspeckle could be triggered by NEAT1 depletion. For example, the nuclear export of transcripts containing paraspeckle-retention elements (Alu repeats in humans and SINEs in mouse) or hyper-edited A-to-I is enhanced by NEAT1 depletion [66,67].

Fig. 2

The biogenesis, structure, and functions of NEAT1_1 (the short form of NEAT1) is stabilized by a canonical poly(A) tail following cleavage/polyadenylation. However, NEAT1_2 (the long form of NEAT1) undergoes steps similar to that of MALAT1 biogenesis. (B) The A- and U-rich motifs, present upstream of the RNase P cleavage site, form a triple helix via base-pairing, protecting the 3′ end of NEAT1_2 from degradation. Figure is adapted from Ref. [6]. (C)NEAT1 is involved in diverse molecular events, such as RNA processing and transcriptional regulation. The dysregulation of NEAT1 contributes to many diseases. NEAT1 not only functions as an architectural component of paraspeckles, but also is important to some other organelles (Fig. 2C). For example, NEAT1 and paraspeckles are crucial for mitochondrial homeostasis [68,69]. A recent study demonstrated that loss of mitochondrial proteins or mitochondrial stress results in abnormal expression of NEAT1 as well as a remarkable increase of elongated nuclear paraspeckles [68]. Moreover, many mRNAs of nuclear-encoded mitochondrial proteins contain paraspeckle-retention elements at their 3’ UTRs and is associated with NEAT1 [68], indicating a strong regulatory communication between nucleus and mitochondria. Abnormal NEAT1 expression leads to mitochondrial dysfunction, such as altered mitochondrial respiration and ATP production rate [68,69]. NEAT1 has strong regulatory effects on many molecular processes (Fig. 2C). Jiang and colleagues found that NEAT1 and its binding partner PSP1/p54 heterodimer can enhance pri-miRNA processing [70]. NEAT1 may provide a binding platform for various pri-miRNAs and microprocessor components that are attracted by the pri-miRNA-like hairpin structures of NEAT1 [70]. Additionally, NEAT1 can regulate the splicing of PPARγ2 (peroxisome proliferator activated receptor) pre-mRNA temporally via the SR protein SRp40 mediated pathway during adipogenesis. In this case, NEAT1 binds to SRp40 and potentiates its phosphorylation status by Clk kinase [71]. Wang and colleagues found that NEAT1 can activate viral gene expression by recruiting STAT3 (signal transducer and activator of transcription 3) to their promoters in response to herpes simplex virus-1 (HSV-1) infection [72]. Similarly, their following study showed that NEAT1 recruits the acetyltransferase CBP/P300 complex to the promoters of the endocytosis-related genes, thereby altering H3K27 acetylation and H3K27 crotonylation in Alzheimer disease [73]. Except for RNA processing or transcriptional regulation, several studies have suggested that NEAT1 has a profound connection with translation activation or protein stability control. For example, during RNA polymerase I inhibition, NEAT1-sequestered p54 and PSF are released from nuclear paraspeckles upon NEAT1 depletion and relocated to c-Myc mRNAs for its translation activation [74]. Curiously, NEAT1 depletion has almost no effect on the protein level of c-Myc under normal conditions [74], indicating that the NEAT1-mediated c-Myc translation only occurs in stressed conditions. In CD4+ T cells, NEAT1 physically binds to STAT3 protein and keeps it from ubiquitination [75]. This was exemplified by the NEAT1 knockdown experiment which led to a large reduction in ubiquitination labeled STAT3 protein [75]. Likewise, the half-life of another NEAT1-binding protein DDX5 is drastically increased upon overexpression of NEAT1 in HT29 cells [76]. Similar to many other ncRNAs, NEAT1 consists of potential targeting sites for various mircoRNAs [[77], [78], [79], [80], [81], [82]]. Importantly, plenty of these mircoRNA-target genes are oncogenes and contribute to tumorigenesis. Disrupting the cross-talk between NEAT1, mircoRNAs, and mircoRNA-target genes may promote proliferation, migration, invasion of cancer cells. Thus NEAT1 could be an effective target for the diagnosis and therapy of many cancer types. In prostate cancer, NEAT1 modulates the expression of the oncogene HMGA2 by sponging mircoRNA-98-5p [81]. In gastric cancer, the binding of microRNA-365a-3p to NEAT1 enhances ABCC4 expression [80]. In esophageal Squamous Cell Carcinoma, mircoRNA-129/CTBP2 axis is controlled by NEAT1 [82]. Nevertheless, it is still elusive how and why cytoplasmic mircoRNAs are controlled by a nuclear lncRNA. An increasing number of studies focusing on NEAT1's function in physiology have proved that the intensive upregulation of NEAT1 in several human diseases is associated with a drastic decrease in patient survival (Fig. 2C) [[83], [84], [85], [86], [87]]. In regard to Parkinson's disease (PD), a chronic and progressive brain disorder that affects movement, NEAT1 expression was found to be significantly upregulated in PD mice [83,84]. Interestingly, depletion of NEAT1 could reverse apoptosis induced by PD inducers possibly by controlling the expression of some PD contributors, such as α-synuclein and PINK1 [83,84]. Regarding non‐alcoholic fatty liver disease, loss of NEAT1 inhibits the expression of acetyl‐CoA carboxylase and fatty acid synthase through the mTOR/S6K1 signaling pathway, implying that NEAT1 acts as a potent regulator in the biosynthesis of lipid and insulin signaling transduction [88]. With intensive researches of NEAT1-related diseases, we believe that a therapeutic application of NEAT1 is coming soon.

Chromatin-associated lncRNA XIST (X-inactive specific transcript)

Less than 3% of the ancestral genes survive on the male Y chromosome, while X chromosome retains more than 97% [89,90]. Therefore, males and females have unbalanced genetic content in mammals. To minimize sex chromosome-linked gene expression imbalance between XX females and XY males, a randomly chosen X chromosome is transcriptionally silenced in early female embryogenesis. This dosage compensation process is called X chromosome inactivation (XCI) and is initiated by the lncRNA XIST [[91], [92], [93], [94], [95], [96]]. It is important to note that XCI by no means equal to loss of X chromosome, since 3%–20% of the X-linked genes from the inactive X (Xi) chromosome are still activated and indispensable for female embryogenesis. This is exemplified by Turner syndrome, a genetic disorder disease only occurs in females with only a single X chromosome [97,98]. XIST was first described as a long noncoding transcripts generated exclusively from the Xi chromosome in the early 1990s [[99], [100], [101], [102], [103]]. Once generated, XIST spreads across the entire length of the Xi chromosome, resulting in gene repression chromosome-widely. Further sequence analyses revealed that XIST is a 15–17 kb spliced and polyadenylated RNA which contains blocks of local tandem repeat elements designated A-F (Fig. 3) [102,103]. Significant progresses have elucidated the functional roles of each individual repeat in establishment of X chromosome-linked gene silencing. Hence, we will focus our discussion on these functional elements as well as their RBPs (RNA-binding proteins) in this section.

Fig. 3

Schematic representation of XIST and its RBPs. As a multi-tasking RNA molecule, XIST functions in a variety of molecular processes through different RBPs. The tandem repeats (denoted in boxes) contribute to the recruitment of different RBPs. Using a series of inducible XIST transgenes, Wutz and colleagues found that XIST mutations that lack A-repeat, consisting of 7–8 copies of a 24 nt GC-rich core sequence at the 5’ end of XIST, abolish silencing activity and lead to a decrease in stability without influencing localization [104]. Furthermore, the transcription activity of XIST appears to be affected by deletion of the endogenous A-repeat region in DNA [105]. Besides affecting XIST itself, the role of A-repeat in chromosome silencing is also linked to its binding proteins. For example, A-repeat contributes to the interaction with RBM15 (RNA binding motif protein 15) which further targets the methyltransferase complex METTL3/14 to specific RNAs, including XIST, for addition of N-methyladenosine (m6A modification) [106,107]. Importantly, loss of the m6A catalytic subunit or the m6A reader leads to a major silencing deficiency, indicating a role of m6A in XIST-mediated chromosome silencing [106]. Contradictorily, only a minor decrease in X silencing efficiency was observed in either RBM15 knockdown or knockout cells [106,108]. The phenomena might be explained by the finding of RBM15B, the RBM15 homolog, which shares redundant functions with RBM15 [106,109]. SPEN is another interactor of A-repeat [110]. Depletion of SPEN results in impaired silencing of the X chromosome-linked genes, such as Fbxl17, without disrupting XIST localization [107,108,110,111]. In addition, SPEN might initiate transcriptional repression by actively recruiting the transcriptional corepressor complex and the histone deacetylase HDAC3 [[112], [113], [114]]. Unlike RBM15 and SPEN, LBR (Lamin B receptor) has a widespread distribution across XIST [115,116]. The major binding site encompasses the entire A- and F-repeat as proved by LBR CLIP-seq analysis [115,116]. Losing the interaction between XIST and LBR leads to impaired chromosome silencing [108,116]. B- and C-repeat have been reported to be critical for XIST spreading as well as the recruitment of hnRNP K (heterogeneous nuclear ribonucleoprotein K) and the polycomb silencing complexes to XIST [110,117,118]. Chu and colleagues found that defective chromosome silencing can be induced by depletion of hnRNP K [110]. With further IF-coFISH (combined immunofluorescence and RNA fluorescence in situ hybridization) experiments, they demonstrated that hnRNP K colocalizes with H2AK119ub and H3K27me3, the earliest chromatin modifications occurring in XCI [110]. Depletion of hnRNP K reduces the accumulation of H3K27me3 and H2AK119ub along Xi without influencing the global level of these modifications, suggesting Xi undergoes a specific modification. Indeed, hnRNP K depletion specifically dissociates XIST from the polycomb silencing complexes [110]. C-repeat, together with E-repeat, is also responsible for the binding of hnRNP U (heterogeneous nuclear ribonucleoprotein U) to XIST and, in turn, XIST localization to chromatin [108,110,[119], [120], [121]]. But whether correct XIST localization depends on a combination of additional factors is still elusive, since the effect of hnRNP U cannot be confirmed in certain cell types [122,123]. In fact, an interesting study demonstrated that YY1 protein, capable of binding both RNA and DNA, acts as an adaptor in docking C-repeat of the RNA onto the F-repeat region of the XIST locus [124]. By deleting D-repeat through CRISPR/Cas9 technique in HEK293T cells, a significantly reduction in XIST expression was observed, implying that D-repeat might play an important role in the stability or transcriptional activity of XIST [125]. Consequently, the silencing ability of XIST was abrogated as proved by the phenomenon that levels of some X chromosome-linked genes were upregulated [125]. 3D structural reorganization is another critical pathway for establishment of Xi gene repression by XIST, together with SmcHD1 (Structural maintenance of chromosomes flexible hinge domain containing 1) and the polycomb factors [[126], [127], [128], [129], [130], [131], [132], [133], [134]]. In this mechanism, the X inactive genes are relocalized from the periphery to the interior of chromatin during XCI, thereby being isolated from RNA Pol II and other general transcription factors [126]. XCI is not only a dosage compensation process happening during early female embryogenesis and related to X-linked diseases in females, but also a critical modifier of many other diseases including cancers and neurological disorders. For example, XIST could affect cancer development and progression by sponging various microRNAs, thereby influencing the expression of some oncogenes [[135], [136], [137]]. On the other hand, Yue and colleagues found that the level of XIST is significantly increased in AD (Alzheimer's disease) mice models, suggesting that XIST might be used as a diagnostic marker for AD [138].

Nuclear circular RNA

Protein-coding genes of the eukaryotic genome are typically split by one or multiple introns. During the production of a mature mRNA, introns must be removed and exons are joined together in a linear order, which is known as canonical splicing [[139], [140], [141]]. Alternatively, splicing events can also occurs in a way that a splicing donor is joined to an upstream acceptor, thereby generating a covalently-closed RNA (circular RNA) [7,[142], [143], [144], [145]]. Despite the first circular RNA being discovered in 1976 [146], only recently has it been appreciated that these covalently-closed molecules represent a common class of transcriptional outputs from numerous protein-coding genes in the eukaryotic system [147,148]. So far, many labs have been putting a lot of efforts into characterization of circular RNA biology, and thus obtained remarkable progresses as well reviewed in many papers [[7], [8], [9],143,[149], [150], [151], [152], [153], [154]]. In short, the biogenesis of circular RNA is initiated by base-pairing between the flanking intronic repeats as well as some RBPs, including QKI, Mbl, ADAR1, RBM20, FUS, etc [[155], [156], [157], [158], [159]]. The degradation of circular RNA is controlled by different exonucleases and the platform protein gawky (aka GW182) in various context [[160], [161], [162], [163], [164]]. Once generated, most circular RNAs are exported to the cytoplasm in a length-dependent manner with the assistance of the DExH/D-box helicase DDX39A and DDX39B [165,166]; however, a small amount of circular RNAs accumulate in the nucleus. Here, we will focus our discussion on nuclear circular RNAs and their functions in this section. In searching for lncRNAs involved in transcriptional regulation, Li and colleagues identified a special subclass of Pol II-binding circular RNA, also called EIciRNA (exon-intron circular RNA), whose intron(s) between circularized exons are not spliced out (Fig. 4) [167]. Similar to most EciRNAs (exonic circular RNAs), EiciRNAs are generated via back-splicing reactions in which the flanking Alu repeats facilitate their circularization [167]. Functionally, EIciRNAs have cis regulatory effects on their host gene transcription activity (Table 1), since depletion of EIciRNA leads to a significant reduction in the levels of their nascent mRNAs. Further mechanism analyses indicated that EIciRNA interacts with U1 snRNP directly at the promoter of their host gene. Loss of U1 snRNA or disrupting the U1 snRNA-EIciRNA interaction by antisense morpholino (AMO) impairs the transcription-enhancing effect of EIciRNAs [167]. Notably, EIciRNAs not only localize to their sites of transcription, but also accumulate to other nuclear subdomains, suggesting other potential regulatory roles and mechanisms of EIciRNAs [167]. Another example of nuclear intron-containing circular RNA is ciRNA (intronic circular RNA), a type of lariat RNA which escapes from debranching (Fig. 4) [168]. Like EIciRNAs, ciRNAs (e.g. circANKRD52) are able to interact with the Pol II complex and modulate transcription efficiency of their host genes in cis (Table 1). Mechanistically, depletion of circANKRD52 has almost no effect on Pol II engagement, suggesting that circANKRD52 affects transcriptional rate but not transcription termination [168]. In another case, ciRNAs (circIns2 in rat and circINS in human), derived from the insulin gene, regulate transcription of genes involved in calcium signaling and insulin exocytosis via the RBP TAR DNA-binding protein 43 kDa (TDP43) in the nucleus (Table 1) [169]. Low expression of circIns2 or circINS leads to defective insulin secretion of β-cell to and are associated with type 2 diabetes [169]. EciRNAs are also capable of regulating transcription in the nucleus as exemplified by circFECR1. CircFECR1 can activate its host gene expression by inducing promoter hypomethylation with the demethylase TET1 (Table 1) [170].

Fig. 4

Table 1

The nuclear functions of circular RNAs

circRNA	Type	Molecular Mechanism	Ref
circEIF3J	EIciRNA	Interacts with RNA Pol II and U1 snRNP to regulate transcription of its host gene	[167]
circPAIP2	EIciRNA	Interacts with RNA Pol II and U1 snRNP to regulate transcription of its host gene	[167]
circANKRD52	ciRNA	Interacts with RNA Pol II to regulate transcription of its host gene	[168]
circIns2/INS	ciRNA	Interacts with TDP-43 to regulate transcription	[169]
circFECR1	EciRNA	Induces promoter hypomethylation with the demethylase TET1	[170]
circ2082	EciRNA	Promotes DICER nuclear translocation via RBM3	[171]
cia-cGAS	EciRNA	Inhibits cGAS as a decoy	[172]
circSEP	EciRNA	Regulates alternative splicing of its host gene via R-loop	[173]
circCRM1	Intergenic circRNA	Maintains normal centromeric chromatin organization via R-loop	[174]
circSMARCA5	EciRNA	Downregulates its host gene expression via R-loop	[177]

The biogenesis of circular RNAs. Most circular RNAs are generated from protein-coding genes via back-splicing by which a splicing donor is joined to an upstream acceptor. In other cases, circular RNAs are derived from lariat intronic RNAs that escape from debranching. The nuclear functions of circular RNAs Besides transcription, some nuclear circular RNAs can affect the function of their RBPs. For example, a recent study demonstrated that circ2082 has an important role in nucleocytoplasmic distribution of the endoribonuclease DICER, a key component of pre-microRNA processor complex (Table 1) [171]. In this case, circ2082 is recruited to DICER by RNA binding motif protein 3 (RBM3) in the nucleus of glioblastoma cell. The interaction is crucial for nuclear retention of DICER, since depletion of circ2082 results in normalization of DICER localization to the cytoplasm [171]. The change in nucleocytoplasmic distribution of DICER further reestablishes the assortment of cellular microRNAs (microRNAome) [171]. Because the dysregulation of microRNAome is associated with cancers, circ2082 could be used as a therapeutic target for cancer treatment. Circular RNA cia-cGAS act as a protein decoy for the DNA sensor cGAS and inactivates its synthase activity under homeostatic conditions, which in turn inhibits cGAS-mediated autoimmune responses (Table 1) [172]. Interestingly, certain nuclear circular RNAs are able to form an RNA:DNA hybrid (R-loop) with the genomic DNA. In Arabidopsis, nuclear circSEP derived from exon 6 of the SEPALLATA3 gene was reported to affect the splicing of its host gene through an R-loop mechanism (Table 1). Formation of the R-loop leads to an increased level of a linear SEP transcript isoform which lacks exon 6, in turn driving floral homeotic phenotypes [173]. Importantly, circSEP:DNA R-loops are more abundant than their linear RNA:DNA counterparts, indicating the specific role of circSEP in R-loop formation [173]. In Zea mays, circular RNA:DNA R-loops are integral components of centromeric chromatin [174]. The centromeric retrotransposons derived circular RNAs were found to bind to the centromere through R-loops (Table 1). Disrupting the R-loop results in abnormal localization of the centromeric H3 variant (CENH3), in turn affecting centromeric chromatin organization [174]. SMARCA5 is a key component of DNA damage repair pathway and is required for chromatin remodeling [175,176]. In breast cancer cells, circSMARCA5 can downregulate the expression of the SMARCA5 gene via R-loop (Table 1). Therefore, circSMARCA5 has a negative effect on DNA repair capacity [177].

Concluding remarks

Eukaryotic nucleus is a membrane-bound organelle contains most genetic information and controls gene expression. Many molecular processes, such as transcription, splicing, replication, histone modification, and DNA repair, occur in the nucleus and are regulated combinatorially by various protein factors. Only until recent 20 years, lncRNAs have been considered important regulators in the processes mentioned above [19,[178], [179], [180]]. Although some of lncRNAs (e.g. XIST) have general effects on nuclear events, the majority of them seem to act uniquely on certain genes, implying that the total numbers of functional nuclear lncRNAs (annotated or unannotated) may be underestimated. Thus defining the novel nuclear roles of lncRNAs will help us understand the complex of nuclear regulation. Having established an association between a lncRNA and a nuclear event, we could reason a possible mechanism based on the phenotype. Informed by the studies discussed in above sections, we will summarize potential mechanisms for the functions of newly-discovered nuclear lncRNAs. First, lncRNAs might indirectly affect certain events via controlling the expression or subcellular localization of the key protein factors. With the development of high-throughput RNA-seq technique and single molecule tracking system, this possibility could be easily examined. Second, lncRNAs might act as a molecular platform, bringing diverse proteins together into an RNP complex, or a decoy by sponging RNAs or even proteins and preventing them from associating with their targets. In these two cases, deletion analyses can be performed. If the hypothesis is right, then deletion of a protein binding motif in lncRNAs or a RNA binding motif in proteins leads to loss of function. In addition, a variety of high-throughput sequencing methods were developed to identify the interactomes of nuclear lncRNAs in vivo (Table 2). These tools could help us map RNA-DNA [[181], [182], [183], [184], [185], [186], [187], [188], [189]], RNA-protein [[190], [191], [192], [193]], and RNA-RNA interactions [[194], [195], [196], [197], [198]]. Third, lncRNAs might be an organizer of nuclear architecture. Knockdown or knockout of these lncRNAs results in nuclear subdomain breakdown, thereby affecting unique reactions. Fourth, the act of transcription through a lncRNA locus seems to be regulatory, rather than the lncRNA product as discussed by Long et al. and Bassett et al. [178,199]. To answer the question, the lncRNA region in genomic DNA can be replaced by a random sequence without changing its promoter. Combinatorially, the lncRNA transcript can be knockdown by siRNA, ASO, or CRISPR-Cas13a system or functionally blocked by AMO.

Table 2

The methods used to study RNA interactomes.

Method	Description	Experimental Purpose	Enhanced Version	Ref
CHIRP-seq	A method to pull down and map the RNA-associated chromatin using biotinylated antisense oligos	Identification of RNA-DNA interaction	CHART-seq, CHIRT-seq, hiCHIRP-seq	[[181], [182], [183], [184], [185]]
R-CHIP-seq	A method to pull down and map R-loops using catalytic-dead RNase H	Identification of R-loop		[186]
DRIP-seq	A method to pull down and map R-loops using S9.6 antibody	Identification of R-loop	bisDRIP-seq, DRIPc-seq	[[187], [188], [189]]
CLIP-seq	A method to pull down and map the protein-binding RNA by immunoprecipitation	Identification of RNA-protein interaction	PAR-CLIP-seq, iCLIP-seq, hiCLIP-seq	[[190], [191], [192], [193]]
SHAPE-seq	A method to predict RNA secondary structures using chemical modifiers	Prediction of RNA-RNA interaction	icSHAPE-seq, DMS-seq, SHAPE-seq2.0	[[194], [195], [196], [197]]
Frag-seq	A method to predict RNA secondary structures using endonuclease P1	Prediction of RNA-RNA interaction		[198]

The methods used to study RNA interactomes. In conclusion, the last 20 years have revealed that lncRNAs are not simply rare transcriptional noise, but pervasively exist in the eukaryotic system. Although numerous surprises have been disclosed, they only represent a glimpse of the tip of lncRNA biology. More efforts are required to provide insights into how these special transcripts are controlled and function especially in the nucleus. The era for studies in lncRNA biology is just coming.

Declaration of competing interest

No potential conflict of interest was reported by the authors.

5 in total

1. HOTTIP Mediated Therapy Resistance in Glioma Cells Involves Regulation of EMT-Related miR-10b.

Authors: Zhang Li; Ming Li; Pengcheng Xia; Zhiming Lu
Journal: Front Oncol Date: 2022-03-24 Impact factor: 6.244

Review 2. MALAT1-related signaling pathways in colorectal cancer.

Authors: Wen-Wen Xu; Jin Jin; Xiao-Yu Wu; Qing-Ling Ren; Maryam Farzaneh
Journal: Cancer Cell Int Date: 2022-03-19 Impact factor: 5.722

Review 3. Oncogenic Dysregulation of Circulating Noncoding RNAs: Novel Challenges and Opportunities in Sarcoma Diagnosis and Treatment.

Authors: Lidia Chellini; Ramona Palombo; Veronica Riccioni; Maria Paola Paronetto
Journal: Cancers (Basel) Date: 2022-09-26 Impact factor: 6.575

4. LncRNA RCAT1 promotes tumor progression and metastasis via miR-214-5p/E2F2 axis in renal cell carcinoma.

Authors: Renbo Guo; Benkui Zou; Yiran Liang; Jiasheng Bian; Jian Xu; Qian Zhou; Chao Zhang; Tao Chen; Mingshan Yang; Huansheng Wang; Fajun Pei; Zhonghua Xu
Journal: Cell Death Dis Date: 2021-07-09 Impact factor: 8.469

Review 5. Non-Coding RNAs in Kidney Diseases: The Long and Short of Them.

Authors: Juan Antonio Moreno; Eya Hamza; Melania Guerrero-Hue; Sandra Rayego-Mateos; Cristina García-Caballero; Mercedes Vallejo-Mudarra; Laurent Metzinger; Valérie Metzinger-Le Meuth
Journal: Int J Mol Sci Date: 2021-06-04 Impact factor: 5.923

5 in total