Literature DB >> 33835127

Functional annotation of lncRNA in high-throughput screening.

Chi Wai Yip1, Divya M Sivaraman1,2, Anika V Prabhu1, Jay W Shin1.   

Abstract

Recent efforts on the characterization of long non-coding RNAs (lncRNAs) revealed their functional roles in modulating diverse cellular processes. These include pluripotency maintenance, lineage commitment, carcinogenesis, and pathogenesis of various diseases. By interacting with DNA, RNA and protein, lncRNAs mediate multifaceted mechanisms to regulate transcription, RNA processing, RNA interference and translation. Of more than 173000 discovered lncRNAs, the majority remain functionally unknown. The cell type-specific expression and localization of the lncRNA also suggest potential distinct functions of lncRNAs across different cell types. This highlights the niche of identifying functional lncRNAs in different biological processes and diseases through high-throughput (HTP) screening. This review summarizes the current work performed and perspectives on HTP screening of functional lncRNAs where different technologies, platforms, cellular responses and the downstream analyses are discussed. We hope to provide a better picture in applying different technologies to facilitate functional annotation of lncRNA efficiently.
© 2021 The Author(s).

Entities:  

Keywords:  expression modulation; functional annotation; high-throughput screening; long non-coding RNA (lncRNA)

Mesh:

Substances:

Year:  2021        PMID: 33835127      PMCID: PMC8564734          DOI: 10.1042/EBC20200061

Source DB:  PubMed          Journal:  Essays Biochem        ISSN: 0071-1365            Impact factor:   8.000


Introduction

The characterization of the mammalian transcriptome by the Encyclopedia of DNA Elements (ENCODE) and the Functional Annotation of the Mammalian Genome (FANTOM) projects have revealed a large collection of long non-coding RNAs (lncRNAs), which are defined as greater than 200 nucleotides and account for the majority of the transcriptome [1,2]. A study summarizing multiple transcript collections has further revealed approx. 20 000 human lncRNAs with functional insight [3]. Indeed, the single-nucleotide polymorphisms (SNPs) identified by genome-wide association studies (GWASs) that are associated with diseases or traits are mainly located in the non-coding region [4], suggesting that both DNA regulatory elements and lncRNAs play some functional roles in different diseases. Except for the canonical lncRNAs that were initially characterized such as XIST, NEAT1 and MALAT1 [5,6], the vast majority of lncRNAs were characterized only in the last decade. These lncRNAs have been shown to regulate pluripotency maintenance [7], cellular reprogramming [8], lineage commitment [9], carcinogenesis [10], and pathogenesis of various diseases [11]. While the pervasive transcription of lncRNAs implies they should acquire function over evolutionary time [12], functional role of the majority of lncRNAs remains to be characterized. Notably, lncRNAs express, localize, and function specifically across different cell types [13-15], suggesting that an lncRNA may be functional in one cell type or a particular cellular phenotype but not in the others. The diverse roles of lncRNAs in organismal development, physiological processes, and disease pathology has been revealed by the advent of high-throughput (HTP) screenings [15-25]. Various technologies have allowed loss-/gain-of-function HTP screening to become popular and affordable. This has included the application of short hairpin RNA (shRNA) pooled libraries, CRISPR-based single guide RNA (sgRNA) pooled libraries, antisense oligonucleotide (ASO), and next-generation sequencing (NGS). While these HTP screening platforms have been frequently applied to mRNA functional screening, they have not yet become common for lncRNAs. Unlike protein-coding genes, the novelty of lncRNAs makes it challenging to hypothesize the connection between the lncRNA hits and the tested phenotypes. Thus, this review will summarize the use of HTP platforms for lncRNA screening, and the downstream analyses that help to narrow the gap between lncRNA hits and the phenotype.

Selection and prioritization of lncRNA targets

Currently, the number of known human lncRNA transcripts are over 173 000 according to NONCODE [26] and over 268 000 according to LncBook [27]. However, this could be far lower in a particular cell type since lncRNAs are highly cell type-specific [14]. Therefore, expression level is one of the key factors to select and prioritize lncRNAs in a screen to improve the success rate. Indeed, loss-of-function screening on lncRNAs with a broad range of expression levels showed positive correlations between expression level and the cell growth phenotype [15,16]. Furthermore, acquiring differential expression data by microarray or NGS-based methods between the desired phenotype and control can improve target selection. For example, lncRNAs associated with cancer subtypes and clinical prognosis were identified from microarray data of four cancer types [28], and the list was adopted in a CRISPR-based screening study [17]. Similarly, Xu and colleagues [18] selected 25 lncRNAs to screen for drug resistance by choosing the up-regulated lncRNAs after drug treatments using RNA-seq analysis. Although computationally it is possible to further prioritize to predict the most relevant lncRNAs [29-32], loss-/gain-of-function HTP screening is necessary to identify lncRNAs with a specific and relevant cellular phenotype.

Modulation of lncRNA expression

Many loss-/gain-of-function methods are available for modulating the expression of lncRNAs [33]. However, only several of them are scalable for use in HTP screening. We will only cover ASO, RNA-interference (RNAi) and CRISPR/Cas13 systems, which target the RNA directly, as well as CRISPR/Cas9 (CRISPRn) and use of the catalytically dead Cas9 (dCas9) (CRISPRi/a) which target genomic DNA (Figure 1A). For a clearer goal of the review, we will start by comparing compatible modulation methods directly.
Figure 1

Functional screening for lncRNAs

(A) Methods for modulating lncRNA expression levels, by cleavage of the RNA directly, modulating the transcription, and knockout of the gene. (B) Expression modulation can be scaled to achieve HTP screening by oligo arrays and pooled libraries and (C) the resulting phenotypes for each of them.

Functional screening for lncRNAs

(A) Methods for modulating lncRNA expression levels, by cleavage of the RNA directly, modulating the transcription, and knockout of the gene. (B) Expression modulation can be scaled to achieve HTP screening by oligo arrays and pooled libraries and (C) the resulting phenotypes for each of them.

ASO versus RNAi

Both ASO and RNAi (including shRNA exogenous expression and siRNA transfection) directly target the RNA molecules for degradation. The action of RNAi depends on the RNA-induced silencing complex (RISC) to cleave the RNA target, while ASOs form a DNA–RNA duplex with the RNA target for RNase H recognition and cleavage. Typical ASOs for degrading RNA utilize LNA gapmer technology [34] and often include a 2′MOE modification for stability and nuclease resistance. Both methodologies have been used in HTP screening for functional lncRNAs and shown to knockdown lncRNAs efficiently [16,22-24]. In comparing ASO and RNAi, the key difference is the localization of the endogenous enzymes used to target the RNA molecule—RNase H is enriched in the nucleus, while RISC is enriched in the cytoplasm. Consequently, nuclear lncRNAs were more efficiently suppressed by ASOs, compared with cytoplasmic lncRNAs which responded better to RNAi [35]. However, since most lncRNAs are enriched in the nucleus where chromatin is regulated by lncRNA [36,37], targeting the nuclear lncRNAs is often more beneficial. Additionally, ASOs were shown to be equally effective in targeting introns and exons of lncRNAs [16], indicating ASOs could knockdown cytoplasmic RNA during their transcription in the nucleus. Indeed, lncRNA localization negatively affects RNAi more than ASO [35]. Therefore, ASO is recommended for modulating the expression of lncRNA for array-based screens. On the contrary, shRNA pooled library allows the RNAi modality to be scalable. Besides expression modulation, RNase H-inactive ASO has been used to block splicing of nascent RNA [38,39]. The same method has been applied to block the splicing of an lncRNA, resulting in chromatin retention and malfunction of the lncRNA [40]. This highlights the added benefits of ASO in studying lncRNA biology.

CRISPRn versus CRISPRi/a

CRISPR/Cas9-mediated genome editing has been commonly used in mRNA functional screening and also for lncRNA functionality. Zhu and colleagues [17] adopted paired-guide RNA to mediate deletion of various lengths from 700 lncRNA genes and identified 51 lncRNAs that affected cell growth. In the CRISPRi system, dCas9 is fused to transcriptional repressor domains KRAB or methyl-CpG binding protein 2 (MeCP2) to achieve targeted suppression of gene expression [41-43]. Besides those for mRNA, several large-scale CRISPRi screenings have been performed to elucidate lncRNA functionality [15,21,25]. More recently, dCas9 fused with transcriptional activator VP64 and other synergistic activators have been described [44]. Unlike conventional plasmid-based overexpression, CRISPRa activates lncRNA expression from the endogenous genomic locus, which has the advantage of capturing cis-acting and nuclear lncRNA functions. Thus, CRISPRa has been utilized for the functional screening of lncRNAs [19,20]. When comparing CRISPRn knockout and CRISPRi knockdown in pooled library studies targeting mRNAs, CRISPRn is more effective than CRISPRi [45,46]. However, there are several limitations when applying CRISPRn to modulate lncRNA [47]. Firstly, unlike mRNA, partial deletion of the lncRNA genes may not ablate their functions, with lncRNA gene size often too long for complete deletion [48]. While other options such as deletion of the promoter could be considered, the varied efficiency of suppression for different loci is challenging in large-scale screening. Secondly, CRISPRn could affect other proximal functional elements and their topological interactions thus confounding the mechanistic activity of lncRNAs. CRISPRn cleavage also hinders the detection of cis-regulation of lncRNAs. Finally, as many of the lncRNAs overlap with other genes, deletion by CRISPRn is less applicable. Additionally, dsDNA damage mediated by CRISPRn is known to trigger non-specific false positives [49], which does not occur with CRISPRi [50]. Therefore, CRISPRi/a is more applicable than CRISPRn for lncRNA expression modulation.

CRISPRi versus shRNA versus CRISPR/Cas13

The performance of pooled libraries for RNAi, CRISPRi, and CRISPRn were compared using 46 essential and 47 non-essential mRNAs in a negative selection screen [46]. This study found that (1) CRISPRn screening performs the best in both sgRNA- and gene-based analyses, (2) shRNA screening results reflected off-target effects of individual shRNAs, which can be reduced by using multiple shRNAs, and (3) CRISPRi screening shows virtually no off-target effects, but only 50% of the sgRNA is effective, leading to a lower hit rate. As discussed earlier, CRISPRn is not the most suitable screening method for lncRNAs. For both the shRNA and CRISPRi methods, their shortcomings can be compensated by including more shRNA or sgRNA constructs. Additionally, sgRNA design can be improved with accurate transcription start sites (TSSs) positioning using FANTOM CAGE annotation [51]. Similarly, tools to improve the selection of lncRNA targets according to expression level and TSS annotation for sgRNA design are available [52]. Therefore, both CRISPRi and shRNA pooled screening are applicable for lncRNA functional screening, with the choice of targeting the DNA or RNA, respectively. However, a major advantage of CRISPR-based screening is the ongoing development of relevant technologies and supportive analytical algorithms. Recently, an RNA-guided RNA-targeting CRISPR effector Cas13 was characterized and shown to function in both the nucleus and cytoplasm [53-55]. Notably, CRISPR/Cas13 is reported to have a high specificity, allowing the possibility of using closely related mismatch controls in knockdown studies [54]. However, Cas13 exhibits collateral activity after target recognition and cleaves any RNA in close proximity regardless of complementarity [54,56], which may be a hindrance in its utility for RNA knockdown. While Cas13 has already been used in a functional lncRNA screen [18], a comprehensive comparison with other knockdown methods is still needed.

Targeting the lncRNA transcripts versus lncRNA loci

In summary, targeting the transcripts of lncRNA, and not their DNA loci, is advantageous for (1) not interfering with the function of the lncRNA promoter, which may act as an enhancer for other genes, (2) targeting only individual isoforms since different isoforms could have opposing effects [57], and (3) allowing additional lncRNAs to be considered for screening, since the DNA-targeting approach must avoid affecting other genes on the same loci of intragenic, divergent, and antisense lncRNA genes. On the contrary, targeting the DNA loci could also investigate enhancer-like activities, broadening the coverage of functional non-coding regulatory elements in the genome. Furthermore, both ASO and RNAi methods exhibit independent off-target effects [16,58] whereas CRISPRi/a modulation of the DNA is less prone to off-target effects because of the narrow targeting window.

Scaling to HTP screening

HTP screening has shown promise in identifying individual lncRNAs which regulate a cellular phenotype. Moreover, as lncRNAs constitute the majority of the transcriptome and harbor many GWAS SNPs, using HTP screenings to identify the proportion of lncRNAs with the information of their genomic characteristics that participate in cellular mechanisms has become a key tool to understand the genome and its role in disease. Additionally, HTP screening can identify associations between drug compounds and their gene targets, which include both protein-coding genes [59-62] and lncRNAs [18-20,63]. For example, Bester and colleagues [19] utilized a CRISPRa-system to identify genes contributing to cytarabine resistance in an acute myeloid leukemia cell line. They revealed a group of lncRNAs driving cytarabine resistance via cis-regulation. Such studies further highlight the interconnected role lncRNAs play in various pathological mechanisms. Screening in an array format has the advantage of separating the individual perturbations. This allows measurement by qPCR to rule out the unsuccessful perturbation [16,22], where the degree of knockdown significantly affects the hit rate [16]. This also provides the flexibility to directly record cellular phenotype, such as through high-content imaging [24]. Oligonucleotide arrays are available for chemically synthesized siRNA, ASO, and sgRNA (Figure 1B). Previously the relatively high cost of LNA gapmer ASO limited its use in HTP screening. However, due to the lifting of the LNA patent protection, the cost is now comparable with siRNA and sgRNA (∼$200 USD per oligo). Alternatively, both shRNA and sgRNA can be generated by cloning into viral vectors in an array format, however, this task can be laborious to scale-up. When the throughput of an arrayed screen is limited by cost and experimental intensiveness, pooled library screening provides another option where the throughput can be near unlimited (∼$20 000 USD per library with 30 000 constructs).

Screening by oligonucleotide array

Design of the array platform

When designing the arrayed screens, at least two constructs with independent sequences showing effective knockdown and the same cellular phenotype are necessary to call a hit. From our previous study, which used 2021 ASOs to target 285 lncRNAs, 43.5% of ASOs were effective and there were 68.1% lncRNA targets with at least two effective ASOs [16]. Our results showed that higher expressing lncRNAs are more susceptible to ASO knockdown, suggesting inclusion of such targets could improve the chances of an effective ASO. For RNAi, Guttman and colleagues [22] designed five shRNAs per target for 214 intergenic lncRNAs, in which 65% had at least one effective knockdown. Designing a functional ASO sequence without off-targeting for lncRNA is challenging, since lncRNA genes harbor many repetitive elements [64] and ASOs have the potential to target the intronic sequences of nascent transcripts. For RNAi, off-target effects were partly due to the dependence on a relatively short complementary seed sequence in the 3′ end of RNAi [65]. Therefore, in order to reach enough effective oligos and deal with the off-target effects, the starting number of oligos should be sufficiently high (e.g., five or more). The phenotypic response of transfecting oligos to cells is transient. For a longer phenotypic assay, such as differentiation, lentivirus-mediated shRNA array can be considered, as done by Guttman and colleagues [22] to identify lncRNAs that are important in mouse ES cells.

Phenotypes tested in array platform

When adopting an array method of screening, the cellular phenotypes of interest are abundant as compared with pooled screening. Numerous quantitative cellular assays have been utilized, such as those measuring growth, differentiation [66], infection [67], and endocytosis [68]. Among them, the inclusion of imaging is a major advantage for array screening. For instance, 50 lncRNAs were shown to affect cell morphology in human dermal fibroblasts by real-time imaging after ASO knockdown [16]. By applying high-content imaging to RNAi array screening, Stojic and colleagues [24] identified six lncRNAs for regulating mitotic progression, chromosome segregation, and cytokinesis.

Screening by pooled library

Design of the pooled library platform

When designing a pooled library screening, several factors determine the scale of the study. These include the total number of lncRNA targets, how many sgRNA/shRNA constructs per lncRNA, the additional non-target scramble sequences, and the size of the coverage. A higher number of constructs could compensate for the off-target effects of shRNA and increase the number of effective sgRNAs. Typically, the number of constructs is in a range of 5–20 shRNA/sgRNA per target (Table 1). From an analytical point of view, the number of constructs should be at least 4, while a higher number allows for statistical analysis that can incorporate technical and biological variability to improve power [69]. The number of non-target controls reflects the variation of the screen, which is complicated with the randomness of construct distribution and cell-to-cell variation. Therefore, it is necessary to include a large pool of non-target controls, which usually ranges from ∼100 to ∼1000 [19,21,25]. Assuming a library of 10 000 constructs, 10 constructs per lncRNA with 1000 non-target controls, the throughput of the library will be 900 lncRNAs. The infection coverage represents the number of cells uptaking the same construct, where each cell contains only one construct by restricting the multiplicity of infection (MOI) (usually ≤0.3). As the genomic integration event is random, a sufficient size of infection coverage can normalize the variability. The common infection coverage for each construct is approx. 300–500× [15,19,25] while sequencing coverage is approx. 1000×. Additional independent experimental replications of the same library are also necessary. Therefore, for a single replicate, the number of infected cells needed for 300× coverage is 3 million, and the starting number of cells is 10 million at an MOI of 0.3.
Table 1

Screenings applied to lncRNAs

TechnologiesCell typesLncRNAsConstructsPhenotype (% hit)References
Array screening
siRNAHeLa22314 (pooled)Mitotic progression (0.1%)Chromosome segregation (0.1%)Cytokinesis (0.1%)[24]
ASOHuman dermal fibroblast285 (194),1195–15 (2–10),≥2Proliferation (7.7%),CAGE sequencing (10.9%)[16]
shRNAMouse ES214 (147)5 (1–2)Microarray (93%)[22]
Pooled library screening
CRISPRiEpidermal keratinocyte22635Proliferation (0.4%)[25]
Cas13K5622510Proliferation with three anti-cancer drug treatments (64%)[18]
CRISPRiHuman glioblastoma568910Proliferation with fractionated radiation (8.2%)[21]
CRISPRaMOLM14 AML14701≥4Proliferation with Cytarabine treatment (19.5%)[19]
CRISPRaHuman melanoma A37510504∼10Proliferation with Vemurafenib treatment (0.2%)[20]
CRISPRiiPSC,MCF7,U87,K562,MDA-MB-231,HeLa,HEK293T5543,5725,5689,16401,5725,6158,578510Proliferation (5.9%),Proliferation (1%),Proliferation (1.1%),Proliferation (0.4%),Proliferation (0.5%),Proliferation (0.4%),Proliferation (0.3%)[15]
CRISPRHuh7.5OC671∼20Proliferation (7.6%)[17]
shRNAMouse ES1280≥3OCT4 expression (1.6%)[23]

Unique molecular identifiers

Since cell-to-cell variability has posed challenges in interpreting phenotypes, strategies such as incorporating unique molecular identifiers (UMIs) in sgRNA libraries have been established [70]. The UMIs have allowed for the screening of clonally expanded and individually tagged cells, resulting in an increased sensitivity and robustness compared with conventional analyses. The statistical methods, including using the UMIs as internal replicates and in lineage dropout analyses, increase both the precision and the accuracy of the screen, as well as reducing the infection coverage needed to reach the same statistical power [71].

Phenotypes tested in pooled library platform

The cellular phenotypes assessed after genetic perturbation are diverse (Figure 1C), including survival advantage for robust cell growth [15,17,25], after drug treatments [18-20] or with fractionated radiation [21]. By combining with cell sorting, a wide variety of phenotypes can be measured, such as pluripotency maintenance [15,23,72], differentiation [73], protein transport [74], oxidative stress [75], and many more. Almost all the pooled library screenings for lncRNAs thus far have relied on survival advantage as the phenotype of interest. For example, Liu and colleagues [21] identified 434 and 33 lncRNAs, that respectively support and reduce cell growth, in human glioblastoma cells in the presence of clinically relevant doses of fractionated radiation. Additionally, Cai and colleagues [25] identified 9 lncRNAs that support robust cell proliferation of epidermal keratinocyte cells. The lower hit rate of this study may be due to a lower number of sgRNAs designed per target and thus lower statistical power. Another reason is that including treatments that place the cells under stress, as in the Liu and colleagues’ study [21], can provoke the expression or function of lncRNAs, which constitute a significant fraction of the genes differentially expressed in response to cell stress [76]. As summarized for all lncRNA screenings in Table 1, the hit rate is generally higher if drug treatment is included. When combined with cell sorting, cell loss from the staining and washing steps of fluorescence-activated cell sorting (FACS) is a factor to consider, although CRISPRi screen combined with FACS has been reported [15]. A larger population of transduced cells are needed to compensate for this cell loss and reach the final coverage. Formaldehyde fixation can reduce the degree of cell loss while de-crosslinking is required to rescue the genomic DNA for PCR library construction [15]. Indeed, many screens rely on expression of exogenous genes carrying fluorescent signals [73,74] or fluorescent probe live trackers [75]. For example, Liu and colleagues [73] performed a CRISPRa screen in mouse ES cells, where the cell surface marker hCD8 was inserted downstream of Tubb3. Differentiated neurons were then separated by magnetic-activated cell sorting (MACS), which combined with use of cell surface markers, can minimize cell loss.

Analytical efforts

Pooled library screening requires rigorous bioinformatics analyses to interpret the results, with detection of false positives remaining a critical issue. Nevertheless, advanced analysis strategies for CRISPR applications are currently available [77], with several distinct algorithms established for evaluating the results of pooled library screening. Briefly, redundant siRNA activity [78] and HiTSelect [79] are designed for RNAi screening. Redundant siRNA activity ranks the targeted genes by log fold change and generates P-values from the ranking against a uniform distribution, while HiTSelect ranks target genes by considering both the effect on the phenotype and the number of active constructs using a random-effects model. The MAGeCK robust ranking algorithm [80] is commonly used in CRISPR-based screens. It uses the raw sgRNA read counts and adopts a negative binomial model to generate sgRNA P-values, which are combined to gene level by a modified robust ranking algorithm. CRISPhieRmix [81] uses the log fold change value of each sgRNA generated from standard count software such as DESeq2 [82], and provides empirical FDR for the target genes using a hierarchical mixture distribution. BAGEL [83] uses data from prior screens to build null distribution and positive effects to rank target genes. Besides, some algorithms such as MAGeCK maximum likelihood estimation [84] and JACKS [85] are designed to compare and pool multiple screens. CERES [86] is specifically designed to correct the side effects mediated by DNA damage from CRISPR cuts for cancer cells which exhibit large copy number variation. Bodapati and colleagues [69] compared these algorithms with CRISPR pooled screening data and suggested using MAGeCK robust ranking algorithm in most cases for its robustness and performance, while CRISPhieRmix was touted as the only algorithm taking the various sgRNA efficiencies into account for CRISPRi/a screenings. If UMIs are included, there is an advantage to incorporating additional statistical methods, such as internal replicate analysis and lineage dropout analysis [71]. Because the hits identified by pooled library screening are statistical outcomes from a population of perturbations with considerable false positives [87], validation by reproducing the phenotypic result with individual perturbation is sometimes necessary.

Annotate the roles of the lncRNA hits

Transcriptome profiling

Except for several of the lncRNAs with structural implications [88] or those regulating translation [89], the majority of lncRNAs regulate other genes at the transcriptional [36,37] or RNA [90] levels. This signature allows us to unveil the functions of most lncRNAs by studying the transcriptomic changes. Therefore, follow-up studies restricted to lncRNA hits using Perturb-seq or CROP-seq [91-93] will be the most compatible with pooled library screening. Perturb-seq and CROP-seq are sequencing platforms designed to combine single cell RNA-seq and CRISPR-based genetic screens. To facilitate the detection of the non-polyadenylated gRNA in single cell transcriptome, Perturb-seq lentiviral vector harbors a gRNA-matched barcode upstream of the poly-A tail of the puromycin gene [91], while CROP-seq introduces an additional gRNA copy upstream of the poly-A of the puromycin gene [93]. Besides, a single cell RNA-seq platform has been applied to the shRNA screen by using a pol II-dependent shRNA backbone [94]. Single-cell transcriptomes containing the sgRNA/shRNA identity can unveil the mechanism mediated by specific lncRNAs, while a single-cell readout is advantageous if the cell population is heterogeneous (e.g., study design with differentiation). More recently, targeted Perturb-seq [95] was developed, allowing profiling of a subset of the transcriptome (e.g., genes near the loci of lncRNA hits). Alternatively, molecular phenotyping with bulk RNA-seq or CAGE-seq after lncRNA perturbation can reveal genes or pathways modulated by the lncRNA [16]. Such transcriptomic profiling will identify the global transcriptomic changes, which can be captured by differential expression profile followed by Gene Set Enrichment Analysis (GSEA) [96] and Gene Ontology (GO) [97] analyses. Ramilowski and colleagues [16] characterized ZNF213-AS1 by GSEA in controlling migration in dermal fibroblast and validated this in a wound closure migration assay. Besides global transcriptomic changes, transcriptomic profiling combined with computational analyses will also identify direct effector genes of the lncRNA.

Prediction of effector genes with known lncRNA mechanisms

Predicting direct effector genes is often necessary to connect lncRNA hits to the tested phenotype with cellular mechanisms. This will also reflect the functional mechanisms of the lncRNAs so that validation experiments can be designed. Since cis-regulation is one of the major mechanisms mediated by lncRNA [36,37], defining proximal genes either by 2D distance or 3D chromatin structure from Hi-C data will yield functional interactions between the lncRNA locus and the effector genes (Figure 2A). From certain disease phenotypes, effector genes can be identified by connecting lncRNAs to GWAS [98] and expression quantitative trait loci (eQTL) [99-101]. Additionally, co-expression analysis among the tissue-wide or cell type-wide data from various consortia [1,102] between the lncRNAs and effector genes should improve confidence in these networks.
Figure 2

Mechanisms of lncRNA in gene regulation

(A) Regulation of proximal genes. (B) Regulation of genes in trans. (C) Mediation of protein-DNA interaction to regulate gene expression. (D) Sequestering miRNA from mRNA by acting as miRNA sponge.

Mechanisms of lncRNA in gene regulation

(A) Regulation of proximal genes. (B) Regulation of genes in trans. (C) Mediation of protein-DNA interaction to regulate gene expression. (D) Sequestering miRNA from mRNA by acting as miRNA sponge. Moreover, lncRNAs are known to function by interacting with protein, DNA and RNA [103,104], while their subcellular localization can suggest their functional mechanisms [105]. Fractionation data will be useful to estimate mechanisms of the lncRNAs, but data with matched cell type are necessary as subcellular localization of lncRNAs is cell type specific [13]. For trans-regulation, experimental genome-wide RNA–DNA interaction analysis, such as GRID-seq [106] and RADICL-seq [107] and in silico RNA–DNA interaction such as triplex formation prediction [108] can be used as references (Figure 2B). Both cis and trans RNA–DNA interactions are likely to involve proteins as the executors while linking a protein (chromatin modifier or transcription factor) with RNA–DNA interaction will help characterizing the downstream effects. Many DNA-binding proteins were found to be capable of binding RNA [109,110]. RNA-binding protein (RBP) is a class of protein that cooperates with lncRNAs for post-transcriptional processes, such as splicing, cleavage and polyadenylation. ENCODE Phase III has generated RNA–RBP interaction for 356 RBPs in K562 and HepG2 [109] while ChIP-seq data are also available for 58 of these RBPs [111]. Combining these two datasets could reveal RNA-DNA interaction with RBP content (Figure 2C). Last but not least, sequestering miRNA is one of the known mechanisms of lncRNA while the differential expression of the mRNA competitor can be revealed from transcriptome profiling. A number of databases supporting the prediction of miRNA sponge interaction have been described [112] (Figure 2D).

Conclusion

Array and pooled library screenings have been established in many studies to identify functional protein coding genes in different cellular phenotypes, but the work on lncRNAs is lagging. However, the majority of lncRNAs have yet to be characterized. Advances in HTP screening platforms present an opportunity to explore the functionality of lncRNAs. Following the genome-wide screening of lncRNAs, it will also be imperative to investigate molecular mechanisms of individual lncRNAs for determining their roles across different cell types, including in disease, to identify functional conservation and redundancy. Direct RNA-targeting perturbation methods present the advantage of (1) distinguishing between isoforms, (2) avoiding interference with the enhancer function of lncRNA promoters, and (3) targeting lncRNAs even when their loci overlap with other genes. DNA-targeting perturbation methods provide more consistent results by having fewer off-target effects and allowing functional screening of cis-regulatory elements. Array screening can be easily combined with different phenotypic readouts and allow the quantification of perturbation efficiency. Pooled library screening benefits from high throughput, but the results are preliminary and often require secondary screening and validations.
  112 in total

1.  An atlas of human long non-coding RNAs with accurate 5' ends.

Authors:  Chung-Chau Hon; Jordan A Ramilowski; Jayson Harshbarger; Nicolas Bertin; Owen J L Rackham; Julian Gough; Elena Denisenko; Sebastian Schmeier; Thomas M Poulsen; Jessica Severin; Marina Lizio; Hideya Kawaji; Takeya Kasukawa; Masayoshi Itoh; A Maxwell Burroughs; Shohei Noma; Sarah Djebali; Tanvir Alam; Yulia A Medvedeva; Alison C Testa; Leonard Lipovich; Chi-Wai Yip; Imad Abugessaisa; Mickaël Mendez; Akira Hasegawa; Dave Tang; Timo Lassmann; Peter Heutink; Magda Babina; Christine A Wells; Soichi Kojima; Yukio Nakamura; Harukazu Suzuki; Carsten O Daub; Michiel J L de Hoon; Erik Arner; Yoshihide Hayashizaki; Piero Carninci; Alistair R R Forrest
Journal:  Nature       Date:  2017-03-01       Impact factor: 49.962

2.  A promoter-level mammalian expression atlas.

Authors:  Alistair R R Forrest; Hideya Kawaji; Michael Rehli; J Kenneth Baillie; Michiel J L de Hoon; Vanja Haberle; Timo Lassmann; Ivan V Kulakovskiy; Marina Lizio; Masayoshi Itoh; Robin Andersson; Christopher J Mungall; Terrence F Meehan; Sebastian Schmeier; Nicolas Bertin; Mette Jørgensen; Emmanuel Dimont; Erik Arner; Christian Schmidl; Ulf Schaefer; Yulia A Medvedeva; Charles Plessy; Morana Vitezic; Jessica Severin; Colin A Semple; Yuri Ishizu; Robert S Young; Margherita Francescatto; Intikhab Alam; Davide Albanese; Gabriel M Altschuler; Takahiro Arakawa; John A C Archer; Peter Arner; Magda Babina; Sarah Rennie; Piotr J Balwierz; Anthony G Beckhouse; Swati Pradhan-Bhatt; Judith A Blake; Antje Blumenthal; Beatrice Bodega; Alessandro Bonetti; James Briggs; Frank Brombacher; A Maxwell Burroughs; Andrea Califano; Carlo V Cannistraci; Daniel Carbajo; Yun Chen; Marco Chierici; Yari Ciani; Hans C Clevers; Emiliano Dalla; Carrie A Davis; Michael Detmar; Alexander D Diehl; Taeko Dohi; Finn Drabløs; Albert S B Edge; Matthias Edinger; Karl Ekwall; Mitsuhiro Endoh; Hideki Enomoto; Michela Fagiolini; Lynsey Fairbairn; Hai Fang; Mary C Farach-Carson; Geoffrey J Faulkner; Alexander V Favorov; Malcolm E Fisher; Martin C Frith; Rie Fujita; Shiro Fukuda; Cesare Furlanello; Masaaki Furino; Jun-ichi Furusawa; Teunis B Geijtenbeek; Andrew P Gibson; Thomas Gingeras; Daniel Goldowitz; Julian Gough; Sven Guhl; Reto Guler; Stefano Gustincich; Thomas J Ha; Masahide Hamaguchi; Mitsuko Hara; Matthias Harbers; Jayson Harshbarger; Akira Hasegawa; Yuki Hasegawa; Takehiro Hashimoto; Meenhard Herlyn; Kelly J Hitchens; Shannan J Ho Sui; Oliver M Hofmann; Ilka Hoof; Furni Hori; Lukasz Huminiecki; Kei Iida; Tomokatsu Ikawa; Boris R Jankovic; Hui Jia; Anagha Joshi; Giuseppe Jurman; Bogumil Kaczkowski; Chieko Kai; Kaoru Kaida; Ai Kaiho; Kazuhiro Kajiyama; Mutsumi Kanamori-Katayama; Artem S Kasianov; Takeya Kasukawa; Shintaro Katayama; Sachi Kato; Shuji Kawaguchi; Hiroshi Kawamoto; Yuki I Kawamura; Tsugumi Kawashima; Judith S Kempfle; Tony J Kenna; Juha Kere; Levon M Khachigian; Toshio Kitamura; S Peter Klinken; Alan J Knox; Miki Kojima; Soichi Kojima; Naoto Kondo; Haruhiko Koseki; Shigeo Koyasu; Sarah Krampitz; Atsutaka Kubosaki; Andrew T Kwon; Jeroen F J Laros; Weonju Lee; Andreas Lennartsson; Kang Li; Berit Lilje; Leonard Lipovich; Alan Mackay-Sim; Ri-ichiroh Manabe; Jessica C Mar; Benoit Marchand; Anthony Mathelier; Niklas Mejhert; Alison Meynert; Yosuke Mizuno; David A de Lima Morais; Hiromasa Morikawa; Mitsuru Morimoto; Kazuyo Moro; Efthymios Motakis; Hozumi Motohashi; Christine L Mummery; Mitsuyoshi Murata; Sayaka Nagao-Sato; Yutaka Nakachi; Fumio Nakahara; Toshiyuki Nakamura; Yukio Nakamura; Kenichi Nakazato; Erik van Nimwegen; Noriko Ninomiya; Hiromi Nishiyori; Shohei Noma; Shohei Noma; Tadasuke Noazaki; Soichi Ogishima; Naganari Ohkura; Hiroko Ohimiya; Hiroshi Ohno; Mitsuhiro Ohshima; Mariko Okada-Hatakeyama; Yasushi Okazaki; Valerio Orlando; Dmitry A Ovchinnikov; Arnab Pain; Robert Passier; Margaret Patrikakis; Helena Persson; Silvano Piazza; James G D Prendergast; Owen J L Rackham; Jordan A Ramilowski; Mamoon Rashid; Timothy Ravasi; Patrizia Rizzu; Marco Roncador; Sugata Roy; Morten B Rye; Eri Saijyo; Antti Sajantila; Akiko Saka; Shimon Sakaguchi; Mizuho Sakai; Hiroki Sato; Suzana Savvi; Alka Saxena; Claudio Schneider; Erik A Schultes; Gundula G Schulze-Tanzil; Anita Schwegmann; Thierry Sengstag; Guojun Sheng; Hisashi Shimoji; Yishai Shimoni; Jay W Shin; Christophe Simon; Daisuke Sugiyama; Takaai Sugiyama; Masanori Suzuki; Naoko Suzuki; Rolf K Swoboda; Peter A C 't Hoen; Michihira Tagami; Naoko Takahashi; Jun Takai; Hiroshi Tanaka; Hideki Tatsukawa; Zuotian Tatum; Mark Thompson; Hiroo Toyodo; Tetsuro Toyoda; Elvind Valen; Marc van de Wetering; Linda M van den Berg; Roberto Verado; Dipti Vijayan; Ilya E Vorontsov; Wyeth W Wasserman; Shoko Watanabe; Christine A Wells; Louise N Winteringham; Ernst Wolvetang; Emily J Wood; Yoko Yamaguchi; Masayuki Yamamoto; Misako Yoneda; Yohei Yonekura; Shigehiro Yoshida; Susan E Zabierowski; Peter G Zhang; Xiaobei Zhao; Silvia Zucchelli; Kim M Summers; Harukazu Suzuki; Carsten O Daub; Jun Kawai; Peter Heutink; Winston Hide; Tom C Freeman; Boris Lenhard; Vladimir B Bajic; Martin S Taylor; Vsevolod J Makeev; Albin Sandelin; David A Hume; Piero Carninci; Yoshihide Hayashizaki
Journal:  Nature       Date:  2014-03-27       Impact factor: 49.962

3.  A pseudogene long-noncoding-RNA network regulates PTEN transcription and translation in human cells.

Authors:  Per Johnsson; Amanda Ackley; Linda Vidarsdottir; Weng-Onn Lui; Martin Corcoran; Dan Grandér; Kevin V Morris
Journal:  Nat Struct Mol Biol       Date:  2013-02-24       Impact factor: 15.369

4.  Triplexator: detecting nucleic acid triple helices in genomic and transcriptomic data.

Authors:  Fabian A Buske; Denis C Bauer; John S Mattick; Timothy L Bailey
Journal:  Genome Res       Date:  2012-05-01       Impact factor: 9.043

Review 5.  Making sense of GWAS: using epigenomics and genome engineering to understand the functional relevance of SNPs in non-coding regions of the human genome.

Authors:  Yu Gyoung Tak; Peggy J Farnham
Journal:  Epigenetics Chromatin       Date:  2015-12-30       Impact factor: 4.954

6.  New factors for protein transport identified by a genome-wide CRISPRi screen in mammalian cells.

Authors:  Kevin Moreau; Julien Villeneuve; Laia Bassaganyas; Stephanie J Popa; Max Horlbeck; Claudia Puri; Sarah E Stewart; Felix Campelo; Anupama Ashok; Cristian M Butnaru; Nathalie Brouwers; Kartoosh Heydari; Jean Ripoche; Jonathan Weissman; David C Rubinsztein; Randy Schekman; Vivek Malhotra
Journal:  J Cell Biol       Date:  2019-09-05       Impact factor: 10.539

7.  Transposable elements are major contributors to the origin, diversification, and regulation of vertebrate long noncoding RNAs.

Authors:  Aurélie Kapusta; Zev Kronenberg; Vincent J Lynch; Xiaoyu Zhuo; LeeAnn Ramsay; Guillaume Bourque; Mark Yandell; Cédric Feschotte
Journal:  PLoS Genet       Date:  2013-04-25       Impact factor: 5.917

8.  Widespread RNA binding by chromatin-associated proteins.

Authors:  David G Hendrickson; David R Kelley; Danielle Tenen; Bradley Bernstein; John L Rinn
Journal:  Genome Biol       Date:  2016-02-16       Impact factor: 13.583

9.  CRISPR/Cas9 screening using unique molecular identifiers.

Authors:  Bernhard Schmierer; Sandeep K Botla; Jilin Zhang; Mikko Turunen; Teemu Kivioja; Jussi Taipale
Journal:  Mol Syst Biol       Date:  2017-10-09       Impact factor: 11.429

10.  CRISPhieRmix: a hierarchical mixture model for CRISPR pooled screens.

Authors:  Timothy P Daley; Zhixiang Lin; Xueqiu Lin; Yanxia Liu; Wing Hung Wong; Lei S Qi
Journal:  Genome Biol       Date:  2018-10-08       Impact factor: 13.583

View more
  1 in total

1.  PRR7-AS1 Correlates with Immune Cell Infiltration and Is a Diagnostic and Prognostic Marker for Hepatocellular Carcinoma.

Authors:  Yifan Lu; Songhai Chen; Qingqing Wang; Jie Zhang; Xuanzeng Pei
Journal:  J Oncol       Date:  2022-08-26       Impact factor: 4.501

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.