Literature DB >> 30838214

Multi-Omics Approaches to Study Long Non-coding RNA Function in Atherosclerosis.

Adam W Turner1, Doris Wong1,2, Mohammad Daud Khan1, Caitlin N Dreisbach1,3,4, Meredith Palmore1, Clint L Miller1,2,4,5,6.   

Abstract

Atherosclerosis is a complex inflammatory disease of the vessel wall involving the interplay of multiple cell types including vascular smooth muscle cells, endothelial cells, and macrophages. Large-scale genome-wide association studies (GWAS) and the advancement of next generation sequencing technologies have rapidly expanded the number of long non-coding RNA (lncRNA) transcripts predicted to play critical roles in the pathogenesis of the disease. In this review, we highlight several lncRNAs whose functional role in atherosclerosis is well-documented through traditional biochemical approaches as well as those identified through RNA-sequencing and other high-throughput assays. We describe novel genomics approaches to study both evolutionarily conserved and divergent lncRNA functions and interactions with DNA, RNA, and proteins. We also highlight assays to resolve the complex spatial and temporal regulation of lncRNAs. Finally, we summarize the latest suite of computational tools designed to improve genomic and functional annotation of these transcripts in the human genome. Deep characterization of lncRNAs is fundamental to unravel coronary atherosclerosis and other cardiovascular diseases, as these regulatory molecules represent a new class of potential therapeutic targets and/or diagnostic markers to mitigate both genetic and environmental risk factors.

Entities:  

Keywords:  atherosclerosis; cardiovascular disease; gene regulation; genomics; long noncoding (lnc) RNAs

Year:  2019        PMID: 30838214      PMCID: PMC6389617          DOI: 10.3389/fcvm.2019.00009

Source DB:  PubMed          Journal:  Front Cardiovasc Med        ISSN: 2297-055X


Introduction

Despite intensive research into the underlying pathogenesis of atherosclerosis/coronary artery disease (CAD), this disease still remains a significant public health burden. Atherosclerosis is a complex disease involving both environmental and genetic risk factors resulting in plaque formation and inflammation in the vessel wall. A number of cellular responses have been proposed to contribute to disease progression, including endothelial cell dysfunction, vascular smooth muscle cell phenotypic switching, macrophage/foam cell activation/necrosis/defective efferocytosis, and defective lipid/lipoprotein metabolism (1, 2). Much of the research to date has focused on the role of protein coding genes in atherosclerosis leading to the identification of a number of proteins described as key drivers. However, our understanding of the causal mechanisms of this disease remains limited, likely due to our incomplete functional knowledge of the non-coding genome. It is now well-established that more than 90% of disease-associated variants reside in non-coding regions, once considered “junk sequences” (3). An increasing number of studies, especially those utilizing high-throughput sequencing technologies, have shown that a number of non-coding RNAs (ncRNAs) are differentially regulated during disease, supporting potential functional roles for these molecules (4). The formation of the GENCODE project (part of the ENCODE project) and other large-scale initiatives such as NONCODE have revealed 75% of the genome is transcribed, yet only 2% encodes for protein, suggesting alternative functional roles for ncRNA transcripts (5, 6). Since the development of RNA-seq and other high-throughput sequencing assays, ncRNAs are now appreciated as key regulators of gene expression (7). A number of the essential types of non-coding RNAs are already well-characterized such as transfer, spliceosomal, and ribosomal RNAs. Aside from these housekeeping RNA elements, the remaining types of non-coding RNAs are subdivided into two classes, small and long non-coding RNAs, based on their size. Examples of small non-coding RNAs include microRNAs (miRNA), small nucleolar RNAs (snoRNA), and PIWI-interacting RNAs (piRNA) (8). These elements can act as positive or negative regulators of gene expression and generally exert their influence through complementary base pairing to their target transcript 3′ or 5′ untranslated regions (9). Long non-coding RNAs (lncRNAs) represent a heterogeneous class of non-coding RNAs that includes transcripts >200 nucleotides, which lack functional protein coding ability (10). Within this lncRNA class, they are also classified based on their genomic location and broadly encompass enhancer-related RNAs (eRNAs), transcribed ultraconserved RNAs, intronic RNAs, long intergenic RNAs (lincRNAs), and natural antisense transcripts (NATs) (10). Contrary to canonical linear lncRNAs, a distinct group of lncRNAs are known as circular RNAs (circRNAs) due to their circular structure, which often results from backsplicing (11). By participating in both transcriptional and post-transcriptional stages, lncRNAs modulate gene expression through multiple distinct mechanisms. Further insight into these regulatory mechanisms will facilitate a better understanding of disease biology and identify additional viable targets for therapeutic intervention or diagnostics. Here, we present an overview of various lncRNAs relevant to atherosclerosis and highlight next-generation sequencing approaches to systematically investigate lncRNA function, as well as the ongoing challenges in this exciting field.

Mechanisms of Long Non-coding RNA Function

LncRNAs are a heterogeneous class of ncRNAs (>200 nucleotides in length) that do not contain a functional open reading frame (12). LncRNAs can be encoded on either the sense or antisense DNA strand and may be located within a protein coding gene or in the intergenic regions (12). Similar to mRNAs, lncRNAs are transcribed by RNA polymerase II. Many transcripts are polyadenylated, multi-exonic, undergo RNA splicing, and contain a 5' cap. Often, their active promoters are marked with H3K4me3 and gene bodies have H3K36me3 histone modifications (13). Unlike protein coding genes, lncRNAs are not translated into protein and thereby lack functional initiation and termination codons (8). They are expressed at a much lower levels relative to their protein coding counterparts and lack robust evolutionary conservation. Despite this low level of conservation, the expression pattern of lncRNAs has been shown to be relatively cell/tissue specific (14–16). The mechanism of action of these regulatory factors were categorized into four broad groups proposed by Wang and Chang (17). Signaling lncRNAs represent a class which exhibit a high degree of spatial and temporal specificity that serve a role in signal transduction. Once transcribed, these signaling lncRNAs have effector functions in activating appropriate downstream pathways in response to a stimulus. Additionally, their presence may indicate a particular developmental cell state, condition, or overall transcriptional activity (17). Another mechanism in which lncRNAs exert their regulatory function is by acting as decoy molecules to limit the availability of RNA binding factors to interact with their partners. By impairing the ability of chromatin remodelers, transcription factors, and miRNA from binding to their target genes, decoy lncRNAs can inhibit downstream effector functions (17). It is noteworthy that miRNAs also have the ability to target lncRNAs directly and thereby influence transcriptional regulation and vascular functions (18–20). Aided by their ability to bind protein as well as base pair with target sequences, guide lncRNAs are responsible for localizing transcriptional regulators to specific regions. Similar to the role of guide lncRNAs, scaffold lncRNAs use their protein binding ability to provide a surface to mediate protein-protein interactions (17). In these various ways, lncRNAs represent a distinct class of regulatory elements to modulate transcriptional activities.

Overview of LncRNAs Involved in Atherosclerosis

Functional Studies of LncRNAs in Atherosclerosis

Depending on the cell types involved, lncRNAs play myriad roles in diverse atherosclerotic processes in the vessel wall including cell proliferation, migration, differentiation, apoptosis, and inflammation. They also play important roles in the regulation of cholesterol and lipid metabolism. Pertinent cell types include smooth muscle cells, endothelial cells, macrophages, and hepatocytes (Figure 1). A more comprehensive overview of key lncRNAs in atherogenic processes is given in Table 1. Perhaps the most well-studied in atherosclerosis is CDKN2B-AS1 or ANRIL (Antisense Non-coding RNA in the INK4 Locus), which acts in several cell types relevant to CAD (21, 22, 81). ANRIL acts as a guide lncRNA to localize polycomb repressive complex (PRC) at target promoters through a direct interaction with its subunits, CBX7 or SUZ12. PRC then adds H3K27me3 modifications to this region to repress transcription (22). Loss of function studies have suggested ANRIL acts in cis in order to regulate transcription of the nearby tumor suppressors, CDKN2A and CDKN2B (82, 83). Consistent with this mechanism of action, ANRIL expression correlates with a more proliferative phenotype in endothelial cells and vascular smooth muscle cells (VSMC) (22, 84, 85). In addition to acting in cis, ANRIL also acts in trans (via Alu elements) to regulate other genes that participate in proatherogenic pathways (22). Since ANRIL is not well-conserved in mice, in vivo functional studies have been challenging (86). A more complete overview of the atherogenic roles of ANRIL RNA species has been recently documented in a review by Holdt and Teupser (87).
Figure 1

Schematic of atherosclerotic processes and specific lncRNA functions. Top, LncRNAs are shown with described smooth muscle cell (SMC) functions, such as proliferation, apoptosis, autophagy, phenotypic switching, and differentiation. LncRNAs are also shown with endothelial cell (EC) functions such as differentiation, regulation of endothelial nitric oxide synthase (eNOS) mediated signaling, growth and angiogenesis. LncRNAs are shown with macrophage functions, such as macrophage polarization, cholesterol efflux, and inflammation. Also, lncRNAs are listed with functions in regulating cholesterol and triglyceride metabolism in hepatocytes and/or macrophages. Bottom, schematic showing example of atherosclerotic lesion after invasion of vascular endothelium by activated monocytes, which become macrophages upon chronic inflammatory stimulation. Exposure to oxidized LDL (oxLDL) particles promote macrophage transformation to lipid-laden foam cells. Also depicted is the transformation of contractile SMCs to de-differentiated or modulated SMCs, as well as the transition of modulated SMCs to macrophage-like cells in the lesion. ECM, Extracellular matrix.

Table 1

List of long non-coding RNAs with functional relevance in coronary artery disease cell types/tissues.

NameFull nameInitial detection method(s)Cell types/tissues relevant to CADFunction relevant to CADReferences
ANRILA new large antisense non-coding RNALong range PCR and nucleotide sequencingSMC, EC, MacCell cycle regulationRegulation of apoptosis(2124)
GAS5Growth arrest specific 5Characterization of cDNA librarySMC, EC, MacRegulates apoptosisRegulates autophagyRegulates smooth muscle differentiation(2528)
HIF1a-AS1Hypoxia-inducible factor 1-alpha Antisense RNA 1RIP, qRT-PCRSMCRegulation of VSMC apoptosis(2932)
LincRNA-p21Long intergenic non-coding RNA at p21 locusNimblegen lincRNA tiling microarray platformSMC, MacRegulation of cell proliferationRegulation of apoptosis(33, 34)
MALAT1Metastasis associated lung adenocarcinoma transcript 1Identified in numerous physiological processesSMC, EC, MacRegulation of cell proliferationRegulation of inflammationRegulation of autophagy(3540)
MEG3Maternally expressed 3Characterization of cDNA libraryEC, SMCRegulates endothelial cell proliferationInhibits angiogenesisRegulates smooth muscle proliferation/migration(4143)
NEAT1Nuclear paraspeckle assembly transcript 1Affymetrix expression arraySMC, MacRegulation of VSMC phenotypic switchingMacrophage polarization (4447)
NEXN-AS1Nexilin antisense transcript 1lncRNA arrayEC, SMCRegulation of adhesion molecules/monocyte recruitmentRegulation of inflammatory cytokines(48)
CHROMECholesterol homeostasis regulator of miRNA expressionCharacterization of transcript proximal to CAD and plasma HDL-C associated locusMac, LiverRegulation of cholesterol homeostasis (49)
AK098656AK098656lncRNA arraySMCRegulation of VSMC phenotypic switching(50)
CASC15Cancer susceptibility 15Quantification of copy number gains in metastatic melanomaSMCVSMC stiffness(51, 52)
H19H19cDNA characterizationSMCRegulation of VSMC differentiationLet 7 miRNA sponge(5355)
MIR122HG/ Lnc-Ang362 MicroRNA 122 Host GeneRNA-seqSMCRegulation of VSMC proliferation(56)
MYOSLIDMYOcardin-induced Smooth muscle LncRNA, Inducer of DifferentiationRNA-seqSMCRegulation of SMC differentiation(57)
PACERP50-associated COX-2 extragenic RNAANalysis of ChIP data, RT-qPCRSMCRegulation of COX-2 expressionVSMC stiffness(51, 58)
SENCRSmooth muscle and Endothelial cell-enriched migration/differentiation-associated long Non-coding RNARNA-seq SMCRegulation of myocardin, SMC contractile gene program(59)
SMILRSmooth muscle-induced lncRNA enhances replicationRNA-seqSMCRegulation of VSMC proliferation(60)
CLDN10-AS1Claudin 10 antisense transcript 1lncRNA microarrayECRegulation of endothelial signaling(61)
CTC-459I6.1RASGRF2 antisense RNA 1lncRNA microarrayECRegulation of endothelial signaling(61)
GATA6-ASGATA6 antisenseRNA-seqECRegulation of endothelial signalingEffects on endothelial- mesenchymal transition(62)
LEENElncRNA that enhances eNOS expressionRNA-seq, chromosome conformation captureECRegulation of eNOS and endothelial function (63)
MIATMyocardial infarction associated transcriptIsolation of specific cDNA, RACEECRegulation of angiogenesis(64, 65)
NRONNon-protein coding RNA, repressor of NFATScreen of cDNA library to identify conserved lncRNAsshRNA screen to identify NFAT regulatorsECRegulation of angiogenesisRegulation of vascular development(6668)
sONE/ ATG9BAutophagy related 9BcDNA characterizationECRegulation of eNOS(69)
STEELSpliced transcript endothelial-enriched lncRNACustom lncRNA microarrayECAngiogenesis(70)
TIE1-ASEndothelial-specific receptor tyrosine kinase 1 antisenseDetection in EST libraries, RACEECRegulation of vascular development(71, 72)
Dnm3osDynamin-3 opposite strand/antisense RNAIsolation of clones, RACEMacRegulation of macrophage inflammation(73, 74)
lncRNA-Mirt2lncRNA Myocardial infarction associated transcriptlncRNA expression microarrayMacNegative regulator of inflammation(75)
MeXIS Macrophage-expressed LXR-induced sequenceRNA-seqMacRegulation of cholesterol metabolism(76)
APOA1-ASApolipoprotein A1 antisense transcriptIdentification in EST library, RACE LiverRegulation of APOA1 expression(77)
LeXisLiver-expressed LXR-induced sequenceRNA-seqLiverRegulation of cholesterol metabolism(78)
lncLSTRlncRNA Liver-Specific Triglyceride RegulatorCharacterization of imprinted genesReannotation of Affymetrix Mouse Genome 430 2.0 Array, search for liver-enriched transcriptsLiverRegulation of triglyceride metabolism(79)
TRIBALTribbles homolog 1-associated locusIdentification in EST library, RACELiverRegulation triglyceride metabolism(80)

Color scheme: Gray, lncRNAs associated with multiple cell types/tissues. Purple, lncRNAs associated with smooth muscle cells (SMC). Green, lncRNAs associated with endothelial cells (EC), Yellow, lncRNAs associated with macrophages (Mac), Blue, lncRNAs associated with cholesterol metabolism in liver. RACE: Rapid amplification of cDNA ends, EST: Expressed sequence tag.

Schematic of atherosclerotic processes and specific lncRNA functions. Top, LncRNAs are shown with described smooth muscle cell (SMC) functions, such as proliferation, apoptosis, autophagy, phenotypic switching, and differentiation. LncRNAs are also shown with endothelial cell (EC) functions such as differentiation, regulation of endothelial nitric oxide synthase (eNOS) mediated signaling, growth and angiogenesis. LncRNAs are shown with macrophage functions, such as macrophage polarization, cholesterol efflux, and inflammation. Also, lncRNAs are listed with functions in regulating cholesterol and triglyceride metabolism in hepatocytes and/or macrophages. Bottom, schematic showing example of atherosclerotic lesion after invasion of vascular endothelium by activated monocytes, which become macrophages upon chronic inflammatory stimulation. Exposure to oxidized LDL (oxLDL) particles promote macrophage transformation to lipid-laden foam cells. Also depicted is the transformation of contractile SMCs to de-differentiated or modulated SMCs, as well as the transition of modulated SMCs to macrophage-like cells in the lesion. ECM, Extracellular matrix. List of long non-coding RNAs with functional relevance in coronary artery disease cell types/tissues. Color scheme: Gray, lncRNAs associated with multiple cell types/tissues. Purple, lncRNAs associated with smooth muscle cells (SMC). Green, lncRNAs associated with endothelial cells (EC), Yellow, lncRNAs associated with macrophages (Mac), Blue, lncRNAs associated with cholesterol metabolism in liver. RACE: Rapid amplification of cDNA ends, EST: Expressed sequence tag. The ubiquitously expressed and evolutionarily conserved lncRNA, MALAT1, is decreased in atherosclerotic plaques (88, 89), and reduced MALAT1 levels in hematopoietic cells promotes atherosclerosis and inflammation in mice in vivo (89). In contrast, MALAT1 knockdown in VSMC and EC results in cell cycle arrest and reduced proliferation (35, 36). Other lncRNAs (e.g., lincRNA-p21, HIF1α-AS, circular ANRIL, and GAS5) have been implicated in cell death/apoptotic pathways through various mechanisms (23, 29, 33, 90), as described elsewhere (91, 92). In particular, lincRNA-p21 was shown to be reduced in CAD patients and mouse models of atherosclerosis, and regulates p53-dependent smooth muscle cell proliferation and apoptosis (33). Many lncRNAs have established immune and inflammatory roles. For example, heterozygous MALAT1-deficient ApoE-/- mice have increased inflammation and atherosclerosis (93). LncRNA Dnm3os (dynamin 3 opposite strand) is upregulated in diabetic induced macrophages and regulates nucleolin and epigenetic mediated inflammatory responses (73). Finally, in human macrophages treated with oxidized LDL, HOTAIR regulates oxidative stress and inflammation (94). There are several lncRNAs with regulatory roles in lipid and cholesterol metabolism. CHROME (cholesterol homeostasis regulator of miRNA expression) is a lncRNA upregulated in carotid plaques, which regulates cholesterol homeostasis in primates in liver and macrophages by inhibiting miRNAs, such as miR-33 (49). NEAT1 promotes pro-atherogenic functions in THP-1 human macrophage cells such as increased ox-LDL lipid accumulation and inflammation by serving as a sponge of miR-342-3p target (44). Finally, differential expression of TRIBAL, APOA1-AS, and lncLSTR is linked to defects in lipid metabolic pathways, mainly in the liver. TRIBAL (TRIB1 associated locus) regulates Trib1 mRNA stability through mitogen activated kinase, consistent with Trib1 regulation (80, 95, 96). Increased TRIBAL expression stabilizes Trib1 expression and upregulates fatty acid oxidative pathways (80). Likewise, lncLSTR (liver-specific triglyceride regulator) regulates plasma triglyceride clearance by modulating apolipoprotein C2 (APOC2) levels and lipoprotein lipase activity (79). APOA1-AS regulates cholesterol levels through epigenetic modulation of APOA1, a protein involved in the cholesterol efflux pathway (77).

LncRNAs With Genetic Associations in Atherosclerosis

Genome wide association studies (GWAS) have linked genetic variation at the ANRIL locus (9p21.3) to many complex phenotypes including CAD, stroke, type 2 diabetes and multiple cancers (97). In addition to ANRIL, the 9p21.3 locus encodes three tumor suppressor proteins: CDKN2A, CDKN2B, and MTAP. Despite each being attractive candidates underlying the locus association with various diseases, several studies report CAD risk polymorphisms associated with ANRIL expression (21, 98). However, association studies of 9p21.3 genotype with ANRIL expression remain complex due to the numerous linear and circular ANRIL forms (23). Other less studied lncRNAs have been identified from genetic studies of CAD or related traits and may play critical roles in atherosclerosis. For instance, genetic variation in the imprinted lncRNA H19, involved in embryonic development (99) and oncogenesis (100), was associated with CAD and ischemic stroke in Chinese populations (101, 102). H19 was initially shown to be re-expressed in smooth muscle cells in human and rodent atherosclerotic plaques (103), and promotes VSMC proliferation by acting as a let-7a miRNA sponge to upregulate cyclin D (104). However, a recent study revealed endothelial cell restricted expression in human atherosclerotic plaques and a role in endothelial cell aging by suppressing STAT3 signaling (105), similar to lncRNA MEG3 (106). Another endothelial cell lncRNA, MIAT (Myocardial Infarction Associated Transcript), was previously associated with myocardial infarction in a large genetic study of a Japanese population (64). MIAT is upregulated in atherosclerosis plaques (88), and regulates microvascular dysfunction by acting as a competing endogenous RNA (65). Another lncRNA associated with CAD through large-scale GWAS is known as TARID (TCF21 antisense RNA inducing promoter demethylation) (107). TARID was identified as an eQTL target gene in human coronary artery smooth muscle cells (108), and molecular studies suggest this lncRNA guides GADD45A mediated DNA demethylation and inactivation of TCF21 (109), a known tumor suppressor and vascular wall transcription factor associated with CAD (110–113). Yet, functional studies of TARID both in VSMCs and in vivo are needed to elucidate its potential role in atherosclerosis. With larger GWAS sample sizes, and complementary eQTL colocalization (114), and transcriptome-wide association studies (TWAS) (115), it is anticipated that even more lncRNAs will be identified with genetic association evidence.

Application of Transcriptomics to Identify and Study LncRNAs Relevant to CAD

General Considerations for Transcriptomics Studies of LncRNAs

While traditional methods to profile lncRNA transcriptomes have relied on microarrays or serial analysis of gene expression (SAGE), these approaches have largely been replaced with the decreasing costs and greater output achieved by RNA-seq (116). In general, RNA-seq provides greater sensitivity and specificity to detect a broad range of ncRNA transcripts, novel isoforms, and interactions between ncRNAs (117). Nonetheless there are some important considerations when designing and conducting RNA-seq based lncRNA screening experiments. For instance, given that lncRNAs are approximately 10X less abundant than mRNAs on average, the basal expression of a typical lncRNA is < 5 fragments per kilobase of transcript per million mapped reads (FPKM) (118). Thus, it is highly recommended to obtain deeper sequencing per sample (~100X read depth) than a typical RNA-seq experiment. Also, while up to 50% of lncRNAs appear to be poly-adenylated (119) and would be detected with mRNA library preparation kits, a more comprehensive landscape of lncRNAs, other ncRNAs, including eRNAs, would require total RNA [poly(A) and non-poly(A)], ribosomal RNA depletion methods of purification. Distinguishing lncRNA transcripts from mRNA transcripts from short-read sequencing data remains a challenge, however deeper, paired-end and stranded sequencing should improve identification of lncRNAs (120). Also, careful study design is needed to ensure sufficient power to detect differentially expressed and transcript-specific lncRNAs, when using standard count-based tools (120). Since many lncRNAs are tissue and cell-specific (121–123), it is also worth considering the effects of diluting weak signals from bulk populations of cells, as well as specific environmental contexts that may regulate lncRNA transcript levels. Below, we summarize recent findings of RNA-seq based lncRNA discoveries in specific cell types relevant to CAD/atherosclerosis.

Transcriptomics of Vascular Smooth Muscle Cell Function

The first VSMC lncRNA discovered via RNA-seq was Lnc-Ang362 (HG-MIR222), which is upregulated in rat aortic smooth muscle cells upon stimulation with angiotensin II (56). Lnc-Ang362 promotes VSMC proliferation and is the host-transcript for both miR-221 and miR-222. Bell et al. conducted RNA-seq in human coronary artery smooth muscle cells and identified 31 previously unidentified lncRNAs (59). Notably, one of these was Smooth muscle and Endothelial cell-enriched migration/differentiation-associated long Non-coding RNA (SENCR), which is located antisense to the FLI1 gene. SENCR functionally promotes a contractile smooth muscle phenotype and inhibits migration (59). In a follow-up study, RNA-seq was performed in human coronary artery smooth muscle cells to examine the effect of myocardin (MYOCD) overexpression (57). MYOCD is a potent co-factor that binds with serum response factor (SRF) to activate an array of smooth muscle-specific genes that maintain smooth muscle cell differentiation (124–128). Over 100 lncRNAs were differentially expressed, one of which was identified as MYOcardin-induced Smooth muscle LncRNA, Inducer of Differentiation (MYOSLID). Functional studies demonstrated that MYOSLID, a direct transcriptional target of MYOCD/SRF, promotes smooth muscle differentiation and inhibits proliferation (57). Yu et al. used RNA-seq to compare transcriptomes of coronary and aortic smooth muscle cells subjected to both normal and pathological aortic stiffness, a subclinical risk factor for CAD and various aortic diseases (51). Only two of the top 20 ranked differentially expressed lncRNAs have been studied to date: CASC15 and PACER (RP5-973M2.2). These lncRNAs regulate expression of protein-coding genes in cis and PACER activates COX2 expression (52, 58, 129). Analysis of RNA-seq data highlighted the lncRNA MALAT1 as a key regulator of VSMC stiffness-induced proliferation and migration. Although MALAT1 was originally described as an endothelial lncRNA, MALAT1 regulates the phenotyping switching of VSMCs via activation of the autophagy pathway (36). Using RNA-seq in human smooth muscle cells Ballantyne et al. identified over 300 differentially expressed lncRNAs upon platelet-derived growth factor and interleukin-1 alpha stimulation. The novel lncRNA, Smooth Muscle-Induced LncRNA enhances Replication (SMILR) identified from this study enhances smooth muscle cell proliferation and has increased expression in unstable atherosclerotic plaques (60). The lncRNA NEAT1 (nuclear paraspeckle assembly transcript 1) has recently been implicated in promoting the phenotypic switching of VSMCs (45). RNA-seq demonstrated NEAT1 silencing increases the mRNA levels of numerous critical smooth muscle cell marker genes. Finally, to identify lncRNAs key in smooth muscle cell differentiation, Lim et al. combined and queried diverse RNA-seq datasets from Gene Expression Omnibus (GEO). Dozens of lncRNAs with no previous evidence for roles in VSMC differentiation were identified in this analysis that warrant further investigation, either as cis transcriptional regulators or suppressing miRNA function (130). The development of custom lncRNA arrays has been applied to identify lncRNAs involved in various processes critical in atherosclerosis. One example is a microarray analysis which identified 580 lncRNAs differentially expressed upon exposure of human aortic smooth muscle cells to cyclic mechanical stretch (131). Another example is identification of AK098656, predominantly expressed in VSMCs, also upregulated in hypertensive patients and involved in promoting a synthetic smooth muscle cell phenotype (50).

Transcriptomics of Endothelial Cell Function

Although not all lncRNAs have a poly(A) tail, Michalik et al. performed deep sequencing of poly(A)-selected RNA in human umbilical vein endothelial cells (HUVECs) and found over half of total RNA composed of non-coding RNA, many of which are lncRNAs (35). This study focused on five lncRNAs with high endothelial expression and strong conservation between mice and humans: MALAT1, linc00493, maternally expressed 3 (MEG3), taurine upregulated gene 1 (TUG1), and linc00657. MALAT1 and MEG3 are strongly upregulated in response to hypoxia while linc006757 are TUG1 are moderately upregulated. In regards to angiogenesis, MALAT1 promotes angiogenesis and induces a switch of endothelial cells from a migratory cell phenotype to a proliferative cell phenotype (132). Huang et al. postulated exosomal MALAT1 from oxidized LDL (oxLDL) treated endothelial cells (HUVECs) promotes macrophage polarization toward the M2 phenotype (133). MEG3 was shown to interact with epigenetic modifiers, to inhibit angiogenesis and contribute to age-related endothelial dysfunction (106, 134, 135). In another study Miao et al. conducted RNA-seq profiling of endothelial cells subjected to both physiological and pathological flow for various time points (63). They identified and characterized LEENE (lncRNA that enhances eNOS expression) as a lncRNA highly correlated with endothelial nitric oxide synthase (eNOS) expression levels, which is downregulated upon pathological flow (63). Several lncRNAs characterized in smooth muscle cells also have functional significance in endothelial cells. For instance, the SMC lncRNA SENCR regulates the differentiation of pluripotent cells into endothelial cells and promotes angiogenesis in HUVECs (136). Custom lncRNA microarrays have identified endothelial cell enriched lncRNAs (70) and endothelial lncRNAs differentially expressed in response to specific treatments (e.g., oxidized LDL) (61). These arrays have revealed new candidate lncRNAs in atherosclerosis including spliced-transcript endothelial-enriched lncRNA (STEEL) (70), NEXN-AS1 (48), CLDN10-AS1, and CTC-459I6.1 (61). STEEL facilitates the transcriptional stimulation of both eNOS and Kruppel-like factor (KLF2) and in vivo promotes angiogenesis (70). STEEL expression is decreased in HUVECs exposed to “atheroprotective” flow and expression is increased in HUVECs exposed to disturbed “atheroprone” flow.

Transcriptomics of Macrophage Function and Inflammation

In macrophages, LXR activation promotes cholesterol efflux through activation of target genes such as Abca1 during the formation of HDL. To investigate the regulation of LXR-dependent transcription in macrophages, a recent study conducted large-scale transcriptional profiling of mouse peritoneal macrophages in response to the LXR agonist GW3965. LXR activation stimulated transcription of an array of lncRNAs, of which MeXis was among the strongest induced (76). MeXis is well-conserved in mice and was shown to amplify the LXR-dependent expression of Abca1 in vivo and promote cholesterol efflux in macrophages (76). Loss of MeXis in Ldlr−/− mice was shown to accelerate atherosclerosis through impaired Abca1 expression in macrophages and resulted in decreased cholesterol efflux (76). ATAC-seq in peritoneal macrophages demonstrated decreased chromatin accessibility across the Abca1 locus in response to loss of MeXis. Querying the MeXis interactome through mass spectrometry revealed protein interactions with the nuclear receptor coactivator DDX17. Either directly or indirectly through one of its interacting targets, MeXis represents a potential therapeutic target to regulate macrophage cholesterol efflux. RNA-seq and lncRNA arrays have identified a number of other macrophage lncRNAs that could represent novel CAD targets. Zhang et al. performed deep RNA sequencing of human monocyte-derived macrophages as well as M1 activated (via interferon gamma and lipopolysaccharide stimulation) and M2 activated (via interleukin 4 stimulation) macrophages (137). This study identified 861 previously unannotated lincRNAs, most of which are not syntenic in mouse. Furthermore, the lncRNA expression profile is dramatically shifted upon M1 activation, supporting the inflammatory nature of atherosclerosis. Similarly, 109 unannotated CD14+ monocyte lincRNAs were highlighted upon exposure to inflammatory stress in vivo (138). Other recent array studies highlighted the macrophage lncRNAs Dnm3os amd Mirt2. Dnm3os is upregulated in bone marrow derived macrophages in diabetic mice compared to controls and is higher in monocytes in human type 2 diabetic patients compared to controls (73). Dnm3os alters global histone modifications in macrophages and upregulates various immune-response and inflammatory genes. LncRNA-Mirt2 is strongly induced by LPS, a toll-like receptor 4 (TLR4) ligand where it acts as a negative feedback inflammatory regulator (75).

Transcriptomics of Cholesterol Metabolism and Hepatocyte Function

Liver X receptors (LXRs) are nuclear factor transcription factors that are important mediators of lipid and cholesterol metabolism. LXR targets include the ABC family of transporters, ApoE, LPL, and SREBP (139, 140). Liver-specific LXR alpha knockout mice develop increased cholesterol levels and atherosclerosis (141). Sallam et al. performed genome-wide transcriptional profiling of primary mouse hepatocytes upon stimulation with an LXR agonist (78). The strongest induced gene was a non-coding RNA termed LeXis (liver-expressed LXR-induced sequence) that lies adjacent to the Abca1 gene. LeXis regulates several genes with roles in cholesterol biosynthesis, subsequently altering both liver and plasma cholesterol levels. Mass spectrometry was used to characterize the LeXis interactome and revealed binding to RALY, a ribonucleoprotein that acts a transcriptional cofactor in regulation of cholesterol biosynthetic genes (78). In the context of atherosclerosis, adenoviral overexpression of LeXis in the liver reduces atherosclerosis in a familial hypercholesterolemia mouse model (142). As discussed above, CHROME is another LXR-regulated lncRNA involved in cholesterol homeostasis (49). CHROME was first identified through a combination of genetic association studies for premature CAD and HDL-C and microarray based expression profiling in human atherosclerotic plaques (49). RNA-seq of control and CHROME shRNA treated HepG2 hepatocytes revealed downstream pathways affected, including the LXR pathway, bile acid metabolism, cholesterol excretion and fatty-acid β-oxidation pathways (49).

Application of Novel Genomic Technologies to Detect and Study LncRNA Functions

Novel Sequencing Technologies to Discover and Annotate Long Non-coding RNAs

Although next-generation sequencing has resulted in the identification of thousands of lncRNAs in the genome, many of these lncRNAs remain poorly characterized and annotated. It is often unclear where transcription begins and which exons are present in a particular isoform. Since lncRNAs are often expressed at lower levels compared to protein-coding genes, current transcriptomic data is unable to provide comprehensive mapping/characterization of isoforms. However, new sequencing technologies allow for better characterization due to longer read lengths, higher sensitivity, and higher accuracy. Techniques such as Iso-Seq (Pacific Biosystems) offer long-read sequencing using single-molecule, real-time (SMRT) sequencing, in which the sequence of a full-length transcript is captured in a single read (143). Despite these benefits, these single-molecule sequencers yield higher error rates compared with short read sequencing technologies (e.g., Illumina). Nanopore technologies such as the MinION instrument (Oxford Nanopore Technologies) also allow single cDNA molecules to be sequenced without the need for amplification, providing sufficient read lengths to cover the full-length non-coding RNA, and results in less bias than other long-read approaches (144). This technique passes nucleic acids through an orifice 10−9 m in diameter, where instrumental electric current changes are utilized to decipher the identity of each nucleotide (145). Since lncRNAs are typically less abundant than protein coding genes (usually one order of magnitude less), they remain a challenge to study in bulk transcriptomic datasets. To improve the detection and annotation of lncRNAs, a method known as RACE (Rapid Amplification of cDNA Ends)-Seq was developed (146), however this approach was limited by its low-throughput. Later a technique called RNA CaptureSeq was developed to enrich for long non-coding RNAs (147). RNA CaptureSeq employs an array of oligonucleotide probes to capture select genes of interest, which can be applied to pull-down lncRNAs of interest (148, 149). More recently the GENCODE consortium improved upon RNA CaptureSeq by developing RNA Capture Long Seq (RNA CLS) with the goal of annotating lncRNAs with much higher confidence (150). RNA CLS overcomes the short-read length hurdle of RNA CaptureSeq by first capturing lncRNAs and then integrating with long-read sequencing.

DNA-Based LncRNA Interactions

Despite their low abundance, lncRNAs are known to function through specific molecular interactions with other RNA species and RNA binding proteins. Several high-throughput methods are now available to uncover the genomic DNA sequences that lncRNAs interact with and likely regulate (Figure 2A). Chromatin Isolation by RNA Purification (ChIRP-Seq) is a well-established technique to study lncRNA-chromatin interactions through RNA/chromatin crosslinking, purification using biotinylated antisense oligonucleotides, followed by high-throughput sequencing (151, 152). Domain-specific ChIRP (dChIRP) is a variation of ChIRP that can characterize lncRNA function and architecture at the RNA domain level (153). dChIRP can not only investigate lncRNA-chromatin interactions but also pairwise lncRNA-RNA and lncRNA-protein interactions.
Figure 2

Genomic approaches to capture lncRNA interactions. (A) DNA-based lncRNA interactions include Chromatin Isolation by RNA Purification (ChIRP) and Capture Hybridization. Analysis of RNA Targets (CHART). An in situ based method to capture Global RNA Interactions with DNA (GRID) followed by deep sequencing uses a biotinylated bivalent linker to ligate RNA and dsDNA. (B) Protein-based lncRNA interactions include RNA Immunoprecipitation (RIP) which uses an antibody against RNA binding protein (RBP) to capture RNA-protein interactions. Cross-linking Immunoprecipitation (CLIP) combines UV cross-linking with immunoprecipitation to capture RNA-protein interactions. Targets of RNA-binding proteins Identified By Editing (TRIBE) couples an RBP to an RNA editing enzyme (ADAR). Targets of RBP are marked by adenosine to inositol RNA editing events and identified by sequencing. (C) RNA-based lncRNA interactions include RNA Antisense Purification, which uses a biotinylated probe to capture interacting RNAs that could be followed with sequencing or mass spectrometry. LIGation of interacting RNA (LIGR) followed by sequencing is a powerful approach to capture lncRNA-RNA interactions by in vivo crosslinking of RNA duplexes using the psoralen derivative 4'-aminomethyltrioxalen (AMT) and UV irradiation at 365 nm.

Genomic approaches to capture lncRNA interactions. (A) DNA-based lncRNA interactions include Chromatin Isolation by RNA Purification (ChIRP) and Capture Hybridization. Analysis of RNA Targets (CHART). An in situ based method to capture Global RNA Interactions with DNA (GRID) followed by deep sequencing uses a biotinylated bivalent linker to ligate RNA and dsDNA. (B) Protein-based lncRNA interactions include RNA Immunoprecipitation (RIP) which uses an antibody against RNA binding protein (RBP) to capture RNA-protein interactions. Cross-linking Immunoprecipitation (CLIP) combines UV cross-linking with immunoprecipitation to capture RNA-protein interactions. Targets of RNA-binding proteins Identified By Editing (TRIBE) couples an RBP to an RNA editing enzyme (ADAR). Targets of RBP are marked by adenosine to inositol RNA editing events and identified by sequencing. (C) RNA-based lncRNA interactions include RNA Antisense Purification, which uses a biotinylated probe to capture interacting RNAs that could be followed with sequencing or mass spectrometry. LIGation of interacting RNA (LIGR) followed by sequencing is a powerful approach to capture lncRNA-RNA interactions by in vivo crosslinking of RNA duplexes using the psoralen derivative 4'-aminomethyltrioxalen (AMT) and UV irradiation at 365 nm. Capture hybridization analysis of RNA targets (CHART) (154, 155) is a similar method to experimentally determine where lncRNAs target and localize in the genome (Figure 2A). In the CHART protocol chromatin is crosslinked and lncRNAs subsequently hybridized to biotinylated C-oligos. After bead immobilization of lncRNA/DNA complexes, sequencing is conducted to identify lncRNA binding DNA regions. GRID-seq (global RNA interactions with DNA by deep sequencing) is a new unbiased method to capture global RNA-interactions (Figure 2A) that can be applied to investigate lncRNA-DNA interactions in cell lines relevant to atherosclerosis (156). This GRID-seq technique uses a bivalent linker consisting of double-stranded DNA and single-stranded RNA to link RNAs with DNA in nuclei that have been fixed. Finally, MARGI (mapping RNA-genome interactions) is a high-throughput method that can be performed in vivo or on cells and reveal the genomic target sites of lncRNAs (157).

Protein-Based LncRNA Interactions

ChIRP-MS is an adaptation of the ChIRP protocol and used to characterize the interacting proteome for a lncRNA (158). ChIRP-MS has identified protein interactors for lncRNAs such as LeXis, MeXis, and AK098656 (50, 76, 78). lncRNA pull-down followed by mass spectrometry has been conducted for several lncRNAs with potential roles in CAD such as circANRIL (23), STEEL (70), MALAT1 (159, 160), Dnm3os (73), lncLSTR (79), and GATA6-AS (62). Numerous additional methods exist to decipher the proteins binding lncRNAs. RAP-MS uses ultraviolet light to crosslink direct RNA-protein interactions (161). UV-C crosslinking immunoprecipitation (CLIP) is another powerful technique to interrogate direct protein-RNA interactions and many variations have been adapted based on the implementation of high-throughput sequencing (Figure 2B). These include iCLIP, PAR-CLIP, HITS-CLIP, irCLIP, and eCLIP (162–166). High-throughput sequencing of RNA isolation by crosslinking immunoprecipitation (HITS-CLIP) was developed as genome-wide means to interrogate RNA-protein interactions in vivo (164). TRIBE (targets of RNA-binding proteins identified by editing) is designed for identifying RNA molecules that bind to RNA binding proteins (RBP) (Figure 2B) (167). Advantages of TRIBE include application to in vivo samples, ability to performed on a small number of cells, and no need for antibodies in the procedure. The TRIBE protocol couples an RNA editing enzyme to the RBP and RNA targets that have been edited are identified via next-generation sequencing (TRIBE-seq). HyperTRIBE extends upon the TRIBE procedure by introducing a hyperactive mutation into the RNA editing enzyme, which improves the RNA editing efficiency and reduces the sequence bias of editing (168).

RNA-Based LncRNA Interactions

RNA-centric RNA antisense purification (RAP) is a general approach to identify and study lncRNA functions (Figure 2C). This method uses long capture probes (120 nucleotides) tiled across an entire RNA sequence to pull down lncRNAs, followed by stringent wash conditions to reduce non-specific binding (169). There are now next-generation sequencing derived methodologies that have been established to better define RNA-RNA interactions. LIGR-seq (LIGation of interacting RNA followed by high-throughput sequencing) can capture base-paired RNA-RNA interactions (Figure 2C) (170). In LIGR-seq, RNA duplexes are cross linked with the psoralen derivative 4'-aminomethyltrioxalen (AMT) along with UV irradiation at 365 nm, and RNase R is added to digest linear and structural RNAs. This step enriches for AMT-crosslinked RNA-RNA duplexes that are subsequently subjected to next-generation sequencing. Though LIGR-seq does not work well for small RNAs such as microRNA (miRNA), it should be able to uncover novel dynamic and long range interactions between lncRNAs and other RNA molecules. Various other methods have been developed to study the RNA interactome for lncRNAs with functional relevance in atherosclerosis including PARIS (Psoralen Analysis of RNA Interactions and Structures) (171), SPLASH (172), and MARIO (173). These techniques all can provide valuable information because many lncRNA sequence and structural motifs act as functional scaffolds in the assembly of RNA-protein complexes (17). However, it should be noted that many of these assays could be biased toward capturing stable interactions, while more transient and stimulation specific interactions may require some enrichment steps.

In situ Hybridization-Based Methods

A critical consideration when interrogating a given lncRNA function, is identifying its endogenous tissue and cellular localization. While many lncRNAs are expected to be cytosolic and contribute to post-transcriptional, translational or post-translational gene regulation, nuclear lncRNAs could participate in transcriptional regulation, chromatin structure or mRNA export mechanisms (17). RNA Fluorescence in situ Hybridization (FISH) has been a traditional method to identify the subcellular localization of RNA within cells, however it lacks sensitivity for lowly expressed lncRNAs. Single-molecule RNA FISH (smFISH) is a quantitative technique that provides the sensitivity to detect these lncRNAs and measures absolute transcript levels by using multiple short probes per target RNA (174). However, given that smFISH relies heavily on the optical detection of a limited number of fluorophores, it is restricted in its multiplexing capacity. Attempts to overcome this issue include implementation of combinatorial labeling by spectral barcodes and the incorporation of sequential hybridizations (seqFISH) using different colored probes in each hybridization round (175, 176). In seqFISH individual transcripts are imaged as different colored dots and quantified by counting the number of dots. Multiplex error-robust combinatorial labeling (merFISH) is an in situ targeted approach that utilizes two-step labeling and the detection of binary barcodes assigned to specific targets. This is accomplished by several rounds of hybridization, imaging, and cleavage of fluorophores from probes conjugated to readout sequences that interchange each cycle. Hybridization to readout sequences by the merFISH technique is much less time consuming than methods that utilize hybridization directly to target RNAs (177, 178). RNA SPOTS (sequential probing of targets) follows the same rationale as merFISH, except that it is used in vitro instead of in situ (179). While still in the nascent stage, the emergence of spatial transcriptomics facilitates integration of RNA-seq expression data with spatial locations of RNA molecules in individual tissue sections (180). In this procedure, fixed tissue samples are annealed to regionally barcoded reverse transcription primers. Following reverse transcription, RNA-seq followed by computational reconstruction allows the two-dimensional localization and quantification of RNA molecules (180). This barcoded method has already been applied to spatially resolve gene expression in the human adult heart (181). While this procedure was originally developed to study mRNAs, it shows promise for the spatial resolution of lncRNAs, given the increased sensitivity and ability to identify context-specific expression profiles. One consideration for atherosclerosis FISH experiments, is that heterogeneous cell types in lesions may be impacted differently by various fixation and hybridization conditions, so careful titration of reagents is recommended.

Other LncRNA Functions

Finally, there are various omics methods that can define the dynamics of lncRNA transcription, stability and RNA modifications. Nascent RNA sequencing analysis, including global nuclear run-on sequencing (GRO-seq) and precision run-on sequencing (PRO-seq) assays, could enable comprehensive detection of transient RNA transcriptional events for multiple RNA species, including mRNA, lncRNA, and eRNA (182). While most transcriptomic datasets capture steady-state levels of lncRNA transcripts, they do not provide direct insights into the stability of lncRNAs. BRIC-seq (5'-bromo-uridine immunoprecipitation chase-deep sequencing analysis) is a method that pulse-labels endogenous RNAs and employs next-generation sequencing to measure RNA decay over time (183). Total RNAs (including lncRNAs) can be isolated from cells at desired time points under various cell-specific perturbations to facilitate functional analysis of lncRNA stability (e.g., lncRNA related to CAD). For example, direct measurements of lncRNA stability in response to CRISPR based loss/gain of gene function or drug treatments could be examined. Another technique, ICE-seq (inosine chemical erasing coupled with sequencing) (184) represents a promising approach to globally identify lncRNA adenosine to inosine modifications (e.g., in the context of atherosclerosis). Adenosine to inosine (A-to-I) RNA editing is the most abundant form of RNA editing in humans and results from adenosine deaminase acting on RNA (ADAR). A-to-I editing is common to all lncRNAs and affects lncRNA function through altered stability and target recognition (185, 186). A-to-I RNA editing of mRNA has already been demonstrated to have important functional consequences in atherosclerosis. For example, A-to-I editing of cathepsin S mRNA (CTSS) is associated with cathepsin S levels in patients with atherosclerosis. Treatment of endothelial cells with inflammatory cytokines or exposure to hypoxia was shown to induce cathepsin S RNA editing and gene/protein expression (187).

Novel Computational Tools for LncRNA Annotation and Functional Prediction

Genomic annotation of lncRNA sequences requires defining the precise genomic coordinates of lncRNA exons and their respective transcription start sites. LncRNA annotation also involves functional annotation with respect to predicted biological mechanisms, subcellular localization, and affected cell types/tissues. While lncRNAs share some similarities with mRNAs such as transcript length and splicing structure (188), proper identification and characterization of specific long non-coding transcripts still remains a challenge. Unlike mRNAs, lncRNAs often exhibit lower stability, lower abundance, less splicing and greater nuclear localization (189). With the widespread application of high-throughput sequencing technologies, both automated and manual methods have been adopted to properly define lncRNA sequences from RNA-seq data. Automated annotation generates a larger catalog of lncRNAs and harnesses a transcriptome assembly consisting of two distinct strategies. In one automated approach, reads are first aligned to the reference genome to reveal all the possible splicing events which are subsequently assembled into transcripts (190, 191). In another automated approach, transcripts are built de novo from experimental reads and later aligned to a particular reference genome. Fu et al. (192) used both short and long sequencing reads to demonstrate superior sensitivity of transcript assembly and isoform annotation accuracy with the de novo approach. Automated assembly is fast as it does not require wet-lab based characterization, and it is considerable cheaper than the manual approach (144). Although it produces a smaller catalog of lncRNAs compared to the automated method, manual annotation produces higher quality lncRNA transcript sequences and thus improves functional characterization. The widely adopted GENCODE project annotation of lncRNAs utilizes a manual curation approach (193), and integrates different sources of data together with computational analyses to generate a transcript model. cDNA and expressed sequence tag (EST) sequences deposited in publicly available databases are typically the starting point for manually annotating lncRNA transcripts. These are integrated with Cap Analysis of Gene Expression (5′-CAGE) and poly(A) position profiling by sequencing (3P-seq) to characterize 5′ and 3′ ends, respectively. These manually annotated transcripts are then mapped to reference genomes and assigned exon and splice site locations (119). The RefSeq (Reference Sequence) project also implements manual annotation of long non-coding RNAs that are integrated with automated methods (194). Manually annotated lncRNAs can be further divided into subclasses such as intergenic lncRNAs (lincRNAs), antisense lncRNAs, and intronic lncRNAs. As cDNA annotation depends on the availability of full length transcripts, manual annotation focuses primarily on genomic annotation. As a result the manual approach produces a more comprehensive set of pseudogenes and alternatively spliced transcripts (193). Another comprehensive database established in 2016 is NONCODE that dedicates itself to collecting lncRNAs through integration with other databases (e.g., RefSeq and Ensembl) and exhaustive annotation. Compared to these other databases, NONCODE has collected more lncRNA transcripts (excluding tRNAs and rRNAs) and provides unique annotations of lncRNAs (e.g., RNA secondary structure, expression in exosomes, associations between lncRNA and disease) (6, 195). NONCODE also provides lncRNAs for over 15 species including mouse, zebrafish, and C-elegans. The latest version of NONCODE (v5), which also captures lncRNAs from the literature, consists of nearly 550,000 annotated lncRNAs (195). There are now an array of computational tools to annotate the sequences and functions of the expanding catalog of lncRNAs, as described in Table 2. Existing computational methods for lncRNA identification include those that require a reference genome and those that are reference-free. Examples of methods requiring a reference genome include UClncR, lncScore (205), COME (206), and lncRScan-SVM (207). Reference-free methods to identify lncRNAs from RNA-seq data include LncADeep (198), lncRNAnet (197, 208), FEElnc (197), longdist (204), lncRNA-MFDL (209), and CPC2 (210). Many of these tools employ artificial intelligence algorithms (e.g., machine learning, deep learning) in order to distinguish lncRNAs from their protein-coding transcript counterparts.
Table 2

Comparison of selected computational tools.

Tool name and reference*Model#Code availableApplicationPerformance results
AnnoLnc (196)Statistical approachhttp://annolnc.cbi.pku.edu.cnAnnotation of human lncRNAs.Not reported.
FEELnc (197)Random foresthttps://github.com/tderrien/FEELncAnnotation of lncRNAs.High classification power (AUC = 0.97).
LncADeep (198)Deep belief network built as a stack of restricted Boltzmann machineshttps://github.com/cyang235/LncADeepIdentification and functional annotation for lncRNAsWith 10-fold cross validation, average sensitivity of 98.1% and specificity of 97.2% and an average harmonic mean of 97.7%
LncFunTK (199)Statistical approachhttps://github.com/zhoujj2013/lncfuntkTo integrate ChIP-seq, CLIP-seq and RNA-seq data to predict, prioritize and annotate lncRNA functions.Calculates a Functional Information Score (FIS) to quantitatively predict functional importance.
lncLocator (200)Ensemble of support vector machine and random forest classifiers.http://www.csbio.sjtu.edu.cn/bioinf/lncLocator/To predict lncRNA subcellular localizations.Accuracy of 59% for prediction.
PennDiff (201)Regression-based statistical approachhttps://github.com/tigerhu15/PennDiffTo detect differential transcript isoforms from RNA-seq dataBased on both annotations (RefSeq and Ensembl), estimates from PennDiff have Spearman correlation coefficients of 0.87 and 0.76, respectively.
SEEKR (202)Statistical approachhttps://github.com/CalabreseLabPrediction of lncRNA subcellular localization, protein interactorsLncRNAs of related function have similar k-mer profiles, despite linear sequence similarity
UClncR (203)Statistical approachhttp://bioinformaticstools.mayo.edu/research/UClncRPerforms transcript assembly, prediction of lncRNA candidates in bulk RNA-seq data, quantification and annotation both known and novel lncRNA candidates.For lincRNA prediction, UClncR reported 66 “novel” lincRNA transcripts and 12 lncRNAs overlapping with nearby genes (the recall rate of 90.7%).
A support vector machine based method to distinguish long non-coding RNAs from protein transcripts (204)Support vector machinehttps://github.com/hugowschneider/longdist.pyTo distinguish lncRNAs from protein coding transcripts.98.21% accuracy in classifying long non-coding RNAs from protein coding transcripts.

Three of the publications have not been constructed into available tools but rather represent a framework for analysis.

Model type does not include preprocessing which may or may not including alignment of protein-coding regions. .

Comparison of selected computational tools. Three of the publications have not been constructed into available tools but rather represent a framework for analysis. Model type does not include preprocessing which may or may not including alignment of protein-coding regions. . Unlike protein functions that can be inferred from protein-coding sequences, it is more difficult to infer lncRNA function from RNA sequences. Zhou et al. developed a tool, lncFunTK that calculates a Functional Information Score (FIS) to quantitatively measure the functional importance of a lncRNA (199), based on the top Gene Ontology and inferred regulatory networks for lncRNAs and their neighboring genes. Another tool, FEELnc, annotates lncRNA function by evaluating neighboring genes to predict both lncRNA function and mRNA partners (197). Given that lncRNA function often depends on subcellular localization, the lncLocator tool predicts five lncRNA categories: nucleus, cytoplasm, cytosol, exosome, and ribosome (200). LncADeep provides enriched pathways and functional modules for lncRNA functional annotation by integrating KEGG and Reactome Pathway databases in a deep learning framework (211). A novel method for lncRNA classification is SEEKR, which counts lncRNA k-mer frequencies from nucleotide sequences, which may be correlated with lncRNA localization or protein binding (202). Like other ncRNAs, lncRNA functional annotation also depends on accurate secondary structure prediction. There are several computational tools to predict lncRNA structures such as pysster (212), RNAfold (213), RNAstructure (214), and UNAfold (215). Other tools are available to study the evolutionary conservation of lncRNAs, including EvoFold (216), Evolinc (217), RNAz (218), and SISSIz (219). Since lncRNAs are generally poorly conserved, there are strategies to examine remnants of protein coding sequences in these lncRNAs (220). Identifying evolutionarily conserved lncRNA structures and binding domains may provide clues to predict lncRNA function for follow-up experimentation.

Conclusion

The emergence of RNA-seq and other omics technologies in the past decade have catalyzed the identification of a plethora of novel lncRNAs. To date, more than 30 lncRNAs with functional relevance to CAD have been characterized (Table 1), yet numerous lncRNAs remain to be studied in greater detail that are linked to endothelial, smooth muscle, macrophage, and lipid traits. With the growing number of CAD GWAS candidate loci harboring lncRNAs, and improved fine-mapping and annotation approaches, there is an opportunity to functionally dissect these regions to develop novel strategies to target non-coding genomic risk factors. As outlined in this review, a multi-faceted approach is likely required to successfully prioritize and study these lncRNAs, which may include implementation of long-read and high-depth sequencing, improved computational tools, coupled with orthogonal high-throughput experimental validation assays. Careful consideration of the lower abundance, context-specific expression of lncRNAs, and thoughtful study designs may improve chances of success in these multi-omics assays. However, it should also be noted that in many cases, more traditional and lower throughput approaches would be equally appropriate to characterize a given lncRNA, thus reducing the overall costs and required expertise. For conserved lncRNAs with predicted roles in altering CAD pathogenesis, loss of function studies can be performed in animal models, such as the mouse (ApoE−/− or LDLR−/− backgrounds) or zebrafish. However, with the majority of human lncRNAs being poorly conserved across species, they may be better suited to studies in primary human cells or induced pluripotent stem cell (iPSC) derived vascular cells. In the context of CAD and other cardiometabolic disorders, genetic manipulation of lncRNAs via antisense oligonucleotides (221) or CRISPR/Cas9 to either delete (23, 222, 223) or activate/repress lncRNA expression, may lead to the identification of specific lncRNA binding partners, subcellular localization and functional insights relevant to CAD. lncRNA discovery/annotation can be further improved by integrating these genetic perturbations with high-dimensional transcriptomic and epigenomic assays (e.g., RNA-seq, ATAC-seq and ChIP-seq) to mark lncRNA promoters, decipher RNA polymerase and transcription factor binding, and reveal the dynamics of lncRNA regulatory activities. Single-cell based assays may also shed light on cell-specific markers and dynamics of lncRNAs across lineages (224, 225). Unraveling the complexity of lncRNA function in the setting of atherosclerosis may hold the key to delineate causal disease-associated pathways. In this regard it will also be important to determine whether lncRNAs operate synergistically, serve redundant and/or compensatory roles with other dysregulated lncRNAs and/or mRNAs associated with CAD.

Author Contributions

AT and CM conceived of the manuscript. AT, DW, MK, CD, MP, and CM wrote the manuscript.

Conflict of Interest Statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
  224 in total

1.  Rapid evolution of noncoding RNAs: lack of conservation does not mean lack of function.

Authors:  Ken C Pang; Martin C Frith; John S Mattick
Journal:  Trends Genet       Date:  2005-11-10       Impact factor: 11.639

2.  A strategy for probing the function of noncoding RNAs finds a repressor of NFAT.

Authors:  A T Willingham; A P Orth; S Batalov; E C Peters; B G Wen; P Aza-Blanc; J B Hogenesch; P G Schultz
Journal:  Science       Date:  2005-09-02       Impact factor: 47.728

3.  A conserved noncoding intronic transcript at the mouse Dnm3 locus.

Authors:  David A F Loebel; Bonny Tsoi; Nicole Wong; Patrick P L Tam
Journal:  Genomics       Date:  2005-06       Impact factor: 5.736

4.  Myocardin is a key regulator of CArG-dependent transcription of multiple smooth muscle marker genes.

Authors:  Tadashi Yoshida; Sanjay Sinha; Frédéric Dandré; Brian R Wamhoff; Mark H Hoofnagle; Brandon E Kremer; Da-Zhi Wang; Eric N Olson; Gary K Owens
Journal:  Circ Res       Date:  2003-03-27       Impact factor: 17.367

5.  Identification of an imprinted gene, Meg3/Gtl2 and its human homologue MEG3, first mapped on mouse distal chromosome 12 and human chromosome 14q.

Authors:  N Miyoshi; H Wagatsuma; S Wakana; T Shiroishi; M Nomura; K Aisaka; T Kohda; M A Surani; T Kaneko-Ishino; F Ishino
Journal:  Genes Cells       Date:  2000-03       Impact factor: 1.891

6.  Myocardin is a critical serum response factor cofactor in the transcriptional program regulating smooth muscle cell differentiation.

Authors:  Kevin L Du; Hon S Ip; Jian Li; Mary Chen; Frederic Dandre; William Yu; Min Min Lu; Gary K Owens; Michael S Parmacek
Journal:  Mol Cell Biol       Date:  2003-04       Impact factor: 4.272

7.  Post-transcriptional regulation of endothelial nitric-oxide synthase by an overlapping antisense mRNA transcript.

Authors:  G Brett Robb; Andrew R Carson; Sharon C Tai; Jason E Fish; Sundeep Singh; Takahiro Yamada; Stephen W Scherer; Kazuhiko Nakabayashi; Philip A Marsden
Journal:  J Biol Chem       Date:  2004-07-02       Impact factor: 5.157

8.  Myocardin: a component of a molecular switch for smooth muscle differentiation.

Authors:  Jiyuan Chen; Chad M Kitchen; Jeffrey W Streb; Joseph M Miano
Journal:  J Mol Cell Cardiol       Date:  2002-10       Impact factor: 5.000

9.  Potentiation of serum response factor activity by a family of myocardin-related transcription factors.

Authors:  Da-Zhi Wang; Shijie Li; Dirk Hockemeyer; Lillian Sutherland; Zhigao Wang; Gerhard Schratt; James A Richardson; Alfred Nordheim; Eric N Olson
Journal:  Proc Natl Acad Sci U S A       Date:  2002-10-23       Impact factor: 11.205

10.  Human tribbles, a protein family controlling mitogen-activated protein kinase cascades.

Authors:  Endre Kiss-Toth; Stephanie M Bagstaff; Hye Y Sung; Veronika Jozsa; Clare Dempsey; Jim C Caunt; Kevin M Oxley; David H Wyllie; Timea Polgar; Mary Harte; Luke A J O'neill; Eva E Qwarnstrom; Steven K Dower
Journal:  J Biol Chem       Date:  2004-08-06       Impact factor: 5.157

View more
  12 in total

1.  BioAutoML: automated feature engineering and metalearning to predict noncoding RNAs in bacteria.

Authors:  Robson P Bonidia; Anderson P Avila Santos; Breno L S de Almeida; Peter F Stadler; Ulisses N da Rocha; Danilo S Sanches; André C P L F de Carvalho
Journal:  Brief Bioinform       Date:  2022-07-18       Impact factor: 13.994

Review 2.  Artificial intelligence and machine learning in precision and genomic medicine.

Authors:  Sameer Quazi
Journal:  Med Oncol       Date:  2022-06-15       Impact factor: 3.738

Review 3.  Emerging biology of noncoding RNAs in malaria parasites.

Authors:  Karina Simantov; Manish Goyal; Ron Dzikowski
Journal:  PLoS Pathog       Date:  2022-07-07       Impact factor: 7.464

4.  Cis-regulated expression of non-conserved lincRNAs associates with cardiometabolic related traits.

Authors:  Muredach P Reilly; Andrea S Foulkes; Tingyi Cao; Marcella E O'Reilly; Caitlin Selvaggi; Esther Cynn; Heidi Lumish; Chenyi Xue; Anjali Jha
Journal:  J Hum Genet       Date:  2022-01-11       Impact factor: 3.755

5.  Transcriptional control of a novel long noncoding RNA Mymsl in smooth muscle cells by a single Cis-element and its initial functional characterization in vessels.

Authors:  Mihyun Choi; Yao Wei Lu; Jinjing Zhao; Mingfu Wu; Wei Zhang; Xiaochun Long
Journal:  J Mol Cell Cardiol       Date:  2019-11-18       Impact factor: 5.000

Review 6.  Network medicine in Cardiovascular Research.

Authors:  Laurel Y Lee; Arvind K Pandey; Bradley A Maron; Joseph Loscalzo
Journal:  Cardiovasc Res       Date:  2021-08-29       Impact factor: 10.787

Review 7.  Unveiling ncRNA regulatory axes in atherosclerosis progression.

Authors:  Estanislao Navarro; Adrian Mallén; Josep M Cruzado; Joan Torras; Miguel Hueso
Journal:  Clin Transl Med       Date:  2020-02-03

Review 8.  Skeleton-vasculature chain reaction: a novel insight into the mystery of homeostasis.

Authors:  Ming Chen; Yi Li; Xiang Huang; Ya Gu; Shang Li; Pengbin Yin; Licheng Zhang; Peifu Tang
Journal:  Bone Res       Date:  2021-03-22       Impact factor: 13.567

Review 9.  Long non-coding RNAs in metabolic disorders: pathogenetic relevance and potential biomarkers and therapeutic targets.

Authors:  B Alipoor; S Nikouei; F Rezaeinejad; S-N Malakooti-Dehkordi; Z Sabati; H Ghasemi
Journal:  J Endocrinol Invest       Date:  2021-04-01       Impact factor: 4.256

10.  Antisense long non‑coding RNA WEE2‑AS1 regulates human vascular endothelial cell viability via cell cycle G2/M transition in arteriosclerosis obliterans.

Authors:  Baohong Jiang; Rui Wang; Zefei Lin; Jieyi Ma; Jin Cui; Mian Wang; Ruiming Liu; Weibin Wu; Chunxiang Zhang; Wen Li; Shenming Wang
Journal:  Mol Med Rep       Date:  2020-10-22       Impact factor: 2.952

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.