Hui Jiang1, Furong Wu2, Nannan Jiang3, Jiarong Gao3, Jiafu Zhang3. 1. Experimental Center of Clinical Research, The First Affiliated Hospital of Anhui University of Chinese Medicine, Hefei, Anhui 230031, P.R. China. 2. Department of Pharmacy, Anhui Provincial Hospital, Hefei, Anhui 230001, P.R. China. 3. Department of Pharmacy, The First Affiliated Hospital of Anhui University of Chinese Medicine, Hefei, Anhui 230031, P.R. China.
Abstract
Hepatic fibrosis (HF), one of the leading global health problems, is defined as aberrant and excess production of extracellular matrix components. The pathogenesis of HF is complex and poorly understood. Long non‑coding RNAs (LncRNAs) can interact with microRNAs (miRNAs) as competing endogenous RNAs (ceRNAs) to regulate the expression of target genes, which play a significant role in the initiation and progression of HF. In the present study, the LncRNA‑associated ceRNA network was reconstructed based on LncRNA, miRNA and mRNA expression profiles that were downloaded from National Center for Biotechnology Information Gene Expression Omnibus. Bioinformatics assessments including Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes pathway analyses were performed with Database for Annotation, Visualization and Integration Discovery. The ceRNA network was composed of 220 LncRNA nodes, 24 miRNA nodes, 164 mRNA nodes and 1,149 edges. Functional assays identified that a total of 338 GO terms and 25 pathways, including regulation of cytokine and collagen, and the transforming growth factor‑β and Toll‑like receptor signaling pathways, were significantly enriched. In addition, 4 LncRNAs (NONMMUT036242, XR_877072, XR_378619 and XR_378418) were highly related to HF and thereby chosen as key LncRNAs. The present study uncovered a ceRNA network that could further the understanding of the mechanisms underlying HF development and provide potential novel markers for clinical diagnosis and targets for treatment.
Hepatic fibrosis (HF), one of the leading global health problems, is defined as aberrant and excess production of extracellular matrix components. The pathogenesis of HF is complex and poorly understood. Long non‑coding RNAs (LncRNAs) can interact with microRNAs (miRNAs) as competing endogenous RNAs (ceRNAs) to regulate the expression of target genes, which play a significant role in the initiation and progression of HF. In the present study, the LncRNA‑associated ceRNA network was reconstructed based on LncRNA, miRNA and mRNA expression profiles that were downloaded from National Center for Biotechnology Information Gene Expression Omnibus. Bioinformatics assessments including Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes pathway analyses were performed with Database for Annotation, Visualization and Integration Discovery. The ceRNA network was composed of 220 LncRNA nodes, 24 miRNA nodes, 164 mRNA nodes and 1,149 edges. Functional assays identified that a total of 338 GO terms and 25 pathways, including regulation of cytokine and collagen, and the transforming growth factor‑β and Toll‑like receptor signaling pathways, were significantly enriched. In addition, 4 LncRNAs (NONMMUT036242, XR_877072, XR_378619 and XR_378418) were highly related to HF and thereby chosen as key LncRNAs. The present study uncovered a ceRNA network that could further the understanding of the mechanisms underlying HF development and provide potential novel markers for clinical diagnosis and targets for treatment.
Hepatic fibrosis (HF), a reversible pathophysiological process, is characterized by an increased and altered deposition of extracellular matrix (ECM) proteins that results in excessive tissue scarring and promotes chronic liver injury (1,2). Due to its end-stage outcome (cirrhosis), it is responsible for high morbidity and mortality (3). Over the past several years, great efforts have been made to provide insight into the molecular mechanisms underlying HF (4,5). However, the precise molecular mechanisms still remain to be determined (6). Therefore, elucidating the pathogenesis of liver fibrosis and identifying novel treatment strategies, including potential biomarkers and therapeutic targets to combat HF, are of great importance.Accumulating evidence suggests that although they have no or little protein-coding capacity, various non-coding RNAs (ncRNAs) serve as master regulators that affect expression levels of dozens or even hundreds of target genes (7,8). Long (L)ncRNAs are defined as RNA molecules that are >200 nucleotides in length, transcribed by RNA polymerase II and lacking an open reading frame (9). LncRNAs have been found to be associated with the pathogenesis of different types of diseases by regulating diverse biological processes, such as transcription, splicing and translation (10–13). To date, >50,000 LncRNAs have been cloned and identified in the human genome; however, only a small number of these have been functionally annotated (14). Improved knowledge has suggested that LncRNAs can regulate microRNA (miRNA) abundance by binding and sequestering them, thereby regulating the expression of target mRNAs (15). Thus, it has been demonstrated that an efficient way to the infer potential function of LncRNAs is by studying associated miRNAs and/or mRNAs whose functions have been annotated. This association has been demonstrated in several diseases, but has not been studied in HF (16–18).The present study reconstructed a competitive endogenous RNA (ceRNA) network using the data from National Center for Biotechnology Information (NCBI; http://www.ncbi.nlm.nih.gov/) Gene Expression Omnibus (GEO; http://www.ncbi.nlm.nih.gov/geo/), based on the ceRNA theory, a new RNA language that mRNAs, transcribed pseudogenes, and LncRNAs ‘talk’ to each other using miRNA response elements (19). Subsequently, Gene Ontology (GO; http://geneontology.org/) and Kyoto Encyclopedia of Genes and Genomes (KEGG; http://www.kegg.jp/) pathway analysis displayed that a number of processes overrepresented in HF were related to the regulation of cytokine and collagen, and the transforming growth factor-β (TGF-β) and Toll-like receptor (TLR) signaling pathways. Furthermore, the ceRNA network analysis identified that 4 LncRNAs (NONMMUT036242, XR_877072, XR_378619 and XR_378418) were highly related to HF, suggesting that these LncRNAs were key LncRNAs, and could be used as potential diagnosis markers and treatment targets. Thus, the present study may help expand the understanding of the possible mechanism of HF from the perspective of LncRNAs, and provide some new potential LncRNA markers for the clinical diagnosis and treatment of HF.
Materials and methods
Raw data
Mouse miRNA expression data were downloaded from NCBI GEO (GSE77271) (20), based on the SurePrint G3 Mouse miRNA Microarray. A total of 6 samples were included in the dataset: 3 samples were the control group and 3 samples were the liver fibrosis group, which was induced using carbon-tetrachloride (CCl4). Mouse LncRNA and mRNA expression data were also downloaded from NCBI GEO (GSE80601) (21), generated using an Affymetrix GeneChip Mouse Exon 1.0 ST Array. A total of 10 samples were included in this dataset: 5 samples were the control group and 5 samples were the liver fibrosis group, which was also induced by CCl4.
Gene chip probe re-annotation
Based on the LncRNA classification pipeline constructed in our previous study (22), a number of LncRNAs represented on the Affymetrix microarrays were identified. First, the latest version of the NetAffx™ Annotation File (MoEx-1_0-st-v1 Probeset Annotations, CSV Format, release 36; uploaded 7/6/16) was obtained from the Affymetrix official website (http://www.affymetrix.com/support/technical/byproduct.affx?product=moexon-st). This annotation file was mapped to the MoEx-1_0-st-v1 probe sets ID. Second, the probe sets were annotated in the Refseq database (https://www.ncbi.nlm.nih.gov/refseq/); those IDs beginning with ‘NR’ were retained, and transcript IDs labeled with ‘NP’ were deleted. Probe sets from the Ensembl database (https://www.ensembl.org/index.html) and the online software BioMart v95 (https://www.ensembl.org/info/data/biomart/index.html) were applied to convert Affymetrix microarray IDs to Ensembl IDs, together with the corresponding gene type, and only RNAs type which were annotated as ‘lincRNA’, ‘sense_intronic’, ‘processed_transcript’, ‘antisense’, ‘sense_overlapping’, ‘3prime_overlapping_ncrna’ or ‘misc_RNA’ were retained. The probe sets that annotated in Noncode v4 (http://www.noncode.org/) were all retained. Next, based on the above two steps, probe set IDs annotated as ‘microRNA’, ‘snoRNAs’, ‘pseudogenes’ and other small RNAs were removed.
Screening of differentially expressed LncRNAs, miRNAs and mRNAs
First, differential probes of GSE77271 and GSE80601 were selected with thresholds of fold change (FC) >1.5 and P<0.05 using GEO2R (https://www.ncbi.nlm.nih.gov/geo/geo2r/). Second, differential probes were matched to miRNAs in GSE77271, and LncRNAs and mRNAs in GSE80601. Then, the differentially expressed LncRNAs (DELs), miRNAs (DEMis) and mRNAs (DEMs) were identified.
Reconstruction ceRNA network
The LncRNA-miRNA-mRNA network was reconstructed based on ceRNA theory as described previously (7). DEL sequence were obtained from Refseq, Noncode and Ensemble databases, whereas DEMi sequences were obtained from miRBase (http://www.mirbase.org/). Then, the target LncRNAs of miRNAs were predicted by MiRanda (http://www.microrna.org, release 2010) (23). The target mRNAs of miRNAs were predicted by MiRanda and TargetScan v7.2 (http://www.targetscan.org) (24). mRNAs and LncRNAs that were targeted and negatively co-expressed with a certain common miRNA were identified as LncRNA-miRNA-mRNA co-expression competing triplets. The ceRNA network was reconstructed by assembling all co-expression competing triplets, and was visualized using Cytoscape v3.2.8 (25) software. Simultaneously, all node degrees of the LncRNA-miRNA-mRNA network were calculated.
Functional enrichment analysis
To assess functional enrichment, GO (26,27) and KEGG pathway (28,29) analyses of mRNAs in the ceRNA network were performed using Database for Annotation, Visualization, and Integration Discovery (v6.8; http://david.ncifcrf.gov/). Then, a false discovery rate (FDR) was calculated to correct the P-value, and FDR <0.05 was selected as the threshold.
Reconstruction of the key LncRNA-miRNA-mRNA sub-networks
In order to evaluate the key LncRNAs, several topological properties, such as the hub nodes and relationship pairs, were considered. First, the hub nodes of LncRNAs with node degree >5 were selected. Second, the first relationship pairs of LncRNA-miRNA and the secondary relationship pairs of miRNA-mRNA were calculated. The LncRNAs whose total relationship pairs were in the top 4 were selected as key LncRNAs. Then, key LncRNA-miRNA-mRNA sub-networks were visualized using Cytoscape software. For further analysis, GO and KEGG pathway annotations for each of the key LncRNAs were analyzed by using their original mRNA neighbors in the key LncRNA-miRNA-mRNA sub-networks.
Results
Identification of DELs, DEMis and DEMs
Based on the cut-off criteria, a total of 254 DELs (107 significantly up- and 147 downregulated), 25 DEMis (12 significantly up- and 13 downregulated) and 472 DEMs (308 significantly up- and 164 downregulated) in the GSE77271 and GSE80601 dataset were identified.
LncRNA-miRNA-mRNA network
To further understand how LncRNAs regulate mRNAs through interactions with miRNAs in HF, a ceRNA network was reconstructed based on the above data and visualized. As presented in Fig. 1, the ceRNA network was composed of 220 LncRNA nodes, 24 miRNA nodes, 164 mRNA nodes and 1,149 edges. Base on the LncRNA-miRNA-mRNA network, it was found that one LncRNA can interact with multiple miRNAs and mRNAs, one miRNA can interact with multiple LncRNAs and mRNAs, and one mRNA can interact with multiple LncRNAs and miRNAs; this indicated that the expression profiles of LncRNAs, miRNAs and mRNAs were significantly associated.
Figure 1.
LncRNA-miRNA-mRNA network. The arrowheads represent LncRNAs, rhombuses represent miRNAs, circles represent mRNAs; upregulated genes are in pink, downregulated genes are in green. LncRNA, long non-coding RNA; miRNAs microRNAs.
Functional enrichment analysis of LncRNAs based on the ceRNA network
To explore the biological functions of LncRNAs during the development of HF, GO and KEGG analyses were conducted. GO analysis revealed that a total of 338 GO terms, including platelet-derived growth factor binding, and regulation of cytokine and collagen, were significantly altered. The top 30 significantly GO terms are shown in Fig. 2A. KEGG pathway analysis indicated that 25 pathways were significantly enriched, including the TGF-β, TLR and peroxisome proliferator-activated receptor signaling pathways. The top 30 pathways were demonstrated in Fig. 2B.
Figure 2.
GO and KEGG pathway analysis of differentially expressed mRNAs. (A) Top 30 enriched GO terms. (B) Top 30 changes in KEGG pathway enrichment. Circles represent biological process, triangles represent cellular component and squares represent molecular function. The different colors from green to red represent the P-value. The deeper the color, the more significant the P-value. The different sizes of the shapes represent the gene count number. GO, Gene Ontology; KEGG, Kyoto Encyclopedia of Genes and Genomes.
Topological analysis of the ceRNA network
In general, a higher degree indicates that a node in the ceRNA network was a hub that participated in more ceRNA interactions. According to a previous study by Han et al (30), in which they defined a hub as a node degree >5, the present study found that 77 nodes could be chosen as hub nodes, including 43 LncRNAs, 24 miRNAs and 10 mRNAs. The results are shown in Fig. 3. In addition, the number of first relationship LncRNA-miRNA pairs and secondary relationship miRNA-mRNA pairs of hubs were calculated. The results were demonstrated in Table I. On the basis of total relationship pairs, 4 LncRNAs (NONMMUT036242, XR_877072, XR_378619 and XR_378418) were selected as the key LncRNAs.
Figure 3.
Node degree analysis of the LncRNA-miRNA-mRNA network. LncRNA, long non-coding RNA; miRNAs microRNAs.
Table I.
LncRNA-miRNA and miRNA-mRNA pairs of network hubs.
Number
Gene name
LncRNA-miRNA pairs
miRNA-mRNA pairs
Total number
1
NONMMUT036242
8
203
211
2
XR_877072
10
199
209
3
XR_378619
7
187
194
4
XR_378418
7
171
178
5
NONMMUT025112
8
161
169
6
NONMMUT069361
8
159
167
7
NONMMUT041965
7
152
159
8
NONMMUT039701
7
144
151
9
NONMMUT067035
7
133
140
10
NONMMUT030757
6
127
133
11
NONMMUT043221
6
113
119
12
NONMMUT010366
6
106
112
13
NONMMUT053659
6
87
93
14
Gm34394
12
77
89
15
Gm41622
11
77
88
16
Gm35562
11
77
88
17
NONMMUT022066
8
74
82
18
Gm29966
8
74
82
19
NONMMUT013707
8
72
80
20
NONMMUT051651
7
72
79
21
NONMMUT058428
7
66
73
22
NONMMUT033508
8
64
72
23
NONMMUT033142
8
63
71
24
Gm32861
6
60
66
25
NONMMUT051738
8
58
66
26
NONMMUT048637
8
56
64
27
NONMMUT010786
6
57
63
28
NONMMUT059817
4
58
62
29
NONMMUT050190
7
55
62
30
NONMMUT041236
8
54
62
31
1700001L05Rik
7
54
61
32
NONMMUT068223
6
54
60
33
NONMMUT064230
6
54
60
34
NONMMUT044605
6
53
59
35
NONMMUT057586
8
48
56
36
NONMMUT053611
6
46
52
37
NONMMUT036431
7
45
52
38
NONMMUT039492
6
43
49
39
NONMMUT042546
6
41
47
40
Gm32374
6
36
42
41
NONMMUT022800
6
27
33
42
NONMMUT045026
6
23
29
43
NONMMUT025285
6
11
17
LncRNA, long non-coding RNA; miRNAs, microRNAs.
Key LncRNA-miRNA-mRNA sub-networks
The new key LncRNA-mRNA sub-networks were reconstructed by extracting the key LncRNAs and their linked miRNAs and mRNAs from the triple global network. As presented in Fig. 4, the LncRNA NONMMUT036242-miRNA-mRNA sub-network was composed of 8 miRNA nodes, 112 mRNA nodes and 211 edges. The LncRNA XR_877072-miRNA-mRNA sub-network was composed of 10 miRNA nodes, 120 mRNA nodes and 220 edges. The LncRNA XR_378619-miRNA-mRNA sub-network was composed of 7 miRNA nodes, 109 mRNA nodes and 197 edges. The LncRNA XR_378418-miRNA-mRNA sub-network was composed of 7 miRNA nodes, 100 mRNA nodes and 178 edges.
Figure 4.
Key LncRNA-miRNA-mRNA sub-networks. (A) LncRNA NONMMUT036242, (B) LncRNA XR_877072, (C) LncRNA XR_378619 and (D) LncRNA XR_378418 sub-networks. The arrowheads represent LncRNAs, rhombuses represent miRNAs, circles represent mRNAs; upregulated genes are in pink, downregulated genes in green. LncRNA, long non-coding RNA; miRNAs microRNAs.
Functional prediction of key LncRNAs-miRNA-mRNA sub-networks
GO and KEGG pathway annotations were further performed for the 4 key LncRNAs. The results revealed that 335 GO terms and 21 pathways were significantly enriched in the LncRNA NONMMUT036242-miRNA-mRNA sub-network, 230 GO terms and 23 pathways were significantly enriched in the LncRNA XR_877072-miRNA-mRNA sub-network, 399 GO terms and 27 pathways were significantly enriched in the LncRNA XR_378619-miRNA-mRNA sub-network, 200 GO terms and 18 pathways were significantly enriched in the LncRNA XR_378418-miRNA-mRNA sub-network. The top 30 significant GO terms are shown in Fig. 5 and the top 30 enriched pathways shown in Fig. 6. It is notable that some common GO terms such as ‘regulation of cytokine-mediated signaling pathway’, ‘regulation of plasma lipoprotein particle levels’ and ‘collagen’, and some common pathways such as ‘Wnt signaling pathway, ‘TGF-β signaling pathway’ and ‘Toll-like receptor signaling pathway’ were enriched for all 4 sub-networks.
Figure 5.
GO analysis of the key LncRNA-miRNA-mRNA sub-networks. The top 30 significantly enriched GO terms of the (A) LncRNA NONMMUT036242, (B) LncRNA XR_877072, (C) LncRNA XR_378619 and (D) LncRNA XR_378418-miRNA-mRNA sub-networks. Circles represent biological process, triangles represent cellular component, square represent molecular function. The different colors from green to red represent the P-value. The deeper the color, the more significant the P-value. The different sizes of the shapes represent the gene count. GO, Gene Ontology; LncRNA, long non-coding RNA; miRNAs, microRNAs.
Figure 6.
KEGG pathway analysis of the key LncRNA-miRNA-mRNA sub-networks. Top 30 enriched KEGG pathways of the (A) LncRNA NONMMUT036242, (B) LncRNA XR_877072, (C) LncRNA XR_378619 and (D) LncRNA XR_378418-miRNA-mRNA sub-networks. The different colors from green to red represent the P-value. The deeper the color, the more significant the P-value. The different sizes of the shapes represent the gene count number. LncRNA, long non-coding RNA; miRNAs, microRNAs; KEGG, Kyoto Encyclopedia of Genes and Genomes.
Discussion
HF is well recognized as a wound-healing response that occurs in liver following any type of acute or chronic injury (31–33). It is hypothesized that liver fibrosis is reversible; however, the resulting structural damage to liver lobules and vasculature leads to cirrhosis, which is irreversible (34–36). Therefore, there is an increasing requirement for diagnostic and prognostic biomarkers of HF.Increasing evidence has demonstrated that LncRNAs are involved in numerous complex diseases by communicating with mRNAs and miRNAs, and with each other (37). Compared with protein-coding genes, LncRNAs have clear advantages as diagnostic and prognostic biomarkers. Exploiting interactions between LncRNAs and mRNAs/miRNAs to LncRNA functional similarity is an effective method to explore function of LncRNAs (38–40). The present study, on the basis of the ceRNA hypothesis, used interaction data generated using datasets from GEO to reconstruct a triple global network. The LncRNA-miRNA-mRNA network consisted of 220 LncRNAs, 24 miRNAs, 164 mRNAs and 1,149 interactions.Furthermore, to investigate the functions of HF-related genes, these genes were annotated via GO and KEGG pathway analysis, and it was found that the most significant enriched terms and pathways were regulation of cytokine and collagen, and TGF-β and TLR signaling. It is well known that activated hepatic stellate cells (HSCs) serve a role in liver fibrosis (41). Several studies have demonstrated that TGF-β plays a critical role in HSC activation, and that the TGF-β signaling pathway could be a potential therapeutic target for HF (42–44). This classic signaling pathway is activated by TGF-β binding to its receptors located on the cell membrane. The downstream proteins, including Smad2 and Smad3, are activated by phosphorylation, which further promotes the transcription of genes encoding ECM components, thereby accelerating the development of liver fibrosis (45).Previous studies have demonstrated that hub nodes, characterized by their high degree of connectivity to other nodes, can be used as topological properties of the network to evaluate the importance of genes (46,47). In general, a LncRNA that has more edges indicates that the LncRNA is a hub, which participates in more ceRNA interactions. Thus, the LncRNA is essential in network organization and plays a critical role in a network. In the present study, according to the node degree and relationship pairs, 4 LncRNAs (NONMMUT036242, XR_877072, XR_378619 and XR_378418) were selected as the key LncRNAs, which may be used as potential novel biomarkers for the clinical diagnosis and treatment of HF.Several LncRNAs, which are altered during HF and could affect HSC activation, have already been reported, including liver fibrosis-associated LncRNA1 (21), metastasis-associated lung adenocarcinoma transcript 1 (48), LncRNA-p21 (49) and maternally expressed 3 (50). However, functional characterization of the 4 key LncRNAs (NONMMUT036242, XR_877072, XR_378619 and XR_378418) is still in its infancy. In order to infer the potential functions of LncRNAs, according to the ceRNA theory, the related miRNAs and/or mRNAs, whose functions have been annotated, can be studied. While the LncRNA-miRNA-mRNA could provide a global view of all possible ceRNA interactions that could be used to investigate the regulatory properties of the LncRNAs, the key LncRNA-miRNA-mRNA sub-network revealed a more detailed picture of how the key LncRNAs synergized with competing mRNAs (51). According to the key LncRNA-miRNA-mRNA sub-networks, the LncRNA XR_877072, a seldom reported LncRNA, interacted with numerous mRNAs, including CD74, discoidin domain receptor 2 and tissue inhibitor of metallopeptidases 2, which were highly associated with HF (52–54). In addition, this LncRNA could directly interact with a number of miRNAs, including miR-378a-3p and miR-212-3p, which have been verified as important factors in the development of HF. For example, Hyun et al (55) reported that expression of miR-378a-3p declined in mice models of liver fibrosis. In activated HSCs transfected with miR-378a-3p mimic, the profibrotic gene expression of Vimentin, α-smooth muscle actin and collagen Iα1 decreased. In addition, the levels of Gli3, a critical target gene of the Hedgehog signaling pathway that promotes HF (56,57), were also downregulated (55). Zhu et al (58) confirmed that overexpression of miR-212-3p can induce the expression of HSC markers, including α-SMA, and collagens by activating the TGF-β signaling pathway.In conclusion, based on the ceRNA theory, the present study constructed an HF-associated LncRNA-miRNA-mRNA network to explore the biological functions of LncRNAs during the development of HF at a system-wide level. Notably, 4 LncRNAs (NONMMUT036242, XR_877072, XR_378619 and XR_378418) were selected that participated in the most ceRNA interactions, and may play important roles in HF. The present study uncovered a ceRNA network that could further the understanding of the mechanisms underlying HF progression and provide novel potential markers for clinical diagnosis and targets for treatment. However, there were still certain limitations to the present study. First, the 4 key LncRNAs need to be verified in liver tissue specimens from patients with liver fibrosis via reverse transcription-quantitative PCR analysis. Second, further experiments are required to validate their effects and mechanisms in HF. Finally, the homology of the 4 key LncRNAs, which were identified based on expression in mice, needs to be validated to support their use as potential biomarkers and therapeutic targets in clinical settings.
Authors: Przemyslaw Szafranski; Avinash V Dharmadhikari; Erwin Brosens; Priyatansh Gurha; Katarzyna E Kolodziejska; Ou Zhishuo; Piotr Dittwald; Tadeusz Majewski; K Naga Mohan; Bo Chen; Richard E Person; Dick Tibboel; Annelies de Klein; Jason Pinner; Maya Chopra; Girvan Malcolm; Gregory Peters; Susan Arbuckle; Sixto F Guiang; Virginia A Hustead; Jose Jessurun; Russel Hirsch; David P Witte; Isabelle Maystadt; Neil Sebire; Richard Fisher; Claire Langston; Partha Sen; Paweł Stankiewicz Journal: Genome Res Date: 2012-10-03 Impact factor: 9.043