Literature DB >> 29805571

Gene-gene interaction network analysis of hepatocellular carcinoma using bioinformatic software.

Jin-Hua He1, Ze-Ping Han1, Pu-Zhao Wu1, Mao-Xian Zou1, Li Wang1, Yu-Bing Lv1, Jia-Bin Zhou1, Ming-Rong Cao2, Yu-Guang Li1.   

Abstract

Information processing tools and bioinformatics software have markedly advanced the ability of researchers to process and analyze biological data. Data from the genomes of humans and model organisms aid researchers to identify topics to study, which in turn improves predictive accuracy, facilitates the identification of relevant genes and simplifies the validation of laboratory data. The objective of the present study was to investigate the regulatory network constituted by long non-coding RNAs (lncRNAs), microRNAs (miRNAs) and mRNA in hepatocellular carcinoma (HCC). Microarray data from HCC datasets were downloaded from The Cancer Genome Atlas database, and the Limma package in R was used to identify the differentially expressed genes (DEGs) between HCC and normal samples. Gene ontology enrichment analysis of DEGs was conducted using the Database for Annotation, Visualization, and Integrated Discovery. TargetScan, microcosm, miRanda, miRDB and PicTar were used to predict target genes. lncRNAs associated with HCC were probed using the lncRNASNP database, and a lncRNA-miRNA-mRNA regulatory network was visualized using Cytoscape. The present study identified 114 differentially expressed miRNAs and 2,239 differentially expressed mRNAs; of these, 725 were downregulated genes that were primarily involved in complement and coagulation cascades, fatty acid metabolism and butanoate metabolism, among others. The remaining 1,514 were upregulated genes principally involved in DNA replication, oocyte meiosis and homologous recombination, among others. Through the integrated analysis of associations between different types of RNAs and target gene prediction, the present study identified 203 miRNA-mRNA pairs, including 28 miRNAs and 170 mRNAs, and identified 348 lncRNA-miRNA pairs, containing 28 miRNAs. Therefore, owing to the association between lncRNAs-miRNAs-mRNAs, the present study screened out 2,721 regulatory associations. The data in the present study provide a comprehensive bioinformatic analysis of genes, functions and pathways that may be involved in the pathogenesis of HCC.

Entities:  

Keywords:  The Cancer Genome Atlas; differentially expressed genes; hepatocellular carcinoma; long non-coding RNA; mRNA; microRNA

Year:  2018        PMID: 29805571      PMCID: PMC5950030          DOI: 10.3892/ol.2018.8408

Source DB:  PubMed          Journal:  Oncol Lett        ISSN: 1792-1074            Impact factor:   2.967


Introduction

Hepatocellular carcinoma (HCC) accounts for >90% of liver cancer cases, making it the most common hepatic cancer (1). HCC has recently become one of the tumors to occur most frequently globally and is considered to be among the most lethal cancer types, making up approximately one-third of all cancer cases (2). HCC progression is characterized by a series of sequential, complex steps. Recent research has focused on the role of protein-coding genes in HCC pathogenesis. However, as only 1% of the human genome encodes protein, another 4–9% remains that are transcribed to provide a number of short or long RNAs with restricted protein-coding abilities (3). MicroRNAs (miRNAs/miRs) are small, endogenous RNAs that degrade mRNA or inhibit translation, thus regulating gene expression (4). The aberrant expression of miRNAs serves an essential role in the progression of HCC. As such, miRNA replacement or inhibition therapies have a great deal of potential as treatments for HCC (5). Long non-coding RNAs (lncRNAs) are untranslated RNA transcripts >200 nucleotides in length that bear a lot of the structural characteristics of mRNAs, including a poly(A) tail, 5′-cap and a promoter structure, but no conserved open reading frames (6,7). Numerous lncRNAs are expressed in a temporal- and tissue-specific manner during development, exhibiting a range of different splicing patterns. lncRNAs epigenetically regulate the expression of protein-coding genes, exerting a strong effect on a number of different cellular processes, including those involved in the pathogenesis of multiple human cancer types (8,9). ncRNAs have been suggested to form a regulatory interaction network; specifically, miRNAs and mRNAs, and miRNAs and lncRNAs interact with each other, imposing an additional level of post-transcriptional regulation (10). In the present study, mRNA expression data and miRNA expression data of HCC were downloaded from The Cancer Genome Atlas (TCGA) and a lncRNA-miRNA-mRNA regulatory network was constructed. In total, five miRNA-target gene-prediction databases, including TargetScan6, microcosm, miRanda7, miRDB8 and picTar9, were integrated as the support dataset of miRNA and mRNA pairs. lncRNA-miRNA information in the lncRNA (10) database was also used as the support dataset of lncRNA and mRNA pairs. The lncRNA, miRNA and mRNA regulatory networks in HCC were analyzed with bioinformatics software to provide a theoretical basis for the molecular mechanism of HCC pathogenesis.

Materials and methods

Gene expression profiles

The gene expression dataset ‘batch8_9’ was downloaded from the TCGA project webpage (https://tcga-data.nci.nih.gov/tcga/). This set included 212 HCC samples and 50 matched non-cancerous samples for miRNA analysis, as well as 269 other liver cancer samples and 51 matched non-cancerous samples for mRNA analysis. Data levels are sorted by data type, platform and center in TCGA. The data downloaded consisted of levels 1–4, from which, level 3 (for segmented or interpreted data) was used for further study. The median was calculated to standardize the original data.

Screening differentially expressed mRNA and miRNA

DEGs were analyzed using the DESeq (11) package in R and by paired Student's t-test. Differentially expressed mRNAs between the two groups of samples were analyzed using DESeq by negative binomial distribution. The edgeR (12) package in R language and a t-test were used to analyze differentially expressed miRNAs. The adjusted P-value (p.adj) was set as 0.05 and the log2 fold change absolute value was set as 1. On the basis of the analysis of differentially expressed miRNAs and mRNAs, a correlation analysis was conducted on each significantly differentially expressed miRNA and mRNA.

Function analysis of DEGs

Gene Ontology (GO) (13) enrichment analysis of DEGs was conducted using the Database for Annotation, Visualization, and Integrated Discovery (DAVID) (14). A P-value cut-off of <0.05 was set as the screening condition.

Integrated analysis of miRNA target gene prediction database

Integrated analysis of five miRNA target gene predication databases, including TargetScan, microcosm, miRanda, miRDB and PicTar (15–18), was conducted to identify the number of miRNA-regulated target gene pairs. The correlation coefficient threshold was set at −0.3 and the significance p.adj was set at 0.05. The pairs that were identified by two or more databases were further processed and retained, and the pairs supported by two or more databases were analyzed statistically.

Association analysis and regulatory network visualization of lncRNA, and miRNA and lncRNA

The regulatory association between lncRNA and miRNA was exploited using lncRNASNP (19,20). Regulatory network visualization for the regulatory associations between miRNAs and mRNAs, and lncRNAs and miRNAs was conducted using Cytoscape (21).

Results

Screening differentially expressed mRNA

DEGs were used to compare gene expression levels between 51 human non-cancerous samples and 269 HCC samples. The original data were preprocessed and the expression data of 16,489 genes were obtained. DESeq identified 2,695 differentially expressed mRNAs. A total of 687 upregulated genes and 1,515 downregulated genes were identified as being significantly differentially expressed. The t-test identified 2,803 differentially expressed mRNAs. DESeq and the t-test identified 2,239 (Fig. 1A) differentially expressed mRNAs, including 1,514 upregulated genes and 725 downregulated genes. edgeR identified 135 differentially expressed miRNAs, and the t-test identified 134. The two methods identified 114 differentially expressed miRNAs, 94 of which were upregulated and 20 of which were downregulated (Fig. 1B).
Figure 1.

Distribution of differentially expressed genes. (A) Differentially expressed mRNAs identified by DESeq and t-test. (B) Differentially expressed miRNAs identified by edgeR and a t-test. RPM, reads count per million.

DEG functions and pathways

The GO enrichment bar chart may directly reflect the distribution of DEGs for each enriched GO term with regard to biological process (BP), cellular component (CC) and molecular function (13). GO enrichment analysis of DEGs was conducted using DAVID, and 10 significantly enriched BP terms, 10 significantly enriched CC terms and 10 significantly enriched molecular function terms were selected to form the bar chart (Fig. 2). KEGG enrichment analysis (22) on DEGs was conducted using DAVID. The results showed that DEGs are mainly involved in the complement and coagulation cascades, fatty acid metabolism and butanoate metabolism (Fig. 3A). P<0.05 was set as the screening threshold. A pie demonstrating the enriched pathways is presented in Fig. 3B.
Figure 2.

Distribution of differentially expressed gene families in significantly enriched gene ontology terms. BP, biological process; CC, cellular component; MF, molecular function.

Figure 3.

Differentially expressed genes in a significantly enriched KEGG pathway. (A) Bar graph depicting KEGG pathway enrichment. (B) Pie chart depicting KEGG pathway enrichment. KEGG, Kyoto Encyclopedia of Genes and Genomes; PPAR, peroxisome proliferator-activated receptor.

Correlation analysis of miRNA and mRNA

Correlation analysis was conducted on each significantly differentially expressed miRNA and mRNA, based on the analysis of differentially expressed miRNAs and mRNAs. As miRNAs contribute to cell proliferation, differentiation, apoptosis and other processes by inhibiting the translation of mRNA or degrading it (5), negatively associated miRNA and mRNA pairs were selected. In total, 3,133 miRNA and mRNA pairs were identified for the analysis of frequency distribution for the correlation of miRNA and mRNA (Fig. 4).
Figure 4.

Frequency distribution diagram of microRNA and mRNA correlation coefficients.

Analysis of miRNA target gene database

Integrated analysis of five miRNA target gene predication databases (TargetScan, microcosm, miRanda, miRDB and PicTar) was conducted and 7,099,255 miRNA-regulated target gene pairs were identified. The pairs supported by two or more databases were further processed and retained, and totaled 701,418; these pairs were then analyzed statistically (Fig. 5). Integrated analysis was conducted for the pairs that met the conditions of correlation analysis and pairs whose interaction was supported by at least two databases. A total of 203 pairs was identified in the intersection of the two databases, including 28 miRNAs and 170 mRNAs.
Figure 5.

Integrated analysis of correlation and target gene prediction association.

miRNA and mRNA regulatory network visualization

In order to display the regulating network more clearly, Cytoscape was used to visualize it for the 203 pairs, including 170 miRNAs and 28 mRNAs (Fig. 6). Each node of the Cytoscape network represents a gene, miRNA or other data point. The edge between nodes represents an interaction between these biomolecules (17).
Figure 6.

MicroRNA-mRNA regulatory network.

Association analysis of lncRNA, miRNA and mRNA

The lncRNASNP database is a single nucleotide polymorphism (SNP) database that records human lncRNAs and contains SNP information on lncRNAs, structural changes caused by SNPs and the binding sites of lncRNAs and miRNAs. The database contains 72,000 lncRNA and miRNA pairs in total, including 5,118 experimentally validated pairs. The naming method used in the LNCipedia database, which provides comprehensive annotation of the sequence and structure of human lncRNAs, was also used to name lncRNAs in the lncRNASNP database. LNCipedia 3.0 is another database, which contains 113,513 human annotated lncRNAs (19,20). Regulatory associations were screened according to the corresponding associations between lncRNAs and miRNAs, and miRNAs and mRNAs; a total of 2,721 regulatory associations were identified. Cytoscape was used to perform regulator network visualization for the regulatory associations between lncRNAs, miRNAs and mRNAs, including 12 miRNAs, 113 mRNAs and 199 lncRNAs (Fig. 7).
Figure 7.

lncRNA-miRNA-mRNA regulatory network. lncRNA, long non-coding RNA; miRNA, microRNA.

Discussion

TCGA project was initiated by the National Cancer Institute (NCI) and the National Human Genome Research Institute (NHGRI) in 2006, and includes rich clinical data and large-scale data of mRNA and miRNA expression profiles, copy number variations, DNA methylation and mutations. Full exploitation of TCGA database may lay the foundation for future studies (23). In order to investigate the molecular mechanism of HCC pathogenesis, the gene expression profiles of HCC in TCGA database were downloaded. Of the selected 2,239 DEGs, 1,514 were upregulated genes, including ANGPT1, CEBPA-AS1, ARRDC2, FAM213B, SAMD1 and PCNXL3, and 725 were downregulated genes, including PLA2G16, PXMP2, CDNF, GOT2, NR0B2 and DCXR. The upregulated genes were separately extracted and underwent KEGG pathway enrichment analysis using DAVID. Multiple genes, such as E2F1, E2F2, E2F5 and DBF4, are involved in the cell cycle. LIG1, POLA1, POLA2, MCM2, MCM3 and RNASEH2A are involved in DNA replication. CDK1, ADCY6, SGOL1, PKMYT1, ESPL1 and AURKA are involved in progesterone-mediated oocyte meiosis. These data suggest that it is alterations to these functions and pathways that is the primary cause of tumorigenesis in HCC. miRNAs regulate the expression of target genes post-transcriptionally via their complementarity with the mRNA seed sequence (24). miRNAs present tissue-specific expression, have numerous target genes, may regulate multiple protein-coding genes and can be involved in multiple molecular pathways associated with tumor evolution and progression, which are associated with multiple cancer types (25). The present study identified 114 differentially expressed miRNAs using TCGA, including hsa-mir-183, hsa-mir-552, hsa-mir-184, hsa-mir-1269, hsa-mir-217, hsa-mir-196a-1, hsa-mir-200c, hsa-mir-190b, hsa-mir-224 and hsa-mir-541. miRNAs can post-transcriptionally regulate the expression of their target genes through their incomplete complementarity with the 3′ non-coding region of target mRNAs (4). On the basis of the analysis of differentially expressed miRNAs and mRNAs, negative correlation analysis was conducted on each significantly differentially expressed miRNA and mRNA. Through the integrated analysis of target gene prediction databases, 203 pairs, including 170 miRNAs and 28 mRNAs (Fig. 5), were identified, such as hsa-miR-101-5p and EZH2, hsa-miR-454-3p and ESR1, hsa-miR-101-3p-5p and CBFA2T2, hsa-miR-93-5p and ESR1, and hsa-mir-101-3P and DNMT3. miR-125b was reported to exert its hepatic tumor-suppressive effect by suppressing the expression of the oncogene LIN28B, suggesting it could have a therapeutic application (26). miR-375 regulates the YY1-associated protein 1 oncogene, meaning it could have a therapeutic role in HCC treatment (27). At the same time, miR-375 targets AEG-1 in HCC and suppresses liver cancer cell growth in vitro and in vivo, which underlines the potential for the therapeutic use of miR-375 in the treatment of HCC (28). The present results suggest that the dysregulated expression of miRNAs may alter the expression of functional genes and the post-transcriptional regulation of HCC, cell proliferation, apoptosis, differentiation and metastasis. lncRNAs can regulate the expression of genes via transcriptional regulation, chromatin modification, post-transcriptional processing, genomic imprinting and the regulation of protein functions (29). Recent studies have suggested that lncRNAs may exert functions by targeting miRNAs (30,31). In the present study, the regulatory associations were screened out according to the corresponding associations of lncRNA and miRNA, and of miRNA and mRNA. The regulatory network visualization was performed for the screened corresponding associations. The regulatory networks included 12 miRNAs, 113 mRNAs and 199 lncRNAs, an example being the association between the lncRNA lnc-AC073263, the miRNA hsa-miR-182-5p and the mRNA BDNF (Fig. 7A). Numerous studies have suggested that lncRNAs and miRNAs may competitively reduce the expression of mRNAs, which may cause tumor evolution and progression (32–34). For example, HULC may downregulate a number of miRNAs, including miR-372. Inhibition of miR-372 reduces the translational repression of its target gene, PRKACB, which allows it to be expressed and allows for its protein product to phosphorylate CREB (35). The interactions between lncRNAs microRNAs and mRNAs, and their role during tumorigenesis, can provide novel insights into the regulatory mechanisms that underlie several classes of important ncRNAs in cancer. The present study established a novel HCC network to facilitate data mining. In addition, the present study conducted functional module analysis within the network. However, further analyses are required to unravel the roles and mechanisms of the identified RNAs in the process of tumorigenesis and the development in HCC. In conclusion, the present data provide a comprehensive bioinformatic analysis of genes and pathways that may be involved in the pathogenesis of HCC.
  35 in total

1.  Cytoscape: a software environment for integrated models of biomolecular interaction networks.

Authors:  Paul Shannon; Andrew Markiel; Owen Ozier; Nitin S Baliga; Jonathan T Wang; Daniel Ramage; Nada Amin; Benno Schwikowski; Trey Ideker
Journal:  Genome Res       Date:  2003-11       Impact factor: 9.043

2.  MicroRNA targeting specificity in mammals: determinants beyond seed pairing.

Authors:  Andrew Grimson; Kyle Kai-How Farh; Wendy K Johnston; Philip Garrett-Engele; Lee P Lim; David P Bartel
Journal:  Mol Cell       Date:  2007-07-06       Impact factor: 17.970

3.  The long noncoding RNA CHRF regulates cardiac hypertrophy by targeting miR-489.

Authors:  Kun Wang; Fang Liu; Lu-Yu Zhou; Bo Long; Shu-Min Yuan; Yin Wang; Cui-Yun Liu; Teng Sun; Xiao-Jie Zhang; Pei-Feng Li
Journal:  Circ Res       Date:  2014-02-20       Impact factor: 17.367

Review 4.  Characteristics of long non-coding RNA and its relation to hepatocellular carcinoma.

Authors:  Jin-Lan Huang; Lei Zheng; Yan-Wei Hu; Qian Wang
Journal:  Carcinogenesis       Date:  2013-12-02       Impact factor: 4.944

Review 5.  RNA in unexpected places: long non-coding RNA functions in diverse cellular contexts.

Authors:  Sarah Geisler; Jeff Coller
Journal:  Nat Rev Mol Cell Biol       Date:  2013-10-09       Impact factor: 94.444

6.  C-Myc-activated long noncoding RNA CCAT1 promotes colon cancer cell proliferation and invasion.

Authors:  Xiaolu He; Xueming Tan; Xiang Wang; Heiying Jin; Li Liu; Limei Ma; Hong Yu; Zhining Fan
Journal:  Tumour Biol       Date:  2014-09-04

7.  Differential expression analysis for sequence count data.

Authors:  Simon Anders; Wolfgang Huber
Journal:  Genome Biol       Date:  2010-10-27       Impact factor: 13.583

8.  edgeR: a Bioconductor package for differential expression analysis of digital gene expression data.

Authors:  Mark D Robinson; Davis J McCarthy; Gordon K Smyth
Journal:  Bioinformatics       Date:  2009-11-11       Impact factor: 6.937

9.  The microRNA.org resource: targets and expression.

Authors:  Doron Betel; Manda Wilson; Aaron Gabow; Debora S Marks; Chris Sander
Journal:  Nucleic Acids Res       Date:  2007-12-23       Impact factor: 16.971

10.  Reciprocal regulation of PCGEM1 and miR-145 promote proliferation of LNCaP prostate cancer cells.

Authors:  Jin-Hua He; Jing-Zhi Zhang; Ze-Ping Han; Li Wang; Yu Bing Lv; Yu-Guang Li
Journal:  J Exp Clin Cancer Res       Date:  2014-09-10
View more
  3 in total

1.  Shambhala: a platform-agnostic data harmonizer for gene expression data.

Authors:  Nicolas Borisov; Irina Shabalina; Victor Tkachev; Maxim Sorokin; Andrew Garazha; Andrey Pulin; Ilya I Eremin; Anton Buzdin
Journal:  BMC Bioinformatics       Date:  2019-02-06       Impact factor: 3.169

2.  Downregulated SPINK4 is associated with poor survival in colorectal cancer.

Authors:  Xiaojie Wang; Qian Yu; Waleed M Ghareeb; Yiyi Zhang; Xingrong Lu; Ying Huang; Shenghui Huang; Yanwu Sun; Jiayi Lin; Jin Liu; Pan Chi
Journal:  BMC Cancer       Date:  2019-12-30       Impact factor: 4.430

3.  The mRNA-miRNA-lncRNA Regulatory Network and Factors Associated with Prognosis Prediction of Hepatocellular Carcinoma.

Authors:  Bo Hu; Xiaolu Ma; Peiyao Fu; Qiman Sun; Weiguo Tang; Haixiang Sun; Zhangfu Yang; Mincheng Yu; Jian Zhou; Jia Fan; Yang Xu
Journal:  Genomics Proteomics Bioinformatics       Date:  2021-03-17       Impact factor: 6.409

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.