Literature DB >> 31402962

Discovery of core genes in colorectal cancer by weighted gene co-expression network analysis.

Cun Liao1, Xue Huang2, Yizhen Gong1, Qiuning Lin3.   

Abstract

The aim of the present study was to investigate the interactions among messenger RNAs (mRNAs), microRNAs (miRNAs), and long noncoding RNAs (lncRNAs) in colorectal cancer (CRC), in order to examine its underlying mechanisms. The raw gene expression data was downloaded from the Gene Expression Omnibus (GEO) database. An online tool, GEO2R, which is based on the limma package, was used to identify differentially expressed genes. The co-expression between lncRNAs and mRNAs was identified utilizing the weighted gene co-expression analysis package of R to construct a coding non-coding (CNC) network. The function of the genes in the CNC network was determined by performing Gene Ontology and Kyoto Encyclopedia of Genes and Genomes pathways enrichment analysis. The interactions among miRNAs, mRNAs and lncRNAs were predicted using Lncbase and mirWalk to construct the competing endogenous RNA (ceRNA) network. The expression of the genes involved in the ceRNA network was further validated in The Cancer Genome Atlas dataset. A total of 3,183 dysregulated mRNAs, 78 dysregulated miRNAs and 2,248 dysregulated lncRNAs were screened in two GEO datasets. Combined with the results of the dysregulated genes, 169 genes were selected to construct the CNC network. 'p53 signaling pathway' and the 'cell cycle' were the most significant enriched pathways in the genes involved in the CNC network. Finally, a validated ceRNA network composed of 2 lncRNAs (MIR22HG and RP11-61I13.3), 5 miRNAs (hsa-miR-765, hsa-miR-198, hsa-miR-125a-3p, hsa-miR-149-3p and hsa-miR-650) and 5 mRNAs (ANK2, BTK, GBP2, PCSK5 and PDK4) was obtained. In conclusion, MIR22HG may regulate PCSK5, BTK and PDK4, and RP11-61I13.3 may regulate the ANK2, GBP2, PCSK5 through sponging miRNAs to act on the progression of CRC, and the potential function of these genes have been revealed. However, the diagnostic and prognostic value of these genes requires further validation.

Entities:  

Keywords:  The Cancer Genome Atlas; bioinformatics analysis; colorectal cancer; competing endogenous RNA; weighted gene co-repression analysis

Year:  2019        PMID: 31402962      PMCID: PMC6676736          DOI: 10.3892/ol.2019.10605

Source DB:  PubMed          Journal:  Oncol Lett        ISSN: 1792-1074            Impact factor:   2.967


Introduction

Colorectal cancer (CRC) is one of the most frequently diagnosed cancers of the digestive system. It was estimated that there would be 97,220 new cases and 50,630 CRC-related deaths in the United States in 2018 (1). Much of the rising burden is attributable to population growth and aging, as well as sociodemographic changes. Recurrence is the major cause of CRC-related death (2). Despite the diagnosis and treatment of CRC, the survival of the patient is closely associated to the stage of the tumor at diagnosis, and 40–50% of patient mortality is due to distant metastasis (3,4). Considering the enormous threat of CRC to human health, new diagnostic and therapeutic approaches are required for early cancer detection and effective treatment (5). In order to find a more effective treatment for CRC, a more thorough understanding of its pathogenesis is important. Previously, there has been increasing evidence that microRNAs (miRNAs) are functional in cancer progression by downregulating their targets, including mRNA, long non-coding RNA (lncRNA), circular RNA and pseudogenes (6–8). Nevertheless, overexpression of these targets could abolish the downregulatory effect of miRNA in turn (9–11). Moreover, more than one miRNA target could compete with miRNAs as competing endogenous RNA (ceRNA) and overexpression of one ceRNA could indirectly upregulate other ceRNAs (12,13). ceRNA crosstalk was first proposed by Poliseno et al (14). Phosphatase and tensin homolog pseudogene 1 (PTENP1) could upregulate PTEN by sponging their common miRNAs. It was subsequently confirmed that various genes could participate in the development of cancer through ceRNA mechanisms (15). In breast cancer, overexpression of NEAT1 induced by BRCA1-deficiency can silence hsa-miR-129-5p and upregulate WNT4, a target of hsa-miR-129-5p, which leads to activation of oncogenic WNT signaling (16). Liang et al (17) reported that the oncogenic functions of lncRNA H19 in CRC may be attributed to its ceRNA activity to sequester miR-138 and miR-200a and therefore, upregulate the expression levels of VIM, ZEB1 and ZEB2, the critical genes involved in epithelial mesenchymal transition. With the development of gene sequencing technology, it has been shown that the expression of various lncRNAs is either upregulated or downregulated in CRC tissues to regulate several signaling pathways, such as the Wnt, p13K/Akt and Ras pathways, through ceRNA crosstalk (18,19). Therefore, it has been proven that ceRNA crosstalk is a critical mechanism for the complex pathogenesis and multi-step development of CRC, which may be a potential starting point for the development of novel CRC treatment methods, which deserves further study. The aim of the present study, was to identify a ceRNA network between non-coding RNAs and coding genes in CRC and to reveal the potential mechanism of CRC progression, which may aid in establishing a novel clinical CRC treatment system.

Materials and methods

Gene datasets and clinical information

Raw gene expression data of the GSE109454 (20) and GSE41655 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE41655) datasets were downloaded from the National Center of Biotechnology Information Gene Expression Omnibus (GEO) database (https://www.ncbi.nlm.nih.gov/gds/). In the GSE109454 dataset, tumor samples and adjacent non-tumor samples were obtained from 6 patients with CRC. Clinical information of these CRC patients from the original study is shown in Table I. In the GSE41655 dataset, there were 15 normal colorectal samples, 59 colorectal adenoma samples and 33 colorectal adenocarcinoma samples. Clinical information of these 107 cases from the original study is shown in Table II. Histological tumor type and grade were evaluated according to the World Health Organization cancer classification (21) and tumor stage according to the Union for International Cancer Control TNM classification (22).
Table I.

Clinical information of the patients included in the GSE109454 dataset (18).

SexAge, yearsTNM stage (22)CEA, ng/ml
Male55II12.43
Male61IVA37.84
Female56IIIB23.60
Male59IIIA39.93
Female53II15.51
Female63IIIA18.89

CEA, carcinoembryonic antigen; TNM, Tumor-Node-Metastasis.

Table II.

Clinical information of the patients included in the GSE41655 dataset (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE41655).

SexAge, yearsPathological grade (21)pTNM stage (22)
Male35Normal epithelium
Male60Normal epithelium
Male53Normal epithelium
Male68Normal epithelium
Female61Normal epithelium
Female55Normal epithelium
Male29Normal epithelium
Male37Normal epithelium
Female38Normal epithelium
Female22Normal epithelium
Female62Normal epithelium
Male69Normal epithelium
Female46Normal epithelium
Male51Normal epithelium
Female39Normal epithelium
Male49Low-grade dysplasia
Female51Low-grade dysplasia
Male64High-grade dysplasia
Male69Low-grade dysplasia
Male54Low-grade dysplasia
Male60Low-grade dysplasia
Male61Low-grade dysplasia
Female74Low-grade dysplasia
Male64Low-grade dysplasia
Male61High-grade dysplasia
Female70High-grade dysplasia
Male83High-grade dysplasia
Female53High-grade dysplasia
Male49High-grade dysplasia
Female29High-grade dysplasia
Male49High-grade dysplasia
Female70High-grade dysplasia
Male80High-grade dysplasia
Male65High-grade dysplasia
Male56High-grade dysplasia
Female76High-grade dysplasia
Male56High-grade dysplasia
Male74High-grade dysplasia
Male61High-grade dysplasia
Male35Low-grade dysplasia
Male59Low-grade dysplasia
Female53High-grade dysplasia
Female68Low-grade dysplasia
Male61Low-grade dysplasia
Female43Low-grade dysplasia
Male37Low-grade dysplasia
Male55High-grade dysplasia
Male82High-grade dysplasia
Female62Low-grade dysplasia
Male72Low-grade dysplasia
Male62Low-grade dysplasia
Female34Low-grade dysplasia
Female61Low-grade dysplasia
Female58Low-grade dysplasia
Female53Low-grade dysplasia
Male76Low-grade dysplasia
Male29Low-grade dysplasia
Male37Low-grade dysplasia
Male38Low-grade dysplasia
Female22Low-grade dysplasia
Male69Low-grade dysplasia
Male69Low-grade dysplasia
Male65Low-grade dysplasia
Male49High-grade dysplasia
Male69Low-grade dysplasia
Male63Low-grade dysplasia
Male57Low-grade dysplasia
Female56Low-grade dysplasia
Female69Low-grade dysplasia
Female72Low-grade dysplasia
Female65Low-grade dysplasia
Male75Low-grade dysplasia
Male57Low-grade dysplasia
Female35Low-grade dysplasia
Male69AdenocarcinomaT2N0M0
Male63AdenocarcinomaT4N1M1
Male64AdenocarcinomaT3N1M0
Female75AdenocarcinomaT1N0M0
Male49AdenocarcinomaT3N1M0
Female70AdenocarcinomaT3N1M0
Male80AdenocarcinomaT1N0M0
Male65AdenocarcinomaT2N0M0
Male56AdenocarcinomaT3N0M0
Female76AdenocarcinomaT2N1M0
Male56AdenocarcinomaT1N0M0
Female68AdenocarcinomaT2N1M0
Male74AdenocarcinomaT4N1M0
Male61AdenocarcinomaT4N1M0
Male70AdenocarcinomaT3N1M0
Female80AdenocarcinomaT1N0M0
Male71AdenocarcinomaT2N0M0
Male35AdenocarcinomaT3N0M0
Male59AdenocarcinomaT3N0M0
Male60AdenocarcinomaT3N0M0
Female68AdenocarcinomaT2N0M0
Male61AdenocarcinomaT3N1M0
Male51AdenocarcinomaT3N1M0
Male55AdenocarcinomaT4N2M1
Male66AdenocarcinomaT3N1M0
Male72AdenocarcinomaT4N2M0
Female34AdenocarcinomaT4N2M0
Female63AdenocarcinomaT1N0M0
Male63AdenocarcinomaT1N0M0
Male69AdenocarcinomaT4N1M0
Female69AdenocarcinomaT2N0M0
Male61AdenocarcinomaT4N1M0
Male45AdenocarcinomaT2N0M0

pTNM, pathological Tumor-Node-Metastasis.

Identification of differentially expressed genes (DEGs)

GEO2R (http://www.ncbi.nlm.nih.gov/geo/geo2r/) is an online tool provided by GEO. GEO2R is based on the R language limma package (v3.26.8) (23). GEO2R was used to screen DEGs between tumor and normal samples in the GSE41655 and GSE109454 datasets. At the same time, using false discovery rate correction, multiple t-tests were used to determine the statistical significance of the DEGs. |log2 fold change (FC)|>1 and a P-value <0.05 were set as the cut-off criteria. Subsequently, probes without a corresponding gene symbol were filtered.

Weighted gene co-repression analysis (WGCNA)

To screen significant co-expression modules, WGCNA was performed as previously described (24,25). Probe sets were first filtered based on the variance of expression value across all samples. Probe sets with repeating gene symbols were also removed based on the expression variance. The R package WGCNA (v1.61) (26) was used to construct the co-expression networks. Independent signed networks were constructed from the CRC samples and normal samples. The adjacency matrix was constructed using a soft threshold power of 12 to make the soft threshold >0.8. Network interconnectedness was measured by calculating the topology overlap using the TOMdist function with a signed TOM-Type. The average hierarchical clustering was performed to group the genes based on the topological overlap dissimilarity measure (1-TOM) of their connection strengths. The network modules were identified using a dynamic tree cut algorithm, with a minimum cluster size of 30 and a merge threshold function of 0.25. Genes that were not assigned to particular modules were designated as grey. The module that had the strongest association with CRC was selected for further analysis and the co-expression in this module was filtered as follows: i) The weight score of co-expression >0.3; ii) only the coding and non-coding co-expression relationships were retained; iii) only the genes that were differentially expressed were retained; and iv) only the genes in one pair that had the same expression tendency were retained.

Gene function analysis

To determine the function of the selected genes in CRC, Gene Ontology (GO) enrichment analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis for the genes in the co-expression network were implemented using DAVID (v6.8; http://david.ncifcrf.gov/). In the present study, pathway and process enrichment analysis was performed using the following ontological sources: GO Biological Processes (BPs) and KEGG Pathway. All genes in the H. sapiens genome were used as background in the enrichment analysis. The P-value was calculated based on accumulative hypergeometric distribution, and the q-value was calculated using the Benjamini-Hochberg program to consider multiple tests (27). When hierarchical clustering was performed on enriched terms, the κ score was used as a measure of similarity, and sub-trees with similarity >0.3 were then treated as clusters.

Prediction of the lncRNA-miRNA interactions and miRNA-mRNA interactions

The interactions between differentially expressed miRNAs (DEMs) and DEGs were predicted using miRWalk v3.0 (http://mirwalk.umm.uni-heidelberg.de/), which integrated the predictions of miRDB (28) and TargetScan (29). A score ≥0.95 was considered as the critical criterion for miRWalk predictive analysis. Considering the inhibition effect of miRNA on mRNA expression, the interactions of miRNA and mRNA with the same expression tendency were deleted. The interactions between miRNA and lncRNA were predicted using DIANA-LncBase v2.0 (30), and the score ≥0.4 was considered to be the cut-off criterion for predictive analysis in the LncBase prediction module. After the predicted targets were intersected with DEGs in the GSE109454 dataset and DEMs in the GSE41655 dataset, the mRNAs, miRNAs and lncRNAs were selected to construct the lncRNA-mRNA-miRNA regulatory network. The cytoscape software (v3.40; http://cytoscape.org/) was used to visualize the network.

Validation in the cancer genome atlas (TCGA) datasets

TCGA is a public platform for researchers to download and assess free datasets (https://cancergenome.nih.gov/) (31). RNA-sequencing (RNA-seq) data and clinical data from 31 cancer types are included in TCGA. In the present study, to improve the reliability of the analysis, the expression of hub genes was validated in TCGA datasets using an easy-use online tool GEPIA (v1.0; http://gepia.cancer-pku.cn/). The tumor type was limited to colon adenocarcinoma (COAD). Finally, RNA-seq data and clinical data from 275 primary tumor samples of COAD and 349 corresponding normal samples in TCGA datasets were used. Box and whisker plots were used to display the expression of the genes involved in the ceRNA network. |log2 FC|>1 and P-value <0.05 were set as the cut-off criteria to determine the differential expression between primary tumor samples and normal samples.

Results

Identification of DEGs

DEGs in the GSE109454 dataset and DEMs in the GSE41655 dataset were screened with a threshold of P<0.05 and |log2 FC|>1. As shown in Fig. 1, a total of 2,599 downregulated genes (1,430 mRNAs and 1,169 lncRNAs) and 2,832 upregulated genes (1,753 mRNAs and 1,079 lncRNAs) were screened in the GSE109454 dataset. In the GSE41655 dataset, a total of 116 DEMs, including 71 downregulated miRNAs and 45 upregulated miRNAs, between colorectal adenoma samples and normal colorectal samples, were screened (Fig. 2A and C), and a total of 109 DEMs, including 55 downregulated miRNAs and 54 upregulated miRNAs, between colorectal adenocarcinoma samples and normal colorectal samples, were screened (Fig. 2B and D). After the DEMs in the two groups were intersected, 78 DEMs were selected for further analysis.
Figure 1.

Identification of differentially expressed genes between tumor and normal samples in the GSE109454 dataset. (A) The volcano plot. The green dots indicate the significantly downregulated genes, the red dots indicate the significantly upregulated genes, while the blue dots indicate the genes with no significant difference. The horizontal axis represents the log2 (fold change) and the vertical axis represents the -log10 (P-value). (B and C) Heatmaps of the (B) differentially expressed mRNAs and (C) differentially expressed lncRNAs. Each row presents a dysregulated RNA transcript and each column represents a sample. Orange represents upregulation and blue represents downregulation.

Figure 2.

Identification of DEMs between tumor and normal samples in the GSE41655 dataset. (A) The volcano plots of the DEMs between colorectal adenoma samples and normal colorectal samples. (B) The volcano plots of the DEMs between colorectal adenocarcinoma samples and normal colorectal samples. The green dots indicate the significantly downregulated genes, the red dots indicate the significantly upregulated genes and the blue dots indicate the genes with no significant difference. The horizontal axis represents the log2 (fold change) and the vertical axis represents the -log10 (P-value). (C) Heatmaps of DEMs between colorectal adenoma samples and normal colorectal samples. (D) Heatmaps of DEMs between colorectal adenocarcinoma samples and normal colorectal samples. Each row presents a dysregulated RNA transcript and each column represents a sample. Orange represents upregulation and blue represents downregulation. DEMs, differentially expressed mRNAs.

Construction of co-expression network

Using the R package for WGCNA, 39 modules were generated (Fig. 3A) and all of the uncorrelated genes were assigned to the grey module. The number of the genes in every module is shown in Table III. The trait of the samples in the present study was divided into tumor and non-tumor. Out of 39 modules, the blue module was most positively associated with CRC (r=0.98; P=1×10−8; Fig. 3B). A total of 2,556,690 co-expression relationships in the blue module were therefore further filtered, and 169 genes (86 mRNAs and 83 lncRNAs) and 245 relationships were selected to construct the CNC network (Fig. 4).
Figure 3.

(A) Hierarchical clustering dendrogram of all the probe sets used in the weighted gene co-expression network analysis based on the 1-TOM dissimilarity measure. Each line of the dendrogram represents an individual probe. The multi-color bar represents the 39 modules identified using the dynamic tree cut algorithm. (B) Correlations between module eigengenes and tumor status. The numbers within the heatmap indicate correlations and P-values for the module-trait associations, respectively. Red wells indicate positive correlation, blue wells indicate negative correlation and white wells indicate no correlation.

Table III.

Number of genes in the 39 modules.

Module colorsFrequency
Black426
Blue2,249
Brown1,417
Cyan274
Dark green133
Dark grey103
Dark magenta56
Dark olive green59
Dark orange84
Dark red145
Dark turquoise103
Green507
Green yellow293
Grey191
Grey 60223
Light cyan246
Light green177
Light yellow169
Magenta349
Midnight blue264
Orange97
Pale turquoise60
Pink391
Plum 139
Purple313
Red439
Royal blue154
Saddle brown63
Salmon282
Sienna 352
Sky blue72
Sky blue 347
Steel blue63
Tan291
Turquoise3,809
Violet59
White80
Yellow909
Yellow green48
Figure 4.

Coding non-coding network. Red represents upregulation and blue represents downregulation. The V-shape represents lncRNA and circles represent mRNA.

GO and KEGG pathway enrichment analysis of genes in the CNC network

GO and KEGG enrichment analysis was performed for the genes in the blue module and the result is shown in Table IV. ‘p53 signaling pathway’ and ‘cell cycle’ were the most significantly enriched pathways in the KEGG pathway enrichment analysis. ‘Cell division’, ‘chromosome segregation’ and ‘mitotic nuclear division’ were the top 3 enriched BP terms.
Table IV.

Gene function enrichment analysis of the genes included in the coding non-coding network.

CategoryTermCountP-valueGenes
KEGGCell cycle  51.66×10−3CDK1, MAD2L1, CCNB2, E2F5, ATR
p53 signaling pathway  33.18×10−2CDK1, CCNB2, ATR
BPCell division137.04×10−8CDK1, MAD2L1, CCNB2, KIF11, OIP5, NEK2, KIF20B, NUF2, SDCCAG3, CENPF, AURKA, UBE2C
Mitotic nuclear division102.17×10−6CDK1, KIF11, CCNB2, OIP5, NEK2, KIF20B, NUF2, CENPF, CENPW, AURKA
Chromosome segregation  61.58×10−5KIF11, OIP5, NEK2, NUF2, CENPF, CENPW
DNA strand elongation involved in DNA replication  44.21×10−5GINS1, RFC3, GINS4, POLD2
Positive regulation of telomere maintenance via telomerase  44.34×10−4DKC1, NEK2, ATR, CCT6A
DNA replication  67.71×10−4CDK1, RFC3, GINS4, POLD2, ATR, DSCC1
DNA repair  77.81×10−4CDK1, RAD51AP1, FANCD2, PARPBP, RUVBL2, ACTL6A, ATR
CENP-A containing nucleosome assembly  41.04×10−3CENPL, OIP5, CENPQ, CENPW
DNA duplex unwinding  41.11×10−3GINS1, GINS4, RUVBL2, RAD54B
Sister chromatid cohesion  51.35×10−3CENPL, MAD2L1, CENPQ, NUF2, CENPF
Positive regulation of telomerase RNA localization to Cajal body  32.15×10−3DKC1, RUVBL2, CCT6A
Spindle organization  32.45×10−3KIF11, AURKA, AUNIP
Cell cycle  63.38×10−3DTYMK, SDCCAG3, AURKA, ATR, CDKN3, SUV39H2
G2/M transition of mitotic cell cycle  53.79×10−3CDK1, PLK4, CCNB2, NEK2, AURKA
Regulation of mitotic nuclear division  35.50×10−3MKI67, NEK2, KIF20B
Anaphase-promoting complex-dependent catabolic process  45.92×10−3CDK1, MAD2L1, AURKA, UBE2C
Cytoplasmic translation  35.96×10−3RPL6, RPL8, FTSJ1
Reciprocal meiotic recombination  38.52×10−3CCNB1IP1, RAD54B, TRIP13
Histone H4 acetylation  39.08×10−3NCOA1, RUVBL2, ACTL6A
Positive regulation of type I hypersensitivity  201.3871FCER1A, BTK
Mitotic nuclear envelope disassembly  31.78×10−2CDK1, CCNB2, RAE1
Regulation of mitotic centrosome separation  21.85×10−2KIF11, NEK2
Interstrand cross-link repair  32.18×10−2RAD51AP1, FANCD2, ATR
Mitotic centrosome separation  22.30×10−2KIF11, AURKA
Regulation of growth  32.52×10−2ARMC10, RUVBL2, ACTL6A
Cell proliferation  62.77×10−2CDK1, MKI67, DKC1, DLGAP5, DTYMK, CENPF
Positive regulation of DNA-directed DNA polymerase activity  23.21×10−2RFC3, DSCC1
Box C/D snoRNP assembly  23.21×10−2NUFIP1, RUVBL2
Protein ubiquitination involved in ubiquitin-dependent protein catabolic process  43.44×10−2CDK1, MAD2L1, AURKA, UBE2C
Positive regulation of establishment of protein localization to telomere  24.10×10−2DKC1, CCT6A
Negative regulation of ubiquitin-protein ligase activity involved in mitotic cell cycle  34.31×10−2CDK1, MAD2L1, UBE2C
Positive regulation of ubiquitin-protein ligase activity involved in regulation of mitotic cell cycle transition  34.87×10−2CDK1, MAD2L1, UBE2C

Construction and validation of ceRNA network

To identify how lncRNAs regulate mRNAs through miRNAs in the CNC network, online prediction tools were used to determine miRNA-lncRNA and miRNA-mRNA interactions. The predicted miRNAs were then intersected with DEMs identified in the GSE41655 dataset. Subsequently a ceRNA network was constructed, which was composed of 24 mRNAs, 16 miRNAs and 10 lncRNAs (Fig. 5). To improve the reliability of the analysis, the expression of genes involved in the ceRNA network was validated in TCGA datasets. The results showed that 3 lncRNAs (MIR22HG, CHL1-AS2 and RP11-61I13.3) were downregulated in TCGA datasets (Fig. 6). As shown in Fig. 7, ANK2, BTK, FGL2, GBP2, PCSK5 and PDK4 were downregulated, and ARMC10, CENPF, CENPF, DKC1, GINS4, PAICS, PARPBP, RAD54B, RAE1 and TOMM34 were upregulated in TCGA datasets. Subsequently, these validated mRNAs and lncRNAs were selected to construct a ceRNA network. Finally, a validated ceRNA network composed of 2 lncRNAs (MIR22HG and RP11-61I13.3), 5 miRNAs (hsa-miR-765, hsa-miR-198, hsa-miR-125a-3p, hsa-miR-149-3p and hsa-miR-650) and 5 mRNAs (ANK2, BTK, GBP2, PCSK5 and PDK4) was obtained (Fig. 8). PCSK5 was at the center of the validated ceRNA network, and interacted with 4 miRNAs.
Figure 5.

Competing endogenous RNA network. Red represents upregulation and blue represents downregulation. The V-shape represents lncRNA, circles represent mRNA and triangles represent microRNA.

Figure 6.

Gene expression of the selected long non-coding RNAs validated in The Cancer Genome Atlas dataset. The dysregulation of (A) MIR22HG, (B) CHL1-AS2 and (C) RP11-61I13.3. The red box corresponds to tumor samples and the grey box corresponds to non-tumor samples. The vertical axis represents the relative expression of genes. *P<0.05. COAD, colon adenocarcinoma; T, tumor; N, normal.

Figure 7.

Gene expression of the selected mRNAs validated in The Cancer Genome Atlas dataset. The dysregulation of (A) ANK2, (B) ARMC10, (C) BTK, (D) CENPF, (E) DKC1, (F) FGL2, (G) GBP2, (H) GINS4, (I) PAICS, (J) PARPBP, (K) PCSK5, (L) PDK4, (M) RAD54B, (N) RAE1 and (O) TOMM34. The red box corresponds to tumor samples and the grey box corresponds to non-tumor samples. The vertical axis represents the relative expression of genes. *P<0.05. COAD, colon adenocarcinoma; T, tumor; N, normal.

Figure 8.

Competing endogenous RNA network constructed from validated genes. Red represents upregulation and blue represents downregulation. V-shape represents lncRNA, circle represents mRNA, and triangle represents miRNA.

Discussion

In developing countries, the incidence of CRC is rapidly increasing. CRC is the fifth most common malignant tumor after lung, liver, esophagus and breast cancer in China (32). Statistically, there were 376,300 new cases and 191,000 CRC-related deaths in China in 2015 (32). Therefore, CRC has become a public health issue and it is essential to understand the etiological factors and mechanisms of CRC progression to improve prevention and survival rates. In the present study, 3,183 dysregulated mRNAs, 78 dysregulated miRNAs and 2,248 dysregulated lncRNAs were screened in two GEO datasets. WGCNA analysis was conducted to identify the co-expression between coding genes and non-coding genes. Combined with the dysregulated genes, 169 genes were selected to construct the CNC network. GO and KEGG pathway enrichment analysis was performed to identify the biological function of the genes in the CNC network. The genes in the CNC network were significantly enriched in ‘cell cycle’ and ‘p53 signaling pathway’. DEMs identified from the GS41655 dataset were predicted to be involved in miRNA-lncRNA and miRNA-mRNA interactions were used to construct the ceRNA network. The expression of 3 lncRNAs (MIR22HG, CHL1-AS2 and RP11-61I13.3) and 15 mRNAs (ANK2, ARMC10, BTK, CENPF, DKC1, FGL2, GBP2, GINS4, PAICS, PARPBP, PCSK5, PDK4, RAD54B, RAE1 and TOMM34) in the ceRNA network were validated and found to be dysregulated in TCGA datasets. PCSK5 is at the center of the validated ceRNA network, and interacts with 4 miRNAs. PCSK5 belongs to the subtilisin-like proprotein convertase family. The members of this family are proprotein convertases that process a potential precursor protein into its biologically active product (33). Reports of the involvement of PCSK5 in cancer are rare. In triple-negative breast cancer (TNBC), a lack of PCSK5 could lead to the bioactivity of growth differentiation factor (GDF11) as a tumor-suppressor. PCSK5 reconstitution mobilizes the latent TNBC reservoir of GDF11 in vitro and inhibits TNBC metastasis to the lung of syngeneic hosts (34). However, the function of PCSK5 in CRC is largely unknown. The present study indicated that PCSK5 could be regulated by MIR22HG through miR-198 and miR-149-3p, and regulated by RP11-61I13.3 through miR-765 and miR-125-3p. MIR22HG has been identified as a tumor suppressor in several studies. In hepatocellular carcinoma (HCC), MIR22HG expression is significantly downregulated, and can regulate the expression of high mobility group box 1 and its downstream pathways by deriving miR-22-3p to suppress HCC cell proliferation, invasion and metastasis (35). In endometrial cancer (EC), MIR22HG can act as sponge of miR-141-3p, which could inhibit EC cells proliferation and induce EC cells apoptosis by targeting death-associated protein kinase 1 (36). In addition, the present study suggested that MIR22HG could regulate the expression of BTK and PDK4 through miR-650. Therefore, MIR22HG might be an important regulator in CRC. BTK is a new regulator of p53 and BTK induction, which leads to p53 phosphorylation. Inhibiting BTK can reduce p53-dependent senescence and apoptosis (37). In CRC, miR-650 is a direct regulator of N-myc downstream-regulated gene 2, which is a potential tumor suppressor gene (38). Therefore, MIR22HG might be an important regulator in CRC. RP11-61I13.3 is a novel lncRNA and little is known about its function. RP11-61I13.3 can also regulate GBP2 and ANK2 through miR-765, and miR-765 has been demonstrated to be an onco-miRNA in HCC (39) and oral squamous cancer (40). Meanwhile, GBP2 has been associated with a better prognosis in several cancer types, including breast cancer and ovarian cancer (41,42). In conclusion, two lncRNAs (MIR22HG and RP11-61I13.3) have been identified that can regulate several mRNAs through sponging miRNAs in CRC, and the potential functions of these genes have been revealed. The ceRNA network involved in PCSK5 has been shown to play an important role in CRC. However, the diagnostic and prognostic value of these genes still requires further validation.
  40 in total

1.  Gene ontology: tool for the unification of biology. The Gene Ontology Consortium.

Authors:  M Ashburner; C A Ball; J A Blake; D Botstein; H Butler; J M Cherry; A P Davis; K Dolinski; S S Dwight; J T Eppig; M A Harris; D P Hill; L Issel-Tarver; A Kasarskis; S Lewis; J C Matese; J E Richardson; M Ringwald; G M Rubin; G Sherlock
Journal:  Nat Genet       Date:  2000-05       Impact factor: 38.330

2.  Conserved seed pairing, often flanked by adenosines, indicates that thousands of human genes are microRNA targets.

Authors:  Benjamin P Lewis; Christopher B Burge; David P Bartel
Journal:  Cell       Date:  2005-01-14       Impact factor: 41.582

3.  A general framework for weighted gene co-expression network analysis.

Authors:  Bin Zhang; Steve Horvath
Journal:  Stat Appl Genet Mol Biol       Date:  2005-08-12

4.  Redefining microRNA targets.

Authors:  Hervé Seitz
Journal:  Curr Biol       Date:  2009-04-16       Impact factor: 10.834

5.  Down-regulation of NDRG2 gene expression in human colorectal cancer involves promoter methylation and microRNA-650.

Authors:  Li Feng; Yun Xie; Hao Zhang; Yunlin Wu
Journal:  Biochem Biophys Res Commun       Date:  2011-02-23       Impact factor: 3.575

6.  Single-nucleotide polymorphisms of the proprotein convertase subtilisin/ kexin type 5 (PCSK5) gene.

Authors:  H Cao; A Mok; B Miskie; R A Hegele
Journal:  J Hum Genet       Date:  2001       Impact factor: 3.172

7.  The miR-200 family and miR-205 regulate epithelial to mesenchymal transition by targeting ZEB1 and SIP1.

Authors:  Philip A Gregory; Andrew G Bert; Emily L Paterson; Simon C Barry; Anna Tsykin; Gelareh Farshid; Mathew A Vadas; Yeesim Khew-Goodall; Gregory J Goodall
Journal:  Nat Cell Biol       Date:  2008-03-30       Impact factor: 28.824

8.  A coding-independent function of gene and pseudogene mRNAs regulates tumour biology.

Authors:  Laura Poliseno; Leonardo Salmena; Jiangwen Zhang; Brett Carver; William J Haveman; Pier Paolo Pandolfi
Journal:  Nature       Date:  2010-06-24       Impact factor: 49.962

9.  Microarray analysis after RNA amplification can detect pronounced differences in gene expression using limma.

Authors:  Ilhem Diboun; Lorenz Wernisch; Christine Anne Orengo; Martin Koltzenburg
Journal:  BMC Genomics       Date:  2006-10-09       Impact factor: 3.969

10.  WGCNA: an R package for weighted correlation network analysis.

Authors:  Peter Langfelder; Steve Horvath
Journal:  BMC Bioinformatics       Date:  2008-12-29       Impact factor: 3.169

View more
  5 in total

1.  LncRNA, a novel target biomolecule, is involved in the progression of colorectal cancer.

Authors:  Weihong Sun; Shaoshao Ren; Ran Li; Qingshan Zhang; Haiping Song
Journal:  Am J Cancer Res       Date:  2019-11-01       Impact factor: 6.166

2.  Downregulation of long non-coding RNA MAFG-AS1 represses tumorigenesis of colorectal cancer cells through the microRNA-149-3p-dependent inhibition of HOXB8.

Authors:  Zhiyan Ruan; Hongling Deng; Minhua Liang; Zhe Xu; Manxiang Lai; Hong Ren; Xiangliang Deng; Xinguo Su
Journal:  Cancer Cell Int       Date:  2020-10-19       Impact factor: 5.722

3.  MicroRNA‑198 suppresses tumour growth and metastasis in oral squamous cell carcinoma by targeting CDK4.

Authors:  Yuanyuan Kang; Ying Zhang; Yan Sun
Journal:  Int J Oncol       Date:  2021-05-13       Impact factor: 5.650

4.  ANK2 Hypermethylation in Canine Mammary Tumors and Human Breast Cancer.

Authors:  Johannes J Schabort; A-Reum Nam; Kang-Hoon Lee; Seok Won Kim; Jeong Eon Lee; Je-Yoel Cho
Journal:  Int J Mol Sci       Date:  2020-11-18       Impact factor: 5.923

5.  WGCNA reveals key gene modules regulated by the combined treatment of colon cancer with PHY906 and CPT11.

Authors:  Shuqin Xing; Yafei Wang; Kaiwen Hu; Fen Wang; Tao Sun; Quanwang Li
Journal:  Biosci Rep       Date:  2020-09-30       Impact factor: 3.840

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.