Literature DB >> 32953881

Coexpression Network Analysis Identifies a Novel Nine-RNA Signature to Improve Prognostic Prediction for Prostate Cancer Patients.

Jiarong Cai1, Zheng Chen2, Xuelian Chen1, He Huang3, Xia Lin4, Bin Miao5.   

Abstract

BACKGROUND: Prostate cancer (PCa) is the most common malignancy and the leading cause of cancer death in men. Recent studies suggest the molecular signature was more effective than the clinical indicators for the prognostic prediction, but all of the known studies focused on a single RNA type. The present study was to develop a new prognostic signature by integrating long noncoding RNAs (lncRNAs) and messenger RNAs (mRNAs) and evaluate its prognostic performance.
METHODS: The RNA expression data of PCa patients were downloaded from The Cancer Genome Atlas (TCGA) or Gene Expression Omnibus database (GSE17951, GSE7076, and GSE16560). The PCa-driven modules were identified by constructing a weighted gene coexpression network, the corresponding genes of which were overlapped with differentially expressed RNAs (DERs) screened by the MetaDE package. The optimal prognostic signature was screened using the least absolute shrinkage and selection operator analysis. The prognostic performance and functions of the combined prognostic signature was then assessed.
RESULTS: Twelve PCa-driven modules were identified using TCGA dataset and validated in the GSE17951 and GSE7076 datasets, and six of them were considered to be preserved. A total of 217 genes in these 6 modules were overlapped with 699 DERs, from which a nine-gene prognostic signature was identified (including 3 lncRNAs and 6 mRNAs), and the risk score of each patient was calculated. The overall survival was significantly shortened in patients having the risk score higher than the cut-off, which was demonstrated in TCGA (p = 5.063E - 03) dataset and validated in the GSE16560 (p = 3.268E - 02) dataset. The prediction accuracy of this risk score was higher than that of clinical indicators (the Gleason score and prostate-specific antigen) or the single RNA type, with the area under the receiver operator characteristic curve of 0.945. Besides, some new therapeutic targets and mechanisms (MAGI2-AS3-SPARC/GJA1/CYSLTR1, DLG5-AS1-DEFB1, and RHPN1-AS1-CDC45/ORC) were also revealed.
CONCLUSION: The risk score system established in this study may provide a novel reliable method to identify PCa patients at a high risk of death.
Copyright © 2020 Jiarong Cai et al.

Entities:  

Mesh:

Substances:

Year:  2020        PMID: 32953881      PMCID: PMC7482004          DOI: 10.1155/2020/4264291

Source DB:  PubMed          Journal:  Biomed Res Int            Impact factor:   3.411


1. Introduction

Prostate cancer (PCa) is the most frequently diagnosed malignancy in men, with an estimated 174,650 new cases and 31,620 deaths in 2019 in the United States [1]. Although diverse treatment strategies (including radical surgery, radiotherapy, chemotherapy, and androgen deprivation therapy) have been demonstrated to be effective, approximately 25% of PCa patients will experience recurrence, metastasis, and develop into castration-resistant PCa, leading to the poor overall survival (OS) [2]. Therefore, it is critical to early identify the patients who are at a high risk for death. Serum level of prostate-specific antigen (PSA) [3], the Gleason score [4], and tumor, node, and metastasis (TNM) staging [5] are the routine clinical indicators for the prediction of OS in patients with PCa. Nevertheless, in clinic, some scholars also observed that patients with the same stage could progress to opposite consequences [6], while similar prognostic outcomes were present in patients with different PSA levels or Gleason scores [7, 8]. Therefore, identification of more effective prognostic biomarkers is highly desirable. With the development of molecular biology, recent studies have attempted to develop the gene molecular signature for the prognostic prediction. Some have been demonstrated to possess a higher predictive power than the above clinical indicators. For example, Li et al. developed a risk score constructed by 6 protein-coding genes. Univariate and multivariate Cox regression analyses showed that this risk score was independent of TNM stage for biochemical recurrence- (BCR-) free survival prediction (hazard ratio (HR) = 3.045, 95% confidence interval (CI) = 1.655 − 5.602; p < 0.001). A subgroup analysis revealed that there were also significant survival differences when the patients with the same Gleason score (≤7 or >7) were classified into the high-risk score group and the low-risk score group [9]. Shi et al. established a prognostic risk score based on 9 protein-coding genes. Receiver operating characteristic (ROC) analysis indicated that the prediction accuracy of this risk score for BCR-free survival was higher than that of Gleason score (area under curve (AUC) = 0.836 vs.0.742) and pathological T stage (AUC = 0.836 vs.0.780) [10]. Similar superiority of the molecular risk score was also observed in the study of Huang et al. who found the four-long noncoding RNA- (lncRNA-) based risk score was independent of the American Joint Committee on Cancer T stage and Gleason score for the prediction of BCR-free survival and disease-free survival. The prediction accuracy for both 2- (AUC = 0.823 vs.0.787) and 5-year BCR (AUC = 0.833 vs.0.797) can be improved by 3.6% if the four-lncRNA signature was added to the clinical indicators [11]. Xu et al. identified an eight-lncRNA signature as an independent factor associated with BCR-free survival and demonstrated its prognostic ability was better than that of the Gleason score (AUC = 0.79 vs.0.688) and positive lymph node (AUC = 0.79 vs.0.622) [12]. However, all of these studies were exploring the prognostic value of a single RNA type. Previous studies on other cancers indicated the lncRNA-mRNA combined signature seemed to be more effective than the lncRNA or mRNA alone for the prognostic prediction [13, 14]. Hence, it is essential to further develop an lncRNA-mRNA integrated prognostic signature for PCa. Furthermore, most of the studies involving comparison with clinical indicators focused on the prediction for BCR-free survival, not for OS, which was also a novelty of our study. The objectives of the current study were (1) to establish a signature for OS prediction based on the PCa-driven lncRNAs and mRNAs which were screened by weighted gene coexpression network analysis (WGCNA) [15], a systematic biological method to cluster highly correlated genes; (2) to confirm the superiority of the integrated molecular signature to clinical indicators or single molecular type; and (3) to reveal the underlying functions of the gene signature.

2. Materials and Methods

2.1. Data Collection and Preprocessing

The level-3 RNA-seq dataset of PCa patients was downloaded from The Cancer Genome Atlas (TCGA, https://portal.gdc.cancer.gov/) on October 18, 2019, including 494 tumor samples (which had survival outcomes) and 54 control samples. This dataset was selected as the training dataset for module and signature identification. The normalized fragments per kilobase of exon per million fragments mapped (FPKM) were used to represent the expression of gene. Furthermore, three gene microarray datasets were downloaded from Gene Expression Omnibus (GEO, http://www. http://ncbi.nlm.nih.gov/geo/) because they also analyzed the gene expression profiles in tumor and control tissues of PCa patients (GSE17951: control, n = 14; tumor, n = 109 [16, 17] in which some samples without clear type description were deleted; GSE70768: control, n = 73; tumor, n = 126 [18]; these two datasets were used to estimate module preservation) or recorded the prognostic outcomes in all PCa patients (GSE16560: tumor, n = 281 [19]; this dataset was used for signature validation). The series matrix files were collected from GEO, and the probe IDs were converted into gene symbols via corresponding platforms (GSE17951: GPL570, [HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array; GSE70768: GPL10558, Illumina HumanHT-12 V4.0 expression BeadChip; GSE16560: GPL5474, Human 6k Transcriptionally Informative Gene Panel for DASL). The known mRNAs and lncRNAs in the above sequencing or microarray datasets were reannotated by the HUGO Gene Nomenclature Committee (HGNC; http://www.genenames.org/) which contains the official nomenclature for 4,495 lncRNAs and 19,219 protein-coding genes [20]. RNAs with a median expression value equal to 0 were removed. Only the mRNAs and lncRNAs that were annotated in all included datasets were used for the following analyses.

2.2. Screening of PCa-Driven Modules

The WGCNA package in R (version 1.61; https://cran.r-project.org/web/packages/WGCNA/index.html) [15] was used to identify PCa-driven modules. Briefly, the expression and connectivity correlations of all RNAs between any two datasets (TCGA, GSE17951, and GSE7076) were first calculated to confirm their comparability. Then, the soft threshold power (β) was selected using the pickSoftThreshold function according to the scale-free topology criterion, with which the weighted adjacency matrix was generated and the gene dendrogram was constructed based on topological overlap matrix dissimilarity. Next, modules with a cutHeight of 0.995 and minSize ≥ 50 were identified using TCGA data via the dynamicTreeCut method [21]. The preservation of the identified modules was validated in the GSE17951 and GSE7076 datasets using the modulePreservation statistics [22]. Preservation Z − score > 5 implied the corresponding module was preserved. Finally, the potential function of preserved modules was analyzed according to the userListEnchment function, while their clinical association was investigated by moduleTraitCor and moduleTraitPvalue algorithms in the WGCNA package.

2.3. Screening of Differentially Expressed lncRNAs and mRNAs

The differentially expressed lncRNAs (DELs) and mRNAs (DEGs) between PCa and normal controls were screened using the MetaDE.ES function in the MetaDE package (version 1.0.5, https://cran.r-project.org/web/packages/MetaDE/). Briefly, due to the presence of the platform difference in three datasets (TCGA, GSE17951, and GSE7076), the heterogeneity of RNAs across them was first assessed by tau2 statistic and Chi-square-based Q-test. Only the RNAs with homogeneity (tau2 and Q p value > 0.05) were included for the differential analysis. The gene expression difference was determined by the MetaDE.pvalue algorithm, with the false discovery rate (FDR) < 0.05 set as the significance threshold. Furthermore, the log2FC (fold change) of RNAs in each dataset was also calculated. Only the RNAs with the consistency in significance and differential trend in three datasets were considered as differentially expressed RNAs (DERs).

2.4. Construction of an lnRNA-mRNA Prognostic Model

A Venn diagram (http://bioinformatics.psb.ugent.be/webtools/Venn/) was developed to identify the overlap between PCa-driven module RNAs and DERs, which were used as seeds for the following survival analysis. The association between the expression levels of DERs and OS was first evaluated by the univariate Cox proportional hazards regression analysis in the “survival” package of R (version, 2.41-1; http://bioconductor.org/packages/survivalr/). Only DERs with log-rank p < 0.05 were considered to have the prognostic potential, which then underwent the multivariate Cox regression analysis to assess their independence. In order to confirm whether the signature identified by multivariate analysis was optimal, L1-penalized (least absolute shrinkage and selection operator (LASSO)) Cox proportional hazard model in the penalized package (version, 0.9-5; http://bioconductor.org/packages/penalized/) [23, 24] was further applied. The risk score formula was generated based on the expression levels of prognostic RNAs (ExpDERs) and their LASSO coefficients (βDERs): The risk score of each patient was calculated according to the above formula, and then, the patients were divided into the low-risk group and the high-risk group using the median score as the cut-off point. The prognostic effect of the risk score was examined by the Kaplan–Meier estimate (log-rank p value < 0.05) and ROC analysis (AUC = 0.5 ~ 1; the AUC towards 1 indicated a good performance). Furthermore, univariate, multivariate Cox regression, and subsequent stratification analyses were also conducted to estimate the association between the risk score and clinical pathological characteristics, with the criteria of statistical significance set as p < 0.05.

2.5. Function Enrichment Analyses of the Prognostic RNAs

Due to the prognostic RNAs selected from different modules, their associations may not be reflected by the WGCNA. Thus, we also used the cor.test function (https://stat.ethz.ch/R-manual/R-devel/library/stats/html/cor.test.html) in R to calculate the Pearson correlation coefficients (PCC) between prognostic lncRNAs and all module DERs and reconstructed the coexpression network using the Cytoscape software (version 3.6.1; http://www.cytoscape.org/). The biological processes and pathways of genes in the coexpression network were predicted using the online Database for Annotation, Visualization, and Integrated Discovery (DAVID) (version 6.8; http://david.abcc.ncifcrf.gov) [25]. Significant Gene Ontology (GO) terms and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways were selected based on the threshold value of p value < 0.05.

3. Results

3.1. WGCNA Analysis Identifies PCa-Driven Modules

After data processing and annotation, 695 lncRNAs and 6,613 protein-encoding mRNAs were identified to be shared within TCGA, GSE17951, and GSE7076 datasets. Thus, they were used for the WGCNA analysis. Correlation analysis showed that there were significantly positive correlations of RNAs between any two datasets irrespective of the expression level (Figure 1(a)) or the connectivity (Figure 1(b)), indicating these datasets were comparable. The soft threshold power β was selected as 8 according to the criterion of the scale-free topology (R2 = 0.9) (Figure 2(a)), using which the mean connectivity for the network was calculated (=1) (Figure 2(b)). A total of 12 modules were identified using TCGA dataset after the DynamicTreeCut analysis (Figures 3(a) and 3(b); Table 1), which was also validated in the GSE17951 and GSE7076 datasets (Figures 3(a) and 3(b)). The black module contained 99 genes (60 mRNAs and 39 lncRNAs); the blue module contained 964 genes (919 mRNAs and 45 lncRNAs); the brown module contained 422 genes (408 mRNAs and 14 lncRNAs); the green module contained 138 genes (134 mRNAs and 4 lncRNAs); the green-yellow module contained 71 genes (69 mRNAs and 2 lncRNAs); the grey module contained 1,705 genes (1,524 mRNAs and 181 lncRNAs); the magenta module contained 78 genes (67 mRNAs and 11 lncRNAs); the pink module contained 78 genes (74 mRNAs and 4 lncRNAs); the purple module contained 74 genes (50 mRNAs and 24 lncRNAs); the red module contained 129 genes (121 mRNAs and 8 lncRNAs); the turquoise module contained 1,491 genes (1,446 mRNAs and 45 lncRNAs); and the yellow module contained 412 genes (388 mRNAs and 24 lncRNAs) (Table 1). From Figure 3(c), we could see that the RNAs belonged to the similar module tended to group together (such as blue and turquoise). Among these 12 modules, blue, brown, green, pink, red, and yellow were considered to be preserved and may be PCa-driven (Table 1). This conclusion may be believable because there were also significant associations between these modules genes and clinical characteristics of PCa patients (Figure 4). the Blue module was significantly associated with Age, recurrence, Gleason_score, and Pathologic_N; the brown module was significantly associated with Age, Gleason_score, Pathologic_M, Pathologic_N, Pathologic_T, Radiation_therapy, and Targeted_molecular_therapy; the red module was significantly associated with recurrence, Gleason_score, Pathologic_M, Pathologic_N, Pathologic_T, prostate-specific antigen (PSA)_value, and Targeted_molecular_therapy; the yellow module was significantly associated with recurrence, Gleason_score, Pathologic_M, Pathologic_N, Pathologic_T, PSA_value, Radiation_therapy, and Targeted_molecular_therapy; and the green and pink modules were associated with all parameters (Figure 4).
Figure 1

Significant correlations between any two datasets (TCGA, GSE17951, and GSE7076). (a) The RNA expression levels. (b) The connectivity.

Figure 2

Soft threshold power β. (a) Soft threshold power β selected when the R2 reached 0.9 for the first time. (b) The mean connectivity corresponding to different β values.

Figure 3

Coexpression modules screened based on WGCNA analysis. (a) Clustering dendrogram of gene coexpression modules from TCGA, GSE17951, and GSE7076 datasets. (b) A dendrogram of the module eigengenes from TCGA, GSE17951, and GSE7076 datasets. (c) A multidimensional scaling (MDS) plot of the module eigengene from TCGA datasets.

Table 1

Preserved modules identified based on weighted gene coexpression network analysis.

IDColorModule sizemRNAlncRNAPreservation Z-scoreModule annotation
Module 1Black9960390.2408Peptidoglycan catabolic process
Module 2Blue96491945 8.9080 Translation
Module 3Brown42240814 19.2639 Immune response
Module 4Green1381344 11.3671 DNA replication
Module 5Green-yellow716924.0398Intracellular signaling cascade
Module 6Grey170515241811.0886Angiogenesis
Module 7Magenta7867114.6283Positive regulation of cardiac muscle contraction
Module 8Pink78744 9.9431 Vasculogenesis
Module 9Purple7450244.3446Regulation of transcription, DNA-templated
Module 10Red1291218 8.9918 Chemotaxis
Module 11Turquoise14911446451.9359Transcription, DNA-templated
Module 12Yellow41238824 38.0674 Cell-cell signaling

Bold indicated preserved modules with preservation Z − score > 5.

Figure 4

Heatmap of the correlation between module eigengenes and clinical traits.

3.2. The Venn Analysis Identifies Differentially Expressed PCa-Driven Module Genes

A total of 699 RNAs were identified as DERs in analysis of all three datasets (TCGA, GSE17951, and GSE7076), including 461 upregulated (37 DELs; 424 DEGs) and 238 downregulated (21 DELs; 217 DEGs). Figure 5(a) shows the samples were distinctly separated according to the expression (high, red; low, green) of these DERs, and the differential pattern was similar among different datasets, indicating these genes may be representative for PCa. These 699 genes were subsequently overlapped with the 2,143 genes of 12 PCa-driven modules. The results showed 217 (including 12 lncRNAs and 205 mRNAs) were common, containing 99 of the blue module, 40 of the brown module, 24 of the green module, 5 of the pink module, 15 of the red module, and 34 of the yellow module (Figure 5(b)), suggesting these 217 genes may be key PCa-driven module genes and may represent alternative biomarkers.
Figure 5

Identification of differentially expressed module genes. (a) Heatmap of differentially expressed RNAs in three datasets. (b) The Venn diagram to obtain the overlap between differentially expressed RNAs and module genes.

3.3. The LASSO Analysis Identifies a 9-Gene Signature for the Prognosis Prediction

The univariate Cox regression analysis revealed that 33 module DERs (including 26 DEGs and 7 DELs) were significantly related to OS (log-rank p < 0.05). Then, these 33 DERs were entered into the multivariate Cox regression model. As a result, 9 DERs (including 6 DEGs and 3 DELs) were identified as the independent predictors of OS. A subsequent LASSO analysis confirmed these 9 DERs from blue, green, and yellow modules may constitute the optimal prognostic signature (DEL: DLG5-AS1 (DLG5 antisense RNA 1), MAGI2-AS3 (MAGI2 antisense RNA 3), and RHPN1-AS1 (RHPN1 antisense RNA 1); DEG: GINS2 (GINS complex subunit 2), NLGN2 (neuroligin 2), EBNA1BP2 (EBNA1 binding protein 2), MELK (maternal embryonic leucine zipper kinase, EIF5AL1 (eukaryotic translation initiation factor 5A like 1), and G6PC3 (glucose-6-phosphatase catalytic subunit 3)) (Table 2). According to the LASSO coefficients and the gene expression levels, the risk score was calculated for each patient as follows: risk score = (−0.2216 × expression of DLG5 − AS1) + (−0.7093 × expression of MAGI2 − AS3) + (−1.5300 × expression of RHPN1 − AS1) + (1.4561 × expression of GINS2) + (1.7956 × expression of NLGN2) + (2.7422 × expression of EBNA1BP2) + (−0.0482 × expression of MELK]) + (−1.5588 × expression of EIF5AL1) + (−1.5551 × expression of G6PC3).
Table 2

The optimal prognostic signature.

SymbolModuleExpressionTypeUnivariate Cox regressionMultivariate Cox regressionLASSO coefficient
HR95% CI p valueHR95% CI p value
GINS2GreenUpregulatedmRNA3.2481.344-7.8494.450E-031.3991.261-2.0363.33E-021.4561
NLGN2YellowDownregulatedmRNA6.5141.237-14.311.350E-023.0122.000-9.0792.10E-021.7956
EBNA1BP2BlueUpregulatedmRNA7.1121.048-18.282.250E-024.4702.904-12.1152.79E-022.7422
DLG5-AS1BlueDownregulatedlncRNA0.6290.199-0.9873.655E-020.5890.290-0.9314.59E-02-0.2216
MAGI2-AS3YellowDownregulatedlncRNA0.3540.0572-0.9142.210E-020.5120.220-0.9134.51E-02-0.7093
RHPN1-AS1GreenUpregulatedlncRNA0.2720.0286-0.5812.210E-020.5130.236-0.7113.89E-02-1.5300
MELKGreenUpregulatedmRNA0.5570.047-0.6251.950E-020.3600.110-1.1793.84E-02-0.0482
EIF5AL1BlueUpregulatedmRNA0.1940.0311-0.4144.000E-020.5400.273-0.7013.26E-02-1.5588
G6PC3BlueUpregulatedmRNA0.2370.0617-0.9121.800E-020.3520.1117-0.5092.66E-02-1.5551

HR: hazard ratio; CI: confidence interval; LASSO: least absolute shrinkage and selection operator.

The patients were divided into the two groups by their corresponding risk scores (low-risk group,
Figure 6

The prediction performance assessment of the prognostic signature. (a) The Kaplan–Meier survival curve analysis of TCGA dataset. (b) The receiver operator characteristic (ROC) curve analysis of TCGA dataset. (c) The Kaplan–Meier survival curve analysis of the GSE16560 dataset. (d) The receiver operator characteristic curve analysis of the GSE16560 dataset. HR: hazard ratio; AUC: area under the ROC curve.

Table 3

The univariate and multivariate Cox regression analysis to identify independent prognostic factors.

VariablesTCGA (n = 494)Univariate analysisMultivariate analysis
HR95% CI p valueHR95% CI p value
Age (years, mean ± SD)61.03 ± 6.841.0530.956-1.1602.91E-01
Pathologic_M (M0/M1/-)452/3/392.8920.470-5.3661.48E-01
Pathologic_N (N0/N1/-)344/78/723.6090.799-16.307.46E-02
Pathologic_T (T2/T3/T4/-)186/291/10/72.2420.601-8.3732.27E-01
Radiation therapy (yes/no/-)59/386/492.9840.577-15.422.33E-01
Targeted molecular therapy (yes/no/-)52/392/503.1270.592-16.522.20E-01
Gleason score (6/7/8/9/10)45/245/63/137/42.9591.337-6.549 2.28E-03 1.6851.163-3.963 2.32E-02
Prostate-specific antigen1.75 ± 15.891.0621.004-1.124 9.60E-05 1.0221.019-1.054 1.52E-02
Recurrence (yes/no/-)368/58/687.2241.937-26.94 5.68E-04 2.0810.424-10.2223.67E-01
Prognostic score status (high/low)247/2479.5741.212-17.56 5.06E-03 5.8461.708-18.27 1.01E-02
Death (dead/alive)10/484
Overall survival days (months, mean ± SD)36.10 ± 26.32

SD: standard deviation; TCGA: The Cancer Genome Atlas; HR: hazard ratio; CI: confidence interval. Bold indicated the factors with statistical significance.

Figure 7

The superiority of the molecular prognostic signature to clinical indicators. (a) Stratification analysis for the Gleason score. (b) Stratification analysis for the level of prostate-specific antigen (PSA). (c) Time-dependent ROC curve analysis constructed according to various models. HR: hazard ratio; AUC: area under the receiver operator characteristic curve.

3.4. The DAVID Analysis Identifies the Function of Prognostic Genes

After calculation of the PCC between three prognostic DELs and module DEGs, a total of 259 relationship pairs were considered to be correlated due to their absolute value of PCC > 0.4. These relationship pairs were used to construct the coexpression network (Figure 8(a)), from which we could see that downregulated MAGI2-AS3 may coexpress with SPARC (secreted protein acidic and cysteine rich) and GJA1 (gap junction protein alpha 1) of the yellow module and CYSLTR1 (cysteinyl leukotriene receptor 1) of the brown module; downregulated DLG5-AS1 may coexpress with VPS37D (VPS37D subunit of ESCRT-I), EIF5AL1, and G6PC3 of the blue module or DEFB1 (defensin beta 1) of the red module; RHPN1-AS1 may coexpress with MELK, GINS2, ORC6 (origin recognition complex subunit 6), and CDC45 (cell division cycle 45) of the green module and EBNA1BP2 of the blue module. The DAVID enrichment analysis identified 14 GO biological process terms, including GO:0007204~positive regulation of cytosolic calcium ion concentration (GJA1), GO:0016525~negative regulation of angiogenesis (SPARC), GO:0032496~response to lipopolysaccharide (SPARC), GO:0006935~chemotaxis (DEFB1), GO:0039702~viral budding via host ESCRT complex (VPS37D), GO:0001937~negative regulation of endothelial cell proliferation (SPARC), GO:0019058~viral life cycle (VPS37D), and GO:0000727~double-strand break repair via break-induced replication (GINS2) (Table 4; Figure 8(b)). In addition, 6 KEGG pathways were also enriched, such as hsa04080:Neuroactive ligand-receptor interaction (CYSLTR1), hsa04110:Cell cycle (CDC45, ORC6), and hsa04020:Calcium signaling pathway (CYSLTR1) (Table 4; Figure 8(b)).
Figure 8

Function analysis for the prognostic genes. (a) A coexpression network between three prognostic lncRNAs and module differentially expressed mRNAs. Triangle, upregulated; inverted triangle, downregulated. The different colors corresponded to the module color. The larger nodes were the signature RNAs. (b) The DAVID enrichment analysis. GO: Gene Ontology.

Table 4

Function enrichment analysis.

CategoryTermCount p valueGenes
GO_BPGO:0007204~positive regulation of cytosolic calcium ion concentration84.15E-04PTGER1, PTGER2, CYSLTR1, GALR2, GJA1, CD52, FPR3, and CXCR3
GO_BPGO:0010818~T cell chemotaxis32.71E-03GPR183, CXCR3, and CXCL10
GO_BPGO:0016525~negative regulation of angiogenesis53.51E-03SERPINF1, FASLG, CXCR3, SPARC, and CXCL10
GO_BPGO:0032496~response to lipopolysaccharide76.37E-03PTGER1, PTGER2, KCNJ8, ELANE, FASLG, SPARC, and CXCL10
GO_BPGO:0006935~chemotaxis67.84E-03RARRES2, CYSLTR1, CXCR3, CCL5, DEFB1, and CXCL10
GO_BPGO:0006954~inflammatory response101.47E-02TUSC2, PTGER1, PTGER2, RARRES2, NMI, FPR3, CXCR3, CCL5, CXCL10, and AOC3
GO_BPGO:0039702~viral budding via host ESCRT complex32.04E-02CHMP2A, VPS37B, and VPS37D
GO_BPGO:0016197~endosomal transport42.89E-02CHMP2A, VPS37B, VPS37D, and RAB13
GO_BPGO:0001937~negative regulation of endothelial cell proliferation33.42E-02GJA1, CXCR3, and SPARC
GO_BPGO:0060333~interferon-gamma-mediated signaling pathway43.48E-02NMI, HLA-DPB1, HLA-E, and GBP1
GO_BPGO:0019058~viral life cycle33.87E-02CHMP2A, VPS37B, and VPS37D
GO_BPGO:0036258~multivesicular body assembly33.87E-02CHMP2A, VPS37B, and VPS37D
GO_BPGO:0015031~protein transport94.67E-02CHMP2A, COPZ2, DNAJC15, GOLT1A, EIF5AL1, KIF18A, VPS37B, VPS37D, and NACA2
GO_BPGO:0000727~double-strand break repair via break-induced replication24.93E-02GINS2, CDC45
KEGGhsa04080:Neuroactive ligand-receptor interaction71.10E-02PTGER1, PTGER2, GLRB, CYSLTR1, ADORA2B, GALR2, and FPR3
KEGGhsa04110:Cell cycle42.16E-02E2F2, CDC45, TTK, and ORC6
KEGGhsa04623:Cytosolic DNA-sensing pathway32.21E-02POLR2L, CCL5, and CXCL10
KEGGhsa00190:Oxidative phosphorylation2.511.10E-02UQCRC1, COX7A1, NDUFB7, and NDUFB9
KEGGhsa04060:Cytokine-cytokine receptor interaction53.86E-02OSM, FASLG, CXCR3, CCL5, and CXCL10
KEGGhsa04020:Calcium signaling pathway44.43E-02PTGER1, CYSLTR1, ADORA2B, and CACNA1H

GO: Gene Ontology; BP: biological process; KEGG: Kyoto Encyclopedia of Genes and Genomes.

4. Discussion

Using the training and validation datasets, we first identified 6 preserved PCa-driven modules and then screened 9 prognosis-related genes (including 3 lncRNAs: DLG5-AS1, MAGI2-AS3, and RHPN1-AS1; and 6 mRNAs: GINS2, NLGN2, EBNA1BP2, MELK, EIF5AL1, and G6PC3) from these modules to construct the risk score. The ROC curve analysis demonstrated the prediction accuracy of this molecular risk score was higher than that of clinical indicators (the Gleason score [AUC = 0.945 vs.0.57], PSA [AUC = 0.945 vs.0.578], and combined [AUC = 0.945 vs.0.673]), which was in line with the studies of Li et al. [9], Shi et al. [10], Huang et al. [11], and Xu et al. [12]. More importantly, our integrated model seemed to be more effective than the single mRNA model (Xu et al.: 4-mRNA, AUC = 0.945 vs.0.904 [26]) for OS prediction, which was also observed in our study (AUC = 0.945 vs.0.81) (Figure 7). Although there was no study to investigate the prognostic ability of the lncRNA signature for OS, their lower prognostic performance for BCR-free survival (Huang et al.: AUC = 0.823 [11]; Xu et al.: AUC = 0.79 [12]) may indirectly confirm the prognostic significance of the combined signature compared with the single lncRNA type, which was also reflected by our study (AUC = 0.945 vs.0.659) (Figure 7). Among these 9 genes, four of them had the consistency between the expression level and the expected prognosis results, that is, the high expression of oncogenes (GINS2 and EBNA1BP2: upregulated, HR > 1) predicted the worse OS, while the high expression of tumor suppressor genes (MAGI2-AS3 and DLG5-AS1: downregulated, HR < 1) predicted the better OS. These findings indicated these four genes may be especially crucial therapeutic targets for PCa. Although rare studies identified lncRNA MAGI2-AS3 as a prognostic biomarker for PCa, the reports of other cancers can indirectly explain their roles. For example, Liu et al. observed that MAGI2-AS3 was downregulated in breast cancer tissues compared to normal adjacent tissues [27]. Overexpression of MAGI2-AS3 suppressed the proliferative, migratory, and invasive capability, while promoted the apoptosis of lung squamous cell carcinoma [28], bladder cancer [29], breast cancer [27, 30], and hepatocellular carcinoma cells [31]. Downregulated MAGI2-AS3 was significantly associated with tumor size, lymph node metastasis, TNM stage, and poor OS [29, 31, 32]. Most of the studies revealed MAGI2-AS3 may function in cancers as a competing endogenous RNA for miRNAs (such as miR-374a/b-5p [28, 31], miRNA-23a-3p [33], and miR-15b-5p [29]) to regulate their target genes (such as CADM2 [28], SMG1 [31], PTEN [33], and CCDC19 [29]), while few indicated MAGI2-AS3 may directly interact with target gene KDM1A [34]. However, the mechanism of MAGI2-AS3 remained unclear. In this study, we predicted that downregulated MAGI2-AS3 may be involved in PCa by leading to the low expression of inflammation (SPARC) or calcium signaling pathway related genes (GJA1 and CYSLTR1). This hypothesis may be believable because these target genes have been implicated to be associated with PCa or other cancers. SPARC expression was found to be decreased in PCa cell lines, the mechanism of which may be attributed to the hypermethylation of its promoter. Also, hypermethylation level was shown to be correlated with a poor prognosis [35]. PCa cells treated with exogenous SPARC exhibited significantly decreased proliferative and migratory abilities [36]. GJA1 (also known as connexin 43) expression was measured to be significantly reduced in tumor tissues compared to that of benign prostatic hyperplasia. Reduced GJA1 expression was associated with high levels of preoperative PSA, high Gleason score, and advanced pT stage and was an independent predictor for shortened postoperative BCR-free survival [37]. Forced expression of GJA1 induced apoptosis of PCa cells by downregulation of Bcl-2 expression and upregulation of caspase-3 activity [38]. By immunohistochemistry (IHC) and quantitative reverse transcription-polymerase chain reaction (qRT-PCR) analyses, Venerito et al. found CYSLTR1 expression was decreased by 0.26-fold in esophageal squamous cell cancer tissues compared to control mucosa [39]. Also, the roles of GINS2 and EBNA1BP2 in PCa could be reflected by their associations with other cancers. GINS2 was shown to be upregulated in the cervical cancer cell lines and tumor specimens compared to the normal control. Patients with higher GINS2 expression had shorter OS than those with lower GINS2 expression [40]. Downregulation of GINS2 markedly inhibited cell proliferation, migration, and invasion [40] and increased the apoptosis rate [41]. Total saponins from Paris forrestii (Takht) H. Li. exhibited an anticancer effect on PCa cells by downregulating GINS2 [42]. Both protein and mRNA levels of EBNA1BP2 were reported to be upregulated in lung cancer samples. EBNA1BP2 may promote cancer cell proliferation by blocking the degradation of oncogene c-Myc [43]. lncRNA DLG5-AS1 may be a newly identified biomarker and therapeutic target for cancer because there was no evidence to show its association with cancer. In this study, we predicted downregulated DLG5-AS1 may exert roles in PCa by decreasing the transcription of DEFB1. This theory may be credible because DEFB1 had been found to be significantly downregulated in PCa tissues and three cell lines [44, 45]. The low expression of DEFB1 in PCa may be mediated by the hypermethylation of the 14 CpG dinucleotide units in its 5′-end promoter region [44]. High expression of DEFB1 was reported to correlate with better prognosis in patients with renal cell carcinoma [46]. The inconsistency between the expression level and the expected prognosis in the other five genes (RHPN1-AS1, MELK, EIF5AL1, and G6PC3: upregulated, but HR < 1; NLGN2: downregulated, but HR > 1) may be attributed to a protective response mechanism in order to resist the development of cancer. This speculation can be verified based on the function studies of these genes. Knockdown of RHPN1-AS1 was shown to result in the development of gefitinib resistance in non-small cell lung cancer cells, whereas the overexpression of RHPN1-AS1 sensitized gefitinib resistant NSCLC cells to gefitinib treatment. Decreased expression of RHPN1-AS1 was associated with poor prognosis of non-small cell lung cancer patients [47]. Furthermore, we also predicted RHPN1-AS1 can positively coexpress with CDC45 and ORC. Patients with low expression of CDC45 and ORC6 were also demonstrated to have significantly worse relapse-free survival and OS [48]. Mass spectrometry identified G6PC3 was a downstream target of Coronin 3. High expressed Coronin 3 was reported to promote the progression of hepatocellular carcinoma cells by inhibiting the expression of G6PC3 [50]. The roles of other genes needed further investigation in the future. Some limitations of our study should be acknowledged. First, this is a study to validate the prognostic value of our identified molecular signature using the retrospective data from the public-available dataset. Prospective clinical trials should be designed to further confirm its prediction ability. Second, wet experiments (IHC, qRT-PCR, overexpression, or knockdown) should be performed to elucidate the expression and roles of the signature genes in PCa because most of them were not reported previously and some were even contradictory. Third, the coexpression relationship between lncRNAs and mRNAs should be explored by chromatin immunoprecipitation, RNA immunoprecipitation, and biotin-labeled RNA pull-down assays [49].

5. Conclusion

Using the WGCNA and LASSO methods, we developed a nine-RNA (including 3 lncRNAs and 6 mRNAs) prognostic signature for PCa patients. This risk score could independently predict the OS and further discriminate the prognostic outcomes for patients with the Gleason score (8-10) and the high level of PSA (above median). Besides, our study may provide new therapeutic targets for PCa patients and the underlying mechanisms for them (MAGI2-AS3-SPARC/GJA1/CYSLTR1, DLG5-AS1-DEFB1, and RHPN1-AS1-CDC45/ORC).
  49 in total

1.  DAVID: Database for Annotation, Visualization, and Integrated Discovery.

Authors:  Glynn Dennis; Brad T Sherman; Douglas A Hosack; Jun Yang; Wei Gao; H Clifford Lane; Richard A Lempicki
Journal:  Genome Biol       Date:  2003-04-03       Impact factor: 13.583

2.  Hypermethylation of the SPARC promoter and its prognostic value for prostate cancer.

Authors:  Tieshi Liu; Xuefeng Qiu; Xiaozhi Zhao; Rong Yang; Huibo Lian; Feng Qu; Xiaogong Li; Hongqian Guo
Journal:  Oncol Rep       Date:  2017-11-29       Impact factor: 3.906

3.  Reduced cancer-specific survival of low prostate-specific antigen in high-grade prostate cancer: A population-based retrospective cohort study.

Authors:  Pan Song; Bo Yang; Zhufeng Peng; Jing Zhou; Zhengju Ren; Kun Fang; Luchen Yang; Linchuan Wang; Qiang Dong
Journal:  Int J Surg       Date:  2020-02-26       Impact factor: 6.071

4.  Identification of a Multi-RNA-Type-Based Signature for Recurrence-Free Survival Prediction in Patients with Uterine Corpus Endometrial Carcinoma.

Authors:  Peizhi Wang; Zhi Zeng; Xiaoting Shen; Xiaohui Tian; Qingjian Ye
Journal:  DNA Cell Biol       Date:  2020-02-27       Impact factor: 3.311

5.  Advanced prostate cancer survival in Spain according to the Gleason score, age and stage.

Authors:  J Campá; G Mar-Barrutia; J Extramiana; A Arróspide; J Mar
Journal:  Actas Urol Esp       Date:  2016-05-09       Impact factor: 0.994

6.  Expression of long non-coding RNA MAGI2‑AS3 in human gliomas and its prognostic significance.

Authors:  X-D Chen; M-X Zhu; S-J Wang
Journal:  Eur Rev Med Pharmacol Sci       Date:  2019-04       Impact factor: 3.507

7.  The impact of clinical stage on prostate cancer survival following radical prostatectomy.

Authors:  Matthew K Tollefson; R Jeffrey Karnes; Laureano J Rangel; Eric J Bergstralh; Stephen A Boorjian
Journal:  J Urol       Date:  2012-11-15       Impact factor: 7.450

8.  Reduced Connexin 43 expression is associated with tumor malignant behaviors and biochemical recurrence-free survival of prostate cancer.

Authors:  Ning Xu; Hui-Jun Chen; Shao-Hao Chen; Xue-Yi Xue; Hong Chen; Qing-Shui Zheng; Yong Wei; Xiao-Dong Li; Jin-Bei Huang; Hai Cai; Xiong-Lin Sun
Journal:  Oncotarget       Date:  2016-10-11

9.  Transcriptome analysis reveals a long non-coding RNA signature to improve biochemical recurrence prediction in prostate cancer.

Authors:  Jinyuan Xu; Yujia Lan; Fulong Yu; Shiwei Zhu; Jianrong Ran; Jiali Zhu; Hongyi Zhang; Lili Li; Shujun Cheng; Yun Xiao; Xia Li
Journal:  Oncotarget       Date:  2018-05-18

10.  A Novel Gene Signature-Based Model Predicts Biochemical Recurrence-Free Survival in Prostate Cancer Patients after Radical Prostatectomy.

Authors:  Run Shi; Xuanwen Bao; Joachim Weischenfeldt; Christian Schaefer; Paul Rogowski; Nina-Sophie Schmidt-Hegemann; Kristian Unger; Kirsten Lauber; Xuanbin Wang; Alexander Buchner; Christian Stief; Thorsten Schlomm; Claus Belka; Minglun Li
Journal:  Cancers (Basel)       Date:  2019-12-18       Impact factor: 6.639

View more
  3 in total

1.  Long non-coding RNA MAGI2-AS3 inactivates STAT3 pathway to inhibit prostate cancer cell proliferation via acting as a microRNA-424-5p sponge.

Authors:  Xin Wei; Yi Hou; Yan Zhang; Huaiwei Zhang; Zhou Sun; Xiangdi Meng; Zhixin Wang
Journal:  J Cancer       Date:  2022-01-01       Impact factor: 4.207

2.  Identification of Pathologic and Prognostic Genes in Prostate Cancer Based on Database Mining.

Authors:  Kun Liu; Yijun Chen; Pengmian Feng; Yucheng Wang; Mengdi Sun; Tao Song; Jun Tan; Chunyang Li; Songpo Liu; Qinghong Kong; Jidong Zhang
Journal:  Front Genet       Date:  2022-03-11       Impact factor: 4.599

3.  MicroRNA‑195‑5p is associated with cell proliferation, migration and invasion in prostate cancer and targets MIB1.

Authors:  Bin Chen; Guohui Bai; Xiaoyan Ma; Lulin Tan; Houqiang Xu
Journal:  Oncol Rep       Date:  2021-10-26       Impact factor: 3.906

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.