Literature DB >> 30024105

Construction of a set of novel and robust gene expression signatures predicting prostate cancer recurrence.

Yanzhi Jiang1,2,3,4, Wenjuan Mei2,3,4,5, Yan Gu2,3,4, Xiaozeng Lin2,3,4, Lizhi He6, Hui Zeng2,3,4,7, Fengxiang Wei8, Xinhong Wan8, Huixiang Yang1, Pierre Major9, Damu Tang2,3,4.   

Abstract

We report here numerous novel genes and multiple new signatures which robustly predict prostate cancer (PC) recurrence. We extracted 696 differentially expressed genes relative to a reported PC signature from the TCGA dataset (n = 492) and built a 15-gene signature (SigMuc1NW) using Elastic-net with 10-fold cross-validation through analyzing their expressions at 1.5 standard deviation/SD below and 2 SD above a population mean. SigMuc1NW predicts biochemical recurrence (BCR) following surgery with 56.4% sensitivity, 72.6% specificity, and 63.24 median months disease free (MMDF) (P = 1.12e-12). The prediction accuracy is improved with the use of SigMuc1NW's cutpoint (P = 3e-15) and is further enhanced (sensitivity 67%, specificity 75.7%, MMDF 45.2, P = 0) when all 15 genes were analyzed through their cutpoints instead of their SDs. These genes individually associate with BCR using either SD or cutpoint as the cutoff points. Eight of 15 genes are individual risk factors after adjusting for age at diagnosis, Gleason score, surgical margin, and tumor stage. Eleven of 15 genes are novel to PC. SigMuc1NW discriminates BCR with time-dependent AUC (tAUC) values of 76.6% at 11.5 months (76.6%-11.5 m), 73.8%-22.3 m, 78.5%-32.1 m, and 76.4%-48.4 m. SigMuc1NW is correlated with adverse features of PC, high Gleason scores (odds ratio/OR 1.48, P < 2e-16), and advanced tumor stages (OR 1.33, P = 4.37e-13). SigMuc1NW remains an independent risk factor of BCR (HR 2.44, 95% CI 1.53-3.87, P = 1.62e-4) after adjusting for age at diagnosis, Gleason score, surgical margin, and tumor stage. In an independent PC (MSKCC) cohort (n = 140), these 15 genes were altered in PC vs normal tissue, metastatic PCs vs primary PCs, and recurrent PCs vs nonrecurrent PCs. Importantly, a 10-gene subsignature SigMuc1NW1 predicts BCR in MSKCC (P = 3.11e-15) and TCGA (P = 3.13e-12); SigMuc1NW1 discriminates BCR at 18.4 m with tAUC as 82.5%. Collectively, our analyses support SigMuc1NW as a novel and robust signature in predicting BCR of PC.
© 2018 The Authors. Published by FEBS Press and John Wiley & Sons Ltd.

Entities:  

Keywords:  MUC1; biomarkers; prostate cancer; prostate cancer recurrence

Mesh:

Year:  2018        PMID: 30024105      PMCID: PMC6120243          DOI: 10.1002/1878-0261.12359

Source DB:  PubMed          Journal:  Mol Oncol        ISSN: 1574-7891            Impact factor:   6.603


androgen deprivation therapy biochemical recurrence castration‐resistant prostate cancer differentially expressed genes disease‐free survival Gleason score median months disease free overall survival prostate cancer radical prostatectomy

Introduction

Prostate cancer (PC) is the most common malignancy in men in the developed countries (Ferlay et al., 2015). The disease progresses with a large degree of disparity. While a large proportion of the low grade [Gleason score 6/WHO grade (group) I or ISUP (the International Society of Urological pathology) grade 1] tumors are not life‐threatening, approximately 30% of patients after radical prostatectomy (RP) will experience disease recurrence with a rise in serum prostate‐specific antigen (PSA) (Zaorsky et al., 2013); this biochemical recurrence (BCR) indicates significantly increased risk for PC metastasis and castration‐resistant prostate cancer (CRPC) (Semenas et al., 2012). Metastasis is the leading cause of PC death. The standard treatment for metastatic PC is androgen deprivation therapy (ADT), which offers palliative care as resistance in the form of CRPC always occurs. In this regard, intervention at the point of BCR will be more effective than at time when PC has advanced to later stages. Thus, effectively assessing PCs with increased risk of BCR is highly desirable. Recent developments have yielded three commercially available mRNA expression‐based multigene panels, Oncotype DX (Genomic Prostate Score/GPS), Prolaris (cell cycle progression/CCP), and Decipher (Genomic Classifier/GC). Both the 17‐gene Oncotype DX and the 31‐gene Prolaris improve risk stratification of patients with high risk of PC recurrence at time of diagnosis (Albala et al., 2016; Cuzick et al., 2011; Klein et al., 2014; Knezevic et al., 2013; Oderda et al., 2017) and after radical prostatectomy (RP) (Cooperberg et al., 2013; Cullen et al., 2015). The 22‐gene Decipher predicts metastasis following RP (Erho et al., 2013; Karnes et al., 2013; Klein et al., 2016). While these and other biomarkers assist decision making and thus improve patient management, their clinical application requires further validation (Lamy et al., 2017; Martin, 2016; McGrath et al., 2016; Patel and Gnanapragasam, 2016; Ross et al., 2016; Zhuang and Johnson, 2016). There is a clear need to improve our ability to stratify PCs with high risk of recurrence following RP. The challenge in accurately predicting PC recurrence is in part attributable to a complex network of pathways that drive the disease development. The Mucin 1 (MUC1) network plays a role in BCR after RP (Eminaga et al., 2016; Lin et al., 2017). MUC1 is a tumor‐associated antigen that has been intensively investigated (Apostolopoulos et al., 2015; Kufe, 2009; Nath and Mukherjee, 2014). MUC1 is a glycoprotein that is expressed on the apical surface of most epithelial tissues (de Paula Peres et al., 2015; Wurz et al., 2014); its glycosylation is altered in over 70% of cancers (Kufe, 2009; de Paula Peres et al., 2015). In PC, MUC1 expression is upregulated and aberrantly glycosylated (Arai et al., 2005; Cozzi et al., 2005; Rabiau et al., 2009). These abnormalities are associated with angiogenesis (Papadopoulos et al., 2001) and adverse clinical features (Eminaga et al., 2016). MUC1 upregulation weakly correlates with shortening in disease‐free survival (DFS) and overall survival (OS) (Eminaga et al., 2016) and associates with adverse histopathology following RP (Durrani et al., 2015). A 3‐protein panel (AZGP1, MUC1, and p53) is related to poor prognosis in men with local PC (Severi et al., 2014). Increases in MUC1 mRNA expression were detected in metastatic PC. Genomic alterations in a 25‐gene MUC1 network were marginally associated with PC recurrence (Wong et al., 2016). Among these 25 genes, genomic alterations in nine genes substantially enhanced the association (Lin et al., 2017). To further explore the biomarker value of the MUC1 network, we examined the transcriptome of the 9‐gene MUC1 genomic signature using the TCGA Provisional dataset within cBioPortal, and established 696 differentially expressed genes (DEGs). From these DEGs, a 15‐gene panel and multiple subpanels were constructed. These signatures robustly associate with reductions in DFS following RP in two independent PC datasets (n = 492 and n = 140). Cutpoints have been derived, which not only enhance the power of these signatures in the stratification of men with higher risk of BCR but also provide a guideline for the subsequent validation and clinical application. Taken together, we have constructed a set of novel and robust signatures to assess PC recurrence following RP.

Materials and methods

cBioPortal

The cBioPortal (Cerami et al., 2012; Gao et al., 2013) (http://www.cbioportal.org/index.do) database contains the most well‐organized and comprehensive data on cancer genetics for various cancer types. The TCGA Provisional datasets for individual cancer types cover genetic abnormalities, transcriptomes determined by either cDNA microarray or RNA sequencing, and the detailed clinical characteristics including disease outcomes (recurrence and mortality). The TCGA Provisional PC dataset has 492 patients with localized PC.

Establishing of multigene panel signatures

The largest TCGA Provisional dataset within the cBioPortal database (Cerami et al., 2012; Gao et al., 2013) (http://www.cbioportal.org/index.do), which includes 492 patients with follow‐up data, was used to derive 696 DEGs that are associated with the 9‐gene signature of the MUC1 genomic network (Lin et al., 2017). These DEGs were defined at q < 0.001. Follow‐up period, recurrence, and other clinical data were also extracted. Elastic‐net logistic regression within the glmnet package in R was used to select variables with major impacts on BCR with 10‐fold cross‐validation; the mixing parameter of Elastic‐net α was used at: 0.2 and 0.8. When α = 0, Elastic‐net operates as Ridge regression which does not perform covariate selection but shrink the coefficients of correlated predictors toward one another. When α = 1, it runs as Lasso which tends to select one covariate among a group of related covariates; this will make a signature less robust. To enhance selection of highly related variables as a group while maintaining the number of covariates to minimum, we used a range of α value: 0.2 and 0.8. With this system, a 15‐gene panel was selected.

Assignment of signature scores to patients/tumors

Individual component genes have been examined to predict BCR using univariate Cox proportional hazards (PH) regression; the Cox coefficients for individual component genes were obtained. The PH assumption was also determined. This analysis was performed using the R ‘survival’ package. The signature scores for individual patients were given using Sum (coef1 + coef2 + … … + coefn), where coef1 … coef are the coefs of individual genes.

Cutpoint estimation

Cutpoint of signature in separation of recurrent tumor from those without BCR was estimated using Maximally Selected Rank Statistics (the Maxstat package) in R. We also retrieved the RNA expression data for each component gene from the TCGA dataset; the cutpoints to discriminate PCs with BCR from those without BCR for each RNA expression data were also derived.

Regression analyses

Logistic regression was performed using R. Cox proportional hazards (Cox PH) regression analyses were carried out using the R survival package. The PH assumption was examined.

Pathway enrichment analysis

The GAGE (Luo et al., 2009) and Reactome (Yu and He, 2016) packages in R were used to analyze gene sets and pathways that were enriched in DEGs using the KEGG (Kyoto Encyclopedia of Genes and Genomes) and GO (gene ontology) databases.

Statistical analysis

Fisher's exact test was performed using the GraphPad Prism 5 software. Kaplan–Meier surviving curves and log‐rank test were carried out using the R survival package, and tools provided by cBioPortal. Univariate and multivariate Cox regression analyses were run using the R survival package. Time‐dependent receive operating characteristic (tROC) analysis was performed using the R timeROC package. A value of P < 0.05 is considered statistically significant.

Results

Identification of DEGs which are associated with the 9‐gene MUC1 genomic signature

Biochemical recurrence (BCR) after surgical resection occurs in 30–40% of patients (Punnen et al., 2014); approximately 40% of these patients will develop metastatic disease (Briganti et al., 2015; Den et al., 2014). Improving our ability in predicting BCR risk is clearly critical in preventing metastatic progression. We have recently constructed a 9‐gene genomic signature from the MUC1 genomic network (Lin et al., 2017); the signature effectively predicts BCR using the TCGA Provisional dataset: sensitivity 34.8%, specificity 83.6%, and median months disease free (MMDF) 73.36 months (P = 5.57e‐5) (Lin et al., 2017). BCR is a complex process driven by multiple pathway alterations. In this regard, we reasoned that the transcriptome associated with the 9‐gene genomic signature may yield a better signature. To investigate this possibility, we analyzed the 9‐gene signature‐associated transcriptome using the TCGA Provisional dataset within the cBioPortal database following the strategy outlined in Fig. 1A. Among 492 patients/tumors, 100 were positive for the signature (Fig. 1A). A comparison to the mean expression of individual genes between these 100 PCs and other 392 PCs revealed a total of 696 differentially expressed genes (DEGs), which were defined at q < 0.001 (Table S1). These DEGs contained 416 downregulations and 280 upregulations (Fig. 1A; Table S1). Geneset enrichment analysis of these DEGs using the KEGG (kegg) kegg.set.hs dataset and Gaga package in R revealed the upregulation of the genesets regulating cell cycle, oocyte meiosis, progesterone‐mediated oocyte maturation (Table S2A), and downregulation of the genesets mediating focal adhesion and others (Table S2B). With the Gene Ontology (go) go.sets.hs dataset, the upregulated genesets include those regulating multiple aspects of cell cycle progression, DNA metabolism, and other processes related to cell proliferation (Table S2C). Downregulated genesets contain those that mediate cell adhesion, extracellular processes, and other events (Table S2D). Pathway enrichment analysis of the 696 DEGs using the Reactome package in R identified pathways regulating G1, M, DNA replication, and chromatid segregation (Table S2E). Collectively, the above analyses reveal an association of the 696 DEGs with PC cell proliferation, implying their potential in predicting PC progression.
Figure 1

Construction of a 15‐gene signature. (A) Strategy used to produce the signature. The TCGA Provisional dataset within cBioPortal has 492 prostate cancers with gene expression profiled by RNA sequencing. The cohort was first divided into two populations: one (n = 100) positive for a 9‐gene signature derived from a MUC1 genomic network (Lin et al., 2017) and another (n = 392) negative for the signature. From these two populations, 696 differentially expressed genes (DEGs) were identified based on the mean mRNA expression and q < 0.001. These DEGs consist of 461 downregulated genes and 218 upregulated genes. For the downregulated genes, we have assigned tumors with gene expression at 1.5 SD (standard deviation) lower than a reference population mean (−1.5 SD); for those upregulated genes, we have located PCs with these gene expression at 2 SD above the population mean. We then performed model‐building using regularization‐coupled covariate selection of these 696 DEGs for their impact on BCR using the Elastic‐net penalty in the R glmnet package (Fig S1 for a typical selection), which resulted in a 15‐gene signature (SigMuc1NW). (B) PCs of the TCGA cohort with −1.5 SD (SLCP2A1 and CGNL1) and 2 SD expression are shown using OncoPrint (top gray illustration) and clustered (bottom color image). The disease‐free status is also included. The illustration was generated using tools provided by cBioPortal.

Construction of a 15‐gene signature. (A) Strategy used to produce the signature. The TCGA Provisional dataset within cBioPortal has 492 prostate cancers with gene expression profiled by RNA sequencing. The cohort was first divided into two populations: one (n = 100) positive for a 9‐gene signature derived from a MUC1 genomic network (Lin et al., 2017) and another (n = 392) negative for the signature. From these two populations, 696 differentially expressed genes (DEGs) were identified based on the mean mRNA expression and q < 0.001. These DEGs consist of 461 downregulated genes and 218 upregulated genes. For the downregulated genes, we have assigned tumors with gene expression at 1.5 SD (standard deviation) lower than a reference population mean (−1.5 SD); for those upregulated genes, we have located PCs with these gene expression at 2 SD above the population mean. We then performed model‐building using regularization‐coupled covariate selection of these 696 DEGs for their impact on BCR using the Elastic‐net penalty in the R glmnet package (Fig S1 for a typical selection), which resulted in a 15‐gene signature (SigMuc1NW). (B) PCs of the TCGA cohort with −1.5 SD (SLCP2A1 and CGNL1) and 2 SD expression are shown using OncoPrint (top gray illustration) and clustered (bottom color image). The disease‐free status is also included. The illustration was generated using tools provided by cBioPortal.

Construction of a 15‐gene signature SigMuc1NW to predict BCR following radical prostatectomy (RP)

We then analyzed the contributions of these 696 DEGs to BCR using the TCGA Provisional cohort, in which the primary treatment was RP (cBioPortal). While the classic system to construct a signature is to randomly divide a dataset into a training set and testing set (Lin et al., 2017), we chose to use the system of cross‐validation. This system is selected due to our large number of DEGs to be assessed for their impact on BCR and the availability of the powerful machine learning programs in the glmnet R package. Based on the heterogeneity of PCs, we reasoned that these DEGs may affect BCR when their expression is beyond a threshold level. For the downregulated DEGs, we separated PCs with their expression lower than 1.5 SD (standard deviation) of a reference population mean from those without this level of downregulation. For the upregulated DEGs, we grouped PCs with DEG expressions above 2 SD from the reference population mean (Fig. 1A). A reference population was either tumors within the dataset that are diploid for the gene of interest or the intact tumor population (http://www.cbioportal.org/faq.jsp). The justifications of using the levels of −1.5 SD downregulation and 2 SD upregulation here were based on our publication (Ojo et al., 2017) and to maintain a sufficient number of DEGs available for variable selection as a value below −1.5 SD or above 2 SD significantly reduced the number of qualified DEGs (data not shown). Using this re‐organized dataset containing the downregulations, upregulations, follow‐up period, and recurrence status for each patient, we then performed covariate selection with regularization using Elastic‐net logistic regression within the R glmnet package (Fig. 1A). To balance the selection of highly correlated covariates and minimization of the number of covariates, we ran Elastic‐net with the mixing parameter α set at 0.2 or 0.8. A 10‐fold cross‐validation was used in all selection settings. As expected, more covariates were selected at α = 0.2 (n = 17) than α = 0.8 (n = 5) (Fig. S1). We also performed covariate selection with a different setting (s = 0.5) which resulted in more covariates than the setting of α = 0.2. We then removed those DEGs with coefficient < 0.01 in the s = 0.5 setting and < 0.001 in the α = 0.2 setting. This resulted in a panel of 15 genes (SigMuc1NW; NW referring to network), including all 5 genes selected at α = 0.8, 14 genes selected from α = 0.2 (including all 5 genes selected at α = 0.8), and 15 DEGs from s = 0.5 (including all 14 genes selected at α = 0.2) (Table 1).
Table 1

The component genes of SigMuc1NW

GeneLocusNameRole in PC/other tumorigenesisReferences
SLCO2A1a 3q22.1‐q22.2Solute carrier organic anion transporter family member 2A1Unknown/inactivation of it facilitates color cancer formationGuda et al., 2014;
CGNL1a 15q21.3Cingulin like 1Unknown/unknownNA
SUPV3L1b 10q22.1Suv3 like RNA helicaseUnknown/unknownNA
TATDN2b 3p25.3TatD DNase domain containing 2Unknown/unknownNA
MGAT4Bb 5q35.3Mannosyl (alpha‐1,3‐)‐glycoprotein β‐1,4‐N‐acetylglucosaminyltransferase, isozyme BUnknown/upregulation in murine hepatocellular carcinomaBlomme et al., 2013;
VAV2b 9q34.2Vav guanine nucleotide exchange factor 2An androgen receptor (AR) coactivator; enhancing AR signaling in PC/Magani et al., 2017;
SLC25A33b 1p36.22Solute carrier family 25 member 33Unknown/a mitochondrial UTP carrier; contributing to IGF‐induced cell growthLyons et al., 2017;
MCCC1b 3q27.1Methylcrotonyl‐CoA carboxylase 1Unknown/gain of function was reported in oral squamous cell carcinomaRibeiro et al., 2014;
ASNSb 7q21.3Asparagine synthetaseContributing to CRPC/Sircar et al., 2012;
CASKIN1b 16p13.3CASK interacting protein 1Unknown/unknownNA
DNMT3Bb 20q11.21DNA methyltransferase 3 betaLikely facilitating CRPC/Gravina et al., 2011;
AURKAb 20q13.2Aurora kinase AContributing to CRPC/Mosquera et al., 2013;
OIP5b 15q15.1Opa interacting protein 5Unknown/a cancer testis antigen detected in colorectal cancerTarnowski et al., 2016;
CTHRC1b 8q22.3Collagen triple helix repeat containing 1Unknown/promoting tumorigenesis in multiple cancer typesKe et al., 2014;
GOLGA7Bb 10q24.2Golgin A7 family member BUnknown/unknownNA

−1.5 SD downregulated genes.

2 SD upregulated genes.

NA: not available.

The component genes of SigMuc1NW −1.5 SD downregulated genes. 2 SD upregulated genes. NA: not available. Among the 15 genes, SLCO2A1 and CGNL1 are downregulated and the rest are upregulated (Table 1). Five genes CGNL1, SUPV3L1, TATDN2, CASKIN1, and GOLGA7B are of unknown functions in either prostate cancer tumorigenesis or tumorigenesis in general (Table 1). Six genes (SLCO2A1, MGAT4B, SLC25A33, MCCC1, OIP5, and CTHRC1) have been shown to affect the tumorigenesis of other cancer types but not PC (Blomme et al., 2013; Chen et al., 2013; Guda et al., 2014; Ke et al., 2014; Lyons et al., 2017; Ribeiro et al., 2014; Tarnowski et al., 2016) (Table 1). OIP5 (Opa interacting protein 5) is a cancer testis antigen and has been reported in other cancer types as a type of tumor‐associated antigen (TAA) (Tarnowski et al., 2016); its detection in PC here suggests OIP5 being a TAA for PC. The remaining four genes VAV2 (VAV guanine nucleotide exchange factor 2), ASNS (asparagine synthesis), DNMT3B (DNA methyltransferase 3 beta), and AURKA (Aurora kinase A) not only all promote PC pathogenesis but also play a role in the development of CRPC (Gravina et al., 2011; Magani et al., 2017; Mosquera et al., 2013; Sircar et al., 2012). VAV2 is a coactivator of androgen receptor (AR) and sustains AR signaling under androgen deprivation therapy (ADT) (Magani et al., 2017); it also promotes angiogenesis and metastasis (Barrio‐Real and Kazanietz, 2012). AURKA plays a critical role in mitosis (Dominguez‐Brauer et al., 2015; Plotnikova et al., 2015) and promotes the development of neuroendocrine PC under ADT (Beltran et al., 2011; Mosquera et al., 2013). DNMT3B may regulate epigenetic events to facilitate CRPC development (Hoffmann et al., 2007). Collectively, evidence supports an association of SigMuc1NW with PC recurrence. In line with this possibility, univariate Cox proportional hazards (PH) analysis revealed that all component genes at the defined level expression (−1.5 SD downregulation and 2 SD upregulation) significantly predict BCR (Table 2). Except for TATDN2 and OIP5, the PH assumption of the Cox model was confirmed. The prediction for some genes (MGAT4B, ASNS, DNMT3B, and OIP5) is robust (Table 2), particularly considering the prediction being individual gene‐based.
Table 2

Association of the component genes of SigMuc1NW with PC recurrencea

GenesCoefb HRc 95% CId P‐value
SLCO2A1e 1.58134.8611.763–13.40.00225**
CGNL1e 0.99022.6921.546–4.6860.000464***
SUPV3L1f 0.84372.3251.168–4.6290.0163*
TATDN2f 1.31323.7181.855–7.450.000213***
MGAT4Bf 1.51784.5622.245–9.2722.73e‐5***
VAV2f 1.10273.0121.671–5.4290.000244***
SLC25A33f 1.0962.9921.55–5.7770.00109**
MCCC1f 0.83362.3021.322–4.0070.00321**
ASNSf 1.34563.842.064–7.1452.15e‐5***
CASKIN1f 1.02862.7971.55–5.0470.000636***
DNMT3Bf 1.29193.641.928–6.876.73e‐5***
AURKAf 1.09662.9941.692–5.2980.000166***
OIP5f 1.3653.9142.022–7.5765.13e‐5***
CTHRC1f 0.79812.2211.15–4.2890.0174*
GOLGA7Bf 2.04067.6952.388–24.790.00063***

Univariate Cox analysis was performed using the TCGA Provisional cohort (n = 492).

Cox coefficient.

Hazard ratio.

Confidence interval.

Gene expression was < −1.5 SD of the reference population mean.

Gene expression was at > 2 SD of the reference population mean.

*P < 0.05; **P < 0.01; ***P < 0.001.

Association of the component genes of SigMuc1NW with PC recurrencea Univariate Cox analysis was performed using the TCGA Provisional cohort (n = 492). Cox coefficient. Hazard ratio. Confidence interval. Gene expression was < −1.5 SD of the reference population mean. Gene expression was at > 2 SD of the reference population mean. *P < 0.05; **P < 0.01; ***P < 0.001. In support of our selection of related genes, changes in the 15 genes show an overlapping profile (Fig. 1B, up panel) and their expression can be clustered (Fig. 1B, bottom panel). The downregulation/upregulation‐based alterations and gene expression‐derived cluster are well matched (Fig. 1B), providing a validation for our covariate selection. Importantly, patients with these changes are at risk of developing recurrent PC; that is, these patients are enriched with recurrent tumors (Fig. 1B, see the ‘Disease‐free status’ illustration). Tumors positive to SigMuc1NW are also robustly associated with reductions in disease‐free survival (DFS) (Fig. 2A, P = 1.12e‐12). The association has a sensitivity of 56.4% and specificity of 72.6%, which are significantly improved from the initially reported 9‐gene signature (sensitivity of 34.8%, specificity of 83.6%, P = 5.57e‐5) (Lin et al., 2017). Considering the TCGA cohort had 10 total mortality, it is intriguing that 8 of these 10 deaths occurred in patients with SigMuc1NW‐positive PC (Fig. 2B, P = 0.00212), which are consistent with VAV2, ASNS, DNMT3B, and AURKA being factors promoting CRPC development (Gravina et al., 2011; Magani et al., 2017; Mosquera et al., 2013; Sircar et al., 2012). As expected, SigMuc1NW displays an overlapping pattern with the 9‐gene genomic signature used to select DEGs (Fig. S2). Inclusion of SigMuc1NW substantially enhanced the association of the 9‐gene signature with BCR (Fig. S3A,C) and significantly correlates with a reduction in overall survival (OS) (Fig. S3B).
Figure 2

SigMuc1NW is associated with reductions in disease‐free survival (DFS) and overall survival (OS) in patients with PC. The TCGA Provisional cohort was used in these analyses. (A) The effect of SigMuc1NW on DFS. MDF: months disease free; MS: months survival; MMDF: median months disease free; NA: not available as MMDF being not reached. Numbers of patient at risk at the start of the indicated follow‐up period were included. (B) The impact of SigMuc1NW on OS. MMS: median months survival. Kaplan–Meier and log‐rank test were performed using the R survival Package.

SigMuc1NW is associated with reductions in disease‐free survival (DFS) and overall survival (OS) in patients with PC. The TCGA Provisional cohort was used in these analyses. (A) The effect of SigMuc1NW on DFS. MDF: months disease free; MS: months survival; MMDF: median months disease free; NA: not available as MMDF being not reached. Numbers of patient at risk at the start of the indicated follow‐up period were included. (B) The impact of SigMuc1NW on OS. MMS: median months survival. Kaplan–Meier and log‐rank test were performed using the R survival Package.

SigMuc1NW effectively discriminates recurrent PCs from those without BCR

To examine the effectiveness of SigMuc1NW in separation of recurrent PC from those without BCR, we have assigned the alterations of the 15 genes with their Cox efficient (Table 2). The cumulative scores of SigMuc1NW for individual patients were then calculated as ∑(f i) (f i: Cox coefficient of genei, n = 15) (Table S3). The sensitivity and specificity of the scores derived from SigMuc1NW in discrimination of BCR was analyzed using time‐dependent ROC (tROC). The scores discriminate recurrent PC with tAUC (area under curve) ranging from 74.9% at 11.5 and 32.1 months to 69.7% at 48.4 months (Fig. 3A), revealing SigMuc1NW being particularly effective in predicting earlier BCR. To further investigate this application, we determined the cutpoint of the SigMuc1NW scores in the separation of recurrent from nonrecurrent PC using Maximally Selected Rank Statistics using the Maxstat package in R (Fig. S4) and converted the scores into binary code; scores ≤ 1.7833 (cutpoint, Fig. S4) were assigned ‘0’ and scores > 1.7833 were assigned ‘1’. PCs with scores above the cutpoint have a dynamically faster profile of BCR than those with scores not above the cutpoint (Fig. 3B). Intriguingly, the cutpoint‐positive tumors even developed BCR in a shorter time frame (Fig. 3B; MMDF 33.1, 95% CI 30.9–73.4) compared to SigMuc1NW‐positive PCs (Fig. 2A; MMDF 63.2, 95% CI 40–77.3). The cutpoint thus not only will facilitate clinical examination of SigMuc1NW but also enhances its predictive power. Additionally, both mean and quartile 3 (Q3) scores can stratify patients with high risk of BCR with comparable effectiveness as SigMuc1NW (comparing Fig. 3C,D to Fig. 2A). Both mean and Q3 scores cover 48 and 46 recurrent PCs, respectively (Fig. 3C,D) which are more than the 41 recurrent PCs marked by the cutpoint (Fig. 3A). Thus, the mean (0.918), Q3 (1.019), and cutpoint (1.7883) scores can also be used to predict BCR following RP with a range of BCR risk. We further demonstrated SigMuc1NW (1.62e‐4), cutpoint (P = 2.05e‐5) (Table 3), Mean (P = 1.19e‐4), and Q3 (P = 1.67e‐4) (data not shown) being independent risk factors for PC recurrence after adjusting for age at diagnosis, RP Gleason scores, surgical margin, and TMN tumor stage. When the World Health Organization (WHO) PC grading system [WHO grade (group) I‐V] or its equivalent ISUP (the International Society of Urological Pathology) grade (Egevad et al., 2016; Gordetsky and Epstein, 2016) (Table S3 for details) instead of Gleason grade was used, SigMuc1NW (P = 2.05e‐4), cutpoint (P = 1.91e‐5), Mean (P = 1.37e‐4), and Q3 (P = 1.86e‐4) remain an independent risk factor for BCR. The demographics of the TCGA dataset with respect to the clinical characteristics used in the above multivariate Cox analyses are included (Table S4).
Figure 3

SigMuc1NW scores effectively stratify PCs with a high risk of recurrence. (A) All tumors within the TCGA Provisional cohort were scored for SigMuc1NW (see Results for details). The scores were analyzed for discrimination of tumors with high risk of recurrence using tROC. AUC at the indicated period of time (tAUC) along with the status of disease recurrence are indicated. DF: disease free. (B) The cutpoint of SigMuc1NW scores for effectively separating PCs with high risk of recurrence from low risk PCs was estimated (Fig S4 for details), followed by assigning binary codes to tumors based on the cutpoint (see Results for details). The effects of cutpoint on DFS of the patients in the TCGA cohort were then determined. (C, D) The effects of Mean and Q3 scores of SigMuc1NW on BCR in PC patients in the TCGA Provisional cohort. Kaplan–Meier and log‐rank test were performed using the R survival Package. The vertical dot line shows MMDF. The color dot curves are for 95% CI.

Table 3

Univariate and multivariate Cox analysis of SigMuc1NW for PC recurrence

FactorsUnivariate Cox analysisMultivariate Cox analysisMultivariate Cox analysis
HR95% CI P‐valueHR95% CI P‐valueHR95% CI P‐value
Siga 4.162.74–6.365.54e‐11* 2.441.53–3.871.62e‐4* NANANA
Cutpointb 4.63.03–6.976.44e‐13* NANANA2.671.70–4.202.05e‐5*
Agec 1.030.99–1.060.09810.9990.97–1.030.97111.0010.97–1.030.9756
GSd 2.191.76–2.721.49e‐12* 1.621.25‐2.112.71e‐4* 1.621.25–2.102.86e–4*
SMargine 2.251.48–3.410.000137* 1.250.79–1.980.33061.280.81–2.020.2976
TumStgef 3.682.08–6.518.19e‐6* 1.820.97–3.400.06141.820.96–3.450.0668

SigMuc1NW.

SigMuc1NW‐derived cutpoint.

Age at diagnosis.

Radical prostatectomy Gleason score.

Surgical margin.

Tumor stages (0 for ≤ T2; 1 for T3 and T4).

HR, hazard ratio; CI, confidence interval; NA, not available.

*P < 0.05.

SigMuc1NW scores effectively stratify PCs with a high risk of recurrence. (A) All tumors within the TCGA Provisional cohort were scored for SigMuc1NW (see Results for details). The scores were analyzed for discrimination of tumors with high risk of recurrence using tROC. AUC at the indicated period of time (tAUC) along with the status of disease recurrence are indicated. DF: disease free. (B) The cutpoint of SigMuc1NW scores for effectively separating PCs with high risk of recurrence from low risk PCs was estimated (Fig S4 for details), followed by assigning binary codes to tumors based on the cutpoint (see Results for details). The effects of cutpoint on DFS of the patients in the TCGA cohort were then determined. (C, D) The effects of Mean and Q3 scores of SigMuc1NW on BCR in PC patients in the TCGA Provisional cohort. Kaplan–Meier and log‐rank test were performed using the R survival Package. The vertical dot line shows MMDF. The color dot curves are for 95% CI. Univariate and multivariate Cox analysis of SigMuc1NW for PC recurrence SigMuc1NW. SigMuc1NW‐derived cutpoint. Age at diagnosis. Radical prostatectomy Gleason score. Surgical margin. Tumor stages (0 for ≤ T2; 1 for T3 and T4). HR, hazard ratio; CI, confidence interval; NA, not available. *P < 0.05.

Enhancing the predictive efficiency of SigMuc1NW

To further demonstrate SigMuc1NW being effective and robust, we analyzed the signature using the actual gene expression data instead of using SD (standard deviation)‐based distribution. For this purpose, the RNA sequencing data for all 15 SigMuc1NW genes were retrieved from the TCGA dataset and estimated for cutpoints in separating recurrent PCs (Table 4). All tumors were given a binary code for all 15 genes as described above with exception for both downregulated genes SLCO2A1 and CGNL1 in which tumors with expression less than the cutpoint were assigned ‘1’. Univariate Cox PH analysis was carried out with the PH assumption confirmed for all genes. All 15 genes, as defined by their cutpoint, significantly predict BCR (Fig. 4). Additionally, SLCO2A1, SUPV3L1, TATDN2, MGAT4B, VAV2, SLC25A33, ASNS, and OIP5 remain as independent risk factors of BCR after adjusting for age at diagnosis, RP Gleason scores, surgical margin, and TMN tumor stage (Table 5). These observations are appealing considering their single gene‐based nature, and that 8/15 component genes of SigMuc1NW possesses independent predicting value to BCR, which further supports SigMuc1NW as a signature for BCR.
Table 4

SigMuc1NWa component genes defined at their cutpoints associate with BCR

GenesCutpointb P‐valueCoefc P‐value
SLCO2A1497.32920.091280.79670.00499**
CGNL13066.2290.004126** 0.79660.000372***
SUPV3L1545.89280.007953** 0.79920.000187***
TATDN21756.0570.002471** 0.8731d 8.48e‐5***
MGAT4B1818.7186.389e‐5*** 1.03312.61e‐6***
VAV21489.060.000547*** 0.94029.94e‐6***
SLC25A33297.55080.25220.85030.0218*
MCCC11233.1590.001077** 1.01791.2e‐5***
ASNS1041.0860.01123* 1.05440.000109***
CASKIN1106.40460.02646* 0.70060.00125**
DNMT3B61.40860.008576** 0.90820.000175***
AURKA81.12493.807e‐5*** 1.02231.12e‐6***
OIP516.43174.237e‐7*** 1.242d 2.64e‐8***
CTHRC1180.86220.01389* 0.76080.000537***
GOLGA7B23.20220.01249*0.76230.000581***

RNA sequencing data of SigMuc1NW's component genes were retrieved from the TCGA Provisional dataset (cBioPortal).

Cutpoint was estimated using Maximally Selected Rank Statistics in R.

Coefficient to BCR was determined using univariate Cox proportion hazard analysis.

PH assumption was at P < 0.05.

*P < 0.05; **P < 0.01; ***P < 0.001.

Figure 4

All 15 component genes are significantly associated with PC recurrence and the formulation of three subsignatures. The mRNA expression data for the 15 genes were retrieved from the TCGA Provisional dataset (cBioPortal). Individual cutpoints were derived, and binary codes were assigned to all tumors. The hazard ratio (HR) of PC recurrence for all individual genes was determined using the univariate Cox proportional hazards (PH) mode. The PH assumption was evaluated and confirmed. These analyses were carried out using the R survival package. Individual HR, the 95% CI, and P‐value are included. The inclusion of component genes in SigCut1, SigCut2, and SigCut3 were shown, which was based on the P‐values.

Table 5

Univariate and multivariate Cox analysis of SigMuc1NW component genes defined at cutpoint for PC recurrence

FactorsUnivariate Cox analysisMultivariate Cox analysis
HR95% CI P‐valueHR95% CI P‐value
Agea 1.030.99–1.060.0981NSe
GSb 2.191.76–2.721.49e‐12* 1.71–1.89f (1.32–1.46)–(2.20‐2.41)f 4.48e‐7*–1.4e‐5*,f
SMarginc 2.251.48–3.410.000137* NSe
TumStged 3.682.08–6.518.19e‐6* 1.62–2.07(0.85–1.08)‐(3.08–3.96)f 0.0272*,h–0.139f,g
SLCO2A12.221.27–3.870.00499* 1.821.04–3.190.0369*
SUPV3L12.221.46–3.381.87e‐4* 2.081.36–3.197.98e‐4*
TATDN22.391.55–3.708.48e‐5* 2.151.37–3.378.35e‐4*
MGAT4B2.811.83–4.322.61e‐6* 1.771.23–2.780.0128*
VAV22.561.69–3.899.94e‐6* 1.931.26–2.950.0024*
SLC25A332.341.13–4.840.0218* 2.251.08–4.670.0297*
ASNS2.871.68–4.901.09e‐4* 1.911.09–3.360.0239*
OIP53.462.24–5.362.64e‐8* 1.941.20–3.120.00638*

Age at diagnosis.

Radical prostatectomy Gleason score.

Surgical margin.

Tumor stages (0 for ≤ T2; 1 for T3 and T4).

Not significant.

Range of HR, 95% CI, and P‐values resulted from multivariate Cox analysis with the individual genes.

The P‐values for SLCO2A1 (P = 0.0749), MGAT4B (P = 0.0891), ASNS (P = 0.0917), and OIP5 (P = 0.139).

hThe P‐values for SUPV3L1 (P = 0.0431*), TATDN2 (P = 0.0272*), VAV2 (P = 0.0364*), and SLC25A33 (P = 0.0334*).

HR, hazard ratio; CI, confidence interval.

SigMuc1NWa component genes defined at their cutpoints associate with BCR RNA sequencing data of SigMuc1NW's component genes were retrieved from the TCGA Provisional dataset (cBioPortal). Cutpoint was estimated using Maximally Selected Rank Statistics in R. Coefficient to BCR was determined using univariate Cox proportion hazard analysis. PH assumption was at P < 0.05. *P < 0.05; **P < 0.01; ***P < 0.001. All 15 component genes are significantly associated with PC recurrence and the formulation of three subsignatures. The mRNA expression data for the 15 genes were retrieved from the TCGA Provisional dataset (cBioPortal). Individual cutpoints were derived, and binary codes were assigned to all tumors. The hazard ratio (HR) of PC recurrence for all individual genes was determined using the univariate Cox proportional hazards (PH) mode. The PH assumption was evaluated and confirmed. These analyses were carried out using the R survival package. Individual HR, the 95% CI, and P‐value are included. The inclusion of component genes in SigCut1, SigCut2, and SigCut3 were shown, which was based on the P‐values. Univariate and multivariate Cox analysis of SigMuc1NW component genes defined at cutpoint for PC recurrence Age at diagnosis. Radical prostatectomy Gleason score. Surgical margin. Tumor stages (0 for ≤ T2; 1 for T3 and T4). Not significant. Range of HR, 95% CI, and P‐values resulted from multivariate Cox analysis with the individual genes. The P‐values for SLCO2A1 (P = 0.0749), MGAT4B (P = 0.0891), ASNS (P = 0.0917), and OIP5 (P = 0.139). hThe P‐values for SUPV3L1 (P = 0.0431*), TATDN2 (P = 0.0272*), VAV2 (P = 0.0364*), and SLC25A33 (P = 0.0334*). HR, hazard ratio; CI, confidence interval. Using the Cox coefficients (Table 4), all cutpoint‐positive events were converted to the respective coefficient values (Table S5). Based on the robustness defined by P‐values (Fig. 4), we formulated three subsignatures SigCut1, SigCut2, and SigCut3 (Fig. 4). All tumors were then scored for SigCut1, SigCut2, and SigCut3 using ∑(f i) (f i: Cox coefficient of genei, n = 3, 6, or 15). All three subsignatures discriminate recurrent PC effectively with tAUC > 70% (Fig. 5A). The respective cutpoints were determined: 1.0331/P = 6.166e‐8 for SigCut1, 4.0135/P = 1.005e‐11 for SigCut2, and 5.4067/P = 7.97e‐15 for SigCut3. The respective binary code for individual subsignature was then assigned to all tumors, which was used to perform survival analysis. All three subsignatures dramatically associate with reductions in DFS with SigCut2 and SigCut3 being more robust (Fig. 5B–D). Nonetheless, they predict BCR with a range of effectiveness in terms of the number of recurrent tumors included, the duration of MMDF, and sensitivity/specificity: 71.4%/63.9% for SigCut1, 41.8%/87.5% for SigCut2, and 67.7%/75.7% for SigCut3 (Fig. 5B–D). These three subsignatures can thus be used together to predict recurrent PCs; this will significantly enhance their predictive power.
Figure 5

Analyses of SigCut1, SigCut2, and SigCut3 for their association with reductions in DFS. The TCGA Provisional dataset was used here. (A) All tumors were scored for SigCut1, SigCut2, and SigCut3 using the respective Cox coefficient. Time‐dependent AUCs for individual signature at the current follow‐up period and the corresponding recurrent status are shown. (B‐D) The associations of SigCut1, SigCut2, and SigCut3 with BCR. (E) The Q1, Median, Cutpoint, and Q3 scores of SigCut3 were analyzed for the stratification of PC with high risk of recurrence. The number of risk individuals at the indicated follow‐up period is included. The multiple Kaplan–Meier curves and log‐rank test were performed using the R survival package.

Analyses of SigCut1, SigCut2, and SigCut3 for their association with reductions in DFS. The TCGA Provisional dataset was used here. (A) All tumors were scored for SigCut1, SigCut2, and SigCut3 using the respective Cox coefficient. Time‐dependent AUCs for individual signature at the current follow‐up period and the corresponding recurrent status are shown. (B‐D) The associations of SigCut1, SigCut2, and SigCut3 with BCR. (E) The Q1, Median, Cutpoint, and Q3 scores of SigCut3 were analyzed for the stratification of PC with high risk of recurrence. The number of risk individuals at the indicated follow‐up period is included. The multiple Kaplan–Meier curves and log‐rank test were performed using the R survival package. The Q1 (1.647), Median (3.589), and Q3 (6.386) scores all effectively stratify PC with high risk of BCR with a range of effectiveness in terms of sensitivity/specificity/MMDF (median month disease free)/P‐value being 93.4%/31.8%/81.2/6.76e‐6 for Q1, 80.2%/56.9%/66.9/6.73e‐11 for Median, and 56%/82%/40/0 for Q3 (Fig. S5). When Q1, Median, Q3, and cutpoint of SigCut3 are used together, it offers an impressive system to stratify recurrent and nonrecurrent PCs with only a few recurrent cases in tumors with score < Q1 (Fig. 5E). Furthermore, in comparison with SD‐defined SigMuc1NW (Fig. 2A), SigCut3 is clearly more effective (Fig. 5D). After adjusting for age at diagnosis, RP Gleason scores, surgical margin, and TMN tumor stage, SigCut1 (P = 0.00308), SigCut2 (P = 1.55e‐5), and SigCut3 (P = 2.97e‐6) independently predict BCR, respectively. All three signatures are associated with adverse features of PC: high tumor stages (T3 and T4) at odds ratio/95% CI of 1.78/1.51–2.12 (P = 2.39e‐11) for SigCut1, 1.55/1.37–1.77 (P = 1.33e‐11) for SigCut2, and 1.33/1.23–1.44 (P = 8.47e‐13) as well as for Gleason scores (8–10) at the respective odds ratio/95% CI of 2.19/1.86–2.6 (P < 2e‐16), 1.84/1.62–2.1 (P < 2e‐16), and 1.48/1.37–1.61 (P < 2e‐16). Taken together, these observations validate the efficacy of SigMuc1NW.

Validation of SigMuc1NW

We have made an effort to determine the individual component gene expression in PCs. The MKSCC (Cancer Cell 2010) (Taylor et al., 2010) dataset within cBioPortal has 216 PCs/patients with mRNA expression profiled using microarray; the expression data were organized for comparison between normal prostate tissues and PC (cBioPortal). Importantly, all primary PCs have been treated and the follow‐up information is available; this cohort thus supports survival analysis. To further validate SigMuc1NW constructed using RNA sequencing data from the TCGA Provisional dataset, mRNA expression data for all 15 component genes along with all clinical information were extracted from the MKSCC dataset. Tissues can be grouped into normal prostate (n = 29), primary PCs (n = 149), recurred PCs (n = 36), and metastatic PCs (n = 9) (cBioPortal). Using this setting, we demonstrated significant reductions of CGNL1 in primary PCs over normal prostate tissues, in metastatic PCs compared to localized PCs, and in recurrent PCs compared to nonrecurrent PCs among the two downregulated genes (SLCO2A1 and CGNL1) of SigMuc1NW (Fig. 6A–C). Significantly higher levels for most upregulated genes identified in SigMuc1NW were shown in the above comparisons (Fig. 6A–C), supporting the authenticity of SigMuc1NW.
Figure 6

Alterations in the expression of the component genes in an independent PC population. Gene expression data determined by microarray were extracted from the MSKCC dataset (Robinson et al., 2015) within cBioPortal. The mRNA levels in normal and PC tissues (A), in primary PC and metastatic PC (B), and in nonrecurrent and recurrent PC (C) were determined. The number of cases used in the comparisons is indicated. Means ± SD are graphed. Statistical analyses were performed using Student's t‐test (2‐tailed). *P < 0.05, **P < 0.01, and ***P < 0.001.

Alterations in the expression of the component genes in an independent PC population. Gene expression data determined by microarray were extracted from the MSKCC dataset (Robinson et al., 2015) within cBioPortal. The mRNA levels in normal and PC tissues (A), in primary PC and metastatic PC (B), and in nonrecurrent and recurrent PC (C) were determined. The number of cases used in the comparisons is indicated. Means ± SD are graphed. Statistical analyses were performed using Student's t‐test (2‐tailed). *P < 0.05, **P < 0.01, and ***P < 0.001. Following our system described above, cutpoints for all 15 genes were estimated, binary codes were assigned, and association of individual genes with BCR was determined using Cox PH regression (Table 6). Except MCCC1 being reversely associated with DFS and four genes without a significant correlation with DFS, other 10 genes significantly or robustly (CGNL1 and CTHRC1) predict BCR risk (Table 6). We then formulated a subsignature with these 10 genes (SigMuc1NW1). As described above, all tumors were scored for SigMuc1NW1 using their coefficients (Table 6). Analysis with tROC shows tAUC values being from 76.6% to 82.5% (Fig. 7A). SigMuc1NW1 thus effectively discriminates recurred PCs from nonrecurrent tumors across all follow‐up period from 18.4 months to 65 months (Fig. 7A); this efficiency matches that of SigMuc1NW in the discrimination of recurrent PCs in the TCGA cohort (Fig. 5A). Additionally, using the binary code derived from Q1 (0), Median (1.805), Q3 (3.727), and cutpoint (6.2136) scores of SigMuc1NW1, all these classifications significantly stratify recurrent PCs (Fig. 7B–E). The respective sensitivity/specificity/PPV (positive predictive value) are 36.1%/98.1%/86.7% for cutpoint, 97.2%/35.6%/34.3% for Q1, 75%/59.6%/39.1% for Median, and 52.8%/84.6%/54.3% for Q3 (Fig. 7B–E). The PPV for cutpoint is robust (86.7%). Collectively, through combination of Q1, Median, Q3, and cutpoint, PC recurrence could be effectively predicted for patients in the MSKCC cohort. The similar situation was also demonstrated in the TCGA cohort using SigMuc1NW. In a reverse validation effort, we demonstrated that SigMuc1NW1 is also robustly associated with BCR in the TCGA cohort and significantly correlates with a reduction in OS in the TCGA dataset (Fig. 8A,B). Taken together, we provide a thorough validation of SigMuc1NW and SigMuc1NW1.
Table 6

Cutpoint and Cox coefficients of SigMuc1NW component genes in the MSKCC cohorta

GenesCutpointb P‐valueCoefc HR95% CI P‐value
SLCO2A18.1550980.70730.63641.890.7835–4.5580.157
CGNL110.021320.004758** 1.46794.342.084–9.0388.8e‐5***
SUPV3L17.6555460.7029−0.69310.50.2277–1.0980.0841
TATDN27.7551330.969−0.51490.59760.2476–1.4420.252
MGAT4B8.5365760.01469* 1.32453.761.833–7.7120.000302***
VAV27.8013080.20760.82582.2841.184–4.4050.0138*
SLC25A338.65305610.47521.6080.6248–4.140.325
MCCC17.7893430.2982−1.07680.34070.1467–0.79110.0122*
ASNS7.9466250.01918* 1.18153.2591.567–6.780.00157**
CASKIN18.1428540.04935* 1.098531.529–5.8860.0014**
DNMT3B7.1996730.060771.03732.8221.385–5.7490.00428**
AURKA7.2152840.03781* 1.05522.8731.435–5.750.00288**
OIP56.0263970.055570.97892.6621.374–5.1560.00372**
CTHRC17.8276640.0001814*** 1.6315.1092.4–10.882.33e‐5***
GOLGA7B7.5345410.16951.10953.0331.371–6.710.00617**

Microarray data of SigMuc1NW's component genes were retrieved from the MSKCC dataset (cBioPortal).

Cutpoint was estimated using Maximally Selected Rank Statistics in R.

Coefficient to BCR was determined using univariate Cox proportion hazard analysis.

*P < 0.05; **P < 0.01; ***P < 0.001.

Figure 7

SigMuc1NW1 robustly predicts PC recurrent in an independent PC dataset. The follow‐up data along with mRNA expression data for all 15 genes were retrieved from the MSKCC dataset (Robinson et al., 2015). SigMuc1NW1 was formed using 10 genes (see Results for details). Time‐dependent AUCs were derived (A). The stratification of PC with increased risk of recurrence was analyzed using the cutpoint (B), Q1 (C), Median (D), and Q3 (E) scores of SigCut1NW1. Numbers of risk individuals at the current follow‐up period are also included.

Figure 8

SigMuc1NW1 significantly correlates with reductions in DFS and OS in PC patients. The analyses were performed using the TCGA Provisional dataset. SigMuc1NW1 gene expression was based on the SD levels. Kaplan–Meier curve and log‐rank test were performed using tools provided by cBioPortal.

Cutpoint and Cox coefficients of SigMuc1NW component genes in the MSKCC cohorta Microarray data of SigMuc1NW's component genes were retrieved from the MSKCC dataset (cBioPortal). Cutpoint was estimated using Maximally Selected Rank Statistics in R. Coefficient to BCR was determined using univariate Cox proportion hazard analysis. *P < 0.05; **P < 0.01; ***P < 0.001. SigMuc1NW1 robustly predicts PC recurrent in an independent PC dataset. The follow‐up data along with mRNA expression data for all 15 genes were retrieved from the MSKCC dataset (Robinson et al., 2015). SigMuc1NW1 was formed using 10 genes (see Results for details). Time‐dependent AUCs were derived (A). The stratification of PC with increased risk of recurrence was analyzed using the cutpoint (B), Q1 (C), Median (D), and Q3 (E) scores of SigCut1NW1. Numbers of risk individuals at the current follow‐up period are also included. SigMuc1NW1 significantly correlates with reductions in DFS and OS in PC patients. The analyses were performed using the TCGA Provisional dataset. SigMuc1NW1 gene expression was based on the SD levels. Kaplan–Meier curve and log‐rank test were performed using tools provided by cBioPortal. Finally, we made an attempt to compare the performance of SigMuc1NW to Prolaris (cell cycle progression/CPC) (Cuzick et al., 2011) in predicting BCR. The basis for this comparison was the similarities between SigMuc1NW to CPC: (a) like CPC, SigMuc1NW affects cell cycle progression (Table S2A and S2C; also see Discussion), and (b) similar to CPC, SigMuc1NW predicts BCR. As the CPC component genes promote cell cycle progression, we analyzed their effects on BCR using the 2 SD expression level. In the TCGA Provisional cohort, CPC is not correlated with a reduction in OS but significantly associated with BCR (Fig. S6). However, the predictive accuracy is lower than SigMuc1NW (comparing Fig. 2 and Fig. S6). Considering Prolaris being a real‐time PCR‐based signature and SigMuc1NW being derived from RNA seq, this comparison may not fully realize Prolaris effectiveness in predicting BCR. Nonetheless, it suggests that SigMuc1NW (Fig. 2A, MMDF 63.24, P = 1.12e‐12) offers comparable efficacy to Prolaris (Fig. S6, MMDF 66.89, P = 1.34e‐4) in assessing PC recurrence.

Discussion

Progression to biochemical recurrence is a major turning point in PC development; from there, a large proportion of PC will metastasize (Shipley et al., 2017), leading to ultimate death. The current treatments to metastatic PC are essentially palliative. It is thus highly desirable to effectively stratify PCs with higher risk of BCR following RP, allowing early intervention prior to metastatic progression. MUC1 drives tumor progression in multiple tumor types (Kufe, 2009; de Paula Peres et al., 2015; Wurz et al., 2014) through activating important oncogenic proteins including EGFR, β‐catenin, NF‐κB, PKM2, and other pathways (Kufe, 2009; Singh and Hollingsworth, 2006; Wong et al., 2015). In line with its functions, a 9‐gene genomic signature was recently constructed from the MUC1 genomic network, which predicts BCR with a relatively good effectiveness (Lin et al., 2017). Using a novel system, we report here a robust improvement of this 9‐gene genomic signature in predicting BCR by systemically exploring its associated transcriptome. To our best knowledge, this is the first thorough analysis not on a single gene‐associated but rather on a multigene signature‐associated transcriptome consisting of 696 genes (Table S1). Because of the complex nature of cancer progression, in this case the progression to BCR, we chose not to focus on a specific aspect or pathway of tumorigenesis and instead performed a systemic examination of these 696 genes for their predictive power in BCR. This novel and comprehensive analytic approach has resulted in a new 15‐gene panel. In the panel, 73.3% (11/15) of genes have not been reported to associate with PC. These 11 new PC genes include MGAT4B and OIP5. The former may play a role in the alteration of protein glycosylation, which is well known for being an important aspect of tumorigenesis (Munkley et al., 2016). Abnormalities in MUC1 glycosylation have been well demonstrated in tumorigenesis (Kufe, 2009; de Paula Peres et al., 2015). Thus, the inclusion of MGAT4B in the 15‐gene panel is in accordance with the panel being derived from a 9‐gene MUC1 genomic signature (Lin et al., 2017). The presence of OIP5 in SigMuc1NW suggests the protein as a tumor‐associated antigen (TAA) in PC. TAAs have been extensively investigated in cancer diagnosis and therapy (Scheid et al., 2016). In this regard, the OIP5's potential in PC diagnosis and therapy should be pursued. As the construction of SigMuc1NW was not aimed on specific pathways, the gene panel covers multiple pathways. In addition to the potential effects on protein glycosylation though MGAT4B, the panel contains proteins with RNA helicase activity (SUPV3L1, Table 1) and DNA methyltransferase activity (DNMT3B, Table 1). These activities are important in gene expression and epigenetic alterations, which are well known to facilitate caner progression. SigMuc1NW also have a component of cell proliferation. AURKA is emerging as an important regulator of mitosis and a critical player in tumorigenesis. As such, AURKA is a hotly pursued in cancer therapy (Dominguez‐Brauer et al., 2015; Plotnikova et al., 2015). Additionally, OIP5 is also known as Mis18β which has recently been shown to play an important role in chromatid separation during mitosis (Nardi et al., 2016; Stellfox et al., 2016), adding another appealing feature for its inclusion in SigMuc1NW. Intriguingly, among the 15 genes, only four are known to function in PC and all four genes facilitate CRPC development, which is in accordance with the detection of SigMuc1NW elevation in mCRPCs (Table 6). As alterations in gene expression and the epigenetic patterns are involved in CRPC, the 15‐gene panel may also predict CRPC development, which will be examined in the future. Inclusion of genes functioning in multiple pathways is likely a major attributor for the robust nature of the signature. SigMuc1NW and a set of its subsignatures all effectively stratify PC with increased risk of BCR with P‐value being the lowest (0) and are able to discriminate recurrent PC with tAUC >75%. Through combination of the subsignatures, sensitivity, specificity, and PPV can be achieved at high levels, 97.2%/, 98.1%, and 86.7% (Fig. 7B–E). Collectively, these evidences strongly indicate that the signatures constructed in this study will have important clinical applications in predicting PC recurrence. This possible clinical application is supported by that the 15‐gene panel is likely not overfitted. (a) The overfitting issue is largely taken care of by modeling the 696 DEGs with covariate selection coupled with regularization (Elastic‐net penalty in R) with 10‐fold cross‐validation. (b) The component genes were directly examined using a different system: maximally selected rank statistics‐derived cutpoint; importantly, this system clearly improved the effectiveness of the SD‐based signature. (c) The signatures were robust in two independent PC cohorts (TCGA Provisional and MSKCC). (d) RNA was profiled through RNA sequencing (TCGA) and microarray analysis (MSKCC). (e) The 15‐gene panel is robustly associated with adverse feature of PC: Gleason scores and tumor stages. These associations likely resulted in the reduced HR of Gleason scores and tumor stages when they were analyzed with SigMuc1NW in multivariate Cox analysis (Table 3). Between two commercially available multigene panels, Oncotype DX (12 genes plus 5 reference genes) and Prolaris (31 genes), there are no overlapping genes (Cuzick et al., 2011; Knezevic et al., 2013). This suggests the coexistence of different genesets with predictive values toward PC recurrence, which might be attributable to the complex mechanisms involved in disease progression. In this regard, our newly established SigMuc1NW, which contains a different set of genes from Oncotype DX and Prolaris, will enrich our ability to assess the risk of PC recurrence. While our research comprehensively supports that the signatures constructed here will have attractive clinical applications, realization of this potential requires further investigation.

Conclusions

We have formulated a novel strategy to derive differentially expressed genes (DEGs) relative to a reported PC signature from the most comprehensive and large PC genomic dataset (the TCGA dataset) and to systemically analyze these DEGs (n = 696) for pathways affected and impacts on PC recurrence. In this effort, a novel multigene set (n = 15 genes, SigMuc1NW) has been constructed. SigMuc1NW robustly predicts PC recurrence and is an independent risk factor of PC recurrence after adjusting for age at diagnosis, Gleason score, surgical margin, and tumor stage. Among these 15 component genes include 5 candidate oncogenic genes and 6 novel PC genes; within these 11 novel genes affecting PC recurrence, 6 genes (SLCO2A1, SUPV3L1, TATDN2, MGAT4B, SLC25A33, and OIP5) individually predict PC recurrence after adjusting for the above clinical factors. Collectively, we have identified novel genes affecting oncogenesis in general and PC pathogenesis in particular as well as constructed a novel and robust multigene set predicting PC recurrence using our system reported here. This system will have applications in exploration of publically available datasets for factors affecting cancer progression.

Author contributions

YJ, WM, and DT performed literature search and initial analyses. YG, XL, LH, and HZ contributed to the analysis. FW, XW, HY, PM, and DT designed the research. PM and DT supervised the project. YJ, WM, YG, PM, and DT prepared the manuscript. All authors edited the manuscript and approved the final manuscript for submission. Fig. S1. Covariate selection from 696 DEGs using Elastic‐net penalty. Click here for additional data file. Fig. S2. Overlapping between the 9‐gene genomic signature which we have previously reported (Lin et al., 2017) and the current signature (SigMuc1NW). Graph was produced using the TCGA Provisional dataset (n = 492, cBioPortal). Click here for additional data file. Fig. S3. The combined signature is significantly associated with reductions in DFS and OS in PC patients. Click here for additional data file. Fig. S4. Cutpoint estimation. Click here for additional data file. Fig. S5. SigMuc1NW scores effectively stratify PCs with elevated risk of recurrence following RP. Click here for additional data file. Fig. S6. CPC geneset is associated with a reduction in DFS but not OS in PC patients. Click here for additional data file. Table S1. Differentially expression genes (DEGs) of a 9‐gene signature identified in the TCGA Provisional dataset. Click here for additional data file. Table S2. (A) Upregulation of gene sets among the 696 DEGs associated with the 9‐gene genomic signature within the kegg.sets.hs dataset. (B) Downregulation of gene sets among the 696 DEGs within the kegg.sets.hs dataset. (C) Upregulation of gene sets among the 696 DEGs within the GO.sets.hs dataset. (D) Downregulation of gene sets among the 696 DEGs within the GO.sets.hs dataset. (E) Pathways affected by the 696 DEGs associated with the 9‐gene genomic signature. Click here for additional data file. Table S3. Scores of the component genes and some clinical characteristics of patients with prostate cancer in the TCGA Provisional dataset within cBioPortal. Click here for additional data file. Table S4. Demographics of the TCGA patient population. The clinical characteristics were extracted from the TCGA Provisional dataset within cBioPortal along with the indicated clinical data. Click here for additional data file. Table S5. Cutpoints of individual gene expression determined by RNA sequencing. Click here for additional data file.
  67 in total

1.  ReactomePA: an R/Bioconductor package for reactome pathway analysis and visualization.

Authors:  Guangchuang Yu; Qing-Yu He
Journal:  Mol Biosyst       Date:  2016-02

2.  Prognostic value of an RNA expression signature derived from cell cycle proliferation genes in patients with prostate cancer: a retrospective study.

Authors:  Jack Cuzick; Gregory P Swanson; Gabrielle Fisher; Arthur R Brothman; Daniel M Berney; Julia E Reid; David Mesher; V O Speights; Elzbieta Stankiewicz; Christopher S Foster; Henrik Møller; Peter Scardino; Jorja D Warren; Jimmy Park; Adib Younus; Darl D Flake; Susanne Wagner; Alexander Gutin; Jerry S Lanchbury; Steven Stone
Journal:  Lancet Oncol       Date:  2011-03       Impact factor: 41.316

3.  Tumor angiogenesis is associated with MUC1 overexpression and loss of prostate-specific antigen expression in prostate cancer.

Authors:  I Papadopoulos; E Sivridis; A Giatromanolaki; M I Koukourakis
Journal:  Clin Cancer Res       Date:  2001-06       Impact factor: 12.531

4.  Validation of a genomic classifier that predicts metastasis following radical prostatectomy in an at risk patient population.

Authors:  R Jeffrey Karnes; Eric J Bergstralh; Elai Davicioni; Mercedeh Ghadessi; Christine Buerki; Anirban P Mitra; Anamaria Crisan; Nicholas Erho; Ismael A Vergara; Lucia L Lam; Rachel Carlson; Darby J S Thompson; Zaid Haddad; Benedikt Zimmermann; Thomas Sierocinski; Timothy J Triche; Thomas Kollmeyer; Karla V Ballman; Peter C Black; George G Klee; Robert B Jenkins
Journal:  J Urol       Date:  2013-06-11       Impact factor: 7.450

5.  Validation of a cell-cycle progression gene panel to improve risk stratification in a contemporary prostatectomy cohort.

Authors:  Matthew R Cooperberg; Jeffry P Simko; Janet E Cowan; Julia E Reid; Azita Djalilvand; Satish Bhatnagar; Alexander Gutin; Jerry S Lanchbury; Gregory P Swanson; Steven Stone; Peter R Carroll
Journal:  J Clin Oncol       Date:  2013-03-04       Impact factor: 44.544

6.  Cancer incidence and mortality worldwide: sources, methods and major patterns in GLOBOCAN 2012.

Authors:  Jacques Ferlay; Isabelle Soerjomataram; Rajesh Dikshit; Sultan Eser; Colin Mathers; Marise Rebelo; Donald Maxwell Parkin; David Forman; Freddie Bray
Journal:  Int J Cancer       Date:  2014-10-09       Impact factor: 7.396

7.  Discovery and validation of a prostate cancer genomic classifier that predicts early metastasis following radical prostatectomy.

Authors:  Nicholas Erho; Anamaria Crisan; Ismael A Vergara; Anirban P Mitra; Mercedeh Ghadessi; Christine Buerki; Eric J Bergstralh; Thomas Kollmeyer; Stephanie Fink; Zaid Haddad; Benedikt Zimmermann; Thomas Sierocinski; Karla V Ballman; Timothy J Triche; Peter C Black; R Jeffrey Karnes; George Klee; Elai Davicioni; Robert B Jenkins
Journal:  PLoS One       Date:  2013-06-24       Impact factor: 3.240

8.  Overexpression of CTHRC1 in hepatocellular carcinoma promotes tumor invasion and predicts poor prognosis.

Authors:  Yu-Ling Chen; Ting-Huang Wang; Hey-Chi Hsu; Ray-Hwang Yuan; Yung-Ming Jeng
Journal:  PLoS One       Date:  2013-07-29       Impact factor: 3.240

9.  Overexpression of MUC1 and Genomic Alterations in Its Network Associate with Prostate Cancer Progression.

Authors:  Xiaozeng Lin; Yan Gu; Anil Kapoor; Fengxiang Wei; Tariq Aziz; Diane Ojo; Yanzhi Jiang; Michael Bonert; Bobby Shayegan; Huixiang Yang; Khalid Al-Nedawi; Pierre Major; Damu Tang
Journal:  Neoplasia       Date:  2017-09-18       Impact factor: 5.715

10.  GAGE: generally applicable gene set enrichment for pathway analysis.

Authors:  Weijun Luo; Michael S Friedman; Kerby Shedden; Kurt D Hankenson; Peter J Woolf
Journal:  BMC Bioinformatics       Date:  2009-05-27       Impact factor: 3.169

View more
  11 in total

1.  Establishment of a Genomic-Clinicopathologic Nomogram for Predicting Early Recurrence of Hepatocellular Carcinoma After R0 Resection.

Authors:  Bin Yu; Han Liang; Qifa Ye; Yanfeng Wang
Journal:  J Gastrointest Surg       Date:  2020-03-03       Impact factor: 3.452

2.  A novel 8-gene panel for prediction of early biochemical recurrence in patients with prostate cancer after radical prostatectomy.

Authors:  Jinan Guo; Chenhui Zhao; Xinzhou Zhang; Zhong Wan; Tingting Chen; Jiashun Miao; Jinping Cai; Wenchuan Xie; Hao Chen; Mengli Huang; Xiaochen Zhao; Wei Wei; Qi Shen
Journal:  Am J Cancer Res       Date:  2022-07-15       Impact factor: 5.942

3.  Deep Learning-Based Multi-Omics Integration Robustly Predicts Relapse in Prostate Cancer.

Authors:  Ziwei Wei; Dunsheng Han; Cong Zhang; Shiyu Wang; Jinke Liu; Fan Chao; Zhenyu Song; Gang Chen
Journal:  Front Oncol       Date:  2022-06-23       Impact factor: 5.738

4.  The prognostic value and potential mechanism of Matrix Metalloproteinases among Prostate Cancer.

Authors:  Xinyu Geng; Chunyang Chen; Yuhua Huang; Jianquan Hou
Journal:  Int J Med Sci       Date:  2020-06-21       Impact factor: 3.738

5.  Jackknife Model Averaging Prediction Methods for Complex Phenotypes with Gene Expression Levels by Integrating External Pathway Information.

Authors:  Xinghao Yu; Lishun Xiao; Ping Zeng; Shuiping Huang
Journal:  Comput Math Methods Med       Date:  2019-04-08       Impact factor: 2.238

6.  FAM84B promotes prostate tumorigenesis through a network alteration.

Authors:  Yanzhi Jiang; Xiaozeng Lin; Anil Kapoor; Lizhi He; Fengxiang Wei; Yan Gu; Wenjuan Mei; Kuncheng Zhao; Huixiang Yang; Damu Tang
Journal:  Ther Adv Med Oncol       Date:  2019-05-13       Impact factor: 8.168

7.  Offsetting Expression Profiles of Prognostic Markers in Prostate Tumor vs. Its Microenvironment.

Authors:  Zhenyu Jia; Jianguo Zhu; Yangjia Zhuo; Ruidong Li; Han Qu; Shibo Wang; Meiyue Wang; Jianming Lu; John M Chater; Renyuan Ma; Ze-Zhen Liu; Zhiduan Cai; Yongding Wu; Funeng Jiang; Huichan He; Wei-De Zhong; Chin-Lee Wu
Journal:  Front Oncol       Date:  2019-06-26       Impact factor: 6.244

8.  Assessment of biochemical recurrence of prostate cancer (Review).

Authors:  Xiaozeng Lin; Anil Kapoor; Yan Gu; Mathilda Jing Chow; Hui Xu; Pierre Major; Damu Tang
Journal:  Int J Oncol       Date:  2019-10-04       Impact factor: 5.650

9.  Effective Prediction of Prostate Cancer Recurrence through the IQGAP1 Network.

Authors:  Yan Gu; Xiaozeng Lin; Anil Kapoor; Taosha Li; Pierre Major; Damu Tang
Journal:  Cancers (Basel)       Date:  2021-01-23       Impact factor: 6.639

10.  Differential Expression of a Panel of Ten CNTN1-Associated Genes during Prostate Cancer Progression and the Predictive Properties of the Panel Towards Prostate Cancer Relapse.

Authors:  Yan Gu; Mathilda Jing Chow; Anil Kapoor; Xiaozeng Lin; Wenjuan Mei; Damu Tang
Journal:  Genes (Basel)       Date:  2021-02-10       Impact factor: 4.096

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.