Literature DB >> 31815141

Identifying Mutually Exclusive Gene Sets with Prognostic Value and Novel Potential Driver Genes in Patients with Glioblastoma.

Qian Gao1, Yan Cui1,2, Yanan Shen1, Yanyan Li1, Xue Gao1, Yanfeng Xi3, Tong Wang1.   

Abstract

The pathogenesis and prognosis of glioblastoma (GBM) remain poorly understood. Mutual exclusivity analysis can distinguish driver genes and pathways from passenger ones. The purpose of this study was to identify mutually exclusive gene sets (MEGSs) that have prognostic value and to detect novel driver genes in GBM. The genomic alteration profile and clinical information were derived from The Cancer Genome Atlas, and the MEGSA method was used to identify the MEGS. Next, we performed survival analysis and constructed a risk prediction model for prognostic stratification. Leave-one-out cross-validation and permutation test were used to evaluate its performance. Finally, we identified 21 statistically significant MEGSs. We found that the MEGS in the RB pathway was significantly associated with poor prognosis, after adjusting for age and gender (HR = 1.837, 95% CI: 1.192-2.831). Based on the risk prediction model, 208 (80.9%) and 49 (19.1%) patients were assigned to high- and low-risk groups, respectively (log-rank: p < 0.001, adjusted p=0.001). Additionally, we found that SPTA1, a novel gene involved in the MEGS, was mutually exclusive with members of cell cycle, P53, and RB pathways. In conclusion, the MEGS in the RB pathway had considerable clinical value for GBM prognostic stratification. Mutated SPTA1 may be involved in GBM development.
Copyright © 2019 Qian Gao et al.

Entities:  

Mesh:

Substances:

Year:  2019        PMID: 31815141      PMCID: PMC6878817          DOI: 10.1155/2019/4860367

Source DB:  PubMed          Journal:  Biomed Res Int            Impact factor:   3.411


1. Introduction

Glioblastoma (GBM) is the most common and biologically aggressive primary brain tumor [1, 2]. Each year, it affects over 10,000 new patients in the United States [3]. Despite improvements in diagnostic and therapeutic approaches, patients with GBM have poor prognosis. The median overall survival (OS) time is 12–17 months [2, 4–6]. To improve the prognosis of GBM, it is important to understand the carcinogenic mechanism of GBM. Tumor development is primarily driven by the accumulation of lifetime somatic alterations [7, 8]. Therefore, identifying and understanding the genetic and pathway abnormalities that drive the initiation and progression of GBM are critical for the development of effective therapies [2]. The development of the next-generation sequencing has accumulated a large amount of genomic data. The major tasks of analyzing these data are identifying driver alterations that contribute to cancerogenesis and investigating their functional interactions. These tasks can be approached via mutual exclusivity (ME) analysis [9, 10]. Mutual exclusivity of genomic alterations, indicating that genes belonging to the same functional pathway tend not to mutate simultaneously in the same patient, has been observed in various cancer types [11, 12]. Over 25% of well-known cancer genes show an mutual exclusivity (ME) pattern [7]. Detecting an ME pattern is important to understand the tumorigenic mechanisms and identify drug targets. Currently, several methods based on mutual exclusivity have been proposed to uncover novel infrequent cancer drivers and investigate their functional relationship [9, 10, 13]. Mutually exclusive gene set analysis (MEGSA), proposed by Hua et al., is a new model to discover mutually exclusive gene sets (MEGSs) from de novo or existing biological pathways. Simulation studies have indicated that MEGSA outperformed other methods, such as Dentrix, MDPFinder, Multi-Dentrix, and Mutex, in statistical power and their capability for identifying specific MEGSs, especially for highly imbalanced MEGSs [9]. However, one limitation of this analysis is that only nonsynonymous point mutations were taken into consideration when Hua et al. identified MEGSs in patients with GBM. Consequently, only one mutually exclusive gene pair (PTEN and IDH) was found. Compared with other types of somatic genetic alterations, copy number variation (CNV) accounts for a large fraction of genomic alterations in cancer [14] and plays a critical role in carcinogenesis [14]. Therefore, it is necessary to take CNVs into account when performing mutual exclusivity analysis. Other studies have identified ME patterns related to GBM; however, no study has analyzed their prognostic values [9, 10, 13, 15–20]. Therefore, one purpose of this study is to identify MEGSs and detect novel infrequently driver genes in GBM by integrating nonsynonymous single-nucleotide variants (SNVs) and copy number variations (CNVs) using MEGSA. A further objective is to assess the prognostic value of specific MEGSs.

2. Materials and Methods

2.1. Data

The preprocessed GBM genomic variant dataset was derived from Multi-Dentrix, which contained 398 alterations (nonsynonymous SNVs and CNVs) and 261 patients [18, 21]. Data preparation for GBM was described in reference [18]. The clinical data were downloaded from The Cancer Genome Atlas (TCGA). Samples with incomplete survival information were excluded, and 257 patients with GBM were enrolled in survival and leave-one-out cross-validation (LOOCV) analysis.

2.2. Identifying MEGSs

In this study, the MEGSA was employed to identify MEGSs and novel driver genes. In brief, MEGSA consists of three parts. First, a likelihood ratio (LRT) statistic for testing mutual exclusivity was constructed. Second, global null hypothesis (GNH) analysis was performed to test whether the set of M genes contains an MEGS of any size. Third, the optimal MEGS was identified using model selection [9]. Suppose that A0 is an MEGS with N rows that correspond to patients and m columns that represent genes. The entity a denotes the mutation status which is 1 if the gene k is mutated for the subject i or 0 otherwise. We defined the set of model parameters, Θ=(γ, P, Π), using coverage, γ, gene-specific background mutation rate, Π=(π1,…, π), and gene-relative mutation frequencies in A0, P=(p1,…, p). Therefore, under the assumption of p ∝ π, the total log likelihood across N subjects is defined as The LRT is calculated as (H0 : γ=0 versus H1 : γ > 0), with an asymptotically null distribution of 0.5χ02+0.5χ12. The GNH test is completed in three steps: (1) The multiple-path search algorithm is performed to determine the minimum p values for gene sets with different size (denoted as p (k=2,…, K)). (2) The permutation test is used to adjust the p values and obtain Q (k=2,…, K). Intuitively, Q measures the significance that searches only for MEGSs of size k. (3) Finally, the overall statistic is defined as θ=min(Q2,…, Q). Considering two significant putative MEGSs (Q < θ1−), MEGS1 has two genes (G1, G2) with a nominal p value of p1 and MEGS2 has three genes (G1, G2, G3) with a nominal p value of p2 based on LRT. The null hypothesis of model selection would be that none of the M-2 genes (G3,…, G) are mutually exclusive of (G1, G2). We chose MEGS2 if p2 < p0 and p0 was chosen according to permutations with a false-positive rate <5%. This procedure was repeated until the size of the MEGS reached its preset maximum value, k, or the hypothesis test no longer rejected H0.

2.3. Selection and Validation of MEGSs Related to Prognosis

We transformed the gene mutation profile to the MEGS mutation profile by assuming that the MEGS was mutated in a patient if any gene in the gene set was mutated [9]. Univariate and multivariable Cox proportional hazards models were constructed to assess the association between the MEGS, clinical characteristics, and 5-year survival. Next, we developed a risk prediction model based on the prognostic index of the multivariable Cox model for prognostic stratification and evaluated its performance using LOOCV [22, 23]. For each leave-one-out step, the risk score was calculated for the patient who was removed for testing. Following this, each patient was classified into the high- or low-risk group based on whether the risk score was above or below the cut-off value [23]. The cutoff among N risk scores was defined using maximally selected log-rank statistics [4, 23, 24]. Survival curves of the high- and low-risk groups were estimated using the Kaplan–Meier method, and significance was assessed using the log-rank test. To overcome any overfitting bias, the permutation test was used to adjust the log-rank p value. In brief, we randomly permuted the correspondence of survival time and censoring indicators to covariates and repeated the entire LOOCV process. The adjusted p value was calculated as the proportion of permutations whose log-rank statistics were greater than or equal to the value of the statistic for the original data [22, 23]. All analyses were considered statistically significant if p < 0.05. All analyses were performed using R 3.3.2 and SAS 9.2.

3. Results

3.1. MEGSs in GBM

De novo analyses identified 21 significant but overlapping MEGSs (Supplementary ). These MEGSs involved 12 genetic abnormalities and a metagene, in which RB1, TP53, IDH1, PTEN, SPTA1, and NF1 occurred as single-nucleotide variants; CDK4, MDM2, EGFR, PDGFRA, and the metagene (MET, CAPZA2, ST7, ST7-AS1, ST7-OT4) possessed copy number amplification; CDKN2A and PTEN possessed copy number deletion. Figure 1(a) summarizes the 21 significant MEGSs via a network construction [9]. The vertexes of the network are genes involved in MEGSs. The edges between gene pairs indicate that these genes are mutually exclusive in at least one MEGS. Furthermore, the weights of vertexes and edges in the network were proportional to the frequency in the detected MEGSs. As shown in Figure 1(a), the most recurrent gene was CDKN2A (14/21), followed by TP53, RB1, CDK4, MDM2, IDH1, EGFR, PTEN, PDGFRA, NF1, and MET. All these genes have been linked to GBM [1, 2, 21, 25, 26]. The top three most significant MEGSs (Figure 1(b)) with p < 10−17 were core members of the RB (CDK4 amplification, CDKN2A deletion, and RB1 mutations), P53 (CDKN2A deletion, TP53 mutations, and MDM2 amplification), and cell cycle signaling (CDKN2A deletion, RB1 and TP53 mutations, and MDM2 amplification) pathways, respectively [1, 6, 21]. The MEGSs with EGFR amplification and NF1 and TP53 mutations (p=1.78 × 10−7) were enriched in the MAPK pathway. Compared with other studies, we identified several novel less frequent genes, including SPTA1 (9.6%) and the metagene (MET, CAPZA2, ST7, ST7-OT4, ST7-AS1) (4.6%) [15, 18, 27, 28].
Figure 1

Results of mutual exclusivity analysis. (a) A network constructed based on the 21 significant MEGSs; MET(A) is the abbreviation of the metagene (MET, CAPZA2, ST7, ST7-OT4, ST7-AS1 (A)). (b) The top three most significant MEGSs (p < 10−17).

3.2. Selection of MEGSs and Clinical Characteristics with Prognostic Value

After excluding individuals with incomplete survival information, 257 patients were enrolled in the prognosis analysis, including 166 (64.6%) males and 91 (35.4%) females. The age at diagnosis ranged from 21 to 89 years with a median of 61 years. The demographics included 234 (91.1%) white patients, and 20 (7.8%) were of other ethnicities (Asian, black, or African American). Of 257 patients, 209 (81.3%) died within 5 years with a median survival time of 14.7 months. Univariate Cox regression showed age (age ≥50) [4], male, and mutant CDK4(A)/CDKN2A(D)/RB1 and CDK4(A)/SPTA1/RB1/CDKN2A(D) had significant associations with poor prognosis (Table 1 and Supplementary ). Based on these results, we performed multivariable Cox regression analysis with the stepwise procedure (entry = 0.05, retention = 0.10). These results indicated that age (age ≥50 vs. age <50), gender (male vs. female), and CDK4(A)/CDKN2A(D)/RB1 (mutant vs. wild) were independent prognostic factors (Table 2). After adjusting for age and gender, GBM patients with mutant CDK4(A)/CDKN2A(D)/RB1 had significantly higher risk for 5-year mortality compared with patients with wild type (HR = 1.837, 95% CI: 1.192–2.831).
Table 1

Significant factors in univariate survival analysis.

Factors N (%) β^ SE (β^)Wald χ2 p HR (95% CI)
Age≥50215 (83.66)0.5400.1957.62 0.006 1.716 (1.117, 2.516)
<5042 (16.34)

GenderMale166 (64.59)0.3120.1484.44 0.035 1.366 (1.022, 1.826)
Female91 (35.41)

CDK4(A)/RB1/CDKN2A(D)Mutant225 (87.55)0.5870.2197.20 0.007 1.799 (1.171, 2.761)
Wild32 (14.22)

CDK4(A)/SPTA1/RB1/CDKN2A(D)Mutant228 (88.72)0.5710.2276.30 0.012 1.769 (1.133, 2.762)
Wild29 (11.28)
Table 2

Results of multivariable Cox proportional hazards analysis.

Variables β^ SE (β^)Wald χ2 p HR (95% CI)
Age0.4550.1975.310.0211.576 (1.070, 2.319)
Gender0.3250.1514.670.0311.384 (1.031, 1.859)
CDK4(A)/CDKN2A(D)/RB10.6080.2217.590.0061.837 (1.192, 2.831)

3.3. Prognosis Stratification Based on Risk Prediction Model

We developed a risk prediction model based on the prognostic index of the multivariable Cox model to divide the patients into low- and high-risk groups. Taking practical and statistical significance into consideration, we chose 0.82 as the cut-off value using the maximally selected log-rank statistics (Figure 2(a)). There were 49 (19.1%) and 208 (80.9%) patients in the low- and high-risk groups, respectively. Figure 2(b) shows the Kaplan–Meier curves for the low- and high-risk groups (log-rank: p < 0.001). The adjusted log-rank p value calculated via the permutation test (1000 times) was 0.001. The univariate Cox model indicated that the mortality risk within 5 years in the high-risk group was 1.953 times higher than that in the low-risk group (Table 3).
Figure 2

Prognosis stratification based on risk prediction. (a) Identifying the cut-off value using maximally selected log-rank statistics. (b) Kaplan–Meier curves for high- and low-risk groups.

Table 3

Cox regression containing only group variable.

Variables β^ SE (β^)Wald χ2 p HR (95% CI)
Class0.6690.18513.040.0003051.953 (1.358, 2.809)

4. Discussion

In this study, we identified MEGSs in GBM by integrating nonsynonymous SNVs and CNVs. Most genomic alterations that were involved in MEGSs were enriched in core pathways (RB, P53, and RTK/RAS/PI(3)K pathways) required for GBM pathogenesis [1, 6, 21], providing an important validation for the MEGSA. The most significant MEGSs included 3 genomic alterations: CDKN2A deletion, CDK4 amplification, and RB1 mutations (covered 87.7%). These genes are core members of the RB pathway, which plays a central role in the regulation of cell proliferation. In quiescent cells, hypophosphorylated RB (active) binds E2F to prevent cell progression through the G1/S cell checkpoint, whereas in the proliferating cell, the D-cyclin/CDK4/6 complex phosphorylates RB (inactive) leading to the release of E2F, which, in turn, induces genes required for DNA synthesis and cell growth. CDKN2A-p16INK4A is a negative regulator of the RB pathway, and CDKN2A-p16INK4A competes with D-cyclins to bind CDK4/6, which prevents the formation of the D-cyclin/CDK4/6 complex [1, 6, 29]. Intuitively, any genomic alteration, including CDKN2A deletion, CDK4 amplification, and RB1 mutations, can inactivate RB, resulting in cell proliferation. Moreover, our results showed that the ME pattern in the RB pathway was associated with poor prognosis in GBM. Previous studies have shown that disrupting the RB pathway is associated with prognosis of various human cancers [30-37]. Immunohistochemical analysis has shown that the underexpressed RB protein in gastric adenocarcinoma [36] and low expression of p16 (encoded by CDKN2A) in oral carcinoma [31], vertical growth phase melanoma [32], esophageal squamous cell carcinoma [35], and GBM [38] significantly predict poor patient survival. Bäcklund et al. have reported that any loss of CDKN2A and RB or the amplification of CDK4 in anaplastic astrocytoma (AA) was associated with decreased survival [39]. Furthermore, poor prognosis in patients with an abnormal RB pathway may be due to high resistance caused by RB silencing to etoposide (VP-16) [40]. An interpretation about mutual exclusivity, referred to as synthetic lethality, is that the secondary driver alteration within the same pathway is detrimental to cells and may result in cell death [12, 13, 41]. Therefore, our study results provided a clue to the development of tumor molecular targeted therapies. Additionally, we developed a risk prediction model for prognosis stratification. The leave-one-out cross-validation and permutation test results revealed the effectiveness of the developed model in our study. ME analysis can overcome the limitations linked to the frequency-based method for large sample size and detect less frequent mutated genes [9, 10, 13]. Given that CDKN2A, TP53, RB1, PTEN, NF1, CDK4, MDM2, EGFR, PDGFRA, IDH1, and MET are well-known genes associated with GBM, the observed mutual exclusivity suggests that SPTA1 and CAPZA2 may be cancer genes. SPTA1, which is one of the most recurrent genes involved in MEGS, encodes α-spectrin. α-Spectrin and ß-spectrin are assembled into spectrin, which is an actin crosslinking and molecular scaffold protein that determines cell shape and membrane protein location [42, 43]. Alterations in SPTA1 are associated with colorectal cancer [44, 45] and small-cell lung cancer [42]. However, to date, the carcinogenic mechanism of mutated SPTA1 remains unknown. Previous studies have shown that nonerythroid α-spectrin interacts with proteins that are related to several cellular processes, such as DNA synthesis, cell cycle progression, and signal transduction, which are consistent with our findings [46]. We found that SPTA1 mutations were mutually exclusive to the core members of the RB, P53, and cell cycle pathways. Taken together, these data indicate that mutated SPTA1 may be related to abnormal cell proliferation and apoptosis in GBM development. CAPZA2 encodes the human actin-capping protein α-subunit. The function of the actin-capping protein is to block the growth of actin filaments by capping the barbed end [47]. The CAZ2 protein is overexpressed in breast cancer, and the F-actin-capping protein is linked to renal cell carcinoma [48, 49]. Moreover, Mueller et al. have observed CAPZA2 amplification in glioma, which was in good agreement with our observation [50]. However, the investigations about the role of CAPZA2 in cancer are rare. It is possible that CAPZA2 may play a role in tumor-specified cell motility [48]. Recently, Ohishi et al. found that CAPZA2 negatively regulates cell invasion [51], which indicates that amplified CAPZA2 may be a favorable prognosis marker in cancer. To explore the prognostic value of SPTA1 and CAPZA2, survival analyses were performed. However, our results showed that neither SPTA1 (p=0.764) nor CAPZA2 (p=0.213) had significant associations with survival in patients with GBM after adjusting for age, gender, and CDK4(A)/RB1/CDKN2A(D). These data suggest that alterations in SPTA1 and CAPZA2 may be linked to the formation of GBM alone. Nonetheless, the mutually exclusive gene set CDK4(A)/CDKN2A(D)/RB1 was involved in the formation of GBM and predicted the prognosis of GBM. The main limitation of this study was the lack of external validation, which makes our results less reliable. Therefore, a study using a larger patient cohort and an experiment with cell lines are required to validate our findings and allow more reliable conclusions to be reached.

5. Conclusions

In summary, we derived 21 MEGSs by integrating nonsynonymous single-nucleotide variants and copy number variations. In these MEGSs, only the ME pattern in the RB pathway predicted the prognosis of patients with GBM after adjusting for age and gender. This finding may help researchers develop molecular targeted therapies and identify high-risk GBM for better treatment. Additionally, we obtained several less frequent cancer genes, which may extend our knowledge on the pathogenesis of GBM.
  49 in total

1.  Multigene analysis of Rb pathway and apoptosis control in esophageal squamous cell carcinoma identifies patients with good prognosis.

Authors:  Dilek Güner; Isrid Sturm; Philipp Hemmati; Sandra Hermann; Steffen Hauptmann; Reinhard Wurm; Volker Budach; Bernd Dörken; Matthias Lorenz; Peter T Daniel
Journal:  Int J Cancer       Date:  2003-02-10       Impact factor: 7.396

2.  Identification of an amplified gene cluster in glioma including two novel amplified genes isolated by exon trapping.

Authors:  H W Mueller; A Michel; D Heckel; U Fischer; M Tönnes; L C Tsui; S Scherer; K D Zang; E Meese
Journal:  Hum Genet       Date:  1997-12       Impact factor: 4.132

3.  Short postoperative survival for glioblastoma patients with a dysfunctional Rb1 pathway in combination with no wild-type PTEN.

Authors:  L Magnus Bäcklund; Bo R Nilsson; Helena M Goike; Esther E Schmidt; Lu Liu; Koichi Ichimura; V Peter Collins
Journal:  Clin Cancer Res       Date:  2003-09-15       Impact factor: 12.531

4.  Novel breast cancer biomarkers identified by integrative proteomic and gene expression mapping.

Authors:  Keli Ou; Kun Yu; Djohan Kesuma; Michelle Hooi; Ning Huang; Wei Chen; Suet Ying Lee; Xin Pei Goh; Lay Keng Tan; Jia Liu; Sou Yen Soon; Suhaimi Bin Abdul Rashid; Thomas C Putti; Hiroyuki Jikuya; Tetsuo Ichikawa; Osamu Nishimura; Manuel Salto-Tellez; Patrick Tan
Journal:  J Proteome Res       Date:  2008-03-05       Impact factor: 4.466

5.  Abnormal expression of pRb, p16, and cyclin D1 in gastric adenocarcinoma and its lymph node metastases: relationship with pathological features and survival.

Authors:  Roger M Feakins; Carole D Nickols; Heena Bidd; Sarah-Jane Walton
Journal:  Hum Pathol       Date:  2003-12       Impact factor: 3.466

6.  MEGSA: A Powerful and Flexible Framework for Analyzing Mutual Exclusivity of Tumor Mutations.

Authors:  Xing Hua; Paula L Hyland; Jing Huang; Lei Song; Bin Zhu; Neil E Caporaso; Maria Teresa Landi; Nilanjan Chatterjee; Jianxin Shi
Journal:  Am J Hum Genet       Date:  2016-02-18       Impact factor: 11.043

7.  Evidence that synthetic lethality underlies the mutual exclusivity of oncogenic KRAS and EGFR mutations in lung adenocarcinoma.

Authors:  Arun M Unni; William W Lockwood; Kreshnik Zejnullahu; Shih-Queen Lee-Lin; Harold Varmus
Journal:  Elife       Date:  2015-06-05       Impact factor: 8.140

8.  A novel independence test for somatic alterations in cancer shows that biology drives mutual exclusivity but chance explains most co-occurrence.

Authors:  Sander Canisius; John W M Martens; Lodewyk F A Wessels
Journal:  Genome Biol       Date:  2016-12-16       Impact factor: 13.583

9.  CoMEt: a statistical approach to identify combinations of mutually exclusive alterations in cancer.

Authors:  Mark D M Leiserson; Hsin-Ta Wu; Fabio Vandin; Benjamin J Raphael
Journal:  Genome Biol       Date:  2015-08-08       Impact factor: 13.583

10.  The integrated landscape of driver genomic alterations in glioblastoma.

Authors:  Veronique Frattini; Vladimir Trifonov; Joseph Minhow Chan; Angelica Castano; Marie Lia; Francesco Abate; Stephen T Keir; Alan X Ji; Pietro Zoppoli; Francesco Niola; Carla Danussi; Igor Dolgalev; Paola Porrati; Serena Pellegatta; Adriana Heguy; Gaurav Gupta; David J Pisapia; Peter Canoll; Jeffrey N Bruce; Roger E McLendon; Hai Yan; Ken Aldape; Gaetano Finocchiaro; Tom Mikkelsen; Gilbert G Privé; Darell D Bigner; Anna Lasorella; Raul Rabadan; Antonio Iavarone
Journal:  Nat Genet       Date:  2013-08-05       Impact factor: 38.330

View more
  2 in total

1.  A Five-Genes Based Diagnostic Signature for Sepsis-Induced ARDS.

Authors:  Ning Xu; Hui Guo; Xurui Li; Qian Zhao; Jianguo Li
Journal:  Pathol Oncol Res       Date:  2021-07-29       Impact factor: 3.201

Review 2.  Metabolic Rewiring in Glioblastoma Cancer: EGFR, IDH and Beyond.

Authors:  Abdellatif El Khayari; Najat Bouchmaa; Bouchra Taib; Zhiyun Wei; Ailiang Zeng; Rachid El Fatimy
Journal:  Front Oncol       Date:  2022-07-14       Impact factor: 5.738

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.