Literature DB >> 32802843

Identification and Validation of an Energy Metabolism-Related lncRNA-mRNA Signature for Lower-Grade Glioma.

Jingwei Zhao1, Le Wang2, Bo Wei1.   

Abstract

Energy metabolic processes play important roles for tumor malignancy, indicating that related protein-coding genes and regulatory upstream genes (such as long noncoding RNAs (lncRNAs)) may represent potential biomarkers for prognostic prediction. This study will develop a new energy metabolism-related lncRNA-mRNA prognostic signature for lower-grade glioma (LGG) patients. A GSE4290 dataset obtained from Gene Expression Omnibus was used for screening the differentially expressed genes (DEGs) and lncRNAs (DELs). The Cancer Genome Atlas (TCGA) dataset was used as the prognosis training set, while the Chinese Glioma Genome Atlas (CGGA) was for the validation set. Energy metabolism-related genes were collected from the Molecular Signatures Database (MsigDB), and a coexpression network was established between energy metabolism-related DEGs and DELs to identify energy metabolism-related DELs. Least absolute shrinkage and selection operator (LASSO) analysis was performed to filter the prognostic signature which underwent survival analysis and nomogram construction. A total of 1613 DEGs and 37 DELs were identified between LGG and normal brain tissues. One hundred and ten DEGs were overlapped with energy metabolism-related genes. Twenty-seven DELs could coexpress with 67 metabolism-related DEGs. LASSO regression analysis showed that 9 genes in the coexpression network were the optimal signature and used to construct the risk score. Kaplan-Meier curve analysis showed that patients with a high risk score had significantly worse OS than those with a low risk score (TCGA: HR = 3.192, 95%CI = 2.182-4.670; CGGA: HR = 1.922, 95%CI = 1.431-2.583). The predictive accuracy of the risk score was also high according to the AUC of the ROC curve (TCGA: 0.827; CGGA: 0.806). Multivariate Cox regression analyses revealed age, IDH1 mutation, and risk score as independent prognostic factors, and thus, a prognostic nomogram was established based on these three variables. The excellent prognostic performance of the nomogram was confirmed by calibration and discrimination analyses. In conclusion, our findings provided a new biomarker for the stratification of LGG patients with poor prognosis.
Copyright © 2020 Jingwei Zhao et al.

Entities:  

Mesh:

Substances:

Year:  2020        PMID: 32802843      PMCID: PMC7403901          DOI: 10.1155/2020/3708231

Source DB:  PubMed          Journal:  Biomed Res Int            Impact factor:   3.411


1. Introduction

Lower-grade gliomas (LGG) that include World Health Organization (WHO) grade II and III diffuse gliomas are common infiltrative brain tumors in adults [1]. Although advances have been made for the treatment of LGG, including neurosurgical resection, chemotherapy, and radiotherapy, a considerable proportion of patients still experience recurrence and malignant transformation to high-grade glioblastoma multiforme (GBM; WHO grade IV) [2], leading to declines in their health-related quality of life [3] and eventual death [2]. This heterogeneity in the prognosis of patients with LGG highlights the necessity to develop effective biomarkers to early stratify the patients at high risk for poor outcomes and give preventative therapy. In order to maintain the malignant characteristics (rapid proliferation, migration, and invasion), tumor cells (including gliomas) need to produce a large amount of energy [4]. It is well known that carbohydrate, lipid, and amino acid metabolic processes are the main sources for the production of adenosine triphosphate (ATP) [5]. Therefore, the expression changes in genes involved in these metabolic processes may be important molecular mechanisms for the progression of gliomas, and the genes may represent potential biomarkers for prognostic prediction. This theory has been demonstrated by some scholars. For example, Qi et al. extracted the fatty acid catabolic metabolism-related genes from Molecular Signatures Database (MsigDB) and then identified an 8-gene risk signature using the Least Absolute Shrinkage and Selection Operator (LASSO) regression analysis based on RNA-seq data from the Chinese Glioma Genome Atlas (CGGA) dataset and The Cancer Genome Atlas (TCGA) dataset. This risk signature was found to be an independent prognostic factor for patients with all grade gliomas (CGGA: hazard ratios (HR) = 4.0044, 95%confidence intervals (CI) = 2.7634‐5.8028; TCGA: HR = 1.7382, 95%CI = 1.0577‐2.8567) [6]. Wu et al. used the Cox proportional hazards model to screen a prognostic signature from the differential genes of lipid metabolism between LGG and GBM. Consequently, a nine-gene signature was obtained as a classifier, which was demonstrated to significantly distinguish the overall survival (OS) between the high- and low-risk group of CGGA and TCGA cohorts [7]. Univariate Cox regression analysis performed by Zhao et al. generated a signature including 45 glucose-related genes. This risk score was associated with the OS of patients in the CGGA (HR = 2.293, 95%CI = 1.471‐3.576) training dataset and TCGA (HR = 1.227, 95%CI = 1.000‐1.504) and GSE16011 (HR = 1.440; 95%CI = 1.016‐2.039) validation datasets [8]. Based on glioma datasets from TCGA, REMBRANDT (Repository for Molecular Brain Neoplasia Data), and GSE16011, Chen et al. established a glycolytic gene expression signature score by incorporating ten glycolytic genes. A high risk score was reported to predict poor prognosis for patients with GBM [9]. Zhou et al. obtained 587 energy metabolism-related genes (including 41 carbohydrate, 73 lipid, and 144 protein metabolisms) from MsigDB and overlapped with the 463 differentially expressed genes between LGG and GBM to develop a risk score. As a result, a 29-gene signature was identified, and its predictive accuracy can reach 87.2% [10]. However, energy metabolism-related prognostic biomarkers for LGG remain rarely reported. In addition, long noncoding RNAs (lncRNAs), a class of noncoding transcripts > 200 nucleotides in length, had also been demonstrated to affect cancer progression by regulation of the energy metabolism genes and then the related processes [11]. For example, Cheng et al. identified that highly expressed lncRNA X-inactive specific transcript (XIST) may promote cell viability, migration, invasion, and resistance to apoptosis by increasing glucose uptake, with the mechanism referring to upregulation of glucose transporters GLUT1 and GLUT3 [12]. The study of He et al. revealed that upregulated lncRNA UCA1 may induce glycolysis and invasion in glioma cells by competitively binding to miR-182 and then influencing the downstream target (fructose-2,6-biphosphatase) of miR-182 [13]. These findings indicated that the metabolism-related lncRNAs may also have underlying prognostic values for LGG; nevertheless, no studies focused on the lncRNA signature until now. Thus, the goal of this study was to construct a new energy metabolism-related lncRNA-mRNA prognostic signature for LGG patients.

2. Materials and Methods

2.1. Dataset Mining

Three datasets obtained from Gene Expression Omnibus (GEO; http://www.ncbi.nlm.nih.gov/geo/), TCGA (https://gdc-portal.nci.nih.gov/; level 3), and CGGA (http://www.cgga.org.cn; level 3) were included in our study. GSE4290 in GEO was enrolled because it met the following inclusion criteria: (1) studying RNA expression profiles, (2) studying brain tissue samples, (3) the number of samples more than 50, and (4) consisting of LGG (n = 76) and normal controls (n = 23). GSE4290 was used for screening the differentially expressed RNAs (DERs). Samples of TCGA (n = 520) and CGGA (n = 431) were eligible if they (1) belonged to the LGG type and (2) provided the clinical survival outcomes because TCGA was used as the training set for constructing the prognosis model, while CGGA was used as the validation set to confirm the prognosis value of the established model.

2.2. Identification of Energy Metabolism-Related Genes

The mRNAs and lncRNAs in the GSE4290 dataset were reannotated by the HUGO Gene Nomenclature Committee (HGNC; http://www.genenames.org/) that includes the standard nomenclature for 4516 lncRNAs and 19,200 protein-coding genes [14]. The Linear Models for Microarray Data (LIMMA) method (version 3.34.7; https://bioconductor.org/packages/release/bioc/html/limma.html) [15] in the Bioconductor R package (version 3.4.1; http://www.R-project.org/) was used to identify differentially expressed genes (DEGs) and lncRNAs (DELs) between LGG and normal controls. The false discovery rate (FDR) < 0.05 and ∣logFC(fold change) | >1 served as the screening threshold. Clustering analysis was conducted by the Pheatmap package (version: 1.0.8; https://cran.r-project.org/web/packages/pheatmap) in R language. Energy metabolism-related gene sets (REACTOME_METABOLISM_OF_AMINO_ACIDS_AND_DERIVATIVES, REACTOME_METABOLISM_OF_CARBOHYDRATES, and REACTOME_METABOLISM_OF_LIPIDS) were collected from MSigDB (version 7.0; http://software.broadinstitute.org/gsea/msigdb/). These genes were overlapped with the DEGs to obtain energy metabolism-related DEGs.

2.3. Identification of Energy Metabolism-Related lncRNAs

Energy metabolism-related lncRNAs were determined by constructing a coexpression network between DELs and energy metabolism-related DEGs. The coexpression pairs were selected by calculation of Pearson correlation coefficients (PCC) between lncRNAs and DEGs by cor.test function (https://stat.ethz.ch/R-manual/R-devel/library/stats/html/cor.test.html) in R. PCC > 0.6 revealed that a significant correlation existed. The network was visualized in the Cytoscape software (version 3.6.1; http://www.cytoscape.org/).

2.4. Function Enrichment Analysis

To illustrate the specific metabolic processes involved of the lncRNAs in the network, function enrichment analysis was performed for the genes coexpressed with lncRNAs using the Database for Annotation, Visualization and Integrated Discovery (DAVID) (version 6.8; http://david.abcc.ncifcrf.gov) [16] and Enrichr (https://amp.pharm.mssm.edu/Enrichr/) [17]. Significant Gene Ontology (GO) biological process terms and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways with p value < 0.05 were collected and visualized using the ggplot2 package (version 3.3.0; https://cran.r-project.org/web/packages/ggplot2) in R.

2.5. Signature Development and Validation

Based on the TCGA data, univariate and multivariate Cox regression analyses were sequentially performed to evaluate the prognostic ability of energy metabolism-related DELs and DEGs in the coexpression network using “survival” package in R (version, 2.41-1; http://bioconductor.org/packages/survivalr/), with p < 0.05 tested by log-rank testing as the statistical threshold. A Cox proportional hazards model based on the L1-penalized regularization regression algorithm in the penalized package (version, 0.9-5; http://bioconductor.org/packages/penalized/) [18, 19] was subsequently conducted in DELs and DEGs still significant after multivariate analysis to further screen the optimal signature combination. The expression differences of signature genes between LGG and GBM were subsequently identified in GSE4290 (71 vs. 81), CGGA (443 vs. 249), and TCGA (524 vs. 155) datasets via an unpaired t-test to verify their specificity, with p < 0.05 set as a statistical difference. Finally, the risk score was established as follows: Prognostic risk score = ∑βDERs × ExpDERs, where ExpDERs is the expression levels of prognostic DERs and ∑βDERs is the prognostic coefficients of DERs after LASSO analysis. The LGG patients were divided into the high-risk group and low-risk group using the median risk score as the cut-off. The Kaplan-Meier curve and receiver operating characteristic (ROC) curve were used to assess the predictive ability of the energy metabolism-related signature. These analyses were performed for the training dataset (TCGA) and validation dataset (CGGA), respectively. To validate if the risk score could be independent of other clinicopathological parameters, univariate and multivariate Cox regression analyses were performed using the training dataset, followed by the stratification analysis for clinical variables with p < 0.05 in multivariate analysis. Furthermore, a nomogram using the result of multivariate Cox regression analysis was constructed to predict the 3-year and 5-year OS. The performance of the nomogram was assessed by discrimination and calibration. The discrimination was determined by the area under the curve (AUC) of the ROC curve and concordance index (C-index). The calibration was evaluated by calibration curves, which shows the agreement between the predicted and observed survival probabilities.

3. Results

3.1. Identification of Energy Metabolism-Related DERs

A total of 15,183 protein-encoding mRNAs and 576 lncRNAs were annotated in three included datasets. Based on the LIMMA method, 1613 mRNAs and 37 lncRNAs were found to be differentially expressed between LGG and normal brain tissues in the GSE4290 dataset (Figure 1(a)). Hierarchical clustering analysis showed that LGG samples could be clearly distinguished from normal samples according to the expressions of these DERs (Figure 1(b)).
Figure 1

Identification of energy metabolism-related differentially expressed RNAs between lower-grade gliomas and normal brain tissues in the GSE4290 dataset. (a) Volcano plot to display the distribution of differentially expressed RNAs, which was performed by ggplot2. Green dots were downregulated RNAs; red dots were upregulated RNAs; black dots were RNAs not differentially expressed. FC: fold change; FDR: false discovery rate. (b) Heat map of differentially expressed RNAs, which was created by Pheatmap. Red indicated high expression; green indicated low expression. (c) Venn diagram to display the overlap between differentially expressed protein-coding genes (DEGs) and energy metabolism-related genes obtained from Molecular Signatures Database.

By searching the MsigDB database, a total of 1403 energy metabolism-related genes were downloaded, including 293 for carbohydrate, 738 for lipid, and 372 for amino acids and derivative metabolism. These energy metabolism-related genes were then overlapped with the above 1613 DEGs, with 110 shown to be shared (Figure 1(c)), indicating that they were energy metabolism-related DEGs. By calculation of PCC, a total of 585 coexpression pairs between 27 DELs and 67 energy metabolism-related DEGs (such as GABPB1-AS1-PON2 (paraoxonase 2), HAR1A-CYP46A1 (cytochrome P450 family 46 subfamily A member 1)/HK1 (hexokinase 1), LINC00599-INPP5J (inositol polyphosphate-5-phosphatase J), SNAI3-AS1-INPP5J/HK1, and SNHG1-GPC2 (glypican 2)) (Figure 2), were obtained, indicating that these 27 DELs may be associated with the regulation of energy metabolism. Function enrichment analysis using DAVID (Figure 3; Table 1) and Enrichr (Supplementary Table 1) also showed that these 67 DEGs were enriched into energy metabolism-related GO biological process terms (such as GO:0030203~glycosaminoglycan metabolic process (GPC2), GO:0030208~dermatan sulfate biosynthetic process (UST, uronyl 2-sulfotransferase), GO:0055114~oxidation-reduction process (CYP46A1), GO:0046856~phosphatidylinositol dephosphorylation (INPP5J), GO:0006094~gluconeogenesis (SLC25A1, solute carrier family 25 member 1), GO:0006629~lipid metabolic process (FABP6, fatty acid binding protein 6), GO:0046488 phosphatidylinositol metabolic process (MBOAT7, membrane bound O-acyltransferase domain containing 7), and GO:0006631 fatty acid metabolic process (PON2)) and KEGG pathways (such as hsa01100: metabolic pathways (INPP5J, HK1), hsa01200: carbon metabolism (HK1), and glycerophospholipid metabolism (MBOAT7)). Thus, these 27 DELs and 67 DEGs were used for the following analyses.
Figure 2

Identification of energy metabolism-related differentially expressed lncRNAs based on their coexpression with differentially expressed protein-coding mRNAs. Upregulated lncRNAs (square) and mRNAs (circle) were in red; downregulated lncRNAs (square) and mRNAs (circle) were in green. The coexpression pairs between lncRNAs and mRNAs were selected by calculation of Pearson correlation coefficients (PCC) by cor.test function. Only coexpression pairs with PCC > 0.6 were visualized.

Figure 3

Function enrichment analysis for genes in the coexpression network by the Database for Annotation, Visualization and Integrated Discovery database. (a) Gene Ontology (GO) biological process terms; (b) Kyoto Encyclopedia of Genes and Genomes (KEGG). These plots were generated using the ggplot2 package.

Table 1

Function enrichment.

CategoryTerm p valueGenes
Biology processGO:0006661~phosphatidylinositol biosynthetic process3.380e − 06INPP5J, ARF3, SYNJ1, PI4KA, PIP4K2A, MTMR7
GO:0030203~glycosaminoglycan metabolic process5.100e − 06B3GAT1, GPC2, CHST9, BCAN, VCAN
GO:0030206~chondroitin sulfate biosynthetic process1.310e − 04CHSY3, CHST9, BCAN, VCAN
GO:0009725~response to hormone6.240e − 04ME1, OXCT1, FHL2, DHCR24
GO:0035338~long-chain fatty-acyl-CoA biosynthetic process6.240e − 04ACOT7, ELOVL4, SLC25A1, ACOT4
GO:0030208~dermatan sulfate biosynthetic process1.009e − 03UST, BCAN, VCAN
GO:0055114~oxidation-reduction process2.379e − 03ME1, TYRP1, CYP46A1, CYP2C8, QDPR, CBR4, SRD5A1, CRYM, DHCR24
GO:0006699~bile acid biosynthetic process3.136e − 03CYP46A1, STAR, SLC27A2
GO:0046856~phosphatidylinositol dephosphorylation4.090e − 03INPP5J, SYNJ1, MTMR7
GO:0008652~cellular amino acid biosynthetic process4.792e − 03GLS2, ASPA, GOT1
GO:0042493~response to drug7.278e − 03STAR, BCHE, OXCT1, FABP3, SRD5A1, AACS
GO:0019083~viral transcription1.015e − 02RPLP0, NUP93, TPR, RPS21
GO:0006486~protein glycosylation1.040e − 02B3GAT1, B3GALT2, B4GALT6, LRP2
GO:0070859~positive regulation of bile acid biosynthetic process1.192e − 02NR1D1, STAR
GO:0006024~glycosaminoglycan biosynthetic process1.218e − 02GPC2, HAS1, HS3ST2
GO:0006094~gluconeogenesis1.332e − 02GOT1, ENO2, SLC25A1
GO:0006533~aspartate catabolic process1.587e − 02ASPA, GOT1
GO:0071320~cellular response to cAMP1.829e − 02STAR, RPLP0, SRD5A1
GO:0016101~diterpenoid metabolic process1.979e − 02STAR, SRD5A1
GO:0071394~cellular response to testosterone stimulus1.979e − 02SRD5A1, AACS
GO:0006629~lipid metabolic process2.485e − 02CPNE6, AACS, LRP2, FABP6
GO:0060992~response to fungicide2.760e − 02STAR, SRD5A1
GO:0044539~long-chain fatty acid import2.760e − 02FABP3, SLC27A2
GO:0044321~response to leptin2.760e − 02NR1D1, STAR
GO:0046951~ketone body biosynthetic process3.148e − 02HMGCLL1, AACS
GO:0007584~response to nutrient3.523e − 02STAR, OXCT1, AACS
GO:0035457~cellular response to interferon-alpha3.535e − 02STAR, TPR
GO:0032869~cellular response to insulin stimulus3.788e − 02GOT1, STAR, SRD5A1
GO:0046855~inositol phosphate dephosphorylation4.304e − 02SYNJ1, MTMR7
GO:0046488~phosphatidylinositol metabolic process4.304e − 02PITPNM3, SYNJ1
GO:0051292~nuclear pore complex assembly4.304e − 02NUP93, TPR
GO:0071872~cellular response to epinephrine stimulus4.686e − 02STAR, SRD5A1

KEGG pathwayhsa01100:metabolic pathways9.150e − 06ME1, PLD3, TYRP1, HMGCLL1, B3GALT2, CYP2C8, SYNJ1, PI4KA, QDPR, HK1, AGMAT, RIMKLA, ACOT4, GLS2, ASPA, CKMT1A, GOT1, CHSY3, INPP5J, ENO2, B4GALT6, MTMR7, DHCR24
hsa00562:inositol phosphate metabolism1.555e − 03INPP5J, SYNJ1, PI4KA, PIP4K2A, MTMR7
hsa00250:alanine, aspartate, and glutamate metabolism1.894e − 03GLS2, ASPA, GOT1, RIMKLA
hsa04070:phosphatidylinositol signaling system5.025e − 03INPP5J, SYNJ1, PI4KA, PIP4K2A, MTMR7
hsa00330:arginine and proline metabolism5.263e − 03CKMT1A, GOT1, AGMAT, CARNS1
hsa00062:fatty acid elongation1.343e − 02ACOT7, ELOVL4, ACOT4
hsa00650:butanoate metabolism1.558e − 02HMGCLL1, OXCT1, AACS
hsa00280:valine, leucine, and isoleucine degradation4.385e − 02HMGCLL1, OXCT1, AACS
hsa01200:carbon metabolism4.608e − 02ME1, GOT1, ENO2, HK1

3.2. Development of Energy Metabolism-Related DER-Based Risk Score

Univariate Cox regression analysis revealed that 13 out of 27 energy metabolism-related DELs and 27 out of 67 energy metabolism-related DEGs were significantly correlated with OS of LGG patients in the TCGA dataset. Then, they were entered into multivariate Cox regression. Five energy metabolism-related DELs and four energy metabolism-related DEGs were filtered as significant, independent prognostic factors. These 9 genes were further suggested to be the powerful prognostic indicators after LASSO regression analysis (Table 2). Also, there were significant expression differences in these 9 genes between LGG and GBM, indicating that they were specific signatures for LGG (Table 2). A risk score was constructed by combining the expression levels of these 9 genes with their LASSO coefficients as the following formula: (−0.05468 × expression of GABPB1-AS1) + (−0.64387 × expression of HAR1A) + (0.00904 × expression of LINC00599) + (−1.81399 × expression of SNAI3-AS1) + (0.29846 × expression of SNHG1) + (0.55872 × expression of FABP6) + (0.77305 × expression of MBOAT7) + (−0.40797 × expression of SLC25A1) + (−0.55501 × expression of UST).
Table 2

The optimal signature combination.

SymbolTypeExpression (LGG vs. CK)Expression (GBM vs. LGG)Multivariate Cox regression analysisLASSO coefficient
logFCFDR (GSE4290) p value (GSE4290) p value (CGGA) p value (TCGA)HR95% CI p value
GABPB1-AS1lncRNA1.072.72e − 077.59e − 036.72e − 068.83e − 2890.99460.9899-0.99942.846e − 02-0.05468
HAR1AlncRNA-1.517.65e − 044.31e − 031.65e − 151.62e − 1020.99380.988-0.99963.580e − 02-0.64387
LINC00599lncRNA-1.34.80e − 053.64e − 022.36e − 062.62e − 1051.00351.0001-1.0074.628e − 020.00904
SNAI3-AS1lncRNA-1.769.51e − 087.96e − 033.12e − 101.71e − 2580.98170.9701-0.99342.310e − 03-1.81399
SNHG1lncRNA1.36.06e − 091.28e − 011.28e − 048.19e − 2221.00541.0004-1.01043.328e − 020.29846
FABP6mRNA-2.181.13e − 072.94e − 014.34e − 021.42e − 1281.00621.004-1.00856.270e − 080.55872
MBOAT7mRNA-1.273.92e − 098.01e − 011.11e − 0301.00841.0017-1.01521.463e − 020.77305
SLC25A1mRNA1.032.39e − 094.05e − 035.96e − 1200.99550.9911-0.99994.934e − 02-0.40797
USTmRNA1.475.50e − 105.80e − 057.57e − 082.41e − 2560.99510.9914-0.99891.177e − 02-0.55501

LGG: lower-grade gliomas; GBM: glioblastoma multiforme; CK: normal control; CGGA: Chinese Glioma Genome Atlas; TCGA: The Cancer Genome Atlas; FC: fold change; FDR: false discovery rate; HR: hazard ratio; CI: confidence interval; LASSO: least absolute shrinkage and selection operator.

After calculation of the risk score for each patient in TCGA and CGGA datasets, the patients were dichotomized to the low-risk (patients with a high risk score had significantly worse OS than those with a low risk score (TCGA: HR = 3.192, 95%CI = 2.182‐4.670, p = 1.989e − 10, Figure 4(a); CGGA: HR = 1.922, 95%CI = 1.431‐2.583, p = 1.039e − 05, Figure 4(c)). The predictive accuracy of the prognostic risk score was also demonstrated to be high according to the AUC of ROC curve (TCGA: 0.827, Figure 4(b); CGGA: 0.806, Figure 4(d)).
Figure 4

The prognostic performance assessment for the risk score model. (a, c) Kaplan-Meier survival curve analysis to show the overall survival difference between the high- and low-risk group of the training (a) and validation (c) datasets; (b, d) receiver operator characteristic (ROC) curves to demonstrate the predictive accuracy for the overall survival of patients in the training (b) and validation (d) datasets. CGGA: Chinese Glioma Genome Atlas; TCGA: The Cancer Genome Atlas; HR: hazard ratio; AUC: area under the ROC curve.

To construct a prognostic nomogram, univariate and multivariate Cox regression analyses were performed for risk score and several clinical factors to explore all independent risk factors for OS. Univariate analysis demonstrated that age, histological type, isocitrate dehydrogenase 1 (IDH1) mutation, neoplasm histologic grade, radiation therapy, and risk score were significant prognostic factors. These significant variables further underwent multivariate analysis. As a result, age (Figure 5(a)), IDH1 mutation (Figure 5(d)), and risk score served as independent prognostic factors (Table 3). Also, stratification analysis showed that the risk score could further distinguish the prognosis of patients with the same age (<42 years, p = 2.529e − 04, Figure 5(b); ≥42 years, p = 3.071e − 09, Figure 5(c)) and IDH status (without mutation, p = 5.918e − 03, Figure 5(e); with mutation, p = 1.032e − 01, Figure 5(f)), implying that it is necessary to integrate the risk score into the clinical prognostic factors. Thus, a nomogram was then constructed based on these independent prognostic factors (Figure 6(a)). The excellent prognostic performance of the nomogram was confirmed by calibration (approximate to the 45-degree line for 3- and 5-year OS prediction) (Figure 6(b)) and discrimination (AUC = 0.845 and C‐index = 0.928; both higher than age, IDH status, and risk score model alone) (Table 4; Figure 6(c)).
Figure 5

Stratification analysis based on age (a) and IDH1 mutation (d) which were also independent prognostic factors for overall survival in addition to the risk score. The Kaplan-Meier curve showed significant differences in overall survival between the high-risk group and the low-risk group in different ages (b, c) and IDH1 mutation status (e, f). HR: hazard ratio; y: year; IDH: isocitrate dehydrogenase.

Table 3

Univariate and multivariate Cox regression analyses of clinical pathologic features for overall survival.

VariablesTCGA (N = 520)Univariate analysisMultivariate analysis
HR95% CI p valueHR95% CI p value
Age (years, mean ± SD)42.84 ± 13.391.0571.043-1.0722.78e − 151.0491.003-1.0963.59e − 02
Gender (male/female)286/2341.1450.811-1.6174.40e − 01
Animal insect allergy history (yes/no/-)16/289/2150.8310.202-3.4287.92e − 01
Asthma history (yes/no/-)22/346/1520.8940.409-1.9567.76e − 01
Food allergy history (yes/no/-)20/290/2101.0080.363-2.8019.87e − 01
Hay fever history (yes/no/-)38/304/1780.4880.177-1.3471.24e − 01
Headache history (yes/no/-)174/297/490.8410.576-1.2273.64e − 01
Histological type (astrocytoma/oligoastrocytoma/oligodendroglioma)194/132/1940.7560.620-0.9215.25e − 030.5290.244-1.1491.08e − 01
IDH1 mutation (yes/no/-)91/34/3950.1810.0675-0.4844.14e − 040.2260.0729-0.7019.98e − 03
Neoplasm histologic grade (G2/G3/-)254/265/13.4162.341-4.9851.84e − 110.9490.294-3.0639.30e − 01
Radiation therapy (yes/no/-)294/181/451.8471.209-2.8202.83e − 031.9810.584-6.7222.73e − 01
Targeted molecular therapy268/200/521.3890.955-2.0198.08e − 02
Risk score (high/low)260/2603.1922.182 -4.6701.99e − 105.0411.272-19.972.13e − 02
Death (dead/alive)133/387
Overall survival days (months, mean ± SD)32.97 ± 32.78

HR: hazard ratio; CI: confidence interval; SD: standard deviation; IDH: isocitrate dehydrogenase; TCGA: The Cancer Genome Atlas. Bold indicated the factors with statistical difference (p < 0.05).

Figure 6

Establishment and assessment of a prognostic nomogram based on age, IDH1 mutation status, and the risk score. (a) A prognostic nomogram; (b) the calibration assessment by calibration curves; (c) the discrimination assessment by receiver operator characteristic curve. RS: risk score; IDH: isocitrate dehydrogenase.

Table 4

The performance of the nomogram assessed by discrimination parameters.

ModelAUCC-index p valueSpecificitySensitivity
Age0.5640.74300.8890.278
IDH1 mutation0.6460.7338.138e − 050.8490.442
Multiclinical0.7790.81.278e − 060.8770.632
lncRNA alone0.7470.7029.97e − 140.8320.662
mRNA alone0.7390.73700.7780.662
Multi-RNA based0.8270.81200.7980.827
Multi-RNA combined0.8450.92800.8630.782

AUC: area under the curve of receiver operating characteristic curve; C-index: concordance index; IDH: isocitrate dehydrogenase.

4. Discussion

In this study, we, for the first time, attempted to develop an energy metabolism-related prognostic signature for LGG patients based on the lncRNAs and mRNAs. As a result, a prognostic risk score established by 9 energy metabolism-associated lncRNAs (GABPB1-AS1, HAR1A, LINC00599, SNAI3-AS1, and SNHG1)-mRNAs (FABP6, MBOAT7, SLC25A1, and UST) was generated. This risk score was demonstrated to be an independent prognostic factor for OS prediction, with the predictive accuracy reaching 82.7% for TCGA and 80.6% for the CGGA dataset, respectively, which seemed to be higher than the signature established by lncRNAs and mRNAs alone previously for LGG, such as Zhou et al. (29-mRNA, AUC = 79.1% for CGGA) [10], Ni et al. (25-mRNA, AUC = 77.1%) [20], Wang et al. (4-mRNA, AUC = 62.0%) [21], and Kiran et al. (8-lncRNA, AUC < 80%) [22]. This conclusion was also proved by our study (AUC = 0.827 vs. 0.747 for lncRNA alone model and 0.739 for mRNA alone model; C‐index = 0.812 vs. 0.702 for lncRNA alone model and 0.737 for mRNA alone model). Furthermore, the poor prognosis of LGG was traditionally determined according to clinical characteristics, including older age [23] and IDH1 nonmutational status [24, 25]. Thus, whether the risk score was superior or added additional prognostic value to these current clinical systems for prognosis prediction was also an important focus in the signature studies [10, 22, 26]. In the present study, we performed the stratification, AUC, C-index calculation, and nomogram analyses to confirm it. In line with previous studies [10, 22, 26], our results showed that the OS of patients with the same age and IDH1 mutation status can be further stratified by the risk score. Also, the AUC and C-index of risk score were obviously higher than age (AUC = 0.827 vs. 0.564; C‐index = 0.812 vs. 0.743) and IDH1 mutation status (AUC = 0.827 vs. 0.646; C‐index = 0.812 vs. 0.733). The prognostic performance was the highest (calibration plot: approximate to the 45-degree line for OS prediction; discrimination: AUC = 0.845; C‐index = 0.928) if age, IDH1 mutation, and the risk score were combined. These findings suggested that our identified combination (risk score+age+IDH1 mutation) may be a promising biomarker for clinical prediction of the outcomes in LGG patients. Although our molecular prognostic signature was new for LGG patients, several lncRNAs included had been demonstrated to be related to the progression and prognosis for glioma. For example, the study of Luan et al. revealed that GABPB1-AS1 was a protective factor for the poor OS in patients with glioma (HR = 0.668, 95%CI = 0.494‐0.904) and GABPB1-AS1 may be involved in glioma by regulating autophagy-related genes [27]. Zou et al. reported that HAR1A was significantly downregulated in patients with GBM compared with nontumor controls (logFC = −2.873, p = 2.98e − 11). Multivariate analysis showed that low HAR1A expression was an independent prognosis factor for OS of glioma patients (HR = 1.6, p = 0.021) [28]. Fu et al. found that the expression of LINC00599 was reduced in LGG and GBM tissues as well as glioma cell lines compared with normal brain tissues or human astrocytes. Low LINC00599 expression was associated with poor disease-free survival and OS of glioma patients. In vitro study implied that overexpression of LINC00599 inhibited cell migration and invasion through blocking the epithelial-mesenchymal transition process [29]. Liu et al. observed that SNHG1 was overexpressed in glioma tissues and cell lines. In vitro and in vivo assays suggested that SNHG1 promoted glioma progression by functioning as a sponge for miR-194 and then inducing the high expression of pleckstrin homology like domain family A, member 1 (PHLDA1) [30]. Li et al. elucidated that SNHG1 may regulate the malignant behavior of glioma cells by binding to miR-154-5p or miR-376b-3p and then enhancing the expression of downstream target of both miR-154-5p and miR-376b-3p FOXP2 [31]. Kaplan-Meier analysis of Wang et al. showed that high expression of SNHG1 was significantly associated with poor OS. Functional studies demonstrated that knockdown of SNHG1 suppressed glioma cell proliferation and cell invasion and increased cell apoptosis [32]. In line with these studies, we also demonstrated that GABPB1-AS1 and SNHG1 were highly expressed, while HAR1A and LINC00599 were lowly expressed in LGG compared with normal brain tissues. They were all OS-related genes for LGG. However, their functions in glioma remain not well understood. In this study, we predicted that these four lncRNAs may play crucial roles in LGG by regulating the transcription of energy metabolism-related genes, including GABPB1-AS1-PON2, HAR1A-CYP46A1/HK1, LINC00599-INPP5J, and SNHG1-GPC2. The roles of these energy metabolism-related genes had been implicated for cancers. PON2 is a member of the paraoxonase family and had been demonstrated to exert an antioxidative function by improving mitochondrial efficiency to reduce reactive oxygen species production. Theoretically, PON2 should be downregulated in cancers exposed to anoxia. However, several studies showed that PON2 expression was obviously increased in gastric cancer [33] and bladder cancer [34] tissues compared with normal tissue samples. Overexpression of PON2 led to a significant increase in tumor cell proliferation and resistance to oxidative stress [34], while silencing of PON2 expression inhibited cancer cell proliferation, migration, and invasion [24]. Patients with high PON2 expression had shorter OS compared with those having low PON2 expression [23]. These findings indicated that high expression of PON2 may represent a protective stress response. It was reported that CYP46A1, a brain-specific enzyme that converts the cholesterol into 24(S)-hydroxycholesterol (24OHC), was significantly decreased in GBM samples compared with normal brain tissue. Low expression of CYP46A1 was associated with increasing tumor grade and poor prognosis in human gliomas [35]. Overexpression of CYP46A1 or the use of activator suppressed cell proliferation and in vivo tumor growth by increasing 24OHC levels [25]. Malignant tumors often rely on glycolysis to produce ATP (that is, Warburg effect), but not tricarboxylic acid cycle. Thus, glycolytic enzymes may play important roles for cancer. Hexokinase (HK) is the first rate-limiting enzyme to phosphorylate glucose to form glucose-6-phosphate (G-6-P). Theoretically, all hexokinase isozymes (HK1 to HK4) should be highly expressed for tumor initiation and maintenance. However, a study showed that HK2 was upregulated [36] but HK1 was downregulated to increase glycolysis and accelerate tumor growth and metastasis [37]. This result may be attributed to the underlying regulation between HK1 and HK2, which was also confirmed in the study of Tseng et al. (that is, knockdown of HK1 increased the HK2 level; silencing of HK2 elevated HK1 expression) [37]. INPP5J is a negative regulator for PI3K/AKT signaling and thus may function as a tumor suppressor, which was demonstrated in breast cancer [38], ovarian cancer [39], hepatocellular carcinoma [40], and oesophageal squamous cell carcinoma [41]. GPC2 is a member of the human glypican family of proteins that mediate neuronal cell adhesion and neurite outgrowth by attaching to the cell surface via a GPI anchor. Hereby, GPC2 protein is highly expressed in brain tumors, which had been validated in neuroblastoma [42, 43]. In accordance with these studies, we also identified that PON2 and GPC2 were upregulated, while CYP46A1, HK1, and INPP5J were downregulated in LGG compared with normal control (Supplementary Table 1). The published study on SNAI3-AS1 was rarely reported, except one for hepatocellular carcinoma, in which highly expressed SNAI3-AS1 was shown to be correlated with shorter OS; knockdown of SNAI3-AS1 inhibited cell proliferation and metastasis, whereas inverse conclusions were obtained with overexpression of SNAI3-AS1 in vitro. Functional investigations showed that SNAI3-AS1 may affect tumorigenesis by inducing tumor epithelial to mesenchymal transition via regulating the UPF1/Smad7 signaling pathway [44]. Unfortunately, our results showed that SNAI3-AS1 was downregulated in LGG, suggesting that SNAI3-AS1 may be tissue-specifically expressed and a new target for LGG. Furthermore, we speculated that SNAI3-AS1 may function in LGG by coexpressing with HK1 and INPP5J as described above. In addition to lncRNAs, four energy metabolism-related genes (FABP6, MBOAT7, SLC25A1, and UST) were also included into the prognostic signature, indicating their importance for LGG. Some genes had been demonstrated to be related to the development and progression of cancer previously. For example, a gastric cancer risk allele carrier was observed to have downregulated expressions of MBOAT7 [45]. Overexpression of mitochondrial citrate carrier (SLC25A1) was proved to be associated with reduced survival of lung cancer patients [46]. The mechanisms of SLC25A1 in cancer referred to its antioxidant defense and maintenance of the self-renewal capability of cancer stem cells [46, 47]. Knockdown of UST could significantly reduce migration and adhesion in mouse melanoma cells and pulmonary metastasis in mice [48]. However, no studies focused on glioma, suggesting they may also represent new biomarkers for LGG. There were some limitations in this study. First, the expression of crucial lncRNAs and mRNAs should be verified with quantitative PCR, western blotting, or immunohistochemistry in LGG and normal brain tissues with larger sample size. Second, the prognosis ability of our risk score and combined model should be validated in newly hospitalized LGG patients. Third, in vitro and in vivo experiments are also essential to confirm the coexpression mechanisms of our identified lncRNAs (GABPB1-AS1-PON2, HAR1A-CYP46A1/HK1, LINC00599-INPP5J, and SNHG1-GPC2). Fourth, lncRNAs can function as competing endogenous RNAs (ceRNAs) to target mRNAs by sponging miRNAs. It may be a potential direction in the future to explore a signature established by lncRNA-miRNA-mRNA for prognosis prediction and reveal possible ceRNA mechanism axes for LGG as reported in other cancers [49]. Fifth, proteomics analysis also should be conducted in LGG and normal brain tissues in order to identify a protein alone or lncRNA-protein signature for prognosis prediction [50].

5. Conclusion

Our present study provided a new prognostic biomarker for LGG based on energy metabolism mechanisms. This prognostic signature consisted of five lncRNAs (GABPB1-AS1, HAR1A, LINC00599, SNAI3-AS1, and SNHG1) and four mRNAs (FABP6, MBOAT7, SLC25A1, and UST), which could classify patients into high- and low-risk subgroups exhibiting significantly different OS. Furthermore, this risk score was also combined with clinical characteristics (age, IDH1 mutation) to establish a prognostic nomogram, which may be more applicable for clinical use.
  50 in total

1.  limma powers differential expression analyses for RNA-sequencing and microarray studies.

Authors:  Matthew E Ritchie; Belinda Phipson; Di Wu; Yifang Hu; Charity W Law; Wei Shi; Gordon K Smyth
Journal:  Nucleic Acids Res       Date:  2015-01-20       Impact factor: 16.971

2.  MiR-661 contributed to cell proliferation of human ovarian cancer cells by repressing INPP5J expression.

Authors:  Tongyu Zhu; Jing Yuan; Yuzhi Wang; Cuiping Gong; Yudou Xie; Hong Li
Journal:  Biomed Pharmacother       Date:  2015-08-15       Impact factor: 6.529

3.  Evolutionary game theory elucidates the role of glycolysis in glioma progression and invasion.

Authors:  D Basanta; M Simon; H Hatzikirou; A Deutsch
Journal:  Cell Prolif       Date:  2008-12       Impact factor: 6.831

4.  Hexokinase 2 is required for tumor initiation and maintenance and its systemic deletion is therapeutic in mouse models of cancer.

Authors:  Qi Wang; Prashanth T Bhaskar; Krushna C Patra; Luke Miller; Zebin Wang; Will Wheaton; Navdeep Chandel; Markku Laakso; William J Muller; Eric L Allen; Abhishek K Jha; Gromoslaw A Smolen; Michelle F Clasquin; Brooks Robey; Nissim Hay
Journal:  Cancer Cell       Date:  2013-08-01       Impact factor: 31.743

5.  Health-related quality of life in long-term survivors with Grade II gliomas: the contribution of disease recurrence and Karnofsky Performance Status.

Authors:  Yoshiko Okita; Yoshitaka Narita; Ruriko Miyahara; Yasuji Miyakita; Makoto Ohno; Soichiro Shibui
Journal:  Jpn J Clin Oncol       Date:  2015-07-31       Impact factor: 3.019

6.  Expression and prognostic value of mRNAs in lower grade glioma with MGMT promoter methylated.

Authors:  Wen Wang; Junsheng Li; Fa Lin; Jia Guo; Jizong Zhao
Journal:  J Clin Neurosci       Date:  2020-03-28       Impact factor: 1.961

7.  Comprehensive analysis of genes based on chr1p/19q co-deletion reveals a robust 4-gene prognostic signature for lower grade glioma.

Authors:  Chuang Zhang; Ruoxi Yu; Zhi Li; Huicong Song; Dan Zang; Mingming Deng; Yibo Fan; Yunpeng Liu; Ye Zhang; Xiujuan Qu
Journal:  Cancer Manag Res       Date:  2019-05-29       Impact factor: 3.989

8.  Identification of GPC2 as an Oncoprotein and Candidate Immunotherapeutic Target in High-Risk Neuroblastoma.

Authors:  Kristopher R Bosse; Pichai Raman; Zhongyu Zhu; Maria Lane; Daniel Martinez; Sabine Heitzeneder; Komal S Rathi; Nathan M Kendsersky; Michael Randall; Laura Donovan; Sorana Morrissy; Robyn T Sussman; Doncho V Zhelev; Yang Feng; Yanping Wang; Jennifer Hwang; Gonzalo Lopez; Jo Lynne Harenza; Jun S Wei; Bruce Pawel; Tricia Bhatti; Mariarita Santi; Arupa Ganguly; Javed Khan; Marco A Marra; Michael D Taylor; Dimiter S Dimitrov; Crystal L Mackall; John M Maris
Journal:  Cancer Cell       Date:  2017-09-11       Impact factor: 38.585

9.  Exploring the role of paraoxonase-2 in bladder cancer: analyses performed on tissue samples, urines and cell cultures.

Authors:  Tiziana Bacchetti; Davide Sartini; Valentina Pozzi; Tiziana Cacciamani; Gianna Ferretti; Monica Emanuelli
Journal:  Oncotarget       Date:  2017-04-25

10.  lncRNAs PVT1 and HAR1A are prognosis biomarkers and indicate therapy outcome for diffuse glioma patients.

Authors:  Hecun Zou; Lan-Xiang Wu; Yonglong Yang; Shuang Li; Ying Mei; Yong-Bin Liu; Lihua Zhang; Yu Cheng; Hong-Hao Zhou
Journal:  Oncotarget       Date:  2017-08-12
View more
  2 in total

1.  An innovative prognostic model based on autophagy-related long noncoding RNA signature for low-grade glioma.

Authors:  Aierpati Maimaiti; Mirezhati Tuerhong; Yongxin Wang; Maimaitili Aisha; Lei Jiang; Xixian Wang; Yusufu Mahemuti; Yirizhati Aili; Zhaohai Feng; Maimaitijiang Kasimu
Journal:  Mol Cell Biochem       Date:  2022-02-13       Impact factor: 3.396

2.  A Prognostic Model of Bladder Cancer Based on Metabolism-Related Long Non-Coding RNAs.

Authors:  Jintao Hu; Cong Lai; Zefeng Shen; Hao Yu; Junyi Lin; Weibin Xie; Huabin Su; Jianqiu Kong; Jinli Han
Journal:  Front Oncol       Date:  2022-02-25       Impact factor: 6.244

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.