Literature DB >> 32869484

Identification of purity and prognosis-related gene signature by network analysis and survival analysis in brain lower grade glioma.

Zujian Xiong^1,2,3, Yi Xiong^1,2,3, Hongwei Liu^1,3, Chang Li^1,2,3, Xuejun Li^1,3.

Abstract

Tumour microenvironment of brain lower grade glioma (LGG) consists of non-tumour cells including stromal cells and immune cells mainly. These non-tumour cells dilute the purity of LGG and play pivotal roles in tumour growth and development, thereby affecting patient prognosis. Tumour purity is also associated with molecular subtypes of LGG. In this study, we discovered the most relevant module to purity by weighted gene co-expression network analysis (WGCNA) and afterwards performed consensus network analysis and survival analysis to filter 61 significant genes related to both purity and prognosis. In turn, we built a simplified model based on the calculation of purity score, and consensus measurement of purity estimation (CPE), with a satisfactory predictive performance by random forest regression. HLA-E, MSN, GNG-5, MYL12A, ITGB4, PDPN, AGTRAP, S100A4, PLSCR1, VAMP5 were selected as the most relevant genes correlating to both purity and prognosis. The risk score model based on the 10 genes could moderately predict patients' overall survival. These 10 genes, respectively, were positively correlated positively to immunosuppressive cells like macrophage M2, but negatively correlated to patient prognosis, which may explain partially the poor prognosis with low-purity group.

Entities: Chemical Disease Gene Species

Mesh：

Year: 2020 PMID： 32869484 PMCID： PMC7576230 DOI： 10.1111/jcmm.15805

Source DB: PubMed Journal: J Cell Mol Med ISSN： 1582-1838 Impact factor: 5.310

INTRODUCTION

Glioma is the most common and lethal tumour type in central nervous system (CNS). Compared to glioblastoma (GBM, WHO IV), lower grade glioma (LGG, WHO II, III) grow slowly with less malignancy. Because of bioinformatics study on gliomas, the WHO added molecular markers, such as IDH mutation status, chromosome 1p or 19q codeletion (1p/19q codel) status, into the glioma diagnostic guideline to improve the diagnosis and treatment accuracy. Apart from previous treatment on LGG like surgery, chemotherapy and radiotherapy, immunotherapy is now one of popular approaches deal with tumours. However, clinical outcome in large randomized clinical trials using checkpoint inhibitors was not satisfactory. The efficiency of immunotherapy partially influenced by tumour microenvironment which consists mainly of immune cells and stromal cells. , Thus, considering the non‐tumour components within the tumour is essential for improving therapy effect. Tumour purity is defined as the proportion of tumour cells in tumour tissue. It used to be determined by pathologists through visual evaluation so that the results could be affected by the sensitivity of histopathology or interobserver bias. We adapted the consensus measurement of purity estimation (CPE) based on RNA‐seq data to evaluate the non‐tumour proportion within LGG and former studies elucidated that CPE shows good accuracy in purity evaluation. ,

METHODS

Data download and pre‐processing

This study collected 697 samples of LGG patient from TCGA and CGGA database. 516 samples’ RNA‐seq data and matched clinical data were acquired from TCGA database via Xena Browser developed by UCSC. 181 samples’ RNA‐seq data and clinical data were downloaded from CGGA database (http://www.cgga.org.cn). All RNA‐Seq data were normalized by the transcripts per kilobase million (TPM) method for further analysis.

Tumour purity estimate

Tumour purity scores were inferred by the CPE method as described previously. Please read the File [Link], [Link] for more details.

SCNA and mutation analysis

Somatic copy number alteration (SCNA) data were download form GDAV Firehose which were detected by GISTIC 2.0. MAF files of LGG were downloaded from TCGA database and performed by R/maftools.

Differentially expressed genes (DEGs) identification and functional enrichment analysis

Two purity groups were divided by X‐tile based on CPE value. DEGs were identified by R/limma. Gene ontology (GO) and pathway enrichment analysis (Kyoto Encyclopedia of Genes and Genomes (KEGG)) were performed by R/clusterProfiler. Gene set variation analysis (GSVA) was performed by R/GSVA, the reference gene sets were downloaded from MSigDB, and all parameters were set as advised. We used single‐sample gene set enrichment analysis (ssGSEA) to estimate the immune cellular fraction by R/GSVA and CIBERSORT.

Survival analysis and model building

Survival‐related analysis was conducted by R/survival and R/survminer. We first perform WGCNA and consensus network analysis by R/WGCNA with calculated soft threshold to find the module most relevant to CPE with 200 times permutation, parameters were advised by (Figure S1D‐S1E). Then, target genes were filtered by log‐rank test and Pearson's correlation firstly and selected by Lasso regression and random forest algorism. We used risk score to build the model. N, Expi and Coei represented the number of signature genes, gene expression level and coefficient value, respectively. All TCGA patients were divided into training set (n = 344) and test set (n = 163) randomly to build the model and CGGA patients used as validation set. The receiver operating characteristic curve (ROC) was generated to assess the accuracy of the model.

Statistics Analysis

All statistical analyses were performed using R software, version 3.5.3. Continues variables between groups were compared by Student's t test, one‐way ANOVA with post hoc pairwise Bonferroni tests or the Wilcoxon rank‐sum test. Correlations between continuous variables were evaluated by Spearman or Pearson's correlation analysis. The method workflow was depicted by Figure S1A. For more method details, please see the File [Link], [Link].

RESULTS AND DISCUSSION

By calculating the DEGs between groups that divided by X‐tile based on CPE value, we found that these DEGs were enriched in immunological activity and immune cell‐cell adhesion pathways in low‐CPE group (Figure 1A‐B). However, we did not discover any metabolism pathway with statistical significance by KEGG. Hence, we assume that the type of biological metabolism of immune cells in the tumour microenvironment has little difference between groups. Meanwhile, the immune infiltration analysis revealed that more innate immune cells and tumour‐killing cells like macrophages, cytotoxic T cells and NK cells were accumulated in low‐purity tumour microenvironment, whereas adapted immune cells like T helper cells and B cells were more in high‐purity tumours, but the whole abundance of immune cells in high‐purity group was less than low‐purity group (Figure 1C). It was consistent with previous research that low purity accounts for positive immune status and intensive immune phenotype. On the other hand, the Kaplan‐Meier survival curve (Figure 1E) showed that low purity implied a poor prognosis with statistical significance. The high‐purity group possessed a better prognosis (median survival = 28.9 months) than the low‐purity group (median survival = 25.3 months). LGG tumour purity also correlated with molecular signatures such as IDH mutation, chromosome 1p/19q codeletion, chromosome 7 gain and 10 loss and the 4 LGG subtypes, classical, proneural, neural and mesenchymal subtypes (Figure 1G). The prognosis benefit mutation like IDH mutation and chromosome 1p19q codel were prone to exist in high‐purity group which could explain partially the better prognosis of high‐purity group (Figure 1F). Additionally, we also observed more classical driver mutations of glioma like chromosome 7 gain and 10 loss and mutation of TP53, EGFR and ATRX in low‐purity group which indicated the malignant potential of these tumours of low‐purity group (Figure 1D,1H). The histologic and molecular subtypes of LGG had different CPE value (Figure 1I‐J).

FIGURE 1

A, GO enrichment results of DEGs of low‐purity group comparing to high‐purity group. B, GSVA enrichment results of DEGs of low‐purity group comparing to high‐purity group. C, Immune infiltration evaluation by ssGSEA, the redder the immune cells are, the higher relative abundance the immune cells possess. D, CNV analysis results of low‐ and high‐purity group, CNV score were higher in low‐purity group with classical driver mutation of chromosome 7 gain and 10 loss. E, Kaplan‐Meier curve of high‐ and low‐purity groups, and low CPE means low purity indicated poor prognosis. F, The distribution of CPE value in different LGG molecular subtypes defined by WHO. IDHmut − codel, IDH mutation with 1p19q codeletion; IDHmut‐non‐codel, IDH mutation without 1p19q codeletion; IDHwt, IDH wild type. G, The relationship among LGG subtypes, specific molecular signature and purity score, CPE. H, Somatic variants of LGG between low‐ and high‐purity groups. I, The distribution of CPE in 4‐type classification. CL, classical subtype; ME, mesenchymal subtype; NE, neural subtype; PN, proneural subtype. J, Purity distribution in different histology of LGG Eight modules were identified by WGCNA, and all modules pass the preservation verification (Zsummary > 10) (Figure 2A‐B). It means all modules were conservative. Among them, the yellow module was identified to be the most correlated one with CPE value (MS = −0.8) and could distinguish the high‐ and low‐purity group well (Figure 2E, Figure S1B). We then used consensus network analysis to identify specific modules in the low‐purity group compared with the high‐purity group, and 10 modules were identified with 4 unconservative modules (Zsummary < 10) (Figure 2C‐D). We intersected these 4 modules with the former yellow module to get our 178 target genes, and these genes enriched at immune‐related pathways which means immune difference was the major difference between two groups (Figure S1C). We then conducted Pearson's correlation to CPE (coefficient > median), and log‐rank test (FDR < 0.05) on target genes, respectively, and then extracted the 85 common genes passed both tests. For further dimension reduction, we performed lasso regression to select 61 genes which also identified as risk factors by univariate Cox regression (Figure 2F). Purity prediction based on these 61 genes possessed similar power comparing to ESTIMATE that based on specific gene expression profiles of 141 immune genes and 141 stromal genes (Figure 2G) but with less genes. To simplify the model and connect purity with prognosis, we used 3 random forest algorism and selected 10 most correlative genes that ranked in the top 15 in all three algorisms and listed as follows: HLA‐E, MSN, GNG‐5, MYL12A, ITGB4, PDPN, AGTRAP, S100A4, PLSCR1, VAMP5 (Figure S2G–I, details of gene function and fold change please see the File [Link], [Link]). Comparing to normal samples, these 10 genes expression were higher in low‐purity group than high‐purity group than normal samples (Figure 2L). Additionally, we found that these 10 genes were positively correlated with PDCD1 with pretty good coefficients and moderately positively related to CTLA4 (Figure S2F). We then calculated each samples' risk score based on these 10 genes and divided TCGA samples into high‐ and low‐risk score groups with obvious prognostic difference, and risk score had moderately good prognostic prediction power (Figure 2H‐K). To validate risk score prediction power, we risk score into multiCox regression with gender, grade, IDH mutation, 1p19q codeletion and MGMT promoter methylation and the HR of risk score was 1.03 with FDR equalling 0.048. All these 10 genes were highly expressed in the high‐risk score group, associating with the low‐purity group in the TCGA data set and immune cell fractions. Also, we got consistent results in the CGGA dataset (Figure S2A‐D). Since innate immune cells dominated in the low‐purity LGG immune microenvironment, we observed less Treg cells infiltrated (Figure 1C). However, we observed that macrophage M2 fraction, which could promote tumour progression, metastasis and suppress anti‐tumour immune cells’ reactivity by expressing high levels of soluble factors such as TGF‐β, correlated positively with these 10 genes obviously, indicating high expression of these 10 genes correlated with high infiltration fraction of macrophage M2 (Figure S2E). , The increased innate immune suppressor infiltration could partially explain the poor prognosis caused by macrophage M2 facilitating tumour growth through induction of anti‐inflammatory responses and multitherapy resistance and impairing T cell activation and proliferation, and unsatisfactory outcome of checkpoint inhibitor that targeted to PD‐1, PD‐L1 or CTLA‐4 receptor, which usually expressed on T cell‐tumour cell interaction. This gave us a hint to pay attention on inhibiting innate anti‐immune cells like macrophage M2 to improve therapy outcomes. In summary, we identified the 10 most significant genes that contribute to both prognosis and purity. These genes showed a consistent correlation with different immune infiltrating cells between databases which could be further explored as potential targets in immune therapy.

FIGURE 2

A, Identification of a co‐expression module in LGG by WGCNA. Red, black, yellow, gold, green, brown, blue and turquoise modules were identified, and grey module contains unmatched genes. B, Module preservation statistics of TCGA 8 modules and visualization, all modules were conservative due to Zsummary > 10. C, Identification of consensus modules between low‐ and high‐purity groups, based on consensus modules, low and high CPE groups had their own module organization respectively. D, Module preservation statistics of consensus network analysis and visualization, there were 4 unpreserved modules in low‐purity group comparing to high‐purity group, pink, magenta, purple and black. E, Correlation between the WGCNA co‐expression modules and clinical traits. The clinical traits included sex, overall survival (OS), dead event, grade, CPE and three glioma histological types: oligoastrocytoma, oligodendroglioma and astrocytoma. F, Forest plot of 61 filtered genes, all of which were identified as risk factors for patient prognosis. G, ROC curve of prediction efficiency of tumour purity. The constructed model had equivalent predictive value compared with ESTIMATE. H, Kaplan‐Meier curve of high‐ and low‐risk score group calculated by 10 genes in TCGA samples. I, Verification of risk score efficiency in CGGA samples’ prognosis. J, ROC curve of risk score prediction power on TCGA cohort. K, ROC curve of risk score prediction power on CGGA cohort. L, 10 genes expression in low‐purity group, high‐purity group and normal samples

CONFLICT OF INTEREST

The authors state that the research was conducted without any commercial or financial relationship that could be interpreted as a potential conflict of interest.

AUTHOR CONTRIBUTION

Zujian Xiong: Formal analysis (lead); Methodology (lead); Validation (lead); Visualization (lead); Writing‐original draft (lead); Writing‐review & editing (lead). Yi Xiong: Methodology (supporting); Validation (supporting); Visualization (supporting). Hongwei Liu: Methodology (supporting); Validation (supporting). Chang Li: Methodology (supporting); Writing‐review & editing (supporting). Xuejun Li: Conceptualization (lead); Funding acquisition (lead); Project administration (lead); Supervision (lead); Writing‐original draft (supporting); Writing‐review & editing (supporting). Figure S1 Click here for additional data file. Figure S2 Click here for additional data file. Table S1 Click here for additional data file. Table S2 Click here for additional data file. Supplementary Material Click here for additional data file. Supplementary Material Click here for additional data file.

16 in total

Review 1. The promises of immunotherapy in gliomas.

Authors: Iris Mildenberger; Lukas Bunse; Katharina Ochs; Michael Platten
Journal: Curr Opin Neurol Date: 2017-12 Impact factor: 5.710

2. Interobserver agreement among pathologists for semiquantitative hormone receptor scoring in breast carcinoma.

Authors: David A Cohen; David J Dabbs; Kristine L Cooper; Milon Amin; Terrell E Jones; Mirka W Jones; Mamatha Chivukula; Giuliana A Trucco; Rohit Bhargava
Journal: Am J Clin Pathol Date: 2012-12 Impact factor: 2.493

Review 3. Tumor-associated macrophages: from mechanisms to therapy.

Authors: Roy Noy; Jeffrey W Pollard
Journal: Immunity Date: 2014-07-17 Impact factor: 31.745

4. Robust enumeration of cell subsets from tissue expression profiles.

Authors: Aaron M Newman; Chih Long Liu; Michael R Green; Andrew J Gentles; Weiguo Feng; Yue Xu; Chuong D Hoang; Maximilian Diehn; Ash A Alizadeh
Journal: Nat Methods Date: 2015-03-30 Impact factor: 28.547

5. DNA methylation, transcriptome and genetic copy number signatures of diffuse cerebral WHO grade II/III gliomas resolve cancer heterogeneity and development.

Authors: H Binder; E Willscher; H Loeffler-Wirth; L Hopp; D T W Jones; S M Pfister; M Kreuz; D Gramatzki; E Fortenbacher; B Hentschel; M Tatagiba; U Herrlinger; H Vatter; J Matschke; M Westphal; D Krex; G Schackert; J C Tonn; U Schlegel; H-J Steiger; W Wick; R G Weber; M Weller; M Loeffler
Journal: Acta Neuropathol Commun Date: 2019-04-25 Impact factor: 7.801

Review 6. Cancer Stem Cells and Immunosuppressive Microenvironment in Glioma.

Authors: Qianquan Ma; Wenyong Long; Changsheng Xing; Junjun Chu; Mei Luo; Helen Y Wang; Qing Liu; Rong-Fu Wang
Journal: Front Immunol Date: 2018-12-21 Impact factor: 7.561

7. Side population in human glioblastoma is non-tumorigenic and characterizes brain endothelial cells.

Authors: Anna Golebiewska; Sébastien Bougnaud; Daniel Stieber; Nicolaas H C Brons; Laurent Vallar; Frank Hertel; Barbara Klink; Evelin Schröck; Rolf Bjerkvig; Simone P Niclou
Journal: Brain Date: 2013-03-04 Impact factor: 13.501

8. GSVA: gene set variation analysis for microarray and RNA-seq data.

Authors: Sonja Hänzelmann; Robert Castelo; Justin Guinney
Journal: BMC Bioinformatics Date: 2013-01-16 Impact factor: 3.169

9. WGCNA: an R package for weighted correlation network analysis.

Authors: Peter Langfelder; Steve Horvath
Journal: BMC Bioinformatics Date: 2008-12-29 Impact factor: 3.169

10. Systematic pan-cancer analysis of tumour purity.

Authors: Dvir Aran; Marina Sirota; Atul J Butte
Journal: Nat Commun Date: 2015-12-04 Impact factor: 14.919

10 in total

Review 1. Subtype-dependent regulation of Gβγ signalling.

Authors: Mithila Tennakoon; Kanishka Senarath; Dinesh Kankanamge; Kasun Ratnayake; Dhanushan Wijayaratna; Koshala Olupothage; Sithurandi Ubeysinghe; Kimberly Martins-Cannavino; Terence E Hébert; Ajith Karunarathne
Journal: Cell Signal Date: 2021-02-11 Impact factor: 4.850

2. Data mining of immune-related prognostic genes in metastatic melanoma microenvironment.

Authors: Wei Han; Biao Huang; Xiao-Yu Zhao; Guo-Liang Shen
Journal: Biosci Rep Date: 2020-11-27 Impact factor: 3.840

3. Comprehensive analysis of immune infiltration and gene expression for predicting survival in patients with sarcomas.

Authors: Hongmin Chen; Yijiang Song; Chuangzhong Deng; Yanyang Xu; Huaiyuan Xu; Xiaojun Zhu; Guohui Song; Qinglian Tang; Jinchang Lu; Jin Wang
Journal: Aging (Albany NY) Date: 2020-12-09 Impact factor: 5.682

4. Prognostic and predictive value of FCER1G in glioma outcomes and response to immunotherapy.

Authors: Houshi Xu; Qingwei Zhu; Lan Tang; Junkun Jiang; Huiwen Yuan; Anke Zhang; Meiqing Lou
Journal: Cancer Cell Int Date: 2021-02-12 Impact factor: 5.722

5. Immune Infiltration Associated MAN2B1 Is a Novel Prognostic Biomarker for Glioma.

Authors: Xuelei Lin; Hongwei Liu; Hongyu Zhao; Shunjin Xia; Yueshuo Li; Chaoqian Wang; Qi Huang; Siyi Wanggou; Xuejun Li
Journal: Front Oncol Date: 2022-02-02 Impact factor: 6.244

6. AGTRAP Is a Prognostic Biomarker Correlated With Immune Infiltration in Hepatocellular Carcinoma.

Authors: Shanshan Liu; Wei Zhao; Xuemei Li; Yu Gao; Qiling Peng; Chengyou Du; Ning Jiang
Journal: Front Oncol Date: 2021-09-14 Impact factor: 6.244

7. Pan-cancer analysis of the angiotensin II receptor-associated protein as a prognostic and immunological gene predicting immunotherapy responses in pan-cancer.

Authors: Kai Hong; Yingjue Zhang; Lingli Yao; Jiabo Zhang; Xianneng Sheng; Lihua Song; Yu Guo; Yangyang Guo
Journal: Front Cell Dev Biol Date: 2022-08-19

8. Identification of purity and prognosis-related gene signature by network analysis and survival analysis in brain lower grade glioma.

Authors: Zujian Xiong; Yi Xiong; Hongwei Liu; Chang Li; Xuejun Li
Journal: J Cell Mol Med Date: 2020-08-31 Impact factor: 5.310

9. Screening and Identification of Four Prognostic Genes Related to Immune Infiltration and G-Protein Coupled Receptors Pathway in Lung Adenocarcinoma.

Authors: Yan Wang; Liwei Qiu; Yu Chen; Xia Zhang; Peng Yang; Feng Xu
Journal: Front Oncol Date: 2021-02-08 Impact factor: 6.244

10. Establishment of a nomogram with EMP3 for predicting clinical outcomes in patients with glioma: A bi-center study.

Authors: Anke Zhang; Houshi Xu; Zeyu Zhang; Yibo Liu; Xiaying Han; Ling Yuan; Yunjia Ni; Shiqi Gao; Yuanzhi Xu; Sheng Chen; Junkun Jiang; Yike Chen; Xiaotao Zhang; Meiqing Lou; Jianmin Zhang
Journal: CNS Neurosci Ther Date: 2021-07-16 Impact factor: 5.243

10 in total