Literature DB >> 33854716

Mining TCGA database for prognostic genes in head and neck squamous cell carcinoma microenvironment.

Qiu-Chi Ran1,2, Sheng-Rong Long3, Yan Ye4, Chen Xie1, Zhuo-Lin XuXiao1, Yu-Song Liu1, Hong-Xia Pang5, Diwas Sunchuri1, Nai-Chia Teng6,7, Zhu-Ling Guo1,5.   

Abstract

BACKGROUND/
PURPOSE: Head and neck squamous cell carcinoma (HNSCC) is one of the most common malignant tumors. The aim of this study was to elucidate the effect of tumor microenvironment-related genes on the prognosis of HNSCC and to obtain tumor microenvironment-related genes that can predict poor prognosis in HNSCC patients.
MATERIALS AND METHODS: The ESTIMATE algorithm was applied to the HNSCC transcriptomic data downloaded from the TCGA (The cancer genome atlas), and then the samples were divided into two groups: high and low immune scoring groups, and high and low basal scoring groups to screen for differentially expressed genes (DEGs) associated with poor patient outcomes. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis was performed to explore the potential functions of DEGs, and then to explore the potential prognostic value of individual DEGs. The results of survival analysis between DEGs and overall survival (OS) to explore tumor microenvironment-related genes relevant to the prognosis of HNSCC patients.
RESULTS: Fifty-nine tumor microenvironment-related genes were screened for association of OS with HNSCC (P < 0.05). The GO and KEGG enrichment analysis showed that the selected DEGs may mediate immune response, extracellular matrix, and immunoglobulin binding via neutrophil activation in HNSCC. Six of these DEGs, GIMAP6, SELL, TIFAB, KCNA3, P2RY8 and CCR4 were most significantly associated with OS (P < 0.001).
CONCLUSION: We identified six tumor microenvironment-related genes that were significantly associated with poor prognosis in HNSCC. These genes may inspire researchers to discover new targets and approaches for HNSCC treatment.
© 2020 Association for Dental Sciences of the Republic of China. Publishing services by Elsevier B.V.

Entities:  

Keywords:  HNSCC; Immune scores; TCGA; Tumor microenvironment

Year:  2020        PMID: 33854716      PMCID: PMC8025193          DOI: 10.1016/j.jds.2020.09.017

Source DB:  PubMed          Journal:  J Dent Sci        ISSN: 1991-7902            Impact factor:   2.080


Introduction

The incidence of malignant tumors of the head and neck ranks sixth in the world, among which laryngeal cancer, tongue cancer, hypopharyngeal cancer and nasopharyngeal cancer are the most common, and squamous cell carcinoma accounts for 90%–95% of the pathological types, which is called Head and neck squamous cell carcinoma (HNSCC). Despite recent advances in the surgical and comprehensive treatment of HNSCC, malignant pathological features such as early lymph node metastasis and aggressive growth are still important factors in its high rate of postoperative recurrence and metastasis, low survival rate, and poor long-term outcome. The use of targeted drugs and immunotherapy is bringing new hope to patients. Tumor microenvironment (TME), that is the cellular environment in which the tumor is located, is heterogeneous. TME is composed of immune cells, endothelial cells, fibroblasts and myeloid cells, as well as inflammatory mediators and extracellular matrix molecules, which is a key factor in tumor progression and high patient mortality. In TME, immune cells and stromal cells are two major non-neoplastic components that are valuable for tumor diagnosis and prognosis assessment, and tumor microenvironment prognosis-related genes may also be the next new therapeutic target. Yoshihara et al. designed an algorithm called ESTIMATE to explore the potential relationship between the degree of infiltration of immune and stromal cells and tumor prognosis. Subsequent studies soon applied the ESTIMATE algorithm to acute leukemia,, glioblastoma, renal clear cell carcinoma,, colon cancer, and breast cancer and showed the effectiveness of this big data-based algorithm. Therefore, this study was based on the ESTATE algorithm to explore the potential relationship between the infiltration of immune cells and stromal cells and the prognosis of HNSCC, and to obtain tumor microenvironment-related genes that can predict poor prognosis in HNSCC patients.

Materials and methods

Data resource

Downloaded from the TCGA data portal (https://portal.gdc.cancer.gov/, accessed August 30, 2019) publicly available HNSCC datasets, including tertiary data on gene expression profiles and survival information. The downloaded RNA expression data were scored by the ESTIMATE algorithm for basal and immune scoring, and cases of lung adenocarcinoma were classified into high and low groups based on the median score. All data involved in this study were downloaded from TCGA, and data collection and application were performed in accordance with TCGA publication guidelines and data access policy without additional approval from the local ethics committee.

Identification of differentially expressed genes (DEGs)

Data analysis was performed using the “limma” package (version 3.8) on R (version 3.6.0). The Wilcoxon rank sum test was used for statistical tests, and the following criteria were used for differentially expressed gene screening: FDR <0.05, |log2FC | > 1.

Heatmaps and clustering analysis

The heat map plot and sample clustering was generated using R “heatmap” package.

Enrichment analysis of DEGs

Functional enrichment analysis of DEGs was performed by R package clusterprofiler to identify gene ontology (GO) categories by their biological processes (BP), molecular functions (MF), or cellular components (CC). The clusterprofiler was also used to perform pathway enrichment analysis with reference from KEGG (Kyoto Encyclopedia of Genes and Genomes) pathways. FDR< 0.05 was used as the cut-off.

Overall survival curve

The survival R package was used to analyze the relationship between DEGs expression levels and the overall survival of HNSCC patients.

Results

We downloaded gene expression profiles and clinical information of all 546 HNSCC patients from the TCGA database. Among them, 63 (11.54%) patients were diagnosed with Grade 1, 310 (56.78%) patients were Grade 2, 125 (22.89%) patients were Grade 3, 7 (1.28%) patients were Grade 4, and 41 (7.51%) patients were of unknown grade. Based on ESTIMATE algorithm, immune scores were distributed between −1057.57 and 2785.18, and stromal scores ranged from −1941.57 to 1947.28, respectively (Fig. 1A, B).
Figure 1

Immune scores and stromal scores are associated with HNSCC grades and their overall survival. A) Distribution of immune scores of HNSCC grades. Boxplot shows that there is significant association between HNSCC grades and the level of immune scores (p < 0.05). B) Distribution of stromal scores of HNSCC grades. Box-plot shows that there is not significant association between HNSCC grades and the level of stromal scores (p < 0.05).

Immune scores and stromal scores are associated with HNSCC grades and their overall survival. A) Distribution of immune scores of HNSCC grades. Boxplot shows that there is significant association between HNSCC grades and the level of immune scores (p < 0.05). B) Distribution of stromal scores of HNSCC grades. Box-plot shows that there is not significant association between HNSCC grades and the level of stromal scores (p < 0.05). To find out the potential correlation of overall survival with immune scores and/or stromal scores, we divided the 500 HNSCC cases (only 500 of 502 cancer tissues samples have their grade clinical information) into top and bottom halves (high vs. low score groups) based on their scores. Kaplan–Meier survival curves (Fig. 2A) showed that median overall survival of cases with the low score group of immune scores are shorter than the cases in the high score group (p = 0.107 in log-rank test). Consistently, cases with lower stromal scores also showed shorter median overall survival compared to patients with higher stromal scores (Fig. 2B, p = 0.814 in log-rank test), although it was not statistically significant.
Figure 2

Immune scores and stromal scores are associated with HNSCC overall survival. A) HNSCC cases were divided into two groups based on their immune scores: the top half of cases with higher immune scores and the bottom half of cases with lower immune scores. As shown in the Kaplan–Meier survival curve, median survival of the low score group is shorter than high score group, as indicated by the log-rank test, p value is 0.107. B) HNSCC cases were divided into two groups based on their stromal scores: the top half of 209 cases and the bottom half of 208 cases. The median survival of the low score group is shorter than the high score group, however, it is not statistically different as indicated by the log-rank test p value is 0.814.

Immune scores and stromal scores are associated with HNSCC overall survival. A) HNSCC cases were divided into two groups based on their immune scores: the top half of cases with higher immune scores and the bottom half of cases with lower immune scores. As shown in the Kaplan–Meier survival curve, median survival of the low score group is shorter than high score group, as indicated by the log-rank test, p value is 0.107. B) HNSCC cases were divided into two groups based on their stromal scores: the top half of 209 cases and the bottom half of 208 cases. The median survival of the low score group is shorter than the high score group, however, it is not statistically different as indicated by the log-rank test p value is 0.814.

Comparison of gene expression profile with immune scores and stromal scores in HNSCC

To reveal the correlation of global gene expression profiles with immune scores and/or stromal scores, we compared Affymetrix microarray data of these 500 HNCSS cases obtained in TCGA database. Heatmaps in Fig. 3A, B showed distinct gene expression profiles of cases belong to high vs. low immune scores/stromal scores groups. For comparison based on immune scores, 777 genes were upregulated and 161 genes downregulated in the high score than the low score group (fold change > 1.5, p < 0.05). Similarly, for the high and low groups based on stromal scores, 986 genes were upregulated and 63 genes were downregulated in the high score group (fold change > 1.5, p < 0.05). Moreover, Venn diagrams (Fig. 3C, D) showed that 260 genes were commonly upregulated in the high scores groups, and 15 genes were commonly downregulated. It is worth mentioning that the DEGs extracted from the comparison of high vs. low immune scores groups covered the majority of genes extracted from the comparison based on stromal scores. Thus, we decided to focus on these DEGs for all subsequent analysis in this manuscript. To outline the potential function of the DEGs, we performed functional enrichment analysis of the 260 upregulated genes and 15 downregulated genes in high-immune scores group.
Figure 3

Comparison of gene expression profile with immune scores and stromal scores in HNSCC. Heatmaps were drawn based on the average linkage method and Pearson distance measurement method. Genes with higher expression are shown in red, lower expression are shown in blue, genes with same expression level are in white. A) Heatmap of the DEGs of immune scores of top half (high score) vs. bottom half (low score). p < 0.05, fold change >1.5). B) Heatmap of the DEGs of stromal scores of top half (high score) vs. bottom half (low score). p < 0.05, fold change >1.5). Venn diagrams showing the number of commonly upregulated C) or downregulated D) DEGs in stromal and immune score groups.

Comparison of gene expression profile with immune scores and stromal scores in HNSCC. Heatmaps were drawn based on the average linkage method and Pearson distance measurement method. Genes with higher expression are shown in red, lower expression are shown in blue, genes with same expression level are in white. A) Heatmap of the DEGs of immune scores of top half (high score) vs. bottom half (low score). p < 0.05, fold change >1.5). B) Heatmap of the DEGs of stromal scores of top half (high score) vs. bottom half (low score). p < 0.05, fold change >1.5). Venn diagrams showing the number of commonly upregulated C) or downregulated D) DEGs in stromal and immune score groups. We found 260 genes were commonly upregulated in the high scores groups, and 15 genes were commonly downregulated (Supplementary Table 1). In order to reveal the potential functions of differentially expressed genes, a functional enrichment analysis was performed on genes that were generally up- and down-regulated in immune and matrix scores. Functional enrichment clustering of these genes showed strong association with immune response as well. Top GO terms identified neutrophil activation involved in and mediated immune response, extracellular matrix, and immunoglobulin binding (Fig. 4A). In addition, all the pathways that were yielded from the KEGG identified Cytokine–cytokine receptor interaction, Complement and coagulation cascades and B cell receptor signal pathway (Fig. 4B).
Figure 4

A: Top 10 BP, CC, MF terms and B: top 25 KEGG pathways enrichment analysis was performed by R package clusterprofiler (p. adjusted <0.05).

A: Top 10 BP, CC, MF terms and B: top 25 KEGG pathways enrichment analysis was performed by R package clusterprofiler (p. adjusted <0.05).

Correlation of expression of individual DEGs in overall survival

To explore the potential roles of individual DEGs in overall survival, we generated Kaplan–Meier survival curves from TCGA database. Among the 275 DEGs in the high-immune scores group, a total of 59 DEGs (Supplementary Table 2) were shown to significantly predict poor overall survival in log-rank test (p < 0.05, selected 6 genes which most significant relate to OS (p < 0.001) are shown in Fig. 5).
Figure 5

DEGs extracted from TCGA database with overall survival. Kaplan–Meier survival curves were generated for selected DEGs extracted from the comparison of groups of high (red line) and low (blue line) gene expression. p < 0.001 in Log-rank test.

DEGs extracted from TCGA database with overall survival. Kaplan–Meier survival curves were generated for selected DEGs extracted from the comparison of groups of high (red line) and low (blue line) gene expression. p < 0.001 in Log-rank test.

Discussion

Data mining of TCGA databases has been widely used to predict cancer prognosis, and recent studies have shown that the tumor microenvironment plays a critical role in tumor growth and progression. The immune cells and stromal cells infiltrated in TME are composed of many different cell types. On the one hand, fibroblasts, as the most abundant stromal cells, form a physical barrier to avoid immune recognition and elimination of tumor cells; on the other hand, they promote tumor proliferation and metastasis by modulating the extracellular matrix and secreting relevant cytokines or growth factors., In this work, we attempted to identify tumor microenvironment related genes contributing to HNSCC overall survival in the TCGA database. In particular, by comparing global gene expression in a large number of cases with high vs. low immune scores, we extracted 59 genes involved in extracellular matrix and immune response. By analyzing 275 differentially expressed genes yielded from comparison of high vs. low immune scores (or stromal scores) groups, we found that many of them were involved in tumor microenvironment, as shown by GO term analysis (Fig. 2). This is in accordance with previous reports that the functions of immune cells and ECM molecules are closely interrelated in building tumor microenvironment in HNSCC.16, 17, 18, 19 Finally, we analyzed overall survival analysis of these 275 genes and found that 59 genes were associated with poor outcomes in HNSCC patients. There are 6 genes, GIMAP6, SELL, TIFAB, KCNA3, P2RY8 and CCR4 which most significant relate to OS of HNSCC. This demonstrates the prognostic value of our big data analysis of lung adenocarcinoma microenvironment-associated genes in the TCGA database of HNSCC based on the ESTIMATE algorithm. We are particularly interested in CCR4, GIMAP6 and SELL. CCR4 has been proven that it may have an important role in HNSCC progression, regional lymph node metastasis and recurrence. Significant progress has been made on the correlation of overall survival with gene expression in HNSCCs.20, 21, 22 Many of these experiments were done in tumor cell lines, animal models, or patients' tumor samples. GIMAP6 has also been reported to have a potential role in tumor evolution mechanisms related to inflammation and microenvironment. SELL also has been identified overexpressed in HNSCC that may modulate metastasis and affect survival. However, the complexity of HNSCC and HNSCC microenvironment demands more comprehensive analysis consisting of larger cohorts. The interaction between HNSCC and its tumor microenvironment critically affects tumor evolution, which subsequently impacts tumor subtype classification, recurrence, drug resistance, and the overall prognosis of patients. In summary, these 6 TME-associated DEGs were significantly associated with poor prognosis of HNSCC using HNSCC transcriptome data in this paper. Further study of these TME-associated genes may provide new insights into the potential relationship between TME and HNSCC prognosis, and these genes may also inspire researchers to discover new therapeutic targets and approaches for HNSCC.

Declaration of competing interest

No conflicts of interest relevant to this article were disclosed.
  24 in total

1.  clusterProfiler: an R package for comparing biological themes among gene clusters.

Authors:  Guangchuang Yu; Li-Gen Wang; Yanyan Han; Qing-Yu He
Journal:  OMICS       Date:  2012-03-28

Review 2.  Extracellular matrix degradation and remodeling in development and disease.

Authors:  Pengfei Lu; Ken Takai; Valerie M Weaver; Zena Werb
Journal:  Cold Spring Harb Perspect Biol       Date:  2011-12-01       Impact factor: 10.005

3.  Human Pancreatic Tumor Organoids Reveal Loss of Stem Cell Niche Factor Dependence during Disease Progression.

Authors:  Takashi Seino; Shintaro Kawasaki; Mariko Shimokawa; Hiroki Tamagawa; Kohta Toshimitsu; Masayuki Fujii; Yuki Ohta; Mami Matano; Kosaku Nanki; Kenta Kawasaki; Sirirat Takahashi; Shinya Sugimoto; Eisuke Iwasaki; Junichi Takagi; Takao Itoi; Minoru Kitago; Yuko Kitagawa; Takanori Kanai; Toshiro Sato
Journal:  Cell Stem Cell       Date:  2018-01-11       Impact factor: 24.633

Review 4.  Positive and negative influence of the matrix architecture on antitumor immune surveillance.

Authors:  Elisa Peranzoni; Ana Rivas-Caicedo; Houcine Bougherara; Hélène Salmon; Emmanuel Donnadieu
Journal:  Cell Mol Life Sci       Date:  2013-05-07       Impact factor: 9.261

5.  Identification of prognostic genes in the acute myeloid leukemia immune microenvironment based on TCGA data analysis.

Authors:  Haimeng Yan; Jianwei Qu; Wen Cao; Yang Liu; Gaofeng Zheng; Enfan Zhang; Zhen Cai
Journal:  Cancer Immunol Immunother       Date:  2019-10-24       Impact factor: 6.968

Review 6.  Hallmarks of cancer: the next generation.

Authors:  Douglas Hanahan; Robert A Weinberg
Journal:  Cell       Date:  2011-03-04       Impact factor: 41.582

Review 7.  Turning foes to friends: targeting cancer-associated fibroblasts.

Authors:  Xueman Chen; Erwei Song
Journal:  Nat Rev Drug Discov       Date:  2019-02       Impact factor: 84.694

8.  Retrospective analysis of prognostic factors in 205 patients with laryngeal squamous cell carcinoma who underwent surgical treatment.

Authors:  Si-Yi Zhang; Zhong-Ming Lu; Xiao-Ning Luo; Liang-Si Chen; Ping-Jiang Ge; Xin-Han Song; Shao-Hua Chen; Yi-Long Wu
Journal:  PLoS One       Date:  2013-04-04       Impact factor: 3.240

9.  Prognostic value and immune infiltration of novel signatures in clear cell renal cell carcinoma microenvironment.

Authors:  Wen-Hao Xu; Yue Xu; Jun Wang; Fang-Ning Wan; Hong-Kai Wang; Da-Long Cao; Guo-Hai Shi; Yuan-Yuan Qu; Hai-Liang Zhang; Ding-Wei Ye
Journal:  Aging (Albany NY)       Date:  2019-09-07       Impact factor: 5.682

Review 10.  Head and neck cancer: causes, prevention and treatment.

Authors:  Ana Lívia Silva Galbiatti; João Armando Padovani-Junior; José Victor Maníglia; Cléa Dometilde Soares Rodrigues; Érika Cristina Pavarino; Eny Maria Goloni-Bertollo
Journal:  Braz J Otorhinolaryngol       Date:  2013 Mar-Apr
View more
  1 in total

1.  Feature screening for survival trait with application to TCGA high-dimensional genomic data.

Authors:  Jie-Huei Wang; Cai-Rong Li; Po-Lin Hou
Journal:  PeerJ       Date:  2022-03-10       Impact factor: 2.984

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.