Literature DB >> 31794546

Identification for Exploring Underlying Pathogenesis and Therapy Strategy of Oral Squamous Cell Carcinoma by Bioinformatics Analysis.

Zheng Xu1, Pan Jiang1, Shengteng He1.   

Abstract

BACKGROUND Oral squamous cell carcinoma (OSCC), one of the most common cavity-associated cancers, has a high incidence and worldwide mortality. However, the cause and underlying molecular mechanisms of OSCC remain unclear. MATERIAL AND METHODS Three microarray datasets (GSE23558, GSE34105, and GSE74530) from the Gene Expression Omnibus (GEO) database were downloaded and then integrated to gain differentially expressed genes (DEGs). We performed Gene Ontology (GO) analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichments of DEGs in order to elucidate DEGs' biological roles. Protein-protein interaction (PPI) networks were established in order to identify hub genes. To validate the gene markers for OSCC, the data of TCGA OSCC were also assessed. RESULTS Together, 651 DEGs containing 288 upregulated genes and 363 downregulated genes were screened out, which could completely distinguish between OSCC and normal control tissues by principal component analysis (PCA). The GO analysis indicated the DEGs were enriched in chemokine activity in the biological process group. The molecular functions of DEGs included growth factor activity. The molecular functions included oxidoreductase activity. The main DEG-associated cellular components included extracellular exosome. The KEGG pathway analysis indicated the DEGs were mainly participated in the cytokine-cytokine receptor interaction, metabolism of xenobiotics by cytochrome P450 and glutathione metabolism signal pathway. The co-expression network identified core genes from the PPI network. Additionally, Kaplan-Meier survival analysis showed that CSF2 and EGF genes were significantly correlated with OSCC patients' overall survival. CONCLUSIONS Our study using an integrated bioinformatics analysis might provide valuable information for exploring potential new molecular biomarkers and therapeutic targets for OSCC.

Entities:  

Mesh:

Substances:

Year:  2019        PMID: 31794546      PMCID: PMC6909914          DOI: 10.12659/MSM.917736

Source DB:  PubMed          Journal:  Med Sci Monit        ISSN: 1234-1010


Background

Oral cancers, including oral cavity and oropharynx malignancy, are the ninth most common global malignancy which have an incidence of > 300 000 each year [1]. Approximately more than 90% of these oral cancers are oral squamous cell carcinomas (OSCC) [2]. It is estimated that 75% of oral cancers are related to lifestyle choices, and tobacco and alcohol abuse [1,3,4]. Although a range of diagnostic and therapeutic strategies have been used for treatment, there are still approximately over 145 000 oral cancer deaths annually globally [5]. Most published studies report that 5-year survival rate of OSCC is still around 50% [6-8]. With the development of molecular biological and pathological techniques, a variety of tumor markers are considered powerful tools for exploring diagnostic and predictive biomarkers, especially in the therapies targeting OSCC. Gene expression microarrays, efficient and large-scale techniques for gaining genetic data, have been widely performed to collect and to study gene expression profiling data in many human tumors. These microarray studies develop a new approach for identifying cancer-related genes and provide promising prospects for molecular prediction and therapeutic targets [9]. However, false-positive rates from each microarray analysis make it difficult to identify reliable results. Therefore, we downloaded 3 original microarray datasets, GSE23558, GSE34105, and GSE74530, from the NCBI-Gene Expression Omnibus (GEO) database, which contained a total of 122 samples, consisting of 95 tumor and 27 normal samples. Differentially expressed gene (DEG) investigation and functional and pathway enrichment studies in addition to protein-protein interaction network (PPI) analyses, were performed. We also validate the findings from The Cancer Gene Atlas (TCGA) datasets of OSCC. Our study should yield valuable genes for investigating the mechanisms underlying OSCC pathogenesis and provide candidate genes for identifying diagnostic and therapeutic OSCC targets.

Material and Methods

Microarray data

OSCC datasets in this study were downloaded from the GEO database (). The DEGs were considered by 3 independent OSCC datasets, involving GSE23558, GSE34105, and GSE74530, with 122 OSCC and 27 normal samples. The GSE23558 microarray data was based on the GPL6480 Platform (Agilent-014850 Whole Human Genome Microarray 4×44K G4112F), including 27 OSCC and 5 normal control samples. The GSE34105 dataset was based on the GPL14951 Platform (Illumina HumanHT-12 WG-DASL V4.0 R2 expression beadchip) and composed of 62 OSCC and 16 normal control samples. The platform for GSE74530 is GPL570, [HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array, consisting of 6 OSCC and 6 adjacent normal samples. To validate the gene markers being a specific signature for OSCC, the mRNA expression data and clinical data of OSCC (403 samples) were also downloaded from TCGA database (). Gene symbols were annotated based on the Homo_sapiens.GRCh38.91.chr.gtf file (). Log2 transformations were performed for all gene expression data. The average mRNA expression value was used when duplicate data were found.

Screening of DEGs

The Affy package of R language were used to normalize and convert the raw probelevel data to expression profiles [10,11]. The limma package of R language was used for DEGs between OSCC and normal control samples [12]. The P-value was adjusted by the Benjamini-Hochberg method [13]. An adjusted P-value <0.05 and |log2 fold change (FC) | >1 were considered as threshold values for DEGs identification. The volcano plot was performed using ggplot2 package, and the heat map was constructed using pheatmap package in R language [14].

Principal component analysis (PCA) of DEGs

Principal component analysis (PCA), a multivariate regression analysis, was used to confirm the differential functions of the DEGs for tumor samples and normal control samples [15,16]. A PCA graph was conducted using pca3d in R language. The 3-dimensional graph was then obtained, in which DEGs were considered as variables and the difference between tumor and normal control samples were observed.

Gene Ontology (GO) and pathway enrichment analyses of DEGs

Gene Ontology (GO) analysis is a common analysis for annotating genes and determining biological characteristics, including cellular component (CC), molecular function (MF), and biological process (BP) [17]. Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway database is applied for classification by correlating gene sets into their respective pathways [18]. The Database for Annotation, Visualization and Integrated Discovery (DAVID, ) is a gene functional classification tool that aims to provide a comprehensive set of functional annotation tools for authors to understand the biological meaning behind large lists of genes [19,20]. P<0.05 was identified statistically significant.

Protein-protein interaction (PPI) network analysis

The Search Tool for Interacting Genes/Proteins Retrieval (STRING; ) database provides comprehensive coverage and access to experimental and predicted information concerning the interactions between known and predicted proteins [21]. In our study, only those experimentally validated interactions with a combined score >0.7 were selected as significant, and the maximum number of interactors=0 were set as the cutoff criterion. Hub proteins were identified as those interacted with most partners using the R language (version 3.5.1).

Analysis of the hub genes with TCGA data

We also attempted to validate the findings from the TCGA datasets of OSCC. A total of 403 OSCC samples were included containing 371 OSCC and 32 adjacent nontumor OSCC tissues. The expression level of each hub gene was extracted for further analysis from all included data. Student’s t-test was used for the statistical analyses of the hub genes between OSCC and adjacent nontumor OSCC tissues as described previously [22]. In the meantime, survival information was also achieved from the TCGA. Survival analysis was carried out through the Kaplan-Meier analysis and the logrank test was performed to evaluate the statistical significance of the differences.

Results

Identification of DEGs

The gene expression profiles of GSE23558, GSE34105, and GSE74530 were gained from the GEO. When the 3 datasets were screened by the limma package, a total of 651 genes were considered to be differentially expressed in OSCC compared with the expression profiles from normal controls. Among these DEGs, 288 genes were upregulated, and 363 were downregulated. The cluster heatmap of the top 200 DEGs is seen in Figure 1. The volcano plot is shown in Figure 2.
Figure 1

DEGs between the OSCC and control groups are shown in heatmap. DEGs – differentially expressed genes; OSCC – oral squamous cell carcinoma.

Figure 2

Volcano plots of DEGs between the OSCC and control groups. DEGs – differentially expressed genes; OSCC – oral squamous cell carcinoma.

PCA of DEGs

In order to examine overall gene expression pattern of OSCC and normal tissues, we performed a PCA analysis. As presented in Figure 3, OSCC and normal tissues were completely separated by the DEGs, indicating that DEG expression patterns were specific and could be used to completely distinguish between OSCC and normal tissues.
Figure 3

PCA of DEGs between the OSCC and control groups. PCA – principal component analysis; DEGs – differentially expressed genes. OSCC – oral squamous cell carcinoma.

GO and KEGG pathway analyses of DEGs

In order to obtain further insight into gene function, we performed significant DEG enrichment functions determined by the DAVID online analysis tool, and the GO functional enrichments of upregulated and downregulated genes with a P-value of 0.05 were selected. The GO analysis of DEGs was mainly classified into three functional groups: CC, MF, and BP (Figure 4). In the BP group, the GO term analysis indicated that the common DEGs were primarily enriched in functions such as chemokine activity, monocyte differentiation, signal transduction, and extracellular matrix disassembly. In the MF group, the enriched GO terms were primarily involved in oxidoreductase activity, monooxygenase activity, positive regulation of cell proliferation and response to lipopolysaccharide. In the CC group, the results of GO analysis were enriched in extracellular exosome, external side of plasma membrane, extracellular region and inflammatory response.
Figure 4

GO term analysis of DEGs. (A) GO analysis divided DEGs into 3 functional groups: molecular function, biological processes, and cell composition. (B) GO enrichment significance items of DEGs in different functional groups. GO – Gene Ontology; DEGs – differentially expressed genes.

As demonstrated in Figure 5, the KEGG pathway analysis showed that cytokine-cytokine receptor interaction, metabolism of xenobiotics by cytochrome P450 and glutathione metabolism were the most significantly enriched pathways.
Figure 5

KEGG pathway enrichment analysis of DEGs. KEGG – Kyoto Encyclopedia of Genes and Genomes; DEGs – differentially expressed genes.

PPI network construction and modules selection

We constructed the 288-DEG expression products using the STRING database to construct PPI networks. After eliminating the isolated connected nodes, a total of 774 protein interactions were selected with combined score >0.7 and the PPI network was constructed for additional analysis (Figure 6). Genes with higher degrees of interaction were considered as hub genes. The top 30 hub genes in PPI network are listed in Figure 7. Results showed that the 10 most significant genes showing significant interactions were UBC (degree=33), CSF2 (degree=24), PSMB8 (degree=24), HLA-A (degree=23), OAS2 (degree=21), HLA-C (degree=20), HLA-E (degree=20), ISG15 (degree=19), OAS3 (degree=19) and EGF (degree=18) in this PPI network.
Figure 6

PPI network. PPI – protein-protein interaction.

Figure 7

Potential hub genes with higher degrees of interaction.

Validation of the hub gene expression with TCGA data

The expression of hub genes in the 371 OSCC and 32 adjacent non-tumor tissues were also investigated with TCGA OSCC data. According to the univariate and multivariate Cox regression analyses, we found age and stage could be used as poorer survival of OSCC patients (Figure 8). Nine genes could be screened out from TCGA OSCC data, including UBC, CSF2, PSMB8, OAS2, HLA-C, HLA-E, ISG15, OAS3, and EGF. In addition, the expressions levels of CSF2, PSMB8, OAS2, HLA-C, HLA-E, ISG15, OAS3, and EGF were found to be significant differences between OSCC and adjacent non-tumor tissues (Figure 9). Furthermore, we attempted to investigate Kaplan-Meier curves of the 9 hub genes (Figure 10), and we identified that the DEGS CSF2 and EGF were statistically different with the patients’ prognoses by survival analysis. The results revealed that CSF2 and EGF can be applied to differentiate between OSCC and adjacent non-tumor tissues.
Figure 8

(A, B) Univariate and multivariate Cox regression analysis of the correlation among the clinical characteristics with overall survival.

Figure 9

Different expressions of the hub genes between the OSCC and adjacent non-tumor tissues. (A) UBC; (B) CSF2; (C) PSMB8; (D) OAS2; (E) HLA-C; (F) HLA-E; (G) ISG15; (H) OAS3; (I) EGF. OSCC – oral squamous cell carcinoma.

Figure 10

Kaplan-Meier curves for the hub genes in TCGA OSCC cohort. (A) UBC; (B) CSF2; (C) PSMB8; (D) OAS2; (E) HLA-C; (F) HLA-E; (G) ISG15; (H) OAS3; (I) EGF. OSCC – oral squamous cell carcinoma.

Discussion

OSCC is the most common types of human cancer worldwide, and the overall 5-year survival is about 50% [7,23]. However, the potential molecular mechanisms in OSCC pathogenesis and progression are largely unknown. Recently, microarray and high-throughput sequencing technology has been rapidly and widely used in order to investigate genetic alterations during the different diseases’ progression. This technology appears to introduce promising therapeutic targets for tumors’ early detection, diagnosis and treatment. In our study, based on 3 microarray datasets and using bioinformatics analysis, we identified 651 DEGs, including 288 upregulated and 363 downregulated genes in OSCC tissues compared to non-tumor tissues. Using PCA analysis, we found that OSCC tissues were clearly separated from non-tumor tissues indicating that screened DEGs were specific and could be used to identify relevant genes implicated in OSCC development. GO analysis showed that the DEGs were enriched in extracellular exosome, external side of plasma membrane, extracellular region, and inflammatory response. Exosomes are delivered into recipient cells by mechanisms such as cell fusion, receptor-mediated uptake, and internalization [24]. Therefore, exosomes may be involved in OSCC’s physiological and pathological developments through regulation of cell-cell communication [25]. Growing evidence indicates that numerous tumor cells produce exosomes, and exosomes promoted OSCC growth and progression, which are emerging as potential technologies for OSCC early therapy or control [26,27]. KEGG pathway annotation of DEGs suggested that DEGs were mainly involved in several biological processes including cytokine-cytokine receptor interaction, metabolism of xenobiotics by cytochrome P450 and glutathione metabolism. Cytokine and cytokine receptor interaction networks are regarded as crucial aspects of inflammation and tumor immunology [28]. In the PPI network, UBC, CSF2, PSMB8, HLA-A, OAS2, HLA-C, HLA-E, ISG15, OAS3, and EGF had higher connectivity degree. The expressions levels of CSF2, PSMB8, OAS2, HLA-C, HLA-E, ISG15, OAS3, and EGF were found to be significant differences between OSCC and adjacent non-tumor tissues from TCGA data. Then, survival analysis using the hub genes revealed that CSF2 and EGF genes were significantly correlated with patients’ overall survival. The results suggested that CSF2 and EGF genes may play crucial roles in the progression of OSCC. In our study, CSF2 can be considered as a hub gene in OSCC PPI network. CSF-2 (GM-CSF) is now best viewed as a major regulator governing the functions of granulocyte and macrophage lineage populations at all stages of maturation. Several studies reported that tumor-derived hematopoietic growth factors CSF-2 was significantly increased in OSCC tumors compared to controls [29-31]. Another hub gene, epidermal growth factor (EGF) is a key growth factor that initiates a series of biochemical events resulting in increased cell growth [32]. Increasing evidence shows that EGF is closely associated with migration, proliferation, and tumor apoptosis [33-35]. Moreover, several scholars found that EGF in OSCC patients was significantly different compared to healthy control group [36-38]. Also, EGF in OSCC has been correlated to a worse clinical outcome in OSCC [39]. Thus, the targets of CSF2 and EGF may contribute to the progression of OSCC. Despite these results, there are some limitations to our study. First, we performed a bioinformatics analysis using 3 different microarray datasets and platforms. Though we used background correction and quartile data normalization for the 3 microarray datasets, it might also have variances. Second, OSCC is a highly heterogeneous tumor. For example, there are a lot of differences between oral cavity and oropharynx squamous cell carcinoma. The base of the tongue should be considered different from oral cavity because of HPV infections. Third, most of normal control tissues are adjacent to the tumors, therefore, could be involved of different grades of dysplasia because of the field cancerization phenomenon. Fourth, we identified some enriched pathways and hub genes in this study, however, the relationships between the enriched pathways and hub genes and the hierarchical processes within them were not fully elucidated. In order to find the underlying molecular mechanisms of OSCC, further studies are warranted.

Conclusions

In conclusion, we integrated 3 microarray datasets and identified the functional pathways and hub genes according to bioinformatics methods. This study may provide useful evidence for future investigation into the mechanisms and selection of biomarkers for OSCC. Moreover, a series of verification experiments should be performed later in order to confirm the identified genes function in OSCC.
  39 in total

1.  affy--analysis of Affymetrix GeneChip data at the probe level.

Authors:  Laurent Gautier; Leslie Cope; Benjamin M Bolstad; Rafael A Irizarry
Journal:  Bioinformatics       Date:  2004-02-12       Impact factor: 6.937

2.  Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources.

Authors:  Da Wei Huang; Brad T Sherman; Richard A Lempicki
Journal:  Nat Protoc       Date:  2009       Impact factor: 13.491

3.  The prognostic value of immune checkpoints in oral squamous cell carcinoma.

Authors:  Meri Sieviläinen; Rabeia Almahmoudi; Ahmed Al-Samadi; Tuula Salo; Matti Pirinen; Alhadi Almangush
Journal:  Oral Dis       Date:  2018-11-12       Impact factor: 3.511

4.  Exosomes containing miR-21 transfer the characteristic of cisplatin resistance by targeting PTEN and PDCD4 in oral squamous cell carcinoma.

Authors:  Tao Liu; Gang Chen; Dawei Sun; Minghui Lei; Yongqiang Li; Changming Zhou; Xiaodong Li; Wei Xue; Hong Wang; Chunjun Liu; Jiang Xu
Journal:  Acta Biochim Biophys Sin (Shanghai)       Date:  2017-09-01       Impact factor: 3.848

5.  Clinical significance of EGFR, Her-2 and EGF in oral squamous cell carcinoma: a case control study.

Authors:  Vanessa F Bernardes; Frederico O Gleber-Netto; Sílvia F Sousa; Tarcília A Silva; Maria Cássia F Aguiar
Journal:  J Exp Clin Cancer Res       Date:  2010-04-29

6.  Gene expression profiling of primary cutaneous melanoma and clinical outcome.

Authors:  Véronique Winnepenninckx; Vladimir Lazar; Stefan Michiels; Philippe Dessen; Marguerite Stas; Soledad R Alonso; Marie-Françoise Avril; Pablo L Ortiz Romero; Thomas Robert; Ovidiu Balacescu; Alexander M M Eggermont; Gilbert Lenoir; Alain Sarasin; Thomas Tursz; Joost J van den Oord; Alain Spatz
Journal:  J Natl Cancer Inst       Date:  2006-04-05       Impact factor: 13.506

7.  EGF in saliva and tumor samples of oral squamous cell carcinoma.

Authors:  Vanessa Fátima Bernardes; Frederico Omar Gleber-Netto; Sílvia Ferreira Sousa; Tarcília Aparecida Silva; Mauro Henrique Nogueira Guimarães Abreu; Maria Cássia Ferreira Aguiar
Journal:  Appl Immunohistochem Mol Morphol       Date:  2011-12

8.  Serum cytokine levels in patients with oral mucous membrane disorders.

Authors:  T Yamamoto; K Yoneda; E Ueta; J Hirota; T Osaki
Journal:  J Oral Pathol Med       Date:  1991-07       Impact factor: 4.253

9.  RNA-seq analyses of multiple meristems of soybean: novel and alternative transcripts, evolutionary and functional implications.

Authors:  Lei Wang; Chenlong Cao; Qibin Ma; Qiaoying Zeng; Haifeng Wang; Zhihao Cheng; Genfeng Zhu; Ji Qi; Hong Ma; Hai Nian; Yingxiang Wang
Journal:  BMC Plant Biol       Date:  2014-06-17       Impact factor: 4.215

10.  Application of a Persistent Heparin Treatment Inhibits the Malignant Potential of Oral Squamous Carcinoma Cells Induced by Tumor Cell-Derived Exosomes.

Authors:  Shinya Sento; Eri Sasabe; Tetsuya Yamamoto
Journal:  PLoS One       Date:  2016-02-05       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.