| Literature DB >> 29257303 |
Jianyong Gao1, Gang Tian1, Xu Han1, Qiang Zhu1.
Abstract
Oral squamous cell carcinoma (OSCC) is the sixth most common type cancer worldwide, with poor prognosis. The present study aimed to identify gene signatures that could classify OSCC and predict prognosis in different stages. A training data set (GSE41613) and two validation data sets (GSE42743 and GSE26549) were acquired from the online Gene Expression Omnibus database. In the training data set, patients were classified based on the tumor‑node‑metastasis staging system, and subsequently grouped into low stage (L) or high stage (H). Signature genes between L and H stages were selected by disparity index analysis, and classification was performed by the expression of these signature genes. The established classification was compared with the L and H classification, and fivefold cross validation was used to evaluate the stability. Enrichment analysis for the signature genes was implemented by the Database for Annotation, Visualization and Integration Discovery. Two validation data sets were used to determine the precise of classification. Survival analysis was conducted followed each classification using the package 'survival' in R software. A set of 24 signature genes was identified based on the classification model with the Fi value of 0.47, which was used to distinguish OSCC samples in two different stages. Overall survival of patients in the H stage was higher than those in the L stage. Signature genes were primarily enriched in 'ether lipid metabolism' pathway and biological processes such as 'positive regulation of adaptive immune response' and 'apoptotic cell clearance'. The results provided a novel 24‑gene set that may be used as biomarkers to predict OSCC prognosis with high accuracy, which may be used to determine an appropriate treatment program for patients with OSCC in addition to the traditional evaluation index.Entities:
Mesh:
Substances:
Year: 2017 PMID: 29257303 PMCID: PMC5783517 DOI: 10.3892/mmr.2017.8256
Source DB: PubMed Journal: Mol Med Rep ISSN: 1791-2997 Impact factor: 2.952
Figure 1.Kaplan-Meier curve analysis indicated a significant difference in survival between H (n=41) and L (n=35) stages samples based on tumor-node-metastasis classification. P=2.00×10−05. H, high; L, low.
Figure 2.Optimal threshold selection in the classification model. (A) Overlap scale of classifications using the model with the High and Low classifications. Gene sets with |Fi|>k were selected, and k from 0–1 was set with a step size of 0.01; Fi=0.47 (vertical line) was used as the cut-off value to classify the samples. (B) 5-fold cross validation results for 10 iterations, which are indicated by the different colored lines. Fi, disparity index; k, iteration step.
List of 24 signature genes.
| Gene symbol | Gene name |
|---|---|
| ADA | Adenosine deaminase |
| CC2D2A | Coiled-coil and C2-domain containing 2A |
| C9ORF102 | Chromosome 9 open reading frame 102 |
| PRSS12 | Protease, serine 12 (also known as neurotrypsin and motopsin) |
| TNXB | Tenascin XB |
| SLC11A1 | Solute carrier family 11 member 1 (alsoknown as natural resistance-associatedmacrophage protein 1) |
| GAPVD1 | GTPase activating protein and VPS9domains 1 |
| THBS1 | Thrombospondin 1 |
| C19ORF53 | Chromosome 19 open reading frame 53 |
| IGSF10 | Immunoglobulin superfamily, member 10 |
| PLGLB2 | Plasminogen-like B2 |
| ROD1 | ROD1 regulator of differentiation 1 (alsoknown as polypyrimidine tract-bindingprotein 3) |
| AGPAT2 | 1-acylglycerol-3-phosphate O-acyltransferase2 (also known as lysophosphatidic acidacyltransferase β) |
| SESN3 | Sestrin 3 |
| CSNK1G1 | Casein kinase 1 γ1 |
| HMGN3 | High-mobility group nucleosomal-bindingdomain 3 |
| SLC2A3 | Solute carrier family 2 member 3 |
| FAM161A | Family with sequence similarity 161member A |
| DDX31 | DEAD-box helicase 31 |
| JMJD6 | Jumonji-domain containing 6 (also knownas arginine demethylase and lysine hydrolase) |
| PPAP2B | Phosphatidic acid phosphatase type 2B (alsoknown as phospholipid phosphatase 3) |
| YEATS2 | YEATS-domain containing 2 |
| SERTAD4 | SERTA-domain containing 4 |
| NAPEPLD | N-acyl phosphatidylethanolaminephospholipase D |
Figure 3.Clustering analysis of signature gene expressions and corresponding survival analyses. (A) Signature genes were used to mark samples in H and L classifications under the score-classification model. (B) Kaplan-Meier curve indicating a significant difference in survival between Cluster 1 and Cluster2 classified with the boundary of ‘0’. (C) Heat map of signature gene expressions in two cluster samples. (D) Kaplan-Meier curve indicating a significant difference in survival between Cluster1 and Cluster2 that were identified by expression profiling of the signature genes. H, high; L, low.
GO function term and KEGG pathway enrichment analysis of 24 signature genes.
| A, GOTERM_BP | P-value | Genes |
|---|---|---|
| GO:0006909~phagocytosis | 0.0016 | THBS1, SLC11A1 and JMJD6 |
| GO:0006897~endocytosis | 0.0024 | THBS1, SLC11A1, and JMJD6 |
| GO:0010324~membrane invagination | 0.0024 | THBS1, SLC11A1, and JMJD6 |
| GO:0002604~regulation of dendritic cell antigen processing and presentation | 0.0025 | THBS1 and SLC11A1 |
| GO:0002577~regulation of antigen processing and presentation | 0.0025 | THBS1 and SLC11A1 |
| GO:0051240~positive regulation of multicellular organismal process | 0.0033 | THBS1, AGPAT2, ADA and SLC11A1 |
| GO:0001819~positive regulation of cytokine production | 0.0056 | THBS1, AGPAT2 and SLC11A1 |
| GO:0043277~apoptotic cell clearance | 0.0075 | THBS1 and JMJD6 |
| GO:0042110~T cell activation | 0.0107 | ADA, SLC11A1 and JMJD6 |
| GO:0016044~membrane organization | 0.0112 | THBS1, SLC11A1, and JMJD6 |
| GO:0051241~negative regulation of multicellular organismal process | 0.0176 | THBS1, ADA and SLC11A1 |
| GO:0042116~macrophage activation | 0.0187 | SLC11A1 and JMJD6 |
| GO:0001817~regulation of cytokine production | 0.0212 | THBS1, AGPAT2 and SLC11A1 |
| GO:0002285~lymphocyte activation during immune response | 0.0224 | ADA and SLC11A1 |
| GO:0006644~phospholipid metabolic process | 0.0232 | AGPAT2, PPAP2B and NAPEPLD |
| GO:0002685~regulation of leukocyte migration | 0.0249 | THBS1 and ADA |
| GO:0046649~lymphocyte activation | 0.0253 | ADA, SLC11A1 and JMJD6 |
| GO:0019637~organophosphate metabolic process | 0.0256 | AGPAT2, PPAP2B and NAPEPLD |
| GO:0016192~vesicle-mediated transport | 0.0335 | THBS1, SLC11A1, and JMJD6 |
| GO:0048584~positive regulation of response to stimulus | 0.0347 | THBS1, ADA and SLC11A1 |
| GO:0002684~positive regulation of immune system process | 0.0352 | THBS1, ADA and SLC11A1 |
| GO:0045321~leukocyte activation | 0.0363 | ADA, SLC11A1 and JMJD6 |
| GO:0002824~positive regulation of adaptive immune response based on somatic recombination of immune receptors built from immunoglobulin superfamily domains | 0.0371 | ADA and SLC11A1 |
| GO:0001568~blood vessel development | 0.0372 | THBS1, PPAP2B and JMJD6 |
| GO:0002821~positive regulation of adaptive immune response | 0.0383 | ADA and SLC11A1 |
| GO:0001944~vasculature development | 0.0388 | THBS1, PPAP2B and JMJD6 |
| GO:0002366~leukocyte activation during immune response | 0.0443 | ADA and SLC11A1 |
| GO:0002263~cell activation during immune response | 0.0443 | ADA and SLC11A1 |
| GO:0001818~negative regulation of cytokine production | 0.0467 | THBS1 and SLC11A1 |
| GO:0001775~cell activation | 0.0495 | ADA, SLC11A1 and JMJD6 |
| GO:0044463~cell projection part | 0.0263 | PRSS12, CC2D2A and ADA |
| hsa00565: Ether lipid metabolism | 0.0472 | AGPAT2 and PPAP2B |
ADA, adenosine deaminase; AGPAT2, 1-acylglycerol-3-phosphate O-acyltransferase 2; BP, biological process; CC, cellular component; CC2D2A, coiled-coil and C2-domain containing 2A; GO, gene ontology; JMJD6, Jumonji-domain containing 6; KEGG, Kyoto Encyclopedia of Genes and Genomes; NAPEPLD, N-acyl phosphatidylethanolamine phospholipase D; PPAP2B, phosphatidic acid phosphatase type 2B; PRSS12, protease, serine 12; SLC11A1, solute carrier family 11 member 1; THBX, thrombospondin 1.
Figure 4.Multivariate survival analysis of 24 signature genes. (A) ROC curve; AUC=0.97. (B) Kaplan-Meier curve indicating a significant difference in survival between high-risk and low-risk samples identified by the multivariate prognosis of 24 genes; P=2.48×10−19. AUC, area under the ROC curve; ROC, receiver operating characteristic.
Figure 5.Validation of survival analysis using two additional data sets. (A) ROC curve of the classification in the GSE42743 data set; AUC=0.994. (B) Kaplan-Meier curve indicating a significant difference in survival between high and low risk samples in GSE42743; P=4.55×10−15. (C) ROC curve of the recurrent classification in the GSE26549 data set; AUC=0.984. (D) Kaplan-Meier curve indicating a significant difference between high and low risk samples in GSE26549; P=1.41×10−14. AUC, area under the ROC curve; ROC, receiver operating characteristic.