Literature DB >> 33854823

Novel deep learning-based survival prediction for oral cancer by analyzing tumor-infiltrating lymphocyte profiles through CIBERSORT.

Yeongjoo Kim^1,2, Ji Wan Kang^1,2, Junho Kang^1,2, Eun Jung Kwon^1,2, Mihyang Ha^1,2, Yoon Kyeong Kim^1,2, Hansong Lee^1,2, Je-Keun Rhee³, Yun Hak Kim^2,4.

Abstract

The tumor microenvironment (TME) within mucosal neoplastic tissue in oral cancer (ORCA) is greatly influenced by tumor-infiltrating lymphocytes (TILs). Here, a clustering method was performed using CIBERSORT profiles of ORCA data that were filtered from the publicly accessible data of patients with head and neck cancer in The Cancer Genome Atlas (TCGA) using hierarchical clustering where patients were regrouped into binary risk groups based on the clustering-measuring scores and survival patterns associated with individual groups. Based on this analysis, clinically reasonable differences were identified in 16 out of 22 TIL fractions between groups. A deep neural network classifier was trained using the TIL fraction patterns. This internally validated classifier was used on another individual ORCA dataset from the International Cancer Genome Consortium data portal, and patient survival patterns were precisely predicted. Seven common differentially expressed genes between the two risk groups were obtained. This new approach confirms the importance of TILs in the TME and provides a direction for the use of a novel deep-learning approach for cancer prognosis.

Entities: CellLine Chemical Disease Gene Species

Keywords: Head and neck cancer; cibersort; deep learning; international cancer genome consortium; oral cancer; the cancer genome atlas; tumor microenvironment; tumor-infiltrating lymphocytes

Year: 2021 PMID： 33854823 PMCID： PMC8018482 DOI： 10.1080/2162402X.2021.1904573

Source DB: PubMed Journal: Oncoimmunology ISSN： 2162-4011 Impact factor: 8.110

Introduction

Head and neck cancer (HNSC) is currently garnering much attention; approximately 53,260 patients have been newly diagnosed with HNSC in 2020 – thus far in the United States – and HNSC-associated estimated deaths ranks 10th among the major malignant cancer types in the U.S.[1] Although the number of patients accounts for only 2.9% of the newly reported cancer cases, the incidence rate of HNSC is 11.4% and the mortality rate is 2.5% over a period of 5 years, from 2013 to 2017, and the 5-year relative survival rate is only 66.2% from 2010 to 2016.[2] Therefore, the development of new HNSC biomarkers is important to overcome the lack of research data as a result of the small number of patients. Over the past several years, many studies have been conducted to develop a novel HNSC biomarker that predicts prognosis.[3-8] However, limitations of the current potential biomarkers make clinical application very challenging.[9] The importance of tumor-infiltrating lymphocyte (TIL) information has previously been reported,[10,11] as TIL levels identify high-risk groups of patients with oral tongue squamous cell carcinoma.[12] In particular, balanced levels of CD8 + T cells and regulatory T cells (Tregs) directly affect the survival rate of patients with oral cancer (ORCA).[13] In addition, the elevated abundance of cancer-associated fibroblasts is highly correlated with patient survival.[14] In TIL-focused analysis, determining increases in gene expression levels via differentially expressed gene (DEG) analysis is eventually used to identify individual potential gene biomarkers. In this case, within the analysis results, it is possible that there may be elevated false-potential biomarker genes that are not significantly related to the immune system. Thus, broad identification of general immune microenvironments may be more effective than relying on the identification of individual levels of biomarker genes when analyzing immune-related survival rates.[15,16] Thus, CIBERSORT, a popular TIL prediction method, was selected. Through the LM22 signature matrix, CIBERSORT provided the most diverse predictions of abundance levels across TIL subsets among the tumor microenvironment (TME) deconvolution tools that were currently available. This was consistent with our purpose of identifying the “cell type-wide” immune cell landscape. Using the support regression vector-based machine learning method, Newman et al. have demonstrated that CIBERSORT effectively resolves cell subtypes with similar gene expression patterns via benchmarking analysis.[17] CIBERSORT analysis for various cancer types has enabled the development of novel biomarkers,[18-20] confirming the importance of focusing on immune cell fractions within the TME. Several existing studies have identified survival patterns by analyzing gene expression profiles using deep learning,[21-23] but deep learning studies that reveal TIL-specific patterns using secondary information, such as CIBERSORT, have not been published thus far. Therefore, in this study, deep learning was suggested as an alternative strategy for identifying biomarkers that may provide details about survival patterns. In this strategy, heterogeneous TME information, RNA expression data of selected ORCA subgroups in the HNSC cohort of The Cancer Genome Atlas (TCGA), was fed into a deep neural network (DNN) classifier coupled to CIBERSORT.

Materials and Methods

RNA expression data and derived immunotype-predicted data preprocessing

The RNA-Sequencing (RNA-Seq) ORCA datasets were downloaded for training candidates and validation. The RNA expression and clinical data for head and neck squamous cell carcinoma were downloaded from the Broad Institute Genome Data Analysis Center Firehose database (https://gdac.broadinstitute.org/). Another ORCA dataset was downloaded from the International Cancer Genome Consortium (ICGC) data portal (https://dcc.icgc.org/). Detailed patient information after data preprocessing is provided in Table 1.

Table 1.

Clinical characteristics of the cohort of patients with head and neck cancer in The Cancer Genome Atlas for which the oral cancer data were filtered

		Total	Low-risk group	High-risk group
		Number (Percentage)
Age (years)	0–39	4(2.3)	3(4.1)	1(1.0)
	40–49	20(11.6)	8(10.8)	12(12.1)
	50–59	58(33.5)	27(36.5)	31(31.3)
	60–69	42(24.3)	14(18.9)	28(28.3)
	70–79	34(19.7)	15(20.3)	19(19.2)
	80+	15(8.7)	7(9.5)	8(8.1)
Sex	Male	124(71.7)	53(71.6)	71(71.7)
	Female	49(28.3)	21(28.4)	28(28.3)
N stage	N0	75(43.4)	34(45.9)	41(41.4)
	N1	28(16.2)	8(10.8)	20(20.2)
	N2	62(35.8)	29(39.2)	33(33.3)
	N3	2(1.2)	0(0.0)	2(2.0)
	NX	6(3.5)	3(4.1)	3(3.0)
T stage	T1	14(8.1)	10(13.5)	4(4.0)
	T2	56(32.4)	29(39.2)	27(27.3)
	T3	33(19.1)	15(20.3)	18(18.2)
	T4	66(38.2)	17(23.0)	49(49.5)
	TX	4(2.3)	3(4.1)	1(1.0)
M stage	M0	166(96.0)	68(91.9)	98(99.0)
	M1	2(1.2)	2(2.7)	0(0.0)
	MX	5(2.9)	4(5.4)	1(1.0)

Clinical characteristics of the cohort of patients with head and neck cancer in The Cancer Genome Atlas for which the oral cancer data were filtered The immune cell fractions in both TCGA and ICGC datasets were predicted via CIBERSORT using the LM22 signature matrix with a 100× permutation count without applying quantile normalization, as directed on the website. After running CIBERSORT, 203 out of the 566 samples with a CIBERSORT p-value > 0.05 were removed. Another 77 samples from sites of the hypopharynx, larynx, and oropharynx that obviously did not belong to ORCA were removed. A total of 113 samples from tongue sites (tongue base and oral tongue) were also removed, and the clinical prognosis was different from the validation cohort, gingivobuccal cancer.[24,25] The remaining 173 samples were obtained for further analysis. A detailed flowchart of the pipeline is shown in Figure 1(a).

Figure 1.

(a) Pipeline flowchart depicting the data preprocessing step. (b) Pipeline flowchart for processing the classifier establishment step, including the validation process using a deep neural network (DNN) classifier. GDAC, Genome Data Analysis Center; HNSC, head and neck cancer; RNA-seq, RNA sequencing; TIL, tumor-infiltrating lymphocyte; DEG, differentially expressed gene; DNN, deep neural network; RF, random forests; DT, decision tree; ICGC, International Cancer Genome Consortium; ORCA, oral cancer

Statistical analysis

K-means clustering and hierarchical clustering were performed using the scikit-learn Python package (version 0.22.1). Consensus clustering was performed using the ConsensusClusterPlus R package (version 1.50.0). Significant differences in each LM22 fraction were compared using the Mann‒Whitney U test. Survival analysis was performed using the lifeline Python package (version 0.24.2). Significance between survival curves was analyzed using the log-rank test. The t-test boxplot visualization was performed using the Statannot Python package (version 0.2.2).

DEG analysis

DEG analysis between TCGA risk groups was performed using the R limma package (version 3.42.2) .[26] P-values < 0.05 and |log fold change | > 1.2 threshold were applied to the result.

Survival prediction by deep learning classification

Deep learning classification was performed using a DNN classifier in the TensorFlow module (version 1.14.0) in Python and included 2000 steps with a 7 × 7 hidden layer. The hidden unit size (7 × 7) was diversified by square multiplication from 2 to 30, determining the optimal value with the highest accuracy. Loss was calculated using softmax cross entropy, Adagrad for optimizer, and Relu for the activation function. Accuracy was autonomously calculated using the internal evaluation function. A detailed flowchart of the pipeline is shown in Figure 1(b).

Analysis environment

The overall analysis was performed using Python (version 3.7.9; Python Software Foundation, Wilmington, DE, USA) and R (version 3.6.1; The R Foundation, Vienna, Austria) software. Any other version of Python/R packages of interest may be checked in an established conda environment within the provided docker image.

Results

Survival analysis of clustered CIBERSORT results from TCGA data

To obtain unsupervised classified results, the CIBERSORT results were subjected to various clustering analyses. The most important step in clustering was to determine the appropriate clustering method and optimal k-value. In this study, three clustering methods were considered for classification during candidate training: Hierarchical, K-means, and consensus clustering. These clustering methods were all incorporated into the “intracohort validation” process, and the results were calculated for each k-value via mutual information (MI), normalized MI (NMI), and adjusted MI (AMI) methods. To concisely depict only the valid analysis results, the hierarchical clustering method was utilized with a k-value = 3 that exhibited every measuring value (Table 2).

Table 2..

	Consensus(k = 3)	Hierarchical(k = 2)	Hierarchical(k = 3)	Hierarchical(k = 4)	K-means(k = 2)	K-means(k = 3)
MI	0.423562	0.405244	0.687885	1.037712	0.426107	0.686712
NMI	0.414230	0.592863	0.654062	0.818148	0.674453	0.642146
AMI	0.400079	0.589346	0.645963	0.809483	0.671388	0.633973

Mutual information (MI), Normalized MI (NMI) and Adjusted MI (AMI) scores of potential clustering methods. The most acceptable scores among variated k values across each clustering/measuring method are highlighted. A Kaplan‒Meier (K-M) graph was plotted to depict overall survival information in the clinical data for each defined cluster. Cluster 2 (green) exhibited a better prognosis compared to clusters 1 and 3 (Figure 2(a)). The two groups (cluster 1, red and cluster 3, green) did not exhibit meaningful differences between them (p = .88740); therefore, the two groups were combined into a single high-risk group (Figure 2(b)).

Figure 2.

(a) Kaplan‒Meier (k-m) plot of K-means clustering after cell-type identification by estimating relative subsets of RNA transcripts (k = 3 and n = 173). The yellow line (class 3) shows a distinct favorable survival pattern. (p-value = 0.26592) (b) K-M plot of Figure 2a regrouped by binary risk group. Groups corresponding to the blue and green lines in Figure 2a are merged into one high-risk group. (p-value = 0.01441)

Differences in the TIL fraction between the high- and low-risk groups

The Mann‒Whitney U test was used to determine the differences between the high- and low-risk groups in the TCGA ORCA dataset based on the abundance of each of the 22 TIL subsets (Figure 3). As shown in the boxplots, the counts of naïve and memory B cells, CD8 + T cells, activated memory CD4 + T cells, follicular helper T cells, Tregs, resting natural killer (NK) cells, monocytes, M1 macrophages, resting dendritic cells, and resting mast cells were significantly increased in the low-risk group, whereas a significant increase in the naïve CD4 + T cell, gamma delta T cell, M0 macrophage, activated mast cell, and eosinophil TIL counts were observed in the high-risk group. No difference was observed between the two groups with respect to plasma cell, resting memory CD4 + T cell, activated NK cell, M2 macrophage, activated dendritic cell, and neutrophil counts.

Figure 3.

Bar plots indicating the differences in the estimated LM22 fraction between the high and low survival risk groups. Each p-value is written above the bar plots (NS: p > .05, *: p ≤ 0.05, **: p ≤ 0.01, ***: p ≤ 0.001, and ****: p ≤ 0.0001). Y-axis indicates predicted fraction level of each cell subtype

Common DEGs between high- and low-risk groups

To explore prominent potential biomarker genes, DEG analysis was performed between each predicted risk group in both TCGA and ICGC patients, confirming seven common DEGs in total. Small proline-rich protein 3 (SPRR3) was upregulated in the low-risk group, while collagen type XI alpha 1 chain (COL11A1), collagen type X alpha 1 chain (COL10A1), matrix metallopeptidase 11 (MMP11), matrix metallopeptidase 13 (MMP13), collagen triple helix repeat containing 1 (CTHRC1), and ring finger protein 128 (RNF128) showed significant upregulation in the high-risk group.

Survival prediction based on the classified CIBERSORT results

To verify if the survival rate of other public ORCA cohorts might be predicted based on the patterns observed in the current data group, a deep learning model using a DNN classifier was established. To validate the classifier model’s accuracy, two strategies were employed, i.e., “intracohort validation” and “intercohort validation.” In the first step, 80% of the samples (n = 138) in the TCGA ORCA dataset were used as inputs to train the DNN classifier. The remaining samples were divided into two to perform validation tests. The accuracy for the former test group was 100%, whereas for the latter, it was 94.4% (n = 17 for group 1, n = 18 for group 2), demonstrating a significant survival group prediction level of 97.2%, on average. A time-dependent graph depicting the changes in the loss function/internal accuracy of the training set and accuracies in individual datasets is shown in Figure 4. Because the classifier predicted the survival pattern of the sample group based on the CIBERSORT results, the classifier was used to analyze a completely different RNA-Seq ORCA dataset.

Figure 4.

Scalar visualization of the established deep neural network (DNN) classifier model over steps in the loss function (a) and accuracy with the training datasets (b), primary test set (c), and secondary test set (d) There are some concerns that the remarkable performance of the classifier might be attributed to overfitting, which is always associated with endeavors aimed at achieving maximum accuracy. Thus, the performance of the classifier must be validated using cohorts from completely different batches. The obtained ICGC ORCA RNA-Seq dataset was classified using a previously established classifier. The differential survival patterns between samples from the predicted high-risk (n = 6) and low-risk groups (n = 28) were analyzed using the K-M method. As shown in Figure 5, the survival of the predicted high-risk group was significantly lower than that of the predicted low-risk group (p = .00685). Detailed patient information for each predicted group is provided in Table 3.

Figure 5.

Kaplan‒Meier survival plot of the predicted International Cancer Genome Consortium oral cancer dataset. (p-value: 0.00685)

Table 3.

Clinical characteristics of the risk-group predicted cohort of patients with oral cancer in the International Cancer Genome Consortium data

		Total	Low-risk group	High-risk group
		Number (Percentage)
Age (years)	0–39	7(20.5)	7(25.0)	0(0.0)
	40–49	10(29.4)	8(28.6)	2(33.3)
	50–59	11(32.4)	10(35.7)	1(16.7)
	60–69	5(14.7)	3(10.7)	2(33.3)
	70–79	1(2.9)	0(0.0)	1(16.7)
	80+	0(0.0)	0(0.0)	0(0.0)
Sex	Male	28(82.4)	22(78.6)	6(100.0)
	Female	6(17.6)	6(21.4)	0(0.0)
N stage	N0	8(23.5)	8(28.6)	0(0.0)
	N1	17(50.0)	12(42.9)	5(83.3)
	N2	9(26.5)	8(28.6)	1(16.7)
	N3	0(0.0)	0(0.0)	0(0.0)
	NX	0(0.0)	0(0.0)	0(0.0)
T stage	T1	0(0.0)	0(0.0)	0(0.0)
	T2	0(0.0)	0(0.0)	0(0.0)
	T3	2(5.9)	2(7.1)	0(0.0)
	T4	32(94.1)	26(92.9)	6(100.0)
	TX	0(0.0)	0(0.0)	0(0.0)
M stage	M0	34(100.0)	28(100.0)	6(100.0)
	M1	0(0.0)	0(0.0)	0(0.0)
	MX	0(0.0)	0(0.0)	0(0.0)

Clinical characteristics of the risk-group predicted cohort of patients with oral cancer in the International Cancer Genome Consortium data Kaplan‒Meier survival plot of the predicted International Cancer Genome Consortium oral cancer dataset. (p-value: 0.00685) To validate the classifier’s performance over similar modeling methods, the validation result was compared to other results of the same training data using two methods: random forests and decision tree methods over the same pipeline. The accuracy of the random forest method was nearly 94.1% on average between the two intracohort validation datasets, but the established classifier with the method did not show any significant survival rates between the predicted risk groups in intercohort validation. In contrast, the decision tree method significantly predicted survival rates between the two groups in external validation, but the internal validation result had the worst accuracy across methods, scoring only 82.4% on average. All K-M plots, receiver operating characteristic curve plots, and corresponding area under the curve scores as validation results of the three methods over the pipeline are provided in Supplemental Figures 4 and 5.

Discussion

The analysis of hidden patterns within gene expression data is a tremendous strategy to attaining an in-depth understanding of functional genomics. However, the complexity of biological networks and the large number of genes make data analysis very difficult; thus, some clustering algorithms help derive useful information by identifying patterns in gene expression data.[27] Based on this idea, by clustering the ORCA CIBERSORT results, which accurately estimate immune composition, we aimed to achieve a more immune-specific and noise-free clustering efficiency. In addition, to examine whether the results could be validly analyzed, the immunological characteristics of each high- and low-risk group were determined by comparing the estimated immune cell fraction identified using CIBERSORT in each group. The TME is an important indicator of the clinical and prognostic factors of cancer. Bin Liang et al. have identified the effects of 22 immune cell subsets in patients with HNSC, and used their characteristics to reveal clinical relevance and define independent prognosis factors in advance.[28] Furthermore, in this study, the actual risk groups were predicted by classifying the signature patterns of TIL fractions in patients with HNSC by combining them with survival information. As a result of this effort, a few patients with HNSC, whose survival rates were extremely low, were successfully identified. The purpose of this study was to identify the clearest differences between the two risk groups and to introduce these data patterns into a classifier, so as to identify the classification method/optimal k-value that resulted in the most distant and significant clusters. Therefore, there was a significant difference between this approach and other clustering measurement methods, such as MI, NMI, and AMI. The measured score plays an important role in determining the optimal method/k-value based on survival patterns, but it cannot serve as the sole evidence. For example, in the overall measuring scores of the hierarchical clustering method as shown in Table 2, the score with a k-value = 4 (MI = 1.037712, NMI = 0.818148, AMI = 0.809483) was significantly higher compared to that of a k-value = 3 (MI = 0.687885, NMI = 0.654062, AMI = 0.645963), but in the K-M analysis, the p-value associated with the former condition was poor (k = 3, p = .1359 and k = 4, p = .2524). The survival plot based on K-M analysis is shown in Supplemental Figure 1–3. Differences in LM22 subtypes between the risk groups were also investigated. Many immune cell subtypes that enhance immunity were decreased in the high-risk group. Focusing on T-cell subgroups, the high-risk group showed decreased counts of activated memory CD4 + T cell fractions. Given that the prediction level of the naïve CD4 + T cell fraction showed an opposing regulation pattern, these data suggested that CD4 + T cell differentiation affected survival rates. Although further investigations on memory CD4 + T cells are required, several studies on cell development have revealed that memory CD4 + T cell development in tumors is crucial for various immunotherapeutic treatments, such as immune blockade therapy, in the context of enhancing the effectiveness of the anti-tumor response.[29] In addition, activated memory CD4 + T cell-induced activation of CD8 + T cells increases the direct kill rate of cancer, which is illustrated in the same context as a dramatic difference in the current cell type data. The counts of follicular helper T cells, which have been reported to play important roles in various cancer microenvironments,[30-32] are also decreased in the high-risk group. A high fraction level of gamma/delta T cells has been found in patients with HNSC compared to that of normal patients.[33] The current corresponding result also indicated that the elevated gamma/delta T cell fraction might affect the low survival rates of the predicted high-risk patient group. Increased monocyte levels enhance macrophage polarization into M1 macrophages, which produce proinflammatory cytokines and reactive oxygen/nitrogen species that are crucial for host defense and tumor cell killing.[34] Chronic inflammation abnormalities and the accompanying oxidative stress lead to the development of various diseases, such as cancer.[35,36] In this context, the current predictions of monocyte and macrophage phenotype abundance revealed some interesting results. Specifically, although the M2 macrophage fraction did not show differences between the two groups, the M0 macrophage fraction counts were significantly increased in the high-risk group, while those of the M1 macrophages and monocytes exhibited a proportionately opposite regulation pattern, suggesting that the proinflammatory deactivation caused by significantly decreased macrophage polarization played a crucial role in determining the survival rate in the high-risk group. The resting and activated mast cell fractions also showed an opposite regulation pattern. Mast cell accumulation in tumor tissue is either beneficial or detrimental to tumors; although further understanding of the key roles of mast cells in cancer is required, several studies have summarized the correlation between mast cells and cancer.[37] In particular, it is interesting to note that this result is somewhat different from the survival analysis results based on Tregs and M0 macrophages reported by Bin Liang et al.[28] Both studies showed low survival rates in the groups containing low M0 macrophage counts, but the current study showed a fraction level with an opposite pattern in the Treg counts between risk groups. This indicated that the classifier we established focused on the difference in the cell subtype count with a more significant effect on survival among the 22 input channels, considering the overall actual TIL composition within malignant tissues. It also suggests that analyzing the survival patterns using individual factors, such as single immune-cell types, may be problematic because the TIL composition in the TME is heterogeneous. Eosinophil activity affects tumors in various ways due to their immunobiological characteristics; they affect anti-tumor responses due to their destructive features, but also promote tumor proliferation by inhibiting Th1 responses or increasing Th2 responses.[38] Although tumor-associated eosinophils are generally observed in hematological solid tumors with a favorable prognosis, [39] the eosinophil elevation levels in this study show an opposite tendency. Further studies on the roles of eosinophils in ORCA would contribute to better comprehension of the results. Additionally, both naïve and memory B cells had downregulated fractions in high-risk patients. This result suggests potential hazards of B cell deficiency, since the role of tumor-infiltrating B cells within the TME remains controversial[40] and is a challenge that requires further investigation. Toruner et al. have reported significant downregulation of SPRR3 in oral squamous cell carcinoma compared to that in normal tissues,[41] in two individual studies.[42,43] The current results were identical to that research. COL11A1 and COL10A1, which express circulating extracellular-matrix (ECM)-related proteins, are significantly elevated in breast cancer, gastric cancer, and pancreatic cancer.[44] Although further study is required to apply this result to ORCA, the current study enhances the hypothesis of the role of the two genes in various types of cancer. Pal et al. have reported that MMP11 and MMP13 are stimulated by the tumor-specific ECM protein thrombospondin 1 (THBS1), whose expression is highly elevated in both cultured human ORCA cell lines and their co-cultivated mouse fibroblast cells. The expression level of CTHRC1 is highly correlated with metastasis of ORCA cells.[45] Downregulation of RNF128 expression is correlated with poor prognosis in various types of malignancies.[46,47] However, in the current study, RNF128 was upregulated in the high-risk group compared to that of the low-risk group. The Cox proportional-hazards model was used to acquire common DEGs, but no significant coefficient with survival was found among the genes included. In light of these results, determining the cancer survival prognosis with a single biomarker gene might overlook the importance of a cohort comprising the TME. The benefits gained from the clustering of existing gene expression data and analysis methods are well established.[48] However, the ability to achieve useful results by clustering secondarily analyzed expression data (in this case, CIBERSORT) has been questionable. In this study, evidence was provided on the benefits of clustering CIBERSORT data that contained information on tumor-infiltrating immune cell subsets. Further study and understanding of these cell subsets in in vivo environments in various cancers are needed. Due to the lack of a public RNA-Seq ORCA dataset, the intercohort validation was performed using only a single cohort. In addition, there were some existing challenges that overcame the skewed intercohort validation result. In spite of efforts to establish binary risk group models with simultaneous sample-balanced, survival-distinct, and significant features, predicted high-risk patients were only 17.6% of the total patients, even though most of them were at the T4 stage.

Conclusion

Despite efforts to develop novel biomarkers for HNSC, more research is needed. To establish an accurate survival information-specific predictive model, the public RNA-Seq ORCA dataset was virtually dissected and transformed into TIL-specific data using CIBERSORT. When these data were fed into the DNN classifier, it successfully predicted survival patterns of the predicted risk groups in an independent ICGC ORCA cohort. Through this study, a novel approach based on deep learning is suggested that has potential application in various types of cancer. Click here for additional data file.

43 in total

1. Tumor-infiltrating lymphocytes, particularly the balance between CD8(+) T cells and CCR4(+) regulatory T cells, affect the survival of patients with oral squamous cell carcinoma.

Authors: Yoshiko Watanabe; Fuminori Katou; Haruo Ohtani; Takashi Nakayama; Osamu Yoshie; Kenji Hashimoto
Journal: Oral Surg Oral Med Oral Pathol Oral Radiol Endod Date: 2010-03-29

2. Identification and Complete Validation of Prognostic Gene Signatures for Human Papillomavirus-Associated Cancers: Integrated Approach Covering Different Anatomical Locations.

Authors: Eun Jung Kwon; Mihyang Ha; Jeon Yeob Jang; Yun Hak Kim
Journal: J Virol Date: 2021-02-24 Impact factor: 5.103

3. Expression profile of epidermal differentiation complex genes in normal and anal cancer cells.

Authors: C Zucchini; A Biolchi; P Strippoli; R Solmi; G Rosati; M Del Governatore; E Milano; G Ugolini; N Salfi; A Farina; A Caira; S Zanotti; P Carinci; L Valvassori
Journal: Int J Oncol Date: 2001-12 Impact factor: 5.650

Review 4. Prevention of human cancer by modulation of chronic inflammatory processes.

Authors: Hiroshi Ohshima; Hiroshi Tazawa; Bakary S Sylla; Tomohiro Sawa
Journal: Mutat Res Date: 2005-08-03 Impact factor: 2.433

Review 5. A review on oral cancer biomarkers: Understanding the past and learning from the present.

Authors: Arvind Babu Rajendra Santosh; Thaon Jones; John Harvey
Journal: J Cancer Res Ther Date: 2016 Apr-Jun Impact factor: 1.805

6. Immune signature of T follicular helper cells predicts clinical prognostic and therapeutic impact in lung squamous cell carcinoma.

Authors: Feng Xu; Hongpan Zhang; Jiexin Chen; Ling Lin; Yongsong Chen
Journal: Int Immunopharmacol Date: 2019-12-10 Impact factor: 4.932

7. Evaluation of clustering algorithms for gene expression data.

Authors: Susmita Datta; Somnath Datta
Journal: BMC Bioinformatics Date: 2006-12-12 Impact factor: 3.169

8. Downregulation of RNF128 activates Wnt/β-catenin signaling to induce cellular EMT and stemness via CD44 and CTTN ubiquitination in melanoma.

Authors: Chuan-Yuan Wei; Meng-Xuan Zhu; Yan-Wen Yang; Peng-Fei Zhang; Xuan Yang; Rui Peng; Chao Gao; Jia-Cheng Lu; Lu Wang; Xin-Yi Deng; Nan-Hang Lu; Fa-Zhi Qi; Jian-Ying Gu
Journal: J Hematol Oncol Date: 2019-03-04 Impact factor: 17.388

9. Machine learning analysis of gene expression data reveals novel diagnostic and prognostic biomarkers and identifies therapeutic targets for soft tissue sarcomas.

Authors: David G P van IJzendoorn; Karoly Szuhai; Inge H Briaire-de Bruijn; Marie Kostine; Marieke L Kuijjer; Judith V M G Bovée
Journal: PLoS Comput Biol Date: 2019-02-20 Impact factor: 4.475

10. Profiles of immune cell infiltration in head and neck squamous carcinoma.

Authors: Bin Liang; Ye Tao; Tianjiao Wang
Journal: Biosci Rep Date: 2020-02-28 Impact factor: 3.840

6 in total

1. Deep Machine Learning for Oral Cancer: From Precise Diagnosis to Precision Medicine.

Authors: Rasheed Omobolaji Alabi; Alhadi Almangush; Mohammed Elmusrati; Antti A Mäkitie
Journal: Front Oral Health Date: 2022-01-11

2. Pan-cancer analyses reveal that increased Hedgehog activity correlates with tumor immunosuppression and resistance to immune checkpoint inhibitors.

Authors: Junjie Jiang; Yongfeng Ding; Yanyan Chen; Jun Lu; Yiran Chen; Guanghao Wu; Nong Xu; Haiyong Wang; Lisong Teng
Journal: Cancer Med Date: 2021-11-28 Impact factor: 4.452

3. Potential Key Markers for Predicting the Prognosis of Gastric Adenocarcinoma Based on the Expression of Ferroptosis-Related lncRNA.

Authors: Yanqun Cai; Susu Wu; Yifan Jia; Xiao Pan; Caiqin Li
Journal: J Immunol Res Date: 2022-04-29 Impact factor: 4.493

4. Prognostic signature related to the immune environment of oral squamous cell carcinoma.

Authors: Yingjie Hua; Xuehui Sun; Kefeng Luan; Changlei Wang
Journal: Open Life Sci Date: 2022-09-14 Impact factor: 1.311

5. Characteristic of Molecular Subtypes in Lung Squamous Cell Carcinoma Based on Autophagy-Related Genes and Tumor Microenvironment Infiltration.

Authors: Jinjie Wang; Jiaqi Zhu; Yijie Tang; Anping Zhang; Tingting Zhou; Youlang Zhou; Jiahai Shi
Journal: J Oncol Date: 2022-09-13 Impact factor: 4.501

6. Utilizing Deep Machine Learning for Prognostication of Oral Squamous Cell Carcinoma-A Systematic Review.

Authors: Rasheed Omobolaji Alabi; Ibrahim O Bello; Omar Youssef; Mohammed Elmusrati; Antti A Mäkitie; Alhadi Almangush
Journal: Front Oral Health Date: 2021-07-26

6 in total