Literature DB >> 32821921

Development of a Machine Learning Model for Survival Risk Stratification of Patients With Advanced Oral Cancer.

Yi-Ju Tseng1,2,3, Hsin-Yao Wang1,4, Ting-Wei Lin1, Jang-Jih Lu1,5,6, Chia-Hsun Hsieh5,7,8, Chun-Ta Liao9,10.   

Abstract

Importance: A tool for precisely stratifying postoperative patients with advanced oral cancer is crucial for the treatment plan, such as intensifying or deintensifying the regimen to improve their quality of life and prognosis. Objective: To develop and validate a machine learning-based algorithm that can provide survival risk stratification for patients with advanced oral cancer who have comprehensive clinicopathologic and genetic data. Design, Setting, and Participants: In this prognostic cohort study, the elastic net penalized Cox proportional hazards regression-based risk stratification model was developed and validated using single-center data collected between January 1, 1996, and December 31, 2011. In total, comprehensive clinicopathologic and genetic data (including clinical, pathologic, and 44 cancer-related gene variant profiles) of 334 patients with stage III or IV oral squamous cell carcinoma were used to develop and validate the algorithm in this 15-year cohort study. Data analysis was conducted between February 1, 2018, and May 6, 2020. Main Outcomes and Measures: The main outcomes were cancer-specific survival, distant metastasis-free survival, and locoregional recurrence-free survival. Model performance was compared in terms of the Akaike information criterion and the Harrell concordance index (C index).
Results: Complete data were available for 334 patients (315 men; median age at onset, 48 years [interquartile range, 42-56 years]). The predictive models using comprehensive clinicopathologic and genetic data outperformed those using clinicopathologic data alone. In the groups of postoperative patients receiving adjuvant concurrent chemoradiotherapy, the models demonstrated higher classification performance than those using clinicopathologic data alone in cancer-specific survival (mean [SD] C index, 0.689 [0.050] vs 0.673 [0.051]; P = .02) and locoregional recurrence-free survival (mean [SD] C index, 0.693 [0.039] vs 0.678 [0.035]; P = .004). The classification performance in distant metastasis-free survival was not different (mean [SD] C index, 0.702 [0.056] vs 0.688 [0.048]; P = .09). Conclusions and Relevance: A risk stratification model using comprehensive clinicopathologic and genetic data accurately differentiated the high-risk group from the low-risk group in cancer-specific survival and locoregional recurrence-free survival for postoperative patients with advanced oral cancer. This algorithm could be used through an online calculator to provide additional personalized information for postoperative management of patients with advanced oral squamous cell carcinoma.

Entities:  

Mesh:

Year:  2020        PMID: 32821921      PMCID: PMC7442932          DOI: 10.1001/jamanetworkopen.2020.11768

Source DB:  PubMed          Journal:  JAMA Netw Open        ISSN: 2574-3805


Introduction

Current postoperative treatment of advanced oral squamous cell cancer is often a combination of chemotherapy and radiotherapy.[1] One of the challenges for a physician is the counterpoise between treatment response and patient intolerance of toxic effects and adverse effects, including serious oral mucositis, dysphagia, speech impairment, dermatitis, headache, cognitive dysfunction, and muscle fibrosis.[2,3,4] In addition, the heterogeneity among patients with advanced oral cancer complicates treatment planning, and the treatment decision is reached after discussion between patients and physicians.[5] Risk stratification for patients with advanced cancer is crucial because it can be used to tailor the treatment to deintensify chemoradiotherapy for patients in the low-risk group or to intensify chemoradiotherapy for those in the high-risk group.[6,7,8] Moreover, precise risk stratification is associated with improved allocation and use of health care resources. This information can be further used for care coordination and improving the use of health care resources.[9] For precise treatment planning, tumor histologic information, such as TNM or staging, can be used for providing prognostic information.[10] Moreover, the gene variant profile demonstrates the possibility of indicating cancer prognosis through statistical data mining and machine learning (ML) techniques.[11,12,13] Recently, developing an ML-based model incorporating TNM data was associated with a promising clinical effect.[14] Statistical data mining and ML are excellent analytical methods for classification through identification of data patterns from complex data.[15] Statistical data mining and ML have demonstrated their successful applications in the medical field.[16,17,18,19] The precise estimation of prognosis by using clinicopathologic and genetic information, including clinical data, pathologic data, and the gene variant profile, would provide a comprehensive disease overview.[10] Given the trans-omic data, it is reasonable to harness the ML technologies, which are efficient at handling numerous predictors to generate a risk stratification model. Here, we propose an elastic net penalized Cox proportional hazards regression–based risk stratification model to learn the patterns of different risk levels in cancer-specific survival, distant metastasis–free survival, and locoregional recurrence–free survival for postoperative patients with advanced oral cancer. According to the real-world database validation, our risk stratification models can be used as an online calculator by inputting the required data (eAppendix in the Supplement).

Methods

Data Source

We acquired data from a previously published study.[11] In total, 345 patients with oral squamous cell carcinoma were retrospectively recruited from Chang Gung Memorial Hospital in Taoyuan, Taiwan, between January 1, 1996, and December 31, 2011. All patients had been followed up for 30 months or until death. No patients were lost to follow-up under the enrollment criteria. Details regarding inclusion and exclusion criteria are described in a previously published study.[11] In brief, tumor samples were obtained from patients with stage III or IV node-positive cancer. The staging and pathologic diagnosis were assessed according to the criteria of the seventh edition of the American Joint Committee on Cancer.[20] The patients had not been treated for oral squamous cell carcinoma before the tumor samples were obtained. No metastatic disease was documented when the tumor sample was obtained during surgery as well. Treatment choices (surgery alone, surgery with adjuvant radiotherapy, and surgery with adjuvant concurrent chemoradiotherapy [CCRT]) were determined for each patient according to the National Comprehensive Cancer Network (before 2008) or Chang Gung guidelines (2008).[11,21] The study protocol was reviewed and approved by the Chang Gung Memorial Hospital Institutional Review Board, which waived patient consent because this was a retrospective study. We followed the Standards for Reporting of Diagnostic Accuracy (STARD) reporting guideline and the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) reporting guideline. Tumor samples were obtained during surgery for the following experiments of gene sequencing. The detailed settings of sample preparation and gene sequencing have been described in the previously published studies.[11,21] In brief, ultra-deep sequencing of 44 cancer-related gene variant profiles was analyzed using the Ion 318 chip on the Ion Torrent PGM (Personal Genome Machine) system (Thermo Fisher Scientific), in which hg19 reference genome was used as the reference. The 44 cancer-related gene variant profiles were ABL1 (OMIM 189980), AKT1 (OMIM 164730), ALK (OMIM 105590), APC (OMIM 611731), ATM (OMIM 607585), BRAF (OMIM 164757), CDH1 (OMIM 192090), CDKN2A (OMIM 600160), CSF1R (OMIM 164770), CTNNB1 (OMIM 116806), EGFR (OMIM 131550), ERBB2 (OMIM 164870), ERBB4 (OMIM 600543), FBXW7 (OMIM 606278), FGFR1 (OMIM 136350), FGFR2 (OMIM 176943), FGFR3 (OMIM 134934), FLT3 (OMIM 136351), HNF1A (OMIM 142410), HRAS (OMIM 190020), IDH1 (OMIM 147700), JAK3 (OMIM 600173), KDR (OMIM 191306), KIT (OMIM 164920), KRAS (OMIM 190070), MET (OMIM 164860), MLH1 (OMIM 120436), MPL (OMIM 159530), NOTCH1 (OMIM 190198), NPM1 (OMIM 164040), NRAS (OMIM 164790), PDGFRA (OMIM 173490), PIK3CA (OMIM 171834), PTEN (OMIM 601728), PTPN11 (OMIM 176876), RB1 (OMIM 614041), RET (OMIM 164761), SMAD4 (OMIM 600993), SMARCB1 (OMIM 601607), SMO (OMIM 601500), SRC (OMIM 190090), STK11 (OMIM 602216), TP53 (OMIM 191170), and VHL (OMIM 608537). Sanger sequencing or pyrosequencing was used for confirming variants detected using the Torrent Variant Caller plug-in, version 3.2 (Thermo Fisher Scientific). The genetic features were obtained by next-generation sequencing using an ultra-deep (>1000×) sequencing approach for the primary tumor samples, examining more than 1200 nonsynonymous variants containing missense, nonsense, indel, and splicing types of the variant. Information on the comprehensive clinical, pathologic, and genetic features of the patients was collected (Table). The comprehensive clinicopathologic and genetic features consisted of 5 clinical features (ie, sex, age at onset, alcohol drinking, betel quid chewing, and cigarette smoking), 17 pathologic features (eg, cancer primary site, pathologic T stage, pathologic N stage, pathologic stage, differentiation, pathologic tumor invasion depth, and nearest macroscopic margin), and 44 gene features (ie, 44 cancer-related genes).
Table.

Characteristics of Patients With Oral Squamous Cell Carcinoma Who Underwent Surgery, Surgery With Adjuvant RT, or Surgery With Adjuvant CCRT

CharacteristicTreatment, No. (%)P value
Surgery alone (25 [7.5])Surgery with adjuvant
RT (98 [29.3])CCRT (211 [63.2])
Sex
Male23 (92.0)93 (94.9)199 (94.3).86
Female2 (8.0)5 (5.1)12 (5.7)
Age at onset, median (IQR), y50 (39-60)48 (42-60)48 (43-55).80
Alcohol drinking16 (64.0)64 (65.3)162 (76.8).07
Betel quid chewing17 (68.0)81 (82.7)175 (82.9).18
Cigarette smoking23 (92.0)89 (90.8)192 (91.0).98
Cancer primary site
Tongue12 (48.0)35 (35.7)78 (37.0).06
Mouth floor1 (4.0)6 (6.1)6 (2.8)
Lip01 (1.0)1 (0.5)
Buccal8 (32.0)41 (41.8)79 (37.4)
Gum (alveolar ridge)3 (12.0)10 (10.2)31 (14.7)
Hard palate05 (5.1)1 (0.5)
Retromolar trigone1 (4.0)015 (7.1)
Pathologic T stage
13 (12.0)4 (4.1)8 (3.8).21
210 (40.0)44 (44.9)82 (38.9)
37 (28.0)20 (20.4)38 (18.0)
45 (20.0)30 (30.6)83 (39.3)
Pathologic N stage
114 (56.0)62 (63.3)44 (20.9)<.001
2a01 (1.0)2 (0.9)
2b10 (40.0)26 (26.5)143 (67.8)
2c1 (4.0)9 (9.2)22 (10.4)
Pathologic stage
III13 (52.0)43 (43.9)27 (12.8)<.001
IV12 (48.0)55 (56.1)184 (87.2)
Differentiation
Well differentiated4 (16.0)23 (23.5)31 (14.7).09
Moderately differentiated15 (60.0)66 (67.3)140 (66.4)
Poorly differentiated6 (24.0)9 (9.2)40 (19.0)
Pathologic tumor invasion depth, median (IQR), mm10 (5-15)12 (9-18)13 (8-19).17
Nearest microscopic margin, median (IQR), mm7 (5-9)8.5 (6-10)8 (5-10).15
Total dissected lymph nodes, median (IQR), No.35.0 (29.0-53.0)38.5 (28.0-52.2)47.0 (36.0-61.0)<.001
Positive lymph nodes on dissection, median (IQR), No.1 (1-3)1 (1-3)3 (2-4)<.001
Lower neck lymph node (level IV or V) involvement2 (8.0)3 (3.1)22 (10.4).09
Extranodal extension10 (40.0)25 (25.5)161 (76.3)<.001
Perineural invasion5 (20.0)46 (46.9)122 (57.8).001
Lymphatic vessel invasion013 (13.3)30 (14.2).13
Vascular invasion03 (3.1)14 (6.6).20
Skin invasion2 (8.0)9 (9.2)25 (11.8).70
Bone marrow invasion2 (8.0)22 (22.4)44 (20.9).27
Genetic features
TP5316 (64.0)62 (63.3)140 (66.4).86
PIK3CA6 (24.0)20 (20.4)44 (20.9).92
CDKN2A2 (8.0)11 (11.2)29 (13.7).64
HRAS8 (32.0)6 (6.1)16 (7.6)<.001
BRAF5 (20.0)6 (6.1)18 (8.5).09
EGFR3 (12.0)6 (6.1)13 (6.2).53
FGFR33 (12.0)6 (6.1)10 (4.7).33
SMAD42 (8.0)4 (4.1)11 (5.2).72
APC4 (16.0)5 (5.1)8 (3.8).03
FGFR23 (12.0)3 (3.1)8 (3.8).12
MET1 (4.0)4 (4.1)8 (3.8).99
KIT2 (8.0)5 (5.1)6 (2.8).34
PTEN3 (12.0)3 (3.1)7 (3.3).09
ERBB42 (8.0)3 (3.1)8 (3.8).52
RB11 (4.0)5 (5.1)6 (2.8).61
RET1 (4.0)3 (3.1)7 (3.3).97
ATM1 (4.0)4 (4.1)6 (2.8).83
NOTCH11 (4.0)2 (2.0)7 (3.3).79
ABL13 (12.0)4 (4.1)4 (1.9).02
SMO2 (8.0)3 (3.1)5 (2.4).30
STK111 (4.0)2 (2.0)7 (3.3).79
FBXW72 (8.0)2 (2.0)5 (2.4).23
AKT11 (4.0)2 (2.0)7 (3.3).79
PDGFRA2 (8.0)2 (2.0)5 (2.4).23
KDR1 (4.0)1 (1.0)6 (2.8).54
CTNNB11 (4.0)1 (1.0)5 (2.4).59
PTPN112 (8.0)2 (2.0)4 (1.9).16
KRAS2 (8.0)1 (1.0)4 (1.9).09
CDH11 (4.0)05 (2.4).24
ERBB2004 (1.9).31
SMARCB1006 (2.8).17
JAK31 (4.0)03 (1.4).23
FGFR11 (4.0)1 (1.0)2 (0.9).41
HNF1A02 (2.0)1 (0.5).35
MLH11 (4.0)03 (1.4).23
VHL1 (4.0)02 (0.9).17
IDH12 (8.0)02 (0.9).004
FLT31 (4.0)02 (0.9).17
NRAS003 (1.4).41
MPL1 (4.0)01 (0.5).06
NPM11 (4.0)1 (1.0)0.04
ALK001 (0.5).75
CSF1R001 (0.5).75
SRC01 (1.0)0.30
Survival outcomes
Cancer-specific survival10 (40.0)59 (60.2)125 (59.2).18
Distant metastasis–free survival17 (68.0)76 (77.6)154 (73.0).54
Locoregional recurrence–free survival17 (68.0)76 (77.6)174 (82.5).16

Abbreviations: CCRT, concurrent chemoradiation; IQR, interquartile range; RT, radiotherapy.

Abbreviations: CCRT, concurrent chemoradiation; IQR, interquartile range; RT, radiotherapy.

Model Development

Elastic net penalized Cox proportional hazards regression models were built using clinicopathologic and genetic features to identify the prognostic associations of the features and to calculate the survival index of each patient treated with different curative therapeutics[22,23,24] (eFigure 1 in the Supplement). To examine whether prognostic associations of the features and the distribution of survival indices indicated different prognostic survival outcomes, we built a model for predicting 3 types of outcomes: cancer-specific survival, distant metastasis–free survival, and locoregional recurrence–free survival. A repeated, nested 3-fold cross-validation was applied to tune (inner cross-validation) and evaluate (outer cross-validation) the models (eFigure 1 in the Supplement). Regulation parameters (λ) and an elastic net mixing parameter (α) were selected by inner 3-fold cross-validation on the training set. In each outer fold, the median survival index in the training set was selected to divide patients in the test set into high-risk and low-risk groups. The models were developed using R software with the glmnet package (R Foundation for Statistical Computing).[22] In addition, the performance of elastic net penalized Cox proportional hazards regression models were compared with the regular Cox proportional hazards regression model to evaluate the effects of elastic net penalty. We first built univariate Cox proportional hazards regression models for each clinicopathologic and genetic feature. The features associated with the outcomes (P < .05) were further used in the development of the multivariable Cox proportional hazards regression models. The median survival index was used to divide patients in the test set into high-risk and low-risk groups.

Model Evaluation

For model evaluation, an outer 3-fold cross-evaluation was used to assess the performance of our models (eFigure 1 in the Supplement). The data were partitioned randomly into 3 sets, 1 set for testing and the other 2 sets for training. To evaluate the model stability, repeated nested cross-validation was performed 10 times for each outcome measurement. Thus, we generated 30 training and test sets to evaluate models for each type of prognostic survival and for each treatment method. Patients in the test set were classified into high-risk and low-risk groups based on their survival indices, with a threshold of median survival index in the training set. The log-rank test was used to compare the survival distributions between high-risk and low-risk groups. To evaluate the effectiveness of using comprehensive clinicopathologic and genetic features for model development, we compared the Akaike information criterion and the Harrell concordance index (C index)[25] of models built using the clinicopathologic and genetic features with those using clinicopathologic features alone and genetic features alone.

Feature Association Analysis

The associated prognostic clinicopathologic and genetic features were selected using elastic net penalized Cox proportional hazards regression models for 3 types of prognostic survival, and the coefficients were analyzed to evaluate the importance of the clinicopathologic and genetic features. On the basis of model development and evaluation approach (eFigure 1 in the Supplement), 30 models were built for each prognostic survival type. The number of times each feature was selected among the 30 models was used to evaluate the importance of the feature. The prognostic associations of the clinicopathologic and genetic features were defined as those of the features that were selected by more than 80% of the models (>24 of the 30 models). The hazard ratios of each feature, the exponential of the features’ coefficients, were used for comparing the association with the hazard rate of a given feature with a reference group.

Statistical Analysis

Statistical analysis was conducted from February 1, 2018, to May 6, 2020. Analysis of variance was used for continuous data, and the Pearson χ2 test was used for categorical data. We performed repeated-measures analysis of variance with pairwise paired t test post hoc analyses and a nonparametric Friedman test with a pairwise paired Wilcoxon signed rank post hoc test on the Akaike information criterion and C index values of the models. The P values of pairwise comparison are adjusted using the Bonferroni multiple testing correction method. All statistical tests were 2-sided, and P < .05 was considered statistically significant. All analyses were performed using R software, version 3.4.0 (R Foundation for Statistical Computing).

Results

Patient Characteristics

Of 345 patients with oral squamous cell carcinoma who had clinical and next-generation sequencing data, 334 with complete data were included in the study. Of the 334 patients included in the analysis, the median age at onset was 48 years (interquartile range, 42-56 years), 315 patients (94.3%) were men, and the median follow-up duration was 55.0 months (interquartile range, 13-109 months). The Table shows the demographic, clinical, pathologic, and gene characteristics of the study population. In total, 211 patients (63.2%) underwent sugery with adjuvant CCRT, 98 (29.3%) underwent surgery with adjuvant radiotherapy, and 25 (7.5%) underwent surgery alone. Patients treated with postoperative adjuvant CCRT were likely to have the following risk factors: extranodal extension (161 of 211 [76.3%]; P < .001) and perineural invasion (122 of 211 [57.8%]; P = .001), high pathologic stages (stage IV, 184 of 211 [87.2%]; P < .001), and more total dissected lymph nodes (median, 47.0 [interquartile range, 36.0-61.0]; P < .001). The number of patients meeting cancer-specific survival outcomes was 194 (58.1%), the numbe of patients meeting distant metastasis–free survival outcomes was 247 (74.0%), and the number of patients meeting locoregional recurrence–free survival outcomes was 267 (79.9%).

Performance of Risk Prediction Model

The models built using clinicopathologic and genetic features successfully stratified patients who received postoperative CCRT (Figure 1; eFigure 2 in the Supplement [among 10 rounds of tests, only the first round of test results were plotted]), patients who received postoperative radiotherapy (Figure 2; eFigure 3 in the Supplement [the first round of test results]), and patients who received surgery alone (Figure 3; eFigure 4 in the Supplement [the first round of test results]) for cancer-specific survival and locoregional recurrence–free survival based on their survival indices. The number of patients and their follow-up durations in high-risk and low-risk groups for each survival outcome predicted by the models built using clinicopathologic and genetic features are shown in eTable 1 in the Supplement. The mean (SD) C indices of models for patients treated with postoperative adjuvant CCRT were 0.689 (0.050) for cancer-specific survival prediction, 0.702 (0.056) for distant metastasis–free survival prediction, and 0.693 (0.039) for locoregional recurrence–free survival prediction. For cancer-specific survival and locoregional recurrence–free survival prediction for patients treated with postoperative adjuvant CCRT, the C indices of the models built using clinicopathologic and genetic features were reported to be higher compared with those using clinicopathologic features alone (cancer-specific survival: mean [SD] C index, 0.689 [0.050] vs 0.673 [0.051]; P = .02; locoregional recurrence–free survival: mean [SD] C index, 0.693 [0.039] vs 0.678 [0.035]; P = .004); however, the classification performance in distant metastasis–free survival was not different (mean [SD] C index, 0.702 [0.056] vs 0.688 [0.048]; P = .09) (eTable 3 in the Supplement). Furthermore, these models built using clinicopathologic and genetic features fit better than the models built using genetic features in cancer-specific survival and locoregional recurrence–free survival (eTable 2 in the Supplement). The elastic net penalized Cox proportional hazards regression models outperformed regular Cox proportional hazards regression models in cancer-specific survival (C index, 0.689 vs 0.616; P < .001), distant metastasis–free survival (0.702 vs 0.614; P < .001), and locoregional recurrence–free survival (0.693 vs 0.650; P = .001).
Figure 1.

Kaplan-Meier Curves of Patients Who Received Postoperative Adjuvant Concurrent Chemoradiotherapy Stratified Using Elastic Net Penalized Cox Proportional Hazards Regression Models Built With Clinicopathologic and Genetic Features vs Clinicopathologic Features Alone

A, Cancer-specific survival. B, Distant metastasis–free survival. C, Locoregional recurrence–free survival. The shaded areas indicate 95% CIs.

Figure 2.

Kaplan-Meier Curves of Patients Who Received Postoperative Adjuvant Radiotherapy Stratified Using Elastic Net Penalized Cox Proportional Hazards Regression Models Built With Clinicopathologic and Genetic Features vs Clinicopathologic Alone

A, Cancer-specific survival. B, Distant metastasis–free survival. C, Locoregional recurrence–free survival. The shaded areas indicate 95% CIs.

Figure 3.

Kaplan-Meier Curves of Patients Who Underwent Surgery Alone Stratified Using Elastic Net Penalized Cox Proportional Hazards Regresssion Models Built With Clinicopathologic and Genetic Features vs Clinicopathologic Features Alone

A, Cancer-specific survival. B, Distant metastasis–free survival. C, Locoregional recurrence–free survival. The shaded areas indicate 95% CIs.

Kaplan-Meier Curves of Patients Who Received Postoperative Adjuvant Concurrent Chemoradiotherapy Stratified Using Elastic Net Penalized Cox Proportional Hazards Regression Models Built With Clinicopathologic and Genetic Features vs Clinicopathologic Features Alone

A, Cancer-specific survival. B, Distant metastasis–free survival. C, Locoregional recurrence–free survival. The shaded areas indicate 95% CIs.

Kaplan-Meier Curves of Patients Who Received Postoperative Adjuvant Radiotherapy Stratified Using Elastic Net Penalized Cox Proportional Hazards Regression Models Built With Clinicopathologic and Genetic Features vs Clinicopathologic Alone

A, Cancer-specific survival. B, Distant metastasis–free survival. C, Locoregional recurrence–free survival. The shaded areas indicate 95% CIs.

Kaplan-Meier Curves of Patients Who Underwent Surgery Alone Stratified Using Elastic Net Penalized Cox Proportional Hazards Regresssion Models Built With Clinicopathologic and Genetic Features vs Clinicopathologic Features Alone

A, Cancer-specific survival. B, Distant metastasis–free survival. C, Locoregional recurrence–free survival. The shaded areas indicate 95% CIs.

Features Associated With Prognostic Prediction

The prognostic associations of the clinicopathologic and genetic features were defined as those that were selected by more than 80% of the models (>24 of the 30 models). The essential features selected by the models for predicting prognostic survival types and their hazard ratios are shown in eFigure 5 and eTable 4 in the Supplement. Extranodal extension, positive lymph nodes on dissection, and HRAS variant were selected among the prognostic models for all types of prognostic survival measurements within the different treatment groups (eTable 4 in the Supplement).

Risk Stratification in Patients With Postoperative Adjuvant CCRT

A risk stratification result for patients who received postoperative adjuvant CCRT was visualized with the pattern of the predicted results from the 3 survival measurements (Figure 4). The risk predicted by the models using the cancer-specific survival measurement can represent the overall mortality for the patient. Then, the risk of the other measurements can further represent the risk of cancer metastasis and local recurrence. With risk classified by 3 survival measurements together, the patients can be further categorized into 4 subgroups—overall low-risk subgroup, heterogeneous low-risk subgroup, heterogeneous high-risk subgroup, and overall high-risk subgroup. The patients in the overall low-risk subgroup were classified as low risk in all survival measurements, and the patients in the overall high-risk subgroup were classified as high risk in all survival measurements. The patients in the heterogeneous low-risk subgroup were classified as low risk in cancer-specific measurement but with at least 1 high-risk classification in other measurements, which means that the patient had relative low risk in mortality but with different risks in the event of cancer metastasis or local recurrence. Among 211 patients with postoperative adjuvant CCRT, 104 patients were classified as being in the overall high-risk subgroup and 55 patients were classified as being in the overall low-risk subgroup.
Figure 4.

Risk Stratification From the Classification Result of Patients Who Underwent Surgery With Adjuvant Concurrent Chemoradiotherapy

Discussion

In this study, we aligned the clinical scenario of cancer treatment end points with the ML technique to conduct risk stratification for patients with advanced oral cavity squamous cell carcinoma. The ML model provides information on personalized risk stratification for locoregional recurrence, distant metastasis, and cancer-specific survival. Clinical physicians can intensify or deintensify follow-up durations and treatments based on risk stratifications for postoperative patients with advanced oral cancer. Our risk stratification models were trained and validated based on an East Asian population, in which the prevalence of oral cavity cancer is much higher because of culture, behavior, and socioeconomic status.[26] This work can fill the gap due to a lack of a prognosis prediction tool for the Asian population. A prognostic prediction for patients with cancer could have high accuracy when comprehensive information is used in the predictive model. Several head and neck cancer prognosis calculators, such as the Maastro Clinic, LifeMath, Leiden,[27] MyCancerJourney, Memorial Sloan Kettering, and Knight, have been published for the populations of the United States and the Netherlands.[28,29] However, these risk calculators adopted clinical information only, without pathologic and genetic features. The gene variant profile containing variant measurements for multiple genes is another promising approach to estimate the behavior of a tumor. Using the powerful analytical capability of next-generation sequencing, a parallel analysis of multiple genes is possible. However, cancer development is a complex interaction between tumor cells, paratumor tissues, and other host factors.[1] Thus, genetic features alone would not provide the whole picture of cancer, although gene features alone have proven useful in distinguishing between high-risk and low-risk individuals.[11] In this work, we demonstrated that an improved level of precision of risk stratification is associated with the use of comprehensive clinicopathologic and genetic data (namely, clinical, pathologic, and genetic features) in terms of the Akaike information criterion (Figure 1, Figure 2, and Figure 3; eTable 2 in the Supplement). Moreover, according to the intrinsic feature selection of elastic net penalized Cox proportional hazards regression models, only 6 of 44 tumor-related gene variant profiles were selected in risk stratification models (eFigure 5 in the Supplement). In our models, the number of genes needed to be measured was largely reduced to 6, and multiplex polymerase chain reaction but not next-generation sequencing would be sufficient to detect gene variants. The cost of gene tests in our models would be considerably reduced, and the models would become affordable to patients with oral cancer who had suboptimal socioeconomic status.[26] With the use of existing clinical and pathologic data, combined with the need fora fewer number of genes to be tested, the risk stratification models can be directly integrated into the current workflow of managing postoperative patients with advanced oral cancer. Nested repeated cross-validation was used in our work, which is suitable for small data sets to provide an unbiased estimation of the performance of the prediction model and the importance of the feature.[30,31,32] Machine learning can be a powerful tool if physicians participate in the model development process to align and fulfill the clinical purpose. A generalizable approach and locally relevant data, but not a generalized model, are warranted for a clinically applicable ML model.[16,33] Combining the predicted risk classifications from the 3 survival measurements can provide a practical application of risk stratification (Figure 4). If a patient is classified as high risk in the predicted model based on a cancer-specific survival measurement, that may imply a more intensified need for outpatient and laboratory follow-up.[34] Furthermore, the risk classification in the predicted models of locoregional recurrence–free survival can provide additional useful information on the tailored management of chemotherapy, and the risk classification in the predicted models of distant metastasis–free survival can provide additional useful information on the tailored management of radiotherapy. For patients with postoperative adjuvant CCRT, typically the follow-up plan would be identical without personalized risk stratification. The risk stratification demonstrated in Figure 4 provided a clinically relevant approach for personalized risk assessment.

Limitations

Our work has some limitations. Our model was built and evaluated based on a relatively small, single, tertiary hospital–based retrospective cohort within an Asian population, which indicated a lack of generalizability to Western populations; model performance may differ when applied to data from other institutions. That is, the models might not be recommended for direct use in other institutions because of the high level of diversity of various factors. However, a nested, repeated, 3-fold cross-validation approach was used to minimize bias and imitate external validation. The workflow is generic and can be applied to different institutions. A prospective, multicenter trial is required to validate the utility of the risk stratification model in future studies. Although the predictive model had a relatively stable performance, some contradictory results were observed wherein the patient was simultaneously classified into high-risk and low-risk groups in different prognostic survival outcomes. In addition, the surgical procedure for advanced head and neck cancer was highly complicated and often involved reconstructive surgery, which may affect the patient’s outcome.[19]

Conclusions

In this prognostic cohort study, we developed and validated risk stratification models for postoperative patients with advanced oral cancer by using comprehensive clinicopathologic and genetic data. The risk stratification models that aligned clinical treatment scenarios with the ML technique may indicate the prognostic risks of locoregional recurrence, distant metastasis, and cancer-specific survival. Accurate risk stratification by use of ML models with an online calculator may facilitate a more precise management of cases of advanced oral cancer.
  31 in total

1.  Quality of Life for Patients With Favorable-Risk HPV-Associated Oropharyngeal Cancer After De-intensified Chemoradiotherapy.

Authors:  Kevin A Pearlstein; Kyle Wang; Robert J Amdur; Colette J Shen; Roi Dagan; Jared Weiss; Juneko E Grilley-Olson; Adam Zanation; Trevor G Hackman; Brian D Thorp; Jeffrey M Blumberg; Samip Patel; Nathan Sheets; Mark C Weissler; William M Mendenhall; Bhishamjit S Chera
Journal:  Int J Radiat Oncol Biol Phys       Date:  2018-11-02       Impact factor: 7.038

2.  A Generalizable, Data-Driven Approach to Predict Daily Risk of Clostridium difficile Infection at Two Large Academic Health Centers.

Authors:  Jeeheh Oh; Maggie Makar; Christopher Fusco; Robert McCaffrey; Krishna Rao; Erin E Ryan; Laraine Washer; Lauren R West; Vincent B Young; John Guttag; David C Hooper; Erica S Shenoy; Jenna Wiens
Journal:  Infect Control Hosp Epidemiol       Date:  2018-04       Impact factor: 3.254

3.  Regularization Paths for Generalized Linear Models via Coordinate Descent.

Authors:  Jerome Friedman; Trevor Hastie; Rob Tibshirani
Journal:  J Stat Softw       Date:  2010       Impact factor: 6.440

4.  Individualized outcome prognostication for patients with laryngeal cancer.

Authors:  Connor W Hoban; Lauren J Beesley; Emily L Bellile; Yilun Sun; Matthew E Spector; Gregory T Wolf; Jeremy M G Taylor; Andrew G Shuman
Journal:  Cancer       Date:  2017-11-07       Impact factor: 6.860

5.  Regularization Paths for Cox's Proportional Hazards Model via Coordinate Descent.

Authors:  Noah Simon; Jerome Friedman; Trevor Hastie; Rob Tibshirani
Journal:  J Stat Softw       Date:  2011-03       Impact factor: 6.440

6.  Aspiration in chemoradiated patients with head and neck cancer.

Authors:  Alexander Langerman; Ellen Maccracken; Kristen Kasza; Daniel J Haraf; Everett E Vokes; Kerstin M Stenson
Journal:  Arch Otolaryngol Head Neck Surg       Date:  2007-12

7.  Bias in error estimation when using cross-validation for model selection.

Authors:  Sudhir Varma; Richard Simon
Journal:  BMC Bioinformatics       Date:  2006-02-23       Impact factor: 3.169

8.  Cancers Screening in an Asymptomatic Population by Using Multiple Tumour Markers.

Authors:  Hsin-Yao Wang; Chia-Hsun Hsieh; Chiao-Ni Wen; Ying-Hao Wen; Chun-Hsien Chen; Jang-Jih Lu
Journal:  PLoS One       Date:  2016-06-29       Impact factor: 3.240

9.  Clues toward precision medicine in oral squamous cell carcinoma: utility of next-generation sequencing for the prognostic stratification of high-risk patients harboring neck lymph node extracapsular extension.

Authors:  Hung-Ming Wang; Chun-Ta Liao; Tzu-Chen Yen; Shu-Jen Chen; Li-Yu Lee; Chia-Hsun Hsieh; Chien-Yu Lin; Shu-Hang Ng
Journal:  Oncotarget       Date:  2016-09-27

Review 10.  Head and neck cancer: improving outcomes with a multidisciplinary approach.

Authors:  Cristiana Lo Nigro; Nerina Denaro; Anna Merlotti; Marco Merlano
Journal:  Cancer Manag Res       Date:  2017-08-18       Impact factor: 3.989

View more
  5 in total

1.  Sepsis Related Mortality Associated with an Inflammatory Burst in Patients Admitting to the Department of Internal Medicine with Apparently Normal C-Reactive Protein Concentration.

Authors:  Ronnie Meilik; Hadas Ben-Assayag; Ahuva Meilik; Shlomo Berliner; David Zeltser; Itzhak Shapira; Ori Rogowski; Ilana Goldiner; Shani Shenhar-Tsarfaty; Asaf Wasserman
Journal:  J Clin Med       Date:  2022-06-01       Impact factor: 4.964

Review 2.  The contribution of artificial intelligence to reducing the diagnostic delay in oral cancer.

Authors:  Betul Ilhan; Pelin Guneri; Petra Wilder-Smith
Journal:  Oral Oncol       Date:  2021-03-09       Impact factor: 5.337

3.  Cardiovascular Autonomic Function Changes and Predictors During a 2-Year Physical Activity Program in Rheumatoid Arthritis: A PARA 2010 Substudy.

Authors:  David Hupin; Philip Sarajlic; Ashwin Venkateshvaran; Cecilia Fridén; Birgitta Nordgren; Christina H Opava; Ingrid E Lundberg; Magnus Bäck
Journal:  Front Med (Lausanne)       Date:  2021-12-15

4.  Machine Learning for Head and Neck Cancer: A Safe Bet?-A Clinically Oriented Systematic Review for the Radiation Oncologist.

Authors:  Stefania Volpe; Matteo Pepa; Mattia Zaffaroni; Federica Bellerba; Riccardo Santamaria; Giulia Marvaso; Lars Johannes Isaksson; Sara Gandini; Anna Starzyńska; Maria Cristina Leonardi; Roberto Orecchia; Daniela Alterio; Barbara Alicja Jereczek-Fossa
Journal:  Front Oncol       Date:  2021-11-18       Impact factor: 6.244

Review 5.  Laboratory Demand Management Strategies-An Overview.

Authors:  Cornelia Mrazek; Elisabeth Haschke-Becher; Thomas K Felder; Martin H Keppel; Hannes Oberkofler; Janne Cadamuro
Journal:  Diagnostics (Basel)       Date:  2021-06-23
  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.