| Literature DB >> 36158684 |
Dong-Lin Li1, Lin Zhang2, Hao-Ji Yan3,4, Yin-Bin Zheng5, Xiao-Guang Guo6, Sheng-Jie Tang1, Hai-Yang Hu1, Hang Yan1, Chao Qin1, Jun Zhang1, Hai-Yang Guo1, Hai-Ning Zhou1, Dong Tian2,3,4.
Abstract
Background: For patients with stage T1-T2 esophageal squamous cell carcinoma (ESCC), accurately predicting lymph node metastasis (LNM) remains challenging. We aimed to investigate the performance of machine learning (ML) models for predicting LNM in patients with stage T1-T2 ESCC.Entities:
Keywords: esophageal squamous cell carcinoma; lymph node metastasis; machine learning; predictive model; stage T1-T2
Year: 2022 PMID: 36158684 PMCID: PMC9496653 DOI: 10.3389/fonc.2022.986358
Source DB: PubMed Journal: Front Oncol ISSN: 2234-943X Impact factor: 5.738
Figure 1The flow chat for patient inclusion and exclusion. ESCC: esophageal squamous cell carcinoma.
Clinical characteristics of patients with T1-T2 stage esophageal squamous cell carcinoma.
| Characteristics | Training set | External test set |
|---|---|---|
| Age (year) (median [range]) | 65 (41–85) | 64 (40-80) |
| Sex | ||
| Male | 665 (69.5%) | 117 (75.5%) |
| Female | 287 (30.5%) | 38 (24.5%) |
| BMI | 22.9 ± 4.1 | 21.7 ± 2.5 |
| History of surgery | ||
| Yes | 290 (30.8%) | 40 (25.8%) |
| No | 652 (69.2%) | 115 (74.2%) |
| Tumor location | ||
| Upper | 131 (13.9%) | 21 (13.5%) |
| Middle | 606 (64.3%) | 78 (50.3%) |
| Lower | 205 (21.8%) | 56 (36.2%) |
| Preoperative complications | ||
| Yes | 525 (55.7%) | 114 (73.5%) |
| No | 417 (44.3%) | 41 (26.5%) |
| Endoscopic tumor length (cm) | 3.8 ± 2.0 | 3.7 ± 1.9 |
| Tumor size (cm) | 2.8 ± 1.4 | 3.3 ± 1.6 |
| Tumor differentiation | ||
| G1 | 322 (34.2%) | 28 (18.1%) |
| G2 | 530 (56.3%) | 95 (61.3%) |
| G3 | 90 (9.5%) | 32 (20.6%) |
| T stage† | ||
| T1a | 154 (16.4%) | 13 (8.4%) |
| T1b | 294 (31.2%) | 58 (37.4%) |
| T2 | 494 (52.4%) | 84 (54.2%) |
BMI, body mass index. †The 8th edition of the UICC and AJCC cancer staging system.
Figure 2Performance of 36 machine learning models. This Heatmap showed the area under the curve of each machine learning algorithm (columns) with each feature selection method (rows). KNN, k-nearest neighbours; NB, naïve-bayes; SVM, support vector machine; GBRM, generalized boosted regression modeling; RF, random forest; XGB, extreme gradient boosting machine; LASSO, least absolute shrinkage and selection operator; RFE, recursive feature elimination; DC, determination coefficient.
Performance of the optimal machine learning model and the T stage.
| Model/factor | External validation data | AUC (95% CI) | Sensitivity | Specificity | NPV | PPV |
|---|---|---|---|---|---|---|
| NB model | T1-T2 stage ESCC | 0.752 (0.674-0.829) | 0.787 | 0.638 | 0.822 | 0.585 |
| T stage | T1-T2 stage ESCC | 0.624 (0.547-0.701) | 0.672 | 0.543 | 0.718 | 0.488 |
| NB model | T1 stage ESCC | 0.789 (0.669-0.901) | 0.700 | 0.765 | 0.867 | 0.538 |
| NB model | T2 stage ESCC | 0.704 (0.590-0.818) | 0.732 | 0.628 | 0.711 | 0.652 |
NB, Naive Bayes; ESCC, esophageal squamous cell carcinoma; AUC, the area under the curve; CI, confidence interval; NPV, negative predictive value; PPV, positive predictive value.
Figure 3The receiver operator characteristic curve of the optimal machine learning model. The optimal machine learning model exhibited a good performance to predict the LNM for patients with T1-T2, T1, and T2 stage esophageal squamous cell carcinoma. The 95% confidence intervals were showed in the parentheses. AUC, the area under the curve.
Figure 4Predicted value for patients with T1-T2 (A), T1 (B), T2 (C) stage esophageal squamous cell carcinoma. The predicted value of the naive bayes model could obviously distinguish the different lymph node statuses for patients with T1-T2, T1, and T2 stage esophageal squamous cell carcinoma. LNM, lymph node metastasis.