| Literature DB >> 35277130 |
Gu-Wei Ji1,2,3, Chen-Yu Jiao1,2,3, Zheng-Gang Xu1,2,3, Xiang-Cheng Li1,2,3, Ke Wang4,5,6, Xue-Hao Wang4,5,6.
Abstract
BACKGROUND: Accurate prognosis assessment is essential for surgically resected intrahepatic cholangiocarcinoma (ICC) while published prognostic tools are limited by modest performance. We therefore aimed to establish a novel model to predict survival in resected ICC based on readily-available clinical parameters using machine learning technique.Entities:
Keywords: Intrahepatic cholangiocarcinoma; Machine learning; Modelling; Surgery; Survival
Mesh:
Year: 2022 PMID: 35277130 PMCID: PMC8915487 DOI: 10.1186/s12885-022-09352-3
Source DB: PubMed Journal: BMC Cancer ISSN: 1471-2407 Impact factor: 4.430
Fig. 1Study flowchart and methodology. A Flow chart of the study population. B Pipeline to train, validate and test the gradient boosting machine. ICC, Intrahepatic cholangiocarcinoma; FAHNJMU, First Affiliated Hospital of Nanjing Medical University; SEER, Surveillance, Epidemiology, and End Results; AJCC, American Joint Committee on Cancer
Comparison of demographic and clinicopathological characteristics between the training/ validation and test cohorts
| Characteristics | Training/validation | Test | |
|---|---|---|---|
| Age, years | 60.0 (51.0–66.0) | 63.0 (55.0–70.0) | < 0.001 |
| Gender | < 0.001 | ||
| Female | 157 (39.2) | 333 (51.3) | |
| Male | 244 (60.8) | 316 (48.1) | |
| Tumor size, cm | 5.5 (3.7–7.5) | 5.5 (3.5–8.0) | 0.605 |
| Tumor number | < 0.001 | ||
| Single | 303 (75.6) | 402 (61.9) | |
| Multiple | 98 (24.4) | 140 (21.6) | |
| Unknown | 0 (0.0) | 107 (16.5) | |
| Vascular invasion | < 0.001 | ||
| Negative | 269 (67.1) | 316 (48.7) | |
| Microvascular | 49 (12.2) | 158 (24.3) | |
| Macrovascular | 83 (20.7) | 55 (8.5) | |
| Unknown | 0 (0.0) | 120 (18.5) | |
| Regional LNM | 0.819 | ||
| Absent | 323 (80.5) | 519 (80.0) | |
| Present | 78 (19.5) | 130 (20.0) | |
| Number of regional LNM | 0.174 | ||
| 0 | 323 (80.5) | 519 (80.0) | |
| 1–2 | 48 (12.0) | 96 (14.8) | |
| ≥ 3 | 30 (7.5) | 34 (5.2) | |
| Histological grade | < 0.001 | ||
| Well to moderate | 144 (35.9) | 372 (57.3) | |
| Poorly to undifferentiated | 257 (64.1) | 194 (29.9) | |
| Unknown | 0 (0.0) | 83 (12.8) | |
| Visceral peritoneum invasion | 0.268 | ||
| No | 344 (85.8) | 572 (88.1) | |
| Yes | 57 (14.2) | 77 (11.9) | |
| Direct invasion of adjacent organ | 0.579 | ||
| No | 363 (90.5) | 594 (91.5) | |
| Yes | 38 (9.5) | 55 (8.5) | |
| Fibrosis score | < 0.001 | ||
| None to moderate fibrosis | 269 (67.1) | 118 (18.2) | |
| Severe fibrosis or cirrhosis | 132 (32.9) | 31 (4.8) | |
| Unknown | 0 (0.0) | 500 (77.0) | |
| Type of surgery | < 0.001 | ||
| Wedge or segmental resection | 209 (52.1) | 213 (32.8) | |
| Lobectomy | 53 (13.2) | 241 (37.1) | |
| Extended lobectomy | 117 (29.2) | 96 (14.8) | |
| Extrahepatic bile duct resection | 22 (5.5) | 99 (15.3) | |
| Median CSS time, monthsa | 29.6 (25.9–39.2) | 39.0 (35.0–44.0) | 0.011 |
Continuous variables reported as median (interquartile range) and categorical variables reported as number (percentage)
Abbreviations: LNM lymph node metastasis, CSS cancer-specific survival
†P value calculated by log-rank test
aNumbers in parentheses are 95% confidence interval
Fig. 2Overview of the gradient boosting machine (GBM) model. A Variables included in the model and their relative influence. B Illustrative example of the proposed GBM model, which builds the model by combining predictions from stumps of massive decision-tree-base-learners in a step-wise fashion. Prediction score is estimated by adding up the predictions (red number) attached to the terminal nodes of all 2000 decision trees where the patient traverses. C Performance of GBM model as compared with that of American Joint Committee on Cancer (AJCC) staging system and multifocality, extrahepatic extension, grade, nodal status, and age (MEGNA) prognostic score in the internal validation group. D Online model deployment based on the GBM prediction. LNM, lymph node metastasis
Performance of proposed and existing prognostic tools for ICC
| Prognostic tools | C-statistic (95% CI) | |
|---|---|---|
| Training/validation cohort ( | ||
| GBM model | 0.751 (0.717–0.784) | ref |
| AJCC 8th edition | 0.673 (0.637–0.708) | < 0.001 |
| MEGNA prognostic score | 0.674 (0.638–0.710) | < 0.001 |
| Test cohort ( | ||
| GBM model | 0.723 (0.697–0.749) | ref |
| AJCC 8th edition | 0.636 (0.608–0.664) | < 0.001 |
| MEGNA prognostic scorea | 0.617 (0.582–0.651) | < 0.001 |
Abbreviations: ICC intrahepatic cholangiocarcinoma, CI confidence intervals, GBM gradient boosting machine, AJCC American Joint Committee on Cancer, MEGNA multifocality, extrahepatic extension, grade, nodal status, and age, FAHNJMU First Affiliated Hospital of Nanjing Medical University, SEER Surveillance, Epidemiology, and End Results
aAvailable at baseline (467/649) and compared with GBM model in corresponding sub-cohort
Fig. 3Calibration and clinical utility of the gradient boosting machine (GBM) model. Calibration curves of predicted compared with observed CSS probability at 2 and 5 years in the training/validation A and the test B cohort. Decision curve analysis comparing the model with other strategies for predicting 2-and 5-year CSS in the training/validation C and the test D cohort. The y-axis measures the net benefit at a given threshold probability, which is estimated by summing the benefits (true-positive results) and subtracting the harms (false-positive results), weighting the latter by a factor related to the relative harm of an undetected disease compared with the harm of unnecessary treatment. The gray line represents the treat-all strategy (assuming all die of this disease), and the black line represents the treat-none strategy (assuming none die of this disease). GBM-based model provided greater net benefits compared with other strategies across the majority of threshold probabilities. CSS, cancer-specific survival
Fig. 4Kaplan–Meier curves demonstrating the differences in cancer-specific survival among low-, intermediate-, and high-risk patients. Survival disparities among different risk groups in the training/validation A cohort, the test B cohort as well as sub-cohorts stratified by American Joint Committee on Cancer (AJCC) stages C-E
Cancer-specific survival according to risk stratification
| Risk group | Median time, months | 2-year rate, % | 5-year rate, % | Hazard ratio | |
|---|---|---|---|---|---|
| Training/validation cohort ( | |||||
| Low-risk ( | 74.6 (58.3-NA) | 81.6 (76.0–87.6) | 58.1 (49.1–68.6) | 1 | |
| Intermediate-risk ( | 19.0 (16.6–23.7) | 39.9 (32.4–49.1) | 10.3 (4.8–22.2) | 3.901 (2.826–5.384) | < 0.001* |
| High-risk ( | 7.0 (4.4–9.9) | NA | NA | 2.794 (1.606–4.863) | < 0.001† |
| Test cohort ( | |||||
| Low-risk ( | 73.0 (60.0–89.0) | 82.5 (78.6–86.7) | 54.1 (48.4–60.4) | 1 | |
| Intermediate-risk ( | 28.0 (24.0–33.0) | 55.5 (49.6–62.1) | 18.5 (13.3–25.8) | 2.496 (1.980–3.146) | < 0.001* |
| High-risk ( | 11.0 (7.0–13.0) | 7.8 (3.0–19.9) | 0.0 (NA) | 3.509 (2.149–5.728) | < 0.001† |
Abbreviations: CI confidence intervals, NA not applicable
*P value versus low-risk; †P value versus intermediate-risk