| Literature DB >> 34193166 |
Xiawei Li1,2,3, Litao Yang4, Zheping Yuan5, Jianyao Lou1,2,3, Yiqun Fan6, Aiguang Shi1,2,3, Junjie Huang7, Mingchen Zhao5, Yulian Wu8,9,10.
Abstract
BACKGROUND: Surgical resection is the only potentially curative treatment for pancreatic ductal adenocarcinoma (PDAC) and the survival of patients after radical resection is closely related to relapse. We aimed to develop models to predict the risk of relapse using machine learning methods based on multiple clinical parameters.Entities:
Keywords: Machine learning; PDAC; Prediction model; Radical surgery; Relapse
Mesh:
Year: 2021 PMID: 34193166 PMCID: PMC8243478 DOI: 10.1186/s12967-021-02955-7
Source DB: PubMed Journal: J Transl Med ISSN: 1479-5876 Impact factor: 5.531
Characteristics of the study population in training set and validation set
| Variables | Training (n = 183) | Validation (n = 79) | ||
|---|---|---|---|---|
| Age (years) | Median (q1–q3) | 63.0 (56.0–70.0) | 63.0 (59.0–67.5) | 0.881 |
| Gender | Male (%) | 115 (62.8) | 43 (54.4) | 0.255 |
| Female (%) | 68 (37.2) | 36 (45.6) | ||
| BMI (kg/m2) | Median (q1–q3) | 22.4 (20.3–23.9) | 21.8 (19.9–24.1) | 0.353 |
| CEA (ng/mL) | < 5 (%) | 140 (76.5) | 51 (64.6) | 0.065 |
| ≥ 5 (%) | 43 (23.5) | 28 (35.4) | ||
| CA199 (U/mL) | < 37 (%) | 49 (26.8) | 16 (20.3) | 0.334 |
| ≥ 37 (%) | 134 (73.2) | 63 (79.7) | ||
| CA125 (U/mL) | < 35 (%) | 147 (80.3) | 68 (86.1) | 0.349 |
| ≥ 35 (%) | 36 (19.7) | 11 (13.9) | ||
| WBC (*10^9) | Median (q1–q3) | 6.0 (4.8–7.3) | 5.7 (4.4–6.8) | 0.220 |
| Hb (g/L) | Median (q1–q3) | 127.0 (116.0–140.0) | 129.0 (120.0–142.0) | 0.110 |
| Plt (*10^9) | Median (q1–q3) | 191.0 (154.5–232.0) | 204.0 (164.0–265.5) | 0.132 |
| Neut (*10^9) | Median (q1–q3) | 3.9 (2.9–4.8) | 3.4 (2.4–4.6) | 0.093 |
| Lymp (*10^9) | Median (q1–q3) | 1.4 (1.1–1.7) | 1.5 (1.1–1.9) | 0.275 |
| Mono (*10^9) | Median (q1–q3) | 0.5 (0.4–0.6) | 0.4 (0.3–0.6) | 0.295 |
| Alb (*10^9) | Median (q1–q3) | 40.4 (37.3–43.3) | 40.4 (36.4–43.4) | 0.622 |
| Glb (*10^9) | Median (q1–q3) | 26.7 (24.3–29.5) | 28.2 (26.1–31.9) | 0.002 |
| AGR | Median (q1–q3) | 1.5 (1.4–1.7) | 1.4 (1.2–1.6) | 0.005 |
| NLR | Median (q1–q3) | 2.7 (2.0–4.2) | 2.4 (1.6–3.3) | 0.153 |
| LMR | Median (q1–q3) | 3.0 (2.1–4.2) | 3.5 (2.0–5.1) | 0.113 |
| PLR | Median (q1–q3) | 137.1 (106.3–184.1) | 140.0 (96.6–217.2) | 0.528 |
| AST (U/L) | Median (q1–q3) | 44.0 (20.5–111.5) | 31 (20–90.5) | 0.448 |
| ALT (U/L) | Median (q1–q3) | 50.0 (17.0–200.5) | 29.0 (17.0–139.0) | 0.119 |
| ALP (U/L) | Median (q1–q3) | 156.0 (89.5–391.5) | 105.0 (67.5–367.0) | 0.257 |
| GGT (U/L) | Median (q1–q3) | 159.0 (25.0–704.0) | 51.0 (20.0–494.0) | 0.162 |
| TB (μmol/L) | Median (q1–q3) | 22.1 (12.0–177.3) | 14.6 (9.4–134.1) | 0.716 |
| DB (μmol/L) | Median (q1–q3) | 6.4 (2.6–105.2) | 5.5 (3.2–109.2) | 0.360 |
| Location | Head-isthmus (%) | 139 (76.0) | 54 (68.4) | 0.259 |
| Body-tail (%) | 44 (24.0) | 25 (31.6) | ||
| Margin | R0 (%) | 176 (96.2) | 79 (100.0) | 0.106 |
| R1 (%) | 7 (3.8) | 0 (0.0) | ||
| T stage | 1 (%) | 43 (23.5) | 2 (2.5) | < 0.001 |
| 2 (%) | 86 (47.0) | 55 (69.6) | ||
| 3 (%) | 54 (29.5) | 22 (27.8) | ||
| N stage | 0 (%) | 105 (57.4) | 36 (45.6) | 0.009 |
| 1 (%) | 56 (30.6) | 39 (49.4) | ||
| 2 (%) | 22 (12.0) | 4 (5.1) | ||
| VI | Yes (%) | 83 (45.4) | 20 (25.3) | 0.004 |
| No (%) | 100 (54.6) | 59 (74.7) | ||
| PI | Yes (%) | 143 (78.1) | 67 (84.8) | 0.283 |
| No (%) | 40 (21.9) | 12 (15.2) | ||
| ATI | Yes (%) | 82 (44.8) | 43 (54.4) | 0.195 |
| No (%) | 101 (55.2) | 36 (45.6) | ||
| Differentiation | Well (%) | 37 (20.2) | 5 (6.3) | 0.019 |
| Moderate (%) | 133 (72.7) | 68 (86.1) | ||
| Poor or undifferentiated (%) | 13 (7.1) | 6 (7.6) | ||
| OS | Median (q1–q3) | 19.0 (11.0–33.0) | 14.0 (9.0–28.0) | 0.340 |
| RFS | Median (q1–q3) | 11.0 (6.0–22.8) | 8.0 (4.0–19.5) | 0.503 |
| 1-year relapse | Yes (%) | 106 (57.9) | 49 (62.0) | 0.629 |
| No (%) | 77 (42.1) | 30 (38.0) | ||
| 2-year relapse | Yes (%) | 138 (75.4) | 59 (74.7) | 1.000 |
| No (%) | 45 (24.6) | 20 (25.3) |
BMI body mass index, CEA carcinoembryonic antigen, CA cancer antigen, WBC white blood cell, Hb Hemoglobin, Plt Platelet, Neut neutrophil, Lymph lymphocyte, Mono monocyte, Alb albumin, Glb globulin, AGR albumin-globulin ratio, NLR neutrophil–lymphocyte ratio, LMR lymphcyte-monocyte ratio, PLR platelet-lymphocyte ratio, AST aspartate transaminase, ALT alanine transaminase, ALP alkaline phosphatase, GGT gamma-glutamyltransferase, TB total bilirubin, DB direct bilirubin, VI vascular invasion, PI perineural invasion, ATI adipose tissue invasion, OS overall survival, RFS relapse-free survival
Fig. 1Relative importance of variables on models to predict 1-year relapse. Interpretation: N2 = N stage 1, N3 = N stage 2; grade 2 = moderate differentiation, grade 3 = poor differentiation or undifferentiated; ATI1 = with adipose tissue invasion; VI1 = with vascular invasion; CA1991 = CA 199 ≥ 37U/mL
Fig. 2Relative importance of variables on models to predict 2-year relapse. Interpretation: VI1 = with vascular invasion; N2 = N stage 1, N3 = N stage 2; Mono = monocyte; Alb = Albumin; AGR = albumin-globulin ratio; CA1991 = CA 199 ≥ 37U/mL
Fig. 3Comparisons of ROC curves and AUROC of different models to predict 1- and 2-year relapse in training cohort and validation sets (1-year relapse: training set: A, validation set: B, comparison of AUROC in validation set: C; 2-year relapse: training set: D, validation set: E, comparison of AUROC in validation set: F)
Performance comparison of different models to predict 1-year relapse in the validation set
| Model | AUC | 95%CI.lower | 95%CI.upper | Sensitivity | Specificity | Accuracy | PPV | NPV | F1 | RMSE |
|---|---|---|---|---|---|---|---|---|---|---|
| LR | 0.708 | 0.579 | 0.823 | 0.878 | 0.400 | 0.696 | 0.705 | 0.667 | 0.782 | 0.448 |
| RF | 0.653 | 0.519 | 0.782 | 0.837 | 0.400 | 0.671 | 0.695 | 0.600 | 0.759 | 0.501 |
| SVM | 0.733 | 0.603 | 0.840 | 0.857 | 0.467 | 0.709 | 0.724 | 0.667 | 0.785 | 0.445 |
| GBM | 0.560 | 0.416 | 0.708 | 0.776 | 0.367 | 0.620 | 0.667 | 0.500 | 0.717 | 0.509 |
| NN | 0.720 | 0.604 | 0.836 | 0.878 | 0.400 | 0.696 | 0.705 | 0.667 | 0.782 | 0.448 |
| KNN | 0.600 | 0.460 | 0.740 | 0.837 | 0.467 | 0.696 | 0.719 | 0.636 | 0.774 | 0.496 |
Performance comparison of different models to predict 2-year relapse in the validation set
| Model | AUC | 95%CI.lower | 95%CI.upper | Sensitivity | Specificity | Accuracy | PPV | NPV | F1 | RMSE |
|---|---|---|---|---|---|---|---|---|---|---|
| LR | 0.625 | 0.482 | 0.760 | 0.847 | 0.350 | 0.722 | 0.794 | 0.438 | 0.820 | 0.467 |
| RF | 0.655 | 0.518 | 0.784 | 0.898 | 0.250 | 0.734 | 0.779 | 0.455 | 0.835 | 0.431 |
| SVM | 0.597 | 0.460 | 0.731 | 0.831 | 0.200 | 0.671 | 0.754 | 0.286 | 0.790 | 0.471 |
| GBM | 0.652 | 0.517 | 0.776 | 0.831 | 0.250 | 0.684 | 0.766 | 0.333 | 0.797 | 0.463 |
| NN | 0.608 | 0.464 | 0.747 | 0.831 | 0.200 | 0.671 | 0.754 | 0.286 | 0.790 | 0.450 |
| KNN | 0.689 | 0.558 | 0.817 | 0.915 | 0.200 | 0.734 | 0.771 | 0.444 | 0.837 | 0.416 |