| Literature DB >> 33618238 |
Chenyan Guo1, Jue Wang1, Yongming Wang2, Xinyu Qu1, Zhiwen Shi1, Yan Meng1, Junjun Qiu3, Keqin Hua4.
Abstract
BACKGROUND: Machine learning (ML) has been gradually integrated into oncologic research but seldom applied to predict cervical cancer (CC), and no model has been reported to predict survival and site-specific recurrence simultaneously. Thus, we aimed to develop ML models to predict survival and site-specific recurrence in CC and to guide individual surveillance.Entities:
Keywords: Artificial intelligence; Cervical cancer; Machine learning; Survival prediction
Year: 2021 PMID: 33618238 PMCID: PMC7907920 DOI: 10.1016/j.tranon.2021.101032
Source DB: PubMed Journal: Transl Oncol ISSN: 1936-5233 Impact factor: 4.243
Multivariate Cox analysis: factors associated with recurrence-free survival and overall survival in stage IA1 (LVSI) to IIB2 cervical cancer patients.
| Characteristics | No. | Multivariate | ||||
|---|---|---|---|---|---|---|
| RFS | OS | |||||
| HR (95%CI) | HR (95%CI) | |||||
| <0.001 | <0.001 | |||||
| ⅠA1 (LVSI) | 27 | 1 | 1 | |||
| ⅠA2 | 136 | 0.502 [0.081,3.105] | 1.005 [0.136,7.347] | |||
| ⅠB1 | 3202 | 0.633 [0.196,2.044] | 0.735 [0.176,3.073] | |||
| ⅠB2 | 605 | 1.019 [0.312,3.326] | 1.678 [0.4,7.037] | |||
| ⅡA1 | 734 | 1.137 [0.349,3.708] | 1.347 [0.319,5.682] | |||
| ⅡA2 | 338 | 1.506 [0.458,4.949] | 1.828 [0.431,7.759] | |||
| ⅡB1 | 39 | 1.062 [0.249,4.530] | 1.299 [0.233,7.246] | |||
| ⅡB2 | 31 | 1.444 [0.368,5.667] | 2.133 [0.425,10.71] | |||
| 0.009 | 0.041 | |||||
| No | 2123 | 1 | 1 | |||
| Yes | 2989 | 1.51 [1.109,2.055] | 1.44 [1.016,2.041] | |||
| <0.001 | <0.001 | |||||
| <0.5 | 975 | 1 | 1 | |||
| [0.5,1) | 96 | 0.517 [0.188,1.423] | 0.59 [0.188,1.856] | |||
| [1,1.5) | 338 | 0.458 [0.217,0.968] | 0.373 [0.157,0.89] | |||
| [1.5,2) | 428 | 1.873 [1.053,3.332] | 1.701 [0.867,3.337] | |||
| [2,2.5) | 422 | 0.886 [0.448,1.753] | 0.552 [0.233,1.309] | |||
| [2.5,3) | 518 | 0.956 [0.526,1.738] | 0.862 [0.432,1.719] | |||
| [3,3.5) | 727 | 0.884 [0.481,1.625] | 0.838 [0.416,1.687] | |||
| [3.5,4) | 623 | 1.141 [0.637,2.042] | 1.054 [0.537,2.071] | |||
| [4,4.5) | 375 | 0.9 [0.439,1.844] | 0.723 [0.31,1.685] | |||
| [4.5,5) | 188 | 1.33 [0.684,2.588] | 1.257 [0.589,2.682] | |||
| ≥5 | 422 | 1.808 [0.948,3.448] | 1.488 [0.696,3.18] | |||
| <0.001 | <0.001 | |||||
| SCC | 4179 | 1 | 1 | |||
| AC | 576 | 1.706 [1.241,2.345] | 1.921 [1.351,2.733] | |||
| AS | 281 | 1.69 [1.14,2.506] | 1.949 [1.27,2.991] | |||
| Rare type | 76 | 2.395 [1.341,4.277] | 2.134 [1.062,4.29] | |||
| 0.006 | 0.001 | |||||
| Negative | 1219 | 1 | 1 | |||
| <2/3 | 1542 | 1.007 [0.627,1.618] | 1.435 [0.771,2.668] | |||
| ≥2/3 | 2351 | 1.609 [1.032,2.507] | 2.414 [1.336,4.36] | |||
| <0.001 | <0.001 | |||||
| No | 4857 | 1 | 1 | |||
| Yes | 255 | 2.034 [1.509,2.742] | 1.851 [1.324,2.59] | |||
| <0.001 | <0.001 | |||||
| No | 4112 | 1 | 1 | |||
| Pelvic LNs | 710 | 1.414 [1.066,1.876] | 1.513 [1.096,2.089] | |||
| Common iliac LNs | 248 | 3.078 [2.263,4.186] | 3.5 [2.494,4.911] | |||
| Para-aortic | 42 | 4.503 [2.523.8.04] | 6.543 [3.485,12.285] | |||
Fig. 1Study schema for survival and recurrence analysis.
Baseline characteristics of stage ⅠA1(LVSI)- IIB2 cervical cancer patients.
| Age, years | 47.7 (± 9.6) |
| FIGO stage | |
| ⅠA1 (LVSI) | 27 (0.5) |
| ⅠA2 | 136 (2.7%) |
| ⅠB1 | 3202 (62.6%) |
| ⅠB2 | 605 (11.8%) |
| ⅡA1 | 734 (14.4%) |
| ⅡA2 | 338 (6.6%) |
| ⅡB1 | 39 (0.8%) |
| ⅡB2 | 31 (0.6%) |
| Comorbidity | |
| Yes | 768 (15%) |
| No | 4344 (85%) |
| HPV infection | |
| Yes | 1963 (38.4%) |
| No | 594 (11.6%) |
| Unknown | 2555 (50%) |
| Adjuvant treatment | |
| Yes | 2989 (58.5%) |
| No | 2123 (41.5%) |
| Surgery approach | |
| MH | 4040 (79%) |
| LMH | 3799 (74.3%) |
| RMH | 236 (4.6%) |
| Trans-vaginal | 5 (0.1%) |
| OH | 1072 (21%) |
| Operative time, min | 213.5 (165, 251) |
| Blood loss, ml | 335.9 (150,400) |
| Transfusion | |
| Yes | 369 (7.2%) |
| No | 4743 (92.8%) |
| LEEP | |
| Yes | 982 (19.2%) |
| No | 4130 (80.8%) |
| Tumor size, cm | |
| >0.5 | 975 (19.1%) |
| [0.5,1) | 96 (1.9%) |
| [1,1.5) | 338 (6.6%) |
| [1.5,2) | 428 (8.4%) |
| [2,2.5) | 422 (8.3%) |
| [2.5,3) | 518 (10.1%) |
| [3,3.5) | 727 (14.2%) |
| [3.5,4) | 623 (12.2%) |
| [4,4.5) | 375 (7.3%) |
| [4.5,5) | 188 (3.7%) |
| ≥5 | 422 (8.3%) |
| Histology | |
| SCC | 4179 (81.7%) |
| AC | 576 (11.3%) |
| AS | 281 (5.5%) |
| Rare type | 76 (1.5%) |
| DSI | |
| Negative | 1219 (23.8%) |
| Inner 2/3 | 1542 (30.2%) |
| Outer 1/3 | 2351 (46%) |
| LVSI | |
| Yes | 2129 (41.6%) |
| No | 2983 (58.4%) |
| Surgical margin | |
| Yes | 399 (7.8%) |
| No | 4713 (2.2%) |
| Parametrial involvement | |
| Yes | 255 (5%) |
| No | 4857 (95%) |
| LN metastasis | |
| Yes | 1000 (19.6%) |
| Pelvic LNs | 710 (13.9%) |
| Common iliac LNs | 248 (4.9%) |
| Para-aortic LNs | 42 (0.8%) |
| No | 4112 (80.4%) |
| Keratinization | |
| Yes | 1168 (22.8%) |
| No | 2012 (39.4%) |
| Non-SCC | 933 (18.3%) |
| Unknown | 999 (19.5%) |
| Differentiation | |
| Low | 105 (2.1%) |
| Intermediate | 232 (4.5%) |
| High | 31 (0.6%) |
| Unknown | 4744 (92.8%) |
| P53 | |
| Negative | 1642 (32.1%) |
| + | 2215 (43.3%) |
| ++ | 76 (1.5%) |
| +++ | 26 (0.5%) |
| ++++ | 2 (0%) |
| Unknown | 1151 (22.5%) |
| P16 | |
| Negative | 213 (4.2%) |
| + | 2909 (56.9%) |
| ++ | 339 (6.6%) |
| +++ | 583 (11.4%) |
| ++++ | 58 (1.1%) |
| Unknown | 1010 (9.8%) |
| Ki67 | |
| Negative | 15 (0.3%) |
| 0–20% | 517 (10.1%) |
| 20–40% | 997 (19.5%) |
| 40–60% | 1038 (20.3%) |
| 60–80% | 1147 (22.4%) |
| 80–100% | 389 (7.6%) |
| Unknown | 1009 (19.7%) |
| Follow-up, months | 90 (18–162) |
Multivariate Cox analysis for predictors of site-specific recurrence.
| Local recurrence ( | Thorax recurrence ( | Abdomen recurrence ( | Bone recurrence ( | |||||
|---|---|---|---|---|---|---|---|---|
| Characteristics | HR (95%CI) | HR (95%CI) | HR (95%CI) | HR (95%CI) | ||||
| 0.013 | <0.001 | 0.006 | 0.003 | |||||
| Ⅰ | 1 | 1 | 1 | 1 | ||||
| Ⅱ | 1.517 [1.094,2.103] | 2.623 [1.599,4.304] | 2.421 [1.288,4.551] | 3.04 [1.466,6.306] | ||||
| 0.012 | 0.02 | 0.487 | 0.213 | |||||
| No | 1 | 1 | 1 | 1 | ||||
| Yes | 1.714 [1.123,2.616] | 2.194 [1.13,4.26] | 1.335 [0.591,3.011] | 1.813 [0.711,4.619] | ||||
| 0.005 | ||||||||
| <2 | 1 | NC | NC | NC | ||||
| [2,4) | 0.997 [0.693,1.434] | |||||||
| ≥4 | 1.782 [1.162,2.732] | |||||||
| 0.012 | <0.001 | |||||||
| SCC | 1 | 1 | NC | NC | ||||
| AC | 1.606 [1.034,2.494] | 2.161 [1.078,4.331] | ||||||
| AS | 1.392 [0.769,2.521] | 4.559 [2.406,8.638] | ||||||
| Rare type | 2.787 [1.299,5.978] | 4.673 [1.669,13.08] | ||||||
| 0.004 | 0.152 | 0.222 | 0.818 | |||||
| Negative | 1 | 1 | 1 | 1 | ||||
| <2/3 | 0.921 [0.508,1.671] | 1.054 [0.352,3.158] | 3.694 [0.817,16.707] | 1.5 [0.385,5.841] | ||||
| ≥2/3 | 1.849 [1.073,3.184] | 1.996 [0.745,5.346] | 3.488 [0.797,15.261] | 1.498 [0.401,5.597] | ||||
| 0.023 | 0.886 | 0.387 | ||||||
| No | 1 | 1 | NC | 1 | ||||
| Yes | 1.602 [1.068,2.403] | 1.050 [0.536,2.058] | 1.511 [0.593,3.851] | |||||
| 0.002 | <0.001 | <0.001 | 0.274 | |||||
| No | 1 | 1 | 1 | 1 | ||||
| Yes | 2.005 [1.301,3.09] | 2.862 [1.602,5.111] | 3.778 [1.82,7.845] | 1.756 [0.64,4.822] | ||||
| 0.059 | <0.001 | 0.001 | 0.002 | |||||
| No | 1 | 1 | 1 | 1 | ||||
| Yes | 1.379 [0.988,1.925] | 3.047 [1.824,5.091] | 3.051 [1.583,5.881] | 3.159 [1.524,6.546] | ||||
NC: not calculated because variables show no significance in univariate analysis.
Fig. 2K for K-means was determined using the elbow method.
Fig. 3Survival outcome comparisons between group A and B. Recurrence-free survival (A); Overall survival (B).
Comparison of model performance (probability prediction of recurrence and survival, happen or not).
| Recurrence | Survival | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Validation group ( | Test group ( | Validation group ( | Test group ( | ||||||||||
| Model | AUC | Sen | Spe | AUC | Sen | Spe | AUC | Sen | Spe | AUC | Sen | Spe | |
| Logistic | 0.701(0.016) | 0.727(0.039) | 0.675(0.014) | 0.785 | 0.725 | 0.679 | 0.860(0.012) | 0.872(0.014) | 0.849(0.002) | 0.775 | 0.889 | 0.512 | |
| SVM | 0.703(0.016) | 0.769(0.033) | 0.636(0.011) | 0.794 | 0.768 | 0.659 | 0.771(0.026) | 0.853(0.040) | 0.688(0.025) | 0.836 | 0.796 | 0.714 | |
| ANN | 0.853(0.022) | 0.739(0.037) | 0.768(0.009) | 0.728 | 0.561 | 0.749 | 0.967(0.014) | 0.966(0.022) | 0.968(0.006) | 0.867 | 0.556 | 0.975 | |
| DT | 0.685(0.017) | 0.857(0.115) | 0.515(0.109) | 0.607 | 0.768 | 0.445 | 0.942(0.029) | 0.925(0.063) | 0.958(0.006) | 0.777 | 0.593 | 0.961 | |
| RF | 0.845(0.041) | 0.876(0.072) | 0.814(0.015) | 0.741 | 0.522 | 0.874 | 0.981(0.032) | 0.965(0.064) | 0.997(0.002) | 0.890 | 0.352 | 0.994 | |
| XGBoost | 0.778(0.025) | 0.881(0.056) | 0.740(0.020) | 0.751 | 0.667 | 0.674 | 0.980(0.032) | 0.966(0.065) | 0.994(0.004) | 0.906 | 0.593 | 0.991 | |
| LightGBM | 0.897(0.051) | 0.879(0.110) | 0.915(0.011) | 0.757 | 0.464 | 0.929 | 0.981(0.026) | 0.970(0.055) | 0.992(0.004) | 0.895 | 0.611 | 0.988 | |
Comparison of model performance (probability prediction of recurrence site).
| Model | Validation group ( | Test group ( | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Local | Distant | Local | Distant | ||||||||
| AUC(std) | Sensitivity(std) | Specificity(std) | Sensitivity(std) | Specificity(std) | AUC | Sensitivity | Specificity | Sensitivity | Specificity | ||
| LR | 0.767(±0.034) | 0.494(±0.066) | 0.663(±0.018) | 0.662(±0.057) | 0.773(±0.027) | 0.776 | 0.389 | 0.757 | 0.455 | 0.842 | |
| SVM | 0.767(±0.026) | 0.556(±0.082) | 0.662(±0.026) | 0.656(±0.059) | 0.802(±0.035) | 0.781 | 0.472 | 0.750 | 0.424 | 0.844 | |
| ANN | 0.944(±0.012) | 0.837(±0.042) | 0.690(±0.004) | 0.850(±0.050) | 0.949(±0.015) | 0.637 | 0.167 | 0.920 | 0.273 | 0.932 | |
| DT | 0.725(±0.029) | 0.539(±0.159) | 0.705(±0.062) | 0.764(±0.142) | 0.671(±0.070) | 0.731 | 0.583 | 0.623 | 0.485 | 0.762 | |
| RF | 0.847(±0.032) | 0.568(±0.061) | 0.632(±0.023) | 0.722(±0.078) | 0.773(±0.027) | 0.737 | 0.306 | 0.861 | 0.515 | 0.863 | |
| XGBoost | 0.823(±0.041) | 0.524(±0.086) | 0.632(±0.027) | 0.695(±0.082) | 0.764(±0.033) | 0.715 | 0.333 | 0.823 | 0.515 | 0.837 | |
| LightGBM | 0.823(±0.041) | 0.524(±0.086) | 0.632(±0.027) | 0.695(±0.082) | 0.764(±0.033) | 0.716 | 0.333 | 0.823 | 0.545 | 0.864 | |
. Comparison of model performance (prediction of RFS and OS).
| RFS (344 events) | OS (268 events) | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Validation group ( | Test group ( | Validation group ( | Test group ( | ||||||
| Model | C-index | Mean absolute error | C-index | Mean absolute error | C-index | Mean absolute error | C-index | Mean absolute error | |
| Cox | 0.753(±0.028) | 14.391(±1.119) | 0.782 | 11.717 | 0.797(±0.020) | 25.630(±2.077) | 0.794 | 23.390 | |
| RF | 0.783(±0.028) | 12.951(±1.343) | 0.785 | 11.396 | 0.802(±0.028) | 22.475 (±2.169) | 0.850 | 20.085 | |
| GDBT | 0.766(±0.025) | 12.358(±1.103) | 0.786 | 11.079 | 0.787(±0.034) | 22.171(±2.083) | 0.825 | 21.415 | |
Fig. 4Screenshot for the web-based predictive calculator predicting individual conditional risk of death, risk of recurrence, risk of site-specific recurrence, RFS, and OS. The calculator is available at https://aicer.fckyy.org.cn. Choose or enter the value for each variable, and then press the “Submit” button.