| Literature DB >> 36072795 |
Lizhao Yan1, Nan Gao1, Fangxing Ai1, Yingsong Zhao2, Yu Kang1, Jianghai Chen1, Yuxiong Weng1.
Abstract
Background: Accurate prediction of prognosis is critical for therapeutic decisions in chondrosarcoma patients. Several prognostic models have been created utilizing multivariate Cox regression or binary classification-based machine learning approaches to predict the 3- and 5-year survival of patients with chondrosarcoma, but few studies have investigated the results of combining deep learning with time-to-event prediction. Compared with simplifying the prediction as a binary classification problem, modeling the probability of an event as a function of time by combining it with deep learning can provide better accuracy and flexibility. Materials and methods: Patients with the diagnosis of chondrosarcoma between 2000 and 2018 were extracted from the Surveillance, Epidemiology, and End Results (SEER) registry. Three algorithms-two based on neural networks (DeepSurv, neural multi-task logistic regression [NMTLR]) and one on ensemble learning (random survival forest [RSF])-were selected for training. Meanwhile, a multivariate Cox proportional hazards (CoxPH) model was also constructed for comparison. The dataset was randomly divided into training and testing datasets at a ratio of 7:3. Hyperparameter tuning was conducted through a 1000-repeated random search with 5-fold cross-validation on the training dataset. The model performance was assessed using the concordance index (C-index), Brier score, and Integrated Brier Score (IBS). The accuracy of predicting 1-, 3-, 5- and 10-year survival was evaluated using receiver operating characteristic curves (ROC), calibration curves, and the area under the ROC curves (AUC).Entities:
Keywords: DeepSurv; chondrosarcoma; deep learning; machine learning; survival analysis
Year: 2022 PMID: 36072795 PMCID: PMC9442032 DOI: 10.3389/fonc.2022.967758
Source DB: PubMed Journal: Front Oncol ISSN: 2234-943X Impact factor: 5.738
Figure 1Study profile and analysis pipeline.
Patient demographic, disease, treatment characteristics, and Cox regression analysis.
| Overall | Univariate Cox | Multivariate Cox | ||||||
|---|---|---|---|---|---|---|---|---|
| Characteristic | N = 3,1451 | HR2 | 95% CI2 | P-value | HR2 | 95% CI2 | P-value | |
|
| 0.23 | 0.17 | ||||||
|
| 1,768 (56%) | — | — | — | — | |||
|
| 1,377 (44%) | 1.10 | 0.94, 1.27 | 0.85 | 0.68, 1.07 | |||
|
| 52 (18) | 1.05 | 1.05, 1.06 |
| 1.04 | 1.03, 1.05 |
| |
|
|
|
| ||||||
|
| 1,483 (47%) | — | — | — | — | |||
|
| 1,662 (53%) | 1.48 | 1.29, 1.69 | 1.58 | 1.27, 1.96 | |||
|
|
|
| ||||||
|
| 2,879 (92%) | — | — | — | — | |||
|
| 266 (8.5%) | 6.30 | 5.34, 7.42 | 1.96 | 1.42, 2.69 | |||
|
|
|
| ||||||
|
| 1,595 (51%) | — | — | — | — | |||
|
| 702 (22%) | 1.60 | 1.37, 1.86 | 1.09 | 0.84, 1.42 | |||
|
| 848 (27%) | 0.77 | 0.65, 0.91 | 0.72 | 0.54, 0.95 | |||
|
|
| 0.80 | ||||||
|
| 1,083 (73%) | — | — | — | — | |||
|
| 249 (17%) | 3.36 | 2.68, 4.22 | 1.21 | 0.69, 2.14 | |||
|
| 15 (1.0%) | 1.33 | 0.49, 3.57 | 0.73 | 0.21, 2.49 | |||
|
| 140 (9.4%) | 12.8 | 10.2, 16.2 | 1.33 | 0.46, 3.83 | |||
|
| 1,658 | |||||||
|
|
|
| ||||||
|
| 1,033 (39%) | — | — | — | — | |||
|
| 1,099 (41%) | 1.75 | 1.45, 2.11 | 1.40 | 1.05, 1.88 | |||
|
| 319 (12%) | 4.18 | 3.36, 5.22 | 1.73 | 0.94, 3.20 | |||
|
| 208 (7.8%) | 10.4 | 8.31, 13.0 | 2.63 | 1.38, 5.03 | |||
|
| 486 | |||||||
|
|
|
| ||||||
|
| 393 (13%) | — | — | — | — | |||
|
| 1,066 (35%) | 0.24 | 0.20, 0.29 | 0.54 | 0.37, 0.80 | |||
|
| 1,243 (41%) | 0.33 | 0.28, 0.39 | 0.48 | 0.33, 0.68 | |||
|
| 358 (12%) | 0.65 | 0.53, 0.80 | 0.62 | 0.42, 0.90 | |||
|
| 85 | |||||||
|
|
| 0.39 | ||||||
|
| 2,822 (90%) | — | — | — | — | |||
|
| 323 (10%) | 1.42 | 1.17, 1.72 | 1.15 | 0.84, 1.56 | |||
|
|
| 0.18 | ||||||
|
| 2,905 (92%) | — | — | — | — | |||
|
| 240 (7.6%) | 4.92 | 4.14, 5.83 | 1.26 | 0.90, 1.75 | |||
|
| 81 (60) | 1.00 | 1.00, 1.01 |
| 1.00 | 1.00, 1.00 |
| |
|
| 1,552 | |||||||
|
| 0.28 | 0.23 | ||||||
|
| 2,867 (91%) | — | — | — | — | |||
|
| 278 (8.8%) | 1.12 | 0.91, 1.37 | 0.82 | 0.59, 1.14 | |||
|
|
|
| ||||||
|
| 553 (29%) | — | — | — | — | |||
|
| 1,251 (67%) | 2.27 | 1.81, 2.85 | 1.50 | 1.12, 2.00 | |||
|
| 75 (4.0%) | 4.73 | 3.28, 6.82 | 2.30 | 1.41, 3.75 | |||
|
| 1,266 | |||||||
|
|
|
| ||||||
|
| 1,792 (93%) | — | — | — | — | |||
|
| 128 (6.7%) | 9.98 | 8.07, 12.4 | 3.15 | 1.11, 8.93 | |||
|
| 1,225 | |||||||
|
| 83 (67) | |||||||
|
| ||||||||
|
| 2,241 (71%) | |||||||
|
| 904 (29%) | |||||||
1n (%); Mean (SD).
2HR = Hazard Ratio, CI = Confidence Interval.
P values are bolded to indicate they are less than 0.05.
Figure 2Correlation coefficients for each pair of variables in the data set. The estimated correlation values are distributed within the range of -1 to +1. They are represented by color depth, with a number closer to either end value implying a stronger negative correlation or positive correlation.
Characteristic distribution of data in training sets and test sets.
| Level | Overall | Train | Test | P-value | |
|---|---|---|---|---|---|
|
| 3145 | 2203 | 942 | ||
|
| 51.58 (17.53) | 51.70 (17.41) | 51.29 (17.82) | 0.547 | |
|
| Female | 1483 (47.2) | 1036 (47.0) | 447 (47.5) | 0.857 |
| Male | 1662 (52.8) | 1167 (53.0) | 495 (52.5) | ||
|
| Conventional | 2879 (91.5) | 2025 (91.9) | 854 (90.7) | 0.274 |
| Dedifferentiated | 266 (8.5) | 178 (8.1) | 88 (9.3) | ||
|
| Extremity | 1595 (50.7) | 1121 (50.9) | 474 (50.3) | 0.395 |
| Axial skeleton | 702 (22.3) | 502 (22.8) | 200 (21.2) | ||
| Other | 848 (27.0) | 580 (26.3) | 268 (28.5) | ||
|
| Well differentiated | 1033 (38.8) | 725 (38.6) | 308 (39.5) | 0.933 |
| Moderately differentiated | 1099 (41.3) | 782 (41.6) | 317 (40.7) | ||
| Poorly differentiated | 319 (12.0) | 228 (12.1) | 91 (11.7) | ||
| Undifferentiated | 208 (7.8) | 145 (7.7) | 63 (8.1) | ||
|
| None | 393 (12.8) | 266 (12.4) | 127 (14.0) | 0.571 |
| Local treatment | 1066 (34.8) | 762 (35.4) | 304 (33.5) | ||
| Radical excision with limb salvage | 1243 (40.6) | 874 (40.6) | 369 (40.7) | ||
| Amputation | 358 (11.7) | 251 (11.7) | 107 (11.8) | ||
|
| 80.65 (60.19) | 80.96 (62.00) | 79.88 (55.47) | 0.746 | |
|
| No break in periosteum | 553 (29.4) | 389 (28.9) | 164 (30.8) | 0.425 |
| Extension beyond periosteum | 1251 (66.6) | 900 (66.8) | 351 (66.0) | ||
| Further extension | 75 (4.0) | 58 (4.3) | 17 (3.2) | ||
|
| Not | 1792 (93.3) | 1280 (93.5) | 512 (92.9) | 0.721 |
| Yes | 128 (6.7) | 89 (6.5) | 39 (7.1) | ||
|
| 83.16 (66.93) | 84.50 (66.86) | 80.04 (67.01) | 0.087 | |
|
| Alive | 2241 (71.3) | 1572 (71.4) | 669 (71.0) | 0.882 |
| Dead | 904 (28.7) | 631 (28.6) | 273 (29.0) |
Figure 3Loss convergence graph for (A) DeepSurv, (B) neural network multitask logistic regression (N-MLTR) models.
Performance of four survival models.
| C index | |||||||
|---|---|---|---|---|---|---|---|
| Models | Train | Test | IBS | 1-year AUC | 3-year AUC | 5-year AUC | 10-year AUC |
| CoxPH | 0.782 | 0.773 | 0.126 | 0.923 (0.897-0.948) | 0.879 (0.852-0.906) | 0.865 (0.836-0.893) | 0.870 (0.841-0.899) |
| DeepSurv |
|
|
|
|
|
|
|
| NMTLR | 0.850 | 0.821 | 0.115 | 0.928 (0.900-0.956) | 0.896 (0.870-0.922) | 0.889 (0.862-0.915) | 0.890 (0.863-0.917) |
| RSF | 0.829 | 0.803 | 0.128 | 0.931 (0.905-0.958) | 0.900 (0.873-0.926) | 0.889 (0.862-0.916) | 0.885 (0.857-0.913) |
CoxPH, standard cox proportional hazards; NMLTR, neural multi-task logistic regression; RSF, random survival forest; IBS, Integrated Brier Score; AUC, area under receiver operating characteristic curve. C index, concordance index.
C index in train and test dataset are calculated separately, other metrics are calculated only in the test set.
Bolded metrics indicate that the metric is the best of the fourgroups.
Figure 4Prediction error curve. As a benchmark, a useful model will have a Brier score below 0.25.
Figure 5The receiver operating curves (ROC) and calibration curves for 1-, 3-, 5-, 10-year survival predictions. ROC curves for (A) 1-, (C) 3-, (E) 5-, (G) 10-year survival predictions. calibration curves for (B) 1-, (D) 3-, (F) 5-, (H) 10- year survival predictions.
Figure 6Heatmap of feature importance for DeepSurv, neural network multitask logistic regression (N-MLTR) and random survival forest (RSF) models. The values are expressed as a percentage reduction in the C-index after the value of a feature has been replaced by random numbers. Higher values suggest that a feature is more important in influencing the predictive accuracy of the corresponding deep learning model.
Figure 7A screenshot of the online web-based application of DeepSurv model.