Literature DB >> 32433107

How Does the Skeletal Oncology Research Group Algorithm's Prediction of 5-year Survival in Patients with Chondrosarcoma Perform on International Validation?

Michiel E R Bongers1, Aditya V Karhade1, Elisabetta Setola2, Marco Gambarotti3, Olivier Q Groot1, Kivilcim E Erdoğan4, Piero Picci5, Davide M Donati6, Joseph H Schwab1, Emanuela Palmerini2.   

Abstract

BACKGROUND: The Skeletal Oncology Research Group (SORG) machine learning algorithm for predicting survival in patients with chondrosarcoma was developed using data from the Surveillance, Epidemiology, and End Results (SEER) registry. This algorithm was externally validated on a dataset of patients from the United States in an earlier study, where it demonstrated generally good performance but overestimated 5-year survival. In addition, this algorithm has not yet been validated in patients outside the United States; doing so would be important because external validation is necessary as algorithm performance may be misleading when applied in different populations. QUESTIONS/PURPOSES: Does the SORG algorithm retain validity in patients who underwent surgery for primary chondrosarcoma outside the United States, specifically in Italy?
METHODS: A total of 737 patients were treated for chondrosarcoma between January 2000 and October 2014 at the Italian tertiary care center which was used for international validation. We excluded patients whose first surgical procedure was performed elsewhere (n = 25), patients who underwent nonsurgical treatment (n = 27), patients with a chondrosarcoma of the soft tissue or skull (n = 60), and patients with peripheral, periosteal, or mesenchymal chondrosarcoma (n = 161). Thus, 464 patients were ultimately included in this external validation study, as the earlier performed SEER study was used as the training set. Therefore, this study-unlike most of this type-does not have a training and validation set. Although the earlier study overestimated 5-year survival, we did not modify the algorithm in this report, as this is the first international validation and the prior performance in the single-institution validation study from the United States may have been driven by a small sample or non-generalizable patterns related to its single-center setting. Variables needed for the SORG algorithm were manually collected from electronic medical records. These included sex, age, histologic subtype, tumor grade, tumor size, tumor extension, and tumor location. By inputting these variables into the algorithm, we calculated the predicted probabilities of survival for each patient. The performance of the SORG algorithm was assessed in this study through discrimination (the ability of a model to distinguish between a binary outcome), calibration (the agreement of observed and predicted outcomes), overall performance (the accuracy of predictions), and decision curve analysis (establishment on the ability of a model to make a decision better than without using the model). For discrimination, the c-statistic (commonly known as the area under the receiver operating characteristic curve for binary classification) was calculated; this ranged from 0.5 (no better than chance) to 1.0 (excellent discrimination). The agreement between predicted and observed outcomes was visualized with a calibration plot, and the calibration slope and intercept were calculated. Perfect calibration results in a slope of 1 and an intercept of 0. For overall performance, the Brier score and the null-model Brier score were calculated. The Brier score ranges from 0 (perfect prediction) to 1 (poorest prediction). Appropriate interpretation of the Brier score requires comparison with the null-model Brier score. The null-model Brier score is the score for an algorithm that predicts a probability equal to the population prevalence of the outcome for every patient. A decision curve analysis was performed to compare the potential net benefit of the algorithm versus other means of decision support, such as treating all or none of the patients. There were several differences between this study and the earlier SEER study, and such differences are important because they help us to determine the performance of the algorithm in a group different from the initial study population. In this study from Italy, 5-year survival was different from the earlier SEER study (71% [319 of 450 patients] versus 76% [1131 of 1487 patients]; p = 0.03). There were more patients with dedifferentiated chondrosarcoma than in the earlier SEER study (25% [118 of 464 patients] versus 8.5% [131 of 1544 patients]; p < 0.001). In addition, in this study patients were older, tumor size was larger, and there were higher proportions of high-grade tumors than the earlier SEER study (age: 56 years [interquartile range {IQR} 42 to 67] versus 52 years [IQR 40 to 64]; p = 0.007; tumor size: 80 mm [IQR 50 to 120] versus 70 mm [IQR 42 to 105]; p < 0.001; tumor grade: 22% [104 of 464 had Grade 1], 42% [196 of 464 had Grade 2], and 35% [164 of 464 had Grade 3] versus 41% [592 of 1456 had Grade 1], 40% [588 of 1456 had Grade 2], and 19% [276 of 1456 had Grade 3]; p ≤ 0.001).
RESULTS: Validation of the SORG algorithm in a primarily Italian population achieved a c-statistic of 0.86 (95% confidence interval 0.82 to 0.89), suggesting good-to-excellent discrimination. The calibration plot showed good agreement between the predicted probability and observed survival in the probability thresholds of 0.8 to 1.0. With predicted survival probabilities lower than 0.8, however, the SORG algorithm underestimated the observed proportion of patients with 5-year survival, reflected in the overall calibration intercept of 0.82 (95% CI 0.67 to 0.98) and calibration slope of 0.68 (95% CI 0.42 to 0.95). The Brier score for 5-year survival was 0.15, compared with a null-model Brier of 0.21. The algorithm showed a favorable decision curve analysis in the validation cohort.
CONCLUSIONS: The SORG algorithm to predict 5-year survival for patients with chondrosarcoma held good discriminative ability and overall performance on international external validation; however, it underestimated 5-year survival for patients with predicted probabilities from 0 to 0.8 because the calibration plot was not perfectly aligned for the observed outcomes, which resulted in a maximum underestimation of 20%. The differences may reflect the baseline differences noted between the two study populations. The overall performance of the algorithm supports the utility of the algorithm and validation presented here. The freely available digital application for the algorithm is available here: https://sorg-apps.shinyapps.io/extremitymetssurvival/. LEVEL OF EVIDENCE: Level III, prognostic study.

Entities:  

Mesh:

Year:  2020        PMID: 32433107      PMCID: PMC7491905          DOI: 10.1097/CORR.0000000000001305

Source DB:  PubMed          Journal:  Clin Orthop Relat Res        ISSN: 0009-921X            Impact factor:   4.755


  30 in total

1.  Dedifferentiated central chondrosarcoma.

Authors:  Eric L Staals; Patrizia Bacchini; Franco Bertoni
Journal:  Cancer       Date:  2006-06-15       Impact factor: 6.860

2.  Dedifferentiated chondrosarcoma: prognostic factors and outcome from a European group.

Authors:  Robert J Grimer; Georg Gosheger; Antonie Taminiau; David Biau; Zdenek Matejovsky; Yehuda Kollender; Mikel San-Julian; Franco Gherlinzoni; Cristina Ferrari
Journal:  Eur J Cancer       Date:  2007-08-27       Impact factor: 9.162

3.  Development of Machine Learning Algorithms for Prediction of 5-Year Spinal Chordoma Survival.

Authors:  Aditya V Karhade; Quirina Thio; Paul Ogink; Jason Kim; Santiago Lozano-Calderon; Kevin Raskin; Joseph H Schwab
Journal:  World Neurosurg       Date:  2018-08-08       Impact factor: 2.104

4.  Big Data and Predictive Analytics: Recalibrating Expectations.

Authors:  Nilay D Shah; Ewout W Steyerberg; David M Kent
Journal:  JAMA       Date:  2018-07-03       Impact factor: 56.272

5.  What do we mean by validating a prognostic model?

Authors:  D G Altman; P Royston
Journal:  Stat Med       Date:  2000-02-29       Impact factor: 2.373

6.  Decision curve analysis: a novel method for evaluating prediction models.

Authors:  Andrew J Vickers; Elena B Elkin
Journal:  Med Decis Making       Date:  2006 Nov-Dec       Impact factor: 2.583

7.  EURO-B.O.S.S.: A European study on chemotherapy in bone-sarcoma patients aged over 40: Outcome in primary high-grade osteosarcoma.

Authors:  Stefano Ferrari; Stefan S Bielack; Sigbjørn Smeland; Alessandra Longhi; Gerlinde Egerer; Kirsten Sundby Hall; Davide Donati; Matthias Kevric; Otte Brosjö; Alessandro Comandone; Mathias Werner; Odd Monge; Emanuela Palmerini; Wolfgang E Berdel; Bodil Bjerkehagen; Anna Paioli; Sylvie Lorenzen; Mikael Eriksson; Marco Gambarotti; Per-Ulf Tunn; Nina L Jebsen; Marilena Cesari; Thekla von Kalle; Virginia Ferraresi; Rudolf Schwarz; Rossella Bertulli; Anne-Katrin Kasparek; Giovanni Grignani; Fatime Krasniqi; Benjamin Sorg; Stefanie Hecker-Nolting; Piero Picci; Peter Reichardt
Journal:  Tumori       Date:  2018 Jan-Feb       Impact factor: 2.098

8.  Assessing the performance of prediction models: a framework for traditional and novel measures.

Authors:  Ewout W Steyerberg; Andrew J Vickers; Nancy R Cook; Thomas Gerds; Mithat Gonen; Nancy Obuchowski; Michael J Pencina; Michael W Kattan
Journal:  Epidemiology       Date:  2010-01       Impact factor: 4.822

9.  Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD Statement.

Authors:  Gary S Collins; Johannes B Reitsma; Douglas G Altman; Karel G M Moons
Journal:  BMC Med       Date:  2015-01-06       Impact factor: 8.775

10.  Development and Internal Validation of Machine Learning Algorithms for Preoperative Survival Prediction of Extremity Metastatic Disease.

Authors:  Quirina C B S Thio; Aditya V Karhade; Bas JJ Bindels; Paul T Ogink; Jos A M Bramer; Marco L Ferrone; Santiago Lozano Calderón; Kevin A Raskin; Joseph H Schwab
Journal:  Clin Orthop Relat Res       Date:  2020-02       Impact factor: 4.755

View more
  3 in total

1.  Clinical Features and Serological Markers Risk Model Predicts Overall Survival in Patients Undergoing Breast Cancer and Bone Metastasis Surgeries.

Authors:  Haochen Mou; Zhan Wang; Wenkan Zhang; Guoqi Li; Hao Zhou; Eloy Yinwang; Fangqian Wang; Hangxiang Sun; Yucheng Xue; Zenan Wang; Tao Chen; Xupeng Chai; Hao Qu; Peng Lin; Wangsiyuan Teng; Binghao Li; Zhaoming Ye
Journal:  Front Oncol       Date:  2021-09-17       Impact factor: 6.244

2.  A Simple Logistic Regression Model for Predicting the Likelihood of Recurrence of Atrial Fibrillation Within 1 Year After Initial Radio-Frequency Catheter Ablation Therapy.

Authors:  Sixiang Jia; Haochen Mou; Yiteng Wu; Wenting Lin; Yajing Zeng; Yiwen Chen; Yayu Chen; Qi Zhang; Wei Wang; Chao Feng; Shudong Xia
Journal:  Front Cardiovasc Med       Date:  2022-01-27

3.  Deep learning models for predicting the survival of patients with chondrosarcoma based on a surveillance, epidemiology, and end results analysis.

Authors:  Lizhao Yan; Nan Gao; Fangxing Ai; Yingsong Zhao; Yu Kang; Jianghai Chen; Yuxiong Weng
Journal:  Front Oncol       Date:  2022-08-22       Impact factor: 5.738

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.