Literature DB >> 33772109

Explainable machine learning can outperform Cox regression predictions and provide insights in breast cancer survival.

Arturo Moncada-Torres1, Marissa C van Maaren2,3, Mathijs P Hendriks2,4, Sabine Siesling2,3, Gijs Geleijnse2.   

Abstract

Cox Proportional Hazards (CPH) analysis is the standard for survival analysis in oncology. Recently, several machine learning (ML) techniques have been adapted for this task. Although they have shown to yield results at least as good as classical methods, they are often disregarded because of their lack of transparency and little to no explainability, which are key for their adoption in clinical settings. In this paper, we used data from the Netherlands Cancer Registry of 36,658 non-metastatic breast cancer patients to compare the performance of CPH with ML techniques (Random Survival Forests, Survival Support Vector Machines, and Extreme Gradient Boosting [XGB]) in predicting survival using the [Formula: see text]-index. We demonstrated that in our dataset, ML-based models can perform at least as good as the classical CPH regression ([Formula: see text]-index [Formula: see text]), and in the case of XGB even better ([Formula: see text]-index [Formula: see text]). Furthermore, we used Shapley Additive Explanation (SHAP) values to explain the models' predictions. We concluded that the difference in performance can be attributed to XGB's ability to model nonlinearities and complex interactions. We also investigated the impact of specific features on the models' predictions as well as their corresponding insights. Lastly, we showed that explainable ML can generate explicit knowledge of how models make their predictions, which is crucial in increasing the trust and adoption of innovative ML techniques in oncology and healthcare overall.

Entities:  

Year:  2021        PMID: 33772109      PMCID: PMC7998037          DOI: 10.1038/s41598-021-86327-7

Source DB:  PubMed          Journal:  Sci Rep        ISSN: 2045-2322            Impact factor:   4.379


  32 in total

1.  Iterative partial least squares with right-censored data analysis: a comparison to other dimension reduction techniques.

Authors:  Jie Huang; David Harrington
Journal:  Biometrics       Date:  2005-03       Impact factor: 2.571

2.  From Local Explanations to Global Understanding with Explainable AI for Trees.

Authors:  Scott M Lundberg; Gabriel Erion; Hugh Chen; Alex DeGrave; Jordan M Prutkin; Bala Nair; Ronit Katz; Jonathan Himmelfarb; Nisha Bansal; Su-In Lee
Journal:  Nat Mach Intell       Date:  2020-01-17

3.  Sparse kernel methods for high-dimensional survival data.

Authors:  Ludger Evers; Claudia-Martina Messow
Journal:  Bioinformatics       Date:  2008-05-30       Impact factor: 6.937

4.  Relation of tumor size, lymph node status, and survival in 24,740 breast cancer cases.

Authors:  C L Carter; C Allen; D E Henson
Journal:  Cancer       Date:  1989-01-01       Impact factor: 6.860

5.  Modeling the effect of tumor size in early breast cancer.

Authors:  Claire Verschraegen; Vincent Vinh-Hung; Gábor Cserni; Richard Gordon; Melanie E Royce; Georges Vlastos; Patricia Tai; Guy Storme
Journal:  Ann Surg       Date:  2005-02       Impact factor: 12.969

6.  Explainable machine-learning predictions for the prevention of hypoxaemia during surgery.

Authors:  Scott M Lundberg; Bala Nair; Monica S Vavilala; Mayumi Horibe; Michael J Eisses; Trevor Adams; David E Liston; Daniel King-Wai Low; Shu-Fang Newman; Jerry Kim; Su-In Lee
Journal:  Nat Biomed Eng       Date:  2018-10-10       Impact factor: 25.671

Review 7.  Reporting performance of prognostic models in cancer: a review.

Authors:  Susan Mallett; Patrick Royston; Rachel Waters; Susan Dutton; Douglas G Altman
Journal:  BMC Med       Date:  2010-03-30       Impact factor: 8.775

Review 8.  Review of survival analyses published in cancer journals.

Authors:  D G Altman; B L De Stavola; S B Love; K A Stepniewska
Journal:  Br J Cancer       Date:  1995-08       Impact factor: 7.640

9.  Effect of tumor size on breast cancer-specific survival stratified by joint hormone receptor status in a SEER population-based study.

Authors:  Yi-Zi Zheng; Lei Wang; Xin Hu; Zhi-Ming Shao
Journal:  Oncotarget       Date:  2015-09-08

10.  Allowing for mandatory covariates in boosting estimation of sparse high-dimensional survival models.

Authors:  Harald Binder; Martin Schumacher
Journal:  BMC Bioinformatics       Date:  2008-01-10       Impact factor: 3.169

View more
  9 in total

1.  Machine learning models for identifying predictors of clinical outcomes with first-line immune checkpoint inhibitor therapy in advanced non-small cell lung cancer.

Authors:  Ying Li; Matthew Brendel; Ning Wu; Wenzhen Ge; Hao Zhang; Petra Rietschel; Ruben G W Quek; Jean-Francois Pouliot; Fei Wang; James Harnett
Journal:  Sci Rep       Date:  2022-10-21       Impact factor: 4.996

Review 2.  COVID Mortality Prediction with Machine Learning Methods: A Systematic Review and Critical Appraisal.

Authors:  Francesca Bottino; Emanuela Tagliente; Luca Pasquini; Alberto Di Napoli; Martina Lucignani; Lorenzo Figà-Talamanca; Antonio Napolitano
Journal:  J Pers Med       Date:  2021-09-07

3.  Comparison of Machine Learning Techniques for Mortality Prediction in a Prospective Cohort of Older Adults.

Authors:  Salvatore Tedesco; Martina Andrulli; Markus Åkerlund Larsson; Daniel Kelly; Antti Alamäki; Suzanne Timmons; John Barton; Joan Condell; Brendan O'Flynn; Anna Nordström
Journal:  Int J Environ Res Public Health       Date:  2021-12-04       Impact factor: 3.390

4.  Evolution of hospitalized patient characteristics through the first three COVID-19 waves in Paris area using machine learning analysis.

Authors:  Camille Jung; Jean-Baptiste Excoffier; Mathilde Raphaël-Rousseau; Noémie Salaün-Penquer; Matthieu Ortala; Christos Chouaid
Journal:  PLoS One       Date:  2022-02-22       Impact factor: 3.240

5.  Machine learning predicts response to TNF inhibitors in rheumatoid arthritis: results on the ESPOIR and ABIRISK cohorts.

Authors:  Vincent Bouget; Julien Duquesne; Signe Hassler; Paul-Henry Cournède; Bruno Fautrel; Francis Guillemin; Marc Pallardy; Philippe Broët; Xavier Mariette; Samuel Bitoun
Journal:  RMD Open       Date:  2022-08

6.  Interaction Analysis Based on Shapley Values and Extreme Gradient Boosting: A Realistic Simulation and Application to a Large Epidemiological Prospective Study.

Authors:  Nicola Orsini; Alex Moore; Alicja Wolk
Journal:  Front Nutr       Date:  2022-07-18

7.  Long-term exposure to particulate matter was associated with increased dementia risk using both traditional approaches and novel machine learning methods.

Authors:  Yuan-Horng Yan; Ting-Bin Chen; Chun-Pai Yang; I-Ju Tsai; Hwa-Lung Yu; Yuh-Shen Wu; Winn-Jung Huang; Shih-Ting Tseng; Tzu-Yu Peng; Elizabeth P Chou
Journal:  Sci Rep       Date:  2022-10-12       Impact factor: 4.996

8.  Breast Cancer Surgery 10-Year Survival Prediction by Machine Learning: A Large Prospective Cohort Study.

Authors:  Shi-Jer Lou; Ming-Feng Hou; Hong-Tai Chang; Hao-Hsien Lee; Chong-Chi Chiu; Shu-Chuan Jennifer Yeh; Hon-Yi Shi
Journal:  Biology (Basel)       Date:  2021-12-29

9.  Using explainable machine learning to identify patients at risk of reattendance at discharge from emergency departments.

Authors:  F P Chmiel; D K Burns; M Azor; F Borca; M J Boniface; Z D Zlatev; N M White; T W V Daniels; M Kiuber
Journal:  Sci Rep       Date:  2021-11-02       Impact factor: 4.379

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.