Xi Chen1, Yingxue Li2, Xiang Li2, Xun Cao3, Yanqun Xiang1, Weixiong Xia1, Jianpeng Li4, Mingyong Gao5, Yuyao Sun2, Kuiyuan Liu1, Mengyun Qiang1, Chixiong Liang1, Jingjing Miao1, Zhuochen Cai1, Xiang Guo1, Chaofeng Li6, Guotong Xie7, Xing Lv8. 1. State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Department of Nasopharyngeal Carcinoma, Sun Yat-sen University Cancer Center, Guangzhou 510060, PR China. 2. Ping An Healthcare Technology, Beijing 100032, PR China. 3. State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Department of Intensive Care Unit, Sun Yat-sen University Cancer Center, Guangzhou 510060, PR China. 4. Department of Radiology, Dongguan People's Hospital, Dongguan 523059, PR China. 5. Department of Medical Imaging, First People's Hospital of Foshan, Foshan 528000, PR China. 6. State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Department of Information Technology, Sun Yat-sen University Cancer Center, Guangzhou 510060, PR China. Electronic address: lichaofeng@sysucc.org.cn. 7. Ping An Healthcare Technology, Beijing 100032, PR China; Ping An Health Cloud Company Limited, Ping An International Smart City Technology Co., Ltd., Beijing 100032, PR China. Electronic address: xieguotong@pingan.com.cn. 8. State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Department of Nasopharyngeal Carcinoma, Sun Yat-sen University Cancer Center, Guangzhou 510060, PR China. Electronic address: lvxing@sysucc.org.cn.
Abstract
OBJECTIVES: We aimed to build a survival system by combining a highly-accurate machine learning (ML) model with explainable artificial intelligence (AI) techniques to predict distant metastasis in locoregionally advanced nasopharyngeal carcinoma (NPC) patients using magnetic resonance imaging (MRI)-based tumor burden features. MATERIALS AND METHODS: 1643 patients from three hospitals were enrolled according to set criteria. We employed ML to develop a survival model based on tumor burden signatures and all clinical factors. Shapley Additive exPlanations (SHAP) was utilized to explain prediction results and interpret the complex non-linear relationship among features and distant metastasis. We also constructed other models based on routinely used cancer stages, Epstein-Barr virus (EBV) DNA, or other clinical features for comparison. Concordance index (C-index), receiver operating curve (ROC) analysis and decision curve analysis (DCA) were executed to assess the effectiveness of the models. RESULTS: Our proposed system consistently demonstrated promising performance across independent cohorts. The concordance indexes were 0.773, 0.766 and 0.760 in the training, internal validation and external validation sets. SHAP provided personalized protective and risk factors for each NPC patient and uncovered some novel non-linear relationships between features and distant metastasis. Furthermore, high-risk patients who received induction chemotherapy (ICT) and concurrent chemoradiotherapy (CCRT) had better 5-year distant metastasis-free survival (DMFS) than those who only received CCRT, whereas ICT + CCRT and CCRT had similar DMFS in low-risk patients. CONCLUSIONS: The interpretable machine learning system demonstrated superior performance in predicting metastasis in locoregionally advanced NPC. High-risk patients might benefit from ICT.
OBJECTIVES: We aimed to build a survival system by combining a highly-accurate machine learning (ML) model with explainable artificial intelligence (AI) techniques to predict distant metastasis in locoregionally advanced nasopharyngeal carcinoma (NPC) patients using magnetic resonance imaging (MRI)-based tumor burden features. MATERIALS AND METHODS: 1643 patients from three hospitals were enrolled according to set criteria. We employed ML to develop a survival model based on tumor burden signatures and all clinical factors. Shapley Additive exPlanations (SHAP) was utilized to explain prediction results and interpret the complex non-linear relationship among features and distant metastasis. We also constructed other models based on routinely used cancer stages, Epstein-Barr virus (EBV) DNA, or other clinical features for comparison. Concordance index (C-index), receiver operating curve (ROC) analysis and decision curve analysis (DCA) were executed to assess the effectiveness of the models. RESULTS: Our proposed system consistently demonstrated promising performance across independent cohorts. The concordance indexes were 0.773, 0.766 and 0.760 in the training, internal validation and external validation sets. SHAP provided personalized protective and risk factors for each NPC patient and uncovered some novel non-linear relationships between features and distant metastasis. Furthermore, high-risk patients who received induction chemotherapy (ICT) and concurrent chemoradiotherapy (CCRT) had better 5-year distant metastasis-free survival (DMFS) than those who only received CCRT, whereas ICT + CCRT and CCRT had similar DMFS in low-risk patients. CONCLUSIONS: The interpretable machine learning system demonstrated superior performance in predicting metastasis in locoregionally advanced NPC. High-risk patients might benefit from ICT.