Chubin Ou1,2, Jiahui Liu1, Yi Qian2, Winston Chong3, Xin Zhang1, Wenchao Liu1, Hengxian Su1, Nan Zhang1, Jianbo Zhang1, Chuan-Zhi Duan1, Xuying He1. 1. National Key Clinical Specialty/Engineering Technology Research Center of Education Ministry of China, Guangdong Provincial Key Laboratory on Brain Function Repair and Regeneration, Department of Neurosurgery, Neurosurgery Institute, Zhujiang Hospital, Southern Medical University, Guangzhou, China. 2. Department of Biomedical Sciences, Faculty of Medicine and Health Sciences, Macquarie University, Sydney, NSW, Australia. 3. Monash Medical Centre, Monash University, Clayton, VIC, Australia.
Abstract
Background: Assessment of cerebral aneurysm rupture risk is an important task, but it remains challenging. Recent works applying machine learning to rupture risk evaluation presented positive results. Yet they were based on limited aspects of data, and lack of interpretability may limit their use in clinical setting. We aimed to develop interpretable machine learning models on multidimensional data for aneurysm rupture risk assessment. Methods: Three hundred seventy-four aneurysms were included in the study. Demographic, medical history, lifestyle behaviors, lipid profile, and morphologies were collected for each patient. Prediction models were derived using machine learning methods (support vector machine, artificial neural network, and XGBoost) and conventional logistic regression. The derived models were compared with the PHASES score method. The Shapley Additive Explanations (SHAP) analysis was applied to improve the interpretability of the best machine learning model and reveal the reasoning behind the predictions made by the model. Results: The best machine learning model (XGBoost) achieved an area under the receiver operating characteristic curve of 0.882 [95% confidence interval (CI) = 0.838-0.927], significantly better than the logistic regression model (0.779; 95% CI = 0.729-0.829; P = 0.002) and the PHASES score method (0.758; 95% CI = 0.713-0.800; P = 0.001). Location, size ratio, and triglyceride level were the three most important features in predicting rupture. Two typical cases were analyzed to demonstrate the interpretability of the model. Conclusions: This study demonstrated the potential of using machine learning for aneurysm rupture risk assessment. Machine learning models performed better than conventional statistical model and the PHASES score method. The SHAP analysis can improve the interpretability of machine learning models and facilitate their use in a clinical setting.
Background: Assessment of cerebral aneurysm rupture risk is an important task, but it remains challenging. Recent works applying machine learning to rupture risk evaluation presented positive results. Yet they were based on limited aspects of data, and lack of interpretability may limit their use in clinical setting. We aimed to develop interpretable machine learning models on multidimensional data for aneurysm rupture risk assessment. Methods: Three hundred seventy-four aneurysms were included in the study. Demographic, medical history, lifestyle behaviors, lipid profile, and morphologies were collected for each patient. Prediction models were derived using machine learning methods (support vector machine, artificial neural network, and XGBoost) and conventional logistic regression. The derived models were compared with the PHASES score method. The Shapley Additive Explanations (SHAP) analysis was applied to improve the interpretability of the best machine learning model and reveal the reasoning behind the predictions made by the model. Results: The best machine learning model (XGBoost) achieved an area under the receiver operating characteristic curve of 0.882 [95% confidence interval (CI) = 0.838-0.927], significantly better than the logistic regression model (0.779; 95% CI = 0.729-0.829; P = 0.002) and the PHASES score method (0.758; 95% CI = 0.713-0.800; P = 0.001). Location, size ratio, and triglyceride level were the three most important features in predicting rupture. Two typical cases were analyzed to demonstrate the interpretability of the model. Conclusions: This study demonstrated the potential of using machine learning for aneurysm rupture risk assessment. Machine learning models performed better than conventional statistical model and the PHASES score method. The SHAP analysis can improve the interpretability of machine learning models and facilitate their use in a clinical setting.
Authors: S F Salimi Ashkezari; F Mut; M Slawski; C M Jimenez; A M Robertson; J R Cebral Journal: AJNR Am J Neuroradiol Date: 2022-03-24 Impact factor: 3.825