Simon Nusinovici1, Yih Chung Tham2, Marco Yu Chak Yan1, Daniel Shu Wei Ting2, Jialiang Li3, Charumathi Sabanayagam2, Tien Yin Wong4, Ching-Yu Cheng5. 1. Singapore Eye Research Institute, Singapore National Eye Centre, Singapore, Singapore. 2. Singapore Eye Research Institute, Singapore National Eye Centre, Singapore, Singapore; Ophthalmology and Visual Sciences Academic Clinical Programme, Duke-NUS Medical School, Singapore. 3. Singapore Eye Research Institute, Singapore National Eye Centre, Singapore, Singapore; Department of Statistics and Applied Probability, National University of Singapore, Singapore. 4. Singapore Eye Research Institute, Singapore National Eye Centre, Singapore, Singapore; Department of Ophthalmology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore; Ophthalmology and Visual Sciences Academic Clinical Programme, Duke-NUS Medical School, Singapore. 5. Singapore Eye Research Institute, Singapore National Eye Centre, Singapore, Singapore; Department of Ophthalmology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore; Ophthalmology and Visual Sciences Academic Clinical Programme, Duke-NUS Medical School, Singapore. Electronic address: chingyu.cheng@duke-nus.edu.sg.
Abstract
OBJECTIVE: To evaluate the performance of machine learning (ML) algorithms and to compare them with logistic regression for the prediction of risk of cardiovascular diseases (CVDs), chronic kidney disease (CKD), diabetes (DM), and hypertension (HTN) and in a prospective cohort study using simple clinical predictors. STUDY DESIGN AND SETTING: We conducted analyses in a population-based cohort study in Asian adults (n = 6,762). Five different ML models were considered-single-hidden-layer neural network, support vector machine, random forest, gradient boosting machine, and k-nearest neighbor-and were compared with standard logistic regression. RESULTS: The incidences at 6 years of CVD, CKD, DM, and HTN cases were 4.0%, 7.0%, 9.2%, and 34.6%, respectively. Logistic regression reached the highest area under the receiver operating characteristic curve for CKD (0.905 [0.88, 0.93]) and DM (0.768 [0.73, 0.81]) predictions. For CVD and HTN, the best models were neural network (0.753 [0.70, 0.81]) and support vector machine (0.780 [0.747, 0.812]), respectively. However, the differences with logistic regression were small (less than 1%) and nonsignificant. Logistic regression, gradient boosting machine, and neural network were systematically ranked among the best models. CONCLUSION: Logistic regression yields as good performance as ML models to predict the risk of major chronic diseases with low incidence and simple clinical predictors.
OBJECTIVE: To evaluate the performance of machine learning (ML) algorithms and to compare them with logistic regression for the prediction of risk of cardiovascular diseases (CVDs), chronic kidney disease (CKD), diabetes (DM), and hypertension (HTN) and in a prospective cohort study using simple clinical predictors. STUDY DESIGN AND SETTING: We conducted analyses in a population-based cohort study in Asian adults (n = 6,762). Five different ML models were considered-single-hidden-layer neural network, support vector machine, random forest, gradient boosting machine, and k-nearest neighbor-and were compared with standard logistic regression. RESULTS: The incidences at 6 years of CVD, CKD, DM, and HTN cases were 4.0%, 7.0%, 9.2%, and 34.6%, respectively. Logistic regression reached the highest area under the receiver operating characteristic curve for CKD (0.905 [0.88, 0.93]) and DM (0.768 [0.73, 0.81]) predictions. For CVD and HTN, the best models were neural network (0.753 [0.70, 0.81]) and support vector machine (0.780 [0.747, 0.812]), respectively. However, the differences with logistic regression were small (less than 1%) and nonsignificant. Logistic regression, gradient boosting machine, and neural network were systematically ranked among the best models. CONCLUSION: Logistic regression yields as good performance as ML models to predict the risk of major chronic diseases with low incidence and simple clinical predictors.
Authors: Bocheng Jing; W John Boscardin; W James Deardorff; Sun Young Jeon; Alexandra K Lee; Anne L Donovan; Sei J Lee Journal: Med Care Date: 2022-03-30 Impact factor: 3.178