Ni Wang1,2, Yanqun Huang1,2, Honglei Liu1,2, Xiaolu Fei3, Lan Wei3, Xiangkun Zhao1, Hui Chen4,5. 1. School of Biomedical Engineering, Capital Medical University, No. 10, Xitoutiao, YouAnMen, Fengtai District, Beijing, 100069, China. 2. Beijing Key Laboratory of Fundamental Research on Biomechanics in Clinical Application, Capital Medical University, No. 10, Xitoutiao, YouAnMen, Fengtai District, Beijing, 100069, China. 3. Information Center, Xuanwu Hospital, Capital Medical University, No. 45 Changchun Street, Xicheng District, Beijing, 100053, China. 4. School of Biomedical Engineering, Capital Medical University, No. 10, Xitoutiao, YouAnMen, Fengtai District, Beijing, 100069, China. chenhui@ccmu.edu.cn. 5. Beijing Key Laboratory of Fundamental Research on Biomechanics in Clinical Application, Capital Medical University, No. 10, Xitoutiao, YouAnMen, Fengtai District, Beijing, 100069, China. chenhui@ccmu.edu.cn.
Abstract
BACKGROUND: Conventional risk prediction techniques may not be the most suitable approach for personalized prediction for individual patients. Therefore, individualized predictive modeling based on similar patients has emerged. This study aimed to propose a comprehensive measurement of patient similarity using real-world electronic medical records data, and evaluate the effectiveness of the individualized prediction of a patient's diabetes status based on the patient similarity. RESULTS: When using no more than 30% of the whole training sample, the personalized predictive models outperformed corresponding traditional models built on randomly selected training samples of the same size as the personalized models (P < 0.001 for all). With only the top 1000 (10%), 700 (7%) and 1400 (14%) similar samples, personalized random forest, k-nearest neighbor and logistic regression models reached the globally optimal performance with the area under the receiver-operating characteristic (ROC) curve of 0.90, 0.82 and 0.89, respectively. CONCLUSIONS: The proposed patient similarity measurement was effective when developing personalized predictive models. The successful application of patient similarity in predicting a patient's diabetes status provided useful references for diagnostic decision-making support by investigating the evidence on similar patients.
BACKGROUND: Conventional risk prediction techniques may not be the most suitable approach for personalized prediction for individual patients. Therefore, individualized predictive modeling based on similar patients has emerged. This study aimed to propose a comprehensive measurement of patient similarity using real-world electronic medical records data, and evaluate the effectiveness of the individualized prediction of a patient's diabetes status based on the patient similarity. RESULTS: When using no more than 30% of the whole training sample, the personalized predictive models outperformed corresponding traditional models built on randomly selected training samples of the same size as the personalized models (P < 0.001 for all). With only the top 1000 (10%), 700 (7%) and 1400 (14%) similar samples, personalized random forest, k-nearest neighbor and logistic regression models reached the globally optimal performance with the area under the receiver-operating characteristic (ROC) curve of 0.90, 0.82 and 0.89, respectively. CONCLUSIONS: The proposed patient similarity measurement was effective when developing personalized predictive models. The successful application of patient similarity in predicting a patient's diabetes status provided useful references for diagnostic decision-making support by investigating the evidence on similar patients.
Entities:
Keywords:
Diabetes mellitus; Electronic medical records; Model performance; Patient similarity; Personalized prediction
Authors: David J Whellan; Kevin T Ousdigian; Sana M Al-Khatib; Wenji Pu; Shantanu Sarkar; Charles B Porter; Behzad B Pavri; Christopher M O'Connor Journal: J Am Coll Cardiol Date: 2010-04-27 Impact factor: 24.094
Authors: Robert J Sepanski; Sandip A Godambe; Christopher D Mangum; Christine S Bovat; Arno L Zaritsky; Samir H Shah Journal: Front Pediatr Date: 2014-06-16 Impact factor: 3.418
Authors: Katarzyna Krysik; Dariusz Dobrowolski; Katarzyna Polanowska; Anita Lyssek-Boron; Edward A Wylegala Journal: J Healthc Eng Date: 2017-09-07 Impact factor: 2.682