Ehsan Rezaei-Darzi1, Farshad Farzadfar2, Amir Hashemi-Meshkini3, Iman Navidi4, Mahmoud Mahmoudi5, Mehdi Varmaghani3, Parinaz Mehdipour6, Mahsa Soudi Alamdari7, Batool Tayefi8, Shohreh Naderimagham2, Fatemeh Soleymani9, Alireza Mesdaghinia10, Alireza Delavari11, Kazem Mohammad5. 1. 1)Department of Epidemiology and Biostatistics, School of Public Health, Tehran University of Medical Sciences, Tehran, Iran. 2)Non-communicable Diseases Research Center, Endocrinology and Metabolism Population Science Institute, Tehran University of Medical Sciences, tehran, Iran. 2. 2)Non-communicable Diseases Research Center, Endocrinology and Metabolism Population Science Institute, Tehran University of Medical Sciences, Tehran, Iran. 3)Endocrinology and Metabolism Research Center, Endocrinology and Metabolism Clinical Sciences Institute, Tehran University of Medical Sciences, Tehran, Iran. 3. 4)Department of Pharmacoeconomics and Pharmaceutical Administration, Faculty of Pharmacy, Tehran University of Medical Sciences, Tehran, Iran. 2)Non-communicable Diseases Research Center, Endocrinology and Metabolism Population Science Institute, Tehran, Iran. 4. 1)Department of Epidemiology and Biostatistics, School of Public Health, Tehran University of Medical Sciences, Tehran, Iran. 2)Non-communicable Diseases Research Center, Endocrinology and Metabolism Population Science Institute, Tehran University of Medical Sciences, Tehran, Iran. 5. Department of Epidemiology and Biostatistics, School of Public Health, Tehran University of Medical Sciences, Tehran, Iran. 6. 1)Department of Epidemiology and Biostatistics, School of Public Health, Tehran University of Medical Sciences, Tehran, Iran.2)Non-communicable Diseases Research Center, Endocrinology and Metabolism Population Science Institute, Tehran University of Medical Sciences, Tehran, Iran. 7. 2)Non-communicable Diseases Research Center, Endocrinology and Metabolism Population Science Institute, Tehran University of Medical Sciences, Tehran, Iran. 5)Department of Network Science and Technology, Faculty of New Sciences and Technologies, University of Tehran, Tehran, Iran. 8. Non-communicable Diseases Research Center, Endocrinology and Metabolism Population Science Institute, Tehran University of Medical Sciences, Tehran, Iran. 9. Department of Pharmacoeconomics and Pharmaceutical Administration, Faculty of Pharmacy, Tehran University of Medical Sciences, Tehran, Iran. 10. Department of Environmental Health Engineering, School of Public Health and Institute of Public Health Research, Tehran University of Medical Sciences, Tehran, Iran. 11. Digestive Oncology Research Center, Digestive Disease Research Institute, Shariati Hospital, Tehran University of Medical Sciences, Tehran, Iran.
Abstract
BACKGROUND: This study aimed to evaluate and compare the prediction accuracy of two data mining techniques, including decision tree and neural network models in labeling diagnosis to gastrointestinal prescriptions in Iran. METHODS: This study was conducted in three phases: data preparation, training phase, and testing phase. A sample from a database consisting of 23 million pharmacy insurance claim records, from 2004 to 2011 was used, in which a total of 330 prescriptions were assessed and used to train and test the models simultaneously. In the training phase, the selected prescriptions were assessed by both a physician and a pharmacist separately and assigned a diagnosis. To test the performance of each model, a k-fold stratified cross validation was conducted in addition to measuring their sensitivity and specificity. RESULT: Generally, two methods had very similar accuracies. Considering the weighted average of true positive rate (sensitivity) and true negative rate (specificity), the decision tree had slightly higher accuracy in its ability for correct classification (83.3% and 96% versus 80.3% and 95.1%, respectively). However, when the weighted average of ROC area (AUC between each class and all other classes) was measured, the ANN displayed higher accuracies in predicting the diagnosis (93.8% compared with 90.6%). CONCLUSION: According to the result of this study, artificial neural network and decision tree model represent similar accuracy in labeling diagnosis to GI prescription.
BACKGROUND: This study aimed to evaluate and compare the prediction accuracy of two data mining techniques, including decision tree and neural network models in labeling diagnosis to gastrointestinal prescriptions in Iran. METHODS: This study was conducted in three phases: data preparation, training phase, and testing phase. A sample from a database consisting of 23 million pharmacy insurance claim records, from 2004 to 2011 was used, in which a total of 330 prescriptions were assessed and used to train and test the models simultaneously. In the training phase, the selected prescriptions were assessed by both a physician and a pharmacist separately and assigned a diagnosis. To test the performance of each model, a k-fold stratified cross validation was conducted in addition to measuring their sensitivity and specificity. RESULT: Generally, two methods had very similar accuracies. Considering the weighted average of true positive rate (sensitivity) and true negative rate (specificity), the decision tree had slightly higher accuracy in its ability for correct classification (83.3% and 96% versus 80.3% and 95.1%, respectively). However, when the weighted average of ROC area (AUC between each class and all other classes) was measured, the ANN displayed higher accuracies in predicting the diagnosis (93.8% compared with 90.6%). CONCLUSION: According to the result of this study, artificial neural network and decision tree model represent similar accuracy in labeling diagnosis to GI prescription.