Patrick Doupe1, James Faghmous2, Sanjay Basu3. 1. Zalando SE, Berlin, Germany. Electronic address: patrick.doupe@zalando.de. 2. Center for Population Health Sciences and Center for Primary Care and Outcomes Research, Departments of Medicine and of Health Research and Policy, Stanford University, Stanford, CA, USA. 3. Research and Analytics, Collective Health, San Francisco, CA, USA; School of Public Health, Imperial College, London, England, United Kingdom.
Abstract
BACKGROUND: Machine learning is increasingly used to predict healthcare outcomes, including cost, utilization, and quality. OBJECTIVE: We provide a high-level overview of machine learning for healthcare outcomes researchers and decision makers. METHODS: We introduce key concepts for understanding the application of machine learning methods to healthcare outcomes research. We first describe current standards to rigorously learn an estimator, which is an algorithm developed through machine learning to predict a particular outcome. We include steps for data preparation, estimator family selection, parameter learning, regularization, and evaluation. We then compare 3 of the most common machine learning methods: (1) decision tree methods that can be useful for identifying how different subpopulations experience different risks for an outcome; (2) deep learning methods that can identify complex nonlinear patterns or interactions between variables predictive of an outcome; and (3) ensemble methods that can improve predictive performance by combining multiple machine learning methods. RESULTS: We demonstrate the application of common machine methods to a simulated insurance claims dataset. We specifically include statistical code in R and Python for the development and evaluation of estimators for predicting which patients are at heightened risk for hospitalization from ambulatory care-sensitive conditions. CONCLUSIONS: Outcomes researchers should be aware of key standards for rigorously evaluating an estimator developed through machine learning approaches. Although multiple methods use machine learning concepts, different approaches are best suited for different research problems.
BACKGROUND: Machine learning is increasingly used to predict healthcare outcomes, including cost, utilization, and quality. OBJECTIVE: We provide a high-level overview of machine learning for healthcare outcomes researchers and decision makers. METHODS: We introduce key concepts for understanding the application of machine learning methods to healthcare outcomes research. We first describe current standards to rigorously learn an estimator, which is an algorithm developed through machine learning to predict a particular outcome. We include steps for data preparation, estimator family selection, parameter learning, regularization, and evaluation. We then compare 3 of the most common machine learning methods: (1) decision tree methods that can be useful for identifying how different subpopulations experience different risks for an outcome; (2) deep learning methods that can identify complex nonlinear patterns or interactions between variables predictive of an outcome; and (3) ensemble methods that can improve predictive performance by combining multiple machine learning methods. RESULTS: We demonstrate the application of common machine methods to a simulated insurance claims dataset. We specifically include statistical code in R and Python for the development and evaluation of estimators for predicting which patients are at heightened risk for hospitalization from ambulatory care-sensitive conditions. CONCLUSIONS: Outcomes researchers should be aware of key standards for rigorously evaluating an estimator developed through machine learning approaches. Although multiple methods use machine learning concepts, different approaches are best suited for different research problems.
Authors: Verity Schaye; Benedict Guzman; Jesse Burk-Rafel; Marina Marin; Ilan Reinstein; David Kudlowitz; Louis Miller; Jonathan Chun; Yindalon Aphinyanaphongs Journal: J Gen Intern Med Date: 2022-06-16 Impact factor: 6.473