OBJECTIVE: Propensity scores for the analysis of observational data are typically estimated using logistic regression. Our objective in this review was to assess machine learning alternatives to logistic regression, which may accomplish the same goals but with fewer assumptions or greater accuracy. STUDY DESIGN AND SETTING: We identified alternative methods for propensity score estimation and/or classification from the public health, biostatistics, discrete mathematics, and computer science literature, and evaluated these algorithms for applicability to the problem of propensity score estimation, potential advantages over logistic regression, and ease of use. RESULTS: We identified four techniques as alternatives to logistic regression: neural networks, support vector machines, decision trees (classification and regression trees [CART]), and meta-classifiers (in particular, boosting). CONCLUSION: Although the assumptions of logistic regression are well understood, those assumptions are frequently ignored. All four alternatives have advantages and disadvantages compared with logistic regression. Boosting (meta-classifiers) and, to a lesser extent, decision trees (particularly CART), appear to be most promising for use in the context of propensity score analysis, but extensive simulation studies are needed to establish their utility in practice. Copyright (c) 2010 Elsevier Inc. All rights reserved.
OBJECTIVE: Propensity scores for the analysis of observational data are typically estimated using logistic regression. Our objective in this review was to assess machine learning alternatives to logistic regression, which may accomplish the same goals but with fewer assumptions or greater accuracy. STUDY DESIGN AND SETTING: We identified alternative methods for propensity score estimation and/or classification from the public health, biostatistics, discrete mathematics, and computer science literature, and evaluated these algorithms for applicability to the problem of propensity score estimation, potential advantages over logistic regression, and ease of use. RESULTS: We identified four techniques as alternatives to logistic regression: neural networks, support vector machines, decision trees (classification and regression trees [CART]), and meta-classifiers (in particular, boosting). CONCLUSION: Although the assumptions of logistic regression are well understood, those assumptions are frequently ignored. All four alternatives have advantages and disadvantages compared with logistic regression. Boosting (meta-classifiers) and, to a lesser extent, decision trees (particularly CART), appear to be most promising for use in the context of propensity score analysis, but extensive simulation studies are needed to establish their utility in practice. Copyright (c) 2010 Elsevier Inc. All rights reserved.
Authors: Sherry Weitzen; Kate L Lapane; Alicia Y Toledano; Anne L Hume; Vincent Mor Journal: Pharmacoepidemiol Drug Saf Date: 2004-12 Impact factor: 2.890
Authors: Soko Setoguchi; Sebastian Schneeweiss; M Alan Brookhart; Robert J Glynn; E Francis Cook Journal: Pharmacoepidemiol Drug Saf Date: 2008-06 Impact factor: 2.890
Authors: Daniel Westreich; Jessie K Edwards; Stephen R Cole; Robert W Platt; Sunni L Mumford; Enrique F Schisterman Journal: Int J Epidemiol Date: 2015-07-25 Impact factor: 7.196
Authors: Stephanie Watkins; Michele Jonsson-Funk; M Alan Brookhart; Steven A Rosenberg; T Michael O'Shea; Julie Daniels Journal: Health Serv Res Date: 2013-05-23 Impact factor: 3.402
Authors: Michele Jonsson Funk; Daniel Westreich; Chris Wiesen; Til Stürmer; M Alan Brookhart; Marie Davidian Journal: Am J Epidemiol Date: 2011-03-08 Impact factor: 4.897
Authors: Jeff Y Yang; Michael Webster-Clark; Jennifer L Lund; Robert S Sandler; Evan S Dellon; Til Stürmer Journal: Gastrointest Endosc Date: 2019-04-30 Impact factor: 9.427
Authors: Ryan D Ross; Xu Shi; Megan E V Caram; Pheobe A Tsao; Paul Lin; Amy Bohnert; Min Zhang; Bhramar Mukherjee Journal: Health Serv Outcomes Res Methodol Date: 2020-10-20