| Literature DB >> 30770453 |
Sören R Künzel1, Jasjeet S Sekhon2,3, Peter J Bickel2, Bin Yu1,4.
Abstract
There is growing interest in estimating and analyzing heterogeneous treatment effects in experimental and observational studies. We describe a number of metaalgorithms that can take advantage of any supervised learning or regression method in machine learning and statistics to estimate the conditional average treatment effect (CATE) function. Metaalgorithms build on base algorithms-such as random forests (RFs), Bayesian additive regression trees (BARTs), or neural networks-to estimate the CATE, a function that the base algorithms are not designed to estimate directly. We introduce a metaalgorithm, the X-learner, that is provably efficient when the number of units in one treatment group is much larger than in the other and can exploit structural properties of the CATE function. For example, if the CATE function is linear and the response functions in treatment and control are Lipschitz-continuous, the X-learner can still achieve the parametric rate under regularity conditions. We then introduce versions of the X-learner that use RF and BART as base learners. In extensive simulation studies, the X-learner performs favorably, although none of the metalearners is uniformly the best. In two persuasion field experiments from political science, we demonstrate how our X-learner can be used to target treatment regimes and to shed light on underlying mechanisms. A software package is provided that implements our methods.Entities:
Keywords: conditional average treatment effect; heterogeneous treatment effects; minimax optimality; observational studies; randomized controlled trials
Year: 2019 PMID: 30770453 PMCID: PMC6410831 DOI: 10.1073/pnas.1804597116
Source DB: PubMed Journal: Proc Natl Acad Sci U S A ISSN: 0027-8424 Impact factor: 11.205
Fig. 1.Intuition behind the X-learner with an unbalanced design. (A) Observed outcome and first-stage base learners. (B) Imputed treatment effects and second-stage base learners. (C) ITEs and CATE estimators.
Fig. 2.Social pressure and voter turnout. Potential voters are grouped by the number of elections they participated in, ranging from 0 (potential voters who did not vote during the past five elections) to 5 (voters who participated in all five past elections). The width of each group is proportional to the size of the group. (Upper) Positive values correspond to the percentage of voters for which the predicted CATE is significantly positive, while negative values correspond to the percentage of voters for which the predicted CATE is significantly negative. (Lower) The plot shows the CATE estimate distribution for each bin.
Fig. 3.RMSE, bias, and variance for a simulation based on the social pressure and voter turnout experiment.
Fig. 4.Histograms for the distribution of the CATE estimates in the Reducing Transphobia study. The horizontal line shows the position of the estimated ATE. (A) X-RF. (B) T-RF. (C) S-RF.