Literature DB >> 35757490

Statistical Inference with M-Estimators on Adaptively Collected Data.

Kelly W Zhang1, Lucas Janson2, Susan A Murphy3.   

Abstract

Bandit algorithms are increasingly used in real-world sequential decision-making problems. Associated with this is an increased desire to be able to use the resulting datasets to answer scientific questions like: Did one type of ad lead to more purchases? In which contexts is a mobile health intervention effective? However, classical statistical approaches fail to provide valid confidence intervals when used with data collected with bandit algorithms. Alternative methods have recently been developed for simple models (e.g., comparison of means). Yet there is a lack of general methods for conducting statistical inference using more complex models on data collected with (contextual) bandit algorithms; for example, current methods cannot be used for valid inference on parameters in a logistic regression model for a binary reward. In this work, we develop theory justifying the use of M-estimators-which includes estimators based on empirical risk minimization as well as maximum likelihood-on data collected with adaptive algorithms, including (contextual) bandit algorithms. Specifically, we show that M-estimators, modified with particular adaptive weights, can be used to construct asymptotically valid confidence regions for a variety of inferential targets.

Entities:  

Year:  2021        PMID: 35757490      PMCID: PMC9232184     

Source DB:  PubMed          Journal:  Adv Neural Inf Process Syst        ISSN: 1049-5258


  10 in total

1.  Marginal structural models and causal inference in epidemiology.

Authors:  J M Robins; M A Hernán; B Brumback
Journal:  Epidemiology       Date:  2000-09       Impact factor: 4.822

2.  Post-Contextual-Bandit Inference.

Authors:  Aurélien Bibaut; Antoine Chambaz; Maria Dimakopoulou; Nathan Kallus; Mark van der Laan
Journal:  Adv Neural Inf Process Syst       Date:  2021-12

3.  Power Constrained Bandits.

Authors:  Jiayu Yao; Emma Brunskill; Weiwei Pan; Susan Murphy; Finale Doshi-Velez
Journal:  Proc Mach Learn Res       Date:  2021-08

4.  Inference for Batched Bandits.

Authors:  Kelly W Zhang; Lucas Janson; Susan A Murphy
Journal:  Adv Neural Inf Process Syst       Date:  2020-12

5.  Multi-armed Bandit Models for the Optimal Design of Clinical Trials: Benefits and Challenges.

Authors:  Sofía S Villar; Jack Bowden; James Wason
Journal:  Stat Sci       Date:  2015       Impact factor: 2.901

6.  Statistical Inference for Online Decision-Making: In a Contextual Bandit Setting.

Authors:  Haoyu Chen; Wenbin Lu; Rui Song
Journal:  J Am Stat Assoc       Date:  2020-07-07       Impact factor: 5.033

7.  Asymptotic theory for maximum likelihood estimates in reduced-rank multivariate generalized linear models.

Authors:  E Bura; S Duarte; L Forzani; E Smucler; M Sued
Journal:  Statistics (Ber)       Date:  2018-05-08       Impact factor: 1.051

8.  Confidence intervals for policy evaluation in adaptive experiments.

Authors:  Vitor Hadad; David A Hirshberg; Ruohan Zhan; Stefan Wager; Susan Athey
Journal:  Proc Natl Acad Sci U S A       Date:  2021-04-13       Impact factor: 11.205

9.  Encouraging Physical Activity in Patients With Diabetes: Intervention Using a Reinforcement Learning System.

Authors:  Elad Yom-Tov; Guy Feraru; Mark Kozdoba; Shie Mannor; Moshe Tennenholtz; Irit Hochberg
Journal:  J Med Internet Res       Date:  2017-10-10       Impact factor: 5.428

  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.