Literature DB >> 35002190

Inference for Batched Bandits.

Kelly W Zhang1, Lucas Janson2, Susan A Murphy3.   

Abstract

As bandit algorithms are increasingly utilized in scientific studies and industrial applications, there is an associated increasing need for reliable inference methods based on the resulting adaptively-collected data. In this work, we develop methods for inference on data collected in batches using a bandit algorithm. We first prove that the ordinary least squares estimator (OLS), which is asymptotically normal on independently sampled data, is not asymptotically normal on data collected using standard bandit algorithms when there is no unique optimal arm. This asymptotic non-normality result implies that the naive assumption that the OLS estimator is approximately normal can lead to Type-1 error inflation and confidence intervals with below-nominal coverage probabilities. Second, we introduce the Batched OLS estimator (BOLS) that we prove is (1) asymptotically normal on data collected from both multi-arm and contextual bandits and (2) robust to non-stationarity in the baseline reward.

Entities:  

Year:  2020        PMID: 35002190      PMCID: PMC8734616     

Source DB:  PubMed          Journal:  Adv Neural Inf Process Syst        ISSN: 1049-5258


  11 in total

1.  Efficacy of Contextually Tailored Suggestions for Physical Activity: A Micro-randomized Optimization Trial of HeartSteps.

Authors:  Predrag Klasnja; Shawna Smith; Nicholas J Seewald; Andy Lee; Kelly Hall; Brook Luers; Eric B Hekler; Susan A Murphy
Journal:  Ann Behav Med       Date:  2019-05-03

2.  Power Constrained Bandits.

Authors:  Jiayu Yao; Emma Brunskill; Weiwei Pan; Susan Murphy; Finale Doshi-Velez
Journal:  Proc Mach Learn Res       Date:  2021-08

3.  STATISTICAL INFERENCE FOR THE MEAN OUTCOME UNDER A POSSIBLY NON-UNIQUE OPTIMAL TREATMENT STRATEGY.

Authors:  Alexander R Luedtke; Mark J van der Laan
Journal:  Ann Stat       Date:  2016-03-17       Impact factor: 4.028

4.  Multi-armed Bandit Models for the Optimal Design of Clinical Trials: Benefits and Challenges.

Authors:  Sofía S Villar; Jack Bowden; James Wason
Journal:  Stat Sci       Date:  2015       Impact factor: 2.901

5.  The law of attrition.

Authors:  Gunther Eysenbach
Journal:  J Med Internet Res       Date:  2005-03-31       Impact factor: 5.428

Review 6.  Maximizing Engagement in Mobile Health Studies: Lessons Learned and Future Directions.

Authors:  Katie L Druce; William G Dixon; John McBeth
Journal:  Rheum Dis Clin North Am       Date:  2019-03-08       Impact factor: 2.670

7.  Confidence intervals for policy evaluation in adaptive experiments.

Authors:  Vitor Hadad; David A Hirshberg; Ruohan Zhan; Stefan Wager; Susan Athey
Journal:  Proc Natl Acad Sci U S A       Date:  2021-04-13       Impact factor: 11.205

8.  Encouraging Physical Activity in Patients With Diabetes: Intervention Using a Reinforcement Learning System.

Authors:  Elad Yom-Tov; Guy Feraru; Mark Kozdoba; Shie Mannor; Moshe Tennenholtz; Irit Hochberg
Journal:  J Med Internet Res       Date:  2017-10-10       Impact factor: 5.428

9.  Scaling up behavioral science interventions in online education.

Authors:  René F Kizilcec; Justin Reich; Michael Yeomans; Christoph Dann; Emma Brunskill; Glenn Lopez; Selen Turkay; Joseph Jay Williams; Dustin Tingley
Journal:  Proc Natl Acad Sci U S A       Date:  2020-06-15       Impact factor: 11.205

View more
  2 in total

1.  Statistical Inference with M-Estimators on Adaptively Collected Data.

Authors:  Kelly W Zhang; Lucas Janson; Susan A Murphy
Journal:  Adv Neural Inf Process Syst       Date:  2021-12

2.  Some performance considerations when using multi-armed bandit algorithms in the presence of missing data.

Authors:  Xijin Chen; Kim May Lee; Sofia S Villar; David S Robertson
Journal:  PLoS One       Date:  2022-09-12       Impact factor: 3.752

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.