Literature DB >> 33286559

On Gap-Based Lower Bounding Techniques for Best-Arm Identification.

Lan V Truong1, Jonathan Scarlett2.   

Abstract

In this paper, we consider techniques for establishing lower bounds on the number of arm pulls for best-arm identification in the multi-armed bandit problem. While a recent divergence-based approach was shown to provide improvements over an older gap-based approach, we show that the latter can be refined to match the former (up to constant factors) in many cases of interest under Bernoulli rewards, including the case that the rewards are bounded away from zero and one. Together with existing upper bounds, this indicates that the divergence-based and gap-based approaches are both effective for establishing sample complexity lower bounds for best-arm identification.

Entities:  

Keywords:  PAC learning; best-arm identification; information-theoretic lower bounds; multi-armed bandits

Year:  2020        PMID: 33286559      PMCID: PMC7517353          DOI: 10.3390/e22070788

Source DB:  PubMed          Journal:  Entropy (Basel)        ISSN: 1099-4300            Impact factor:   2.524


  1 in total

1.  Multi-armed Bandit Models for the Optimal Design of Clinical Trials: Benefits and Challenges.

Authors:  Sofía S Villar; Jack Bowden; James Wason
Journal:  Stat Sci       Date:  2015       Impact factor: 2.901

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.