Literature DB >> 22680455

Dynamics of Boltzmann Q learning in two-player two-action games.

Ardeshir Kianercy1, Aram Galstyan.   

Abstract

We consider the dynamics of Q learning in two-player two-action games with a Boltzmann exploration mechanism. For any nonzero exploration rate the dynamics is dissipative, which guarantees that agent strategies converge to rest points that are generally different from the game's Nash equlibria (NEs). We provide a comprehensive characterization of the rest point structure for different games and examine the sensitivity of this structure with respect to the noise due to exploration. Our results indicate that for a class of games with multiple NEs the asymptotic behavior of learning dynamics can undergo drastic changes at critical exploration rates. Furthermore, we demonstrate that, for certain games with a single NE, it is possible to have additional rest points (not corresponding to any NE) that persist for a finite range of the exploration rates and disappear when the exploration rates of both players tend to zero.

Entities:  

Mesh:

Year:  2012        PMID: 22680455     DOI: 10.1103/PhysRevE.85.041145

Source DB:  PubMed          Journal:  Phys Rev E Stat Nonlin Soft Matter Phys        ISSN: 1539-3755


  5 in total

1.  Melioration Learning in Two-Person Games.

Authors:  Johannes Zschache
Journal:  PLoS One       Date:  2016-11-16       Impact factor: 3.240

2.  An Adaptive Learning Based Network Selection Approach for 5G Dynamic Environments.

Authors:  Xiaohong Li; Ru Cao; Jianye Hao
Journal:  Entropy (Basel)       Date:  2018-03-29       Impact factor: 2.524

3.  REinforcement learning to improve non-adherence for diabetes treatments by Optimising Response and Customising Engagement (REINFORCE): study protocol of a pragmatic randomised trial.

Authors:  Julie C Lauffenburger; Elad Yom-Tov; Punam A Keller; Marie E McDonnell; Lily G Bessette; Constance P Fontanet; Ellen S Sears; Erin Kim; Kaitlin Hanken; J Joseph Buckley; Renee A Barlev; Nancy Haff; Niteesh K Choudhry
Journal:  BMJ Open       Date:  2021-12-03       Impact factor: 2.692

4.  Dynamical systems as a level of cognitive analysis of multi-agent learning: Algorithmic foundations of temporal-difference learning dynamics.

Authors:  Wolfram Barfuss
Journal:  Neural Comput Appl       Date:  2021-06-23       Impact factor: 5.606

5.  Critical transitions in a game theoretic model of tumour metabolism.

Authors:  Ardeshir Kianercy; Robert Veltri; Kenneth J Pienta
Journal:  Interface Focus       Date:  2014-08-06       Impact factor: 3.906

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.