Literature DB >> 33500958

Reactive Reinforcement Learning in Asynchronous Environments.

Jaden B Travnik1,2, Kory W Mathewson1,2, Richard S Sutton1, Patrick M Pilarski1,2.   

Abstract

The relationship between a reinforcement learning (RL) agent and an asynchronous environment is often ignored. Frequently used models of the interaction between an agent and its environment, such as Markov Decision Processes (MDP) or Semi-Markov Decision Processes (SMDP), do not capture the fact that, in an asynchronous environment, the state of the environment may change during computation performed by the agent. In an asynchronous environment, minimizing reaction time-the time it takes for an agent to react to an observation-also minimizes the time in which the state of the environment may change following observation. In many environments, the reaction time of an agent directly impacts task performance by permitting the environment to transition into either an undesirable terminal state or a state where performing the chosen action is inappropriate. We propose a class of reactive reinforcement learning algorithms that address this problem of asynchronous environments by immediately acting after observing new state information. We compare a reactive SARSA learning algorithm with the conventional SARSA learning algorithm on two asynchronous robotic tasks (emergency stopping and impact prevention), and show that the reactive RL algorithm reduces the reaction time of the agent by approximately the duration of the algorithm's learning update. This new class of reactive algorithms may facilitate safer control and faster decision making without any change to standard learning guarantees.
Copyright © 2018 Travnik, Mathewson, Sutton and Pilarski.

Entities:  

Keywords:  asynchronous environments; reaction time; real-time machine learning; reinforcement learning; resource-limited systems

Year:  2018        PMID: 33500958      PMCID: PMC7805616          DOI: 10.3389/frobt.2018.00079

Source DB:  PubMed          Journal:  Front Robot AI        ISSN: 2296-9144


  4 in total

1.  Mastering the game of Go with deep neural networks and tree search.

Authors:  David Silver; Aja Huang; Chris J Maddison; Arthur Guez; Laurent Sifre; George van den Driessche; Julian Schrittwieser; Ioannis Antonoglou; Veda Panneershelvam; Marc Lanctot; Sander Dieleman; Dominik Grewe; John Nham; Nal Kalchbrenner; Ilya Sutskever; Timothy Lillicrap; Madeleine Leach; Koray Kavukcuoglu; Thore Graepel; Demis Hassabis
Journal:  Nature       Date:  2016-01-28       Impact factor: 49.962

2.  Parallel Online Temporal Difference Learning for Motor Control.

Authors:  Wouter Caarls; Erik Schuitema
Journal:  IEEE Trans Neural Netw Learn Syst       Date:  2015-06-23       Impact factor: 10.451

3.  Musicians have better memory than nonmusicians: A meta-analysis.

Authors:  Francesca Talamini; Gianmarco Altoè; Barbara Carretti; Massimo Grassi
Journal:  PLoS One       Date:  2017-10-19       Impact factor: 3.240

4.  Superior memorizers employ different neural networks for encoding and recall.

Authors:  Johannes Mallow; Johannes Bernarding; Michael Luchtmann; Anja Bethmann; André Brechmann
Journal:  Front Syst Neurosci       Date:  2015-09-14
  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.