| Literature DB >> 22156998 |
Ivo Grondman1, Maarten Vaandrager, Lucian Buşoniu, Robert Babuska, Erik Schuitema.
Abstract
We propose two new actor-critic algorithms for reinforcement learning. Both algorithms use local linear regression (LLR) to learn approximations of the functions involved. A crucial feature of the algorithms is that they also learn a process model, and this, in combination with LLR, provides an efficient policy update for faster learning. The first algorithm uses a novel model-based update rule for the actor parameters. The second algorithm does not use an explicit actor but learns a reference model which represents a desired behavior, from which desired control actions can be calculated using the inverse of the learned process model. The two novel methods and a standard actor-critic algorithm are applied to the pendulum swing-up problem, in which the novel methods achieve faster learning than the standard algorithm.Mesh:
Year: 2011 PMID: 22156998 DOI: 10.1109/TSMCB.2011.2170565
Source DB: PubMed Journal: IEEE Trans Syst Man Cybern B Cybern ISSN: 1083-4419