Literature DB >> 19229556

Learning to reach by reinforcement learning using a receptive field based function approximation approach with continuous actions.

Minija Tamosiunaite1, Tamim Asfour, Florentin Wörgötter.   

Abstract

Reinforcement learning methods can be used in robotics applications especially for specific target-oriented problems, for example the reward-based recalibration of goal directed actions. To this end still relatively large and continuous state-action spaces need to be efficiently handled. The goal of this paper is, thus, to develop a novel, rather simple method which uses reinforcement learning with function approximation in conjunction with different reward-strategies for solving such problems. For the testing of our method, we use a four degree-of-freedom reaching problem in 3D-space simulated by a two-joint robot arm system with two DOF each. Function approximation is based on 4D, overlapping kernels (receptive fields) and the state-action space contains about 10,000 of these. Different types of reward structures are being compared, for example, reward-on- touching-only against reward-on-approach. Furthermore, forbidden joint configurations are punished. A continuous action space is used. In spite of a rather large number of states and the continuous action space these reward/punishment strategies allow the system to find a good solution usually within about 20 trials. The efficiency of our method demonstrated in this test scenario suggests that it might be possible to use it on a real robot for problems where mixed rewards can be defined in situations where other types of learning might be difficult.

Entities:  

Mesh:

Year:  2009        PMID: 19229556      PMCID: PMC2798030          DOI: 10.1007/s00422-009-0295-8

Source DB:  PubMed          Journal:  Biol Cybern        ISSN: 0340-1200            Impact factor:   2.086


  13 in total

1.  A model of hippocampally dependent navigation, using the temporal difference learning rule.

Authors:  D J Foster; R G Morris; P Dayan
Journal:  Hippocampus       Date:  2000       Impact factor: 3.899

2.  Spatial cognition and neuro-mimetic navigation: a model of hippocampal place cell activity.

Authors:  A Arleo; W Gerstner
Journal:  Biol Cybern       Date:  2000-09       Impact factor: 2.086

3.  Combining expert neural networks using reinforcement feedback for learning primitive grasping behavior.

Authors:  Medhat A Moussa
Journal:  IEEE Trans Neural Netw       Date:  2004-05

Review 4.  Temporal sequence learning, prediction, and control: a review of different models and their relation to biological mechanisms.

Authors:  Florentin Wörgötter; Bernd Porr
Journal:  Neural Comput       Date:  2005-02       Impact factor: 2.026

5.  Speeding up the learning of robot kinematics through function decomposition.

Authors:  Vicente Ruiz de Angulo; Carme Torras
Journal:  IEEE Trans Neural Netw       Date:  2005-11

6.  On the asymptotic equivalence between differential Hebbian and temporal difference learning.

Authors:  Christoph Kolodziejski; Bernd Porr; Florentin Wörgötter
Journal:  Neural Comput       Date:  2009-04       Impact factor: 2.026

7.  Sensorimotor representations for pointing to targets in three-dimensional space.

Authors:  J F Soechting; M Flanders
Journal:  J Neurophysiol       Date:  1989-08       Impact factor: 2.714

8.  Robust self-localisation and navigation based on hippocampal place cells.

Authors:  Thomas Strösslin; Denis Sheynikhovich; Ricardo Chavarriaga; Wulfram Gerstner
Journal:  Neural Netw       Date:  2005-11-02

9.  Path-finding in real and simulated rats: assessing the influence of path characteristics on navigation learning.

Authors:  Minija Tamosiunaite; James Ainge; Tomas Kulvicius; Bernd Porr; Paul Dudchenko; Florentin Wörgötter
Journal:  J Comput Neurosci       Date:  2008-04-30       Impact factor: 1.621

10.  Reinforcement learning of motor skills with policy gradients.

Authors:  Jan Peters; Stefan Schaal
Journal:  Neural Netw       Date:  2008-04-26
View more
  1 in total

1.  Improving Robot Motor Learning with Negatively Valenced Reinforcement Signals.

Authors:  Nicolás Navarro-Guerrero; Robert J Lowe; Stefan Wermter
Journal:  Front Neurorobot       Date:  2017-04-03       Impact factor: 2.650

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.