Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Reinforcement learning via kernel temporal difference.

Literature DB >> 22255624

Reinforcement learning via kernel temporal difference.

Jihye Bae¹, Pratik Chhatbar, Joseph T Francis, Justin C Sanchez, Jose C Principe.

Abstract

This paper introduces a kernel adaptive filter implemented with stochastic gradient on temporal differences, kernel Temporal Difference (TD)(λ), to estimate the state-action value function in reinforcement learning. The case λ=0 will be studied in this paper. Experimental results show the method's applicability for learning motor state decoding during a center-out reaching task performed by a monkey. The results are compared to the implementation of a time delay neural network (TDNN) trained with backpropagation of the temporal difference error. From the experiments, it is observed that kernel TD(0) allows faster convergence and a better solution than the neural network.

Mesh：

Year: 2011 PMID： 22255624 DOI： 10.1109/IEMBS.2011.6091370

Source DB: PubMed Journal: Conf Proc IEEE Eng Med Biol Soc ISSN： 1557-170X

Keyword Cloud
Cited

5 in total

1. Kernel temporal differences for neural decoding.

Authors: Jihye Bae; Luis G Sanchez Giraldo; Eric A Pohlmeyer; Joseph T Francis; Justin C Sanchez; José C Príncipe
Journal: Comput Intell Neurosci Date: 2015-03-17

2. Reward Expectation Modulates Local Field Potentials, Spiking Activity and Spike-Field Coherence in the Primary Motor Cortex.

Authors: Junmo An; Taruna Yadav; John P Hessburg; Joseph T Francis
Journal: eNeuro Date: 2019-06-26

Review 3. Neural Decoders Using Reinforcement Learning in Brain Machine Interfaces: A Technical Review.

Authors: Benton Girdler; William Caldbeck; Jihye Bae
Journal: Front Syst Neurosci Date: 2022-08-26

4. Towards a naturalistic brain-machine interface: hybrid torque and position control allows generalization to novel dynamics.

Authors: Pratik Y Chhatbar; Joseph T Francis
Journal: PLoS One Date: 2013-01-24 Impact factor: 3.240

5. Use of frontal lobe hemodynamics as reinforcement signals to an adaptive controller.

Authors: Marcello M DiStasio; Joseph T Francis
Journal: PLoS One Date: 2013-07-22 Impact factor: 3.240

5 in total