Literature DB >> 23317844

Medial prefrontal cortex and the adaptive regulation of reinforcement learning parameters.

Mehdi Khamassi1, Pierre Enel, Peter Ford Dominey, Emmanuel Procyk.   

Abstract

Converging evidence suggest that the medial prefrontal cortex (MPFC) is involved in feedback categorization, performance monitoring, and task monitoring, and may contribute to the online regulation of reinforcement learning (RL) parameters that would affect decision-making processes in the lateral prefrontal cortex (LPFC). Previous neurophysiological experiments have shown MPFC activities encoding error likelihood, uncertainty, reward volatility, as well as neural responses categorizing different types of feedback, for instance, distinguishing between choice errors and execution errors. Rushworth and colleagues have proposed that the involvement of MPFC in tracking the volatility of the task could contribute to the regulation of one of RL parameters called the learning rate. We extend this hypothesis by proposing that MPFC could contribute to the regulation of other RL parameters such as the exploration rate and default action values in case of task shifts. Here, we analyze the sensitivity to RL parameters of behavioral performance in two monkey decision-making tasks, one with a deterministic reward schedule and the other with a stochastic one. We show that there exist optimal parameter values specific to each of these tasks, that need to be found for optimal performance and that are usually hand-tuned in computational models. In contrast, automatic online regulation of these parameters using some heuristics can help producing a good, although non-optimal, behavioral performance in each task. We finally describe our computational model of MPFC-LPFC interaction used for online regulation of the exploration rate and its application to a human-robot interaction scenario. There, unexpected uncertainties are produced by the human introducing cued task changes or by cheating. The model enables the robot to autonomously learn to reset exploration in response to such uncertain cues and events. The combined results provide concrete evidence specifying how prefrontal cortical subregions may cooperate to regulate RL parameters. It also shows how such neurophysiologically inspired mechanisms can control advanced robots in the real world. Finally, the model's learning mechanisms that were challenged in the last robotic scenario provide testable predictions on the way monkeys may learn the structure of the task during the pretraining phase of the previous laboratory experiments.
Copyright © 2013 Elsevier B.V. All rights reserved.

Entities:  

Mesh:

Year:  2013        PMID: 23317844     DOI: 10.1016/B978-0-444-62604-2.00022-8

Source DB:  PubMed          Journal:  Prog Brain Res        ISSN: 0079-6123            Impact factor:   2.453


  13 in total

1.  Anterior Cingulate Cortex Cells Identify Process-Specific Errors of Attentional Control Prior to Transient Prefrontal-Cingulate Inhibition.

Authors:  Chen Shen; Salva Ardid; Daniel Kaping; Stephanie Westendorff; Stefan Everling; Thilo Womelsdorf
Journal:  Cereb Cortex       Date:  2014-03-02       Impact factor: 5.357

2.  A spiking neural integrator model of the adaptive control of action by the medial prefrontal cortex.

Authors:  Trevor Bekolay; Mark Laubach; Chris Eliasmith
Journal:  J Neurosci       Date:  2014-01-29       Impact factor: 6.167

3.  Prefrontal and anterior cingulate cortex neurons encode attentional targets even when they do not apparently bias behavior.

Authors:  Stephanie Westendorff; Daniel Kaping; Stefan Everling; Thilo Womelsdorf
Journal:  J Neurophysiol       Date:  2016-05-18       Impact factor: 2.714

4.  Temporal chunking as a mechanism for unsupervised learning of task-sets.

Authors:  Flora Bouchacourt; Stefano Palminteri; Etienne Koechlin; Srdjan Ostojic
Journal:  Elife       Date:  2020-03-09       Impact factor: 8.140

5.  The exploration-exploitation dilemma: a multidisciplinary framework.

Authors:  Oded Berger-Tal; Jonathan Nathan; Ehud Meron; David Saltz
Journal:  PLoS One       Date:  2014-04-22       Impact factor: 3.240

6.  Learning the value of information and reward over time when solving exploration-exploitation problems.

Authors:  Irene Cogliati Dezza; Angela J Yu; Axel Cleeremans; William Alexander
Journal:  Sci Rep       Date:  2017-12-05       Impact factor: 4.379

7.  Confidence and psychosis: a neuro-computational account of contingency learning disruption by NMDA blockade.

Authors:  F Vinckier; R Gaillard; S Palminteri; L Rigoux; A Salvador; A Fornito; R Adapa; M O Krebs; M Pessiglione; P C Fletcher
Journal:  Mol Psychiatry       Date:  2015-06-09       Impact factor: 15.992

8.  Spatiotemporal Spike Coding of Behavioral Adaptation in the Dorsal Anterior Cingulate Cortex.

Authors:  Laureline Logiaco; René Quilodran; Emmanuel Procyk; Angelo Arleo
Journal:  PLoS Biol       Date:  2015-08-12       Impact factor: 8.029

9.  Modulation of feedback-related negativity during trial-and-error exploration and encoding of behavioral shifts.

Authors:  Jérôme Sallet; Nathalie Camille; Emmanuel Procyk
Journal:  Front Neurosci       Date:  2013-11-14       Impact factor: 4.677

10.  Critical role for the mediodorsal thalamus in permitting rapid reward-guided updating in stochastic reward environments.

Authors:  Subhojit Chakraborty; Nils Kolling; Mark E Walton; Anna S Mitchell
Journal:  Elife       Date:  2016-05-02       Impact factor: 8.140

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.