| Literature DB >> 35720717 |
Kenta Kimura1, Noriaki Kanayama1, Asako Toyama2,3, Kentaro Katahira1.
Abstract
This study aimed to investigate whether instrumental reward learning is affected by the cardiac cycle. To this end, we examined the effects of the cardiac cycle (systole or diastole) on the computational processes underlying the participants' choices in the instrumental learning task. In the instrumental learning task, participants were required to select one of two discriminative stimuli (neutral visual stimuli) and immediately receive reward/punishment feedback depending on the probability assigned to the chosen stimuli. To manipulate the cardiac cycle, the presentation of discriminative stimuli was timed to coincide with either cardiac systole or diastole. We fitted the participants' choices in the task with reinforcement learning (RL) models and estimated parameters involving instrumental learning (i.e., learning rate and inverse temperature) separately in the systole and diastole trials. Model-based analysis revealed that the learning rate for positive prediction errors was higher than that for negative prediction errors in the systole trials; however, learning rates did not differ between positive and negative prediction errors in the diastole trials. These results demonstrate that the natural fluctuation of cardiac afferent signals can affect asymmetric value updating in instrumental reward learning.Entities:
Keywords: baroreflex; cardiac cycle; decision-making; instrumental learning; interoception; reinforcement leaning; reward learning
Year: 2022 PMID: 35720717 PMCID: PMC9201078 DOI: 10.3389/fnins.2022.889440
Source DB: PubMed Journal: Front Neurosci ISSN: 1662-453X Impact factor: 5.152
FIGURE 1Schematic illustration of the flow of one trial of the instrumental learning task. The presentation of the discriminative stimuli (indicated by the dashed line) was experimentally manipulated to coincide with either the cardiac systole or diastole. This figure shows the reward probability for one of the two discriminative stimuli (stimulus A) which was changed among six blocks according to the pre-determined schedule.
Information concerning the five models compared on the basis of their fit to the choice data from 36 participants.
| Model name | Description | # of free parameters | LML |
| Q-A | The standard Q-learning model with asymmetric learning rates (α+ and α–) for positive and negative reward prediction errors | 3 | −211.5 (6.54) |
| Q-AF | The Q-A model with updating unchosen action values using forgetting parameter | 4 | −205.6 (7.34) |
| Q-AC | The Q-A model with the computational process of choice history using decay rate (τ) and perseverance parameter (φ) | 5 | −206.5 (7.28) |
| Q-AFC | The hybrid of the Q-AF and Q-AC models | 6 | −207.3 (7.37) |
| null mode | The biased random choice model producing the same probability of two options being chosen with biases of the participants’ choices | 1 | −248.5 (0.75) |
This list of models shows the mean values and standard errors across participants regarding the log marginal likelihood (LML) for each model.
FIGURE 2(A) The precision of the timing within the cardiac cycle, relative to the R-wave peak, is shown in the histogram. (B) The proportions of choosing stimulus A for each reward probability block. Error bars indicate standard error (SE). (C) Mean of the learning rates (α+ and α−) in the systole and diastole trials. Error bars indicate SE. An asterisk indicates a significant difference in the learning rate between the systole and diastole trials (* : p < 0.05). (D) The difference in the learning rate asymmetry (α+−α−) between the systole and diastole trials for each participant. The positive value represents a larger learning rate asymmetry in the systole trials relative to the diastole trials, whereas the negative value represents a larger learning rate asymmetry in the diastole trials relative to the systole trials. In 24 of 36 participants, there were larger learning rate asymmetry in the systole trials compared to the diastole trials.