| Literature DB >> 29163004 |
Lucas Kastner1,2, Jana Kube1,2,3, Arno Villringer1,2,4,5, Jane Neumann1,2,6.
Abstract
Successful learning hinges on the evaluation of positive and negative feedback. We assessed differential learning from reward and punishment in a monetary reinforcement learning paradigm, together with cardiac concomitants of positive and negative feedback processing. On the behavioral level, learning from reward resulted in more advantageous behavior than learning from punishment, suggesting a differential impact of reward and punishment on successful feedback-based learning. On the autonomic level, learning and feedback processing were closely mirrored by phasic cardiac responses on a trial-by-trial basis: (1) Negative feedback was accompanied by faster and prolonged heart rate deceleration compared to positive feedback. (2) Cardiac responses shifted from feedback presentation at the beginning of learning to stimulus presentation later on. (3) Most importantly, the strength of phasic cardiac responses to the presentation of feedback correlated with the strength of prediction error signals that alert the learner to the necessity for behavioral adaptation. Considering participants' weight status and gender revealed obesity-related deficits in learning to avoid negative consequences and less consistent behavioral adaptation in women compared to men. In sum, our results provide strong new evidence for the notion that during learning phasic cardiac responses reflect an internal value and feedback monitoring system that is sensitive to the violation of performance-based expectations. Moreover, inter-individual differences in weight status and gender may affect both behavioral and autonomic responses in reinforcement-based learning.Entities:
Keywords: gender; heart rate; obesity; prediction error; punishment; reinforcement learning; reward
Year: 2017 PMID: 29163004 PMCID: PMC5670147 DOI: 10.3389/fnins.2017.00598
Source DB: PubMed Journal: Front Neurosci ISSN: 1662-453X Impact factor: 4.677
Figure 1Experimental task. Example trial and task structure with reward/gain and punishment/loss probabilities of the reinforcement learning task.
Descriptive statistics.
| Age (years) | 26.2 (5.78) | 25.0 (4.41) | 26.7 (3.2) | 26.0 (4.11) | 0.564 | 0.482 |
| Years of education | 13 (13-13) | 13 (13-13) | 13 (10-13) | 13 (10-13) | 0.187 | 0.657 |
| Height (m) | 1.80 (0.04) | 1.71 (0.06) | 1.80 (0.7) | 1.67 (0.06) | 0.206 | |
| Weight (kg) | 73.37 (4.77) | 63.75 (6.57) | 115.24 (15.54) | 99.37 (9.83) | ||
| BMI | 22.63 (1.20) | 21.73 (1.44) | 35.59 (3.24) | 35.61 (3.68) | 0.565 | |
| WHR (cm) | 0.82 (0.04) | 0.75 (0.04) | 0.95 (0.05) | 0.84 (0.05) | ||
| HR (beats per min) | 66.50 (8.74) | 65.33 (7.5) | 65.17 (10.53) | 67.17 (9.78) | 0.925 | 0.876 |
Distribution of gender, age, level of education, height, weight, body mass index (BMI), waist-to-hip ratio (WHR), and baseline heart rate in male and female participants with and without obesity. Values represent mean and standard deviation except for years of education [median (min-max)]. Group differences were determined by univariate ANOVA with obesity and gender as fixed between-subject factors. Significant group effects at p < 0.05 are marked in bold.
Figure 2Behavioral results. Top: Number of advantageous choices (A) and reaction times (B) for the reward, punishment and neutral condition; bottom: Significant gender effects in the number of advantageous choices (C) and in the overall number of switch (D) trials for the reward and punishment condition. Advantageous choices refer to trials where participants chose the symbol with the higher probability for gaining a reward or avoiding a punishment in the reward and punishment condition, respectively. Switch trials refers to those trials where participants changed their choices from one symbol in the previous trial of this condition to the other symbol in the current trial. Statistically significant differences at p < 0.05 are marked with *.
Figure 3Phasic cardiac responses. Sequence of IBIs in response to reward (red) and punishment (blue) for the first (left) and the second (right) experimental half. For the purpose of plotting responses to stimuli and feedback on a common scale, in this figure all IBIs from stimulus presentation to HR recovery after feedback presentation are referenced to a common IBI −2 prior to stimulus presentation and named in relation to stimulus presentation IBI 0 to IBI 7. Arrows mark the presentation of stimuli (ST) and feedback (FB). Note that the statistical analysis of IBIs was performed separately for stimulus and feedback presentation (see Figures 4, 5).
Figure 4Cardiac responses to stimulus presentation. Effects of valence and time on changes in relative IBI length around stimulus presentation. (A) Deceleration in response to stimulus presentation was stronger and prolonged for stimuli predicting punishment compared to reward. (B) Cardiac reactivity in response to ST presentation was more pronounced during the second experimental half. Arrows mark the presentation of stimuli (ST). Significant differences (at p < 0.05) between conditions or experimental half 1 and 2 in the slope between neighboring IBIs are marked with *.
Figure 5Effects of weight status and gender. Interaction between gender and valence on changes in relative IBI length around feedback presentation during the first experimental half. The interaction was driven by (1) stronger overall cardiac reactivity to feedback presentation in women compared to men (red vs. green lines), and (2) stronger anticipatory deceleration and faster recovery in the punishment (B) compared to the reward condition (A) in women only with no observable differences in men. None of these effects was observable in the second half of the experiment. Arrows mark the presentation of feedback (FB). Significant differences (at p < 0.05) between genders in the slope between neighboring IBIs are marked with *.