| Literature DB >> 28723943 |
Heyeon Park1, Daeyeol Lee2,3, Jeanyung Chey1.
Abstract
Previous studies found that stress shifts behavioral control by promoting habits while decreasing goal-directed behaviors during reward-based decision-making. It is, however, unclear how stress disrupts the relative contribution of the two systems controlling reward-seeking behavior, i.e. model-free (or habit) and model-based (or goal-directed). Here, we investigated whether stress biases the contribution of model-free and model-based reinforcement learning processes differently depending on the valence of outcome, and whether stress alters the learning rate, i.e., how quickly information from the new environment is incorporated into choices. Participants were randomly assigned to either a stress or a control condition, and performed a two-stage Markov decision-making task in which the reward probabilities underwent periodic reversals without notice. We found that stress increased the contribution of model-free reinforcement learning only after negative outcome. Furthermore, stress decreased the learning rate. The results suggest that stress diminishes one's ability to make adaptive choices in multiple aspects of reinforcement learning. This finding has implications for understanding how stress facilitates maladaptive habits, such as addictive behavior, and other dysfunctional behaviors associated with stress in clinical and educational contexts.Entities:
Mesh:
Year: 2017 PMID: 28723943 PMCID: PMC5516979 DOI: 10.1371/journal.pone.0180588
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Task design.
(A) Task structure. Choice in the first stage leads probabilistically to different states in the second stage. Each stimulus in the second-stage resulted in either 0 or 100 points with different probabilities. (B) Timeline of events in a single trial. (C) Reward-probabilities of four options in stage 2.
Fig 2The effects of stress on decision-making.
(A) The hypothetic results of stay-shift analysis expected for model-based (left) and model-free (right) reinforcement learning. (B) The behavioral results of stay-shift analysis. Participants’ task performance in control condition showed characteristics of both model-free and model-based influences, while stressed participants showed stronger characteristic of model-free reinforcement learning. The stress × reward × transition interaction, p = .004. (C) The results of parameter estimation of a reinforcement learning model. Stress heightened the discount factor, γ (which means stress declined the learning rate) (p = .002) and boosted only model-free tendency to switch to a different option after no reward (Δ_mf) (p < .001). Error bars represent SEM. Δ+mb Δ_mb, and Δ+mf are parameters of the RL model which indicate the model-based tendency after reward, the model-based tendency after no-reward, and the model-free tendency after reward, respectively.
Best-fitting parameter estimates, shown as median plus quartiles across conditions.
| γ | Δ+mf | Δ+mb | Δ_mf | Δ_mb | |
|---|---|---|---|---|---|
| 0.23 | 0.70 | 0.10 | 0.12 | -0.41 | |
| 0.47 | 1.09 | 0.33 | 0.48 | -0.14 | |
| 0.60 | 1.67 | 0.80 | 0.78 | 0.04 | |
| 8.41 | 3.30 | 2.00 | 5.17 | -2.74 | |
| 0.56 | 0.04 | -0.23 | -0.10 | -0.13 | |
| 0.73 | 0.50 | -0.04 | 0.02 | -0.01 | |
| 0.95 | 1.06 | 0.18 | 0.14 | 0.11 | |
| 11.50 | 3.84 | .02 | .62 | -.89 | |
Notes: Δ+mb and Δ_mb are parameters which represent the model-based tendency after reward and no-reward, respectively. Δ+mf and Δ+mf are parameters which indicate the model-free tendency after reward and no-reward, respectively.
* p < 0.01
** p < 0.001. T is the t value from the paired t-test which was performed to investigate whether each parameter was significantly different from zero.