| Literature DB >> 34720259 |
Michael Browning1,2, Jacinta O'Shea3,1,4, Margot Juliëtte Overman3.
Abstract
BACKGROUND: Cognitive models of mood disorders emphasize a causal role of negative affective biases in depression. Computational work suggests that these biases may stem from a belief that negative events have a higher information content than positive events, resulting in preferential processing of and learning from negative outcomes. Learning biases therefore represent a promising target for therapeutic interventions. In this proof-of-concept study in healthy volunteers, we assessed the malleability of biased reinforcement learning using a novel cognitive training paradigm and concurrent transcranial direct current stimulation (tDCS).Entities:
Keywords: Affective disorders; Anxiety; Cognitive training; Computational psychiatry; Depression; Dorsolateral prefrontal cortex; Non-invasive brain stimulation; Reinforcement learning; Reward; Transcranial direct current stimulation; Volatility; dlPFC; tDCS
Year: 2020 PMID: 34720259 PMCID: PMC8550254 DOI: 10.1007/s10608-020-10146-9
Source DB: PubMed Journal: Cognit Ther Res ISSN: 0147-5916
Fig. 1Schematic overview of the training sessions. a Timeline study procedure. b Example of a trial on the Information Bias Learning Task (IBLT). At the beginning of the task, participants are provided with a start amount of £1.50. In each trial, a fixation cross flanked by two abstract stimuli is presented and the participant has to choose one of the stimuli via a button press. Once a stimulus is chosen, a win and a loss outcome are presented consecutively, with the order of their appearance (win first versus loss first) being randomised across trials. If the chosen stimulus is associated with a win the participant gains 10p, and if the chosen stimulus is associated with a loss the participant loses 10p. If the win and loss outcome both appear over the same stimulus, the participant does not win or lose any money irrespective of their choice. The win and loss outcomes are independent, meaning that the location of the win does not provide any information about the location of the loss. In this task, participants have to learn through experience which stimulus to choose in order to maximise total winnings. c Structure of the IBLT for negative training in Study 1. The task consisted of 5 blocks comprised of 80 trials each (vertical, dashed black lines separate the individual blocks). The x-axis represents the number of trials, with the y-axis indicating the probability p of an outcome appearing over stimulus ‘A’. The probability of the outcome appearing over stimulus ‘B’ can be calculated as 1 − p. The win outcomes are represented as continuous green lines, with the loss outcomes corresponding to the dashed red lines. The volatility of the win and loss outcomes is manipulated across the task blocks, with higher volatility being associated with a higher information content. In the first block, both the wins and losses are volatile (‘Both-volatile’ block), with the probability of an outcome appearing over stimulus ‘A’ switching between 20 and 80%. Here, both outcomes have a high information content, such that if the win/loss appears over shape ‘A’, it is more likely to be associated with shape ‘A’ than shape ‘B’ in the subsequent trials. In this block, participants are therefore expected to have high learning rates for both wins and losses. In blocks 2–4, on the other hand, volatility is manipulated so that losses are highly informative and wins are uninformative (‘Training’ blocks). Whereas the loss outcomes remain volatile, the association of shape ‘A’ with the win outcome is stable at 50%. Thus, the chance of the win appearing over either of the shapes remains equal across the trials, with its location on one trial providing no information about future trials. In these ‘Training’ blocks, it is expected that participants will have higher learning rates for loss than win outcomes. Finally, block 5 consists of another ‘Both-volatile’ block, in which both wins and losses are volatile. By comparing learning rates in block 5 with block 1, it is possible to quantify potential shifts in learning from win and loss outcomes following the ‘Training’ blocks. d Structure of the IBLT for positive training in Study 2. Similar to Study 1, the IBLT is comprised of a ‘Both-volatile’ block, three ‘Training’ blocks, and a final ‘Both-volatile block. However, volatility of the win and loss outcomes in the ‘Training’ blocks is reversed, such that win outcomes are highly informative (volatile) and loss outcomes are uninformative (stable). Therefore, contrary to Study 1, participants are expected to demonstrate higher learning rates for win than loss outcomes in the ‘Training’ blocks of Study 2 (Color figure online)
Fig. 2Schematic illustration of win-driven behaviour on the IBLT. a If the participant had chosen a stimulus associated with both a win and a loss on trial i, choosing the same stimulus on the next trial i + 1 suggested a stronger impact of win than loss outcomes on behaviour. b Conversely, if the participant had chosen a stimulus with neither a win nor a loss on trial i, selecting the alternative stimulus on the next trial i + 1 indicated a greater influence of win than loss outcomes on choice behaviour
Mean (SE) baseline questionnaire scores for participants completing negative IBLT training (Study 1) and positive IBLT training (Study 2)
| Study 1 | Study 2 | |
|---|---|---|
| BDI-II | 3.70 (1.33) | 4.00 (33.15) |
| STAI-Trait | 34.35 (2.08) | 33.15 (1.22) |
| PANAS Positive | ||
| Sham tDCS session | 33.40 (1.39) | 32.35 (1.70) |
| Bifrontal tDCS session | 33.45 (1.26) | 33.60 (1.87) |
| PANAS Negative | ||
| Sham tDCS session | 11.55 (0.62) | 11.35 (0.43) |
| Bifrontal tDCS session | 10.50 (0.18) | 11.80 (0.63) |
| STAI-State | ||
| Sham tDCS session | 27.40 (1.52) | 26.15 (1.29) |
| Bifrontal tDCS session | 26.35 (1.36) | 26.60 (1.15) |
BDI-II Beck’s Depression Inventory II, PANAS Positive and Negative Affect Scale, STAI State-Trait Anxiety Inventory
Fig. 3Effects of negative IBLT training and tDCS on learning rates. a Across the three ‘Training’ blocks, participants demonstrated significantly higher learning rates for negative (loss) than positive (win) outcomes. Learning rates were pooled over tDCS condition. b Learning rates for both win and loss outcomes decreased over time in the ‘Both-volatile’ blocks carried out before (‘Pre’) and after (‘Post’) the training. Learning rates were pooled over tDCS condition. c Active tDCS did not alter learning rates for either wins or losses in the ‘Training’ blocks compared to sham tDCS. Learning rates are averaged across the three ‘Training’ blocks. d In the ‘Both-volatile’ blocks there were no significant effects of tDCS (p > 0.05) on learning rates for either wins or losses over time when contrasting blocks completed before (‘Pre’) and after (‘Post’) training. This suggests that tDCS did not influence the near-transfer of training effects on speed of learning. *p < 0.05, ***p < 0.001
Fig. 4Proportion (p) of win-driven choices during negative IBLT training by tDCS condition. The proportion of loss-driven choices can be calculated as 1 − p. a In the ‘Training’ blocks, participants tended to make more loss- than win-driven choices across the three blocks. b There was a significant decrease in win-driven choices (i.e. an increase in loss-driven choices) over time in the ‘Both-volatile’ blocks completed before (‘Pre’) and after (‘Post’) training. There was no evidence for an effect of tDCS in either the c ‘Training’ blocks or the d ‘Both-volatile’ blocks. **p < 0.01
Fig. 5Effects of positive IBLT training on learning rates and win-driven choice behaviour. a Across the three ‘Training’ blocks, participants demonstrated significantly higher learning rates for positive (win) than negative (loss) outcomes. b On average, the proportion of win-driven choices (p) was greater than the proportion of loss-driven choices (1 − p) across the ‘Training’ blocks. ***p < 0.001