| Literature DB >> 31873133 |
Florent Wyckmans1, A Ross Otto2, Miriam Sebold3,4, Nathaniel Daw5,6, Antoine Bechara7, Mélanie Saeremans8, Charles Kornreich1,8, Armand Chatard9, Nemat Jaafari10, Xavier Noël11.
Abstract
Compulsive behaviors (e.g., addiction) can be viewed as an aberrant decision process where inflexible reactions automatically evoked by stimuli (habit) take control over decision making to the detriment of a more flexible (goal-oriented) behavioral learning system. These behaviors are thought to arise from learning algorithms known as "model-based" and "model-free" reinforcement learning. Gambling disorder, a form of addiction without the confound of neurotoxic effects of drugs, showed impaired goal-directed control but the way in which problem gamblers (PG) orchestrate model-based and model-free strategies has not been evaluated. Forty-nine PG and 33 healthy participants (CP) completed a two-step sequential choice task for which model-based and model-free learning have distinct and identifiable trial-by-trial learning signatures. The influence of common psychopathological comorbidities on those two forms of learning were investigated. PG showed impaired model-based learning, particularly after unrewarded outcomes. In addition, PG exhibited faster reaction times than CP following unrewarded decisions. Troubled mood, higher impulsivity (i.e., positive and negative urgency) and current and chronic stress reported via questionnaires did not account for those results. These findings demonstrate specific reinforcement learning and decision-making deficits in behavioral addiction that advances our understanding and may be important dimensions for designing effective interventions.Entities:
Mesh:
Year: 2019 PMID: 31873133 PMCID: PMC6927960 DOI: 10.1038/s41598-019-56161-z
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Demographic and psychological measures for Problem Gamblers (PG) and Control Participants (CP): mean (SD).
| Variable | PG | CP | Between groups difference |
|---|---|---|---|
| Gender ratio (men/female) | 38/7 | 29/4 | X²(1) = 0.19, p = 0.67 |
| Age | 31.31 (9.11) | 31.27 (7.93) | t(76) = 0.02, p = 0.98 |
| Years of education | 12.73 (2.63) | 12.88 (2.87) | t(76) = 0.23, p = 0.82 |
| OSPAN | 74.42 (10.96) | 79.19 (11.40) | t(76) = 1.97, p = 0.07 |
| CPGI | 14.13 (5.02) | 0 | |
| DSM-V | 6 (1.38) | 0 | |
| Impulsivity (UPPS-P) | 49.58 (9. 5) | 48.12 (6.16) | t(75.02) = 0.82, p = 0.42 |
| | 10.82 (2.97) | 8.97 (2.14) | |
| | 12.02 (2.18) | 10.45 (2.03) | |
| | 8.33 (3.05) | 8.61 (2.03) | U = 645.5, p = 0.32 |
| | 7.51 (3.19) | 8.33 (2.34) | U = 577, p = 0.09 |
| | 10.89 (3.13) | 11.76 (2.28) | t(76) = 1.35, p = 0.18 |
| SCL-90-R | 70.13 (47.14) | 40.85 (29.37) | |
| Audit | 9.22 (8.36) | 10.48 (6.1) | U = 600, p = 0.15 |
| Smoker | Non-smoker | 20/25 | 15/18 | X²(1) = 0.01, p = 0.93 |
| FTND | 4.95 (2.61) | 3.47 (2.61) | t(33) = 1.66, p = 0.11 |
| Beck Depression Inventory | 7.44 (5.79) | 4.21 (3.94) | |
| Negative affect | 22.2 (9.3) | 18.3 (5.75) | U = 580.5, p = 0.1 |
| STAI-YA | 37.51 (12.56) | 33.3 (9.11) | U = 611.5, p = 0.19 |
| STAI-YB | 44.98 (12.56) | 39.36 (10.27) | |
| SRRS | 289 (183.28) | 222.83 (221.14) | |
| Current stress intensity | 3.46 (2.94) | 2.75 (2.19) | U = 575, p = 0.13 |
Significative differences between groups are displayed in bold. All the tests are performed with a two-tailed Student t-test, Mann-Whitney U or a Chi-square test. Welch correction was applied to Student t-tests when Levene’s test for homogeneity of variances was significant (p < 0.05).
Logistic regression coefficients indicating the influence of previous trial’s outcome, previous trial’s transition, and group on response repetition.
| Coefficient | Estimate (SE) | z value | P value |
|---|---|---|---|
| (Intercept) | 1.67 (0.1) | 16.26 | |
| Group | −0.16 (0.1) | −1.54 | 0.12 |
| Outcome | 0.55 (0.06) | 9.05 | |
| Transition | 0.2 (0.05) | 3.86 | |
| Group * Outcome | −0.02 (0.06) | −0.42 | 0.67 |
| Group * Transition | 0.04 (0.05) | 0.7 | 0.48 |
| Outcome * Transition | 0.32 (0.06) | 5.14 | |
| Group * Outcome * Transition | −0.12 (0.06) | −2 |
Significant results are displayed in bold. *Significance at the 0.05 level; **Significance at the 0.01 level; ***Significance at the 0.001 level.
Figure 1Probabilities to maintain the previous first stage choice depending on the transition and the reward during the previous trial among (A) healthy subjects and (B) pathological gamblers. Error bars represent two times the standard error.
Logistic regression coefficients indicating the influence of previous trial’s transition and group on response repetition depending on the previous trial’s outcome.
| Coefficient | Unrewarded previous trial | Rewarded previous trial | ||||
|---|---|---|---|---|---|---|
| Estimate (SE) | z value | P value | Estimate (SE) | z value | P value | |
| (Intercept) | 1.12 (0.09) | 11.93 | 2.22 (0.14) | 16.07 | ||
| Group | −0.13 (0.09) | −1.41 | 0.16 | −0.18 (0.14) | −1.34 | 0.18 |
| Transition | −0.12 (0.05) | −2.16 | 0.51 (0.1) | 5.32 | ||
| Group * Transition | 0.16 (0.05) | 2.95 | −0.09 (0.1) | −0.92 | 0.36 | |
Significant results are displayed in bold. * Significance at the 0.05 level; ** Significance at the 0.01 level; *** Significance at the 0.001 level.
Figure 2Reaction time in millisecond depending on the transition among both groups. The error bars represent the standard error.
Figure 3(A) Reaction time in millisecond depending on the previous trial outcome among both groups. (B) Correlation among PG between gambling severity measured by the DSM score and the response time acceleration after a negative outcome. The error bars represent the standard error.
Figure 4(A) Two-step decision task (adapted from Otto et al.[85]). (First step) Participants must choose between the two images, leading preferentially to a green or a blue screen, according to fixed probabilities. (Second step) Subject choose between the two images linked to probabilities to win money. Those probabilities slowly change with the time and vary according to the screen color. (B) Trial’s design. (C) Second step’s changes in probability of reward. (D) Theoretical decision pattern according to a pure MF strategy or to a pure MB strategy.