| Literature DB >> 30952156 |
Peter Zhukovsky1,2, Mickael Puaud1,2, Bianca Jupp1,2, Júlia Sala-Bayo1,2, Johan Alsiö1,2, Jing Xia1,2, Lydia Searle1,2, Zoe Morris1,2, Aryan Sabir1,2, Chiara Giuliano1,2, Barry J Everitt1,2, David Belin1,2, Trevor W Robbins1,2, Jeffrey W Dalley3,4,5.
Abstract
Addiction is regarded as a disorder of inflexible choice with behavior dominated by immediate positive rewards over longer-term negative outcomes. However, the psychological mechanisms underlying the effects of self-administered drugs on behavioral flexibility are not well understood. To investigate whether drug exposure causes asymmetric effects on positive and negative outcomes we used a reversal learning procedure to assess how reward contingencies are utilized to guide behavior in rats previously exposed to intravenous cocaine self-administration (SA). Twenty-four rats were screened for anxiety in an open field prior to acquisition of cocaine SA over six daily sessions with subsequent long-access cocaine SA for 7 days. Control rats (n = 24) were trained to lever-press for food under a yoked schedule of reinforcement. Higher rates of cocaine SA were predicted by increased anxiety and preceded impaired reversal learning, expressed by a decrease in lose-shift as opposed to win-stay probability. A model-free reinforcement learning algorithm revealed that rats with high, but not low cocaine escalation failed to exploit previous reward learning and were more likely to repeat the same response as the previous trial. Eight-day withdrawal from high cocaine escalation was associated, respectively, with increased and decreased dopamine receptor D2 (DRD2) and serotonin receptor 2C (HTR2C) expression in the ventral striatum compared with controls. Dopamine receptor D1 (DRD1) expression was also significantly reduced in the orbitofrontal cortex of high cocaine-escalating rats. These findings indicate that withdrawal from escalated cocaine SA disrupts how negative feedback is used to guide goal-directed behavior for natural reinforcers and that trait anxiety may be a latent variable underlying this interaction.Entities:
Mesh:
Substances:
Year: 2019 PMID: 30952156 PMCID: PMC6895115 DOI: 10.1038/s41386-019-0381-0
Source DB: PubMed Journal: Neuropsychopharmacology ISSN: 0893-133X Impact factor: 7.853
Fig. 1a Experimental timeline. Two cohorts of rats (each n = 24) were assessed for open field activity as a measure of anxiety followed by spatial-discrimination reversal learning. Rats in cohort 1 were trained to intravenously self-administer cocaine under short- and long-access schedules (ShA; LgA) while rats in cohort 2 (control group) responded for food reinforcement under identical schedules. Finally, rats in both cohorts were re-assessed for reversal learning prior to sacrifice and post mortem qRT-PCR. b, c Frequency distribution plots of ‘total trials to criterion’ for the cocaine and control rats. d qRT-PCR was used to assess gene expression in the OFC, dorsal and ventral striatum
Fig. 3Modelling variables of learning and response flexibility on the spatial reversal learning task before and after intravenous cocaine self-administration compared with control rats. Data are means ± 1SEM. No significant differences in α, β and κ were observed in future LE and HE rats compared with control rats (a–c, respectively). Whereas the rate of learning of a response after the completion of each trial (α) was not significantly affected by cocaine exposure (d), a significant increase in β (e) and κ (f) was observed in HE rats (*p < 0.05; **p < 0.01). Thus, HE rats failed to exploit what they had previously learnt (increased β) and showed an increased tendency to make the same response as the previous trial (increased κ). An example of the model fit is shown in the lower panel (g) with individual left and right responses in the upper yellow traces alongside the rewarded side (violet trace) and in the lower trace the modelled probabilities of the same animal making a left or right response using the modelled values of α, β and κ
Fig. 2a Active and inactive lever-press responses of rats trained to self-administer cocaine. Data shown are means ± SEM. Since rats responded on a fixed-ratio 1 schedule, the number of lever presses was equivalent to the number of infusions received. Rats were divided into two groups: high escalation (HE) and low escalation (LE), based on a median split of escalation ratios. The escalation ratio was calculated as the ratio of the average number of active lever responses on days 12 and 13 to the number of lever responses on day 7 (D7—the first long-access session). During the first 6 days rats were given short-access to cocaine (1 h daily sessions) under a fixed-ratio-1 (FR-1) schedule of reinforcement. On days 7–13 inclusive, access to cocaine was increased to 6 h under an FR-1 schedule. b Escalation ratios for each animal in the high and low escalation groups, based on a median split (independent samples t17 = 4.2, p = 0.0006). c Individual reversal learning scores (total trials to criterion) before and after cocaine exposure in LE and HE rats compared with control rats. Data are means ± SEM. *p < 0.05. **p < 0.01. Relationships between anxiety and escalation of intravenous cocaine self-administration are shown in plots (d, e), including a line of best of fit with 95% confidence intervals in dotted lines. A lower anxiety score equates to increased anxiety in the open field arena. d Significant positive relationship between escalation ratio during the first hour of cocaine self-administration and anxiety scores (r = 0.29, p < 0.05), consistent with significant group differences in anxiety scores between LE and HE rats (e)
Fig. 4Lose-shift and win-stay probabilities on the spatial reversal learning task before and after intravenous cocaine self-administration compared with control rats. Data are means ± SEM. Prior to cocaine exposure there were no significant differences in lose-shift and win-stay probabilities between any of the groups (a, d). However, in rats exhibiting high escalation, lose-shift probabilities significantly decreased compared with low escalation and control rats (b) unlike win-stay probabilities (e). Escalation ratios did not significantly correlate with incorrect (c) or correct (f) response latencies. Shown are the lines of best fit (solid lines) and 95% confidence intervals (dotted lines)
Fig. 5mRNA expression of DRD2 (a) DRD1 (b), HTR2A (c), and HTR2C (d) in the orbitofrontal cortex (OFC), ventral striatum (VS), and dorsomedial striatum (DMS) of control (n = 23), LE (n = 9), and HE (n = 10) rats. *p < 0.05 versus controls. Data are means ± 95% CIs