Katharine M Seip-Cammack1, Matthew L Shapiro2. 1. Friedman Brain Institute, Department of Neuroscience, Icahn School of Medicine at Mount Sinai, New York, New York 10029, USA katharine.seip-cammack@mssm.edu. 2. Friedman Brain Institute, Department of Neuroscience, Icahn School of Medicine at Mount Sinai, New York, New York 10029, USA.
Abstract
Behavioral flexibility allows individuals to adapt to situations in which rewards and goals change. Potentially addictive drugs may impair flexible decision-making by altering brain mechanisms that compute reward expectancies, thereby facilitating maladaptive drug use. To investigate this hypothesis, we tested the effects of oxycodone exposure on rats in two complementary learning and memory tasks that engage distinct learning strategies and neural circuits. Rats were trained first in either a spatial or a body-turn discrimination on a radial maze. After initial training, rats were given oxycodone or vehicle injections in their home cages for 5 d. Reversal learning was tested 36 h after the final drug exposure. We hypothesized that if oxycodone impaired behavioral flexibility, then drug-exposed rats should learn reversals more slowly than controls. Oxycodone exposure impaired spatial reversal learning when reward contingencies changed rapidly, but not when they changed slowly. During rapid reversals, oxycodone-exposed rats required more trials to reach criterion, made more perseverative errors, and were more likely to make errors after correct responses than controls. Oxycodone impaired body-turn reversal learning in similar patterns. Limited exposure to oxycodone reduced behavioral flexibility when rats were tested in a drug-free state, suggesting that impaired decision-making is an enduring consequence of oxycodone exposure.
Behavioral flexibility allows individuals to adapt to situations in which rewards and goals change. Potentially addictive drugs may impair flexible decision-making by altering brain mechanisms that compute reward expectancies, thereby facilitating maladaptive drug use. To investigate this hypothesis, we tested the effects of oxycodone exposure on rats in two complementary learning and memory tasks that engage distinct learning strategies and neural circuits. Rats were trained first in either a spatial or a body-turn discrimination on a radial maze. After initial training, rats were given oxycodone or vehicle injections in their home cages for 5 d. Reversal learning was tested 36 h after the final drug exposure. We hypothesized that if oxycodone impaired behavioral flexibility, then drug-exposed rats should learn reversals more slowly than controls. Oxycodone exposure impaired spatial reversal learning when reward contingencies changed rapidly, but not when they changed slowly. During rapid reversals, oxycodone-exposed rats required more trials to reach criterion, made more perseverative errors, and were more likely to make errors after correct responses than controls. Oxycodone impaired body-turn reversal learning in similar patterns. Limited exposure to oxycodone reduced behavioral flexibility when rats were tested in a drug-free state, suggesting that impaired decision-making is an enduring consequence of oxycodone exposure.
Behavioral flexibility is required to adapt to changing rewards and goals. Selecting an appropriate response or strategy to achieve a goal is guided by expected outcomes, based on reward history (Pickens et al. 2003; Schoenbaum and Shaham 2008; Stalnaker et al. 2009; Young and Shapiro 2009, 2011; Riceberg and Shapiro 2012; Rudebeck et al. 2013). When reward contingencies change, behavioral flexibility is required to adjust ongoing responding to continue maximizing reward. Reversal-learning tasks provide a simple test of behavioral flexibility. In these tasks, the subject learns to withhold a previously rewarded response and instead make a previously unrewarded response. The neural mechanisms contributing to various reversal-learning tasks include distinct but overlapping neural circuits (Packard and McGaugh 1996; Packard 1999; White and McDonald 2002; Lee et al. 2008) that include subdivisions of the prefrontal cortex (Ragozzino et al. 1999, 2003; Killcross and Coutureau 2003; Fuster 2008; Schoenbaum et al. 2009; Young and Shapiro 2011; Riceberg and Shapiro 2012).In humans and rodents, repeated performance of goal-directed behaviors can become increasingly automated and resistant to change (Knowlton et al. 1996; Yin and Knowlton 2004, 2006; Yin et al. 2004; Balleine and O'Doherty 2010; Lucantonio et al. 2014). These automated responses can either be adaptive or maladaptive, for example, by facilitating performance in stable situations or disrupting performance when behaviors persist despite changing contingencies. An increase in maladaptive, inflexible responding toward drugs or drug-associated cues is a hallmark of drug addiction and is thought to be a key factor contributing to relapse (Volkow and Fowler 2000; Everitt and Robbins 2005; Kalivas and Volkow 2005; Everitt et al. 2008; Lucantonio et al. 2012). Extended exposure to cocaine or amphetamine impairs reversal learning in rodents (Schoenbaum et al. 2004; Calu et al. 2007; Kosheleff et al. 2012; McCracken and Grace 2013) and monkeys (Jentsch et al. 2002), suggesting that psychostimulant exposure impairs behavioral flexibility (Stalnaker et al. 2009; Lucantonio et al. 2012). A shift from flexible to rigid or automated responding (Lucantonio et al. 2014), paired with differential reinforcing effects of the drug itself (Roitman et al. 2008; Berridge et al. 2009) may facilitate a transition from controlled to maladaptive patterns of drug use.The present study investigated the extent to which prior exposure to oxycodone impairs behavioral flexibility once rats were drug-free. Oxycodone is a widely prescribed opiate analgesic (Kenan et al. 2012) with high abuse liability (Walsh et al. 2008; Wightman et al. 2012), and its recreational use has increased rapidly over the past decade (Compton and Volkow 2006). Oxycodone binds to μ- and κ-opioid receptors (Ross and Smith 1997; Kalso 2005, 2007; Lalovic et al. 2006), which are expressed in forebrain and midbrain structures involved in behavioral flexibility, learning, and memory (Mansour et al. 1987). The enduring functional consequences of oxycodone exposure have not been explored systematically. The present study investigated these consequences using two complementary learning and memory tasks that engage different learning strategies and neural circuits. Place-approach tasks and other types of spatial learning and memory require the hippocampal system, whereas body-turn and other egocentric learning and memory tasks require the dorsolateral striatum (Packard and McGaugh 1996; Packard 1999; White and McDonald 2002; Packard and Goodman 2013). To further investigate behavioral flexibility, we tested reversal learning in the spatial task using different schedules of contingency changes that either do or do not require the orbitofrontal cortex, a structure sensitive to expected outcomes based on reward history (Young and Shapiro 2009; Riceberg and Shapiro 2012). We hypothesized that limited oxycodone exposure would impair rats’ reversal learning and thereby reveal novel cognitive and neural mechanisms of behavioral inflexibility related to addiction.
Results
Three experiments tested the effects of oxycodone exposure on memory and reversal learning in rats. Each experiment included five types of sessions spanning a total of 10–11 d: initial learning of a maze discrimination task, a retention test, oxycodone, or saline exposure (5 d), a post-drug reminder (memory) test, and reversal-learning tests (Fig. 1A). Experiments 1–2 (Exp. 1–2) tested spatial learning on a radial maze and varied the frequency and order of contingency changes, while Experiment 3 (Exp. 3) tested egocentric (body-turn) learning on a plus-shaped maze (Fig. 1B).
Figure 1.
(A) Experimental design. Three experiments were performed using different cohorts of rats. In each experiment, rats were acclimated to the maze, trained on an initial maze task, and then exposed to drug (or saline) for 5 d. After drug exposure ended, the rats were given a reminder session and then tested for reversal learning over 1–2 d. (B) Spatial and body-turn discriminations. Experiments 1 and 2 (Exp. 1–2) trained rats to find food reward at the end of a goal arm (e.g., Arm E) from each of three start arms (e.g., Arms A, C, and D) in the radial maze. During reversal trials, the food was placed in a different goal arm (e.g., Arm B). The two experiments varied the frequency and order of spatial reversals. Experiment 3 (Exp. 3) trained each rat to find food at the end of each goal arm (N and S) by turning left or right at the choice point from each of two start arms (E and W). During the reversal session, the rat had to turn in the opposite direction to find reward. Black arrows depict correct journeys from each start arm.
(A) Experimental design. Three experiments were performed using different cohorts of rats. In each experiment, rats were acclimated to the maze, trained on an initial maze task, and then exposed to drug (or saline) for 5 d. After drug exposure ended, the rats were given a reminder session and then tested for reversal learning over 1–2 d. (B) Spatial and body-turn discriminations. Experiments 1 and 2 (Exp. 1–2) trained rats to find food reward at the end of a goal arm (e.g., Arm E) from each of three start arms (e.g., Arms A, C, and D) in the radial maze. During reversal trials, the food was placed in a different goal arm (e.g., Arm B). The two experiments varied the frequency and order of spatial reversals. Experiment 3 (Exp. 3) trained each rat to find food at the end of each goal arm (N and S) by turning left or right at the choice point from each of two start arms (E and W). During the reversal session, the rat had to turn in the opposite direction to find reward. Black arrows depict correct journeys from each start arm.
Experiment 1: low- to high-frequency reversals
This experiment was designed to test how brief oxycodone exposure affected spatial reversals that varied in frequency. Rats first learned a series of spatial reversals with infrequent changes to reward contingencies (low-frequency reversals, LFRs) that encouraged a “win-stay” strategy. After performing LFRs successfully, the rats were presented with a series of more rapid spatial reversals (high-frequency reversals, HFRs) that encouraged a “lose-shift” strategy (Riceberg and Shapiro 2012).
Oxycodone treatment did not impair spatial memory
Prior to drug treatment, rats learned the initial spatial discrimination quickly [TtC: Oxy, 22.7 ± 1.46; Sal, 18.1 ± 2.2] and showed overnight retention of the task (percent of correct trials: Oxy, 91.3 ± 1.6%; Sal, 95.3 ± 1.8%). Acquisition and retention did not differ between groups prior to treatment. In a reminder session the day after the final drug injections, both groups required fewer trials to meet criterion on same spatial discrimination task than during the initial (predrug) session [TtC: F(1,15) = 8.80, P < 0.05] and performed equally well throughout the stable performance phase of the session. Thus, oxycodone did not impair memory of a spatial task learned 7–8 d prior to drug exposure.
Oxycodone increased perseverative errors during slow spatial reversal learning
Drug exposure did not impair learning rate during LFRs. All rats needed more trials to learn the first reversal (block 2) than the other three blocks [main effect of block: F(3,48) = 26.79, P < 0.01; Fig. 2A]. Rats made more errors during the first reversal compared with subsequent reversals [main effect of block: perseverative, F(3,48) = 41.37, P < 0.01; nonperseverative, F(3,48) = 5.22, P < 0.01] and perseverative errors were more common in oxycodone-exposed rats [perseverative errors, treatment × block interaction: F(3,48) = 2.97, P < 0.05; Fig. 2B]. Thus, oxycodone did not impair initial acquisition but did reduce accuracy when contingencies changed infrequently.
Figure 2.
Oxycodone exposure did not impair performance of spatial reversals in which reward contingencies changed infrequently. (A) Spatial discrimination learning (Initial; Block 1) and reversal-learning rate (R1–3; Blocks 2–4) did not differ between treatment groups. (B) Learning rate was slower and errors were more frequent during the first spatial reversal (R1; Block 2) compared with subsequent blocks (R2–3; Blocks 3–4).
Oxycodone exposure did not impair performance of spatial reversals in which reward contingencies changed infrequently. (A) Spatial discrimination learning (Initial; Block 1) and reversal-learning rate (R1–3; Blocks 2–4) did not differ between treatment groups. (B) Learning rate was slower and errors were more frequent during the first spatial reversal (R1; Block 2) compared with subsequent blocks (R2–3; Blocks 3–4).
Oxycodone did not alter response selection or stable performance during slow reversals
During reversals that emphasize stable reward contingencies (LFRs), rats were equally likely to make consecutive correct choices (“win-stay”; Table 1) on each block regardless of treatment [main effect of block: F(3,48) = 20.54, P < 0.01]. The likelihood of consecutive errors (“lose-stay”) was highest during the first LFR (block 2) and decreased across blocks 3–4, regardless of treatment [main effect of block: F(2,32) = 4.56, P < 0.05]. During consecutive errors, the rats were more likely to return to the same arm (perseverative) than a different nonrewarded arm (nonperseverative) (percentage of consecutive errors that were perseverative: Oxy, 76 ± 7%; Sal, 67 ± 15%). Once rats reached learning criterion on each new goal arm, rats in both treatment groups completed the stable performance phase accurately (≥80% correct).
Rats in both treatment groups completed similar numbers of trial blocks in the HFR task (Oxy, 10.6 ± 0.79; Sal, 11.9 ± 0.13). Learning rate was analyzed in the first five reversals. Rats exposed to oxycodone required more trials to learn the HFR task than controls [main effect of treatment: F(1,16) = 18.39, P < 0.01] and the impairment was most pronounced on the first reversal [block 2: t(16) = −2.87, P < 0.05]. Oxycodone treatment increased perseverative errors [main effect of treatment: F(1,15) = 10.71, P < 0.01; treatment × block interaction: F(4,60) = 3.74, P < 0.01; Fig. 3B], especially during the first rapid reversal [block 2: t(16) = −2.51, P < 0.05]. Nonperseverative errors were unaffected by treatment. Thus, limited oxycodone exposure subsequently impaired rats’ ability to learn rapid spatial reversal when rats were first trained to adapt to relatively stable contingencies. The selective increase in perseverative errors suggests that the impairment reflected an inability to inhibit responses to previously rewarded goals.
Figure 3.
Oxycodone exposure impaired learning the first spatial reversal when reward contingencies changed rapidly. (A) Oxycodone-exposed rats required significantly more trials to learn the first HFR (R1; Block 2) than control rats. (B) Perseverative errors differed across treatment groups, with oxycodone-exposed rats making more perseverative errors than control rats during the first reversal (R1; Block 2). Nonperseverative errors were similar across treatment groups. (*) P < 0.05 between treatment groups.
Oxycodone exposure impaired learning the first spatial reversal when reward contingencies changed rapidly. (A) Oxycodone-exposed rats required significantly more trials to learn the first HFR (R1; Block 2) than control rats. (B) Perseverative errors differed across treatment groups, with oxycodone-exposed rats making more perseverative errors than control rats during the first reversal (R1; Block 2). Nonperseverative errors were similar across treatment groups. (*) P < 0.05 between treatment groups.
Oxycodone reduced win-stay response strategies during rapid reversals
Rats were equally likely to make consecutive errors (“lose-stay”) during the first rapid reversal (HFR block 2), regardless of treatment (Fig. 4A). In contrast, oxycodone-exposed rats were less likely to make consecutive correct choices (“win-stay”) during the first reversal (block 2) compared with saline-treated rats [t(16) = 3.11, P < 0.01; Fig. 4B]. Instead, drug-exposed rats continued to return to previously rewarded goal arm even after making a rewarded response to the new goal arm. Together, the increase in perseverative errors and reduction in “win-stay” responses suggest that oxycodone impairs rats’ ability to update the value of the previously rewarded goal arm.
Figure 4.
Oxycodone exposure impaired win-stay responses when contingencies changed rapidly. (A) Rats in both treatment groups were equally likely to make consecutive errors during the first HFR (R1; Block 2). (B) Rats in the control group were more likely to make consecutive correct choices than the oxycodone-exposed rats during the first HFR (R1; Block 2). (*) P < 0.05 between treatment groups.
Oxycodone exposure impaired win-stay responses when contingencies changed rapidly. (A) Rats in both treatment groups were equally likely to make consecutive errors during the first HFR (R1; Block 2). (B) Rats in the control group were more likely to make consecutive correct choices than the oxycodone-exposed rats during the first HFR (R1; Block 2). (*) P < 0.05 between treatment groups.
Experiment 2: high- to low-frequency spatial reversals
Exp. 1 revealed that, following limited oxycodone exposure, rats were impaired on a HFR task. Exp. 2 tested the extent to which a prior history of stable reward contingencies (LFRs) altered drug-exposed rats’ ability to learn HFRs. If oxycodone impairs rats’ overall ability to adapt to frequent contingency changes, then the same impairment should occur whether rapid reversals precede or follow training in infrequent reversals. In contrast, if the impairment depends upon prior training with relatively stable reward contingencies, then oxycodone-exposed rats should learn rapid reversals normally when they precede slow reversals.
Oxycodone did not impair high-frequency reversals when rats lacked low-frequency reversal experience
The schedule of initial training, drug exposure, and retesting was identical to that used as in Exp. 1, except the rats first learned HFRs and then LFRs. Acquisition and retention of the initial task was similar to Exp. 1 and did not differ between groups prior to treatment (TtC: Oxy, 26.7 ± 1.9; Sal, 26.5 ± 4.2). After drug treatment, all rats reached criterion faster during the reminder (post-drug) versus the initial training (predrug) session [main effect of session: F(1,10) = 52.59, P < 0.01] and error rates remained low during the stable performance phase of the retention session.Rats in both treatment groups required more trials to learn the first rapid reversal (block 2) compared with the rats in Exp. 1 that first received LFR training [main effect of experiment: F(1,20) = 29.20, P < 0.01]. Rats in both treatment groups learned the first HFR at the same rate (TtC: Oxy, 29.0 ± 4.0; Sal, 23.0 ± 2.9; Fig. 5A) and made similar numbers of perseverative and nonperseverative errors (Fig. 5B). Independent of drug history, the rats learned rapid spatial reversals (HFRs) more slowly when they had not first been trained using relatively stable contingencies. Moreover, when trained first using HFRs, all rats learned and performed LFRs similarly, demonstrating that oxycodone exposure did not impair rats’ overall ability to adapt to different frequencies of contingency changes. Rather, oxycodone exposure produced an asymmetric impairment in adapting to HFRs that depended on prior experience with relatively stable contingencies. The results suggest that exposure to the drug selectively impaired cognitive flexibility when integrated reward history and established response expectancies were violated, and task demands required shifting from a “win-stay” to a “win-shift” response strategy.
Figure 5.
Oxycodone exposure did not impair rapid reversal learning when rats were trained initially using those contingencies. (A) Both treatment groups learned the first high-frequency reversal (R1) at similar rates, and (B) with similar numbers and types of errors.
Oxycodone exposure did not impair rapid reversal learning when rats were trained initially using those contingencies. (A) Both treatment groups learned the first high-frequency reversal (R1) at similar rates, and (B) with similar numbers and types of errors.
Experiment 3: body-turn reversal
Exp. 1–2 identified a selective and asymmetric effect of oxycodone exposure on spatial reversal learning. Exp. 3 tested the generality of these observations by investigating the effects of the same schedule of training, drug exposure, and testing on egocentric (body-turn) reversal learning, which required distinct response strategies and neural circuits (Packard and McGaugh 1996; Packard 1999; White and McDonald 2002; Packard and Goodman 2013).
Oxycodone did not impair retention of an egocentric task
Both groups required more trials to learn the initial body-turn task than the spatial task [comparison between all three experiments: F(2,45) = 16.73, P < 0.01; post hoc tests, P < 0.01]. Both groups recalled the initial task accurately (TtC: Oxy, 13.7 ± 2.1; Sal, 13.8 ± 2.4) and reached criterion faster during the reminder than the initial training session [main effect of session: F(1,10) = 52.59, P < 0.001].
To facilitate egocentric responses, the rat was placed on the same start arm and required to enter the correct goal in two consecutive trials before the other start arm was used (see Materials and Methods). The reversal session included similar numbers of switches between start arms in both treatment groups (Oxy, 10.8 ± 0.6 arms; Sal, 10.3 ± 0.6 arms). Rats learned the body-turn task when they made six consecutive correct choices, reflecting three switches in start arms. Ninety percent of the saline-exposed rats but only 58% of the oxycodone-exposed rats met this criterion, i.e., learned the body-turn reversal, within the 80-trial session (Fig. 6A). Rats that failed to meet this criterion (Oxy, n = 5; Sal, n = 1) were assigned the maximum number of trials (80) for statistical analysis. Oxycodone-exposed rats learned the reversal more slowly than saline-treated rats [TtC: Sal, 56.7 ± 3.8; Oxy, 68.5 ± 3.9; t(20) = 2.14, P < 0.05]. Performance during the reversal session was not related to performance during initial training, retention or reminder sessions. Because all rats learned reversals slowly and no rat reached criterion within the first seven start-arm switches, we examined learning rates during the first six start-arm switches. All rats required fewer trials to make two consecutive correct choices from each start arm as the session progressed, demonstrating learning [main effect of start-arm set, i.e., Start Arms 1–3 versus 4–6: F(1,20) = 8.66, P < 0.01; Fig. 6B], and the control rats learned faster than drug-exposed rats [treatment × start-arm set interaction: F(1,20) = 5.52, P < 0.05; Fig. 6B, inset]. Moreover, as the session progressed, all rats made fewer errors immediately after start arms were switched [main effect of start-arm set: F(1,20) = 29.97, P < 0.01; Fig. 6C], but control rats improved faster than drug-exposed rats [main effect of treatment: F(1,20) = 10.84, P < 0.01; treatment × start-arm set: P = 0.055; Fig. 6C]. Together, oxycodone exposure contributed to slower learning and reduced accuracy during the body-turn reversal.
Figure 6.
Oxycodone exposure impaired body-turn reversal learning. (A) Control rats learned egocentric reversals faster than rats exposed to oxycodone. Though most of the saline-exposed rats (9/10) learned the body-turn reversal within 80 trials (arrow), most of the oxycodone-exposed rats did not (7/12, arrowhead). (B) Control rats’ performance improved between the first and second three sets of trials (Start Arms 1–3 versus 4–6), whereas the drug-treated rats did not. (C) After start arms were switched, control rats’ performance during the first trials from the new start arm improved as the reversal session progressed (Start Arms 1–3 versus 4–6); performance in oxycodone-treated rats did not improve to the same degree. (D) Control rats made more consecutive correct responses and fewer consecutive errors as the reversal session progressed, compared with oxycodone-treated rats. (*) P < 0.05 for main effect of treatment, (#) P < 0.05 for group × start-arm set interaction.
Oxycodone exposure impaired body-turn reversal learning. (A) Control rats learned egocentric reversals faster than rats exposed to oxycodone. Though most of the saline-exposed rats (9/10) learned the body-turn reversal within 80 trials (arrow), most of the oxycodone-exposed rats did not (7/12, arrowhead). (B) Control rats’ performance improved between the first and second three sets of trials (Start Arms 1–3 versus 4–6), whereas the drug-treated rats did not. (C) After start arms were switched, control rats’ performance during the first trials from the new start arm improved as the reversal session progressed (Start Arms 1–3 versus 4–6); performance in oxycodone-treated rats did not improve to the same degree. (D) Control rats made more consecutive correct responses and fewer consecutive errors as the reversal session progressed, compared with oxycodone-treated rats. (*) P < 0.05 for main effect of treatment, (#) P < 0.05 for group × start-arm set interaction.
Oxycodone reduced win-stay and lose-shift response strategies during the body-turn reversal
As the session progressed, rats were more likely to make consecutive correct choices and less likely to make consecutive errors [main effect of start-arm set: correct choices, F(1,20) = 9.64, P < 0.01; errors, F(1,20) = 18.59, P < 0.01]. However, control rats made more consecutive correct (“win-stay”) choices [treatment × Start-Arms interaction: F(1,20) = 9.04, P < 0.01] and fewer consecutive errors (“lose-stay”) [main effect of treatment: F(1,20) = 11.43, P < 0.01] midway through the session, compared with drug-treated rats (Fig. 6D).
Discussion
General summary
Behavioral flexibility, measured by reversal learning, was impaired when rats exposed briefly to oxycodone were later tested in a drug-free state. Oxycodone exposure did not affect either spatial or body-turn discrimination learning, and did not impair spatial reversal learning when reward contingencies changed infrequently. Oxycodone exposure did impair allocentric spatial reversals when contingencies changed rapidly and impaired an egocentric body-turn reversal. Oxycodone-exposed rats took longer to learn the first rapid spatial reversal, made more perseverative errors, and were less likely to maintain correct responses (“win-stay”) than controls. Oxycodone-exposed rats also learned a body-turn reversal more slowly than saline-treated rats, were more likely to make errors when starting from a new arm, less likely to maintain correct responses (“win-stay”), and more likely to make consecutive errors (“lose-stay”) during the reversal compared with saline-treated rats. Together, the results demonstrate that brief oxycodone exposure impairs behavioral flexibility when rats were tested in a drug-free state, and suggest that this commonly prescribed opiate analgesic may have an enduring effect on cognition.
Behavioral flexibility and drugs of abuse
Behavioral flexibility allows animals to adapt rapidly to changing contingencies. Adapting successfully to such changes involves several cognitive processes, such as decision-making, inhibitory control, response selection, and learning and memory (Jentsch et al. 2002; Schoenbaum et al. 2004; Ersche et al. 2006a,b, 2008; van der Plas et al. 2009; Kosheleff et al. 2012; McCracken and Grace 2013). Drugs of abuse cause widespread cellular and molecular changes in cortical and limbic circuitry, including prefrontal regions involved in behavioral flexibility (Nestler and Aghajanian 1997; Nestler 2005; Pickens et al. 2011; Robison and Nestler 2011), and prominent theories of addiction propose that ensuing cognitive impairments contribute to the trajectory of addiction (Jentsch and Taylor 1999; Volkow and Fowler 2000; Kalivas 2008; Lucantonio et al. 2012, 2014). Indeed, behavioral flexibility is impaired in animals given extended exposure to psychostimulants (Jentsch et al. 2002; Schoenbaum et al. 2004; Kosheleff et al. 2012; McCracken and Grace 2013) and in human addicts (Ersche et al. 2006a,b, 2008; van der Plas et al. 2009). However, the mechanisms underlying behavioral inflexibility in drug-exposed individuals are not well understood.Reversal learning is a common measure of behavioral flexibility and requires that a subject learn to withhold a previously rewarded response and instead make a previously unrewarded response. Reversal learning is impaired in animals exposed chronically to cocaine or amphetamine and tested in the absence of drugs and drug-paired cues (Lucantonio et al. 2012), suggesting a general decline in flexible responding rather than a drug-directed or drug-induced response (Kantak et al. 2005; Berridge et al. 2009). Although many abused drugs share common neural substrates (Nestler 2005), psychostimulant and opiate exposure cause distinct neurobiological and behavioral changes, indicating that unitary theories of addiction require further elaboration to explain key facets of abuse (Badiani et al. 2011). For example, cocaine or amphetamine addicts are more impaired than opiate addicts on cognitive tests involving planning, attention, and behavioral flexibility (Ersche et al. 2006a, 2008). To date, no study has systematically examined whether opiate drugs are associated with inflexible behavior, which could contribute to continued use and abuse in humans. The present study investigated rats’ behavioral flexibility after 5 d of exposure to oxycodone, a widely available opiate analgesic with high abuse liability (Kalso 2005; Compton and Volkow 2006; Walsh et al. 2008; Kenan et al. 2012; Wightman et al. 2012).
Oxycodone did not impair rats’ ability to respond to stable outcome expectancies
Experiment 1 tested behavioral flexibility using a place-approach task and several spatial reversals in a radial maze. During low-frequency reversals, the reward contingencies changed relatively slowly, using a schedule shown previously to require an intact orbitofrontal cortex (OFC) (Riceberg and Shapiro 2012). The OFC is required in tasks in which animals must update their expectation of reward when contingencies change, such as reinforcer devaluation (Gallagher et al. 1999; Pickens et al. 2003; Izquierdo et al. 2004; West et al. 2011) and reversal-learning tasks that use relatively stable reward histories (Izquierdo et al. 2004; Ghods-Sharifi et al. 2008; Young and Shapiro 2009; Riceberg and Shapiro 2012). Neurons in the OFC fire in response to expected outcomes (Schoenbaum et al. 1999, 2009; Morrison et al. 2011; Young and Shapiro 2011) but in cocaine-treated rats, OFC neurons do not signal aversive outcomes appropriately (Stalnaker et al. 2006). These data suggest that drug-exposed individuals are unable to update reward expectancies that contribute to new associative encoding in other brain regions (Stalnaker et al. 2007; Takahashi et al. 2009). This idea is consistent with metabolic and functional abnormalities in human OFC identified in long-term cocaine and heroin addicts, particularly during drug craving (Volkow et al. 1991, 2007; Volkow and Fowler 2000; Botelho et al. 2006; Ersche et al. 2006b, 2008).To date, drug-induced deficits in reversal learning have only been identified using tasks that use relatively stable reward contingencies (Jentsch et al. 2002; Schoenbaum et al. 2004; Calu et al. 2007; Kosheleff et al. 2012) and are thus sensitive to OFC damage (Schoenbaum et al. 2004; Stalnaker et al. 2009; Lucantonio et al. 2012). Further, drug-induced deficits on these tasks have been identified in rats exposed chronically to cocaine or amphetamine and subsequently trained and tested 1–12 wk after drug exposure was discontinued (Schoenbaum et al. 2004; Calu et al. 2007; Stalnaker et al. 2007; Kosheleff et al. 2012; McCracken and Grace 2013), consistent with OFC deficits. Given these data, we predicted that rats given limited opiate exposure would continue to respond to formerly rewarded goal arms after contingencies changed, and that this deficit would be most prominent in tasks that used relatively stable contingencies. However, the rats in the present study were not impaired in a LFR task demonstrated in recent work to be impaired by OFC lesions (Riceberg and Shapiro 2012). Several methodological differences may contribute to the divergent results. The present study trained rats before drug exposure and confirmed retention of the initial task after drug exposure ended, prior to reversal training. Oxycodone, an opiate analgesic with high abuse liability (Compton and Volkow 2006; Walsh et al. 2008), was used instead of a psychostimulant (Badiani et al. 2011) and the length of drug exposure was substantially shorter than the chronic exposure used in previous reversal studies (Jentsch et al. 2002; Schoenbaum et al. 2004; Calu et al. 2007; Kosheleff et al. 2012; McCracken and Grace 2013). OFC deficits may vary with the duration of drug-free periods. Selective inactivation of OFC neurons that respond to drug-paired cues reduces drug-seeking in heroin-dependent rats after 14 d, but not 1 d, of withdrawal (Fanous et al. 2012), suggesting that OFC activity may contribute to the escalation of drug craving across the withdrawal period. If OFC deficits emerge as more time elapses after the end of drug exposure, then rats exposed to oxycodone should be more impaired in LFRs when tested after longer drug-free intervals.
Oxycodone impaired rats’ ability to adapt to high-frequency reversals after prior training using stable contingencies
High-frequency reversals require the animal to respond to rapidly changing contingencies and exemplify a type of behavioral flexibility that does not require the OFC (Riceberg and Shapiro 2012). After training in LFRs, saline-exposed rats adapted rapidly and responded consistently to the new contingencies, rarely making errors. In contrast, oxycodone-exposed rats learned HFRs more slowly and made more perseverative errors than controls. Perseverative and nonperseverative errors can be distinguished clearly in the radial maze because more than two potential goal arms are available to the rat. Though impaired at learning the first HFR, the oxycodone-exposed rats maintained a spatial strategy, alternating between the previous and current goal arm and rarely entering other available arms. The errors were not consistent with a body-turn strategy. Interestingly, the errors were typically not consecutive, as might be expected if the drug exposure induced rigid, compulsive responding. Instead, errors typically occurred after correct choices, indicating that drug-exposed rats did not maintain correct responses even after receiving a reward. Thus, the oxycodone-exposed rats were sensitive to new contingencies, and neither “forgot” the general spatial strategy nor the specific location of the previous goal. Rather, oxycodone exposure disrupted the integration of recent reward history, a computation that allows rapidly changing contingencies to guide adaptive choices.Rapid spatial reversal learning may depend upon the prelimbic/infralimbic regions of the prefrontal cortex (mPFC) more than the OFC (Riceberg and Shapiro 2012). Rats trained to make relatively rapid, serial spatial reversals on a plus-maze (∼11–15 trials per block), similar to HFRs, took longer to learn these reversals following bilateral mPFC inactivation (Guise and Shapiro 2013; Seip-Cammack et al. 2014). Performance on this task was also impaired by contralateral inactivation of the mPFC and dorsal hippocampus (Ragozzino et al. 2003; Seip-Cammack et al. 2014). These data suggest that rapid spatial reversal learning requires the interaction between the mPFC and hippocampus, perhaps because rules and strategies coded by the mPFC select appropriate hippocampal journey codes (Shapiro et al. 2014). This view predicts that errors could correlate with miscoding or inappropriate retrieval of recent journeys by hippocampus or rules by mPFC (Ferbinteanu and Shapiro 2003; Rich and Shapiro 2009).Maze discrimination and reversal learning can, in principle, be maintained by several cognitive strategies, including recent explicit memory, stimulus–response (S–R) associations, immediate reward contingencies, and expectancies based on integrated reward history (White and McDonald 2002; White 2004). The asymmetric impairment in oxycodone-exposed rats in switching from LFR to HFR but not vice versa suggests that oxycodone did not impair rats’ general ability to adapt to new task demands. Rather, the transition between LFR and HFR tasks required a switch in optimal response strategies, from one with stable contingencies that favored persistence (“win-stay”) to one with more dynamic contingencies that required constant flexibility (“lose-shift”). Control rats learned HFRs quickly if they had prior LFR experience but slowly if they did not, suggesting that aspects of the task structure learned during the LFR task transferred to the HFR task. Indeed, performance on the LFR task may have been facilitated by the formation of reward expectancies based on integrated reward history (e.g., reward location is relatively stable) and, in normal rats, violations of these expectations may have produced an error signal that facilitated rapid reversal learning. Drug-exposed rats trained first on LFRs were more likely than controls to revisit previously rewarded arms (i.e., make perseverative errors) during subsequent HFRs, suggesting that learning was not driven by reward expectancy error. Rather, their performance was consistent with a persistent “win-stay” habit reflecting S–R associations that were maintained despite contingencies that favored a “lose-shift” strategy. This explanation is also consistent with the body-turn impairment identified in oxycodone-exposed rats.
Multiple memory systems and the role of the dorsolateral striatum
Experiment 3 tested behavioral flexibility using an egocentric reversal task in which rats had to learn a body-turn response (e.g., turn left at choice point) and modify that response when contingencies changed (e.g., turn right). Learned egocentric responses, also called S–R associations or “habits,” require the dorsolateral striatum (DLS) (Packard and McGaugh 1996; White and McDonald 2002; Rich and Shapiro 2009). Brief oxycodone exposure did not impair rats’ ability to learn an egocentric response, but severely impaired rats’ ability to update that response when contingencies changed. Oxycodone-exposed rats learned the body-turn reversal more slowly than controls, in part because they returned to the previously rewarded arm each time a trial started from the other start arm. Because this task used a plus-maze, distinguishing between perseverative and nonperseverative errors was impossible, because all errors during reversals had been the correct response in the previous block. Similarly, the plus-maze task cannot be used to distinguish whether the oxycodone-exposed rats reverted to the previously rewarded body-turn response or returned to the most recently rewarded spatial location. In either case, oxycodone impaired rats’ ability to modify an established strategy to learn a new association in a familiar context.The physiological mechanisms required for cognitive flexibility may be impaired by direct pharmacological effects of the drug on synaptic plasticity. For example, S–R associations made prior to drug exposure were presumably encoded via enduring changes to corticostriatal circuits, in part through dopaminergic modulation of striatal synapses (White 1996). These associations are strengthened each time that a given response (e.g., turning left) that follows a given stimulus (e.g., choice point) is reinforced, and reactivation of these synapses by abused drugs may further enhance those associations (White 1996). Indeed, DLS neurons recorded from rats exposed to cocaine fire more rapidly in response to cues that predict appetitive stimuli (Takahashi et al. 2007). In mice exposed chronically to ethanol, long-term depression declined and synaptic arborization increased in the DLS; S–R learning also improved (DePoy et al. 2013). If drug exposure strengthens established S–R associations (Nelson and Killcross 2006), then reversal learning could be impaired because strong synaptic traces may decay relatively slowly, impede the induction of new synaptic weights, and/or wiring patterns, or both via competitive plasticity mechanisms (Chklovskii et al. 2004; Kasanetz et al. 2010). Such mechanisms may also help explain why drug-exposed rats were less likely to make consecutive correct responses and more likely to revert to the previously learned association.
Conclusions
Addiction is commonly characterized as a transition from flexible, goal-directed processing to more rigid, habit-like responding (Everitt and Robbins 2005; Kalivas and Volkow 2005; Nelson and Killcross 2006; Everitt et al. 2008; Kalivas 2008; Zapata et al. 2010). This transition may involve a shift from prefrontal to subcortical (e.g., DLS) control over behavior (Everitt and Robbins 2005; Lucantonio et al. 2012, 2014; Gremel and Costa 2013). However, the circuit mechanisms of this transition have remained unclear. Drugs may influence learning and memory by altering different types of information processing (White 1996; Berridge et al. 2009; Kasanetz et al. 2010). The multiple memory systems model (White and McDonald 2002) proposes that distinct, parallel circuits store different types of information about the individual's environment, including sensory signals, internal states and motor responses. Altered plasticity within or between one or more components of these memory systems, as well as circuits that process reward and affective information (Nestler and Aghajanian 1997; Pickens et al. 2011; Robison and Nestler 2011; Smith et al. 2011) and that compute reward expectancies and abstract rules (Shapiro et al. 2014), may help explain key aspects of addiction-like behavior (White 1996).The present study investigated how oxycodone, a widely abused opiate analgesic (Compton and Volkow 2006; Walsh et al. 2008) that likely alters molecular signaling throughout these distributed circuits (Sim-Selley et al. 2007; Seip-Cammack et al. 2013; Zhang et al. 2014), affects rats’ ability to respond to changing reward contingencies in a familiar environment. Findings suggest that a relatively brief history of oxycodone exposure does not alter memory for DLS- or HIPP-dependent tasks but does impair flexible adaptation to contingency changes within each domain, even when the rat was tested in a drug-free state. These results are consistent with disrupted interactions between the prefrontal cortex and circuits involved in different types of learning and memory (Shapiro et al. 2014). To our knowledge, this is the first study to demonstrate that impaired cognitive flexibility is an enduring consequence of a commonly prescribed opiate analgesic.
Materials and Methods
General procedures
Subjects
Male Long-Evans rats (n = 48) aged 90 d and weighing 250–300 g at the start of the experiment, were housed individually in a colony room on a 12-h reversed light/dark cycle. One week after they arrived in the laboratory, the rats were food restricted to ≥85% of their baseline body weight and maintained on a restricted diet for the duration of the experiment. All procedures were approved by the Institutional Animal Care and Use Committee and performed in accordance with National Institutes of Health guidelines.
Apparatus
Learning and memory was tested using two wooden mazes, each located in the center of a room containing distal visual stimuli on all four walls (Fig. 1B). Exp. 1–2 used a radial maze with six arms (A–F; 57 cm L × 10 cm W) bordered by raised edges (4.5 cm) and spaced equidistantly around a central platform (20 cm diameter, 56 cm H). Exp. 3 used a plus-shaped maze with four orthogonal arms (north, south, east, and west; 57 cm L × 8.5 cm W) bordered by raised edges (2.5 cm). Food reward (chocolate sprinkles) could be placed in a recessed round well (2 cm W × 1 cm deep) made of wire mesh. Inaccessible food rewards located beneath the mesh minimized the influence of odor cues on behavior. An elevated waiting platform (40 cm L × 30 cm W × 100 cm H) with raised edges (6 cm) was located next to the maze.
Maze acclimation
Rats were handled for 2–3 d and allowed to forage for chocolate sprinkles scattered on the maze. Behavioral training began after rats consumed all food reward from each goal arm twice in 10 min. Preexisting preferences for a certain maze arm or body-turn were noted and rats were trained on a task opposite of their bias.
Behavior testing
Each maze was assigned equal proportions of “start arms” and “goal arms.” At the start of each “trial,” food reward was placed in the well of a goal arm and the rat was placed on the distal end of a start arm facing away from the center of the maze. The start arm was selected pseudorandomly for each trial, with the restrictions that no more than three consecutive trials used the same start arm and all start arms were used with approximately equal frequency throughout the session. A “choice” was defined when a rat entered a goal arm with its full body length. If the rat chose the correct arm, it was allowed to consume the food before being returned to the waiting platform. If the rat chose an incorrect arm (an “error”), the trial ended and the rat was returned to the waiting platform without food. The rat was allowed to self-correct, i.e., enter arms freely until it chose the correct arm and ate the food, during the first trial of every block or after five consecutive errors. Rats remained on the waiting platform for an intertrial interval of 5–10 sec. If a rat failed to meet criterion during any session, that session was repeated on subsequent days until criterion was reached.Daily testing sessions were organized into one or more “blocks,” defined by a set of consecutive trials using the same goal arm. If a session included more than one block, the goal arm for each subsequent block was chosen pseudorandomly with the restrictions that the same goal arm was not used in consecutive blocks. Each goal arm was used at least once every 5–6 blocks and all goal arms were used with approximately equal frequency during the session. Errors were categorized operationally based on the correct response in the previous block. A “perseverative error” occurred when the rat chose the goal arm rewarded during the previous block; a “nonperseverative error” occurred when the rat chose any other nonrewarded arm.
Drug exposure
Rats were randomly assigned to a drug (n = 26) or saline (n = 22) treatment group. Oxycodone hydrochloride (Oxy, Sigma-Aldrich) was prepared daily (3 mg/kg in 0.9% sterile saline) and injected subcutaneously into the dorsal flank of rats twice daily (1000 and 1500 h) for 5 d. This dose and administration route is comparable to those used to induce place preference (Olmstead and Burns 2005; Rutten et al. 2011; Campbell et al. 2012). A comparable volume of saline (Sal, 0.2 mL) was administered to control rats. No rats displayed somatic withdrawal signs after drug injections were discontinued, indicating lack of overt physical dependence. All testing occurred ∼24 h after the last injection to minimize acute drug-induced effects on learning, memory, and locomotion (Patti et al. 2006). Given the plasma half-life of oxycodone in the rat (Huang et al. 2005; Nakamura et al. 2011) the amount of oxycodone remaining in circulation once testing resumed was predicted to be <0.016% of the injected dose. All rats continued to run on the maze and quickly consume available food rewards regardless of drug treatment, indicating that the palatable food reward was sufficiently motivating, in contrast to reduced responses to natural rewards that characterize severe drug withdrawal (Harris and Aston-Jones 2007).
Data analyses
Learning rate was assessed by the number of trials required to reach criterion on each task (trials to criterion, TtC). The numbers of errors (Err) made during each block were calculated for each session. The numbers of perseverative and nonperseverative errors were calculated separately in Exp. 1–2, which used a radial arm maze, but could not be dissociated in Exp. 3, which used a plus-maze. “Response selection” was quantified by calculating the probability of consecutive correct or error trials and defined as either “win-shift,” “win-stay,” “lose-shift,” or “lose-stay” (Table 1; Riceberg and Shapiro 2012). The probabilities were calculated by dividing the number of trials defined by each strategy by the total number of trials in which that strategy was possible within each block [e.g., P(correct following correct) = (# correct following correct)/(# trials following correct)] and were then averaged across rats within each treatment group. Two-way ANOVAs were used to analyze the effects of treatment on task performance (e.g., TtC, error rate) and response selection, and were preceded by Mauchly's test of sphericity. Post hoc comparisons were done using Bonferroni's comparisons or select t-tests, which were preceded by equality of variance tests. Data are presented as mean ± SEM.
Experiment 1: low- to high-frequency spatial reversals
Initial learning
Rats (n = 18) were trained first to find food reward in a specific goal arm using spatial navigation (Riceberg and Shapiro 2012). This training session was divided into two phases, acquisition and stable performance. The “acquisition phase” consisted of one block. Self-correction was allowed during the first four trials of the block; this is the only session in which self-correction was allowed on more than one trial. Training continued until the rat made eight consecutive correct choices (learning criterion) or completed 40 trials. If a rat failed to reach criterion, the session ended and the rat repeated the same training session on subsequent days until criterion was met. On the same day that a rat reached criterion, the acquisition phase was followed by a “stable performance phase” consisting of 24 trials using the same goal arm and pseudorandomly assigned start arms. Rats had to perform ≥80% correct trials to pass this phase. Rats typically reached both criteria in a single session.
Retention session
The day after a rat reached initial learning criterion, it was given a retention test consisting of 24 trials using the same goal arm. Rats had to perform ≥80% correct trials to pass the retention test. Rats typically reached criterion in a single session. The day after the retention test, rats were randomly assigned to receive 5 d of oxycodone (Oxy, n = 10) or saline (Sal, n = 8) injections in their home cages.
Reminder session
The day after the final drug injections, the rats completed a reminder session consisting of acquisition (≤40 trials) and stable performance (12 trials) phases, similar to initial training. The same goal arm and learning criterion were used.
Low-frequency reversals (LFRs)
This session was designed to establish relatively stable reward contingencies and encourage rats to apply their reward history to a “win-stay” strategy. The session occurred on the day after the reminder session and consisted of four blocks. The first block used the same goal arm from the initial training sessions. In each of the subsequent three blocks, the rats learned to seek reward in a new goal arm, or “reverse” a stable behavioral response. The rat was trained until it made eight consecutive correct choices and then performed an additional 12 trials to confirm stable performance (≥80% correct), for a minimum of 20 trials per block. After reaching stable performance criterion, the food reward was moved to a different goal arm and another block began, to a maximum of 130 trials. If a rat failed to meet criterion on any block, the session was terminated.
High-frequency reversals (HFRs)
This session was designed to establish transient reward contingencies and require a high degree of behavioral flexibility. The session occurred on the day after the LFR session and consisted of 9–13 blocks. The first goal arm (block 1) was the same goal arm used in the final LFR block. In each block, rats were trained until they made three consecutive correct choices. Once criterion was reached, food was moved to a different goal arm and the next block began, to a maximum of 100 trials.A second cohort of rats (Oxy, n = 4; Sal, n = 4) was trained on a schedule identical to Exp. 1 until Days 7–8, when they were tested first on HFRs and then on LFRs (Fig. 1A). Because the rats were trained first on HFRs, they did not have a history of stable reward contingencies on any arm other than the one rewarded during initial training. This experiment tested the extent to which a prior history of stable reward contingencies altered the effect of oxycodone exposure on reversal learning.A third cohort of rats (Oxy, n = 12; Sal, n = 10) was trained to make body-turn responses on a plus-shaped maze. To facilitate an egocentric strategy, the rats were prevented from entering the opposite start arm by a wooden block (30 cm L × 5 cm W × 10 cm H) so the two goal arms (i.e., north and south arms) were the only available choices to the rat. The training schedule was identical to that used in Exp. 1–2 except that the rats learned to make a consistent body-turn when exiting each start arm to obtain reward (e.g., “turn left”). As in Exp. 1–2, the initial training session was divided into acquisition and stable performance phases. During the acquisition phase, the rat was trained from one start arm until it made two consecutive correct choices, then it was trained from the other start arm to the same criterion. Session criterion was defined by six consecutive correct choices within the 80-trial block. Once a rat reached session criterion, the stable performance phase began and consisted of 24 trials using pseudorandomly assigned start arms. Rats had to perform ≥80% correct trials to pass this phase. Retention and reminder sessions were identical to those in Exp. 1–2. The reversal session (Day 7) was identical to the initial training session, except that rats were trained to turn in the direction opposite to the one learned previously (e.g., from “turn left” to “turn right”).To quantify learning rate during different portions of the reversal session, the total number of trials that each rat performed from three consecutive start arms (e.g., all trials associated with Start Arms 1–3) was calculated. To assess performance accuracy, correct and incorrect trials were assigned values of 1 or 0. Values for the first three trials of each start arm were averaged to reflect rats’ performance when a new start arm was used. To assess performance after start arms were switched during different portions of the reversal session, these values were averaged across the first and second sets of three start arms (Arms 1–3 and 4–6). Two-way ANOVAs were used to assess if performance during the reversal session (TtC) covaried with performance during the initial learning, retention, and reminder sessions.
Competing interest statement
The authors have no conflicts of interest, financial or otherwise, pertaining to any aspect of the work reported in this manuscript. All experiments described herein comply with the current laws of the country in which they were performed.
Authors: Thomas A Stalnaker; Matthew R Roesch; Theresa M Franz; Kathryn A Burke; Geoffrey Schoenbaum Journal: Eur J Neurosci Date: 2006-11 Impact factor: 3.386
Authors: Ellen A A van der Plas; Eveline A Crone; Wery P M van den Wildenberg; Daniel Tranel; Antoine Bechara Journal: J Clin Exp Neuropsychol Date: 2008-11-26 Impact factor: 2.475
Authors: Lauren E Mueller; Melissa J Sharpe; Thomas A Stalnaker; Andrew M Wikenheiser; Geoffrey Schoenbaum Journal: J Neurosci Date: 2020-11-20 Impact factor: 6.167